Algorithms
An algorithm is a way of doing something. It is a method, or recipe, or set of instructions to follow to solve some problem.
We start with a very simple example. Suppose we have two pieces of data, x and y, and we need to exchange their values. As a test case:
before: x=3 and y=7
after: x=7 and y=3
This needs to work with any values. Whatever x and y are before, after they must have the values changed over.
How to do this? What algorithm to use?
We might say:
x ← y // this means copy the value of y to x
y ← x
Does this work? We can check using a trace table. We work this out on paper, keeping track of the values of the variables at each step, like this..

x 
y 

Before 
3 
7 
start values 
after x ← y 
7 
7 
the 7 was copied to x 
after y ← x 
7 
7 
the 7 was copied back to y 
At the end 
7 
7 

So, this does not work. When we copied 7 to x, we lost the initial value of x.
One way which does work is to use an extra storage space, which we will call 'temp', because it is only for temporary use:
temp ← x
x ← y
y ← temp
We check this with a trace table:

x 
y 
temp 

Before 
3 
7 
?? 
It makes no difference what temp is 
after temp ← x 
3 
7 
3 
the 3 was copied to temp 
after x ← y 
7 
7 
3 
the 7 was copied to x 
after y ← temp 
7 
3 
3 
temp (start value of x) copied to y 
At the end 
7 
3 

Values exchanged 
So this algorithm will exchange two data values.
So an algorithm is a sequence of steps, each step being an 'instruction'. These instructions become instructions to the computer in the program, so we can only have very simple steps  only the following are allowed:
Moving a value from one location to another, or a constant. So for example x ← y or x ← 3
Doing arithmetic. For example x ← 2*35 (the * means multiply)
Decisions based on simple logic . For example if x > 10 then x ← 2*35
Loops. For example do 10 times x ← x+1
Input and output (I/O) .
The input might come from the keyboard, or from a file of data. The output might be displayed on a screen, or on a printer, or written out to a file. We often ignore I/O in an algorithm. We take it that we already have the data the algorithm operates on. And we do not always output the results. We might, for example, use the results in another algorithm.
We cannot use instructions which require intelligence, like 'solve x2+3x4=0'. If we say something like 'test if x is a prime number' that in fact is using another algorithm, and we should give that algorithm, to say how to test if a number is prime.
The arithmetic must be basic, like add subtract multiply and divide. Actions like find the sine of an angle are algorithms in themselves.
We can only have a finite number of instructions, and the algorithm must end after a finite number of steps, with the correct result.
We write down algorithms in pseudocode. 'pseudo' is pronounced 'sudo'  the p is silent. It means 'like, but not real'. So pseudocode is like real program code, but not actual program code. We use pseudocode because
It is easier to read and understand than actual code
It is less strict than code
It can easily be converted to any programming language
Computers cannot run pseudocode algorithms. They can only run actual programs, written in some programming language.
Algorithms should be languageagnostic. That means, they should not depend on any particular programming language. They will work in any language.
We often have a choice of using different algorithms. How to decide which is better? Remember all algorithms must 'work'  that is, end with the desired result. Otherwise they do not count as an algorithm.
An algorithm is better if
it is faster
it uses less memory
For a small number of data items, any algorithm is OK. It is only for a very large amount of data does speed and memory use matter.
The time complexity of an algorithm is how many steps it takes to finish, depending on the number of data items  normally called n. So we need to know how many steps, for very large n. If n is small, and an algorithm takes 10 or 20 microseconds, we just do not care. If n is a million, and it takes 2 hours, we do care. The step count for very large n is called the asymptotic behaviour.
Obviously any algorithm will be faster on a faster computer. So we ignore the actual time, and just count the steps. That way we can compare algorithms on different computers.
As an example  suppose an algorithm takes 3n+20 steps to execute, with n data items.
If n =100, this is 320. If n=1000, this is 3020. If n=1000000, it is 3000020. For large n, the +20 is negligible. So we ignore it.
So the actual time taken is 3nt, if each step takes t seconds. But on a faster processor t is less  whatever the algorithm. So the stepcount is 3n. If we double n, 3n doubles. If we triple it, it triples. So it is proportional to n. We say this is O[n]. This is 'bigO' notation. O[n] is linear time.
O[n2] is slower. For 100 times the data, it takes 10000 times longer. This is quadratic time. Often this is too slow.
Some algorithms do not depend on how much data there is. For example, accessing an array element does not depend on how big the array is. This is 'constant time', O[1].
The space complexity is how much memory it uses. Some algorithms just move data around in the data structure. These work 'in place' and use no extra memory beyond what the data itself needs. Other algorithms move the data into a different structure  these use twice the memory of the initial data structure.
Usually we can trade off speed and memory, so that an algorithm which uses more memory is faster.
The algorithm to process some data depends on what data structure it is in  a list, a stack, a tree or wherever. As as result, algorithms and data structures are usually studied together.
In a search algorithm, we have a key field, and we want to find the matching record in some data structure  to fetch the value fields in that record. For example we might have a student id and want to find their phone number.
The search result might be that the key is not present in the data.
This applies to lists of data. We simply start at the beginning of the list, and test elements in turn until we find it, or we reach the end without finding it.
In pseudocode:
set index = 0
loop start
if data at index == target
end loop  target found
index = index + 1
if index passed end of list
end loop target not present
loop end
What is the time complexity of a linear search? Suppose the list is n=1000 items long. We might be lucky and find it at the first item  so just 1 step. Or unlucky, and find it at the last item, so 1000 steps. So it will be between 1 and 1000. On average, with random data, we will find it half way through, so n/2 steps. So the step count is proportional to n. Twice the amount of data means twice the step count. So this is O[n]. We would like a faster algorithm than this.
Write and test a program (any language) which
Fills an array with 10 random integers
Prints them out
Assigns a value to search for, at random
Does a linear search for the target, outputting the location if found, and a suitable message if it is not present.
Adapt the program to count, and output, the number of comparisons made
Adapt the program so it works not just for 10 integers, but for a value n, which can be altered.
Run it for n=10, 20, 30,..100 integers. Draw a graph of n against the number of comparisons.
This is only for data which an ordered list:
We have two pointers to select out a range of the list. These pointers can be just indexes into an array.
We have a loop:
We work out the middle, between high and low, and check the data there.
If it is too big, the target must be in the lower half. We change high to middle and carry on, using the lower half.
If it is too small, the target is in the upper half. Change low to middle and continue
If the data at middle == the target it is found : finish
We initialise low to be the bottom of the list, and high to be the top.
What is the time complexity of a binary search? It is how often we can divide the list by 2. So it is O[log2 n]
This is very fast. If we are searching 1 million items  then we can search 2 million instead with just one more step.
But the data must be ordered.
Write and test a program which
Fills an array with integers, as follows: the first element should be zero, and every other element is a random amount up to 4 greater than the previous element.
Prints out the array.
Assigns a target, an ddoes a binary search on the array to find it.
A binary search tree ( BST ) is a binary tree which is leftright ordered. In other words, everything in the left subtree is less than the node data, and everything in the right subtree is larger. For example
To search for a value, we
start at the root
repeat
if the data at the node = the target, return : found
if the data at the node is less than the target, go right
if the data at the node is more than the target, go left
if this is a 'dead end' return  value not present
For example, suppose in the tree shown we search for 80.
Start at 50  too small, go right
75 too small, go right
85 too big, go left
found.
Again search for 35
Start at 50  too big, go left
25 too small, go right
30 too small, go right
dead end  35 is not in the tree
What is the time complexity of this? When we go from a node to a subtree, we halve the number of nodes in it  in effect we reject half the tree beneath us. So the number of steps  the number of comparisons  is the depth of the tree. This is the same as the number of times we can divide it by two  so this is O[ log2(n) ]
This is the average. Suppose the data inserted into a BST is for example 2, 5, 7 and 8  in other words, sorted data. We get an unbalanced tree as shown  no left subtrees. Then a search is in effect a search down a list, so is O[n].
This illustrates there may be a difference between best case, avrage and worst case behaviour.
To sort here means to put into order  increasing or decreasing.
For example 4 3 6 2 5 is not sorted. If we sort that data, we get 2 3 4 5 6 .
We have the data in a list  maybe an array.
We go through the list, comparing each number with the next, and exchanging them if they are in the wrong order. For example
4 3 6 2 5 
swap 
3 4 6 2 5 
leave 
3 4 6 2 5 
swap 
3 4 2 6 5 
swap 
3 4 2 5 6 
finished this scan 
After this one number (here 6) will be moved to its correct position. But others (like 2) are in the wrong position. One scan gets one number correct  so to get all n correct, do it n times:
n times:
go through the list and swap each with the next if wrong.
It is called a bubble sort because on each scan a number 'bubbles' up to the end.
What is the time complexity? Each scan requires n comparisons In fact n1, but for large n, this is about the same as n, and we do n scans. So there is a total on n2, and this is O[n2]
This is a basic bubble sort. We can make some improvements, but it remains O[n2].
This is usually too slow.
Write and test a program (any language) which
Fills an array with 10 random integers
Prints them out
Sorts the array with a bubble sort
Prints them out again
How does this work? To start with, suppose we have a function that merges two sorted lists into a single sorted list. So for example with input
list 1 : 2,4,7,8,9,12,15
list 2 : 3,5,6,10,13
the output is: 2,3,4,5,6,7,8,9,10,13,15
How? We start comparing the heads of the 2 lists  2 and 3. 2 is less, so put it to the output and remove it from list1. Then compare 4 and 3. 3 is less so output it and remove from list2. Now compare 4 and 5. 4 is less so output.. and so on. Continue until one input list is empty, and just move the rest of the other list to output.
Then, we use this function as follows. The input is an unsorted list  maybe
9,2,5,8,1,3,7,4,3,3,8
Treat this as a sequence of lists, 1 element long, and merge them pairwise.
9 
2 
5 
8 
1 
3 
7 
4 
3 
3 
8 
pair 1  merged below 
pair 2 
pair 3 






2 
9 
5 
8 
1 
3 
4 
7 
3 
3 
8 
Then we take pairs 2 elements long, and merge them:
2 
9 
5 
8 
1 
3 
4 
7 
3 
3 
8 
pair 1 
pair 2 
pair 3 

2 
5 
8 
9 
1 
3 
4 
7 
3 
3 
8 
Then pairs 4 elements long. In fact, one pair with 4 elements. The other pair has length 3 and length 0
2 
5 
8 
9 
1 
3 
4 
7 
3 
3 
8 
pair 1 
pair 2 

1 
2 
3 
4 
5 
7 
8 
9 
3 
3 
8 
(pair 4 has zero length). Then pairs 8 elements long. In fact we have one length 8, merging with one length 3
1 
2 
3 
4 
5 
7 
8 
9 
3 
3 
2 
pair 1 

1 
2 
2 
3 
3 
3 
4 
5 
7 
8 
9 
One scan through takes n steps, through the whole list. How many scans? The sublist lengths were 1,2,4,8... Going backwards, this halves the list each time, and the number of scans is how often we can divide n by 2  which is log2 n.
So the total steps is O[ n log2 n ]. This is must faster than a bubble sort.
Write a program which does a merge sort on an array of random integers
To traverse a data structure means to follow links from node to node until every node has been reached.
With a binary tree, as shown, three ways to traverse it are
left  root  right (inorder)
root  left  right (preorder)
left  right  root (postorder)
These are recursive. The left subtree for example, is itself a tree, and we need to traverse it.
The algorithm for an inorder traversal is to visit the root, where visit is:
vist(node)
if node.left != null, visit(node.left)
output node (see below)
if node.right !=null visit(node.right)
The 'output node' might be replaced with whatever we want to do with the data at each node  place them in a queue, do arithmetic with them or whatever
An inorder traversal corresponds to the way a BST is ordered. So a use for an inorder traversal is to output a BST in order of keys.
Here we visit the root, then recurse to the left subtree, then the right subtree.
The root is visited before the left and right.
A use for this is to copy a tree. We clone the root, link its left field to a clone of the left, then link its right to a clone of the right node. This is the root of the tree copy.
This is
left
right
root
The root is visited after the left and right. This is used in abstract syntax tree evaluation.
One traversal method is breadthfirst.
Suppose this is our graph, and we start from node A:
In breadthfirst, we start by visiting every node next to the start. Then every node next to those, and so on.
More exactly:
push start node into a queue
while queue not empty
pop queue head and visit it
put every node next to this into the queue, if it is not already in
So here, starting at A
visit A
Put B, C D in queue
Pop node, get B. visit it. Add nearest to queue. Queue is now C D T X (C was already in)
Pop node, get C. Queue is D T X V W
Pop D. Queue is T X V W G
Visit T
Visit X
Visit V
Visit W
Visit G
Queue empty, so end.
The sequence was A B C D G T X V W G.
The order depends on the order nearest nodes are pushed. For example we could have pushed D, C, B as nearest to A.
This algorithm is used to find the shortest path from one node to another (see Dijkstra later).
There are a set of ideas relating to expressions and their evaluation. Programs contain expressions, and program execution means evaluating those expressions.
A expression is something which can be worked out  for example 2+3*4
Evaluating an expression means working out its value
Expressions contains operators  like +  * and /. There might also be functions like sin and cos and log.
And operands  things the operators work on. These might be integers like 2 and 3, floats like 3.141, variables like x and y and so on.
Operators have precedence levels, which means what order they should be carried out in. For example * normally has higher precedence than +. So in 2+3*4, the * should be done first, so this is 14 not 20. The fact that the * is to the right of the + means we cannot just work left to right. So to work out 2+3*4, simply going left to right would mean 2+3 = 5, *4 = 20, which is the wrong answer.
The usual precedence levels are BODMAS  brackets division multiplication adddition subtraction.
Expressions can be written in different notations. The usual notation is called infix. The operator comes in between operands  like 2 + 3
In prefix, the operand is before (pre) the operands, like + 2 3
In postfix, the operator is after (post) the operands, like 2 3 +. Postfix is also called Reverse Polish or RPN. This is after Jan Łukasiewicz, a Polish logician.
Reverse Polish is used, internally within interpreters and compilers, because it is fast to evaluate, and brackets are not needed.
In conversion of infix to reverse Polish, the order of operands is unchanged. Some examples:
Infix 
reverse Polish 
2 + 3 
2 3 + 
2 + 3 * 4 
2 3 4 * + 
2 * 3 + 4 
2 3 * 4 + 
2*(3+4) 
2 3 4 + * 
(2+3)*(52) 
2 3 + 5 2  * 
In 2 3 + 5 2  *, we can work this out, left to right, as follows:
2 3 + : apply the + to the two things before it : 2 and 3 : and replace with result
5 5 2  : apply the  to the 5 and 2
5 3 * : apply the * to 5 and 3
15
Check that the brackets in (2+3)*(52) are not needed in 2 3 + 5 2  *