Run time analysis of insertion sort and quick sort CM0212 – Algorithms and Data Structures Philip Strong 0807259 An empirical study into the run time characteristics of two simple sorting algorithms. The insertion sort and quick sort will be analysed to show the usefulness of each depending on the application.
Introduction The insertion sort and quick sort algorithms are both sorting algorithms, which take an unsorted list and reorder the entities into numerical order. Each algorithm has its own characteristics as to how the run time changes with respect to the number of inputs. This is commonly shown using the ‘Big Oh’ notation, where the O stands for ‘order of’. Big Oh notation is to show how long an algorithm may take to run when very large inputs are applied, therefore only the largest term in a polynomial is required [1]. For instance, when T(n) = An + B, the runtime complexity is O(n), as B has little effect on the runtime as n tends towards infinity. The average and worst case complexity of the insertion sort is O(n²), and the best case is O(n). When the input is already in ascending order, each position will only need to be checked once, and no swaps will be required. This results in the runtime complexity of O(n), one per position. The worst case is when the input is in descending order, and each position must be compared with everything in the sorted part of the list. This results in the runtime complexity of O(n²). The best and average case complexity of the quick sort is O(nlogn), and the worst case is O(n²). The best case of a quick sort is when the partitioning stage splits the list in half [2]. This is close to real world operation and so the best case and average case complexities are close. The worst case runtime complexity occurs when the list is in descending order [2]. Design The sorting algorithms were based on the pseudocode shown below. As can be seen, the insertion sort algorithm is significantly shorter and so could be a better choice when program space is critical. To simplify the design of the algorithm, the quick sort pivot will be decided as the middle element in the array. Insertion sort pseudo code [3] Algorithm insertionSort(input, n) Input: An array storing n integers Output: An array sorted in ascending order For i = 1 to (n – 1) do item = input[i] j = i – 1 while j ≥ 0 and input[j] > item do
input [j + 1] = input [j] j = j – 1 input[j + 1] = item
Quick sort pseudo code [3] Algorithm quickSort(list, lower , upper ) Input: Partial Array, list[lower::upper] Output: Partial array, sorted if upper > lower then j partition(list, lower, upper) quicksort(list, lower , j - 1) quicksort(list, j + 1, upper) Algorithm partition(list, lower , upper ) Input: Partial Array, list[lower ::upper ]. Output: Partial array, partitioned i lower ; j upper + 1 v list(lower ) do { do i + 1 while (list(i) < v) do j j - 1 while (list(j) > v) if (i < j) exchange(list(i ), list(j)) } while (i < j) exchange(list(lower ), list(j)) return j
Experimental Planning The experiment was set up by implementing both algorithms in Java. A random number generator will create a text file for the algorithms to sort. The file will be read into an array, and then the array passed as an argument to each algorithm. The output will be written to a file to reduce console output. The time taken to run the algorithm will be recorded by adding a counter into every loop, which will be recorded into a comma separated values (CSV) file, along with number of inputs, and sort type. The tests will be set to run automatically. The order of the tests and functions will be as follows: Generate random data set 1 Run insertion sort on data set 1 Run quick sort on data set 1 Generate random data set 2 Run insertion sort on data set 2 Run quick sort on data set 2 Generate random data set 3 Run insertion sort on data set 3 Run quick sort on data set 3 Generate ascending data Run best insertion sort Run best quick sort Generate descending data
Run worst insertion sort Run worst quick sort
Each sort will sort from between 1 and 600,000 inputs, in 50,000 steps. The average case tests will be run 3 times to check for anomalies. Implementation public insertionSort(int[] input, int n) { if (n > input.length) // Check range is available in the data { System.out.println("Out of range, only " + input.length + " integers available."); } else { // Insertion sort algorithm for(int i = 1; i <= (n - 1); i++) { counter++; item = input[i]; j = i - 1; while ((j >= 0) && (input[j] > item)) { counter++; input[j + 1] = input[j]; j = j - 1; } input[j + 1] = item; } output = input; } } // http://www.algolist.net/Algorithms/Sorting/Quicksort // [4] public void sort(int[] input, int lower, int upper) { int index = partition(input, lower, upper); if (lower < index - 1) { sort(input, lower, index - 1); } if (index < upper) { sort(input, index, upper); } output = input; } public int partition(int input[], int lower, int upper) // [4] { int i = lower; int j = upper; int tmp; pivot = input[(lower + upper) / 2]; // Set pivot as middle value while (i <= j) { while (input[i] < pivot)
{ counter++; i++; } while (input[j] > pivot) { counter++; j--; } if (i <= j) { counter++; tmp = input[i]; input[i] = input[j]; input[j] = tmp; i++; j--; } } return i; }
Testing To test the function of the algorithms, I set the random number generator to make only a very small set of data, consisting of 10 random numbers between 1 and 100. The screenshots below show both the input data file (data.tst) with the output data file (OUTdata.tst) shown on top. If the output data is in numerical order then the algorithm is functioning correctly.
Figure 1 – Test data for insertion sort
Figure 2 – Test data for quick sort As is evident from the images, both the insertion sort and the quick sort are sorting the data as expected. Results The data collected by running the tests is shown in Appendix A. To determine whether or not the runtime complexity of each algorithm is as expected, I will be plotting the number of iterations against the expected outcome, and look for a linear graph.
Above, figure 3, shows the average runtime complexity of the insertion sort, with random inputs. It is evident from the graph that the line is linear, and so therefore the runtime complexity with respect to the number of inputs is O(n2), as expected.
Above, figure 4, shows the best case runtime complexity of the insertion sort. As the graph has been plotted with n versus number of iterations, the straight line assumes that the best case for insertion sort is O(n), which is what was predicted.
Above, figure 5 shows the worst time runtime complexity of the insertion sort. The graph has been plotted again n2 and so the linear nature of the graph suggests that the runtime complexity is O(n2), as stated.
Above, figure 6, shows the average runtime complexity of the quick sort, with random inputs. It is evident from the graph that the line is linear, and so therefore the runtime complexity with respect to the number of inputs is O(nlogn), as expected.
The graph above shows that the best case runtime is linear when plotted against nlogn. This shows that the runtime complexity of the quick sort algorithm is nlogn.
As can be seen from the graph above, the output is not linear. This is due to a coding error which was not traceable after testing.
Analysis For all tested numbers of inputs, the insertion sort was slower than the quick sort. This is evident when looking at the numbers of iterations in Appendix A. Even 50,000 inputs causes the insertion sort to run approximately 625,000,000 iterations, compared to approximately 850,000 in the quick sort. An advantage of the insertion sort over the quick sort is the simplicity of programming, making it easier to code, and to help fit in very small program spaces, such as in a microcontroller where program size can be limited to as little as 1,000 operations. The quick sort was faster in every tested case, and the characteristic of the runtime means that it did not get slower at as fast a rate as more inputs were applied, compared to the insertion sort. Conclusion The empirical study has confirmed the predictions made. The best case runtime complexity of an insertion sort is O(n); the average and worst case runtime complexity of an insertion sort is O(n2). The best case and average runtime complexity of quick sort is O(nlogn). It is unable to conclude the worst case runtime complexity of the quick sort the algorithm was unable to run with the pivot overridden and the input list in descending order. It is also concluded that insertion sort is a simpler sorting algorithm, which can be implemented in far less code and time. This is a characteristic which would be useful when processing time is not an important factor, but program space is. If this experiment were to be repeated, I would run the algorithms on a more powerful computer and run them with far higher numbers of inputs to get a more accurate impression of the runtime characteristics. I would also make an effort to ensure the functionality of my program which runs the algorithms, as I had trouble with the worst case scenario when using the quick sort.
References [1] http://leepoint.net/notes-java/algorithms/big-oh/bigoh.html [2] http://www.personal.kent.edu/~rmuhamma/Algorithms/MyAlgorithms/Sorting/quickSort. htm [3] Mumford, C. CM0212 Lecture notes, Cardiff University [4] http://www.algolist.net/Algorithms/Sorting/Quicksort
Appendix A – Results table Test Insertion sort Insertion sort
Number of inputs 1 50000
Insertion sort
100000
Insertion sort
150000
Insertion sort
200000
Insertion sort
250000
Insertion sort
300000
Insertion sort
350000
Insertion sort
400000
Insertion sort
450000
Insertion sort
500000
Insertion sort
550000
Insertion sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Quick sort Insertion sort Best Insertion sort Best Insertion sort Best Insertion sort Best Insertion sort Best
600000 1 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 550000 600000
Test 1 0 627542123 251682429 0 564197634 5 100340220 51 156628527 97 224807968 19 305901393 18 399747652 55 506359082 05 625155632 83 756187612 87 899889406 60 2 830404 1873389 2859706 4003389 5338076 5779951 6644237 7512232 8727657 10354666 10974410 11939090
1
0
50000
49999
100000
99999
150000
149999
200000
199999
Test 2 0 626147799 250986862 5 563403436 7 100004108 43 156191015 93 225058400 73 306331324 81 400791474 89 507200422 38 626096079 92 757163704 81 900518731 06 1 842897 1809351 2849807 3783696 4932477 5844113 6812426 7905009 8758091 9978611 11900488 12267535
Test 3 0 624675965 249529170 1 561131736 2 999601751 8 156199880 64 224919479 88 306169989 25 399832590 67 505897723 66 624685402 35 756247897 86 899943804 07 2 866601 1796251 2698369 3827994 5129526 5868753 6763763 7658100 9097588 9675388 11018520 12358614
Insertion sort Best Insertion sort Best Insertion sort Best Insertion sort Best Insertion sort Best Insertion sort Best Insertion sort Best Insertion sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Quick sort Best Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst Insertion sort Worst
250000
249999
300000
299999
350000
349999
400000
399999
450000
449999
500000
499999
550000
549999
600000 1 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 550000 600000
599999 2 784481 1668946 2587875 3537875 4487875 5475732 6475732 7475732 8475732 9475732 10501445 11551445
1
0 125002499 9 500004999 9 112500749 99 200000999 99 312501249 99 450001499 99 612501749 99 800001999 99 1.0125E+1 1
50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 550000
1.25E+11 1.5125E+1 1
Insertion sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst Quick sort Worst
600000 1 50000 100000 150000 200000 250000 300000 350000 400000 450000 500000 550000
1.8E+11 5 834436 1768898 2737824 3737824 4737824 5775678 6825678 7875678 8925678 9975678 11051388
Quick sort Worst
600000
12151388
Appendix B – Code listing runTests.java import java.io.*; public class runTests { // Class to run all the tests for CM0212 Runtime analysis coursework // Philip Strong private String fileLoc; PrintWriter output = new PrintWriter(new FileWriter("counter.csv"));; private final int generatorSize = 100; // Max size of generated data private final int numOfInts = 10; // Number of integers to be generated private int[] range = {1, 10}; public static void main(String[] args) throws IOException { new runTests(args); } public runTests(String[] args) throws IOException { fileLoc = "data.tst"; // Get filename from argument insertionSort insertionTest; quickSort quickTest; for (int j = 0; j < 3; j++) { generateRandomData(); // Call random number generator System.out.println("Running average insertion sort " + (j + 1) + "...");
// Do insertion sort
for (int i = 0; i < range.length; i++) { System.out.print((i + 1) + " / " + (range.length) + " "); insertionTest = new insertionSort(parseFileToArray(fileLoc), range[i]); dumpArrayToFile(insertionTest.getArray(), i); // Dump sorted data to
file }
dumpInfoToFile("ins", range[i], insertionTest.getIterations());
System.out.println("Running average quicksort " + (j + 1) + "..."); for (int i = 0; i < range.length; i++) { System.out.print((i + 1) + " / " + (range.length) + " "); quickTest = new quickSort(parseFileToArray(fileLoc), 0, range[i], false); // Do insertion sort dumpArrayToFile(quickTest.getArray(), i); // Dump sorted data to file dumpInfoToFile("qui", range[i], quickTest.getIterations()); } } generateAscendingData(); // Call ascending data generator System.out.println("Running best insertion sort..."); for (int i = 0; i < range.length; i++) { System.out.print((i + 1) + " / " + (range.length) + " "); insertionTest = new insertionSort(parseFileToArray(fileLoc), range[i]); // Do insertion sort dumpArrayToFile(insertionTest.getArray(), i); // Dump sorted data to file dumpInfoToFile("insBest", range[i], insertionTest.getIterations()); } System.out.println("Running best quicksort..."); for (int i = 0; i < range.length; i++) { System.out.print((i + 1) + " / " + (range.length) + " ");
quickTest = new quickSort(parseFileToArray(fileLoc), 0, range[i], false); // Do insertion sort
dumpArrayToFile(quickTest.getArray(), i); // Dump sorted data to file dumpInfoToFile("quiBest", range[i], quickTest.getIterations());
} generateDescendingData(); // Call descending data generator System.out.println("Running worst insertion sort..."); for (int i = 0; i < range.length; i++) { System.out.print((i + 1) + " / " + (range.length) + " "); insertionTest = new insertionSort(parseFileToArray(fileLoc), range[i]); // Do insertion sort
dumpArrayToFile(insertionTest.getArray(), i); // Dump sorted data to file dumpInfoToFile("insWorst", range[i], insertionTest.getIterations());
}
insertion sort
System.out.println("Running worst quicksort..."); for (int i = 0; i < range.length; i++) { System.out.print((i + 1) + " / " + (range.length) + " "); quickTest = new quickSort(parseFileToArray(fileLoc), 0, range[i], false); // Do dumpArrayToFile(quickTest.getArray(), i); // Dump sorted data to file dumpInfoToFile("quiWorst", range[i], quickTest.getIterations()); } output.close(); System.out.println("Done");
} public void dumpInfoToFile(String cmd, int range, long iterations) throws IOException { try { output.println(cmd + "," + range + "," + iterations); } finally { System.out.println("Added to timing file."); } } public int[] parseFileToArray(String fileLoc) throws IOException // Take data file and place each item in an array position // Based on Mumford, C 2010 notes { int[] num = new int[10]; int k = 10; int count = 0; BufferedReader input; String line; try { input = new BufferedReader(new FileReader(fileLoc)); line = input.readLine(); while(line != null) { // dynamic array dimensions // makes array storage bigger, if needed if (count == (num.length-1)) { int n = num.length; int[] original = num; num = new int[n * 2]; for (int i = 0; i < n; i++) { num[i] = original[i]; } original = null; } num[count] = Integer.parseInt(line); line = input.readLine(); count++; }
input.close(); int[] original = num; num = new int[count]; for (int i = 0;i < count;i++) { num[i] = original[i]; } original = null; } catch(IOException e) { System.out.println("Error: " + e.toString()); } return num; } public void dumpArrayToFile(int[] input, int k) throws IOException // Takes an array and dumps it to file { try { PrintWriter output; output = new PrintWriter(new FileWriter("OUT" + fileLoc)); for (int i = 0; i < range[k]; i++) { output.println(input[i]); } output.close();
} catch(IOException e) { System.out.println("Error: " + e.toString()); } } public void generateRandomData() throws IOException // Generates a file filled with random integers { System.out.println("Generating random data..."); try { PrintWriter output; int number;
output = new PrintWriter(new FileWriter(fileLoc)); for (int i = 0; i < numOfInts; i++) { /* Debug output * if (i == (numOfInts * 0.1)) System.out.print("10%..."); if (i == (numOfInts * 0.2)) System.out.print("20%..."); if (i == (numOfInts * 0.3)) System.out.print("30%..."); if (i == (numOfInts * 0.4)) System.out.print("40%..."); if (i == (numOfInts * 0.5)) System.out.print("50%..."); if (i == (numOfInts * 0.6)) System.out.print("60%..."); if (i == (numOfInts * 0.7)) System.out.print("70%..."); if (i == (numOfInts * 0.8)) System.out.print("80%..."); if (i == (numOfInts * 0.9)) System.out.print("90%...");*/ number = (int) ((Math.random() * generatorSize) + 1); output.println(number);
} output.close();
} catch(IOException e) { System.out.println("Error: " + e.toString()); } finally { /*System.out.println("100% done");*/ } }
public void generateAscendingData() throws IOException // Generates a file filled with random integers { System.out.println("Generating acending data..."); try { PrintWriter output; output = new PrintWriter(new FileWriter(fileLoc)); for (int i = 0; i < numOfInts; i++) { output.println(i); } output.close(); } catch(IOException e) { System.out.println("Error: " + e.toString()); } finally { System.out.println("100% done"); } } public void generateDescendingData() throws IOException // Generates a file filled with random integers { System.out.println("Generating descending data..."); try { PrintWriter output; int number; output = new PrintWriter(new FileWriter(fileLoc)); for (int i = numOfInts; i > 0; i--) { output.println(i); } output.close();
} }
} catch(IOException e) { System.out.println("Error: " + e.toString()); } finally { System.out.println("100% done"); }
insertionSort.java public class insertionSort { private private private private private
int item; int j; int n; int[] output; long counter = 0;
public insertionSort(int[] input, int n) { // input: array input storing minimum of n integers // output: array output stored in ascending order // n: number of data elements to sort if (n > input.length) // Check range is available in the data { System.out.println("Out of range, only " + input.length + " integers available."); } else { // Insertion sort algorithm for(int i = 1; i <= (n - 1); i++) { counter++; item = input[i]; j = i - 1; while ((j >= 0) && (input[j] > item)) { counter++; input[j + 1] = input[j]; j = j - 1; } input[j + 1] = item; } output = input; }
}
public int[] getArray() // Returns the completed array { return output; } public long getIterations() // Return timing information { return counter; } }
quickSort.java public class quickSort { private private private private
int[] output; long counter; int pivot; boolean overRidePivot = false;
public quickSort(int[] input, int lower, int upper, boolean worstCase) { if (worstCase) setPivot(2); // Override for worst case pivot placement sort(input, lower, upper); } public void sort(int[] input, int lower, int upper) { int index = partition(input, lower, upper); if (lower < index - 1) { sort(input, lower, index - 1); } if (index < upper) { sort(input, index, upper); } output = input; } public int partition(int input[], int lower, int upper) { int i = lower; int j = upper; int tmp; if(!overRidePivot) { pivot = input[(lower + upper) / 2]; } else { overRidePivot = false; } while (i <= j) { counter++; while (input[i] < pivot) { counter++; i++; } while (input[j] > pivot) { counter++; j--; } if (i <= j) { tmp = input[i]; input[i] = input[j]; input[j] = tmp; i++; j--; } } return i; } public void setPivot(int pivotIn) // Set pivot when overridden { overRidePivot = true; pivot = pivotIn;
} public int[] getArray() // Return sorted array { return output; } public long getIterations() // Return timing information { return counter; } }