An algorithm that figures out and analyses the pattern of an array and then go for the best sorting algorithm possible for the array.
Objective
When it comes to sorting an array, we have many different algorithms to choose from but which algorithm to choose for a particular situation, to get optimal performance, depends on the input data for that situation.
Imagine an algorithm that takes an array as input, analyses it and figures out the best sorting algorithm for it. Yeah! We did the same in this project so that you need not care about choose the right algorithm for optimal performance anymore.
| Pattern | Description | Sort Algorithm |
|---|---|---|
| Repeated | Array containing many repeated elements. | Count Sort |
| Nearly Sorted | Array is mostly in ascending order. | Insertion Sort |
| Nearly Reversed | Array is mostly in descending order. | Heap Sort |
| Random | Array in random pattern or if the pattern doesn’t match with any of the above. | Quick Sort |
Technique Used - Random Sampling:
In this method, we give priority to performance over accuracy. However, we were still able to achieve 96.4% success rate with this method. Here, we randomly pick few elements from the array in the increasing order of the index and give scores for each of the four patterns. The pattern with highest score at the end is taken as the array pattern. The number of elements to be sampled is given by the formula log2n sets, each of size log5n.
| Array Size | Actual Pattern | Predicted Pattern | Sampled Size | Repeated Score | Reverse Score | Nearly Sorted Score | Random Score | Time Taken |
|---|---|---|---|---|---|---|---|---|
| 50000 | Repeated | Repeated | 112 | 138 | 60 | 51 | 73 | 13299259 |
| 50000 | Reverse | Reverse | 112 | 0 | 112 | -1 | 1 | 4342272 |
| 50000 | Nearly Sorted | Nearly Sorted | 112 | 0 | -1 | 112 | 1 | 2139414 |
| 50000 | Random | Random | 112 | 0 | 64 | 47 | 72 | 4027845 |
| 50000 | Repeated | Repeated | 112 | 164 | 59 | 52 | 74 | 21102568 |
| 50000 | Reverse | Reverse | 112 | 0 | 111 | 0 | 3 | 3830045 |
| 50000 | Nearly Sorted | Nearly Sorted | 112 | 0 | 1 | 110 | 5 | 1021657 |
| 50000 | Random | Random | 112 | 0 | 57 | 54 | 71 | 1723754 |
| 50000 | Repeated | Repeated | 112 | 79 | 52 | 59 | 76 | 8487678 |
| 50000 | Reverse | Reverse | 112 | 0 | 110 | 1 | 5 | 3957402 |
| 50000 | Nearly Sorted | Nearly Sorted | 112 | 0 | -1 | 112 | 1 | 1111693 |
| 50000 | Random | Random | 112 | 0 | 54 | 57 | 79 | 2218720 |
| 200000 | Repeated | Repeated | 144 | 136 | 74 | 69 | 97 | 14832677 |
| 200000 | Reverse | Reverse | 144 | 0 | 144 | -1 | 1 | 2772467 |
| 200000 | Nearly Sorted | Nearly Sorted | 144 | 0 | -1 | 144 | 1 | 1833383 |
| 200000 | Random | Random | 144 | 0 | 67 | 76 | 93 | 1575403 |
| 5000 | Repeated | Repeated | 78 | 122 | 41 | 36 | 55 | 9144990 |
| 5000 | Reverse | Reverse | 78 | 0 | 76 | 1 | 5 | 1162543 |
| 5000 | Nearly Sorted | Nearly Sorted | 78 | 0 | -1 | 78 | 1 | 1148080 |
| 5000 | Random | Random | 78 | 0 | 41 | 36 | 44 | 1253978 |
| 5000 | Repeated | Repeated | 78 | 78 | 35 | 42 | 50 | 6224173 |
| 5000 | Reverse | Reverse | 78 | 0 | 75 | 2 | 7 | 805662 |
| 5000 | Nearly Sorted | Nearly Sorted | 78 | 0 | 0 | 77 | 3 | 756212 |
| 5000 | Random | Random | 78 | 0 | 39 | 38 | 54 | 877971 |
| 5000 | Repeated | Repeated | 78 | 122 | 44 | 33 | 57 | 7720736 |
| 5000 | Reverse | Reverse | 78 | 0 | 73 | 4 | 11 | 868174 |
| 5000 | Nearly Sorted | Nearly Sorted | 78 | 0 | 5 | 72 | 11 | 1114958 |
| 5000 | Random | Random | 78 | 0 | 38 | 39 | 54 | 1255844 |
| 1000 | Repeated | Repeated | 50 | 72 | 22 | 27 | 34 | 8849224 |
| 1000 | Reverse | Reverse | 50 | 0 | 44 | 5 | 13 | 773473 |
| 1000 | Nearly Sorted | Nearly Sorted | 50 | 18 | 8 | 41 | 17 | 1207794 |
| 1000 | Random | Random | 50 | 0 | 29 | 20 | 31 | 756679 |
| 1000 | Repeated | Repeated | 50 | 90 | 30 | 19 | 32 | 5835103 |
| 1000 | Reverse | Reverse | 50 | 0 | 46 | 3 | 9 | 447382 |
| 1000 | Nearly Sorted | Nearly Sorted | 50 | 0 | 6 | 43 | 10 | 472574 |
| 1000 | Random | Random | 50 | 9 | 22 | 27 | 36 | 508963 |
| 1000 | Repeated | Repeated | 50 | 95 | 26 | 23 | 31 | 7780916 |
| 1000 | Reverse | Reverse | 50 | 0 | 48 | 1 | 5 | 770208 |
| 1000 | Nearly Sorted | Nearly Sorted | 50 | 0 | 2 | 47 | 7 | 647982 |
| 1000 | Random | Random | 50 | 0 | 23 | 26 | 35 | 773473 |
| 100 | Repeated | Repeated | 21 | 38 | 12 | 8 | 15 | 7350328 |
| 100 | Reverse | Reverse | 21 | 0 | 13 | 7 | 11 | 313028 |
| 100 | Nearly Sorted | Nearly Sorted | 21 | 0 | 2 | 18 | 7 | 249582 |
| 100 | Random | Random | 21 | 5 | 11 | 9 | 14 | 431055 |
| 100 | Repeated | Repeated | 21 | 136 | 8 | 12 | 10 | 5614911 |
| 100 | Reverse | Reverse | 21 | 0 | 18 | 2 | 7 | 380672 |
| 100 | Nearly Sorted | Nearly Sorted | 21 | 0 | 4 | 16 | 6 | 366677 |
| 100 | Random | Random | 21 | 5 | 12 | 8 | 17 | 444584 |
| 60 | Repeated | Repeated | 18 | 87 | 7 | 10 | 12 | 5831372 |
| 60 | Reverse | Repeated | 18 | 15 | 11 | 6 | 10 | 269643 |
| 60 | Nearly Sorted | Nearly Sorted | 18 | 0 | 5 | 12 | 11 | 221125 |
| 60 | Random | Random | 18 | 0 | 6 | 11 | 13 | 248183 |
| 60 | Repeated | Repeated | 18 | 99 | 6 | 11 | 9 | 6804045 |
| 60 | Reverse | Reverse | 18 | 5 | 14 | 3 | 9 | 281306 |
| 60 | Nearly Sorted | Nearly Sorted | 18 | 0 | 4 | 13 | 11 | 219259 |
| 60 | Random | Reverse | 18 | 0 | 9 | 8 | 8 | 243985 |
| 60 | Repeated | Repeated | 18 | 60 | 7 | 10 | 9 | 8722799 |
| 60 | Reverse | Reverse | 18 | 0 | 13 | 4 | 9 | 270576 |
| 60 | Nearly Sorted | Nearly Sorted | 18 | 0 | 2 | 15 | 7 | 253782 |
| 60 | Random | Random | 18 | 0 | 7 | 10 | 15 | 452981 |
Note: The efficiency of this algorithm can be felt only if the array size is huge. For smaller sized arrays, performing sort directly without prediction is much faster. During our analysis, we felt the impact only when the array size is greater than 5000.