Sorting

Stable Sort: When algo does not change sequence of similar content
INDEX 0 1 2 3 4 5 6 7 8 9
Before 35 33 42 10 14 19 26 44 26 31
After 10 14 19 26 26 31 33 35 42 44

Not Stable Sort: When algo changes sequence of similar content
INDEX 0 1 2 3 4 5 6 7 8 9
Before 35 33 42 10 14 19 26 44 26 31
After 10 14 19 26 26 31 33 35 42 44

InPlace: does not require extra space. E.g. Bubble Sort
NotInPlace: requires extra space. E.g. Merge Sort

Adaptive: When algo takes advantage of already sorted elements in list to be sorted. 
Non-Adaptive: When algo try to force every single element to be re-ordered even if they were sorted

Wikipedia classifies sorting algorithms as follows: [Wikipedia: Sorting Algorithm]
Simple Sorts: Insertion, Selection
Efficient Sorts: Merge, Heap, Quick
Bubble Sort variants: Bubble, Shell, Comb
Distribution Sort: Counting, Bucket, Radix.

Easy to implement but O(n^2) Algo
Bubblesort: Scan the array. Whenever two consecutive items are found that are out of order, swap them. Repeat until all consecutive items are in order.
Insertion sort: Assume that Arr[1..i − 1] have already been sorted. Insert Arr[i] into its proper position in this subarray, by shifting all larger elements to the right by one to make space for the new item.
Selection sort: Assume that Arr[1..i − 1] contain the i − 1 smallest elements in sorted order. Find the smallest element in Arr[i..n], and then swap it with Arr[i].
Merge sort is O(nlogn)

Bubble Sort  O(n^2). Inplace. 
for (i=0 ; i< N-1; i++)
    swapped=0;
    for(j=0; j< N-1-i; j++)
        if(arr[j] > arr[j+1])
            tmp=arr[j]; arr[j]=arr[j+1];  arr[j+1]=tmp; swapped=1;
    if(!swapped) break;

Optimized, 
1) when no swap takes place in an iteration, stop algo.
2) After every iteration, highest values settles at the end. Thus, inner loop need not run full, can stop at N-1-i.

Insertion Sort  O(n^2) Inplace.
Algo:
  • If it is first elem, its already sorted, return 1
  • Pick next elem, compare with all elem in sorted subarr
  • Shift all elem in sorted subarr that is > value to be sorted
  • Insert value
  • Repeat until arr is sorted
It compares two elements and keep a sorted sub array on left side. When sub-array becomes unsorted, it makes it sorted again.
E.g.:
arr= 14,33,27,10,35,19,42,44. total elements 8.
Step 1: arr= 14,33,27,10,35,19,42,44  => arr= 14,33,27,10,35,19,42,44
Step 2: arr= 14,33,27,10,35,19,42,44  => arr= 14,27,33,10,35,19,42,44
Step 3: arr= 14,27,33,10,35,19,42,44  => arr= 14,27,10,33,35,19,42,44 
     => arr => 14,10,27,33,35,19,42,44 => arr= 10,14,27,33,35,19,42,44 
keep doing it until entire array is sorted.

for (i=1; i<N; i++)
    valToInsert=arr[i]
    holepos=i
    while(holepos>0 && arr[holepos-1] > valToInsert)
        arr[holepos] = arr[holepos-1];  holepos--; //arr[holdpos] moved.

    if(holepos!=i)
        arr[holepos]=valToInsert;  //Item valToInsert added at holepos
        

Selection Sort   O(n^2) InPlace
Algo
  • Set min_idx to 0
  • Search min elem in arr, swap with value at min_idx
  • Increment min_idx to point to next elem
  • Repeat until sorted.
Scan full list to find minimum element, swap with first element if required
Scan remaining list to find minimum element, swap with second element if required.
Continue.. each iteration, that correct element for that position is found and placed.

arr= 14,33,27,10,35,19,42,44.
Step 1: min=10, swap 14,10 => 10,33,27,14,35,19,42,44
Step 2: next min=14, swap 33,14 => 10,14,27,33,35,19,42,44

Step 2: next min=19, swap 27,19=> 10,14,19,33,35,27,42,44

for( i=0; i<N-1; i++)
    min_idx=i;
    for(j=i+1; j<N; j++)
        if (arr[j] < arr[min_idx])  min_idx=j;

    if(min_idx!=i)
        tmp=arr[min_idx]; arr[min_idx]=arr[i]; arr[i]=tmp; //swap

Merge Sort   O(nlogn) Divide&Conquer  NotInPlace
Divides array into equal halves and then combines them in sorted manner.
Algo:
  • Return 1 if it is only element in arr as this is already sorted
  • Divide arr recursively into two halves until further divide not possible
  • Merge smaller arr into new arr in sorted way
divide array into two equal halves:  arr= 14,33,27,10,35,19,42,44. 
keep doing it until array can no more be divided
then merge back in same way as were divided by sorting each division
arr1= 14,33,27,10   arr2= 35,19,42,44 
arr1= 14,33,  arr2=27,10   arr3= 35,19   arr4=42,44         
arr1= 14 arr2=33,  arr3=27 arr4=10   arr5= 35  arr6=19   arr7=42   arr8=44  [no further division possible]
14,33     27,10    35,19     42,44  =>  14,33    10,27    19,35    42,44     [sort 2 pair]
14,33,10,27        19,35,42,44      =>   10,14,27,33       19,35,42,44       [sort 4 pair]
10,14,27,33,19,35,42,44             => 10,14,19,27,33,35,42,44               [sort 8 pair]

void mergesort(int low, int high)
    if (low<high)
        mid=(low+high)/2
        mergesort(low, mid);
        mergesort(mid+1, high)
        merging (low, mid, high)
    else return

void merging(int low, int mid, int high)
    for(low1=low, low2=mid+1, i=low;  low1<=mid && low2<=high;  i++)
        if(arr[low1]<=arr[low2)  b[i] = a[low1++]
        else                                  b[i]= a[low2++]
    while(low1<=mid)  b[i++] = a[low1++];
    while(low2<=high)  b[i++] = a[low2++];
    for(i=low; i<=high; i++)  a[i] = b[i];

Shell Sort    O(n)  Efficient For Mid-size data.   
It uses insertion sort on widely spread elements first, then sort less widely spread elements.
spacing/interval is calculated with Knuth's formula: h=h*3+1;  h=interval with 1 initial value

Algo:
  • Initialize h (interval)
  • Divide arr into smaller subarr of equal interval h
  • Sort subarr using insertion sort
  • Repeat until complete array sorted
arr= 14,33,27,10,35,19,42,44.   interval h=4, club and generate subarr
sub arr= {14,35}, {33,19}, {27,42}, {10,44}  compare each value, swap if required in original arr  =>  arr=14,19,27,10,35,33,42,44

now, interval=2, get 2 sub-arrays {8 elem/2=4 in each}
sub arr= {14,27,35,42}  {19,10,33,44} compare each subarray and update original array if reqd
arr=14,19,27,10,35,33,42,44  [no swaps required in this step]

now, interval=1, get 1 sub-array (full size) and use insertion sort.
Step 1: 14,19,27,10,35,33,42,44
Step 2: 14,19,27,10,35,33,42,44
Step 3: 14,19,27,10,35,33,42,44
Step 4: 14,19,27,10,35,33,42,44
Step 5: 14,19,10,27,35,33,42,44
Step 6: 14,10,19,27,35,33,42,44
Step 7: 10,14,19,27,35,33,42,44
Step 8: 14,19,27,10,35,33,42,44
Step 9: 14,19,27,10,33,35,42,44
Step 10: 14,19,27,10,35,33,42,44
Just 4 swaps to sort array.

void shellsort()
    while(h <= elem/3)  h=h*3+1;
    while(h > 0) 
        for(outer=h; outer<elements; outer++)
            valToInsert=arr[outer]
            inner=outer;
            while(inner > h-1 && arr[inner-h] > valToInsert) //shift elem to right
                arr[inner] = arr[inner-h]
                inner -= h
            arr[inner]=valToInsert //insert at hole
        h=(h-1)/3; //calculate interval
        i++;

Quick Sort   O(nlogn)

Algo: [Quicksort]
  • Make right most index value pivot
  • Partition array using pivot
  • Quicksort left and right partition recursively.
Algo: [Quicksort pivot]
  • Make highest index value as pivot
  • Partition array using pivot [two var, left and right of arr, excluding pivot]
  • while value@left<pivot, move left to right side
  • while value@right>pivot, move right to left side
  • if both previous steps fail, swap element@ left and right
  • if left>=right, point where they met is new pivot.
  • Quicksort left and right partition recursively.
void quicksort(int left, int right)
    if(right-left <=0) return
    int pivot = arr[right];
    int partitionpivot = partition(left, right, pivot)
    quicksort(left, partitionpivot-1)
    quicksort(partitionpivot+1, right)

int partition(int left, int right, int pivot)   //quicksort pivot
    while (1)
        int leftptr = left-1, rightptr = right;
        while(arr[++leftptr] < pivot)  {}
        while(rightptr>0 && arr[--rightptr]>pivot) {}

        if(leftptr >= rightptr) break;
        else  swap(arr[leftptr], arr[rightptr])
    swap(arr[leftptr], arr[right])
    return leftptr

The story begins here, above was theory to understand basic sorting algorithms.

Bucket Sort
When input is distributed over a range, e.g. set of floating numbers uniformly distributed over 0-10.0.
Merge, Heap, Quick comparison based sort need nlogn time minimum. Counting sort can not be used as keys are floating number here.
Algo:

  • Create n empty buckets/lists
  • For each element arr[i]:
  •   Insert arr[i] to bucket[n*arr[i]];
  • Sort each bucket using insertion sort
  • Contatinate all buckets

void bucketsort(float arr[], int size)
{
    vector<float> bucket[size];  int idx=0;
    for(int i=0;i<size;i++) { int bi=size*arr[i];  bucket[bi].push_back(arr[i]); }
    for(int i=0;i<size;i++) { sort(bucket[i].begin(), bucket[i].end()); }
    for(int i=0;i<size;i++)
        for(int j=0;j<bucket[i].size();j++)
            arr[idx++]=bucket[i][j];
}
 
Counting and Radix sort are not comparison based algo, very good for sorting small integers.

Counting Sort
Arr[0..n] with each element in ranve 0..(r-1). uses auxiliary array cnt of counters and outputs sorted version of arr[[ as b[].

int[] countingsort(int[] arr, int r)
{
    int cnt[]=new int [r];
    int b[]=new int[a.length];
    int len=arr.length;

    for(int i=0;i<len;i++)  cnt[a[i]]++;
    for(int i=1;i<r;i++)  cnt[i]+=cnt[i-1];
    for(int i=len-1;i>=0;i--)  b[--cnt[a[i]]] = arr[i];
    return b;
}

Reference

No comments:

Post a Comment