1 CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe
2 Algorthm Effcency SORTING
3 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal to sort the lst, then use bnary search Many sortng algorthms of dfferng complexty (.e. faster or slower) Sortng provdes a "classcal" study of algorthm analyss because there are many mplementatons wth dfferent pros and cons Lst ndex Lst ndex 0 1 2 3 4 5 Orgnal 1 3 5 6 7 8 0 1 2 3 4 5 Sorted
4 Applcatons of Sortng Fnd the set_ntersecton of the 2 lsts to the rght A 0 1 2 3 4 5 How long does t take? B 9 3 4 2 7 8 0 1 2 3 4 5 11 6 Unsorted Try agan now that the lsts are sorted A 1 3 5 6 7 8 0 1 2 3 4 5 How long does t take? B 2 3 4 7 8 9 0 1 2 3 4 5 11 6 Sorted
5 Sortng Stablty A sort s stable f the order of equal tems n the orgnal lst s mantaned n the sorted lst Good for searchng wth multple crtera Example: Spreadsheet search Lst of students n alphabetcal order frst Then sort based on test score I'd want student's wth the same test score to appear n alphabetcal order stll As we ntroduce you to certan sort algorthms consder f they are stable or not Lst ndex Lst ndex Lst ndex 7,a 3,b 5,e 8,c 5,d 0 0 1 2 3 4 Orgnal 3,b 5,e 5,d 7,a 8,c 1 2 3 4 Stable Sortng 3,b 5,d 5,e 7,a 8,c 0 1 2 3 4 Unstable Sortng
6 Bubble Sortng Man Idea: Keep comparng neghbors, movng larger tem up and smaller tem down untl largest tem s at the top. Repeat on lst of sze n-1 Have one loop to count each pass, (a.k.a. ) to dentfy whch ndex we need to stop at Have an nner loop start at the lowest ndex and count up to the stoppng locaton comparng neghborng elements and advancng the larger of the neghbors Lst Lst Lst Lst Lst Lst Orgnal 3 7 6 5 1 8 After Pass 1 3 6 5 1 7 8 After Pass 2 3 5 1 6 7 8 After Pass 3 3 1 5 6 7 8 After Pass 4 1 3 5 6 7 8 After Pass 5
7 Bubble Sort Algorthm vod bsort(vector<nt> mylst) { nt ; for(=mylst.sze()-1; > 0; --){ for(=0; < ; ++){ f(mylst[] > mylst[+1]) { (, +1) Pass 1 Pass 2 Pass n-2 3 7 6 5 1 8 3 1 5 6 7 8 3 7 8 6 5 1 3 7 6 5 1 8 no 1 3 5 6 7 8 3 7 8 6 5 1 no 3 6 7 5 1 8 3 7 6 8 5 1 3 6 5 7 1 8 3 7 6 5 8 1 3 6 5 1 7 8 3 7 6 5 1 8
8 Bubble Sort Value Courtesy of wkpeda.org Lst Index
9 Bubble Sort Analyss Best Case Complexty: When already but stll have to O( ) Worst Case Complexty: When O( ) vod bsort(vector<nt> mylst) { nt ; for(=mylst.sze()-1; > 0; --){ for(=0; < ; ++){ f(mylst[] > mylst[+1]) { (, +1)
10 Bubble Sort Analyss Best Case Complexty: When already sorted (no s) but stll have to do all compares O(n 2 ) Worst Case Complexty: When sorted n descendng order O(n 2 ) vod bsort(vector<nt> mylst) { nt ; for(=mylst.sze()-1; > 0; --){ for(=0; < ; ++){ f(mylst[] > mylst[+1]) { (, +1)
11 Loop nvarant s a statement about what s true ether before an teraton begns or after one ends Consder bubble sort and look at the data after each teraton (pass) What can we say about the patterns of data after the k-th teraton? Loop Invarants vod bsort(vector<nt> mylst) { nt ; for(=mylst.sze()-1; > 0; --){ for(=0; < ; ++){ f(mylst[] > mylst[+1]) { (, +1) Pass 1 3 7 8 6 5 1 3 7 8 6 5 1 3 7 6 8 5 1 3 7 6 5 8 1 no Pass 2 3 7 6 5 1 8 3 7 6 5 1 8 3 6 7 5 1 8 3 6 5 7 1 8 3 6 5 1 7 8 no 3 7 6 5 1 8
12 What s true after the k- th teraton? All data at ndces n-k and above, n k: All data at ndces below n-k are, < n k: Loop Invarants vod bsort(vector<nt> mylst) { nt ; for(=mylst.sze()-1; > 0; --){ for(=0; < ; ++){ f(mylst[] > mylst[+1]) { (, +1) Pass 1 3 7 8 6 5 1 3 7 8 6 5 1 3 7 6 8 5 1 no Pass 2 3 7 6 5 1 8 3 7 6 5 1 8 3 6 7 5 1 8 3 6 5 7 1 8 no 3 7 6 5 8 1 3 6 5 1 7 8 3 7 6 5 1 8
13 What s true after the k- th teraton? All data at ndces n-k and above are sorted, n k: a < a + 1 All data at ndces below n-k are less than the value at n-k, < n k: a < a n k Loop Invarants vod bsort(vector<nt> mylst) { nt ; for(=mylst.sze()-1; > 0; --){ for(=0; < ; ++){ f(mylst[] > mylst[+1]) { (, +1) Pass 1 3 7 8 6 5 1 3 7 8 6 5 1 3 7 6 8 5 1 3 7 6 5 8 1 no Pass 2 3 7 6 5 1 8 3 7 6 5 1 8 3 6 7 5 1 8 3 6 5 7 1 8 3 6 5 1 7 8 no 3 7 6 5 1 8
14 Selecton Sort Selecton sort does away wth the many s and ust records where the mn or max value s and performs one at the end The lst/array can agan be thought of n two parts Sorted Unsorted The problem starts wth the whole array unsorted and slowly the sorted porton grows We could fnd the max and put t at the end of the lst or we could fnd the mn and put t at the start of the lst Just for varaton let's choose the mn approach
15 Selecton Sort Algorthm vod ssort(vector<nt> mylst) { for(=0; < mylst.sze()-1; ++){ nt mn = ; for(=+1; < mylst.sze; ++){ f(mylst[] < mylst[mn]) { mn = (mylst[], mylst[mn]) Pass 1 Pass 2 Pass n-2 mn=0 mn=1 mn=4 mn=1 1 3 8 6 5 7 mn=1 1 3 5 6 7 8 mn=4 1 3 8 6 5 7 mn=1 mn=1 mn=1 mn=5 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7
Selecton Sort 16 Value Courtesy of wkpeda.org Lst Index
17 Selecton Sort Analyss Best Case Complexty: O( ) Worst Case Complexty: O( ) vod ssort(vector<nt> mylst) { for(=0; < mylst.sze()-1; ++){ nt mn = ; for(=+1; < mylst.sze; ++){ f(mylst[] < mylst[mn]) { mn = (mylst[], mylst[mn])
18 Selecton Sort Analyss Best Case Complexty: Sorted already O(n 2 ) Worst Case Complexty: When sorted n descendng order O(n 2 ) vod ssort(vector<nt> mylst) { for(=0; < mylst.sze()-1; ++){ nt mn = ; for(=+1; < mylst.sze; ++){ f(mylst[] < mylst[mn]) { mn = (mylst[], mylst[mn])
19 What s true after the k-th teraton? All data at ndces less than k are, < k: All data at ndces k and above are, k: Loop Invarant vod ssort(vector<nt> mylst) { for(=0; < mylst.sze()-1; ++){ nt mn = ; for(=+1; < mylst.sze; ++){ f(mylst[] < mylst[mn]) { mn = (mylst[], mylst[mn]) Pass 1 1 3 8 6 5 7 mn=0 mn=1 mn=1 mn=1 mn=5 Pass 2 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1
20 What s true after the k-th teraton? All data at ndces less than k are sorted, < k: a < a + 1 All data at ndces k and above are greater than the value at k, k: a k < a Loop Invarant vod ssort(vector<nt> mylst) { for(=0; < mylst.sze()-1; ++){ nt mn = ; for(=+1; < mylst.sze; ++){ f(mylst[] < mylst[mn]) { mn = (mylst[], mylst[mn]) Pass 1 mn=0 mn=1 mn=1 mn=1 mn=5 Pass 2 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 1 3 8 6 5 7 mn=1 mn=1 1 3 8 6 5 7 1 3 8 6 5 7
21 Inserton Sort Algorthm Imagne we pck up one element of the array at a tme and then ust nsert t nto the rght poston Smlar to how you sort a hand of cards n a card game? You pck up the frst (t s by nature sorted) You pck up the second and nsert t at the rght poston, etc. Start???? 1 st Card 7???? 2 nd Card 7 3??? 3 rd Card 3 7 8?? 4 th Card 5 th Card 3 7 8 6? 3 6 7 8 5 3 7??? 3 7 8?? 3 6 7 8? 3 5 6 7 8
22 Inserton Sort Algorthm vod sort(vector<nt> mylst) { for(nt =1; < mylst.sze(); ++){ nt val = mylst[]; hole = whle(hole > 0 && val < mylst[hole-1]){ mylst[hole] = mylst[hole-1]; hole--; mylst[hole] = val; Pass 1 Pass 2 Pass 3 Pass 4 h val=3 3 7 8 6 5 1 h val=8 3 7 8 6 5 1 h val=6 3 6 7 8 5 1 h val=5 7 7 8 6 5 1 h 3 7 8 6 5 1 3 7 8 8 5 1 h 3 6 7 8 8 1 h 3 7 8 6 5 1 h 3 7 7 8 5 1 h 3 6 7 7 8 1 h 3 6 7 8 5 1 h 3 6 6 7 8 1 h 3 5 6 7 8 1 h
Inserton Sort 23 Value Courtesy of wkpeda.org Lst Index
24 Inserton Sort Analyss Best Case Complexty: Sorted already Worst Case Complexty: When sorted n descendng order vod sort(vector<nt> mylst) { for(nt =1; < mylst.sze()-1; ++){ nt val = mylst[]; hole = whle(hole > 0 && val < mylst[hole-1]){ mylst[hole] = mylst[hole-1]; hole--; mylst[hole] = val;
25 Inserton Sort Analyss Best Case Complexty: Sorted already O(n) Worst Case Complexty: When sorted n descendng order O(n 2 ) vod sort(vector<nt> mylst) { for(nt =1; < mylst.sze()-1; ++){ nt val = mylst[]; hole = whle(hole > 0 && val < mylst[hole-1]){ mylst[hole] = mylst[hole-1]; hole--; mylst[hole] = val;
26 What s true after the k-th teraton? All data at ndces less than, Can we make a clam about data at k+1 and beyond? Loop Invarant vod sort(vector<nt> mylst) { for(nt =1; < mylst.sze()-1; ++){ nt val = mylst[]; hole = whle(hole > 0 && val < mylst[hole-1]){ mylst[hole] = mylst[hole-1]; hole--; mylst[hole] = val; h Pass 1 7 7 8 6 5 1 h 3 7 8 6 5 1 h Pass 2 val=3 3 7 8 6 5 1 h 3 7 8 6 5 1 val=8
27 What s true after the k- th teraton? All data at ndces less than k+1 are sorted, < k + 1: a < a + 1 Can we make a clam about data at k+1 and beyond? No, t's not guaranteed to be smaller or larger than what s n the sorted lst Loop Invarant vod sort(vector<nt> mylst) { for(nt =1; < mylst.sze()-1; ++){ nt val = mylst[]; hole = whle(hole > 0 && val < mylst[hole-1]){ mylst[hole] = mylst[hole-1]; hole--; mylst[hole] = val; h Pass 1 7 7 8 6 5 1 h 3 7 8 6 5 1 h Pass 2 val=3 3 7 8 6 5 1 h 3 7 8 6 5 1 val=8
MERGESORT 28
29 Exercse http://bts.usc.edu/websheets/?folder=cpp/cs104&start=mer ge&auth=google# merge
30 Merge Two Sorted Lsts Consder the problem of mergng two sorted lsts nto a new combned sorted lst Can be done n O(n) Can we merge n place or need an output array? 0 1 2 3 3 7 6 8 0 1 2 3 3 6 7 8 Inputs Lsts Merged Result r1 r2 r1 r2 r1 r2 r1 r2 r1 r2 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 3 7 6 8 3 7 6 8 3 7 6 8 3 7 6 8 3 7 6 8 w w w w w 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 0 1 2 3 3 6 7 8 3 6 7 8 3 6 7 8 3 6 7 8 3 6 7 8
31 Recursve Sort (MergeSort) Break sortng problem nto smaller sortng problems and merge the results at the end Mergesort(0..n) If lst s sze 1, return Else Mergesort(0..n/2-1) Mergesort(n/2.. n) Mergesort(0,2) Mergesort(2,4) Mergesort(4,6) Mergesort(6,8) Combne each sorted lst of n/2 elements nto a sorted n-element lst Mergesort(0,8) 0 1 2 3 4 5 6 7 0 4 2 Mergesort(0,4) Mergesort(4,8) 0 1 2 3 4 5 6 7 0 4 2 0 1 2 3 4 5 6 7 0 4 2 0 1 2 3 4 5 6 7 0 4 2 0 1 2 3 4 5 6 7 3 7 6 8 5 10 2 4 0 1 2 3 4 5 6 7 3 6 7 8 2 4 5 10 0 1 2 3 4 5 2 3 4 5 6 7 6 7 8 10
32 Recursve Sort (MergeSort) Run-tme analyss # of recurson levels = Log 2 (n) Total operatons to merge each level = n operatons total to merge two lsts over all recursve calls at a partcular level Mergesort = O(n * log 2 (n) ) Usually has hgh constant factors due to extra array needed for merge Mergesort(0,2) Mergesort(2,4) Mergesort(4,6) Mergesort(6,8) Mergesort(0,8) 0 1 2 3 4 5 6 7 0 4 2 Mergesort(0,4) Mergesort(4,8) 0 1 2 3 4 5 6 7 0 4 2 0 1 2 3 4 5 6 7 0 4 2 0 1 2 3 4 5 6 7 0 4 2 0 1 2 3 4 5 6 7 3 7 6 8 5 10 2 4 0 1 2 3 4 5 6 7 3 6 7 8 2 4 5 10 0 1 2 3 4 5 2 3 4 5 6 7 6 7 8 10
33 MergeSort Run Tme Let's prove ths more formally: T(1) = Θ(1) T(n) =
34 MergeSort Run Tme Let's prove ths more formally: T(1) = Θ(1) T(n) = 2*T(n/2) + Θ(n) k=1 T(n) = 2*T(n/2) + Θ(n) T(n/2) = 2*T(n/4) + Θ(n/2) k=2 k=3 = 2*2*T(n/4) + 2*Θ(n) = 8*T(n/8) + 3*Θ(n) = 2 k *T(n/2 k ) + k*θ(n) Stop @ T(1) [.e. n = 2 k ] k=log 2 n = 2 k *T(n/2 k ) + k*θ(n) = 2 log2(n) *Θ(1) + log 2 *Θ(n) = n+log 2 *Θ(n) = Θ(n*log 2 n)
Merge Sort 35 Value Courtesy of wkpeda.org Lst Index
36 Recursve Sort (MergeSort) vod mergesort(vector<nt>& mylst) { vector<nt> other(mylst); // copy of array // use other as the source array, mylst as the output array msort(other, myarray, 0, mylst.sze() ); vod msort(vector<nt>& mylst, vector<nt>& output, nt start, nt end) { // base case f(start >= end) return; // recursve calls nt md = (start+end)/2; msort(mylst, output, start, md); msort(mylst, output, md, end); // merge merge(mylst, output, start, md, md, end); vod merge(vector<nt>& mylst, vector<nt>& output nt s1, nt e1, nt s2, nt e2) {...
37 Dvde & Conquer Strategy Mergesort s a good example of a strategy known as "dvde and conquer" 3 Steps: Dvde Splt problem nto smaller versons (usually partton the data somehow) Recurse Solve each of the smaller problems Combne Put solutons of smaller problems together to form larger soluton Another example of Dvde and Conquer? Bnary Search
QUICKSORT 38
39 Partton & QuckSort Partton algorthm (arbtrarly) pcks one number as the 'pvot' and puts t nto the 'correct' locaton left rght left rght nt partton(vector<nt> mylst, nt start, nt end, nt p) { nt pvot = mylst[p]; (mylst[p], mylst[end]); // move pvot out of the //way for now nt left = start; nt rght = end-1; whle(left < rght){ whle(mylst[left] <= pvot && left < rght) left++; whle(mylst[rght] >= pvot && left < rght) rght--; f(left < rght) (mylst[left], mylst[rght]); unsorted numbers f(mylst[rght] > mylst[end]) { // put pvot n (mylst[rght], mylst[end]); // correct place return rght; else { return end; p < pvot p > pvot Partton(mylst,0,5,5) 3 6 8 1 5 7 l p 3 6 8 1 5 7 l p 3 6 5 1 8 7 l p 3 6 5 1 8 7 l,r p 3 6 5 1 7 8 r r r l,r p Note: end s nclusve n ths example
40 QuckSort Use the partton algorthm as the bass of a sort algorthm Partton on some number and the recursvely call on both sdes < pvot p > pvot // range s [start,end] where end s nclusve vod qsort(vector<nt>& mylst, nt start, nt end) { // base case lst has 1 or less tems f(start >= end) return; // pck a random pvot locaton [start..end] nt p = start + rand() % (end+1); // partton nt loc = partton(mylst,start,end,p) // recurse on both sdes qsort(mylst,start,loc-1); qsort(mylst,loc+1,end); 3 6 8 1 5 7 l r p 3 6 8 1 5 7 l r p 3 6 5 1 8 7 l r p 3 6 5 1 8 7 l,r p 3 6 5 1 7 8 l,r p
Quck Sort 41 Value Courtesy of wkpeda.org Lst Index
42 QuckSort Analyss Worst Case Complexty: When pvot chosen ends up beng Runtme: 3 6 8 1 5 7 3 6 1 5 7 8 3 6 8 1 5 7 3 1 5 6 8 7 Best Case Complexty: Pvot pont chosen ends up beng the Runtme:
43 QuckSort Analyss Worst Case Complexty: When pvot chosen ends up beng mn or max tem Runtme: T(n) = Θ(n) + T(n-1) 3 6 8 1 5 7 3 6 1 5 7 8 3 6 8 1 5 7 3 1 5 6 8 7 Best Case Complexty: Pvot pont chosen ends up beng the medan tem Runtme: Smlar to MergeSort T(n) = 2T(n/2) + Θ(n)
44 QuckSort Analyss Average Case Complexty: O(n*log(n)) choose a pvot 3 6 8 1 5 7
45 QuckSort Analyss Worst Case Complexty: When pvot chosen ends up beng max or mn of each lst O(n 2 ) Best Case Complexty: Pvot pont chosen ends up beng the mddle tem O(n*lg(n)) Average Case Complexty: O(n*log(n)) Randomly choose a pvot Pvot and qucksort can be slower on small lsts than somethng lke nserton sort Many qucksort algorthms use pvot and qucksort recursvely untl lsts reach a certan sze and then use nserton sort on the small peces
46 Comparson Sorts Bg O of comparson sorts It s mathematcally provable that comparsonbased sorts can never perform better than O(n*log(n)) So can we ever have a sortng algorthm that performs better than O(n*log(n))? Yes, but only f we can make some meanngful assumptons about the nput
OTHER SORTS 47
48 Sortng n Lnear Tme Radx Sort Sort numbers one dgt at a tme startng wth the least sgnfcant dgt to the most. Bucket Sort Assume the nput s generated by a random process that dstrbutes elements unformly over the nterval [0, 1) Countng Sort Assume the nput conssts of an array of sze N wth ntegers n a small range from 0 to k.
49 Applcatons of Sortng Fnd the set_ntersecton of the 2 lsts to the rght A 0 1 2 3 4 5 How long does t take? B 9 3 4 2 7 8 0 1 2 3 4 5 11 6 Unsorted Try agan now that the lsts are sorted A 1 3 5 6 7 8 0 1 2 3 4 5 How long does t take? B 2 3 4 7 8 9 0 1 2 3 4 5 11 6 Sorted
50 Other Resources http://www.youtube.com/watch?v=vxenklcs2tw http://flowngdata.com/2010/09/01/what-dfferent-sortng-algorthmssound-lke/ http://www.math.ucla.edu/~rcompton/muscal_sortng_algorthms/musc al_sortng_algorthms.html http://sortng.at/ Awesome muscal accompanment: https://www.youtube.com/watch?v=epfmtym8cw