Sei sulla pagina 1di 11

Shellsort

Review: Insertion sort


The outer loop of insertion sort is:
for (outer = 1; outer < a.length; outer++) {...}
The invariant is that all the elements to the left of outer
are sorted with respect to one another
For all i < outer, j < outer, if i < j then a[i] <= a[j]
This does not mean they are all in their final correct place; the
remaining array elements may need to be inserted
When we increase outer, a[outer-1] becomes to its left; we must
keep the invariant true by inserting a[outer-1] into its proper place
This means:
Finding the elements proper place
Making room for the inserted element (by shifting over other
elements)
Inserting the element
One step of insertion sort
3 4 7 12 14 14 20 21 33 38 10 55 9 23 28 16
sorted next to be inserted
3 4 7 55 9 23 28 16
10
temp
38 33 21 20 14 14 12 10
sorted
less than 10
This one step takes O(n) time
We must do it n times; hence, insertion sort is O(n
2
)
The idea of shellsort
With insertion sort, each time we insert an
element, other elements get nudged one step closer
to where they ought to be
What if we could move elements a much longer
distance each time?
We could move each element:
A long distance
A somewhat shorter distance
A shorter distance still
This approach is what makes shellsort so much
faster than insertion sort
Sorting nonconsecutive subarrays
Consider just the red locations
Suppose we do an insertion sort on just these numbers, as
if they were the only ones in the array?
Now consider just the yellow locations
We do an insertion sort on just these numbers
Now do the same for each additional group of numbers
Here is an array to be sorted (numbers arent important)
The resultant array is sorted within groups, but not overall
Doing the 1-sort
In the previous slide, we compared numbers that
were spaced every 5 locations
This is a 5-sort
Ordinary insertion sort is just like this, only the
numbers are spaced 1 apart
We can think of this as a 1-sort
Suppose, after doing the 5-sort, we do a 1-sort?
In general, we would expect that each insertion would
involve moving fewer numbers out of the way
The array would end up completely sorted
Diminishing gaps
For a large array, we dont want to do a 5-sort; we
want to do an N-sort, where N depends on the size
of the array
N is called the gap size, or interval size
We may want to do several stages, reducing the
gap size each time
For example, on a 1000-element array, we may want to
do a 364-sort, then a 121-sort, then a 40-sort, then a
13-sort, then a 4-sort, then a 1-sort
Why these numbers?
The Knuth gap sequence
No one knows the optimal sequence of diminishing gaps
This sequence is attributed to Donald E. Knuth:
Start with h = 1
Repeatedly compute h = 3*h + 1
1, 4, 13, 40, 121, 364, 1093
Stop when h is larger than the size of the array and use as the first
gap, the previous number (364)
To get successive gap sizes, apply the inverse formula:
h = (h 1) / 3
This sequence seems to work very well
It turns out that just cutting the array size in half each time
does not work out as well
Analysis I
You cut the size of the array by some fixed amount, n,
each time
Consequently, you have about log n stages
Each stage takes O(n) time
Hence, the algorithm takes O(n log n) time
Right?
Wrong! This analysis assumes that each stage actually
moves elements closer to where they ought to be, by a
fairly large amount
What if all the red cells, for instance, contain the largest
numbers in the array?
None of them get much closer to where they should be
In fact, if we just cut the array size in half each time,
sometimes we get O(n
2
) behavior!
Analysis II
So what is the real running time of shellsort?
Nobody knows!
Experiments suggest something like O(n
3/2
) or
O(n
7/6
)
Analysis isnt always easy!
The End

Potrebbero piacerti anche