Sei sulla pagina 1di 8

Popular sorting algorithms[edit]

While there are a large number of sorting algorithms, in practical


implementations a few algorithms predominate. Insertion sort is widely used
for small data sets, while for large data sets an asymptotically efcient sort is
used, primarily heap sort, merge sort, or quicksort. fcient implementations
generally use a hybrid algorithm, combining an asymptotically efcient
algorithm for the o!erall sort with insertion sort for small lists at the bottom
of a recursion. "ighly tuned implementations use more sophisticated
!ariants, such as #imsort $merge sort, insertion sort, and additional logic%,
used in &ndroid, 'a!a, and Python, and introsort $quicksort and heap sort%,
used $in !ariant forms% in some ()) sort implementations and in .*#.
+or more restricted data, such as numbers in a ,-ed inter!al, distribution
sorts such as counting sort or radi- sort are widely used. .ubble sort and
!ariants are rarely used in practice, but are commonly found in teaching and
theoretical discussions.
When physically sorting ob/ects, such as alphabeti0ing papers $such as tests
or books%, people intuiti!ely generally use insertion sorts for small sets. +or
larger sets, people often ,rst bucket, such as by initial letter, and multiple
bucketing allows practical sorting of !ery large sets. 1ften space is relati!ely
cheap, such as by spreading ob/ects out on the 2oor or o!er a large area, but
operations are e-pensi!e, particularly mo!ing an ob/ect a large distance 3
locality of reference is important. 4erge sorts are also practical for physical
ob/ects, particularly as two hands can be used, one for each list to merge,
while other algorithms, such as heap sort or quick sort, are poorly suited for
human use. 1ther algorithms, such as library sort, a !ariant of insertion sort
that lea!es spaces, are also practical for physical use.
5imple sorts[edit]
#wo of the simplest sorts are insertion sort and selection sort, both of which
are efcient on small data, due to low o!erhead, but not efcient on large
data. Insertion sort is generally faster than selection sort in practice, due to
fewer comparisons and good performance on almost6sorted data, and thus is
preferred in practice, but selection sort uses fewer writes, and thus is used
when write performance is a limiting factor.
Insertion sort[edit]
4ain article7 Insertion sort
Insertion sort is a simple sorting algorithm that is relati!ely efcient for small
lists and mostly sorted lists, and often is used as part of more sophisticated
algorithms. It works by taking elements from the list one by one and inserting
them in their correct position into a new sorted list. In arrays, the new list and
the remaining elements can share the array8s space, but insertion is
e-pensi!e, requiring shifting all following elements o!er by one. 5hell sort
$see below% is a !ariant of insertion sort that is more efcient for larger lists.
5election sort[edit]
4ain article7 5election sort
5election sort is an in6place comparison sort. It has 1$n9% comple-ity, making
it inefcient on large lists, and generally performs worse than the similar
insertion sort. 5election sort is noted for its simplicity, and also has
performance ad!antages o!er more complicated algorithms in certain
situations.
#he algorithm ,nds the minimum !alue, swaps it with the !alue in the ,rst
position, and repeats these steps for the remainder of the list. It does no
more than n swaps, and thus is useful where swapping is !ery e-pensi!e.
fcient sorts[edit]
Practical general sorting algorithms are almost always based on an algorithm
with a!erage comple-ity $and generally worst6case comple-ity% 1$n log n%, of
which the most common are heap sort, merge sort, and quicksort. ach has
ad!antages and drawbacks, with the most signi,cant being that simple
implementation of merge sort uses 1$n% additional space, and simple
implementation of quicksort has 1$n9% worst6case comple-ity. #hese
problems can be sol!ed or ameliorated at the cost of a more comple-
algorithm.
While these algorithms are asymptotically efcient on random data, for
practical efciency on real6world data !arious modi,cations are used. +irst,
the o!erhead of these algorithms becomes signi,cant on smaller data, so
often a hybrid algorithm is used, commonly switching to insertion sort once
the data is small enough. 5econd, the algorithms often perform poorly on
already sorted data or almost sorted data 3 these are common in real6world
data, and can be sorted in 1$n% time by appropriate algorithms. +inally, they
may also be unstable, and stability is often a desirable property in a sort.
#hus more sophisticated algorithms are often employed, such as #imsort
$based on merge sort% or introsort $based on quicksort, falling back to heap
sort%.
4erge sort[edit]
4ain article7 4erge sort
4erge sort takes ad!antage of the ease of merging already sorted lists into a
new sorted list. It starts by comparing e!ery two elements $i.e., : with 9, then
; with <...% and swapping them if the ,rst should come after the second. It
then merges each of the resulting lists of two into lists of four, then merges
those lists of four, and so on= until at last two lists are merged into the ,nal
sorted list. 1f the algorithms described here, this is the ,rst that scales well
to !ery large lists, because its worst6case running time is 1$n log n%. It is also
easily applied to lists, not only arrays, as it only requires sequential access,
not random access. "owe!er, it has additional 1$n% space comple-ity, and
in!ol!es a large number of copies in simple implementations.
4erge sort has seen a relati!ely recent surge in popularity for practical
implementations, due to its use in the sophisticated algorithm #imsort, which
is used for the standard sort routine in the programming languages
Python[:;] and 'a!a $as of '>?@[:<]%. 4erge sort itself is the standard routine
in Perl,[:A] among others, and has been used in 'a!a at least since 9BBB in
'>?:.;.[:C][:@]
"eapsort[edit]
4ain article7 "eapsort
"eapsort is a much more efcient !ersion of selection sort. It also works by
determining the largest $or smallest% element of the list, placing that at the
end $or beginning% of the list, then continuing with the rest of the list, but
accomplishes this task efciently by using a data structure called a heap, a
special type of binary tree. 1nce the data list has been made into a heap, the
root node is guaranteed to be the largest $or smallest% element. When it is
remo!ed and placed at the end of the list, the heap is rearranged so the
largest element remaining mo!es to the root. Dsing the heap, ,nding the
ne-t largest element takes 1$log n% time, instead of 1$n% for a linear scan as
in simple selection sort. #his allows "eapsort to run in 1$n log n% time, and
this is also the worst case comple-ity.
Euicksort[edit]
4ain article7 Euicksort
Euicksort is a di!ide and conquer algorithm which relies on a partition
operation7 to partition an array an element called a pi!ot is selected. &ll
elements smaller than the pi!ot are mo!ed before it and all greater elements
are mo!ed after it. #his can be done efciently in linear time and in6place.
#he lesser and greater sublists are then recursi!ely sorted. #his yields
a!erage time comple-ity of 1$n log n%, with low o!erhead, and thus this is a
popular algorithm. fcient implementations of quicksort $with in6place
partitioning% are typically unstable sorts and somewhat comple-, but are
among the fastest sorting algorithms in practice. #ogether with its modest
1$log n% space usage, quicksort is one of the most popular sorting algorithms
and is a!ailable in many standard programming libraries.
#he important ca!eat about quicksort is that its worst6case performance is
1$n9%= while this is rare, in nai!e implementations $choosing the ,rst or last
element as pi!ot% this occurs for sorted data, which is a common case. #he
most comple- issue in quicksort is thus choosing a good pi!ot element, as
consistently poor choices of pi!ots can result in drastically slower 1$n9%
performance, but good choice of pi!ots yields 1$n log n% performance, which
is asymptotically optimal. +or e-ample, if at each step the median is chosen
as the pi!ot then the algorithm works in 1$n log n%. +inding the median, such
as by the median of medians selection algorithm is howe!er an 1$n%
operation on unsorted lists and therefore e-acts signi,cant o!erhead with
sorting. In practice choosing a random pi!ot almost certainly yields 1$n log n%
performance.
.ubble sort and !ariants[edit]
.ubble sort, and !ariants such as the cocktail sort, are simple but highly
inefcient sorts. #hey are thus frequently seen in introductory te-ts, and are
of some theoretical interest due to ease of analysis, but they are rarely used
in practice, and primarily of recreational interest. 5ome !ariants, such as the
5hell sort, ha!e open questions about their beha!ior.
.ubble sort[edit]
& bubble sort, a sorting algorithm that continuously steps through a list,
swapping items until they appear in the correct order.
4ain article7 .ubble sort
.ubble sort is a simple sorting algorithm. #he algorithm starts at the
beginning of the data set. It compares the ,rst two elements, and if the ,rst
is greater than the second, it swaps them. It continues doing this for each
pair of ad/acent elements to the end of the data set. It then starts again with
the ,rst two elements, repeating until no swaps ha!e occurred on the last
pass. #his algorithm8s a!erage and worst6case performance is 1$n9%, so it is
rarely used to sort large, unordered data sets. .ubble sort can be used to sort
a small number of items $where its asymptotic inefciency is not a high
penalty%. .ubble sort can also be used efciently on a list of any length that is
nearly sorted $that is, the elements are not signi,cantly out of place%. +or
e-ample, if any number of elements are out of place by only one position
$e.g. B:9;A<C@FG and :B;9A<@CGF%, bubble sort8s e-change will get them in
order on the ,rst pass, the second pass will ,nd all elements in order, so the
sort will take only 9n time.
5hell sort[edit]
& 5hell sort, diHerent from bubble sort in that it mo!es elements to numerous
swapping positions
4ain article7 5hell sort
5hell sort was in!ented by >onald 5hell in :GAG. It impro!es upon bubble sort
and insertion sort by mo!ing out of order elements more than one position at
a time. 1ne implementation can be described as arranging the data
sequence in a two6dimensional array and then sorting the columns of the
array using insertion sort.
(omb sort[edit]
4ain article7 (omb sort
(omb sort is a relati!ely simple sorting algorithm originally designed by
Wlod0imier0 >obosiewic0 in :GFB.[:F] Iater it was redisco!ered and
populari0ed by 5tephen Iacey and Jichard .o- with a .yte 4aga0ine article
published in &pril :GG:. (omb sort impro!es on bubble sort. #he basic idea is
to eliminate turtles, or small !alues near the end of the list, since in a bubble
sort these slow the sorting down tremendously. $Jabbits, large !alues around
the beginning of the list, do not pose a problem in bubble sort%
>istribution sort[edit]
5ee also7 -ternal sorting
>istribution sort refers to any sorting algorithm where data are distributed
from their input to multiple intermediate structures which are then gathered
and placed on the output. +or e-ample, both bucket sort and 2ashsort are
distribution based sorting algorithms. >istribution sorting algorithms can be
used on a single processor, or they can be a distributed algorithm, where
indi!idual subsets are separately sorted on diHerent processors, then
combined. #his allows e-ternal sorting of data too large to ,t into a single
computer8s memory.
(ounting sort[edit]
4ain article7 (ounting sort
(ounting sort is applicable when each input is known to belong to a particular
set, 5, of possibilities. #he algorithm runs in 1$K5K ) n% time and 1$K5K%
memory where n is the length of the input. It works by creating an integer
array of si0e K5K and using the ith bin to count the occurrences of the ith
member of 5 in the input. ach input is then counted by incrementing the
!alue of its corresponding bin. &fterward, the counting array is looped
through to arrange all of the inputs in order. #his sorting algorithm cannot
often be used because 5 needs to be reasonably small for it to be efcient,
but the algorithm is e-tremely fast and demonstrates great asymptotic
beha!ior as n increases. It also can be modi,ed to pro!ide stable beha!ior.
.ucket sort[edit]
4ain article7 .ucket sort
.ucket sort is a di!ide and conquer sorting algorithm that generali0es
(ounting sort by partitioning an array into a ,nite number of buckets. ach
bucket is then sorted indi!idually, either using a diHerent sorting algorithm,
or by recursi!ely applying the bucket sorting algorithm.
>ue to the fact that bucket sort must use a limited number of buckets it is
best suited to be used on data sets of a limited scope. .ucket sort would be
unsuitable for data that ha!e a lot of !ariation, such as social security
numbers.
Jadi- sort[edit]
4ain article7 Jadi- sort
Jadi- sort is an algorithm that sorts numbers by processing indi!idual digits.
n numbers consisting of k digits each are sorted in 1$n L k% time. Jadi- sort
can process digits of each number either starting from the least signi,cant
digit $I5>% or starting from the most signi,cant digit $45>%. #he I5>
algorithm ,rst sorts the list by the least signi,cant digit while preser!ing their
relati!e order using a stable sort. #hen it sorts them by the ne-t digit, and so
on from the least signi,cant to the most signi,cant, ending up with a sorted
list. While the I5> radi- sort requires the use of a stable sort, the 45> radi-
sort algorithm does not $unless stable sorting is desired%. In6place 45> radi-
sort is not stable. It is common for the counting sort algorithm to be used
internally by the radi- sort. & hybrid sorting approach, such as using insertion
sort for small bins impro!es performance of radi- sort signi,cantly.
4emory usage patterns and inde- sorting[edit]
When the si0e of the array to be sorted approaches or e-ceeds the a!ailable
primary memory, so that $much slower% disk or swap space must be
employed, the memory usage pattern of a sorting algorithm becomes
important, and an algorithm that might ha!e been fairly efcient when the
array ,t easily in J&4 may become impractical. In this scenario, the total
number of comparisons becomes $relati!ely% less important, and the number
of times sections of memory must be copied or swapped to and from the disk
can dominate the performance characteristics of an algorithm. #hus, the
number of passes and the locali0ation of comparisons can be more important
than the raw number of comparisons, since comparisons of nearby elements
to one another happen at system bus speed $or, with caching, e!en at (PD
speed%, which, compared to disk speed, is !irtually instantaneous.
+or e-ample, the popular recursi!e quicksort algorithm pro!ides quite
reasonable performance with adequate J&4, but due to the recursi!e way
that it copies portions of the array it becomes much less practical when the
array does not ,t in J&4, because it may cause a number of slow copy or
mo!e operations to and from disk. In that scenario, another algorithm may be
preferable e!en if it requires more total comparisons.
1ne way to work around this problem, which works well when comple-
records $such as in a relational database% are being sorted by a relati!ely
small key ,eld, is to create an inde- into the array and then sort the inde-,
rather than the entire array. $& sorted !ersion of the entire array can then be
produced with one pass, reading from the inde-, but often e!en that is
unnecessary, as ha!ing the sorted inde- is adequate.% .ecause the inde- is
much smaller than the entire array, it may ,t easily in memory where the
entire array would not, eHecti!ely eliminating the disk6swapping problem.
#his procedure is sometimes called Mtag sortM.[:G]
&nother technique for o!ercoming the memory6si0e problem is to combine
two algorithms in a way that takes ad!antages of the strength of each to
impro!e o!erall performance. +or instance, the array might be subdi!ided
into chunks of a si0e that will ,t in J&4, the contents of each chunk sorted
using an efcient algorithm $such as quicksort%, and the results merged using
a k6way merge similar to that used in mergesort. #his is faster than
performing either mergesort or quicksort o!er the entire list.
#echniques can also be combined. +or sorting !ery large sets of data that
!astly e-ceed system memory, e!en the inde- may need to be sorted using
an algorithm or combination of algorithms designed to perform reasonably
with !irtual memory, i.e., to reduce the amount of swapping required.
InefcientNhumorous sorts[edit]
5ome algorithms are slow compared to those discussed abo!e, such as the
bogosort 1$nOnP% and the stooge sort 1$n9.@%.
Jelated algorithms[edit]
Jelated problems include partial sorting $sorting only the k smallest elements
of a list, or alternati!ely computing the k smallest elements, but unordered%
and selection $computing the kth smallest element%. #hese can be sol!ed
inefciently by a total sort, but more efcient algorithms e-ist, often deri!ed
by generali0ing a sorting algorithm. #he most notable e-ample is quickselect,
which is related to quicksort. (on!ersely, some sorting algorithms can be
deri!ed by repeated application of a selection algorithm= quicksort and
quickselect can be seen as the same pi!oting mo!e, diHering only in whether
one recurses on both sides $quicksort, di!ide and conquer% or one side
$quickselect, decrease and conquer%.
& kind of opposite of a sorting algorithm is a shuQing algorithm. #hese are
fundamentally diHerent because they require a source of random numbers.
Interestingly, shuQing can also be implemented by a sorting algorithm,
namely by a random sort7 assigning a random number to each element of the
list and then sorting based on the random numbers. #his is generally not
done in practice, howe!er, and there is a well6known simple and efcient
algorithm for shuQing7 the +isher3Rates shuQe.
5ee also[edit]

Potrebbero piacerti anche