Friday 30 March 2012

Sorting in data structure

Comparison of algorithms

In this table, n is the number of records to be sorted. The columns "Average" and "Worst" give the time complexity in each case, under the assumption that the length of each key is constant, and that therefore all comparisons, swaps, and other needed operations can proceed in constant time. "Memory" denotes the amount of auxiliary storage needed beyond that used by the list itself, under the same assumption. These are all comparison sorts. The run time and the memory of algorithms could be measured using various notations like theta, omega, Big-O, small-o, etc. The memory and the run times below are applicable for all the 5 notations.
Comparison sorts
Name Best Average Worst Memory Stable Method Other notes
Quicksort \mathcal{} n \log n \mathcal{} n \log n \mathcal{} n^2 \mathcal{} \log n Depends Partitioning Quicksort is usually done in place with O(log(n)) stack space.[citation needed] Most implementations are unstable, as stable in-place partitioning is more complex. Naïve variants use an O(n) space array to store the partition.[citation needed]
Merge sort \mathcal{} {n \log n} \mathcal{} {n \log n} \mathcal{} {n \log n} Depends; worst case is  \mathcal{} n Yes Merging Used to sort this table in Firefox [2].
In-place Merge sort  \mathcal{} -  \mathcal{} -  \mathcal{} {n \left( \log n \right)^2}  \mathcal{} {1} Yes Merging Implemented in Standard Template Library (STL): [3]; can be implemented as a stable sort based on stable in-place merging: [4]
Heapsort \mathcal{} {n \log n} \mathcal{} {n \log n} \mathcal{} {n \log n} \mathcal{} {1} No Selection
Insertion sort  \mathcal{} n  \mathcal{} n^2  \mathcal{} n^2  \mathcal{} {1} Yes Insertion O(n + d), where d is the number of inversions
Introsort \mathcal{} n \log n \mathcal{} n \log n \mathcal{} n \log n \mathcal{} \log n No Partitioning & Selection Used in SGI STL implementations
Selection sort  \mathcal{} n^2  \mathcal{} n^2  \mathcal{} n^2  \mathcal{} {1} No Selection Stable with O(n) extra space, for example using lists [5]. Used to sort this table in Safari or other Webkit web browser [6].
Timsort  \mathcal{} {n}  \mathcal{} {n \log n}  \mathcal{} {n \log n}  \mathcal{} n Yes Insertion & Merging \mathcal{} {n} comparisons when the data is already sorted or reverse sorted.
Shell sort \mathcal{} n \mathcal{} n (\log n)^2

or

\mathcal{} n^{3/2}
Depends on gap sequence; best known is \mathcal{} n (\log n)^2 \mathcal{} 1 No Insertion
Bubble sort \mathcal{} n \mathcal{} n^2 \mathcal{} n^2 \mathcal{} {1} Yes Exchanging Tiny code size
Binary tree sort \mathcal{} n \mathcal{} {n \log n} \mathcal{} {n \log n} \mathcal{} n Yes Insertion When using a self-balancing binary search tree
Cycle sort  \mathcal{} n^2  \mathcal{} n^2 \mathcal{} {1} No Insertion In-place with theoretically optimal number of writes
Library sort  \mathcal{} {n \log n}  \mathcal{} n^2  \mathcal{} n Yes Insertion
Patience sorting \mathcal{} n \log n \mathcal{} n No Insertion & Selection Finds all the longest increasing subsequences within O(n log n)
Smoothsort \mathcal{} {n} \mathcal{} {n \log n} \mathcal{} {n \log n} \mathcal{} {1} No Selection An adaptive sort - \mathcal{} {n} comparisons when the data is already sorted, and 0 swaps.
Strand sort \mathcal{} n \mathcal{} n^2 \mathcal{} n^2 \mathcal{} n Yes Selection
Tournament sort \mathcal{} n \log n \mathcal{} n \log n

Selection
Cocktail sort \mathcal{} n \mathcal{} n^2  \mathcal{} n^2 \mathcal{} {1} Yes Exchanging
Comb sort \mathcal{} n \mathcal{} n \log n  \mathcal{} n^2  \mathcal{} {1} No Exchanging Small code size
Gnome sort  \mathcal{} n  \mathcal{} n^2  \mathcal{} n^2  \mathcal{} {1} Yes Exchanging Tiny code size
Bogosort  \mathcal{} n  \mathcal{} n \cdot n!  \mathcal{} {n \cdot n! \to \infty}  \mathcal{} {1} No Luck Randomly permute the array and check if sorted.
Slowsort  \Omega\left(n^{ \frac{\log_2(n)}{(2+\epsilon)}}\right) No Selection Remarkably inefficient sorting algorithm [7]
The following table describes integer sorting algorithms and other sorting algorithms that are not comparison sorts. As such, they are not limited by a \Omega\left( {n \log n} \right) lower bound. Complexities below are in terms of n, the number of items to be sorted, k, the size of each key, and d, the digit size used by the implementation. Many of them are based on the assumption that the key size is large enough that all entries have unique key values, and hence that n << 2k, where << means "much less than."
Non-comparison sorts
Name Best Average Worst Memory Stable n << 2k Notes
Pigeonhole sort \;n + 2^k \;n + 2^k \;2^k Yes Yes
Bucket sort (uniform keys) \;n+k \;n^2 \cdot k \;n \cdot k Yes No Assumes uniform distribution of elements from the domain in the array.[2]
Bucket sort (integer keys) \;n+r \;n+r \;n+r Yes Yes r is the range of numbers to be sorted. If r = \mathcal{O}\left( {n} \right) then Avg RT = \mathcal{O}\left( {n} \right)[3]
Counting sort \;n+r \;n+r \;n+r Yes Yes r is the range of numbers to be sorted. If r = \mathcal{O}\left( {n} \right) then Avg RT = \mathcal{O}\left( {n} \right)[2]
LSD Radix Sort \;n \cdot \frac{k}{d} \;n \cdot \frac{k}{d} \mathcal{} n Yes No [3][2]
MSD Radix Sort \;n \cdot \frac{k}{d} \;n \cdot \frac{k}{d} \mathcal{} n + \frac{k}{d} \cdot 2^d Yes No Stable version uses an external array of size n to hold all of the bins
MSD Radix Sort \;n \cdot \frac{k}{d} \;n \cdot \frac{k}{d} \frac{k}{d} \cdot 2^d No No In-Place. k / d recursion levels, 2d for count array
Spreadsort \;n \cdot \frac{k}{d} \;n \cdot \left( {\frac{k}{s} + d} \right) \;\frac{k}{d} \cdot 2^d No No Asymptotics are based on the assumption that n << 2k, but the algorithm does not require this.
The following table describes some sorting algorithms that are impractical for real-life use due to extremely poor performance or a requirement for specialized hardware.
Name Best Average Worst Memory Stable Comparison Other notes
Bead sort N/A N/A N/A No Requires specialized hardware
Simple pancake sort \mathcal{} n \mathcal{} n \mathcal{} {\log n} No Yes Count is number of flips.
Spaghetti (Poll) sort \mathcal{} n \mathcal{} n \mathcal{} n \mathcal{} n^2 Yes Polling This A linear-time, analog algorithm for sorting a sequence of items, requiring O(n) stack space, and the sort is stable. This requires n parallel processors. Spaghetti sort#Analysis
Sorting networks \mathcal{} {\log n} \mathcal{} {\log n} \mathcal{} {n \cdot \log (n)} Yes No Requires a custom circuit of size \mathcal{O}\left( n \cdot \log (n) \right)
Additionally, theoretical computer scientists have detailed other sorting algorithms that provide better than \mathcal{O}\left( {n \log n} \right) time complexity with additional constraints, including:
  • Han's algorithm, a deterministic algorithm for sorting keys from a domain of finite size, taking \mathcal{O}\left( {n \log \log n} \right) time and \mathcal{O}\left( {n} \right) space.[4]
  • Thorup's algorithm, a randomized algorithm for sorting keys from a domain of finite size, taking \mathcal{O}\left( {n \log \log n} \right) time and \mathcal{O}\left( {n} \right) space.[5]
  • An integer sorting algorithm taking \mathcal{O}\left( {n \sqrt{\log \log n}} \right) expected time and \mathcal{O}\left( {n} \right) space

No comments:

Post a Comment