Sei sulla pagina 1di 2

Data Structures - Cheat Sheet

Trees Binary

Red-Black Tree Melding: If the heap is represented by an array, link the two
arrays together and Heapify-Up. O (n).
1. Red Rule : A red child must have a black father
2. Black Rule: All path s to external nodes pass thr ough the
Binomial
same number of black nodes.
3. All the leaves are black, and the sky is grey. Melding: Unify trees by rank like binary summation. O (log n)
Rotations are terminal cases. Only happen once per fixup.
If we have a series of insert-delete for which the insertion point Fibonacci Heap
is known, the amortized cost to each action is O (n). (1+

5)
≤ ≤
Height:log n h 2log n Maximum degree: D (n) ≤
logϕ n ; ϕ =
2
Limit of rotations: 2 per insert. Minimum size of degree k : sk Fk+2 ≥
Bound of ratios between two branches L, R: S (R) (S (L))2
≤ Marking: Every node which lost one child is marked.
Completely isomorphic to 2-4 Trees. Cascading Cut: Cut ever y marked node clim bing upwards.
Keeps amortized O(log n) time for deleteMin. Otherwise

O ( n).
B-Tree
Proof of the ϕk node size bound:
d defines the minimum number of keys on a node 1. All subtrees of junction j , sorted by order of insertion are of

Height: h logd n ≥ −
degree D [si ] i 2 (Proof: when x’s largest subtree was added,
d
1. Every node has at mos t d children and at least 2 children −
since D [x] was i 1, so was the sub tree. Since then, it could
(root excluded). lose only one child, so it is at least i 2) −
2. The root has at least 2 children if it isn’t a leaf. k k
2.Fk+2 =1 + i=0 Fi ; ϕFk+2 ≥

3. A non-leaf node with k children contains k 1 keys.

3. If x is a node an d k = deg [ x], Sx Fk+2 ϕk .
≥ ≥
4. On B+ trees, leaves appear at the same level. (Proof: Assume induc tion after the base cases and then sk =
5. Nodes at each level form linked lists 2 + ki=2 Si−2 2 + ki=2 Fi = 1 + ki=0 Fi = Fk+2 )
 ≥  
d is optimized for HDD/cache block size
Insert: Add to inse rtion point. If the node gets too large,
Structures

split.O (log n) O (logd n)
Split: The middle of the node (low median) moves up to be the Median Heap: one min- heap and one max-he ap with x ∀ ∈
edge of the father node. O (d) ∈
min, y max : x > y then the minimum is on the median-heap
Delete: If the key is not in a leaf, switch with succ/pred. Delete,
and deal with short node v :
1. If v is the root, discard; terminate.
Sorting
2. If v has a non-short sibling, steal from it; terminate. Comparables
3. Fuse v with its sibling, repeat with p ← p [v ].
Algorithm Expected Worst Storage
Traversals QuickSort O (n log n) O n2 In-Place
Partition recursively at each step.
 
BubbleSort O n2 In-Place
Traverse(t):
O n2
SelectionSort
  In-Place
if t==null then return
Traverse n slots keeping score of the
→ print (t) //pre-order
maximum. Swap it with A [n]. Repeat
Traverse(t.left)
→ (OR) print(t) //in-order for A [n − 1] .
Traverse(t.right) HeapSort O (n log n) Aux
→ (OR) print(t) //post-order InsertionSort Aux
MergeSort O (n log n) Aux

Heaps Linear Time

Binary Binomial Fibonacci BucketSort Θ (n):


findMin (1) (1) (1) If the range is known, make the appropriate number of buckets,
deleteMin Θ(log n) (log n) O (log n) then:
insert Θ(log n) O (log n) (1) 1. Scatter: Go over the srcinal array, putting each object in its
decreaseKey Θ(log n) (log n) (1) bucket.
meld Θ(n) (log n) (1) 2. Sort each non-empty bucket (recursively or otherwise)
1
3. Gather: Visit the buckets in order and put all elements back Two-Level Hashing
into the srcinal array.  −  =|
n 1 ni
CountSort Θ (n): The number of collisions per level: i=0 Col |
2
1. Given an array A bounded in the discrete range C , initialize 1. Choose m = n and h such that Col < n. | |
an array with that size. 2. Store the ni elements hashed to i in a small table of size n2i
2. Passing through A, increment every occurence of a number i using a perfect hash function hi .
in its proper slot in C . Random algorithm for constructing a perfect two level
3. Passing through C , add the number represented by i into A hash table:
a total of C [i] times. 1. Choose a rando m h from H (n) and compute the number of
RadixSort Θ (n):
collisions. If there are more than n collisions, repeat.
1. Take the least significant digit. 2. For each cell i,if ni > 1, choose a random hash function from
2. Group the keys based on that digit, but otherwise ke ep the H (ni2). If there are any collisions, repeat.
srcinal order of keys. (This is what makes the LSD radix sort Expected construction time – O (n)
a stable sort). Worst Case search time - O
3. Repeat the grouping process with each more significant digit. (1)

Union-Find
Selection
MakeSet(x) Union( x, y ) Find( x)
O n2
QuickSelect O (n)
  O (1) O (1) O (α (k ))
5-tuple Select Union by Rank: The larger tree remains the master tree in
every union.
Path Compression: every find operation first finds the master
Hashing root, then repeats its walk to change the subroots.
Universal Family: a family of mappi ngs H. h H. h : U ∀ ∈ →
1
∀  ∈
[m] is universal iff k1 = k2 U : P rh∈H [h(k1 ) = h(k2 )] m ≤ Recursion
Example: If U = [p] = 0, 1,...,p { − }
1 then Hp,m =
n
  + ( ); ≥ 1
{ | ≤ ≤
ha,b 1 a p; 0 b p ≤ ≤ } and every hash function is Master Theorem: for T (n) = aT b
f n a , b >
ha,b (k ) = (( ak + b) mod (p)) mod (m) 1, ε > 0:
logb a logb (a) ε
Linear Probing: Search in incremental order through the table T (n) = Θ n  (( )) == Θ  log−   ;
f n O n
  
from h (x) until a vacancy is found.  T (n) = Θ nlog b
a
log k+1
n f n nlog b
a k
n k ≥0
Open Addressing: Use h1 (x) to hash and h2 (x)to permute.
nlogb a+ε
 ( )=Ωf n
 
No pointers. T (n) = (f (n)) n
Open Hashing :
  ≥ ( )
af b
cf n
Perfect Hash: When one function clashes, try another. O ( ). ∞ Building a recursion tree: build one tree for running times (at
Load Factor α: The length of a possible collision chain. When T (αn)) and one for f (n).
| |
U = n, α = . m

Orders of Growth
Methods f x →∞
f = O (g ) lim sup x→∞ fg < ∞ f = o(g ) g
→ 0
Modular: Multipilicative, Additive, Tabular(byte)-additive f = Θ (g) lim x→∞ fg R+ ∈
f x →∞
f = Ω (g) limin f x→∞ fg > 0 f = ω (g ) g
→ ∞
Performance
Chaining E [X ] Worst Case Amortized Analysis
Successful Search/Del 1
2(1 + α) n Potential Method: Set Φ to examine a parameter on data
Failed Search/Verified Insert 1+α n
stucture Di where i indexes the stat e of the structure. If ci is
the actual cost of action i, then cˆi = ci + Φ (Di ) Φ (Di−1 ). −
Probing n 
The total potential amortized cost will then be i=1 cˆi =
 n n 
Linear: h (k, i) = ( h (k ) + i) mod m i=1 (ci + Φ (Di ) −
Φ (Di−1 )) = i=1 ci + Φ (Di ) Φ (D0 ) −
Quadratic:h k, i) = h (k ) + c1 i + c2 i2 mod m Deterministic algorithm: Always predictable.
  
Double: h (k, i) = (h1 (k ) + ih2 (k)) mod m Stirling’s Approximation: n! 2πn ne
n
∼√
log n!
 ⇒ ∼
E [X ] Unsuccessful Search Successful Search x log x x −
1 1 1
Uni. Probing
Lin. Probing 1
1+
1 α
 −   1
2
1
1ln+ − 
α 1 α
1
2 1−α 2 1−α
So Linear Probing is slightly worse but better for cache.
Collision Expectation: P [X 2E [X ]] 21
≤ ≥
So:
Scribbled by Omer Shapira, based on the course “Data Structures” at Tel Aviv
1. if m = n then E [ Col < n] n2 University.
2. if m = n2 then E [ Col < 1]
| || | ≥≥ 1
2 And with 2 there are no Redistribute freely.
collisions. Website: http://www.omershapira.com

Potrebbero piacerti anche