Sei sulla pagina 1di 277

Fundamental Data Structures

Contents

1 Introduction 1
1.1 Abstract data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Defining an abstract data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.4 Advantages of abstract data typing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.5 Typical operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.1.6 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.7 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.8 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.10 Citations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.1.11 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.12 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.1.13 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.1 Usage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.4 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.7 Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Analysis of algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 Cost models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Run-time analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 Relevance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.4 Constant factors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.3.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

i
ii CONTENTS

1.4 Amortized analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13


1.4.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.2 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.4.4 Common use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5 Accounting method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.1 The method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.5.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6 Potential method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6.1 Definition of amortized time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.6.2 Relation between amortized and actual time . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6.3 Amortized analysis of worst-case inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.6.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2 Sequences 18
2.1 Array data type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.2 Abstract arrays . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.3 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.4 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.1.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.1.7 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2 Array data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3 Element identifier and addressing formulas . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.4 Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2.2.5 Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.2.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.3 Dynamic array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.1 Bounded-size dynamic arrays and capacity . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.2 Geometric expansion and amortized cost . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.3.3 Growth factor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.4 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.5 Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.6 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
CONTENTS iii

2.3.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.3.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Linked list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.2 Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.3 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.4.4 Basic concepts and nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.5 Tradeoffs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.6 Linked list operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.4.7 Linked lists using arrays of nodes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4.8 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.9 Internal and external storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.4.10 Related data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.4.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.12 Footnotes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.13 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.4.14 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5 Doubly linked list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.1 Nomenclature and implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.2 Basic algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.5.3 Advanced concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.5.4 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.5.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6 Stack (abstract data type) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.6.2 Non-essential operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6.3 Software stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.6.4 Hardware stacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.6.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.6.6 Security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.6.7 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.6.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.6.9 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.6.10 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7 Queue (abstract data type) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.7.1 Queue implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.7.2 Purely functional implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.7.3 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.7.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.7.5 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.8 Double-ended queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
iv CONTENTS

2.8.1 Naming conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48


2.8.2 Distinctions and sub-types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.8.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.8.4 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.8.5 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
2.8.6 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.8.7 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.8.8 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.8.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.8.10 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.9 Circular buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
2.9.1 Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.9.2 How it works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.9.3 Circular buffer mechanics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.9.4 Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.9.5 Fixed-length-element and contiguous-block circular buffer . . . . . . . . . . . . . . . . . 52
2.9.6 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

3 Dictionaries 53
3.1 Associative array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.1.4 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1.5 Permanent storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
3.1.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.1.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.1.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2 Association list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2.1 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.2.3 Applications and software libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2.4 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.2.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3 Hash table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
3.3.1 Hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.2 Key statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
3.3.3 Collision resolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.3.4 Dynamic resizing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.3.5 Performance analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3.6 Features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.3.7 Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
CONTENTS v

3.3.8 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3.9 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3.10 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
3.3.11 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.3.12 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.3.13 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4 Linear probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.4.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
3.4.4 Choice of hash function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.4.5 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.4.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.5 Quadratic probing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.5.1 Quadratic function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.5.2 Quadratic probing insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
3.5.3 Quadratic probing search . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5.4 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
3.5.7 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.6 Double hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.6.1 Classical applied data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.6.2 Implementation details for caching . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.6.3 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.6.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.6.5 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.7 Cuckoo hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.7.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.7.2 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
3.7.3 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.7.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.7.5 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
3.7.6 Comparison with related structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.7.7 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.7.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.7.9 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.8 Hopscotch hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.8.1 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.8.2 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.8.3 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
vi CONTENTS

3.9 Hash function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78


3.9.1 Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.9.2 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
3.9.3 Hash function algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
3.9.4 Locality-sensitive hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.9.5 Origins of the term . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.9.6 List of hash functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.9.7 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.9.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
3.9.9 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.10 Perfect hash function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.10.1 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.10.2 Construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
3.10.3 Space lower bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.10.4 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.10.5 Related constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.10.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
3.10.7 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.10.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.11 Universal hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.11.2 Mathematical guarantees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
3.11.3 Constructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.11.4 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.11.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
3.11.6 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.11.7 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.12 K-independent hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.12.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.12.2 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
3.12.3 Techniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.12.4 Independence needed by different hashing methods . . . . . . . . . . . . . . . . . . . . . 93
3.12.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
3.12.6 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.13 Tabulation hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.13.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.13.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
3.13.3 Universality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.13.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.13.5 Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
3.13.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
CONTENTS vii

3.13.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.14 Cryptographic hash function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
3.14.1 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
3.14.2 Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.14.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
3.14.4 Hash functions based on block ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.14.5 Merkle–Damgård construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.14.6 Use in building other cryptographic primitives . . . . . . . . . . . . . . . . . . . . . . . . 100
3.14.7 Concatenation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.14.8 Cryptographic hash algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
3.14.9 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.14.10 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
3.14.11 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4 Sets 103
4.1 Set (abstract data type) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1.1 Type theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.1.3 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.1.4 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.1.5 Multiset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.1.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.1.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.1.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.2 Bit array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.2.2 Basic operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.2.3 More complex operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.2.4 Compression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.2.5 Advantages and disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.2.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2.7 Language support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2.8 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.2.9 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.2.10 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.3 Bloom filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.3.1 Algorithm description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3.2 Space and time advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
4.3.3 Probability of false positives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
4.3.4 Approximating the number of items in a Bloom filter . . . . . . . . . . . . . . . . . . . . 114
4.3.5 The union and intersection of sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.3.6 Interesting properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
viii CONTENTS

4.3.7 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114


4.3.8 Alternatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.3.9 Extensions and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
4.3.10 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.3.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.3.12 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.3.13 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
4.4 MinHash . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.4.1 Jaccard similarity and minimum hash values . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.4.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
4.4.3 Min-wise independent permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.4.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.4.5 Other uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.4.6 Evaluation and benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.4.7 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.4.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.4.9 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.5 Disjoint-set data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.5.1 Disjoint-set linked lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.5.2 Disjoint-set forests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
4.5.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.5.4 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
4.5.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.5.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.5.7 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.6 Partition refinement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.6.1 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.6.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.6.3 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.6.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127

5 Priority queues 129


5.1 Priority queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1.2 Similarity to queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
5.1.4 Equivalence of priority queues and sorting algorithms . . . . . . . . . . . . . . . . . . . . 130
5.1.5 Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.1.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
5.1.7 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.1.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
5.1.9 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
CONTENTS ix

5.1.10 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133


5.2 Bucket queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.2.1 Basic data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.2.2 Optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.2.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.2.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.3 Heap (data structure) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.3.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.3.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.3.3 Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.3.4 Comparison of theoretic bounds for variants . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.3.6 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.3.7 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.3.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.3.9 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.4 Binary heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137
5.4.1 Heap operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.4.2 Building a heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.4.3 Heap implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.4.4 Derivation of index equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.4.5 Related structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.4.6 Summary of running times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.4.7 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.4.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.4.9 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.5 ''d''-ary heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.5.1 Data structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.5.2 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.5.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.5.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
5.5.5 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.6 Binomial heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.6.1 Binomial heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.6.2 Structure of a binomial heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.6.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.6.4 Summary of running times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.6.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.6.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.6.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
5.6.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
x CONTENTS

5.7 Fibonacci heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148


5.7.1 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
5.7.2 Implementation of operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
5.7.3 Proof of degree bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.7.4 Worst case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
5.7.5 Summary of running times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.7.6 Practical considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.7.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
5.7.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.8 Pairing heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.8.1 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.8.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
5.8.3 Summary of running times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.8.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
5.8.5 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.9 Double-ended priority queue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.9.1 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.9.2 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
5.9.3 Time Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.9.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
5.9.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.9.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.10 Soft heap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.10.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
5.10.2 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158

6 Successors and neighbors 159


6.1 Binary search algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.1.1 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.1.2 Performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.1.3 Binary search versus other schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
6.1.4 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.1.5 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.1.6 Implementation issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.1.7 Library support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
6.1.8 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.1.9 Notes and references . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.1.10 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.2 Binary search tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.2.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.2.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.2.3 Examples of applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
CONTENTS xi

6.2.4 Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170


6.2.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
6.2.8 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.2.9 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.3 Random binary tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.3.1 Binary trees from random permutations . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
6.3.2 Uniformly random binary trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.3.3 Random split trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.3.4 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
6.3.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.3.6 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.4 Tree rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
6.4.1 Illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.4.2 Detailed illustration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
6.4.3 Inorder invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.4.4 Rotations for rebalancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.4.5 Rotation distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.4.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.4.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
6.4.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.5 Self-balancing binary search tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.5.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
6.5.2 Implementations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.5.3 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.5.4 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.5.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.5.6 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.6 Treap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
6.6.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
6.6.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
6.6.3 Randomized binary search tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.6.4 Comparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
6.6.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.6.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.6.7 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.7 AVL tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.7.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
6.7.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
6.7.3 Rebalancing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
xii CONTENTS

6.7.4 Comparison to other structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186


6.7.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 186
6.7.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.7.7 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.7.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.8 Red–black tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
6.8.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
6.8.2 Terminology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
6.8.3 Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
6.8.4 Analogy to B-trees of order 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
6.8.5 Applications and related data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.8.6 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
6.8.7 Proof of asymptotic bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
6.8.8 Set operations and bulk operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
6.8.9 Parallel algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
6.8.10 Popular culture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
6.8.11 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
6.8.12 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
6.8.13 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.8.14 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.9 WAVL tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.9.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
6.9.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
6.9.3 Computational complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.9.4 Related structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
6.9.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.10 Scapegoat tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.10.1 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.10.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 199
6.10.3 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.10.4 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.10.5 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.11 Splay tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.11.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.11.2 Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
6.11.3 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202
6.11.4 Implementation and variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
6.11.5 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
6.11.6 Performance theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
6.11.7 Dynamic optimality conjecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
6.11.8 Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
CONTENTS xiii

6.11.9 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206


6.11.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
6.11.11 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206
6.11.12 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.12 Tango tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.12.1 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.12.2 Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207
6.12.3 Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
6.12.4 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
6.12.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
6.13 Skip list . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
6.13.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
6.13.2 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
6.13.3 Usages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210
6.13.4 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
6.13.5 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
6.13.6 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
6.14 B-tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
6.14.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212
6.14.2 B-tree usage in databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
6.14.3 Technical description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
6.14.4 Best case and worst case heights . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.14.5 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215
6.14.6 In filesystems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.14.7 Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.14.8 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
6.14.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.14.10 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.14.11 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219
6.15 B+ tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
6.15.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
6.15.2 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
6.15.3 Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.15.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
6.15.5 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
6.15.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
6.15.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
6.15.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222

7 Integer and string searching 223


7.1 Trie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
7.1.1 History and etymology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
xiv CONTENTS

7.1.2 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223


7.1.3 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.1.4 Implementation strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224
7.1.5 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
7.1.6 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226
7.1.7 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
7.2 Radix tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227
7.2.1 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
7.2.2 Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 228
7.2.3 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.2.4 Comparison to other data structures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 229
7.2.5 Variants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
7.2.6 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
7.2.7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
7.2.8 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
7.3 Suffix tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
7.3.1 History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231
7.3.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
7.3.3 Generalized suffix tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
7.3.4 Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
7.3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
7.3.6 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
7.3.7 Parallel construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
7.3.8 External construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
7.3.9 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
7.3.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
7.3.11 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234
7.3.12 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
7.4 Suffix array . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
7.4.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.4.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.4.3 Correspondence to suffix trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.4.4 Space Efficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.4.5 Construction Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236
7.4.6 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
7.4.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
7.4.8 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237
7.4.9 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
7.5 Suffix automaton . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238
7.5.1 See also . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
7.5.2 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
CONTENTS xv

7.5.3 Additional reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239


7.6 Van Emde Boas tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
7.6.1 Supported operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
7.6.2 How it works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
7.6.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.7 Fusion tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
7.7.1 How it works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
7.7.2 Fusion hashing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
7.7.3 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
7.7.4 External links . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 244

8 Text and image sources, contributors, and licenses 245


8.1 Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245
8.2 Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256
8.3 Content license . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
Chapter 1

Introduction

1.1 Abstract data type 1.1.1 Examples

For example, integers are an ADT, defined as the val-


Not to be confused with Algebraic data type. ues …, −2, −1, 0, 1, 2, …, and by the operations of ad-
dition, subtraction, multiplication, and division, together
In computer science, an abstract data type (ADT) is a with greater than, less than, etc., which behave according
mathematical model for data types where a data type is to familiar mathematics (with care for integer division),
defined by its behavior (semantics) from the point of view independently of how the integers are represented by
of a user of the data, specifically in terms of possible val- the computer.[lower-alpha 1] Explicitly, “behavior” includes
ues, possible operations on data of this type, and the be- obeying various axioms (associativity and commutativity
havior of these operations. This contrasts with data struc- of addition etc.), and preconditions on operations (can-
tures, which are concrete representations of data, and are not divide by zero). Typically integers are represented in
the point of view of an implementer, not a user. a data structure as binary numbers, most often as two’s
Formally, an ADT may be defined as a “class of ob- complement, but might be binary-coded decimal or in
jects whose logical behavior is defined by a set of val- ones’ complement, but the user is abstracted from the
ues and a set of operations";[1] this is analogous to an concrete choice of representation, and can simply use the
algebraic structure in mathematics. What is meant by data as integers.
“behavior” varies by author, with the two main types of An ADT consists not only of operations, but also of val-
formal specifications for behavior being axiomatic (alge- ues of the underlying data and of constraints on the op-
braic) specification and an abstract model;[2] these cor- erations. An “interface” typically refers only to the op-
respond to axiomatic semantics and operational seman- erations, and perhaps some of the constraints on the op-
tics of an abstract machine, respectively. Some authors erations, notably pre-conditions and post-conditions, but
also include the computational complexity (“cost”), both not other constraints, such as relations between the oper-
in terms of time (for computing operations) and space ations.
(for representing values). In practice many common data For example, an abstract stack, which is a last-in-first-
types are not ADTs, as the abstraction is not perfect, and out structure, could be defined by three operations: push,
users must be aware of issues like arithmetic overflow that that inserts a data item onto the stack; pop, that removes
are due to the representation. For example, integers are a data item from it; and peek or top, that accesses a data
often stored as fixed width values (32-bit or 64-bit bi- item on top of the stack without removal. An abstract
nary numbers), and thus experience integer overflow if queue, which is a first-in-first-out structure, would also
the maximum value is exceeded. have three operations: enqueue, that inserts a data item
ADTs are a theoretical concept in computer science, used into the queue; dequeue, that removes the first data item
in the design and analysis of algorithms, data structures, from it; and front, that accesses and serves the first data
and software systems, and do not correspond to spe- item in the queue. There would be no way of differentiat-
cific features of computer languages—mainstream com- ing these two data types, unless a mathematical constraint
puter languages do not directly support formally speci- is introduced that for a stack specifies that each pop al-
fied ADTs. However, various language features corre- ways returns the most recently pushed item that has not
spond to certain aspects of ADTs, and are easily confused been popped yet. When analyzing the efficiency of algo-
with ADTs proper; these include abstract types, opaque rithms that use stacks, one may also specify that all oper-
data types, protocols, and design by contract. ADTs were ations take the same time no matter how many data items
first proposed by Barbara Liskov and Stephen N. Zilles in have been pushed into the stack, and that the stack uses a
1974, as part of the development of the CLU language.[3] constant amount of storage for each element.

1
2 CHAPTER 1. INTRODUCTION

1.1.2 Introduction • store(V, x) where x is a value of unspecified nature;

Abstract data types are purely theoretical entities, used • fetch(V), that yields a value,
(among other things) to simplify the description of ab-
stract algorithms, to classify and evaluate data structures, with the constraint that
and to formally describe the type systems of program-
ming languages. However, an ADT may be implemented • fetch(V) always returns the value x used in the most
by specific data types or data structures, in many ways recent store(V, x) operation on the same variable V.
and in many programming languages; or described in a
formal specification language. ADTs are often imple-
mented as modules: the module’s interface declares pro- As in so many programming languages, the operation
cedures that correspond to the ADT operations, some- store(V, x) is often written V ← x (or some similar no-
times with comments that describe the constraints. This tation), and fetch(V) is implied whenever a variable V is
information hiding strategy allows the implementation of used in a context where a value is required. Thus, for
the module to be changed without disturbing the client example, V ← V + 1 is commonly understood to be a
programs. shorthand for store(V,fetch(V) + 1).

The term abstract data type can also be regarded as a gen- In this definition, it is implicitly assumed that storing a
eralized approach of a number of algebraic structures, value into a variable U has no effect on the state of a dis-
such as lattices, groups, and rings.[4] The notion of ab- tinct variable V. To make this assumption explicit, one
stract data types is related to the concept of data ab- could add the constraint that
straction, important in object-oriented programming and
design by contract methodologies for software develop- • if U and V are distinct variables, the sequence {
ment.[5] store(U, x); store(V, y) } is equivalent to { store(V,
y); store(U, x) }.

1.1.3 Defining an abstract data type More generally, ADT definitions often assume that any
operation that changes the state of one ADT instance has
An abstract data type is defined as a mathematical model no effect on the state of any other instance (including
of the data objects that make up a data type as well as the other instances of the same ADT) — unless the ADT ax-
functions that operate on these objects. There are no stan- ioms imply that the two instances are connected (aliased)
dard conventions for defining them. A broad division may in that sense. For example, when extending the definition
be drawn between “imperative” and “functional” defini- of abstract variable to include abstract records, the opera-
tion styles. tion that selects a field from a record variable R must yield
a variable V that is aliased to that part of R.
The definition of an abstract variable V may also restrict
Imperative-style definition
the stored values x to members of a specific set X, called
the range or type of V. As in programming languages,
In the philosophy of imperative programming languages,
such restrictions may simplify the description and analysis
an abstract data structure is conceived as an entity that is
of algorithms, and improve their readability.
mutable—meaning that it may be in different states at dif-
ferent times. Some operations may change the state of the Note that this definition does not imply anything about
ADT; therefore, the order in which operations are eval- the result of evaluating fetch(V) when V is un-initialized,
uated is important, and the same operation on the same that is, before performing any store operation on V. An
entities may have different effects if executed at differ- algorithm that does so is usually considered invalid, be-
ent times—just like the instructions of a computer, or the cause its effect is not defined. (However, there are some
commands and procedures of an imperative language. To important algorithms whose efficiency strongly depends
underscore this view, it is customary to say that the oper- on the assumption that such a fetch is legal, and returns
ations are executed or applied, rather than evaluated. The some arbitrary value in the variable’s range.)
imperative style is often used when describing abstract
algorithms. (See The Art of Computer Programming by
Donald Knuth for more details) Instance creation Some algorithms need to create new
instances of some ADT (such as new variables, or new
stacks). To describe such algorithms, one usually includes
Abstract variable Imperative-style definitions of in the ADT definition a create() operation that yields an
ADT often depend on the concept of an abstract vari- instance of the ADT, usually with axioms equivalent to
able, which may be regarded as the simplest non-trivial
ADT. An abstract variable V is a mutable entity that • the result of create() is distinct from any instance in
admits two operations: use by the algorithm.
1.1. ABSTRACT DATA TYPE 3

This axiom may be strengthened to exclude also partial Single-instance style Sometimes an ADT is defined as
aliasing with other instances. On the other hand, this ax- if only one instance of it existed during the execution of
iom still allows implementations of create() to yield a pre-the algorithm, and all operations were applied to that in-
viously created instance that has become inaccessible to stance, which is not explicitly notated. For example, the
the program. abstract stack above could have been defined with opera-
tions push(x) and pop(), that operate on the only existing
stack. ADT definitions in this style can be easily rewrit-
Example: abstract stack (imperative) As another ten to admit multiple coexisting instances of the ADT, by
example, an imperative-style definition of an abstract adding an explicit instance parameter (like S in the previ-
stack could specify that the state of a stack S can be mod- ous example) to every operation that uses or modifies the
ified only by the operations implicit instance.
On the other hand, some ADTs cannot be meaningfully
• push(S, x), where x is some value of unspecified na-
defined without assuming multiple instances. This is the
ture;
case when a single operation takes two distinct instances
• pop(S), that yields a value as a result, of the ADT as parameters. For an example, consider aug-
menting the definition of the abstract stack with an oper-
ation compare(S, T) that checks whether the stacks S and
with the constraint that
T contain the same items in the same order.

• For any value x and any abstract variable V, the se-


quence of operations { push(S, x); V ← pop(S) } is Functional-style definition
equivalent to V ← x.
Another way to define an ADT, closer to the spirit of
Since the assignment V ← x, by definition, cannot change functional programming, is to consider each state of the
the state of S, this condition implies that V ← pop(S) re- structure as a separate entity. In this view, any opera-
stores S to the state it had before the push(S, x). From this tion that modifies the ADT is modeled as a mathematical
condition and from the properties of abstract variables, it function that takes the old state as an argument, and re-
follows, for example, that the sequence turns the new state as part of the result. Unlike the im-
perative operations, these functions have no side effects.
Therefore, the order in which they are evaluated is imma-
{ push(S, x); push(S, y); U ← pop(S); push(S, terial, and the same operation applied to the same argu-
z); V ← pop(S); W ← pop(S) } ments (including the same input states) will always return
the same results (and output states).
where x, y, and z are any values, and U, V, W are pairwise
In the functional view, in particular, there is no way (or
distinct variables, is equivalent to
need) to define an “abstract variable” with the semantics
of imperative variables (namely, with fetch and store op-
{ U ← y; V ← z; W ← x } erations). Instead of storing values into variables, one
passes them as arguments to functions.
Here it is implicitly assumed that operations on a stack in-
stance do not modify the state of any other ADT instance,
including other stacks; that is, Example: abstract stack (functional) For example,
a complete functional-style definition of an abstract stack
could use the three operations:
• For any values x, y, and any distinct stacks S and T,
the sequence { push(S, x); push(T, y) } is equivalent
• push: takes a stack state and an arbitrary value, re-
to { push(T, y); push(S, x) }.
turns a stack state;

An abstract stack definition usually includes also a • top: takes a stack state, returns a value;
Boolean-valued function empty(S) and a create() opera-
• pop: takes a stack state, returns a stack state.
tion that returns a stack instance, with axioms equivalent
to
In a functional-style definition there is no need for a cre-
ate operation. Indeed, there is no notion of “stack in-
• create() ≠ S for any stack S (a newly created stack is stance”. The stack states can be thought of as being po-
distinct from all previous stacks); tential states of a single stack structure, and two stack
• empty(create()) (a newly created stack is empty); states that contain the same values in the same order are
considered to be identical states. This view actually mir-
• not empty(push(S, x)) (pushing something into a rors the behavior of some concrete implementations, such
stack makes it non-empty). as linked lists with hash cons.
4 CHAPTER 1. INTRODUCTION

Instead of create(), a functional-style definition of an ab- 1.1.4 Advantages of abstract data typing
stract stack may assume the existence of a special stack
state, the empty stack, designated by a special symbol like Encapsulation
Λ or "()"; or define a bottom() operation that takes no ar-
guments and returns this special stack state. Note that the Abstraction provides a promise that any implementation
axioms imply that of the ADT has certain properties and abilities; knowing
these is all that is required to make use of an ADT object.
• push(Λ, x) ≠ Λ. The user does not need any technical knowledge of how
the implementation works to use the ADT. In this way,
the implementation may be complex but will be encapsu-
In a functional-style definition of a stack one does not lated in a simple interface when it is actually used.
need an empty predicate: instead, one can test whether a
stack is empty by testing whether it is equal to Λ.
Localization of change
Note that these axioms do not define the effect of top(s) or
pop(s), unless s is a stack state returned by a push. Since
Code that uses an ADT object will not need to be edited
push leaves the stack non-empty, those two operations are
if the implementation of the ADT is changed. Since any
undefined (hence invalid) when s = Λ. On the other hand,
changes to the implementation must still comply with the
the axioms (and the lack of side effects) imply that push(s,
interface, and since code using an ADT object may only
x) = push(t, y) if and only if x = y and s = t.
refer to properties and abilities specified in the interface,
As in some other branches of mathematics, it is custom- changes may be made to the implementation without re-
ary to assume also that the stack states are only those quiring any changes in code where the ADT is used.
whose existence can be proved from the axioms in a finite
number of steps. In the abstract stack example above, this
rule means that every stack is a finite sequence of values, Flexibility
that becomes the empty stack (Λ) after a finite number
of pops. By themselves, the axioms above do not ex- Different implementations of the ADT, having all the
clude the existence of infinite stacks (that can be poped same properties and abilities, are equivalent and may
forever, each time yielding a different state) or circular be used somewhat interchangeably in code that uses the
stacks (that return to the same state after a finite number ADT. This gives a great deal of flexibility when using
of pops). In particular, they do not exclude states s such ADT objects in different situations. For example, differ-
that pop(s) = s or push(s, x) = s for some x. However, ent implementations of the ADT may be more efficient
since one cannot obtain such stack states with the given in different situations; it is possible to use each in the sit-
operations, they are assumed “not to exist”. uation where they are preferable, thus increasing overall
efficiency.

Whether to include complexity


1.1.5 Typical operations
Aside from the behavior in terms of axioms, it is also pos-
sible to include, in the definition of an ADT operation, Some operations that are often specified for ADTs (pos-
their algorithmic complexity. Alexander Stepanov, de- sibly under other names) are
signer of the C++ Standard Template Library, included
complexity guarantees in the STL specification, arguing: • compare(s, t), that tests whether two instances’ states
are equivalent in some sense;
The reason for introducing the notion of • hash(s), that computes some standard hash function
abstract data types was to allow interchange- from the instance’s state;
able software modules. You cannot have
interchangeable modules unless these modules • print(s) or show(s), that produces a human-readable
share similar complexity behavior. If I replace representation of the instance’s state.
one module with another module with the
same functional behavior but with different In imperative-style ADT definitions, one often finds also
complexity tradeoffs, the user of this code
will be unpleasantly surprised. I could tell • create(), that yields a new instance of the ADT;
him anything I like about data abstraction, • initialize(s), that prepares a newly created instance
and he still would not want to use the code. s for further operations, or resets it to some “initial
Complexity assertions have to be part of the state";
interface.
— Alexander Stepanov[6] • copy(s, t), that puts instance s in a state equivalent to
that of t;
1.1. ABSTRACT DATA TYPE 5

• clone(t), that performs s ← create(), copy(s, t), and 1.1.7 Implementation


returns s;
Further information: Opaque data type
• free(s) or destroy(s), that reclaims the memory and
other resources used by s.
Implementing an ADT means providing one procedure or
function for each abstract operation. The ADT instances
The free operation is not normally relevant or meaning- are represented by some concrete data structure that is
ful, since ADTs are theoretical entities that do not “use manipulated by those procedures, according to the ADT’s
memory”. However, it may be necessary when one needs specifications.
to analyze the storage used by an algorithm that uses the
ADT. In that case one needs additional axioms that spec- Usually there are many ways to implement the same
ify how much memory each ADT instance uses, as a func- ADT, using several different concrete data structures.
tion of its state, and how much of it is returned to the pool Thus, for example, an abstract stack can be implemented
by free. by a linked list or by an array.
In order to prevent clients from depending on the imple-
mentation, an ADT is often packaged as an opaque data
1.1.6 Examples type in one or more modules, whose interface contains
only the signature (number and types of the parameters
Some common ADTs, which have proved useful in a great and results) of the operations. The implementation of the
variety of applications, are module—namely, the bodies of the procedures and the
concrete data structure used—can then be hidden from
• Container most clients of the module. This makes it possible to
change the implementation without affecting the clients.
• List If the implementation is exposed, it is known instead as
a transparent data type.
• Set
When implementing an ADT, each instance (in
• Multiset imperative-style definitions) or each state (in functional-
style definitions) is usually represented by a handle of
• Map some sort.[8]
• Multimap Modern object-oriented languages, such as C++ and Java,
support a form of abstract data types. When a class is
• Graph used as a type, it is an abstract type that refers to a hidden
representation. In this model an ADT is typically imple-
• Stack
mented as a class, and each instance of the ADT is usu-
• Queue ally an object of that class. The module’s interface typ-
ically declares the constructors as ordinary procedures,
• Priority queue and most of the other ADT operations as methods of that
class. However, such an approach does not easily en-
• Double-ended queue capsulate multiple representational variants found in an
ADT. It also can undermine the extensibility of object-
• Double-ended priority queue
oriented programs. In a pure object-oriented program
that uses interfaces as types, types refer to behaviors not
Each of these ADTs may be defined in many ways and representations.
variants, not necessarily equivalent. For example, an
abstract stack may or may not have a count operation
that tells how many items have been pushed and not yet Example: implementation of the abstract stack
popped. This choice makes a difference not only for its
clients but also for the implementation. As an example, here is an implementation of the abstract
stack above in the C programming language.
Abstract graphical data type

An extension of ADT for computer graphics was pro- Imperative-style interface An imperative-style inter-
posed in 1979:[7] an abstract graphical data type (AGDT). face might be:
It was introduced by Nadia Magnenat Thalmann, and typedef struct stack_Rep stack_Rep; // type: stack
Daniel Thalmann. AGDTs provide the advantages of instance representation (opaque record) typedef
ADTs with facilities to build graphical objects in a struc- stack_Rep* stack_T; // type: handle to a stack instance
tured way. (opaque pointer) typedef void* stack_Item; // type:
6 CHAPTER 1. INTRODUCTION

value stored in stack instance (arbitrary address) stack_T Built-in abstract data types
stack_create(void); // creates a new empty stack instance
void stack_push(stack_T s, stack_Item x); // adds an item The specification of some programming languages is
at the top of the stack stack_Item stack_pop(stack_T s); intentionally vague about the representation of certain
// removes the top item from the stack and returns it bool built-in data types, defining only the operations that can
stack_empty(stack_T s); // checks whether stack is empty be done on them. Therefore, those types can be viewed as
“built-in ADTs”. Examples are the arrays in many script-
This interface could be used in the following manner: ing languages, such as Awk, Lua, and Perl, which can be
regarded as an implementation of the abstract list.
#include <stack.h> // includes the stack interface stack_T
s = stack_create(); // creates a new empty stack instance
int x = 17; stack_push(s, &x); // adds the address of 1.1.8 See also
x at the top of the stack void* y = stack_pop(s); //
removes the address of x from the stack and returns it • Concept (generic programming)
if(stack_empty(s)) { } // does something if stack is empty
• Formal methods
This interface can be implemented in many ways. The
• Functional specification
implementation may be arbitrarily inefficient, since the
formal definition of the ADT, above, does not specify • Generalized algebraic data type
how much space the stack may use, nor how long each
operation should take. It also does not specify whether • Initial algebra
the stack state s continues to exist after a call x ← pop(s).
In practice the formal definition should specify that the • Liskov substitution principle
space is proportional to the number of items pushed and
• Type theory
not yet popped; and that every one of the operations above
must finish in a constant amount of time, independently • Walls and Mirrors
of that number. To comply with these additional specifi-
cations, the implementation could use a linked list, or an
array (with dynamic resizing) together with two integers 1.1.9 Notes
(an item count and the array size).
[1] Compare to the characterization of integers in abstract al-
gebra.
Functional-style interface Functional-style ADT def-
initions are more appropriate for functional programming
languages, and vice versa. However, one can provide a 1.1.10 Citations
functional-style interface even in an imperative language
like C. For example: [1] Dale & Walker 1996, p. 3.
typedef struct stack_Rep stack_Rep; // type: stack state [2] Dale & Walker 1996, p. 4.
representation (opaque record) typedef stack_Rep*
stack_T; // type: handle to a stack state (opaque pointer) [3] Liskov & Zilles 1974.
typedef void* stack_Item; // type: value of a stack state
(arbitrary address) stack_T stack_empty(void); // returns [4] Rudolf Lidl (2004). Abstract Algebra. Springer. ISBN
the empty stack state stack_T stack_push(stack_T s, 81-8128-149-7., Chapter 7,section 40.
stack_Item x); // adds an item at the top of the stack
state and returns the resulting stack state stack_T [5] “What Is Object-Oriented Programming?". Hiring | Up-
work. 2015-05-05. Retrieved 2016-10-28.
stack_pop(stack_T s); // removes the top item from
the stack state and returns the resulting stack state [6] Stevens, Al (March 1995). “Al Stevens Interviews Alex
stack_Item stack_top(stack_T s); // returns the top item Stepanov”. Dr. Dobb’s Journal. Retrieved 31 January
of the stack state 2015.

[7] D. Thalmann, N. Magnenat Thalmann (1979). Design


and Implementation of Abstract Graphical Data Types
(PDF). IEEE., Proc. 3rd International Computer Soft-
ADT libraries
ware and Applications Conference (COMPSAC'79),
IEEE, Chicago, USA, pp.519-524
Many modern programming languages, such as C++ and
Java, come with standard libraries that implement several [8] Robert Sedgewick (1998). Algorithms in C. Addi-
common ADTs, such as those listed above. son/Wesley. ISBN 0-201-31452-5., definition 4.4.
1.2. DATA STRUCTURE 7

1.1.11 References tions that can be performed on a data structure and the
computational complexity of those operations. In com-
• Liskov, Barbara; Zilles, Stephen (1974). “Pro- parison, a data structure is a concrete implementation of
gramming with abstract data types”. Proceedings of the specification provided by an ADT.
the ACM SIGPLAN Symposium on Very High Level
Different kinds of data structures are suited to differ-
Languages. SIGPLAN Notices. 9. pp. 50–59.
ent kinds of applications, and some are highly special-
doi:10.1145/800233.807045.
ized to specific tasks. For example, relational databases
• Dale, Nell; Walker, Henry M. (1996). Abstract Data commonly use B-tree indexes for data retrieval,[3] while
Types: Specifications, Implementations, and Appli- compiler implementations usually use hash tables to look
cations. Jones & Bartlett Learning. ISBN 978-0- up identifiers.
66940000-7. Data structures provide a means to manage large amounts
of data efficiently for uses such as large databases and
internet indexing services. Usually, efficient data struc-
1.1.12 Further reading
tures are key to designing efficient algorithms. Some for-
• Mitchell, John C.; Plotkin, Gordon (July 1988). mal design methods and programming languages empha-
“Abstract Types Have Existential Type” (PDF). size data structures, rather than algorithms, as the key or-
ACM Transactions on Programming Languages and ganizing factor in software design. Data structures can be
Systems. 10 (3). doi:10.1145/44501.45065. used to organize the storage and retrieval of information
stored in both main memory and secondary memory.

1.1.13 External links


1.2.2 Implementation
• Abstract data type in NIST Dictionary of Algo-
rithms and Data Structures Data structures are generally based on the ability of a
computer to fetch and store data at any place in its mem-
ory, specified by a pointer—a bit string, representing a
1.2 Data structure memory address, that can be itself stored in memory and
manipulated by the program. Thus, the array and record
data structures are based on computing the addresses of
For information on Wikipedia’s data structure, see data items with arithmetic operations; while the linked
Wikipedia:Administration § Data structure and develop- data structures are based on storing addresses of data
ment. items within the structure itself. Many data structures use
Not to be confused with data type. both principles, sometimes combined in non-trivial ways
In computer science, a data structure is a particular (as in XOR linking).

The implementation of a data structure usually requires


hash writing a set of procedures that create and manipulate in-
keys function buckets stances of that structure. The efficiency of a data struc-
ture cannot be analyzed separately from those operations.
00
This observation motivates the theoretical concept of an
01 521-8976
John Smith abstract data type, a data structure that is defined indi-
02 521-1234 rectly by the operations that may be performed on it, and
03 the mathematical properties of those operations (includ-
Lisa Smith
: : ing their space and time cost).
13
Sandra Dee
14 521-9655
15
1.2.3 Examples
Main article: List of data structures
A hash table.
There are numerous types of data structures, generally
way of organizing data in a computer so that it can be
built upon simpler primitive data types:
used efficiently.[1][2]

• An array is a number of elements in a specific order,


1.2.1 Usage typically all of the same type. Elements are accessed
using an integer index to specify which element is
Data structures can implement one or more particu- required (Depending on the language, individual el-
lar abstract data types (ADT), which specify the opera- ements may either all be forced to be the same type,
8 CHAPTER 1. INTRODUCTION

or may be of almost any type). Typical implementa- Template Library, the Java Collections Framework, and
tions allocate contiguous memory words for the ele- the Microsoft .NET Framework.
ments of arrays (but this is not always a necessity). Modern languages also generally support modular pro-
Arrays may be fixed-length or resizable. gramming, the separation between the interface of a li-
• A linked list (also just called list) is a linear collection brary module and its implementation. Some provide
of data elements of any type, called nodes, where opaque data types that allow clients to hide implemen-
each node has itself a value, and points to the next tation details. Object-oriented programming languages,
node in the linked list. The principal advantage of a such as C++, Java, and Smalltalk, typically use classes
linked list over an array, is that values can always be for this purpose.
efficiently inserted and removed without relocating Many known data structures have concurrent versions
the rest of the list. Certain other operations, such which allow multiple computing threads to access a single
as random access to a certain element, are however concrete instance of a data structure simultaneously.
slower on lists than on arrays.

• A record (also called tuple or struct) is an aggregate 1.2.5 See also


data structure. A record is a value that contains other
values, typically in fixed number and sequence and • Abstract data type
typically indexed by names. The elements of records
are usually called fields or members. • Concurrent data structure

• A union is a data structure that specifies which of a • Data model


number of permitted primitive types may be stored
• Dynamization
in its instances, e.g. float or long integer. Contrast
with a record, which could be defined to contain a • Linked data structure
float and an integer; whereas in a union, there is only
one value at a time. Enough space is allocated to • List of data structures
contain the widest member datatype.
• Persistent data structure
• A tagged union (also called variant, variant record,
• Plain old data structure
discriminated union, or disjoint union) contains an
additional field indicating its current type, for en-
hanced type safety. 1.2.6 References
• A class is a data structure that contains data fields,
[1] Black (ed.), Paul E. (2004-12-15). Entry for data structure
like a record, as well as various methods which op- in Dictionary of Algorithms and Data Structures. Online
erate on the contents of the record. In the context version. U.S. National Institute of Standards and Technol-
of object-oriented programming, records are known ogy, 15 December 2004. Retrieved on 2009-05-21 from
as plain old data structures to distinguish them from http://xlinux.nist.gov/dads/HTML/datastructur.html.
classes.
[2] Encyclopædia Britannica (2009). Entry data struc-
ture in the Encyclopædia Britannica (2009). Re-
1.2.4 Language support trieved on 2009-05-21 from http://www.britannica.com/
EBchecked/topic/152190/data-structure.
Most assembly languages and some low-level languages, [3] Gavin Powell (2006). “Chapter 8: Building Fast-
such as BCPL (Basic Combined Programming Lan- Performing Database Models”. Beginning Database De-
guage), lack built-in support for data structures. On the sign ISBN 978-0-7645-7490-0. Wrox Publishing.
other hand, many high-level programming languages and
some higher-level assembly languages, such as MASM, [4] “The GNU C Manual”. Free Software Foundation. Re-
trieved 2014-10-15.
have special syntax or other built-in support for certain
data structures, such as records and arrays. For exam- [5] “Free Pascal: Reference Guide”. Free Pascal. Retrieved
ple, the C (a direct descendant of BCPL) and Pascal 2014-10-15.
languages support structs and records, respectively, in
addition to vectors (one-dimensional arrays) and multi-
dimensional arrays.[4][5] 1.2.7 Bibliography
Most programming languages feature some sort of library • Peter Brass, Advanced Data Structures, Cambridge
mechanism that allows data structure implementations to University Press, 2008.
be reused by different programs. Modern languages usu-
ally come with standard libraries that implement the most • Donald Knuth, The Art of Computer Programming,
common data structures. Examples are the C++ Standard vol. 1. Addison-Wesley, 3rd edition, 1997.
1.3. ANALYSIS OF ALGORITHMS 9

• Dinesh Mehta and Sartaj Sahni Handbook of broader computational complexity theory, which pro-
Data Structures and Applications, Chapman and vides theoretical estimates for the resources needed by
Hall/CRC Press, 2007. any algorithm which solves a given computational prob-
lem. These estimates provide an insight into reasonable
• Niklaus Wirth, Algorithms and Data Structures, directions of search for efficient algorithms.
Prentice Hall, 1985.
In theoretical analysis of algorithms it is common to
estimate their complexity in the asymptotic sense, i.e.,
1.2.8 External links to estimate the complexity function for arbitrarily large
input. Big O notation, Big-omega notation and Big-
• course on data structures theta notation are used to this end. For instance, binary
search is said to run in a number of steps proportional
• Data structures Programs Examples in c,java
to the logarithm of the length of the sorted list being
• UC Berkeley video course on data structures searched, or in O(log(n)), colloquially “in logarithmic
time". Usually asymptotic estimates are used because
• Descriptions from the Dictionary of Algorithms and different implementations of the same algorithm may dif-
Data Structures fer in efficiency. However the efficiencies of any two
• Data structures course “reasonable” implementations of a given algorithm are
related by a constant multiplicative factor called a hidden
• An Examination of Data Structures from .NET per- constant.
spective
Exact (not asymptotic) measures of efficiency can some-
• Schaffer, C. Data Structures and Algorithm Analysis times be computed but they usually require certain as-
sumptions concerning the particular implementation of
the algorithm, called model of computation. A model of
1.3 Analysis of algorithms computation may be defined in terms of an abstract com-
puter, e.g., Turing machine, and/or by postulating that
certain operations are executed in unit time. For exam-
n! 2ⁿ n² n log₂n n ple, if the sorted list to which we apply binary search has
100
n elements, and we can guarantee that each lookup of an
90 element in the list can be done in unit time, then at most
log2 n + 1 time units are needed to return an answer.
80
70
1.3.1 Cost models
60
N
50 Time efficiency estimates depend on what we define to be
a step. For the analysis to correspond usefully to the actual
40 execution time, the time required to perform a step must
30 be guaranteed to be bounded above by a constant. One
must be careful here; for instance, some analyses count
20 an addition of two numbers as one step. This assumption
√n may not be warranted in certain contexts. For example, if
10
1 log₂n the numbers involved in a computation may be arbitrarily
0 large, the time required by a single addition can no longer
0 10 20 30 40 50 60 70 80 90 100
n be assumed to be constant.

Graphs of number of operations, N vs input size, n for common


Two cost models are generally used:[2][3][4][5][6]
complexities, assuming a coefficient of 1
• the uniform cost model, also called uniform-cost
In computer science, the analysis of algorithms is the measurement (and similar variations), assigns a
determination of the amount of resources (such as time constant cost to every machine operation, regardless
and storage) necessary to execute them. Most algorithms of the size of the numbers involved
are designed to work with inputs of arbitrary length. Usu-
• the logarithmic cost model, also called
ally, the efficiency or running time of an algorithm is
logarithmic-cost measurement (and similar
stated as a function relating the input length to the num-
variations), assigns a cost to every machine opera-
ber of steps (time complexity) or storage locations (space
tion proportional to the number of bits involved
complexity).
The term “analysis of algorithms” was coined by Donald The latter is more cumbersome to use, so it’s only em-
Knuth.[1] Algorithm analysis is an important part of a ployed when necessary, for example in the analysis of
10 CHAPTER 1. INTRODUCTION

arbitrary-precision arithmetic algorithms, like those used Orders of growth


in cryptography.
Main article: Big O notation
A key point which is often overlooked is that published
lower bounds for problems are often given for a model of
computation that is more restricted than the set of oper- Informally, an algorithm can be said to exhibit a growth
ations that you could use in practice and therefore there rate on the order of a mathematical function if beyond
are algorithms that are faster than what would naively be a certain input size n, the function times a positive con-
thought possible.[7] stant provides an upper bound or limit for the run-time
of that algorithm. In other words, for a given input size n
greater than some n0 and a constant c, the running time of
1.3.2 Run-time analysis that algorithm will never be larger than . This concept is
frequently expressed using Big O notation. For example,
Run-time analysis is a theoretical classification that es- since the run-time of insertion sort grows quadratically as
timates and anticipates the increase in running time (or its input size increases, insertion sort can be said to be of
run-time) of an algorithm as its input size (usually denoted order O(n2 ).
as n) increases. Run-time efficiency is a topic of great Big O notation is a convenient way to express the worst-
interest in computer science: A program can take sec- case scenario for a given algorithm, although it can also
onds, hours or even years to finish executing, depending be used to express the average-case — for example,
on which algorithm it implements (see also performance the worst-case scenario for quicksort is O(n2 ), but the
analysis, which is the analysis of an algorithm’s run-time average-case run-time is O(n log n).
in practice).

Empirical orders of growth


Shortcomings of empirical metrics
Assuming the execution time follows power rule, t ≈ k
Since algorithms are platform-independent (i.e. a na , the coefficient a can be found [8] by taking empiri-
given algorithm can be implemented in an arbitrary cal measurements of run time {t1 , t2 } at some problem-
programming language on an arbitrary computer running size points {n1 , n2 } , and calculating t2 /t1 = (n2 /n1 )a
an arbitrary operating system), there are significant draw- so that a = log(t2 /t1 )/ log(n2 /n1 ) . In other words,
backs to using an empirical approach to gauge the com- this measures the slope of the empirical line on the log–
parative performance of a given set of algorithms. log plot of execution time vs. problem size, at some size
Take as an example a program that looks up a specific en- point. If the order of growth indeed follows the power
try in a sorted list of size n. Suppose this program were rule (and so the line on log–log plot is indeed a straight
implemented on Computer A, a state-of-the-art machine, line), the empirical value of a will stay constant at dif-
using a linear search algorithm, and on Computer B, a ferent ranges, and if not, it will change (and the line is a
much slower machine, using a binary search algorithm. curved line) - but still could serve for comparison of any
Benchmark testing on the two computers running their two given algorithms as to their empirical local orders of
respective programs might look something like the fol- growth behaviour. Applied to the above table:
lowing: It is clearly seen that the first algorithm exhibits a linear
Based on these metrics, it would be easy to jump to the order of growth indeed following the power rule. The em-
conclusion that Computer A is running an algorithm that pirical values for the second one are diminishing rapidly,
is far superior in efficiency to that of Computer B. How- suggesting it follows another rule of growth and in any
ever, if the size of the input-list is increased to a sufficient case has much lower local orders of growth (and improv-
number, that conclusion is dramatically demonstrated to ing further still), empirically, than the first one.
be in error:
Computer A, running the linear search program, exhibits Evaluating run-time complexity
a linear growth rate. The program’s run-time is directly
proportional to its input size. Doubling the input size dou- The run-time complexity for the worst-case scenario
bles the run time, quadrupling the input size quadruples of a given algorithm can sometimes be evaluated by
the run-time, and so forth. On the other hand, Com- examining the structure of the algorithm and making
puter B, running the binary search program, exhibits a some simplifying assumptions. Consider the following
logarithmic growth rate. Quadrupling the input size only pseudocode:
increases the run time by a constant amount (in this exam-
ple, 50,000 ns). Even though Computer A is ostensibly a 1 get a positive integer from input 2 if n > 10 3 print “This
faster machine, Computer B will inevitably surpass Com- might take a while...” 4 for i = 1 to n 5 for j = 1 to i 6
puter A in run-time because it’s running an algorithm with print i * j 7 print “Done!"
a much slower growth rate. A given computer will take a discrete amount of time to
1.3. ANALYSIS OF ALGORITHMS 11

execute each of the instructions involved with carrying Therefore, the total running time for this algorithm is:
out this algorithm. The specific amount of time to carry
out a given instruction will vary depending on which in- [ ] [ ]
struction is being executed and which computer is exe- f (n) = T1 +T2 +T3 +T7 +(n+1)T4 + 1 (n2 + n) T6 + 1 (n2 + 3n) T5
cuting it, but on a conventional computer, this amount 2 2
[9]
will be deterministic. Say that the actions carried out which reduces to
in step 1 are considered to consume time T 1 , step 2 uses
time T 2 , and so forth. [ ] [ ]
1 2 1
In the algorithm above, steps 1, 2 and 7 will only be run f (n) = (n + n) T6 + (n2 + 3n) T5 +(n+1)T4 +T1 +T2 +T3 +T7
once. For a worst-case evaluation, it should be assumed 2 2
that step 3 will be run as well. Thus the total amount of As a rule-of-thumb, one can assume that the highest-
time to run steps 1-3 and step 7 is: order term in any given function dominates its rate of
growth and thus defines its run-time order. In this ex-
ample, n² is the highest-order term, so one can conclude
T1 + T2 + T3 + T7 . that f(n) = O(n²). Formally this can be proven as follows:
The loops in steps 4, 5 and 6 are trickier to evaluate. The [1 2 ]
outer loop test in step 4 will execute ( n + 1 ) times (note [ 1 Prove
2
] that 2 (n + n) T6 +
(n + 3n) T 5 + (n + 1)T4 + T1 +
that an extra step is required to terminate the for loop, 2
T2 + T[3 + T7 ≤ cn]2 , n ≥[n0 ]
hence n + 1 and not n executions), which will consume 1 2 1 2
T 4 ( n + 1 ) time. The inner loop, on the other hand, is (n + n) T6 + (n + 3n) T5 + (n + 1)T4 + T1 + T2
2 2
governed by the value of i, which iterates from 1 to i. On
the first pass through the outer loop, j iterates from 1 to ≤(n2 + n)T6 + (n2 + 3n)T5 + (n + 1)T4 + T1 + T2 + T3 + T
1: The inner loop makes one pass, so running the inner Let k be a constant greater than or equal to
loop body (step 6) consumes T 6 time, and the inner loop [T 1 ..T 7 ]
test (step 5) consumes 2T 5 time. During the next pass T6 (n2 + n) + T5 (n2 + 3n) + (n + 1)T4 + T1 + T2 + T3 + T7 ≤
through the outer loop, j iterates from 1 to 2: the inner =2kn2 + 5kn + 5k[ ≤ 2kn2 +] 5kn2 + 5kn2 ( forn ≥ 1) = 12kn2
loop makes two passes, so running the inner loop body 1 2
[Therefore ] 2 (n + n) T6 +
(step 6) consumes 2T 6 time, and the inner loop test (step 1
(n 2
+ 3n) T5 + (n + 1)T 4 + T 1 + T2 +
2
5) consumes 3T 5 time. T3 + T7 ≤ cn2 , n ≥ n0 for c = 12k, n0 = 1
Altogether, the total time required to run the inner loop
body can be expressed as an arithmetic progression: A more elegant approach to analyzing this algorithm
would be to declare that [T 1 ..T 7 ] are all equal to one unit
of time, in a system of units chosen so that one unit is
T6 + 2T6 + 3T6 + · · · + (n − 1)T6 + nT6 greater than or equal to the actual times for these steps.
This would mean that the algorithm’s running time breaks
which can be factored[10] as down as follows:[11]
∑n ∑n
[ ] 4 + i=1 i ≤ 4 + i=1 n = 4 + n2 ≤
1 2
T6 [1 + 2 + 3 + · · · + (n − 1) + n] = T6 (n + n) 5n2 ( forn ≥ 1) = O(n2 ).
2
The total time required to run the outer loop test can be
Growth rate analysis of other resources
evaluated similarly:
The methodology of run-time analysis can also be utilized
for predicting other growth rates, such as consumption of
2T5 + 3T5 + 4T5 + · · · + (n − 1)T5 + nT5 + (n + 1)T 5
memory space. As an example, consider the following
= T5 + 2T5 + 3T5 + 4T5 + · · · + (n − 1)T5 + nT5 + (npseudocode
+ 1)T5 − Twhich
5 manages and reallocates memory us-
which can be factored as age by a program based on the size of a file which that
program manages:
while (file still open) let n = size of file for every 100,000
T5 [1 + 2 + 3 + · · · + (n − 1) + n + (n + 1)] − T5
[ ] kilobytes of increase in file size double the amount of mem-
1 2 ory reserved
= (n + n) T5 + (n + 1)T5 − T5
2
[ ] In this instance, as the file size n increases, memory will
1 2 be consumed at an exponential growth rate, which is or-
=T5 (n + n) + nT5
2 der O(2n ). This is an extremely rapid and most likely
[ ]
1 2 unmanageable growth rate for consumption of memory
= (n + 3n) T5 resources.
2
12 CHAPTER 1. INTRODUCTION

1.3.3 Relevance • Computational complexity theory

Algorithm analysis is important in practice because the • Master theorem


accidental or unintentional use of an inefficient algorithm • NP-Complete
can significantly impact system performance. In time-
sensitive applications, an algorithm taking too long to • Numerical analysis
run can render its results outdated or useless. An inef-
• Polynomial time
ficient algorithm can also end up requiring an uneconom-
ical amount of computing power or storage in order to • Program optimization
run, again rendering it practically useless.
• Profiling (computer programming)
• Scalability
1.3.4 Constant factors
• Smoothed analysis
Analysis of algorithms typically focuses on the asymp-
totic performance, particularly at the elementary level, • Termination analysis — the subproblem of checking
but in practical applications constant factors are impor- whether a program will terminate at all
tant, and real-world data is in practice always limited
• Time complexity — includes table of orders of
in size. The limit is typically the size of addressable
32 growth for common algorithms
memory, so on 32-bit machines 2 = 4 GiB (greater if
64
segmented memory is used) and on 64-bit machines 2 • Information-based complexity
= 16 EiB. Thus given a limited size, an order of growth
(time or space) can be replaced by a constant factor, and
in this sense all practical algorithms are O(1) for a large 1.3.6 Notes
enough constant, or for small enough data.
[1] Donald Knuth, Recent News
This interpretation is primarily useful for functions that
grow extremely slowly: (binary) iterated logarithm (log* ) [2] Alfred V. Aho; John E. Hopcroft; Jeffrey D. Ullman
is less than 5 for all practical data (265536 bits); (binary) (1974). The design and analysis of computer algorithms.
log-log (log log n) is less than 6 for virtually all practi- Addison-Wesley Pub. Co., section 1.3
cal data (264 bits); and binary log (log n) is less than 64 [3] Juraj Hromkovič (2004). Theoretical computer science:
for virtually all practical data (264 bits). An algorithm introduction to Automata, computability, complexity, algo-
with non-constant complexity may nonetheless be more rithmics, randomization, communication, and cryptogra-
efficient than an algorithm with constant complexity on phy. Springer. pp. 177–178. ISBN 978-3-540-14015-3.
practical data if the overhead of the constant time algo-
[4] Giorgio Ausiello (1999). Complexity and approximation:
rithm results in a larger constant factor, e.g., one may have combinatorial optimization problems and their approxima-
6
K > k log log n so long as K/k > 6 and n < 22 = 264 bility properties. Springer. pp. 3–8. ISBN 978-3-540-
. 65431-5.
For large data linear or quadratic factors cannot be ig- [5] Wegener, Ingo (2005), Complexity theory: exploring the
nored, but for small data an asymptotically inefficient al- limits of efficient algorithms, Berlin, New York: Springer-
gorithm may be more efficient. This is particularly used Verlag, p. 20, ISBN 978-3-540-21045-0
in hybrid algorithms, like Timsort, which use an asymp-
totically efficient algorithm (here merge sort, with time [6] Robert Endre Tarjan (1983). Data structures and network
algorithms. SIAM. pp. 3–7. ISBN 978-0-89871-187-5.
complexity n log n ), but switch to an asymptotically in-
efficient algorithm (here insertion sort, with time com- [7] Examples of the price of abstraction?, csthe-
plexity n2 ) for small data, as the simpler algorithm is ory.stackexchange.com
faster on small data.
[8] How To Avoid O-Abuse and Bribes, at the blog “Gödel’s
Lost Letter and P=NP” by R. J. Lipton, professor of Com-
puter Science at Georgia Tech, recounting idea by Robert
1.3.5 See also Sedgewick
• Amortized analysis [9] However, this is not the case with a quantum computer

• Analysis of parallel algorithms [10] It can be proven by induction that 1 + 2 + 3 + · · · + (n −


1) + n = n(n+1)
2
• Asymptotic computational complexity
[11] This approach, unlike the above approach, neglects the
• Best, worst and average case constant time consumed by the loop tests which terminate
their respective loops, but it is trivial to prove that such
• Big O notation omission does not affect the final result
1.4. AMORTIZED ANALYSIS 13

1.3.7 References 1.4.2 Method


• Cormen, Thomas H.; Leiserson, Charles E.; Rivest, The method requires knowledge of which series of oper-
Ronald L. & Stein, Clifford (2001). Introduction to ations are possible. This is most commonly the case with
Algorithms. Chapter 1: Foundations (Second ed.). data structures, which have state that persists between op-
Cambridge, MA: MIT Press and McGraw-Hill. pp. erations. The basic idea is that a worst case operation can
3–122. ISBN 0-262-03293-7. alter the state in such a way that the worst case cannot
occur again for a long time, thus “amortizing” its cost.
• Sedgewick, Robert (1998). Algorithms in C, Parts 1-
4: Fundamentals, Data Structures, Sorting, Search- There are generally three methods for performing amor-
ing (3rd ed.). Reading, MA: Addison-Wesley Pro- tized analysis: the aggregate method, the accounting
fessional. ISBN 978-0-201-31452-6. method, and the potential method. All of these give the
same answers, and their usage difference is primarily cir-
[3]
• Knuth, Donald. The Art of Computer Programming. cumstantial and due to individual preference.
Addison-Wesley.
• Aggregate analysis determines the upper bound T(n)
• Greene, Daniel A.; Knuth, Donald E. (1982). Math- on the total cost of a sequence of n operations, then
ematics for the Analysis of Algorithms (Second ed.). calculates the amortized cost to be T(n) / n.[3]
Birkhäuser. ISBN 3-7643-3102-X.
• The accounting method determines the individual
• Goldreich, Oded (2010). Computational Complex- cost of each operation, combining its immediate ex-
ity: A Conceptual Perspective. Cambridge Univer- ecution time and its influence on the running time
sity Press. ISBN 978-0-521-88473-0. of future operations. Usually, many short-running
operations accumulate a “debt” of unfavorable state
in small increments, while rare long-running opera-
1.4 Amortized analysis tions decrease it drastically.[3]
• The potential method is like the accounting method,
“Amortized” redirects here. For other uses, see but overcharges operations early to compensate for
Amortization. undercharges later.[3]

In computer science, amortized analysis is a method 1.4.3 Examples


for analyzing a given algorithm’s time complexity, or how
much of a resource, especially time or memory in the con- Dynamic Array
text of computer programs, it takes to execute. The moti-
vation for amortized analysis is that looking at the worst-
case run time per operation can be too pessimistic.[1]
While certain operations for a given algorithm may have
a significant cost in resources, other operations may not
be as costly. Amortized analysis considers both the costly
and less costly operations together over the whole series
of operations of the algorithm. This may include account-
ing for different types of input, length of the input, and
other factors that affect its performance.[2]

1.4.1 History

Amortized analysis initially emerged from a method


called aggregate analysis, which is now subsumed by
amortized analysis. The technique was first formally in-
troduced by Robert Tarjan in his 1985 paper Amortized
Computational Complexity, which addressed the need for Amortized Analysis of the Push operation for a Dynamic Array
a more useful form of analysis than the common prob-
abilistic methods used. Amortization was initially used Consider a dynamic array that grows in size as more ele-
for very specific types of algorithms, particularly those ments are added to it such as an ArrayList in Java. If we
involving binary trees and union operations. However, started out with a dynamic array of size 4, it would take
it is now ubiquitous and comes into play when analyzing constant time to push four elements onto it. Yet pushing
many other algorithms as well.[2] a fifth element onto that array would take longer as the
14 CHAPTER 1. INTRODUCTION

array would have to create a new array of double the cur- 1.4.5 References
rent size (8), copy the old elements onto the new array,
and then add the new element. The next three push op- • Allan Borodin and Ran El-Yaniv (1998). Online
erations would similarly take constant time, and then the Computation and Competitive Analysis. Cambridge
subsequent addition would require another slow doubling University Press. pp. 20,141.
of the array size.
In general if we consider an arbitrary number of pushes n [1] “Lecture 7: Amortized Analysis” (PDF). https://www.cs.
cmu.edu/. Retrieved 14 March 2015. External link in
+ 1 to an array of size n, we notice that push operations
|website= (help)
take constant time except for the last one which takes O(n)
time to perform the size doubling operation. Since there [2] Rebecca Fiebrink (2007), Amortized Analysis Explained
were n + 1 operations total we can take the average of (PDF), retrieved 2011-05-03
this and find that for pushing elements onto the dynamic
array takes: O( n+1 [3] [3] “Lecture 20: Amortized Analysis”. http://www.cs.cornell.
n ) = O(1) , constant time.
edu/. Cornell University. Retrieved 14 March 2015. Ex-
ternal link in |website= (help)

[4] Grossman, Dan. “CSE332: Data Abstractions” (PDF).


Queue
cs.washington.edu. Retrieved 14 March 2015.

Let’s look at a Ruby implementation of a Queue, a FIFO


data structure:
1.5 Accounting method
class Queue def initialize @input = [] @output = []
end def enqueue(element) @input << element end def
For accounting methods in business and financial report-
dequeue if @output.empty? while @input.any? @output
ing, see accounting methods.
<< @input.pop end end @output.pop end end

In the field of analysis of algorithms in computer science,


The enqueue operation just pushes an element onto the
the accounting method is a method of amortized anal-
input array; this operation does not depend on the lengths
ysis based on accounting. The accounting method often
of either input or output and therefore runs in constant
gives a more intuitive account of the amortized cost of an
time.
operation than either aggregate analysis or the potential
However the dequeue operation is more complicated. If method. Note, however, that this does not guarantee such
the output array already has some elements in it, then de- analysis will be immediately obvious; often, choosing the
queue runs in constant time; otherwise, dequeue takes correct parameters for the accounting method requires
O(n) time to add all the elements onto the output array as much knowledge of the problem and the complexity
from the input array, where n is the current length of the bounds one is attempting to prove as the other two meth-
input array. After copying n elements from input, we can ods.
perform n dequeue operations, each taking constant time,
The accounting method is most naturally suited for prov-
before the output array is empty again. Thus, we can per-
ing an O(1) bound on time. The method as explained here
form a sequence of n dequeue operations in only O(n)
is for proving such a bound.
time, which implies that the amortized time of each de-
queue operation is O(1).[4]
Alternatively, we can charge the cost of copying any item 1.5.1 The method
from the input array to the output array to the earlier
enqueue operation for that item. This charging scheme A set of elementary operations which will be used in the
doubles the amortized time for enqueue, but reduces the algorithm is chosen and their costs are arbitrarily set to
amortized time for dequeue to O(1). 1. The fact that the costs of these operations may differ
in reality presents no difficulty in principle. What is im-
portant is that each elementary operation has a constant
cost.
1.4.4 Common use
Each aggregate operation is assigned a “payment”. The
payment is intended to cover the cost of elementary oper-
• In common usage, an “amortized algorithm” is one ations needed to complete this particular operation, with
that an amortized analysis has shown to perform some of the payment left over, placed in a pool to be used
well. later.
The difficulty with problems that require amortized anal-
• Online algorithms commonly use amortized analy- ysis is that, in general, some of the operations will require
sis. greater than constant cost. This means that no constant
1.6. POTENTIAL METHOD 15

payment will be enough to cover the worst case cost of operation, the pool has 3m + 3 - (2m + 1) = m + 2. Note
an operation, in and of itself. With proper selection of that this is the same amount as after inserting element m
payment, however, this is no longer a difficulty; the ex- + 1. In fact, we can show that this will be the case for any
pensive operations will only occur when there is sufficient number of reallocations.
payment in the pool to cover their costs. It can now be made clear why the payment for an inser-
tion is 3. 1 pays for the first insertion of the element, 1
pays for moving the element the next time the table is ex-
1.5.2 Examples panded, and 1 pays for moving an older element the next
time the table is expanded. Intuitively, this explains why
A few examples will help to illustrate the use of the ac- an element’s contribution never “runs out” regardless of
counting method. how many times the table is expanded: since the table is
always doubled, the newest half always covers the cost of
moving the oldest half.
Table expansion
We initially assumed that creating a table was free. In
It is often necessary to create a table before it is known reality, creating a table of size n may be as expensive as
how much space is needed. One possible strategy is to O(n). Let us say that the cost of creating a table of size n
double the size of the table when it is full. Here we will is n. Does this new cost present a difficulty? Not really; it
use the accounting method to show that the amortized turns out we use the same method to show the amortized
cost of an insertion operation in such a table is O(1). O(1) bounds. All we have to do is change the payment.

Before looking at the procedure in detail, we need some When a new table is created, there is an old table with
definitions. Let T be a table, E an element to insert, m entries. The new table will be of size 2m. As long as
num(T) the number of elements in T, and size(T) the al- the entries currently in the table have added enough to the
located size of T. We assume the existence of operations pool to pay for creating the new table, we will be all right.
create_table(n), which creates an empty table of size n, We cannot expect the first m 2 entries to help pay for the
for now assumed to be free, and elementary_insert(T,E), new table. Those entries already paid for the current ta-
which inserts element E into a table T that already has ble. We must then rely on the last m 2 entries to pay the
space allocated, with a cost of 1. 2m
cost 2m . This means we must add m/2 = 4 to the pay-
The following pseudocode illustrates the table insertion ment for each entry, for a total payment of 3 + 4 = 7.
procedure:
function table_insert(T,E) if num(T) = size(T) U := 1.5.3 References
create_table(2 × size(T)) for each F in T elemen-
tary_insert(U,F) T := U elementary_insert(T,E) • Thomas H. Cormen, Charles E. Leiserson, Ronald
L. Rivest, and Clifford Stein. Introduction to Algo-
Without amortized analysis, the best bound we can show
2 rithms, Second Edition. MIT Press and McGraw-
for n insert operations is O(n ) — this is due to the loop
Hill, 2001. ISBN 0-262-03293-7. Section 17.2:
at line 4 that performs num(T) elementary insertions.
The accounting method, pp. 410–412.
For analysis using the accounting method, we assign a
payment of 3 to each table insertion. Although the reason
for this is not clear now, it will become clear during the 1.6 Potential method
course of the analysis.
Assume that initially the table is empty with size(T) = In computational complexity theory, the potential
m. The first m insertions therefore do not require real- method is a method used to analyze the amortized time
location and only have cost 1 (for the elementary insert). and space complexity of a data structure, a measure of its
Therefore, when num(T) = m, the pool has (3 - 1)×m = performance over sequences of operations that smooths
2m. out the cost of infrequent but expensive operations.[1][2]
Inserting element m + 1 requires reallocation of the table.
Creating the new table on line 3 is free (for now). The
1.6.1 Definition of amortized time
loop on line 4 requires m elementary insertions, for a cost
of m. Including the insertion on the last line, the total cost In the potential method, a function Φ is chosen that maps
for this operation is m + 1. After this operation, the pool states of the data structure to non-negative numbers. If
therefore has 2m + 3 - (m + 1) = m + 2. S is a state of the data structure, Φ(S) may be thought
Next, we add another m - 1 elements to the table. At this of intuitively as an amount of potential energy stored in
point the pool has m + 2 + 2×(m - 1) = 3m. Inserting that state;[1][2] alternatively, Φ(S) may be thought of as
an additional element (that is, element 2m + 1) can be representing the amount of disorder in state S or its dis-
seen to have cost 2m + 1 and a payment of 3. After this tance from an ideal state. The potential value prior to the
16 CHAPTER 1. INTRODUCTION

operation of initializing a data structure is defined to be this assumption, if X is a type of operation that may be
zero. performed by the data structure, and n is an integer defin-
Let o be any individual operation within a sequence of ing the size of the given data structure (for instance, the
operations on some data structure, with S ₑ ₒᵣₑ denoting number of items that it contains), then the amortized time
the state of the data structure prior to operation o and for operations of type X is defined to be the maximum,
Sₐ ₑᵣ denoting its state after operation o has completed. among all possible sequences of operations on data struc-
Then, once Φ has been chosen, the amortized time for tures of size n and all operations oi of type X within the
operation o is defined to be sequence, of the amortized time for operation oi.
With this definition, the time to perform a sequence of
operations may be estimated by multiplying the amor-
Tamortized (o) = Tactual (o) + C · (Φ(Safter ) − Φ(Sbefore )), tized time for each type of operation in the sequence by
the number of operations of that type.
where C is a non-negative constant of proportionality (in
units of time) that must remain fixed throughout the anal-
ysis. That is, the amortized time is defined to be the actual 1.6.4 Examples
time taken by the operation plus C times the difference
in potential caused by the operation.[1][2] Dynamic array

A dynamic array is a data structure for maintaining an


1.6.2 Relation between amortized and ac- array of items, allowing both random access to positions
tual time within the array and the ability to increase the array size
by one. It is available in Java as the “ArrayList” type and
Despite its artificial appearance, the total amortized time in Python as the “list” type.
of a sequence of operations provides a valid upper bound
A dynamic array may be implemented by a data struc-
on the actual time for the same sequence of operations.
ture consisting of an array A of items, of some length
For any sequence of operations O = o1 , o2 , . . . , define: N, together with a number n ≤ N representing the posi-
tions within the array that have been used so far. With
• The
∑ total amortized time: Tamortized (O) = this structure, random accesses to the dynamic array may
i Tamortized (o i ) be implemented by accessing the same cell of the inter-
∑ nal array A, and when n < N an operation that increases
• The total actual time: Tactual (O) = i Tactual (oi )
the dynamic array size may be implemented simply by
incrementing n. However, when n = N, it is necessary to
Then:
resize A, and a common strategy for doing so is to double
its size, replacing A by a new array of length 2n.[3]

Tamortized (O) = (Tactual (oi ) + C · (Φ(Si+1 ) − Φ(Si )))This structure
= Tactual may be analyzed
(O)+C·(Φ(S using the potential func-
final )−Φ(Sinitial ))
i tion:
where the sequence of potential function values forms a
telescoping series in which all terms other than the initial Φ = 2n − N
and final potential function values cancel in pairs.
Since the resizing strategy always causes A to be at least
Hence: half-full, this potential function is always non-negative, as
desired.
When an increase-size operation does not lead to a resize
Tactual (O) = Tamortized (O) + C · (Φ(Sinitial ) − Φ(Sfinal ))
operation, Φ increases by 2, a constant. Therefore, the
In case Φ(Sfinal ) ≥ 0 and Φ(Sinitial ) = 0 , Tactual (O) ≤ constant actual time of the operation and the constant in-
Tamortized (O) , so the amortized time can be used to pro- crease in potential combine to give a constant amortized
vide accurate predictions about the actual time of se- time for an operation of this type.
quences of operations, even though the amortized time
However, when an increase-size operation causes a resize,
for an individual operation may vary widely from its ac-
the potential value of n decreases to zero after the resize.
tual time.
Allocating a new internal array A and copying all of the
values from the old internal array to the new one takes
1.6.3 Amortized analysis of worst-case in- O(n) actual time, but (with an appropriate choice of the
constant of proportionality C) this is entirely cancelled
puts
by the decrease in the potential function, leaving again a
Typically, amortized analysis is used in combination with constant total amortized time for the operation.
a worst case assumption about the input sequence. With The other operations of the data structure (reading and
1.6. POTENTIAL METHOD 17

writing array cells without changing the array size) do not This number is always non-negative and starts with 0, as
cause the potential function to change and have the same required.
constant amortized time as their actual time.[2] An Inc operation flips the least significant bit. Then, if the
Therefore, with this choice of resizing strategy and poten- LSB were flipped from 1 to 0, then the next bit should be
tial function, the potential method shows that all dynamic flipped. This goes on until finally a bit is flipped from 0 to
array operations take constant amortized time. Combin- 1, in which case the flipping stops. If the number of bits
ing this with the inequality relating amortized time and flipped from 1 to 0 is k, then the actual time is k+1 and
actual time over sequences of operations, this shows that the potential is reduced by k−1, so the amortized time is
any sequence of n dynamic array operations takes O(n) 2. Hence, the actual time for running m Inc operations is
actual time in the worst case, despite the fact that some O(m).
of the individual operations may themselves take a linear
amount of time.[2]
1.6.5 Applications
Multi-Pop Stack The potential function method is commonly used to ana-
lyze Fibonacci heaps, a form of priority queue in which
Consider a stack which supports the following operations: removing an item takes logarithmic amortized time, and
all other operations take constant amortized time.[4] It
• Initialize - create an empty stack. may also be used to analyze splay trees, a self-adjusting
form of binary search tree with logarithmic amortized
• Push - add a single element on top of the stack. time per operation.[5]

• Pop(k) - remove k elements from the top of the


stack. 1.6.6 References
[1] Goodrich, Michael T.; Tamassia, Roberto (2002), “1.5.1
This structure may be analyzed using the potential func- Amortization Techniques”, Algorithm Design: Founda-
tion: tions, Analysis and Internet Examples, Wiley, pp. 36–38.

[2] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,


Φ = number-of-elements-
Ronald L.; Stein, Clifford (2001) [1990]. “17.3 The
in-stack potential method”. Introduction to Algorithms (2nd ed.).
MIT Press and McGraw-Hill. pp. 412–416. ISBN 0-
This number is always non-negative, as required. 262-03293-7.

A Push operation takes constant time and increases Φ by [3] Goodrich and Tamassia, 1.5.2 Analyzing an Extendable
1, so its amortized time is constant. Array Implementation, pp. 139–141; Cormen et al., 17.4
Dynamic tables, pp. 416–424.
A Pop operation takes time O(k) but also reduces Φ by k,
so its amortized time is also constant. [4] Cormen et al., Chapter 20, “Fibonacci Heaps”, pp. 476–
497.
This proves that any sequence of m operations takes O(m)
actual time in the worst case. [5] Goodrich and Tamassia, Section 3.4, “Splay Trees”, pp.
185–194.

Binary counter

Consider a counter represented as a binary number and


supporting the following operations:

• Initialize - create a counter with value 0.

• Inc - add 1 to the counter.

• Read - return the current counter value.

This structure may be analyzed using the potential func-


tion:

Φ = number-of-bits-
equal-to-1
Chapter 2

Sequences

2.1 Array data type selected by indices computed at run-time.


Depending on the language, array types may overlap (or
This article is about the abstract data type. For the be identified with) other data types that describe aggre-
byte-level structure, see Array data structure. For other gates of values, such as lists and strings. Array types are
uses, see Array. often implemented by array data structures, but some-
times by other means, such as hash tables, linked lists, or
search trees.
In computer science, an array type is a data type that
is meant to describe a collection of elements (values or
variables), each selected by one or more indices (identi- 2.1.1 History
fying keys) that can be computed at run time by the pro-
gram. Such a collection is usually called an array vari- Heinz Rutishauser's programming language Super-
able, array value, or simply array.[1] By analogy with plan (1949–1951) included multi-dimensional arrays.
the mathematical concepts of vector and matrix, array Rutishauser however although describing how a compiler
types with one and two indices are often called vector for his language should be built, did not implement one.
type and matrix type, respectively.
Assembly languages and low-level languages like
Language support for array types may include certain BCPL[3] generally have no syntactic support for arrays.
built-in array data types, some syntactic constructions
(array type constructors) that the programmer may use to Because of the importance of array structures for ef-
define such types and declare array variables, and spe- ficient computation, the earliest high-level program-
cial notation for indexing array elements.[1] For exam- ming languages, including FORTRAN (1957), COBOL
ple, in the Pascal programming language, the declara- (1960), and Algol 60 (1960), provided support for multi-
tion type MyTable = array [1..4,1..2] of integer, defines dimensional arrays.
a new array data type called MyTable. The declaration
var A: MyTable then defines a variable A of that type,
which is an aggregate of eight elements, each being an 2.1.2 Abstract arrays
integer variable identified by two indices. In the Pas-
cal program, those elements are denoted A[1,1], A[1,2], An array data structure can be mathematically modeled
A[2,1],… A[4,2].[2] Special array types are often defined as an abstract data structure (an abstract array) with two
by the language’s standard libraries. operations

Dynamic lists are also more common and easier to im-


plement than dynamic arrays. Array types are distin- get(A, I): the data stored in the element of the
guished from record types mainly because they allow the array A whose indices are the integer tuple I.
element indices to be computed at run time, as in the Pas- set(A,I,V): the array that results by setting the
cal assignment A[I,J] := A[N-I,2*J]. Among other things, value of that element to V.
this feature allows a single iterative statement to process
arbitrarily many elements of an array variable. These operations are required to satisfy the axioms[4]
In more theoretical contexts, especially in type theory and
in the description of abstract algorithms, the terms “ar- get(set(A,I, V), I) = V
ray” and “array type” sometimes refer to an abstract data
type (ADT) also called abstract array or may refer to an get(set(A,I, V), J) = get(A, J) if I ≠ J
associative array, a mathematical model with the basic
operations and behavior of a typical array type in most for any array state A, any value V, and any tuples I, J for
languages — basically, a collection of elements that are which the operations are defined.

18
2.1. ARRAY DATA TYPE 19

The first axiom means that each element behaves like a


variable. The second axiom means that elements with
distinct indices behave as disjoint variables, so that stor- 1 2 3
ing a value in one element does not affect the value of any
other element.
These axioms do not place any constraints on the set of
valid index tuples I, therefore this abstract model can be 4 5 6
used for triangular matrices and other oddly-shaped ar-
rays.

7 8 9
2.1.3 Implementations

In order to effectively implement variables of such types A two-dimensional array stored as a one-dimensional array of
as array structures (with indexing done by pointer arith- one-dimensional arrays (rows).
metic), many languages restrict the indices to integer data
types (or other types that can be interpreted as integers,
such as bytes and enumerated types), and require that as a vector of pointers to its rows. Thus an element in
all elements have the same data type and storage size. row i and column j of an array A would be accessed by
Most of those languages also restrict each index to a finite double indexing (A[i][j] in typical notation). This way of
interval of integers, that remains fixed throughout the life- emulating multi-dimensional arrays allows the creation of
time of the array variable. In some compiled languages, jagged arrays, where each row may have a different size
in fact, the index ranges may have to be known at compile — or, in general, where the valid range of each index de-
time. pends on the values of all preceding indices.
On the other hand, some programming languages provide This representation for multi-dimensional arrays is quite
more liberal array types, that allow indexing by arbitrary prevalent in C and C++ software. However, C and C++
values, such as floating-point numbers, strings, objects, will use a linear indexing formula for multi-dimensional
references, etc.. Such index values cannot be restricted to arrays that are declared with compile time constant size,
an interval, much less a fixed interval. So, these languages e.g. by int A[10][20] or int A[m][n], instead of the tra-
usually allow arbitrary new elements to be created at any ditional int **A.[6]:p.81
time. This choice precludes the implementation of array
types as array data structures. That is, those languages use
array-like syntax to implement a more general associative
array semantics, and must therefore be implemented by a Indexing notation
hash table or some other search data structure.
Most programming languages that support arrays support
the store and select operations, and have special syntax for
2.1.4 Language support indexing. Early languages used parentheses, e.g. A(i,j),
as in FORTRAN; others choose square brackets, e.g.
Multi-dimensional arrays A[i,j] or A[i][j], as in Algol 60 and Pascal (to distinguish
from the use of parentheses for function calls).
The number of indices needed to specify an element is
called the dimension, dimensionality, or rank of the array
type. (This nomenclature conflicts with the concept of Index types
dimension in linear algebra,[5] where it is the number of
elements. Thus, an array of numbers with 5 rows and 4 Array data types are most often implemented as array
columns, hence 20 elements, is said to have dimension 2 structures: with the indices restricted to integer (or totally
in computing contexts, but represents a matrix with di- ordered) values, index ranges fixed at array creation time,
mension 4-by-5 or 20 in mathematics. Also, the com- and multilinear element addressing. This was the case in
puter science meaning of “rank” is similar to its meaning most “third generation” languages, and is still the case
in tensor algebra but not to the linear algebra concept of of most systems programming languages such as Ada, C,
rank of a matrix.) and C++. In some languages, however, array data types
Many languages support only one-dimensional arrays. In have the semantics of associative arrays, with indices of
those languages, a multi-dimensional array is typically arbitrary type and dynamic element creation. This is the
represented by an Iliffe vector, a one-dimensional array case in some scripting languages such as Awk and Lua,
of references to arrays of one dimension less. A two- and of some array types provided by standard C++ li-
dimensional array, in particular, would be implemented braries.
20 CHAPTER 2. SEQUENCES

Bounds checking and which of these is represented by the * operator varies


by language.
Some languages (like Pascal and Modula) perform Languages providing array programming capabilities
bounds checking on every access, raising an exception have proliferated since the innovations in this area of
or aborting the program when any index is out of its APL. These are core capabilities of domain-specific lan-
valid range. Compilers may allow these checks to be guages such as GAUSS, IDL, Matlab, and Mathematica.
turned off to trade safety for speed. Other languages (like They are a core facility in newer languages, such as Julia
FORTRAN and C) trust the programmer and perform no and recent versions of Fortran. These capabilities are also
checks. Good compilers may also analyze the program provided via standard extension libraries for other general
to determine the range of possible values that the index purpose programming languages (such as the widely used
may have, and this analysis may lead to bounds-checking NumPy library for Python).
elimination.

String types and arrays


Index origin
Many languages provide a built-in string data type, with
Some languages, such as C, provide only zero-based array specialized notation ("string literals") to build values of
types, for which the minimum valid value for any index that type. In some languages (such as C), a string is just
is 0. This choice is convenient for array implementation an array of characters, or is handled in much the same
and address computations. With a language such as C, way. Other languages, like Pascal, may provide vastly
a pointer to the interior of any array can be defined that different operations for strings and arrays.
will symbolically act as a pseudo-array that accommo-
dates negative indices. This works only because C does
not check an index against bounds when used. Array index range queries
Other languages provide only one-based array types, Some programming languages provide operations that re-
where each index starts at 1; this is the traditional con- turn the size (number of elements) of a vector, or, more
vention in mathematics for matrices and mathematical generally, range of each index of an array. In C and C++
sequences. A few languages, such as Pascal, support arrays do not support the size function, so programmers
n-based array types, whose minimum legal indices are often have to declare separate variable to hold the size,
chosen by the programmer. The relative merits of each and pass it to procedures as a separate parameter.
choice have been the subject of heated debate. Zero-
based indexing has a natural advantage to one-based in- Elements of a newly created array may have undefined
dexing in avoiding off-by-one or fencepost errors.[7] values (as in C), or may be defined to have a specific “de-
fault” value such as 0 or a null pointer (as in Java).
See comparison of programming languages (array) for
the base indices used by various languages. In C++ a std::vector object supports the store, select, and
append operations with the performance characteristics
discussed above. Vectors can be queried for their size
Highest index and can be resized. Slower operations like inserting an
element in the middle are also supported.
The relation between numbers appearing in an array dec-
laration and the index of that array’s last element also
Slicing
varies by language. In many languages (such as C), one
should specify the number of elements contained in the
An array slicing operation takes a subset of the elements
array; whereas in others (such as Pascal and Visual Basic
of an array-typed entity (value or variable) and then as-
.NET) one should specify the numeric value of the index
sembles them as another array-typed entity, possibly with
of the last element. Needless to say, this distinction is
other indices. If array types are implemented as array
immaterial in languages where the indices start at 1.
structures, many useful slicing operations (such as select-
ing a sub-array, swapping indices, or reversing the direc-
Array algebra tion of the indices) can be performed very efficiently by
manipulating the dope vector of the structure. The pos-
Some programming languages support array program- sible slicings depend on the implementation details: for
ming, where operations and functions defined for certain example, FORTRAN allows slicing off one column of a
data types are implicitly extended to arrays of elements matrix variable, but not a row, and treat it as a vector;
of those types. Thus one can write A+B to add corre- whereas C allow slicing off a row from a matrix, but not
sponding elements of two arrays A and B. Usually these a column.
languages provide both the element-by-element multipli- On the other hand, other slicing operations are possible
cation and the standard matrix product of linear algebra, when array types are implemented in other ways.
2.2. ARRAY DATA STRUCTURE 21

Resizing 2.1.6 References

Some languages allow dynamic arrays (also called resiz- [1] Robert W. Sebesta (2001) Concepts of Programming
able, growable, or extensible): array variables whose in- Languages. Addison-Wesley. 4th edition (1998), 5th edi-
dex ranges may be expanded at any time after creation, tion (2001), ISBN 9780201385960
without changing the values of its current elements. [2] K. Jensen and Niklaus Wirth, PASCAL User Manual and
For one-dimensional arrays, this facility may be provided Report. Springer. Paperback edition (2007) 184 pages,
as an operation “append(A,x)" that increases the size of ISBN 978-3540069508
the array A by one and then sets the value of the last el-
[3] John Mitchell, Concepts of Programming Languages.
ement to x. Other array types (such as Pascal strings) Cambridge University Press.
provide a concatenation operator, which can be used to-
gether with slicing to achieve that effect and more. In [4] Lukham, Suzuki (1979), “Verification of array, record,
some languages, assigning a value to an element of an and pointer operations in Pascal”. ACM Transactions on
array automatically extends the array, if necessary, to in- Programming Languages and Systems 1(2), 226–244.
clude that element. In other array types, a slice can be
[5] see the definition of a matrix
replaced by an array of different size” with subsequent el-
ements being renumbered accordingly — as in Python’s [6] Brian W. Kernighan and Dennis M. Ritchie (1988), The
list assignment "A[5:5] = [10,20,30]", that inserts three C programming Language. Prentice-Hall, 205 pages.
new elements (10,20, and 30) before element "A[5]". Re-
sizable arrays are conceptually similar to lists, and the two [7] Edsger W. Dijkstra, Why numbering should start at zero
concepts are synonymous in some languages.
An extensible array can be implemented as a fixed-size 2.1.7 External links
array, with a counter that records how many elements are
actually in use. The append operation merely increments • NIST’s Dictionary of Algorithms and Data Struc-
the counter; until the whole array is used, when the ap- tures: Array
pend operation may be defined to fail. This is an imple-
mentation of a dynamic array with a fixed capacity, as in
the string type of Pascal. Alternatively, the append op-
eration may re-allocate the underlying array with a larger 2.2 Array data structure
size, and copy the old elements to the new area.
This article is about the byte-layout-level structure. For
the abstract data type, see Array data type. For other
2.1.5 See also uses, see Array.

• Array access analysis In computer science, an array data structure, or sim-


ply an array, is a data structure consisting of a collec-
• Array programming tion of elements (values or variables), each identified by
at least one array index or key. An array is stored so that
• Array slicing the position of each element can be computed from its
index tuple by a mathematical formula.[1][2][3] The sim-
• Bounds checking and index checking plest type of data structure is a linear array, also called
one-dimensional array.
• Bounds checking elimination
For example, an array of 10 32-bit integer variables, with
• Delimiter-separated values indices 0 through 9, may be stored as 10 words at memory
addresses 2000, 2004, 2008, ... 2036, so that the element
• Comparison of programming languages (array) with index i has the address 2000 + 4 × i.[4]
The memory address of the first element of an array is
• Parallel array
called first address or foundation address.
Because the mathematical concept of a matrix can be rep-
Related types resented as a two-dimensional grid, two-dimensional ar-
rays are also sometimes called matrices. In some cases
• Variable-length array the term “vector” is used in computing to refer to an ar-
ray, although tuples rather than vectors are more correctly
• Dynamic array the mathematical equivalent. Arrays are often used to im-
plement tables, especially lookup tables; the word table is
• Sparse array sometimes used as a synonym of array.
22 CHAPTER 2. SEQUENCES

Arrays are among the oldest and most important data 2.2.2 Applications
structures, and are used by almost every program. They
are also used to implement many other data structures, Arrays are used to implement mathematical vectors and
such as lists and strings. They effectively exploit the ad- matrices, as well as other kinds of rectangular tables.
dressing logic of computers. In most modern comput- Many databases, small and large, consist of (or include)
ers and many external storage devices, the memory is a one-dimensional arrays whose elements are records.
one-dimensional array of words, whose indices are their Arrays are used to implement other data structures,
addresses. Processors, especially vector processors, are such as lists, heaps, hash tables, deques, queues, stacks,
often optimized for array operations. strings, and VLists. Array-based implementations of
Arrays are useful mostly because the element indices can other data structures are frequently simple and space-
be computed at run time. Among other things, this fea- efficient (implicit data structures), requiring little space
ture allows a single iterative statement to process arbi- overhead, but may have poor space complexity, particu-
trarily many elements of an array. For that reason, the larly when modified, compared to tree-based data struc-
elements of an array data structure are required to have tures (compare a sorted array to a search tree).
the same size and should use the same data representa- One or more large arrays are sometimes used to emu-
tion. The set of valid index tuples and the addresses of late in-program dynamic memory allocation, particularly
the elements (and hence the element addressing formula) memory pool allocation. Historically, this has some-
are usually,[3][5] but not always,[2] fixed while the array is times been the only way to allocate “dynamic memory”
in use. portably.
The term array is often used to mean array data type, a Arrays can be used to determine partial or complete
kind of data type provided by most high-level program- control flow in programs, as a compact alternative to
ming languages that consists of a collection of values or (otherwise repetitive) multiple IF statements. They are
variables that can be selected by one or more indices com- known in this context as control tables and are used
puted at run-time. Array types are often implemented by in conjunction with a purpose built interpreter whose
array structures; however, in some languages they may be control flow is altered according to values contained in
implemented by hash tables, linked lists, search trees, or the array. The array may contain subroutine pointers (or
other data structures. relative subroutine numbers that can be acted upon by
The term is also used, especially in the description of SWITCH statements) that direct the path of the execu-
algorithms, to mean associative array or “abstract array”, tion.
a theoretical computer science model (an abstract data
type or ADT) intended to capture the essential properties
of arrays. 2.2.3 Element identifier and addressing
formulas
When data objects are stored in an array, individual
2.2.1 History objects are selected by an index that is usually a non-
negative scalar integer. Indexes are also called subscripts.
The first digital computers used machine-language pro- An index maps the array value to a stored object.
gramming to set up and access array structures for data
There are three ways in which the elements of an array
tables, vector and matrix computations, and for many
can be indexed:
other purposes. John von Neumann wrote the first array-
sorting program (merge sort) in 1945, during the build-
ing of the first stored-program computer.[6]p. 159 Array in- • 0 (zero-based indexing): The first element of the ar-
dexing was originally done by self-modifying code, and ray is indexed by subscript of 0.[8]
later using index registers and indirect addressing. Some
• 1 (one-based indexing): The first element of the ar-
mainframes designed in the 1960s, such as the Burroughs
ray is indexed by subscript of 1.[9]
B5000 and its successors, used memory segmentation to
perform index-bounds checking in hardware.[7] • n (n-based indexing): The base index of an array can
Assembly languages generally have no special support be freely chosen. Usually programming languages
for arrays, other than what the machine itself provides. allowing n-based indexing also allow negative index
The earliest high-level programming languages, including values and other scalar data types like enumerations,
FORTRAN (1957), Lisp (1958), COBOL (1960), and or characters may be used as an array index.
ALGOL 60 (1960), had support for multi-dimensional
arrays, and so has C (1972). In C++ (1983), class tem- Arrays can have multiple dimensions, thus it is not un-
plates exist for multi-dimensional arrays whose dimen- common to access an array using multiple indices. For
sion is fixed at runtime[3][5] as well as for runtime-flexible example, a two-dimensional array A with three rows and
arrays.[2] four columns might provide access to the element at the
2.2. ARRAY DATA STRUCTURE 23

2nd row and 4th column by the expression A[1, 3] in the This means that array a has 2 rows and 3 columns, and
case of a zero-based indexing system. Thus two indices the array is of integer type. Here we can store 6 elements
are used for a two-dimensional array, three for a three- they are stored linearly but starting from first row linear
dimensional array, and n for an n-dimensional array. then continuing with second row. The above array will be
The number of indices needed to specify an element is stored as a11 , a12 , a13 , a21 , a22 , a23 .
called the dimension, dimensionality, or rank of the array. This formula requires only k multiplications and k addi-
tions, for any array that can fit in memory. Moreover, if
In standard arrays, each index is restricted to a certain
range of consecutive integers (or consecutive values of any coefficient is a fixed power of 2, the multiplication
some enumerated type), and the address of an element is can be replaced by bit shifting.
computed by a “linear” formula on the indices. The coefficients ck must be chosen so that every valid in-
dex tuple maps to the address of a distinct element.
One-dimensional arrays If the minimum legal value for every index is 0, then B is
the address of the element whose indices are all zero. As
A one-dimensional array (or single dimension array) is a in the one-dimensional case, the element indices may be
type of linear array. Accessing its elements involves a sin- changed by changing the base address B. Thus, if a two-
gle subscript which can either represent a row or column dimensional array has rows and columns indexed from 1
index. to 10 and 1 to 20, respectively, then replacing B by B + c1 -
− 3 c1 will cause them to be renumbered from 0 through
As an example consider the C declaration int anArray-
9 and 4 through 23, respectively. Taking advantage of
Name[10];
this feature, some languages (like FORTRAN 77) specify
Syntax : datatype anArrayname[sizeofArray]; that array indices begin at 1, as in mathematical tradition
In the given example the array can contain 10 elements of while other languages (like Fortran 90, Pascal and Algol)
any value available to the int type. In C, the array element let the user choose the minimum value for each index.
indices are 0-9 inclusive in this case. For example, the ex-
pressions anArrayName[0] and anArrayName[9] are the Dope vectors
first and last elements respectively.
For a vector with linear addressing, the element with in- The addressing formula is completely defined by the di-
dex i is located at the address B + c × i, where B is a fixed mension d, the base address B, and the increments c1 , c2 ,
base address and c a fixed constant, sometimes called the ..., ck. It is often useful to pack these parameters into a
address increment or stride. record called the array’s descriptor or stride vector or dope
vector.[2][3] The size of each element, and the minimum
If the valid element indices begin at 0, the constant B is and maximum values allowed for each index may also be
simply the address of the first element of the array. For included in the dope vector. The dope vector is a com-
this reason, the C programming language specifies that plete handle for the array, and is a convenient way to pass
array indices always begin at 0; and many programmers arrays as arguments to procedures. Many useful array
will call that element "zeroth" rather than “first”. slicing operations (such as selecting a sub-array, swap-
However, one can choose the index of the first element by ping indices, or reversing the direction of the indices) can
an appropriate choice of the base address B. For example, be performed very efficiently by manipulating the dope
if the array has five elements, indexed 1 through 5, and the vector.[2]
base address B is replaced by B + 30c, then the indices of
those same elements will be 31 to 35. If the numbering
does not start at 0, the constant B may not be the address Compact layouts
of any element.
Often the coefficients are chosen so that the elements oc-
cupy a contiguous area of memory. However, that is not
Multidimensional arrays necessary. Even if arrays are always created with con-
tiguous elements, some array slicing operations may cre-
For multi dimensional array, the element with indices i,j ate non-contiguous sub-arrays from them.
would have address B + c · i + d · j, where the coeffi-
There are two systematic compact layouts for a two-
cients c and d are the row and column address increments,
dimensional array. For example, consider the matrix
respectively.
More generally, in a k-dimensional array, the address of  
an element with indices i1 , i2 , ..., ik is 1 2 3
A = 4 5 6.
B + c1 · i1 + c2 · i2 + ... + ck · ik. 7 8 9

For example: int a[2][3]; In the row-major order layout (adopted by C for statically
24 CHAPTER 2. SEQUENCES

declared arrays), the elements in each row are stored in In an array with element size k and on a machine with a
consecutive positions and all of the elements of a row have cache line size of B bytes, iterating through an array of
a lower address than any of the elements of a consecutive n elements requires the minimum of ceiling(nk/B) cache
row: misses, because its elements occupy contiguous memory
In column-major order (traditionally used by Fortran), the locations. This is roughly a factor of B/k better than the
elements in each column are consecutive in memory and number of cache misses needed to access n elements at
all of the elements of a column have a lower address than random memory locations. As a consequence, sequen-
any of the elements of a consecutive column: tial iteration over an array is noticeably faster in practice
than iteration over many other data structures, a prop-
For arrays with three or more indices, “row major order” erty called locality of reference (this does not mean how-
puts in consecutive positions any two elements whose in- ever, that using a perfect hash or trivial hash within the
dex tuples differ only by one in the last index. “Column same (local) array, will not be even faster - and achiev-
major order” is analogous with respect to the first index. able in constant time). Libraries provide low-level opti-
In systems which use processor cache or virtual memory, mized facilities for copying ranges of memory (such as
scanning an array is much faster if successive elements memcpy) which can be used to move contiguous blocks
are stored in consecutive positions in memory, rather than of array elements significantly faster than can be achieved
sparsely scattered. Many algorithms that use multidimen- through individual element access. The speedup of such
sional arrays will scan them in a predictable order. A pro- optimized routines varies by array element size, architec-
grammer (or a sophisticated compiler) may use this infor- ture, and implementation.
mation to choose between row- or column-major layout Memory-wise, arrays are compact data structures with no
for each array. For example, when computing the prod- per-element overhead. There may be a per-array over-
uct A·B of two matrices, it would be best to have A stored head, e.g. to store index bounds, but this is language-
in row-major order, and B in column-major order. dependent. It can also happen that elements stored in an
array require less memory than the same elements stored
in individual variables, because several array elements
Resizing can be stored in a single word; such arrays are often called
packed arrays. An extreme (but commonly used) case is
Main article: Dynamic array the bit array, where every bit represents a single element.
A single octet can thus hold up to 256 different combina-
Static arrays have a size that is fixed when they are created tions of up to 8 different conditions, in the most compact
and consequently do not allow elements to be inserted or form.
removed. However, by allocating a new array and copy- Array accesses with statically predictable access patterns
ing the contents of the old array to it, it is possible to are a major source of data parallelism.
effectively implement a dynamic version of an array; see
dynamic array. If this operation is done infrequently, in-
sertions at the end of the array require only amortized Comparison with other data structures
constant time.
Some array data structures do not reallocate storage, but Growable arrays are similar to arrays but add the ability
do store a count of the number of elements of the array to insert and delete elements; adding and deleting at the
in use, called the count or size. This effectively makes end is particularly efficient. However, they reserve linear
the array a dynamic array with a fixed maximum size or (Θ(n)) additional storage, whereas arrays do not reserve
capacity; Pascal strings are examples of this. additional storage.
Associative arrays provide a mechanism for array-like
functionality without huge storage overheads when the in-
Non-linear formulas
dex values are sparse. For example, an array that contains
values only at indexes 1 and 2 billion may benefit from us-
More complicated (non-linear) formulas are occasionally
ing such a structure. Specialized associative arrays with
used. For a compact two-dimensional triangular array,
integer keys include Patricia tries, Judy arrays, and van
for instance, the addressing formula is a polynomial of
Emde Boas trees.
degree 2.
Balanced trees require O(log n) time for indexed access,
but also permit inserting or deleting elements in O(log
2.2.4 Efficiency n) time,[15] whereas growable arrays require linear (Θ(n))
time to insert or delete elements at an arbitrary position.
Both store and select take (deterministic worst case) Linked lists allow constant time removal and insertion in
constant time. Arrays take linear (O(n)) space in the the middle but take linear time for indexed access. Their
number of elements n that they hold. memory use is typically worse than arrays, but is still lin-
2.2. ARRAY DATA STRUCTURE 25

ear. • Variable-length array


• Bit array

1 2 3 • Array slicing
• Offset (computer science)
• Row-major order
4 5 6 • Stride of an array

2.2.7 References
7 8 9 [1] Black, Paul E. (13 November 2008). “array”. Dictionary
of Algorithms and Data Structures. National Institute of
Standards and Technology. Retrieved 22 August 2010.
A two-dimensional array stored as a one-dimensional array of
[2] Bjoern Andres; Ullrich Koethe; Thorben Kroeger;
one-dimensional arrays (rows).
Hamprecht (2010). “Runtime-Flexible Multi-
dimensional Arrays and Views for C++98 and C++0x”.
An Iliffe vector is an alternative to a multidimensional arXiv:1008.2909 [cs.DS].
array structure. It uses a one-dimensional array of
references to arrays of one dimension less. For two di- [3] Garcia, Ronald; Lumsdaine, Andrew (2005). “MultiAr-
mensions, in particular, this alternative structure would ray: a C++ library for generic programming with arrays”.
be a vector of pointers to vectors, one for each row. Thus Software: Practice and Experience. 35 (2): 159–188.
doi:10.1002/spe.630. ISSN 0038-0644.
an element in row i and column j of an array A would
be accessed by double indexing (A[i][j] in typical no- [4] David R. Richardson (2002), The Book on Data Struc-
tation). This alternative structure allows jagged arrays, tures. iUniverse, 112 pages. ISBN 0-595-24039-9, ISBN
where each row may have a different size — or, in gen- 978-0-595-24039-5.
eral, where the valid range of each index depends on the
[5] Veldhuizen, Todd L. (December 1998). Arrays in
values of all preceding indices. It also saves one multipli- Blitz++ (PDF). Computing in Object-Oriented Paral-
cation (by the column address increment) replacing it by lel Environments. Lecture Notes in Computer Sci-
a bit shift (to index the vector of row pointers) and one ence. 1505. Springer Berlin Heidelberg. pp. 223–
extra memory access (fetching the row address), which 230. doi:10.1007/3-540-49372-7_24. ISBN 978-3-540-
may be worthwhile in some architectures. 65387-5.

[6] Donald Knuth, The Art of Computer Programming, vol. 3.


Addison-Wesley
2.2.5 Dimension
[7] Levy, Henry M. (1984), Capability-based Computer Sys-
The dimension of an array is the number of indices tems, Digital Press, p. 22, ISBN 9780932376220.
needed to select an element. Thus, if the array is seen
[8] “Array Code Examples - PHP Array Functions - PHP
as a function on a set of possible index combinations, it code”. http://www.configure-all.com/: Computer Pro-
is the dimension of the space of which its domain is a gramming Web programming Tips. Retrieved 8 April
discrete subset. Thus a one-dimensional array is a list of 2011. In most computer languages array index (counting)
data, a two-dimensional array a rectangle of data, a three- starts from 0, not from 1. Index of the first element of the
dimensional array a block of data, etc. array is 0, index of the second element of the array is 1,
and so on. In array of names below you can see indexes
This should not be confused with the dimension of the
and values.
set of all matrices with a given domain, that is, the
number of elements in the array. For example, an ar- [9] “Chapter 6 - Arrays, Types, and Constants”. Modula-2
ray with 5 rows and 4 columns is two-dimensional, but Tutorial. http://www.modula2.org/tutor/index.php. Re-
such matrices form a 20-dimensional space. Similarly, trieved 8 April 2011. The names of the twelve variables
a three-dimensional vector can be represented by a one- are given by Automobiles[1], Automobiles[2], ... Auto-
dimensional array of size three. mobiles[12]. The variable name is “Automobiles” and the
array subscripts are the numbers 1 through 12. [i.e. in
Modula-2, the index starts by one!]
2.2.6 See also [10] Chris Okasaki (1995). “Purely Functional Random-
Access Lists”. Proceedings of the Seventh Inter-
• Dynamic array national Conference on Functional Programming
Languages and Computer Architecture: 86–95.
• Parallel array doi:10.1145/224164.224187.
26 CHAPTER 2. SEQUENCES

[11] Gerald Kruse. CS 240 Lecture Notes: Linked Lists Plus: 2.3.1 Bounded-size dynamic arrays and
Complexity Trade-offs. Juniata College. Spring 2008. capacity
[12] Day 1 Keynote - Bjarne Stroustrup: C++11 Style at Go- A simple dynamic array can be constructed by allocating
ingNative 2012 on channel9.msdn.com from minute 45 or
an array of fixed-size, typically larger than the number
foil 44
of elements immediately required. The elements of the
[13] Number crunching: Why you should never, ever, EVER use dynamic array are stored contiguously at the start of the
linked-list in your code again at kjellkod.wordpress.com underlying array, and the remaining positions towards the
end of the underlying array are reserved, or unused. Ele-
[14] Brodnik, Andrej; Carlsson, Svante; Sedgewick, Robert; ments can be added at the end of a dynamic array in con-
Munro, JI; Demaine, ED (1999), Resizable Arrays in Op- stant time by using the reserved space, until this space is
timal Time and Space (Technical Report CS-99-09) (PDF), completely consumed. When all space is consumed, and
Department of Computer Science, University of Waterloo an additional element is to be added, then the underly-
ing fixed-sized array needs to be increased in size. Typ-
[15] “Counted B-Trees”. ically resizing is expensive because it involves allocating
a new underlying array and copying each element from
the original array. Elements can be removed from the
end of a dynamic array in constant time, as no resizing is
2.3 Dynamic array required. The number of elements used by the dynamic
array contents is its logical size or size, while the size of
the underlying array is called the dynamic array’s capac-
2 ity or physical size, which is the maximum possible size
without relocating data.[2]
A fixed-size array will suffice in applications where the
27 maximum logical size is fixed (e.g. by specification), or
can be calculated before the array is allocated. A dynamic

271
array might be preferred if

• the maximum logical size is unknown, or difficult to


2713 calculate, before the array is allocated

• it is considered that a maximum logical size given by


27138 a specification is likely to change

• the amortized cost of resizing a dynamic array does

271384 not significantly affect performance or responsive-


ness

Logical size
2.3.2 Geometric expansion and amortized
Capacity cost
Several values are inserted at the end of a dynamic array using To avoid incurring the cost of resizing many times, dy-
geometric expansion. Grey cells indicate space reserved for ex-
namic arrays resize by a large amount, such as doubling
pansion. Most insertions are fast (constant time), while some are
in size, and use the reserved space for future expansion.
slow due to the need for reallocation (Θ(n) time, labelled with tur-
tles). The logical size and capacity of the final array are shown. The operation of adding an element to the end might work
as follows:
In computer science, a dynamic array, growable array, function insertEnd(dynarray a, element e) if (a.size
resizable array, dynamic table, mutable array, or ar- = a.capacity) // resize a to twice its current capacity:
ray list is a random access, variable-size list data struc- a.capacity ← a.capacity * 2 // (copy the contents to the
ture that allows elements to be added or removed. It is new memory location here) a[a.size] ← e a.size ← a.size
supplied with standard libraries in many modern main- +1
stream programming languages.
A dynamic array is not the same thing as a dynamically As n elements are inserted, the capacities form a
allocated array, which is an array whose size is fixed when geometric progression. Expanding the array by any con-
the array is allocated, although a dynamic array may use stant proportion a ensures that inserting n elements takes
such a fixed-size array as a back end.[1] O(n) time overall, meaning that each insertion takes
2.3. DYNAMIC ARRAY 27

amortized constant time. Many dynamic arrays also deal- that resides in other areas of memory. In this case, ac-
locate some of the underlying storage if its size drops be- cessing items in the array sequentially will actually in-
low a certain threshold, such as 30% of the capacity. This volve accessing multiple non-contiguous areas of mem-
threshold must be strictly smaller than 1/a in order to pro- ory, so the many advantages of the cache-friendliness of
vide hysteresis (provide a stable band to avoiding repeat- this data structure are lost.
edly growing and shrinking) and support mixed sequences Compared to linked lists, dynamic arrays have faster in-
of insertions and removals with amortized constant cost. dexing (constant time versus linear time) and typically
Dynamic arrays are a common example when teaching faster iteration due to improved locality of reference;
amortized analysis.[3][4] however, dynamic arrays require linear time to insert or
delete at an arbitrary location, since all following ele-
ments must be moved, while linked lists can do this in
2.3.3 Growth factor constant time. This disadvantage is mitigated by the gap
buffer and tiered vector variants discussed under Variants
The growth factor for the dynamic array depends on sev- below. Also, in a highly fragmented memory region, it
eral factors including a space-time trade-off and algo- may be expensive or impossible to find contiguous space
rithms used in the memory allocator itself. For growth for a large dynamic array, whereas linked lists do not re-
factor a, the average time per insertion operation is about quire the whole data structure to be stored contiguously.
a/(a−1), while the number of wasted cells is bounded
A balanced tree can store a list while providing all oper-
above by (a−1)n. If memory allocator uses a first-fit allo-
ations of both dynamic arrays and linked lists reasonably
cation algorithm, then growth factor values such as a=2
efficiently, but both insertion at the end and iteration over
can cause dynamic array expansion to run out of memory
the list are slower than for a dynamic array, in theory and
even though a significant amount of memory may still be
in practice, due to non-contiguous storage and tree traver-
available.[5] There have been various discussions on ideal
sal/manipulation overhead.
growth factor values, including proposals for the Golden
Ratio as well as the value 1.5.[6] Many textbooks, how-
ever, use a = 2 for simplicity and analysis purposes.[3][4]
Below are growth factors used by several popular imple-
mentations:
2.3.5 Variants
2.3.4 Performance
Gap buffers are similar to dynamic arrays but allow effi-
The dynamic array has performance similar to an array, cient insertion and deletion operations clustered near the
with the addition of new operations to add and remove same arbitrary location. Some deque implementations
elements: use array deques, which allow amortized constant time
insertion/removal at both ends, instead of just one end.
• Getting or setting the value at a particular index Goodrich[15] presented a dynamic array algorithm called
(constant time) Tiered Vectors that provided O(n1/2 ) performance for or-
der preserving insertions or deletions from the middle of
• Iterating over the elements in order (linear time,
the array.
good cache performance)
Hashed Array Tree (HAT) is a dynamic array algorithm
• Inserting or deleting an element in the middle of the published by Sitarski in 1996.[16] Hashed Array Tree
array (linear time) wastes order n1/2 amount of storage space, where n is the
number of elements in the array. The algorithm has O(1)
• Inserting or deleting an element at the end of the
amortized performance when appending a series of ob-
array (constant amortized time)
jects to the end of a Hashed Array Tree.

Dynamic arrays benefit from many of the advantages In a 1999 paper,[14] Brodnik et al. describe a tiered dy-
of arrays, including good locality of reference and data namic array data structure, which wastes only n1/2 space
cache utilization, compactness (low memory use), and for n elements at any point in time, and they prove a lower
random access. They usually have only a small fixed addi- bound showing that any dynamic array must waste this
tional overhead for storing information about the size and much space if the operations are to remain amortized
capacity. This makes dynamic arrays an attractive tool constant time. Additionally, they present a variant where
for building cache-friendly data structures. However, in growing and shrinking the buffer has not only amortized
languages like Python or Java that enforce reference se- but worst-case constant time.
mantics, the dynamic array generally will not store the Bagwell (2002)[17] presented the VList algorithm, which
actual data, but rather it will store references to the data can be adapted to implement a dynamic array.
28 CHAPTER 2. SEQUENCES

2.3.6 Language support [12] Day 1 Keynote - Bjarne Stroustrup: C++11 Style at Go-
ingNative 2012 on channel9.msdn.com from minute 45 or
C++'s std::vector is an implementation of dynamic ar- foil 44
rays, as are the ArrayList[18] class supplied with the Java
[13] Number crunching: Why you should never, ever, EVER use
API and the .NET Framework.[19] The generic List<> linked-list in your code again at kjellkod.wordpress.com
class supplied with version 2.0 of the .NET Framework is
also implemented with dynamic arrays. Smalltalk's Or- [14] Brodnik, Andrej; Carlsson, Svante; Sedgewick, Robert;
deredCollection is a dynamic array with dynamic start Munro, JI; Demaine, ED (1999), Resizable Arrays in Op-
and end-index, making the removal of the first element timal Time and Space (Technical Report CS-99-09) (PDF),
also O(1). Python's list datatype implementation is a dy- Department of Computer Science, University of Waterloo
namic array. Delphi and D implement dynamic arrays [15] Goodrich, Michael T.; Kloss II, John G. (1999), “Tiered
at the language’s core. Ada's Ada.Containers.Vectors Vectors: Efficient Dynamic Arrays for Rank-Based
generic package provides dynamic array implementation Sequences”, Workshop on Algorithms and Data Struc-
for a given subtype. Many scripting languages such as tures, Lecture Notes in Computer Science, 1663: 205–
Perl and Ruby offer dynamic arrays as a built-in primitive 216, doi:10.1007/3-540-48447-7_21, ISBN 978-3-540-
data type. Several cross-platform frameworks provide 66279-2
dynamic array implementations for C, including CFAr-
[16] Sitarski, Edward (September 1996), “HATs: Hashed ar-
ray and CFMutableArray in Core Foundation, and GAr- ray trees”, Algorithm Alley, Dr. Dobb’s Journal, 21 (11)
ray and GPtrArray in GLib.
[17] Bagwell, Phil (2002), Fast Functional Lists, Hash-Lists,
Deques and Variable Length Arrays, EPFL
2.3.7 References
[18] Javadoc on ArrayList
[1] See, for example, the source code of java.util.ArrayList [19] ArrayList Class
class from OpenJDK 6.

[2] Lambert, Kenneth Alfred (2009), “Physical size and log-


ical size”, Fundamentals of Python: From First Programs 2.3.8 External links
Through Data Structures, Cengage Learning, p. 510,
ISBN 1423902181 • NIST Dictionary of Algorithms and Data Structures:
Dynamic array
[3] Goodrich, Michael T.; Tamassia, Roberto (2002), “1.5.2
Analyzing an Extendable Array Implementation”, Algo- • VPOOL - C language implementation of dynamic
rithm Design: Foundations, Analysis and Internet Exam- array.
ples, Wiley, pp. 39–41.
• CollectionSpy — A Java profiler with explicit sup-
[4] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, port for debugging ArrayList- and Vector-related is-
Ronald L.; Stein, Clifford (2001) [1990]. “17.4 Dynamic sues.
tables”. Introduction to Algorithms (2nd ed.). MIT Press
and McGraw-Hill. pp. 416–424. ISBN 0-262-03293-7. • Open Data Structures - Chapter 2 - Array-Based
Lists
[5] “C++ STL vector: definition, growth factor, member
functions”. Retrieved 2015-08-05.

[6] “vector growth factor of 1.5”. comp.lang.c++.moderated. 2.4 Linked list


Google Groups.

[7] List object implementation from python.org, retrieved In computer science, a linked list is a linear collection
2011-09-27. of data elements, called nodes, each pointing to the next
node by means of a pointer. It is a data structure con-
[8] Brais, Hadi. “Dissecting the C++ STL Vector: Part 3 - sisting of a group of nodes which together represent a
Capacity & Size”. Micromysteries. Retrieved 2015-08- sequence. Under the simplest form, each node is com-
05. posed of data and a reference (in other words, a link) to
[9] “facebook/folly”. GitHub. Retrieved 2015-08-05. the next node in the sequence. This structure allows for
efficient insertion or removal of elements from any po-
[10] Chris Okasaki (1995). “Purely Functional Random- sition in the sequence during iteration. More complex
Access Lists”. Proceedings of the Seventh Inter- variants add additional links, allowing efficient insertion
national Conference on Functional Programming or removal from arbitrary element references.
Languages and Computer Architecture: 86–95.
doi:10.1145/224164.224187.

[11] Gerald Kruse. CS 240 Lecture Notes: Linked Lists Plus:


12 99 37
Complexity Trade-offs. Juniata College. Spring 2008. linked list whose nodes contain two fields: an integer value and
2.4. LINKED LIST 29

a link to the next node. The last node is linked to a terminator • Nodes in a linked list must be read in order from
used to signify the end of the list. the beginning as linked lists are inherently sequential
access.
Linked lists are among the simplest and most common
• Nodes are stored incontiguously, greatly increas-
data structures. They can be used to implement several
ing the time required to access individual elements
other common abstract data types, including lists (the ab-
within the list, especially with a CPU cache.
stract data type), stacks, queues, associative arrays, and
S-expressions, though it is not uncommon to implement • Difficulties arise in linked lists when it comes to re-
the other data structures directly without using a list as verse traversing. For instance, singly linked lists
the basis of implementation. are cumbersome to navigate backwards[1] and while
The principal benefit of a linked list over a conventional doubly linked lists are somewhat easier to read,
array is that the list elements can easily be inserted or re- memory is consumed in allocating space for a back-
moved without reallocation or reorganization of the en- pointer.
tire structure because the data items need not be stored
contiguously in memory or on disk, while an array has
to be declared in the source code, before compiling and 2.4.3 History
running the program. Linked lists allow insertion and re-
moval of nodes at any point in the list, and can do so with Linked lists were developed in 1955–1956 by Allen
a constant number of operations if the link previous to Newell, Cliff Shaw and Herbert A. Simon at RAND
the link being added or removed is maintained during list Corporation as the primary data structure for their
traversal. Information Processing Language. IPL was used by the
authors to develop several early artificial intelligence pro-
On the other hand, simple linked lists by themselves do
grams, including the Logic Theory Machine, the General
not allow random access to the data, or any form of ef-
Problem Solver, and a computer chess program. Reports
ficient indexing. Thus, many basic operations — such
on their work appeared in IRE Transactions on Informa-
as obtaining the last node of the list (assuming that the
tion Theory in 1956, and several conference proceedings
last node is not maintained as separate node reference in
from 1957 to 1959, including Proceedings of the Western
the list structure), or finding a node that contains a given
Joint Computer Conference in 1957 and 1958, and In-
datum, or locating the place where a new node should be
formation Processing (Proceedings of the first UNESCO
inserted — may require sequential scanning of most or all
International Conference on Information Processing) in
of the list elements. The advantages and disadvantages of
1959. The now-classic diagram consisting of blocks rep-
using linked lists are given below.
resenting list nodes with arrows pointing to successive list
nodes appears in “Programming the Logic Theory Ma-
chine” by Newell and Shaw in Proc. WJCC, February
2.4.1 Advantages 1957. Newell and Simon were recognized with the ACM
Turing Award in 1975 for having “made basic contribu-
• Linked lists are a dynamic data structure, which
tions to artificial intelligence, the psychology of human
can grow and be pruned, allocating and deallocating
cognition, and list processing”. The problem of machine
memory while the program is running.
translation for natural language processing led Victor Yn-
gve at Massachusetts Institute of Technology (MIT) to
• Insertion and deletion node operations are easily im-
use linked lists as data structures in his COMIT pro-
plemented in a linked list.
gramming language for computer research in the field of
• Dynamic data structures such as stacks and queues linguistics. A report on this language entitled “A pro-
can be implemented using a linked list. gramming language for mechanical translation” appeared
in Mechanical Translation in 1958.
• There is no need to define an initial size for a linked LISP, standing for list processor, was created by John
list. McCarthy in 1958 while he was at MIT and in 1960 he
published its design in a paper in the Communications
• Items can be added or removed from the middle of
of the ACM, entitled “Recursive Functions of Symbolic
list.
Expressions and Their Computation by Machine, Part I”.
• Backtracking is possible in two way linked list. One of LISP’s major data structures is the linked list.
By the early 1960s, the utility of both linked lists and lan-
guages which use these structures as their primary data
2.4.2 Disadvantages representation was well established. Bert Green of the
MIT Lincoln Laboratory published a review article enti-
• They use more memory than arrays because of the tled “Computer languages for symbol manipulation” in
storage used by their pointers. IRE Transactions on Human Factors in Electronics in
30 CHAPTER 2. SEQUENCES

March 1961 which summarized the advantages of the Doubly linked list
linked list approach. A later review article, “A Compar-
ison of list-processing computer languages” by Bobrow Main article: Doubly linked list
and Raphael, appeared in Communications of the ACM
in April 1964.
In a 'doubly linked list', each node contains, besides
Several operating systems developed by Technical Sys- the next-node link, a second link field pointing to the
tems Consultants (originally of West Lafayette Indiana, 'previous’ node in the sequence. The two links may
and later of Chapel Hill, North Carolina) used singly be called 'forward('s’) and 'backwards’, or 'next' and
linked lists as file structures. A directory entry pointed 'prev'('previous’).
to the first sector of a file, and succeeding portions of
the file were located by traversing pointers. Systems us-
ing this technique included Flex (for the Motorola 6800 12 99
CPU), mini-Flex (same CPU), and Flex9 (for the Mo- doubly linked list whose nodes contain three fields: an integer
torola 6809 CPU). A variant developed by TSC for and value, the link forward to the next node, and the link backward
marketed by Smoke Signal Broadcasting in California, to the previous node
used doubly linked lists in the same manner.
The TSS/360 operating system, developed by IBM for A technique known as XOR-linking allows a doubly
the System 360/370 machines, used a double linked list linked list to be implemented using a single link field in
for their file system catalog. The directory structure was each node. However, this technique requires the ability
similar to Unix, where a directory could contain files and to do bit operations on addresses, and therefore may not
other directories and extend to any depth. be available in some high-level languages.
Many modern operating systems use doubly linked lists
to maintain references to active processes, threads, and
2.4.4 Basic concepts and nomenclature other dynamic objects.[2] A common strategy for rootkits
to evade detection is to unlink themselves from these
Each record of a linked list is often called an 'element' or lists.[3]
'node'.
The field of each node that contains the address of the
Multiply linked list
next node is usually called the 'next link' or 'next pointer'.
The remaining fields are known as the 'data', 'informa-
tion', 'value', 'cargo', or 'payload' fields. In a 'multiply linked list', each node contains two or more
link fields, each field being used to connect the same set
The 'head' of a list is its first node. The 'tail' of a list may of data records in a different order (e.g., by name, by
refer either to the rest of the list after the head, or to the department, by date of birth, etc.). While doubly linked
last node in the list. In Lisp and some derived languages, lists can be seen as special cases of multiply linked list, the
the next node may be called the 'cdr' (pronounced could- fact that the two orders are opposite to each other leads to
er) of the list, while the payload of the head node may be simpler and more efficient algorithms, so they are usually
called the 'car'. treated as a separate case.

Singly linked list Circular Linked list

Singly linked lists contain nodes which have a data field In the last node of a list, the link field often contains a
as well as a 'next' field, which points to the next node in null reference, a special value used to indicate the lack of
line of nodes. Operations that can be performed on singly further nodes. A less common convention is to make it
linked lists include insertion, deletion and traversal. point to the first node of the list; in that case the list is
said to be 'circular' or 'circularly linked'; otherwise it is
Link − Each link of a linked list can store a data called an said to be 'open' or 'linear'.
element. Next − Each link of a linked list contains a link
to the next link called Next. Linked List − A Linked List
contains the connection link to the first link called First. 12 99 37
12 99 37 circular linked list
A
A

singly linked list whose nodes contain two fields: an integer


value and a link to the next node In the case of a circular doubly linked list, the only change
that occurs is that the end, or “tail”, of the said list is
2.4. LINKED LIST 31

linked back to the front, or “head”, of the list and vice cause problems in another. This is a list of some of the
versa. common tradeoffs involving linked list structures.

Sentinel nodes Linked lists vs. dynamic arrays

Main article: Sentinel node A dynamic array is a data structure that allocates all ele-
ments contiguously in memory, and keeps a count of the
In some implementations an extra 'sentinel' or 'dummy' current number of elements. If the space reserved for the
node may be added before the first data record or after dynamic array is exceeded, it is reallocated and (possibly)
the last one. This convention simplifies and accelerates copied, which is an expensive operation.
some list-handling algorithms, by ensuring that all links Linked lists have several advantages over dynamic arrays.
can be safely dereferenced and that every list (even one Insertion or deletion of an element at a specific point of a
that contains no data elements) always has a “first” and list, assuming that we have indexed a pointer to the node
“last” node. (before the one to be removed, or before the insertion
point) already, is a constant-time operation (otherwise
without this reference it is O(n)), whereas insertion in a
Empty lists dynamic array at random locations will require moving
half of the elements on average, and all the elements in
An empty list is a list that contains no data records. This the worst case. While one can “delete” an element from
is usually the same as saying that it has zero nodes. If an array in constant time by somehow marking its slot as
sentinel nodes are being used, the list is usually said to be “vacant”, this causes fragmentation that impedes the per-
empty when it has only sentinel nodes. formance of iteration.
Moreover, arbitrarily many elements may be inserted into
Hash linking a linked list, limited only by the total memory available;
while a dynamic array will eventually fill up its under-
The link fields need not be physically part of the nodes. lying array data structure and will have to reallocate —
If the data records are stored in an array and referenced an expensive operation, one that may not even be possi-
by their indices, the link field may be stored in a separate ble if memory is fragmented, although the cost of real-
array with the same indices as the data records. location can be averaged over insertions, and the cost of
an insertion due to reallocation would still be amortized
O(1). This helps with appending elements at the array’s
List handles end, but inserting into (or removing from) middle posi-
tions still carries prohibitive costs due to data moving to
Since a reference to the first node gives access to the maintain contiguity. An array from which many elements
whole list, that reference is often called the 'address’, are removed may also have to be resized in order to avoid
'pointer', or 'handle' of the list. Algorithms that manipu- wasting too much space.
late linked lists usually get such handles to the input lists On the other hand, dynamic arrays (as well as fixed-size
and return the handles to the resulting lists. In fact, in the array data structures) allow constant-time random access,
context of such algorithms, the word “list” often means while linked lists allow only sequential access to elements.
“list handle”. In some situations, however, it may be con- Singly linked lists, in fact, can be easily traversed in only
venient to refer to a list by a handle that consists of two one direction. This makes linked lists unsuitable for ap-
links, pointing to its first and last nodes. plications where it’s useful to look up an element by its
index quickly, such as heapsort. Sequential access on ar-
Combining alternatives rays and dynamic arrays is also faster than on linked lists
on many machines, because they have optimal locality of
reference and thus make good use of data caching.
The alternatives listed above may be arbitrarily combined
in almost every way, so one may have circular doubly Another disadvantage of linked lists is the extra storage
linked lists without sentinels, circular singly linked lists needed for references, which often makes them imprac-
with sentinels, etc. tical for lists of small data items such as characters or
boolean values, because the storage overhead for the links
may exceed by a factor of two or more the size of the
2.4.5 Tradeoffs data. In contrast, a dynamic array requires only the space
for the data itself (and a very small amount of control
As with most choices in computer programming and de- data).[note 1] It can also be slow, and with a naïve alloca-
sign, no method is well suited to all circumstances. A tor, wasteful, to allocate memory separately for each new
linked list data structure might work well in one case, but element, a problem generally solved using memory pools.
32 CHAPTER 2. SEQUENCES

Some hybrid solutions try to combine the advantages of can be adapted for doubly linked and circularly linked
the two representations. Unrolled linked lists store several lists, the procedures generally need extra arguments and
elements in each list node, increasing cache performance more complicated base cases.
while decreasing memory overhead for references. CDR Linear singly linked lists also allow tail-sharing, the use
coding does both these as well, by replacing references of a common final portion of sub-list as the terminal por-
with the actual data referenced, which extends off the end tion of two different lists. In particular, if a new node is
of the referencing record. added at the beginning of a list, the former list remains
A good example that highlights the pros and cons of us- available as the tail of the new one — a simple example of
ing dynamic arrays vs. linked lists is by implementing a a persistent data structure. Again, this is not true with the
program that resolves the Josephus problem. The Jose- other variants: a node may never belong to two different
phus problem is an election method that works by having circular or doubly linked lists.
a group of people stand in a circle. Starting at a predeter- In particular, end-sentinel nodes can be shared among
mined person, you count around the circle n times. Once singly linked non-circular lists. The same end-sentinel
you reach the nth person, take them out of the circle and node may be used for every such list. In Lisp, for exam-
have the members close the circle. Then count around the ple, every proper list ends with a link to a special node,
circle the same n times and repeat the process, until only denoted by nil or (), whose CAR and CDR links point to
one person is left. That person wins the election. This itself. Thus a Lisp procedure can safely take the CAR or
shows the strengths and weaknesses of a linked list vs. a CDR of any list.
dynamic array, because if you view the people as con-
nected nodes in a circular linked list then it shows how The advantages of the fancy variants are often limited to
easily the linked list is able to delete nodes (as it only has the complexity of the algorithms, not in their efficiency.
to rearrange the links to the different nodes). However, A circular list, in particular, can usually be emulated by
the linked list will be poor at finding the next person to re- a linear list together with two variables that point to the
move and will need to search through the list until it finds first and last nodes, at no extra cost.
that person. A dynamic array, on the other hand, will be
poor at deleting nodes (or elements) as it cannot remove
one node without individually shifting all the elements up Doubly linked vs. singly linked
the list by one. However, it is exceptionally easy to find
the nth person in the circle by directly referencing them Double-linked lists require more space per node (unless
by their position in the array. one uses XOR-linking), and their elementary operations
are more expensive; but they are often easier to manip-
The list ranking problem concerns the efficient conver- ulate because they allow fast and easy sequential access
sion of a linked list representation into an array. Although to the list in both directions. In a doubly linked list, one
trivial for a conventional computer, solving this problem can insert or delete a node in a constant number of oper-
by a parallel algorithm is complicated and has been the ations given only that node’s address. To do the same in a
subject of much research. singly linked list, one must have the address of the pointer
A balanced tree has similar memory access patterns and to that node, which is either the handle for the whole list
space overhead to a linked list while permitting much (in case of the first node) or the link field in the previous
more efficient indexing, taking O(log n) time instead of node. Some algorithms require access in both directions.
O(n) for a random access. However, insertion and dele- On the other hand, doubly linked lists do not allow tail-
tion operations are more expensive due to the overhead sharing and cannot be used as persistent data structures.
of tree manipulations to maintain balance. Schemes exist
for trees to automatically maintain themselves in a bal-
anced state: AVL trees or red-black trees. Circularly linked vs. linearly linked

A circularly linked list may be a natural option to repre-


Singly linked linear lists vs. other lists sent arrays that are naturally circular, e.g. the corners of
a polygon, a pool of buffers that are used and released in
While doubly linked and circular lists have advantages FIFO (“first in, first out”) order, or a set of processes that
over singly linked linear lists, linear lists offer some ad- should be time-shared in round-robin order. In these ap-
vantages that make them preferable in some situations. plications, a pointer to any node serves as a handle to the
A singly linked linear list is a recursive data structure, whole list.
because it contains a pointer to a smaller object of the With a circular list, a pointer to the last node gives easy
same type. For that reason, many operations on singly access also to the first node, by following one link. Thus,
linked linear lists (such as merging two lists, or enumerat- in applications that require access to both ends of the list
ing the elements in reverse order) often have very simple (e.g., in the implementation of a queue), a circular struc-
recursive algorithms, much simpler than any solution us- ture allows one to handle the structure by a single pointer,
ing iterative commands. While those recursive solutions instead of two.
2.4. LINKED LIST 33

A circular list can be split into two circular lists, in con- section gives pseudocode for adding or removing nodes
stant time, by giving the addresses of the last node of each from singly, doubly, and circularly linked lists in-place.
piece. The operation consists in swapping the contents of Throughout we will use null to refer to an end-of-list
the link fields of those two nodes. Applying the same marker or sentinel, which may be implemented in a num-
operation to any two nodes in two distinct lists joins the ber of ways.
two list into one. This property greatly simplifies some al-
gorithms and data structures, such as the quad-edge and
face-edge. Linearly linked lists

The simplest representation for an empty circular list Singly linked lists Our node data structure will have
(when such a thing makes sense) is a null pointer, indicat- two fields. We also keep a variable firstNode which always
ing that the list has no nodes. Without this choice, many points to the first node in the list, or is null for an empty
algorithms have to test for this special case, and handle list.
it separately. By contrast, the use of null to denote an
empty linear list is more natural and often creates fewer record Node { data; // The data being stored in the node
special cases. Node next // A reference to the next node, null for last node
} record List { Node firstNode // points to first node of list;
null for empty list }
Using sentinel nodes Traversal of a singly linked list is simple, beginning at the
first node and following each next link until we come to
Sentinel node may simplify certain list operations, by en- the end:
suring that the next or previous nodes exist for every el-
ement, and that even empty lists have at least one node. node := list.firstNode while node not null (do something
One may also use a sentinel node at the end of the list, with node.data) node := node.next
with an appropriate data field, to eliminate some end-of- The following code inserts a node after an existing node
list tests. For example, when scanning the list looking for in a singly linked list. The diagram shows how it works.
a node with a given value x, setting the sentinel’s data field Inserting a node before an existing one cannot be done di-
to x makes it unnecessary to test for end-of-list inside the rectly; instead, one must keep track of the previous node
loop. Another example is the merging two sorted lists: and insert a node after it.
if their sentinels have data fields set to +∞, the choice of
newNode newNode
the next output node does not need special handling for
empty lists. 37 37

However, sentinel nodes use up extra space (especially 12 99 12 99


in applications that use many short lists), and they may node node.next node node.next

complicate other operations (such as the creation of a new


empty list).
function insertAfter(Node node, Node newNode) // in-
However, if the circular list is used merely to simulate sert newNode after node newNode.next := node.next
a linear list, one may avoid some of this complexity by node.next := newNode
adding a single sentinel node to every list, between the
last and the first data nodes. With this convention, an Inserting at the beginning of the list requires a separate
empty list consists of the sentinel node alone, pointing to function. This requires updating firstNode.
itself via the next-node link. The list handle should then function insertBeginning(List list, Node newNode) //
be a pointer to the last data node, before the sentinel, if insert node before current first node newNode.next :=
the list is not empty; or to the sentinel itself, if the list is list.firstNode list.firstNode := newNode
empty.
Similarly, we have functions for removing the node after
The same trick can be used to simplify the handling of a given node, and for removing a node from the beginning
a doubly linked linear list, by turning it into a circular of the list. The diagram demonstrates the former. To find
doubly linked list with a single sentinel node. However, and remove a particular node, one must again keep track
in this case, the handle should be a single pointer to the of the previous element.
dummy node itself.[9]
function removeAfter(Node node) // remove node
past this one obsoleteNode := node.next node.next :=
2.4.6 Linked list operations node.next.next destroy obsoleteNode function remove-
Beginning(List list) // remove first node obsoleteNode :=
When manipulating linked lists in-place, care must be list.firstNode list.firstNode := list.firstNode.next // point
taken to not use values that you have invalidated in pre- past deleted node destroy obsoleteNode
vious assignments. This makes algorithms for insert- Notice that removeBeginning() sets list.firstNode to null
ing or deleting linked list nodes somewhat subtle. This when removing the last node in the list.
34 CHAPTER 2. SEQUENCES

:= someNode do do something with node.value node :=


12 99 37 node.next while node ≠ someNode
node node.next node.next.next Notice that the test "while node ≠ someNode” must be at
the end of the loop. If the test was moved to the beginning
of the loop, the procedure would fail whenever the list had
12 99 37 only one node.
node node.next node.next.next This function inserts a node “newNode” into a circular
linked list after a given node “node”. If “node” is null, it
assumes that the list is empty.

Since we can't iterate backwards, efficient insertBefore function insertAfter(Node node, Node newNode) if node
or removeBefore operations are not possible. Inserting to = null newNode.next := newNode else newNode.next :=
a list before a specific node requires traversing the list, node.next node.next := newNode
which would have a worst case running time of O(n). Suppose that “L” is a variable pointing to the last node
Appending one linked list to another can be inefficient of a circular linked list (or null if the list is empty). To
unless a reference to the tail is kept as part of the List append “newNode” to the end of the list, one may do
structure, because we must traverse the entire first list in insertAfter(L, newNode) L := newNode
order to find the tail, and then append the second list to
To insert “newNode” at the beginning of the list, one may
this. Thus, if two linearly linked lists are each of length n
do
, list appending has asymptotic time complexity of O(n) .
In the Lisp family of languages, list appending is provided insertAfter(L, newNode) if L = null L := newNode
by the append procedure.
Many of the special cases of linked list operations can 2.4.7 Linked lists using arrays of nodes
be eliminated by including a dummy element at the front
of the list. This ensures that there are no special cases Languages that do not support any type of reference can
for the beginning of the list and renders both insertBe- still create links by replacing pointers with array indices.
ginning() and removeBeginning() unnecessary. In this The approach is to keep an array of records, where each
case, the first useful data in the list will be found at record has integer fields indicating the index of the next
list.firstNode.next. (and possibly previous) node in the array. Not all nodes in
the array need be used. If records are also not supported,
parallel arrays can often be used instead.
Circularly linked list
As an example, consider the following linked list record
In a circularly linked list, all nodes are linked in a contin- that uses arrays instead of pointers:
uous circle, without using null. For lists with a front and record Entry { integer next; // index of next entry in ar-
a back (such as a queue), one stores a reference to the last ray integer prev; // previous entry (if double-linked) string
node in the list. The next node after the last node is the name; real balance; }
first node. Elements can be added to the back of the list
A linked list can be built by creating an array of these
and removed from the front in constant time.
structures, and an integer variable to store the index of
Circularly linked lists can be either singly or doubly the first element.
linked.
integer listHead Entry Records[1000]
Both types of circularly linked lists benefit from the abil-
Links between elements are formed by placing the array
ity to traverse the full list beginning at any given node.
index of the next (or previous) cell into the Next or Prev
This often allows us to avoid storing firstNode and lastN-
field within a given element. For example:
ode, although if the list may be empty we need a special
representation for the empty list, such as a lastNode vari- In the above example, ListHead would be set to 2, the
able which points to some node in the list or is null if it’s location of the first entry in the list. Notice that entry
empty; we use such a lastNode here. This representation 3 and 5 through 7 are not part of the list. These cells
significantly simplifies adding and removing nodes with are available for any additions to the list. By creating a
a non-empty list, but empty lists are then a special case. ListFree integer variable, a free list could be created to
keep track of what cells are available. If all entries are
in use, the size of the array would have to be increased
Algorithms Assuming that someNode is some node in or some elements would have to be deleted before new
a non-empty circular singly linked list, this code iterates entries could be stored in the list.
through that list starting with someNode: The following code would traverse the list and display
function iterate(someNode) if someNode ≠ null node names and account balance:
2.4. LINKED LIST 35

i := listHead while i ≥ 0 // loop through the list print 2.4.8 Language support
i, Records[i].name, Records[i].balance // print entry i :=
Records[i].next Many programming languages such as Lisp and Scheme
have singly linked lists built in. In many functional lan-
When faced with a choice, the advantages of this ap-
guages, these lists are constructed from nodes, each called
proach include:
a cons or cons cell. The cons has two fields: the car, a ref-
erence to the data for that node, and the cdr, a reference
• The linked list is relocatable, meaning it can be to the next node. Although cons cells can be used to build
moved about in memory at will, and it can also be other data structures, this is their primary purpose.
quickly and directly serialized for storage on disk or In languages that support abstract data types or templates,
transfer over a network. linked list ADTs or templates are available for building
linked lists. In other languages, linked lists are typically
• Especially for a small list, array indexes can occupy built using references together with records.
significantly less space than a full pointer on many
architectures.
2.4.9 Internal and external storage
• Locality of reference can be improved by keeping
the nodes together in memory and by periodically When constructing a linked list, one is faced with the
rearranging them, although this can also be done in choice of whether to store the data of the list directly in
a general store. the linked list nodes, called internal storage, or merely to
store a reference to the data, called external storage. In-
• Naïve dynamic memory allocators can produce an ternal storage has the advantage of making access to the
excessive amount of overhead storage for each node data more efficient, requiring less storage overall, hav-
allocated; almost no allocation overhead is incurred ing better locality of reference, and simplifying memory
per node in this approach. management for the list (its data is allocated and deallo-
cated at the same time as the list nodes).
• Seizing an entry from a pre-allocated array is faster External storage, on the other hand, has the advantage of
than using dynamic memory allocation for each being more generic, in that the same data structure and
node, since dynamic memory allocation typically re- machine code can be used for a linked list no matter what
quires a search for a free memory block of the de- the size of the data is. It also makes it easy to place the
sired size. same data in multiple linked lists. Although with internal
storage the same data can be placed in multiple lists by
This approach has one main disadvantage, however: it including multiple next references in the node data struc-
creates and manages a private memory space for its nodes. ture, it would then be necessary to create separate rou-
This leads to the following issues: tines to add or delete cells based on each field. It is pos-
sible to create additional linked lists of elements that use
internal storage by using external storage, and having the
• It increases complexity of the implementation. cells of the additional linked lists store references to the
nodes of the linked list containing the data.
• Growing a large array when it is full may be diffi- In general, if a set of data structures needs to be included
cult or impossible, whereas finding space for a new in linked lists, external storage is the best approach. If
linked list node in a large, general memory pool may a set of data structures need to be included in only one
be easier. linked list, then internal storage is slightly better, unless a
generic linked list package using external storage is avail-
• Adding elements to a dynamic array will occa- able. Likewise, if different sets of data that can be stored
sionally (when it is full) unexpectedly take linear in the same data structure are to be included in a single
(O(n)) instead of constant time (although it’s still an linked list, then internal storage would be fine.
amortized constant).
Another approach that can be used with some languages
• Using a general memory pool leaves more memory involves having different data structures, but all have the
for other data if the list is smaller than expected or initial fields, including the next (and prev if double linked
if many nodes are freed. list) references in the same location. After defining sep-
arate structures for each type of data, a generic struc-
ture can be defined that contains the minimum amount
For these reasons, this approach is mainly used for lan- of data shared by all the other structures and contained
guages that do not support dynamic memory allocation. at the top (beginning) of the structures. Then generic
These disadvantages are also mitigated if the maximum routines can be created that use the minimal structure to
size of the list is known at the time the array is created. perform linked list type operations, but separate routines
36 CHAPTER 2. SEQUENCES

can then handle the specific data. This approach is of- As long as the number of families that a member can be-
ten used in message parsing routines, where several types long to is known at compile time, internal storage works
of messages are received, but all start with the same set fine. If, however, a member needed to be included in an
of fields, usually including a field for message type. The arbitrary number of families, with the specific number
generic routines are used to add new messages to a queue known only at run time, external storage would be neces-
when they are received, and remove them from the queue sary.
in order to process the message. The message type field is
then used to call the correct routine to process the specific
type of message. Speeding up search

Finding a specific element in a linked list, even if it is


Example of internal and external storage sorted, normally requires O(n) time (linear search). This
is one of the primary disadvantages of linked lists over
other data structures. In addition to the variants discussed
Suppose you wanted to create a linked list of families
above, below are two simple ways to improve search time.
and their members. Using internal storage, the structure
might look like the following: In an unordered list, one simple heuristic for decreasing
average search time is the move-to-front heuristic, which
record member { // member of a family member next;
simply moves an element to the beginning of the list
string firstName; integer age; } record family { // the
once it is found. This scheme, handy for creating sim-
family itself family next; string lastName; string address;
ple caches, ensures that the most recently used items are
member members // head of list of members of this family
also the quickest to find again.
}
Another common approach is to "index" a linked list us-
To print a complete list of families and their members
ing a more efficient external data structure. For example,
using internal storage, we could write:
one can build a red-black tree or hash table whose ele-
aFamily := Families // start at head of families list while ments are references to the linked list nodes. Multiple
aFamily ≠ null // loop through list of families print in- such indexes can be built on a single list. The disadvan-
formation about family aMember := aFamily.members // tage is that these indexes may need to be updated each
get head of list of this family’s members while aMember time a node is added or removed (or at least, before that
≠ null // loop through list of members print information index is used again).
about member aMember := aMember.next aFamily :=
aFamily.next
Random access lists
Using external storage, we would create the following
structures:
A random access list is a list with support for fast ran-
record node { // generic link structure node next; pointer dom access to read or modify any element in the list.[10]
data // generic pointer for data at node } record member One possible implementation is a skew binary random
{ // structure for family member string firstName; integer access list using the skew binary number system, which
age } record family { // structure for family string last- involves a list of trees with special properties; this al-
Name; string address; node members // head of list of lows worst-case constant time head/cons operations, and
members of this family } worst-case logarithmic time random access to an element
[10]
To print a complete list of families and their members by index. Random access [10]
lists can be implemented as
using external storage, we could write: persistent data structures.

famNode := Families // start at head of families list Random access lists can be viewed as immutable linked
while famNode ≠ null // loop through list of families lists in that they likewise support the same O(1) head and
aFamily := (family) famNode.data // extract family from tail operations.[10]
node print information about family memNode := aFam- A simple extension to random access lists is the min-
ily.members // get list of family members while memNode list, which provides an additional operation that yields the
≠ null // loop through list of members aMember := (mem- minimum element in the entire list in constant time (with-
ber)memNode.data // extract member from node print in- out mutation complexities).[10]
formation about member memNode := memNode.next
famNode := famNode.next
Notice that when using external storage, an extra step is
2.4.10 Related data structures
needed to extract the record from the node and cast it
into the proper data type. This is because both the list Both stacks and queues are often implemented using
of families and the list of members within the family are linked lists, and simply restrict the type of operations
stored in two linked lists using the same data structure which are supported.
(node), and this language does not have parametric types. The skip list is a linked list augmented with layers of
2.4. LINKED LIST 37

pointers for quickly jumping over large numbers of el- [7] Number crunching: Why you should never, ever, EVER use
ements, and then descending to the next layer. This pro- linked-list in your code again at kjellkod.wordpress.com
cess continues down to the bottom layer, which is the ac-
[8] Brodnik, Andrej; Carlsson, Svante; Sedgewick, Robert;
tual list.
Munro, JI; Demaine, ED (1999), Resizable Arrays in Op-
A binary tree can be seen as a type of linked list where the timal Time and Space (Technical Report CS-99-09) (PDF),
elements are themselves linked lists of the same nature. Department of Computer Science, University of Waterloo
The result is that each node may include a reference to the
first node of one or two other linked lists, which, together [9] Ford, William; Topp, William (2002). Data Structures
with C++ using STL (Second ed.). Prentice-Hall. pp.
with their contents, form the subtrees below that node.
466–467. ISBN 0-13-085850-1.
An unrolled linked list is a linked list in which each node
contains an array of data values. This leads to improved [10] Okasaki, Chris (1995). Purely Functional Random-Access
cache performance, since more list elements are contigu- Lists (PS). In Functional Programming Languages and
Computer Architecture. ACM Press. pp. 86–95. Re-
ous in memory, and reduced memory overhead, because
trieved May 7, 2015.
less metadata needs to be stored for each element of the
list.
A hash table may use linked lists to store the chains of 2.4.13 References
items that hash to the same position in the hash table.
A heap shares some of the ordering properties of a linked • Juan, Angel (2006). “Ch20 –Data Structures; ID06
list, but is almost always implemented using an array. In- - PROGRAMMING with JAVA (slide part of the
stead of references from node to node, the next and pre- book 'Big Java', by CayS. Horstmann)" (PDF). p. 3.
vious data indexes are calculated using the current data’s • Black, Paul E. (2004-08-16). Pieterse, Vreda;
index. Black, Paul E., eds. “linked list”. Dictionary of Al-
A self-organizing list rearranges its nodes based on some gorithms and Data Structures. National Institute of
heuristic which reduces search times for data retrieval by Standards and Technology. Retrieved 2004-12-14.
keeping commonly accessed nodes at the head of the list.
• Antonakos, James L.; Mansfield, Kenneth C., Jr.
(1999). Practical Data Structures Using C/C++.
2.4.11 Notes Prentice-Hall. pp. 165–190. ISBN 0-13-280843-
9.
[1] The amount of control data required for a dynamic array
is usually of the form K + B ∗ n , where K is a per- • Collins, William J. (2005) [2002]. Data Structures
array constant, B is a per-dimension constant, and n is and the Java Collections Framework. New York:
the number of dimensions. K and B are typically on the McGraw Hill. pp. 239–303. ISBN 0-07-282379-
order of 10 bytes. 8.

• Cormen, Thomas H.; Leiserson, Charles E.; Rivest,


2.4.12 Footnotes Ronald L.; Stein, Clifford (2003). Introduction to
Algorithms. MIT Press. pp. 205–213, 501–505.
[1] Skiena, Steven S. (2009). The Algorithm Design Manual ISBN 0-262-03293-7.
(2nd ed.). Springer. p. 76. ISBN 9781848000704. We
can do nothing without this list predecessor, and so must • Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
spend linear time searching for it on a singly-linked list. Ronald L.; Stein, Clifford (2001). “10.2: Linked
lists”. Introduction to Algorithms (2nd ed.). MIT
[2] http://www.osronline.com/article.cfm?article=499
Press. pp. 204–209. ISBN 0-262-03293-7.
[3] http://www.cs.dartmouth.edu/~{}sergey/me/cs/cs108/
rootkits/bh-us-04-butler.pdf • Green, Bert F., Jr. (1961). “Computer Lan-
guages for Symbol Manipulation”. IRE Transac-
[4] Chris Okasaki (1995). “Purely Functional Random- tions on Human Factors in Electronics (2): 3–8.
Access Lists”. Proceedings of the Seventh Inter- doi:10.1109/THFE2.1961.4503292.
national Conference on Functional Programming
Languages and Computer Architecture: 86–95. • McCarthy, John (1960). “Recursive Functions of
doi:10.1145/224164.224187. Symbolic Expressions and Their Computation by
[5] Gerald Kruse. CS 240 Lecture Notes: Linked Lists Plus:
Machine, Part I”. Communications of the ACM. 3
Complexity Trade-offs. Juniata College. Spring 2008. (4): 184. doi:10.1145/367177.367199.

[6] Day 1 Keynote - Bjarne Stroustrup: C++11 Style at Go- • Knuth, Donald (1997). “2.2.3-2.2.5”. Fundamental
ingNative 2012 on channel9.msdn.com from minute 45 or Algorithms (3rd ed.). Addison-Wesley. pp. 254–
foil 44 298. ISBN 0-201-89683-4.
38 CHAPTER 2. SEQUENCES

• Newell, Allen; Shaw, F. C. (1957). “Programming lists formed from the same data items, but in opposite
the Logic Theory Machine”. Proceedings of the sequential orders.
Western Joint Computer Conference: 230–240.
12 99 37
• Parlante, Nick (2001). “Linked list basics” (PDF).
Stanford University. Retrieved 2009-09-21.
A doubly linked list whose nodes contain three fields: an integer
• Sedgewick, Robert (1998). Algorithms in C. Addi- value, the link to the next node, and the link to the previous node.
son Wesley. pp. 90–109. ISBN 0-201-31452-5.
The two node links allow traversal of the list in either
• Shaffer, Clifford A. (1998). A Practical Introduc- direction. While adding or removing a node in a doubly
tion to Data Structures and Algorithm Analysis. New
linked list requires changing more links than the same op-
Jersey: Prentice Hall. pp. 77–102. ISBN 0-13- erations on a singly linked list, the operations are simpler
660911-2. and potentially more efficient (for nodes other than first
• Wilkes, Maurice Vincent (1964). “An Experiment nodes) because there is no need to keep track of the pre-
with a Self-compiling Compiler for a Simple List- vious node during traversal or no need to traverse the list
Processing Language”. Annual Review in Auto- to find the previous node, so that its link can be modified.
matic Programming. Pergamon Press. 4 (1): 1. The concept is also the basis for the mnemonic link sys-
doi:10.1016/0066-4138(64)90013-8. tem memorization technique.
• Wilkes, Maurice Vincent (1964). “Lists and Why
They are Useful”. Proceeds of the ACM National 2.5.1 Nomenclature and implementation
Conference, Philadelphia 1964. ACM (P–64): F1–
1. The first and last nodes of a doubly linked list are imme-
• Shanmugasundaram, Kulesh (2005-04-04). “Linux diately accessible (i.e., accessible without traversal, and
Kernel Linked List Explained”. Retrieved 2009-09- usually called head and tail) and therefore allow traversal
21. of the list from the beginning or end of the list, respec-
tively: e.g., traversing the list from beginning to end, or
from end to beginning, in a search of the list for a node
2.4.14 External links with specific data value. Any node of a doubly linked list,
once obtained, can be used to begin a new traversal of the
• Description from the Dictionary of Algorithms and list, in either direction (towards beginning or end), from
Data Structures the given node.
• Introduction to Linked Lists, Stanford University The link fields of a doubly linked list node are often called
Computer Science Library next and previous or forward and backward. The ref-
erences stored in the link fields are usually implemented
• Linked List Problems, Stanford University Com- as pointers, but (as in any linked data structure) they may
puter Science Library also be address offsets or indices into an array where the
nodes live.
• Open Data Structures - Chapter 3 - Linked Lists
• Patent for the idea of having nodes which are in sev-
eral linked lists simultaneously (note that this tech- 2.5.2 Basic algorithms
nique was widely used for many decades before the
patent was granted) Consider the following basic algorithms written in Ada:

Open doubly linked lists


2.5 Doubly linked list
record DoublyLinkedNode { prev // A reference to the
In computer science, a doubly linked list is a linked previous node next // A reference to the next node data
data structure that consists of a set of sequentially linked // Data or a reference to data } record DoublyLinkedList
records called nodes. Each node contains two fields, { DoublyLinkedNode firstNode // points to first node of
called links, that are references to the previous and to list DoublyLinkedNode lastNode // points to last node of
the next node in the sequence of nodes. The beginning list }
and ending nodes’ previous and next links, respectively,
point to some kind of terminator, typically a sentinel node
or null, to facilitate traversal of the list. If there is only one Traversing the list Traversal of a doubly linked list can
sentinel node, then the list is circularly linked via the sen- be in either direction. In fact, the direction of traversal
tinel node. It can be conceptualized as two singly linked can change many times, if desired. Traversal is often
2.5. DOUBLY LINKED LIST 39

called iteration, but that choice of terminology is unfor- Circular doubly linked lists
tunate, for iteration has well-defined semantics (e.g., in
mathematics) which are not analogous to traversal. Traversing the list Assuming that someNode is some
Forwards node in a non-empty list, this code traverses through that
list starting with someNode (any node will do):
node := list.firstNode while node ≠ null <do something
with node.data> node := node.next Forwards

Backwards node := someNode do do something with node.value


node := node.next while node ≠ someNode
node := list.lastNode while node ≠ null <do something
with node.data> node := node.prev Backwards
node := someNode do do something with node.value
node := node.prev while node ≠ someNode
Inserting a node These symmetric functions insert a //NODEPA Notice the postponing of the test to the end
node either after or before a given node: of the loop. This is important for the case where the list
contains only the single node someNode.
function insertAfter(List list, Node node, Node newN-
ode) newNode.prev := node if node.next == null newN-
ode.next := null -- (not always necessary) list.lastNode := Inserting a node This simple function inserts a node
newNode else newNode.next := node.next node.next.prev into a doubly linked circularly linked list after a given el-
:= newNode node.next := newNode function insertBe- ement:
fore(List list, Node node, Node newNode) newNode.next
:= node if node.prev == null newNode.prev := null - function insertAfter(Node node, Node newNode)
- (not always necessary) list.firstNode := newNode else newNode.next := node.next newNode.prev := node
newNode.prev := node.prev node.prev.next := newNode node.next.prev := newNode node.next := newNode
node.prev := newNode To do an “insertBefore”, we can simply “in-
We also need a function to insert a node at the beginning sertAfter(node.prev, newNode)".
of a possibly empty list: Inserting an element in a possibly empty list requires a
function insertBeginning(List list, Node newNode) special function:
if list.firstNode == null list.firstNode := newNode function insertEnd(List list, Node node) if list.lastNode
list.lastNode := newNode newNode.prev := null newN- == null node.prev := node node.next := node else in-
ode.next := null else insertBefore(list, list.firstNode, sertAfter(list.lastNode, node) list.lastNode := node
newNode)
To insert at the beginning we simply “in-
A symmetric function inserts at the end: sertAfter(list.lastNode, node)".
function insertEnd(List list, Node newNode) if Finally, removing a node must deal with the case where
list.lastNode == null insertBeginning(list, newNode) the list empties:
else insertAfter(list, list.lastNode, newNode)
function remove(List list, Node node); if node.next
== node list.lastNode := null else node.next.prev :=
node.prev node.prev.next := node.next if node ==
Removing a node Removal of a node is easier than in- list.lastNode list.lastNode := node.prev; destroy node
sertion, but requires special handling if the node to be
removed is the firstNode or lastNode:
Deleting a node As in doubly linked lists, “re-
function remove(List list, Node node) if node.prev ==
moveAfter” and “removeBefore” can be implemented
null list.firstNode := node.next else node.prev.next :=
with “remove(list, node.prev)" and “remove(list,
node.next if node.next == null list.lastNode := node.prev
node.next)".
else node.next.prev := node.prev
One subtle consequence of the above procedure is that
deleting the last node of a list sets both firstNode and 2.5.3 Advanced concepts
lastNode to null, and so it handles removing the last node
from a one-element list correctly. Notice that we also Asymmetric doubly linked list
don't need separate “removeBefore” or “removeAfter”
methods, because in a doubly linked list we can just An asymmetric doubly linked list is somewhere between
use “remove(node.prev)" or “remove(node.next)" where the singly linked list and the regular doubly linked list.
these are valid. This also assumes that the node being re- It shares some features with the singly linked list (single-
moved is guaranteed to exist. If the node does not exist direction traversal) and others from the doubly linked list
in this list, then some error handling would be required. (ease of modification)
40 CHAPTER 2. SEQUENCES

It is a list where each node’s previous link points not to


the previous node, but to the link to itself. While this
makes little difference between nodes (it just points to an
offset within the previous node), it changes the head of
the list: It allows the first node to modify the firstNode
link easily.[1][2]
As long as a node is in a list, its previous link is never null.

Inserting a node To insert a node before another, we


change the link that pointed to the old node, using the
prev link; then set the new node’s next link to point to the
old node, and change that node’s prev link accordingly.
function insertBefore(Node node, Node newNode) if Simple representation of a stack runtime with push and pop op-
erations.
node.prev == null error “The node is not in a list”
newNode.prev := node.prev atAddress(newNode.prev)
:= newNode newNode.next := node node.prev = ad-
dressOf(newNode.next) function insertAfter(Node (for last in, first out). Additionally, a peek operation may
node, Node newNode) newNode.next := node.next if give access to the top without modifying the stack.
newNode.next != null newNode.next.prev = addres-
sOf(newNode.next) node.next := newNode newN- The name “stack” for this type of structure comes from
ode.prev := addressOf(node.next) the analogy to a set of physical items stacked on top of
each other, which makes it easy to take an item off the
top of the stack, while getting to an item deeper in the
Deleting a node To remove a node, we simply modify stack may require taking off multiple other items first.[1]
the link pointed by prev, regardless of whether the node
Considered as a linear data structure, or more abstractly a
was the first one of the list.
sequential collection, the push and pop operations occur
function remove(Node node) atAddress(node.prev) only at one end of the structure, referred to as the top of
:= node.next if node.next != null node.next.prev = the stack. This makes it possible to implement a stack as
node.prev destroy node a singly linked list and a pointer to the top element.
A stack may be implemented to have a bounded capacity.
2.5.4 See also If the stack is full and does not contain enough space to
accept an entity to be pushed, the stack is then considered
• XOR linked list to be in an overflow state. The pop operation removes an
item from the top of the stack.
• SLIP (programming language)

2.5.5 References
[1] http://www.codeofhonor.com/blog/ 2.6.1 History
avoiding-game-crashes-related-to-linked-lists

[2] https://github.com/webcoyote/coho/blob/master/Base/ Stacks entered the computer science literature in 1946,


List.h in the computer design of Alan M. Turing (who used the
terms “bury” and “unbury”) as a means of calling and
returning from subroutines.[2] Subroutines had already
2.6 Stack (abstract data type) been implemented in Konrad Zuse's Z4 in 1945. Klaus
Samelson and Friedrich L. Bauer of Technical Univer-
For the use of the term LIFO in accounting, see LIFO sity Munich [3]
proposed the idea in 1955 and filed a patent
(accounting). in 1957. The same concept was developed, indepen-
For other uses, see stack (disambiguation). dently, by the Australian Charles Leonard Hamblin in the
[4]
In computer science, a stack is an abstract data type first half of 1957.
that serves as a collection of elements, with two principal Stacks are often described by analogy to a spring-loaded
operations: push, which adds an element to the collection, stack of plates in a cafeteria.[5][1][6] Clean plates are
and pop, which removes the most recently added element placed on top of the stack, pushing down any already
that was not yet removed. The order in which elements there. When a plate is removed from the stack, the one
come off a stack gives rise to its alternative name, LIFO below it pops up to become the new top.
2.6. STACK (ABSTRACT DATA TYPE) 41

2.6.2 Non-essential operations items to or removing items from the end of a dynamic
array requires amortized O(1) time.
In many implementations, a stack has more operations
than “push” and “pop”. An example is “top of stack”, or
"peek", which observes the top-most element without re- Linked list Another option for implementing stacks is
moving it from the stack.[7] Since this can be done with to use a singly linked list. A stack is then a pointer to the
a “pop” and a “push” with the same data, it is not essen- “head” of the list, with perhaps a counter to keep track of
tial. An underflow condition can occur in the “stack top” the size of the list:
operation if the stack is empty, the same as “pop”. Also, structure frame: data : item next : frame or nil struc-
implementations often have a function which just returns ture stack: head : frame or nil size : integer procedure
whether the stack is empty. initialize(stk : stack): stk.head ← nil stk.size ← 0
Pushing and popping items happens at the head of the
list; overflow is not possible in this implementation (un-
2.6.3 Software stacks
less memory is exhausted):
Implementation procedure push(stk : stack, x : item): newhead ←
new frame newhead.data ← x newhead.next ← stk.head
A stack can be easily implemented either through an array stk.head ← newhead stk.size ← stk.size + 1 procedure
or a linked list. What identifies the data structure as a pop(stk : stack): if stk.head = nil: report underflow er-
stack in either case is not the implementation but the in- ror r ← stk.head.data stk.head ← stk.head.next stk.size
terface: the user is only allowed to pop or push items ← stk.size - 1 return r
onto the array or linked list, with few other helper opera-
tions. The following will demonstrate both implementa-
tions, using pseudocode. Stacks and programming languages

Some languages, such as Perl, LISP and Python, make the


Array An array can be used to implement a (bounded) stack operations push and pop available on their standard
stack, as follows. The first element (usually at the zero list/array types. Some languages, notably those in the
offset) is the bottom, resulting in array[0] being the Forth family (including PostScript), are designed around
first element pushed onto the stack and the last element language-defined stacks that are directly visible to and
popped off. The program must keep track of the size manipulated by the programmer.
(length) of the stack, using a variable top that records The following is an example of manipulating a stack in
the number of items pushed so far, therefore pointing to Common Lisp (">" is the Lisp interpreter’s prompt; lines
the place in the array where the next element is to be in- not starting with ">" are the interpreter’s responses to ex-
serted (assuming a zero-based index convention). Thus, pressions):
the stack itself can be effectively implemented as a three-
element structure: > (setf stack (list 'a 'b 'c)) ;; set the variable “stack” (A
B C) > (pop stack) ;; get top (leftmost) element, should
structure stack: maxsize : integer top : integer items : modify the stack A > stack ;; check the value of stack (B
array of item procedure initialize(stk : stack, size : inte- C) > (push 'new stack) ;; push a new top onto the stack
ger): stk.items ← new array of size items, initially empty (NEW B C)
stk.maxsize ← size stk.top ← 0
The push operation adds an element and increments the Several of the C++ Standard Library container types have
top index, after checking for overflow: push_back and pop_back operations with LIFO seman-
procedure push(stk : stack, x : item): if tics; additionally, the stack template class adapts existing
stk.top = stk.maxsize: report overflow error else: containers to provide a restricted API with only push/pop
stk.items[stk.top] ← x stk.top ← stk.top + 1 operations. PHP has an SplStack class. Java’s library
contains a Stack class that is a specialization of Vector.
Similarly, pop decrements the top index after checking Following is an example program in Java language, using
for underflow, and returns the item that was previously that class.
the top one:
import java.util.*; class StackDemo { public static
procedure pop(stk : stack): if stk.top = 0: re- void main(String[]args) { Stack<String> stack =
port underflow error else: stk.top ← stk.top − 1 r ← new Stack<String>(); stack.push(“A”); // Insert “A”
stk.items[stk.top] in the stack stack.push(“B”); // Insert “B” in the
Using a dynamic array, it is possible to implement a stack stack stack.push(“C”); // Insert “C” in the stack
that can grow or shrink as much as needed. The size of stack.push(“D”); // Insert “D” in the stack Sys-
the stack is simply the size of the dynamic array, which tem.out.println(stack.peek()); // Prints the top of the
is a very efficient implementation of a stack since adding stack (“D”) stack.pop(); // removing the top (“D”)
42 CHAPTER 2. SEQUENCES

stack.pop(); // removing the next top (“C”) } } and the stack pointer is adjusted by the size of the
data item.

There are many variations on the basic principle of stack


2.6.4 Hardware stacks
operations. Every stack has a fixed location in memory at
A common use of stacks at the architecture level is as a which it begins. As data items are added to the stack, the
means of allocating and accessing memory. stack pointer is displaced to indicate the current extent of
the stack, which expands away from the origin.
Stack pointers may point to the origin of a stack or to a
Basic architecture of a stack
limited range of addresses either above or below the ori-
gin (depending on the direction in which the stack grows);
however, the stack pointer cannot cross the origin of the
stack. In other words, if the origin of the stack is at ad-
dress 1000 and the stack grows downwards (towards ad-
dresses 999, 998, and so on), the stack pointer must never
be incremented beyond 1000 (to 1001, 1002, etc.). If
a pop operation on the stack causes the stack pointer to
move past the origin of the stack, a stack underflow oc-
curs. If a push operation causes the stack pointer to in-
crement or decrement beyond the maximum extent of the
stack, a stack overflow occurs.
Some environments that rely heavily on stacks may pro-
vide additional operations, for example:

• Duplicate: the top item is popped, and then pushed


again (twice), so that an additional copy of the for-
mer top item is now on top, with the original below
it.

• Peek: the topmost item is inspected (or returned),


but the stack pointer is not changed, and the stack
size does not change (meaning that the item remains
A typical stack, storing local data and call information for nested on the stack). This is also called top operation in
procedure calls (not necessarily nested procedures). This stack many articles.
grows downward from its origin. The stack pointer points to the
current topmost datum on the stack. A push operation decrements • Swap or exchange: the two topmost items on the
the pointer and copies the data to the stack; a pop operation copies stack exchange places.
data from the stack and then increments the pointer. Each pro-
cedure called in the program stores procedure return information • Rotate (or Roll): the n topmost items are moved on
(in yellow) and local data (in other colors) by pushing them onto the stack in a rotating fashion. For example, if n=3,
the stack. This type of stack implementation is extremely com- items 1, 2, and 3 on the stack are moved to positions
mon, but it is vulnerable to buffer overflow attacks (see the text). 2, 3, and 1 on the stack, respectively. Many variants
of this operation are possible, with the most com-
A typical stack is an area of computer memory with a mon being called left rotate and right rotate.
fixed origin and a variable size. Initially the size of the
stack is zero. A stack pointer, usually in the form of a
Stacks are often visualized growing from the bottom up
hardware register, points to the most recently referenced
(like real-world stacks). They may also be visualized
location on the stack; when the stack has a size of zero,
growing from left to right, so that “topmost” becomes
the stack pointer points to the origin of the stack.
“rightmost”, or even growing from top to bottom. The
The two operations applicable to all stacks are: important feature is that the bottom of the stack is in a
fixed position. The illustration in this section is an exam-
• a push operation, in which a data item is placed at ple of a top-to-bottom growth visualization: the top (28)
the location pointed to by the stack pointer, and the is the stack “bottom”, since the stack “top” is where items
address in the stack pointer is adjusted by the size of are pushed or popped from.
the data item;
A right rotate will move the first element to the third po-
• a pop or pull operation: a data item at the current sition, the second to the first and the third to the second.
location pointed to by the stack pointer is removed, Here are two equivalent visualizations of this process:
2.6. STACK (ABSTRACT DATA TYPE) 43

apple banana banana ===right rotate==> cucumber cu- operands. A stack structure also makes superscalar im-
cumber apple cucumber apple banana ===left rotate==> plementations with register renaming (for speculative ex-
cucumber apple banana ecution) somewhat more complex to implement, although
A stack is usually represented in computers by a block it is still feasible, as exemplified by modern x87 imple-
of memory cells, with the “bottom” at a fixed location, mentations.
and the stack pointer holding the address of the current Sun SPARC, AMD Am29000, and Intel i960 are all ex-
“top” cell in the stack. The top and bottom terminology amples of architectures using register windows within a
are used irrespective of whether the stack actually grows register-stack as another strategy to avoid the use of slow
towards lower memory addresses or towards higher mem- main memory for function arguments and return values.
ory addresses. There are also a number of small microprocessors
Pushing an item on to the stack adjusts the stack pointer that implements a stack directly in hardware and some
by the size of the item (either decrementing or increment- microcontrollers have a fixed-depth stack that is not di-
ing, depending on the direction in which the stack grows rectly accessible. Examples are the PIC microcontrollers,
in memory), pointing it to the next cell, and copies the the Computer Cowboys MuP21, the Harris RTX line, and
new top item to the stack area. Depending again on the the Novix NC4016. Many stack-based microprocessors
exact implementation, at the end of a push operation, the were used to implement the programming language Forth
stack pointer may point to the next unused location in the at the microcode level. Stacks were also used as a basis
stack, or it may point to the topmost item in the stack. of a number of mainframes and mini computers. Such
If the stack points to the current topmost item, the stack machines were called stack machines, the most famous
pointer will be updated before a new item is pushed onto being the Burroughs B5000.
the stack; if it points to the next available location in the
stack, it will be updated after the new item is pushed onto
the stack. 2.6.5 Applications
Popping the stack is simply the inverse of pushing. The
topmost item in the stack is removed and the stack pointer Expression evaluation and syntax parsing
is updated, in the opposite order of that used in the push
operation. Calculators employing reverse Polish notation use a stack
structure to hold values. Expressions can be represented
in prefix, postfix or infix notations and conversion from
Hardware support one form to another may be accomplished using a stack.
Many compilers use a stack for parsing the syntax of ex-
Stack in main memory Many CPU families, includ- pressions, program blocks etc. before translating into low
ing the x86, Z80 and 6502, have a dedicated register re- level code. Most programming languages are context-
served for use as (call) stack pointers and special push free languages, allowing them to be parsed with stack
and pop instructions that manipulate this specific reg- based machines.
ister, conserving opcode space. Some processors, like
the PDP-11 and the 68000, also have special address-
Backtracking
ing modes for implementation of stacks, typically with
a semi-dedicated stack pointer as well (such as A7 in
Main article: Backtracking
the 68000). However, in most processors, several dif-
ferent registers may be used as additional stack pointers
as needed (whether updated via addressing modes or via Another important application of stacks is backtracking.
add/sub instructions). Consider a simple example of finding the correct path in a
maze. There are a series of points, from the starting point
to the destination. We start from one point. To reach
Stack in registers or dedicated memory Main the final destination, there are several paths. Suppose we
article: Stack machine choose a random path. After following a certain path, we
realise that the path we have chosen is wrong. So we need
The x87 floating point architecture is an example of a to find a way by which we can return to the beginning of
set of registers organised as a stack where direct access that path. This can be done with the use of stacks. With
to individual registers (relative the current top) is also the help of stacks, we remember the point where we have
possible. As with stack-based machines in general, hav- reached. This is done by pushing that point into the stack.
ing the top-of-stack as an implicit argument allows for a In case we end up on the wrong path, we can pop the last
small machine code footprint with a good usage of bus point from the stack and thus return to the last point and
bandwidth and code caches, but it also prevents some continue our quest to find the right path. This is called
types of optimizations possible on processors permitting backtracking.
random access to the register file for all (two or three) The prototypical example of a backtracking algorithm is
44 CHAPTER 2. SEQUENCES

depth-first search, which finds all vertices of a graph that is used to find and remove concavities in the bound-
can be reached from a specified starting vertex. Other ary when a new point is added to the hull.[8]
applications of backtracking involve searching through
spaces that represent potential solutions to an optimiza- • Part of the SMAWK algorithm for finding the row
tion problem. Branch and bound is a technique for per- minima of a monotone matrix uses stacks in a simi-
forming such backtracking searches without exhaustively lar way to Graham scan.[9]
searching all of the potential solutions in such a space.
• All nearest smaller values, the problem of finding,
for each number in an array, the closest preceding
Runtime memory management number that is smaller than it. One algorithm for this
problem uses a stack to maintain a collection of can-
Main articles: Stack-based memory allocation and Stack didates for the nearest smaller value. For each posi-
machine tion in the array, the stack is popped until a smaller
value is found on its top, and then the value in the
new position is pushed onto the stack.[10]
A number of programming languages are stack-oriented,
meaning they define most basic operations (adding two • The nearest-neighbor chain algorithm, a method
numbers, printing a character) as taking their arguments for agglomerative hierarchical clustering based on
from the stack, and placing any return values back on the maintaining a stack of clusters, each of which is
stack. For example, PostScript has a return stack and an the nearest neighbor of its predecessor on the stack.
operand stack, and also has a graphics state stack and a When this method finds a pair of clusters that
dictionary stack. Many virtual machines are also stack- are mutual nearest neighbors, they are popped and
oriented, including the p-code machine and the Java Vir- merged.[11]
tual Machine.
Almost all calling conventions—the ways in which
subroutines receive their parameters and return results—
2.6.6 Security
use a special stack (the "call stack") to hold information
Some computing environments use stacks in ways that
about procedure/function calling and nesting in order to
may make them vulnerable to security breaches and at-
switch to the context of the called function and restore to
tacks. Programmers working in such environments must
the caller function when the calling finishes. The func-
take special care to avoid the pitfalls of these implemen-
tions follow a runtime protocol between caller and callee
tations.
to save arguments and return value on the stack. Stacks
are an important way of supporting nested or recursive For example, some programming languages use a com-
function calls. This type of stack is used implicitly by the mon stack to store both data local to a called procedure
compiler to support CALL and RETURN statements (or and the linking information that allows the procedure to
their equivalents) and is not manipulated directly by the return to its caller. This means that the program moves
programmer. data into and out of the same stack that contains critical
return addresses for the procedure calls. If data is moved
Some programming languages use the stack to store data
to the wrong location on the stack, or an oversized data
that is local to a procedure. Space for local data items is
item is moved to a stack location that is not large enough
allocated from the stack when the procedure is entered,
to contain it, return information for procedure calls may
and is deallocated when the procedure exits. The C pro-
be corrupted, causing the program to fail.
gramming language is typically implemented in this way.
Using the same stack for both data and procedure calls Malicious parties may attempt a stack smashing attack
has important security implications (see below) of which that takes advantage of this type of implementation by
a programmer must be aware in order to avoid introduc- providing oversized data input to a program that does not
ing serious security bugs into a program. check the length of input. Such a program may copy the
data in its entirety to a location on the stack, and in so do-
ing it may change the return addresses for procedures that
Efficient algorithms have called it. An attacker can experiment to find a spe-
cific type of data that can be provided to such a program
Several algorithms use a stack (separate from the usual such that the return address of the current procedure is re-
function call stack of most programming languages) as set to point to an area within the stack itself (and within
the principle data structure with which they organize their the data provided by the attacker), which in turn contains
information. These include: instructions that carry out unauthorized operations.
This type of attack is a variation on the buffer overflow
• Graham scan, an algorithm for the convex hull of a attack and is an extremely frequent source of security
two-dimensional system of points. A convex hull of breaches in software, mainly because some of the most
a subset of the input is maintained in a stack, which popular compilers use a shared stack for both data and
2.7. QUEUE (ABSTRACT DATA TYPE) 45

procedure calls, and do not verify the length of data items. [10] Berkman, Omer; Schieber, Baruch; Vishkin, Uzi (1993),
Frequently programmers do not write code to verify the “Optimal doubly logarithmic parallel algorithms based on
size of data items, either, and when an oversized or un- finding all nearest smaller values”, Journal of Algorithms,
dersized data item is copied to the stack, a security breach 14 (3): 344–370, doi:10.1006/jagm.1993.1018.
may occur.
[11] Murtagh, Fionn (1983), “A survey of recent advances in
hierarchical clustering algorithms” (PDF), The Computer
Journal, 26 (4): 354–359, doi:10.1093/comjnl/26.4.354.
2.6.7 See also
• List of data structures • This article incorporates public domain material
from the NIST document: Black, Paul E. “Bounded
• Queue stack”. Dictionary of Algorithms and Data Struc-
• Double-ended queue tures.

• Call stack
2.6.9 Further reading
• FIFO (computing and electronics)
• Stack-based memory allocation • Donald Knuth. The Art of Computer Program-
ming, Volume 1: Fundamental Algorithms, Third
• Stack overflow Edition.Addison-Wesley, 1997. ISBN 0-201-
89683-4. Section 2.2.1: Stacks, Queues, and De-
• Stack-oriented programming language ques, pp. 238–243.

2.6.8 References 2.6.10 External links


[1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L.; Stein, Clifford (2009) [1990]. Introduction to • Stacks and its Applications
Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN
0-262-03384-4. • Stack Machines - the new wave

[2] Newton, David E. (2003). Alan Turing: a study in light • Bounding stack depth
and shadow. Philadelphia: Xlibris. p. 82. ISBN
9781401090791. Retrieved 28 January 2015. • Stack Size Analysis for Interrupt-driven Programs
(322 KB)
[3] Dr. Friedrich Ludwig Bauer and Dr. Klaus Samelson
(30 March 1957). “Verfahren zur automatischen Ver-
arbeitung von kodierten Daten und Rechenmaschine zur
Ausübung des Verfahrens” (in German). Germany, Mu- 2.7 Queue (abstract data type)
nich: Deutsches Patentamt. Retrieved 2010-10-01.

[4] C. L. Hamblin, “An Addressless Coding Scheme based


on Mathematical Notation”, N.S.W University of Tech-
nology, May 1957 (typescript)

[5] Ball, John A. (1978). Algorithms for RPN calcula- Back Front
tors (1 ed.). Cambridge, Massachusetts, USA: Wiley-
Interscience, John Wiley & Sons, Inc. ISBN 0-471- Dequeue
03070-8. Enqueue
[6] Godse, A. P.; Godse, D. A. (2010-01-01). Computer
Architecture. Technical Publications. pp. 1–56. ISBN
9788184315349. Retrieved 2015-01-30.

[7] Horowitz, Ellis: “Fundamentals of Data Structures in Pas-


cal”, page 67. Computer Science Press, 1984
Representation of a FIFO (first in, first out) queue
[8] Graham, R.L. (1972). An Efficient Algorithm for Deter-
mining the Convex Hull of a Finite Planar Set. Informa- In computer science, a queue (/ˈkjuː/ KYEW) is a particu-
tion Processing Letters 1, 132-133 lar kind of abstract data type or collection in which the en-
[9] Aggarwal, Alok; Klawe, Maria M.; Moran, Shlomo; Shor, tities in the collection are kept in order and the principal
Peter; Wilber, Robert (1987), “Geometric applications of (or only) operations on the collection are the addition of
a matrix-searching algorithm”, Algorithmica, 2 (2): 195– entities to the rear terminal position, known as enqueue,
208, doi:10.1007/BF01840359, MR 895444. and removal of entities from the front terminal position,
46 CHAPTER 2. SEQUENCES

known as dequeue. This makes the queue a First-In-First- There are several efficient implementations of FIFO
Out (FIFO) data structure. In a FIFO data structure, the queues. An efficient implementation is one that can per-
first element added to the queue will be the first one to be form the operations—enqueuing and dequeuing—in O(1)
removed. This is equivalent to the requirement that once time.
a new element is added, all elements that were added be-
fore have to be removed before the new element can be • Linked list
removed. Often a peek or front operation is also entered,
returning the value of the front element without dequeu- • A doubly linked list has O(1) insertion and
ing it. A queue is an example of a linear data structure, deletion at both ends, so is a natural choice for
or more abstractly a sequential collection. queues.
Queues provide services in computer science, transport, • A regular singly linked list only has efficient
and operations research where various entities such as insertion and deletion at one end. However, a
data, objects, persons, or events are stored and held to small modification—keeping a pointer to the
be processed later. In these contexts, the queue performs last node in addition to the first one—will en-
the function of a buffer. able it to implement an efficient queue.
Queues are common in computer programs, where they
• A deque implemented using a modified dynamic ar-
are implemented as data structures coupled with ac-
ray
cess routines, as an abstract data structure or in object-
oriented languages as classes. Common implementations
are circular buffers and linked lists. Queues and programming languages

Queues may be implemented as a separate data type, or


2.7.1 Queue implementation may be considered a special case of a double-ended queue
(deque) and not implemented separately. For example,
Theoretically, one characteristic of a queue is that it does Perl and Ruby allow pushing and popping an array from
not have a specific capacity. Regardless of how many ele- both ends, so one can use push and shift functions to en-
ments are already contained, a new element can always be queue and dequeue a list (or, in reverse, one can use un-
added. It can also be empty, at which point removing an shift and pop), although in some cases these operations
element will be impossible until a new element has been are not efficient.
added again.
C++'s Standard Template Library provides a “queue”
Fixed length arrays are limited in capacity, but it is not templated class which is restricted to only push/pop
true that items need to be copied towards the head of the operations. Since J2SE5.0, Java’s library contains a
queue. The simple trick of turning the array into a closed Queue interface that specifies queue operations; imple-
circle and letting the head and tail drift around endlessly menting classes include LinkedList and (since J2SE 1.6)
in that circle makes it unnecessary to ever move items ArrayDeque. PHP has an SplQueue class and third party
stored in the array. If n is the size of the array, then com- libraries like beanstalk'd and Gearman.
puting indices modulo n will turn the array into a circle.
This is still the conceptually simplest way to construct a
queue in a high level language, but it does admittedly slow Examples
things down a little, because the array indices must be
compared to zero and the array size, which is compara- A simple queue implemented in Ruby:
ble to the time taken to check whether an array index is
class Queue def initialize @list = Array.new end def
out of bounds, which some languages do, but this will
enqueue(element) @list << element end def dequeue
certainly be the method of choice for a quick and dirty
@list.shift end end
implementation, or for any high level language that does
not have pointer syntax. The array size must be declared
ahead of time, but some implementations simply double
the declared array size when overflow occurs. Most mod- 2.7.2 Purely functional implementation
ern languages with objects or pointers can implement or
come with libraries for dynamic lists. Such data struc-Queues can also be implemented as a purely functional
tures may have not specified fixed capacity limit besidesdata structure.[2] Two versions of the implementation
memory constraints. Queue overflow results from trying exists. The first one, called real-time queue,[3] pre-
to add an element onto a full queue and queue underflow sented below, allows the queue to be persistent with op-
happens when trying to remove an element from an empty erations in O(1) worst-case time, but requires lazy lists
queue. with memoization. The second one, with no lazy lists nor
A bounded queue is a queue limited to a fixed number of memoization is presented at the end of the sections. Its
items.[1] amortized time is O(1) if the persistency is not used; but
2.7. QUEUE (ABSTRACT DATA TYPE) 47

its worst-time complexity is O(n) where n is the number 2.7.3 See also
of elements in the queue.
• Circular buffer
Let us recall that, for l a list, |l| denotes its length, that NIL
represents an empty list and CON S(h, t) represents the • Double-ended queue (deque)
list whose head is h and whose tail is t.
• Priority queue

• Queueing theory
Real-time queue
• Stack (abstract data type) – the “opposite” of a
The data structure used to implements our queues con- queue: LIFO (Last In First Out)
sists of three linked lists (f, r, s) where f is the front
of the queue, r is the rear of the queue in reverse or-
der. The invariant of the structure is that s is the rear 2.7.4 References
of f without its |r| first elements, that is |s| = |f | − |r| .
The tail of the queue (CON S(x, f ), r, s) is then almost [1] “Queue (Java Platform SE 7)". Docs.oracle.com. 2014-
(f, r, s) and inserting an element x to (f, r, s) is almost 03-26. Retrieved 2014-05-22.
(f, CON S(x, r), s) . It is said almost, because in both [2] Okasaki, Chris. “Purely Functional Data Structures”
of those results, |s| = |f |−|r|+1 . An auxiliary function (PDF).
aux must then be called for the invariant to be satisfied.
Two cases must be considered, depending on whether s [3] Hood, Robert; Melville, Robert (November 1981). “Real-
is the empty list, in which case |r| = |f | + 1 , or not. The time queue operations in pure Lisp”. Information Process-
ing Letters,. 13 (2).
formal definition is aux(f, r, Cons(_, s)) = (f, r, s)
and aux(f, r, N IL) = (f ′ , N IL, f ′ ) where f ′ is f fol-
lowed by r reversed. • Donald Knuth. The Art of Computer Programming,
Volume 1: Fundamental Algorithms, Third Edition.
Let us call reverse(f, r) the function which returns
Addison-Wesley, 1997. ISBN 0-201-89683-4. Sec-
f followed by r reversed. Let us furthermore assume
tion 2.2.1: Stacks, Queues, and Deques, pp. 238–
that |r| = |f | + 1 , since it is the case when this
243.
function is called. More precisely, we define a lazy
function rotate(f, r, a) which takes as input three list • Thomas H. Cormen, Charles E. Leiserson, Ronald
such that |r| = |f | + 1 , and return the concatenation L. Rivest, and Clifford Stein. Introduction to Algo-
of f, of r reversed and of a. Then reverse(f, r) = rithms, Second Edition. MIT Press and McGraw-
rotate(f, r, N IL) . The inductive definition of rotate Hill, 2001. ISBN 0-262-03293-7. Section 10.1:
is rotate(N IL, Cons(y, N IL), a) = Cons(y, a) Stacks and queues, pp. 200–204.
and rotate(CON S(x, f ), CON S(y, r), a) =
Cons(x, rotate(f, r, CON S(y, a))) . Its running • William Ford, William Topp. Data Structures with
time is O(r) , but, since lazy evaluation is used, the C++ and STL, Second Edition. Prentice Hall, 2002.
computation is delayed until the results is forced by the ISBN 0-13-085850-1. Chapter 8: Queues and Pri-
computation. ority Queues, pp. 386–390.
The list s in the data structure has two purposes. This list • Adam Drozdek. Data Structures and Algorithms in
serves as a counter for |f | − |r| , indeed, |f | = |r| if and C++, Third Edition. Thomson Course Technology,
only if s is the empty list. This counter allows us to ensure 2005. ISBN 0-534-49182-0. Chapter 4: Stacks and
that the rear is never longer than the front list. Further- Queues, pp. 137–169.
more, using s, which is a tail of f, forces the computation
of a part of the (lazy) list f during each tail and insert op-
eration. Therefore, when |f | = |r| , the list f is totally 2.7.5 External links
forced. If it wast not the case, the intern representation of
f could be some append of append of... of append, and • Queue Data Structure and Algorithm
forcing would not be a constant time operation anymore.
• Queues with algo and 'c' programme

• STL Quick Reference


Amortized queue
• VBScript implementation of stack, queue, deque,
Note that, without the lazy part of the implementation, and Red-Black Tree
the real-time queue would be a non-persistent implemen-
tation of queue in O(1) amortized time. In this case, the This article incorporates public domain material from
list s can be replaced by the integer |f | − |r| , and the the NIST document: Black, Paul E. “Bounded queue”.
reverse function would be called when s is 0. Dictionary of Algorithms and Data Structures.
48 CHAPTER 2. SEQUENCES

2.8 Double-ended queue 2.8.4 Implementations

“Deque” redirects here. It is not to be confused with There are at least two common ways to efficiently imple-
dequeueing, a queue operation. ment a deque: with a modified dynamic array or with a
Not to be confused with Double-ended priority queue. doubly linked list.
The dynamic array approach uses a variant of a dynamic
In computer science, a double-ended queue (dequeue, array that can grow from both ends, sometimes called
often abbreviated to deque) is an abstract data type that array deques. These array deques have all the proper-
generalizes a queue, for which elements can be added to ties of a dynamic array, such as constant-time random
or removed from either the front (head) or back (tail).[1] It access, good locality of reference, and inefficient inser-
is also often called a head-tail linked list, though prop- tion/removal in the middle, with the addition of amor-
erly this refers to a specific data structure implementation tized constant-time insertion/removal at both ends, in-
of a deque (see below). stead of just one end. Three common implementations
include:

2.8.1 Naming conventions • Storing deque contents in a circular buffer, and only
resizing when the buffer becomes full. This de-
Deque is sometimes written dequeue, but this use is gener- creases the frequency of resizings.
ally deprecated in technical literature or technical writing
because dequeue is also a verb meaning “to remove from • Allocating deque contents from the center of the
a queue”. Nevertheless, several libraries and some writ- underlying array, and resizing the underlying array
ers, such as Aho, Hopcroft, and Ullman in their textbook when either end is reached. This approach may re-
Data Structures and Algorithms, spell it dequeue. John quire more frequent resizings and waste more space,
Mitchell, author of Concepts in Programming Languages, particularly when elements are only inserted at one
also uses this terminology. end.

• Storing contents in multiple smaller arrays, allo-


cating additional arrays at the beginning or end as
2.8.2 Distinctions and sub-types needed. Indexing is implemented by keeping a
dynamic array containing pointers to each of the
This differs from the queue abstract data type or First-In- smaller arrays.
First-Out List (FIFO), where elements can only be added
to one end and removed from the other. This general data
class has some possible sub-types: Purely functional implementation

• An input-restricted deque is one where deletion can Double-ended queues can also [2]be implemented as a
be made from both ends, but insertion can be made purely functional data structure. Two versions of the
at one end only. implementation exist. The first one, called 'real-time
deque, is presented below. It allows the queue to be
persistent with operations in O(1) worst-case time, but
• An output-restricted deque is one where insertion
requires lazy lists with memoization. The second one,
can be made at both ends, but deletion can be made
with no lazy lists nor memoization is presented at the end
from one end only.
of the sections. Its amortized time is O(1) if the per-
sistency is not used; but the worst-time complexity of an
Both the basic and most common list types in comput- operation is O(n) where n is the number of elements in
ing, queues and stacks can be considered specializations the double-ended queue.
of deques, and can be implemented using deques. Let us recall that, for a list l, |l| denotes its length, that
NIL represents an empty list and CONS(h,t) represents
the list whose head is h and whose tail is t. The functions
2.8.3 Operations drop(i,l) and take(i,l) return the list l without its first i
elements, and the i’s first elements respectively. Or, if |l|
The basic operations on a deque are enqueue and dequeue < i, they return the empty list and l respectively.
on either end. Also generally implemented are peek op- A double-ended queue is represented as a sixtuple
erations, which return the value at that end without de- lenf,f,sf,lenr,r,sr where f is a linked list which contains
queuing it. the front of the queue of length lenf. Similarly, r is a
Names vary between languages; major implementations linked list which represents the reverse of the rear of the
include: queue, of length lenr. Furthermore, it is assured that |f|
2.8. DOUBLE-ENDED QUEUE 49

<= 2|r|+1 and |r| <= 2|f|+1 - intuitively, it means that nei- in O(1) amortized time. In this case, the lists sf and sr can
ther the front nor the rear contains more than a third of the be removed from the representation of the double-ended
list plus one element. Finally, sf and sr are tails of f and queue.
of r, they allow to schedule the moment where some lazy
operations are forced. Note that, when a double-ended
queue contains n elements in the front list and n elements
in the rear list, then the inequality invariant remains satis-
fied after i insertions and d deletions when (i+d)/2 <= n. 2.8.5 Language support
That is, at most n/2 operations can happen between each
rebalancing. Ada's containers provides the generic
Intuitively, inserting an element x in front of the double- packages Ada.Containers.Vectors and
ended queue lenf, f, sf, lenr, sr leads almost to the Ada.Containers.Doubly_Linked_Lists, for the dynamic
double-ended queue lenf+1, CONS(x,f), drop(2,sf), lenr, array and linked list implementations, respectively.
r, drop(2,sr), the head and the tail of the double-ended C++'s Standard Template Library provides the class tem-
queue lenf, CONS(x,f), sf, lenr, r, sr are x and al- plates std::deque and std::list, for the multiple array and
most lenf-1, f, drop(2,sf), lenr, r, drop(2,sr) respec- linked list implementations, respectively.
tively, and the head and the tail of lenf, NIL, NIL, lenr,
As of Java 6, Java’s Collections Framework provides a
CONS(x,NIL), drop(2,sr) are x and 0, NIL, NIL, 0, NIL,
new Deque interface that provides the functionality of
NIL respectively. The function to insert an element in
insertion and removal at both ends. It is implemented
the rear, or to drop the last element of the double-ended
by classes such as ArrayDeque (also new in Java 6) and
queue, are similar to the above function which deal with
LinkedList, providing the dynamic array and linked list
the front of the double-ended queue. It is said almost
implementations, respectively. However, the ArrayD-
because, after insertion and after an application of tail,
eque, contrary to its name, does not support random ac-
the invariant |r| <= 2|f|+1 may not be satisfied anymore.
cess.
In this case it is required to rebalance the double-ended
queue. Perl's arrays have native support for both removing (shift
and pop) and adding (unshift and push) elements on both
In order to avoid an operation with an O(n) costs, the
ends.
algorithm uses laziness with memoization, and force
the rebalancing to be partly done during the following Python 2.4 introduced the collections module with sup-
(|l| + |r|)/2 operations, that is, before the following port for deque objects. It is implemented using a doubly
rebalancing. In order to create the scheduling, some linked list of fixed-length subarrays.
auxiliary lazy functions are required. The function As of PHP 5.3, PHP’s SPL extension contains the
rotateRev(f,r,a) returns the list f, followed by the list 'SplDoublyLinkedList' class that can be used to imple-
r, and followed by the list a. It is required in this ment Deque datastructures. Previously to make a Deque
function that |r|−2|f| is 2 or 3. This function is defined structure the array functions array_shift/unshift/pop/push
by induction as rotateRev(NIL,r,a)=reverse(r++a) had to be used instead.
where ++ is the concatenation operation, and by ro-
tateRev(CONS(x,f),r,a)=CONS(x,rotateRev(f,drop(2,r),reverse GHC's Data.Sequence module implements an efficient,
(take(2,r))++a)). It should be noted that, ro- functional deque structure in Haskell. The implemen-
tateRev(f,r,NIL) returns the list f followed by the tation uses 2–3 finger trees annotated with sizes. There
list r reversed. The function rotateDrop(f,j,r) which re- are other (fast) possibilities to implement purely func-
turns f followed by (r without j’s first element) reversed is tional (thus also persistent)[3][4]
double queues (most using
also required, for j < |f|. It is defined by rotateDrop(f,0,r) heavily lazy evaluation). Kaplan and Tarjan were the
== rotateRev(f,r,NIL), rotateDrop(f,1,r) == ro- first to implement optimal confluently persistent caten-
[5]
tateRev(f,drop(1,r),NIL) and rotateDrop(CONS(x,f),j,r) able deques. Their implementation was strictly purely
== CONS(x,rotateDrop(f,j-2),drop(2,r)). functional in the sense that it did not use lazy evaluation.
Okasaki simplified the data structure by using lazy eval-
The balancing function can now be defined with uation with a bootstrapped data structure and degrading
fun balance(q as (lenf, f,sf, lenr,r,sr))= if lenf > 2*lenr+1 the performance bounds from worst-case to amortized.
then let val i= (left+lenr)div 2 val j=lenf + lenr -i val Kaplan, Okasaki, and Tarjan produced a simpler, non-
f'=take(i,f) val r'=rotateDrop(r,i,f) in (i,f',f',j,r',r') else bootstrapped, amortized version that can be implemented
if lenf > 2*lenr+1 then let val j= (left+lenr)div 2 val either using lazy evaluation or more efficiently using mu-
i=lenf + lenr -j val r'=take(i,r) val f'=rotateDrop(f,i,r) in tation in a broader but still restricted fashion. Mihaesau
(i,f',f',j,r',r') else q and Tarjan created a simpler (but still highly complex)
strictly purely functional implementation of catenable de-
ques, and also a much simpler implementation of strictly
Note that, without the lazy part of the implementation,
purely functional non-catenable deques, both of which
this would be a non-persistent implementation of queue
have optimal worst-case bounds.
50 CHAPTER 2. SEQUENCES

2.8.6 Complexity [5] Haim Kaplan and Robert E. Tarjan. Purely functional
representations of catenable sorted lists. In ACM Sym-
• In a doubly-linked list implementation and assuming posium on Theory of Computing, pages 202–211, May
no allocation/deallocation overhead, the time com- 1996. (pp. 4, 82, 84, 124)
plexity of all deque operations is O(1). Addition-
[6] Eitan Frachtenberg, Uwe Schwiegelshohn (2007). Job
ally, the time complexity of insertion or deletion in
Scheduling Strategies for Parallel Processing: 12th Inter-
the middle, given an iterator, is O(1); however, the national Workshop, JSSPP 2006. Springer. ISBN 3-540-
time complexity of random access by index is O(n). 71034-5. See p.22.
• In a growing array, the amortized time complexity
of all deque operations is O(1). Additionally, the
2.8.10 External links
time complexity of random access by index is O(1);
but the time complexity of insertion or deletion in
• Type-safe open source deque implementation at
the middle is O(n).
Comprehensive C Archive Network

• SGI STL Documentation: deque<T, Alloc>


2.8.7 Applications
• Code Project: An In-Depth Study of the STL Deque
One example where a deque can be used is the A-Steal job Container
scheduling algorithm.[6] This algorithm implements task
scheduling for several processors. A separate deque with • Deque implementation in C
threads to be executed is maintained for each processor.
To execute the next thread, the processor gets the first el- • VBScript implementation of stack, queue, deque,
ement from the deque (using the “remove first element” and Red-Black Tree
deque operation). If the current thread forks, it is put
• Multiple implementations of non-catenable deques
back to the front of the deque (“insert element at front”)
in Haskell
and a new thread is executed. When one of the proces-
sors finishes execution of its own threads (i.e. its deque
is empty), it can “steal” a thread from another processor:
it gets the last element from the deque of another proces- 2.9 Circular buffer
sor (“remove last element”) and executes it. The steal-job
scheduling algorithm is used by Intel’s Threading Build-
ing Blocks (TBB) library for parallel programming.

2.8.8 See also


• Pipe

• Queue

• Priority queue

2.8.9 References
[1] Donald Knuth. The Art of Computer Programming, Vol-
ume 1: Fundamental Algorithms, Third Edition. Addison-
Wesley, 1997. ISBN 0-201-89683-4. Section 2.2.1:
Stacks, Queues, and Deques, pp. 238–243.

[2] Okasaki, Chris. “Purely Functional Data Structures”


(PDF). A ring showing, conceptually, a circular buffer. This visually
shows that the buffer has no real end and it can loop around the
[3] http://www.cs.cmu.edu/~{}rwh/theses/okasaki.pdf C. buffer. However, since memory is never physically created as a
Okasaki, “Purely Functional Data Structures”, September ring, a linear representation is generally used as is done below.
1996

[4] Adam L. Buchsbaum and Robert E. Tarjan. Confluently A circular buffer, circular queue, cyclic buffer or ring
persistent deques via data structural bootstrapping. Jour- buffer is a data structure that uses a single, fixed-size
nal of Algorithms, 18(3):513–547, May 1995. (pp. 58, buffer as if it were connected end-to-end. This structure
101, 125) lends itself easily to buffering data streams.
2.9. CIRCULAR BUFFER 51

2.9.1 Uses Assume that a 1 is written into the middle of the buffer
(exact starting location does not matter in a circular
The useful property of a circular buffer is that it does buffer):
not need to have its elements shuffled around when one
is consumed. (If a non-circular buffer were used then it
would be necessary to shift all elements when one is con- 1

sumed.) In other words, the circular buffer is well-suited


as a FIFO buffer while a standard, non-circular buffer is
well suited as a LIFO buffer.
Circular buffering makes a good implementation strategy Then assume that two more elements are added — 2 & 3
for a queue that has fixed maximum size. Should a maxi- — which get appended after the 1:
mum size be adopted for a queue, then a circular buffer is
a completely ideal implementation; all queue operations
are constant time. However, expanding a circular buffer 1 2 3

requires shifting memory, which is comparatively costly.


For arbitrarily expanding queues, a linked list approach
may be preferred instead.
In some situations, overwriting circular buffer can be If two elements are then removed from the buffer, the
used, e.g. in multimedia. If the buffer is used as the oldest values inside the buffer are removed. The two ele-
bounded buffer in the producer-consumer problem then it ments removed, in this case, are 1 & 2, leaving the buffer
is probably desired for the producer (e.g., an audio gen- with just a 3:
erator) to overwrite old data if the consumer (e.g., the
sound card) is unable to momentarily keep up. Also, the
3
LZ77 family of lossless data compression algorithms op-
erates on the assumption that strings seen more recently in
a data stream are more likely to occur soon in the stream.
Implementations store the most recent data in a circular
buffer. If the buffer has 7 elements then it is completely full:

2.9.2 How it works 6 7 8 9 3 4 5

A consequence of the circular buffer is that when it is full


and a subsequent write is performed, then it starts over-
writing the oldest data. In this case, two more elements
— A & B — are added and they overwrite the 3 & 4:

6 7 8 9 A B 5

Alternatively, the routines that manage the buffer could


prevent overwriting the data and return an error or raise
A 24-byte keyboard circular buffer. When the write pointer is an exception. Whether or not data is overwritten is up
about to reach the read pointer - because the microprocessor is to the semantics of the buffer routines or the application
not responding, the buffer will stop recording keystrokes and - in using the circular buffer.
some computers - a beep will be played.
Finally, if two elements are now removed then what
A circular buffer first starts empty and of some predefined would be returned is not 3 & 4 but 5 & 6 because A & B
length. For example, this is a 7-element buffer: overwrote the 3 & the 4 yielding the buffer with:

7 8 9 A B
52 CHAPTER 2. SEQUENCES

2.9.3 Circular buffer mechanics 2.9.5 Fixed-length-element and


contiguous-block circular buffer
A circular buffer can be implemented using four pointers,
or two pointers and two integers: Perhaps the most common version of the circular buffer
uses 8-bit bytes as elements.
• buffer start in memory Some implementations of the circular buffer use fixed-
length elements that are bigger than 8-bit bytes—16-bit
• buffer end in memory, or buffer capacity integers for audio buffers, 53-byte ATM cells for tele-
com buffers, etc. Each item is contiguous and has the
• start of valid data (index or pointer) correct data alignment, so software reading and writing
these values can be faster than software that handles non-
• end of valid data (index or pointer), or amount of contiguous and non-aligned values.
data currently in the buffer (integer) Ping-pong buffering can be considered a very specialized
circular buffer with exactly two large fixed-length ele-
This image shows a partially full buffer: ments.
The Bip Buffer (bipartite buffer) is very similar to a cir-
cular buffer, except it always returns contiguous blocks
1 2 3 which can be variable length. This offers nearly all the
efficiency advantages of a circular buffer while maintain-
START END ing the ability for the buffer to be used in APIs that only
accept contiguous blocks.[1]

This image shows a full buffer with four elements (num- Fixed-sized compressed circular buffers use an alternative
bers 1 through 4) having been overwritten: indexing strategy based on elementary number theory to
maintain a fixed-sized compressed representation of the
entire data sequence.[3]
6 7 8 9 A B 5

2.9.6 External links


END START

[1] Simon Cooke (2003), “The Bip Buffer - The Circular


Buffer with a Twist”
When an element is overwritten, the start pointer is in-
cremented to the next element. [2] Morin, Pat. “ArrayQueue: An Array-Based Queue”.
Open Data Structures (in pseudocode). Retrieved 7
In the pointer-based implementation strategy, the buffer’s
November 2015.
full or empty state can be resolved from the start and end
indexes. When they are equal, the buffer is empty, and [3] Gunther, John C. (March 2014). “Algorithm 938: Com-
when the start is one greater than the end, the buffer is pressing circular buffers”. ACM Transactions on Mathe-
full.[1] When the buffer is instead designed to track the matical Software. 40 (2): 1–12. doi:10.1145/2559995.
number of inserted elements n, checking for emptiness
means checking n = 0 and checking for fullness means • CircularBuffer at the Portland Pattern Repository
checking whether n equals the capacity.[2]
• Boost: Templated Circular Buffer Container

• http://www.dspguide.com/ch28/2.htm
2.9.4 Optimization

A circular-buffer implementation may be optimized by


mapping the underlying buffer to two contiguous regions
of virtual memory. (Naturally, the underlying buffer‘s
length must then equal some multiple of the system’s page
size.) Reading from and writing to the circular buffer may
then be carried out with greater efficiency by means of di-
rect memory access; those accesses which fall beyond the
end of the first virtual-memory region will automatically
wrap around to the beginning of the underlying buffer.
When the read offset is advanced into the second virtual-
memory region, both offsets—read and write—are decre-
mented by the length of the underlying buffer.[1]
Chapter 3

Dictionaries

3.1 Associative array “binding” may also be used to refer to the process of cre-
ating a new association.
“Dictionary (data structure)" redirects here. It is not to The operations that are usually defined for an associative
be confused with data dictionary. array are:[1][2]
“Associative container” redirects here. For the im-
plementation of ordered associative arrays in the
standard library of the C++ programming language, see • Add or insert: add a new (key, value) pair to the
associative containers. collection, binding the new key to its new value.
The arguments to this operation are the key and the
In computer science, an associative array, map, symbol value.
table, or dictionary is an abstract data type composed of
a collection of (key, value) pairs, such that each possible • Reassign: replace the value in one of the
key appears at most once in the collection. (key, value) pairs that are already in the collection,
Operations associated with this data type allow:[1][2] binding an old key to a new value. As with an inser-
tion, the arguments to this operation are the key and
• the addition of a pair to the collection the value.

• the removal of a pair from the collection


• Remove or delete: remove a (key, value) pair from
• the modification of an existing pair the collection, unbinding a given key from its value.
The argument to this operation is the key.
• the lookup of a value associated with a particular key

The dictionary problem is a classic computer science • Lookup: find the value (if any) that is bound to a
problem: the task of designing a data structure that main- given key. The argument to this operation is the key,
tains a set of data during 'search', 'delete', and 'insert' and the value is returned from the operation. If no
operations.[3] The two major solutions to the dictionary value is found, some associative array implementa-
problem are a hash table or a search tree.[1][2][4][5] In some tions raise an exception.
cases it is also possible to solve the problem using directly
addressed arrays, binary search trees, or other more spe-
cialized structures. Often then instead of add or reassign there is a single set
operation that adds a new (key, value) pair if one does
Many programming languages include associative arrays not already exist, and otherwise reassigns it.
as primitive data types, and they are available in software
libraries for many others. Content-addressable memory In addition, associative arrays may also include other op-
is a form of direct hardware-level support for associative erations such as determining the number of bindings or
arrays. constructing an iterator to loop over all the bindings. Usu-
ally, for such an operation, the order in which the bindings
Associative arrays have many applications including such are returned may be arbitrary.
fundamental programming patterns as memoization and
the decorator pattern.[6] A multimap generalizes an associative array by allowing
multiple values to be associated with a single key.[7] A
bidirectional map is a related abstract data type in which
3.1.1 Operations the bindings operate in both directions: each value must
be associated with a unique key, and a second lookup op-
In an associative array, the association between a key and eration takes a value as argument and looks up the key
a value is often known as a “binding”, and the same word associated with that value.

53
54 CHAPTER 3. DICTIONARIES

3.1.2 Example accessing the corresponding bucket within the array. As


such, hash tables usually perform in O(1) time, and out-
Suppose that the set of loans made by a library is repre- perform alternatives in most situations.
sented in a data structure. Each book in a library may
Hash tables need to be able to handle collisions: when
be checked out only by a single library patron at a time.
the hash function maps two different keys to the same
However, a single patron may be able to check out multi-
bucket of the array. The two most widespread approaches
ple books. Therefore, the information about which books
to this problem are separate chaining and open address-
are checked out to which patrons may be represented by
ing.[1][2][4][9] In separate chaining, the array does not store
an associative array, in which the books are the keys and
the value itself but stores a pointer to another container,
the patrons are the values. Using notation from Python
usually an association list, that stores all of the values
or JSON, the data structure would be:
matching the hash. On the other hand, in open address-
{ “Pride and Prejudice": “Alice”, “Wuthering Heights": ing, if a hash collision is found, then the table seeks an
“Alice”, “Great Expectations": “John” } empty spot in an array to store the value in a determin-
istic manner, usually by looking at the next immediate
A lookup operation on the key “Great Expectations” position in the array.
would return “John”. If John returns his book, that would
cause a deletion operation, and if Pat checks out a book,
that would cause an insertion operation, leading to a dif-
ferent state:
{ “Pride and Prejudice": “Alice”, “The Brothers Kara-
mazov": “Pat”, “Wuthering Heights": “Alice” }

3.1.3 Implementation
For dictionaries with very small numbers of bindings, it
may make sense to implement the dictionary using an
association list, a linked list of bindings. With this im- This graph compares the average number of cache misses re-
plementation, the time to perform the basic dictionary quired to look up elements in tables with separate chaining and
open addressing.
operations is linear in the total number of bindings; how-
ever, it is easy to implement and the constant factors in
its running time are small.[1][8] Open addressing has a lower cache miss ratio than sep-
arate chaining when the table is mostly empty. How-
Another very simple implementation technique, usable ever, as the table becomes filled with more elements, open
when the keys are restricted to a narrow range of inte- addressing’s performance degrades exponentially. Ad-
gers, is direct addressing into an array: the value for a ditionally, separate chaining uses less memory in most
given key k is stored at the array cell A[k], or if there is cases, unless the entries are very small (less than four
no binding for k then the cell stores a special sentinel value
times the size of a pointer).
that indicates the absence of a binding. As well as being
simple, this technique is fast: each dictionary operation
takes constant time. However, the space requirement for Tree implementations
this structure is the size of the entire keyspace, making it
impractical unless the keyspace is small.[4] Main article: Search tree
The two major approaches to implementing dictionaries
are a hash table or a search tree.[1][2][4][5]
Self-balancing binary search trees Another com-
mon approach is to implement an associative array with
Hash table implementations
a self-balancing binary search tree, such as an AVL tree
[10]
The most frequently used general purpose implementa- or a red-black tree.
tion of an associative array is with a hash table: an array Compared to hash tables, these structures have both ad-
combined with a hash function that separates each key vantages and weaknesses. The worst-case performance
into a separate “bucket” of the array. The basic idea be- of self-balancing binary search trees is significantly bet-
hind a hash table is that accessing an element of an array ter than that of a hash table, with a time complexity in big
via its index is a simple, constant-time operation. There- O notation of O(log n). This is in contrast to hash tables,
fore, the average overhead of an operation for a hash table whose worst-case performance involves all elements shar-
is only the computation of the key’s hash, combined with ing a single bucket, resulting in O(n) time complexity. In
3.1. ASSOCIATIVE ARRAY 55

addition, and like all binary search trees, self-balancing In Smalltalk, Objective-C, .NET,[13] Python,
binary search trees keep their elements in order. Thus, REALbasic, Swift, and VBA they are called dictio-
traversing its elements follows a least-to-greatest pattern, naries; in Perl, Ruby and Seed7 they are called hashes;
whereas traversing a hash table can result in elements in C++, Java, Go, Clojure, Scala, OCaml, Haskell they
being in seemingly random order. However, hash ta- are called maps (see map (C++), unordered_map (C++),
bles have a much better average-case time complexity and Map); in Common Lisp and Windows PowerShell,
than self-balancing binary search trees of O(1), and their they are called hash tables (since both typically use this
worst-case performance is highly unlikely when a good implementation). In PHP, all arrays can be associative,
hash function is used. except that the keys are limited to integers and strings.
It is worth noting that a self-balancing binary search tree In JavaScript (see also JSON), all objects behave as
associative arrays with string-valued keys, while the Map
can be used to implement the buckets for a hash table
that uses separate chaining. This allows for average-case and WeakMap types take arbitrary objects as keys. In
Lua, they are called tables, and are used as the primitive
constant lookup, but assures a worst-case performance of
O(log n). However, this introduces extra complexity into building block for all data structures. In Visual FoxPro,
the implementation, and may cause even worse perfor- they are called Collections. The D language also has
mance for smaller hash tables, where the time spent in- support for associative arrays.[14]
serting into and balancing the tree is greater than the time
needed to perform a linear search on all of the elements
of a linked list or similar data structure.[11][12]
3.1.5 Permanent storage

Other trees Associative arrays may also be stored in Main article: Key-value store
unbalanced binary search trees or in data structures spe-
cialized to a particular type of keys such as radix trees, Most programs using associative arrays will at some point
tries, Judy arrays, or van Emde Boas trees, but these im- need to store that data in a more permanent form, like in
plementation methods are less efficient than hash tables a computer file. A common solution to this problem is a
as well as placing greater restrictions on the types of data generalized concept known as archiving or serialization,
that they can handle. The advantages of these alternative which produces a text or binary representation of the orig-
structures come from their ability to handle operations inal objects that can be written directly to a file. This
beyond the basic ones of an associative array, such as find- is most commonly implemented in the underlying object
ing the binding whose key is the closest to a queried key, model, like .Net or Cocoa, which include standard func-
when the query is not itself present in the set of bindings. tions that convert the internal data into text form. The
program can create a complete text representation of any
group of objects by calling these methods, which are al-
Comparison most always already implemented in the base associative
array class.[15]
3.1.4 Language support For programs that use very large data sets, this sort of
individual file storage is not appropriate, and a database
Main article: Comparison of programming languages
management system (DB) is required. Some DB systems
(mapping) natively store associative arrays by serializing the data and
then storing that serialized data and the key. Individual
Associative arrays can be implemented in any program- arrays can then be loaded or saved from the database us-
ming language as a package and many language systems ing the key to refer to them. These key-value stores have
provide them as part of their standard library. In some been used for many years and have a history as long as
languages, they are not only built into the standard sys- that as the more common relational database (RDBs), but
tem, but have special syntax, often using array-like sub- a lack of standardization, among other reasons, limited
scripting. their use to certain niche roles. RDBs were used for these
Built-in syntactic support for associative arrays was intro- roles in most cases, although saving objects to a RDB
duced by SNOBOL4, under the name “table”. MUMPS can be complicated, a problem known as object-relational
made multi-dimensional associative arrays, optionally impedance mismatch.
persistent, its key data structure. SETL supported them After c. 2010, the need for high performance databases
as one possible implementation of sets and maps. Most suitable for cloud computing and more closely matching
modern scripting languages, starting with AWK and in- the internal structure of the programs using them led to a
cluding Rexx, Perl, Tcl, JavaScript, Wolfram Language, renaissance in the key-value store market. These systems
Python, Ruby, Go, and Lua, support associative arrays as can store and retrieve associative arrays in a native fash-
a primary container type. In many more languages, they ion, which can greatly improve performance in common
are available as library functions without special syntax. web-related workflows.
56 CHAPTER 3. DICTIONARIES

3.1.6 See also 3.1.8 External links


• Key-value database • NIST’s Dictionary of Algorithms and Data Struc-
tures: Associative Array
• Tuple
• Function (mathematics)
• JSON 3.2 Association list
In computer programming and particularly in Lisp, an as-
3.1.7 References sociation list, often referred to as an alist, is a linked list
in which each list element (or node) comprises a key and
[1] Goodrich, Michael T.; Tamassia, Roberto (2006), “9.1
a value. The association list is said to associate the value
The Map Abstract Data Type”, Data Structures & Algo-
rithms in Java (4th ed.), Wiley, pp. 368–371
with the key. In order to find the value associated with
a given key, a sequential search is used: each element of
[2] Mehlhorn, Kurt; Sanders, Peter (2008), “4 Hash Tables the list is searched in turn, starting at the head, until the
and Associative Arrays”, Algorithms and Data Structures: key is found. Associative lists provide a simple way of
The Basic Toolbox (PDF), Springer, pp. 81–98 implementing an associative array, but are efficient only
[3] Anderson, Arne (1989). “Optimal Bounds on the Dictio- when the number of keys is very small.
nary Problem”. Proc. Symposium on Optimal Algorithms.
Springer Verlag: 106–114.
3.2.1 Operation
[4] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L.; Stein, Clifford (2001), “11 Hash Tables”,
An associative array is an abstract data type that can be
Introduction to Algorithms (2nd ed.), MIT Press and
used to maintain a collection of key–value pairs and look
McGraw-Hill, pp. 221–252, ISBN 0-262-03293-7.
up the value associated with a given key. The association
[5] Dietzfelbinger, M., Karlin, A., Mehlhorn, K., Meyer list provides a simple way of implementing this data type.
auf der Heide, F., Rohnert, H., and Tarjan, R. E.
1994. “Dynamic Perfect Hashing: Upper and Lower
To test whether a key is associated with a value in a
Bounds”. SIAM J. Comput. 23, 4 (Aug. 1994), given association list, search the list starting at its first
738-761. http://portal.acm.org/citation.cfm?id=182370 node and continuing either until a node containing the
doi:10.1137/S0097539791194094 key has been found or until the search reaches the end of
the list (in which case the key is not present). To add a
[6] Goodrich & Tamassia (2006), pp. 597–599. new key–value pair to an association list, create a new
[7] Goodrich & Tamassia (2006), pp. 389–397. node for that key-value pair, set the node’s link to be
the previous first element of the association list, and re-
[8] “When should I use a hash table instead of an association place the first element of the association list with the new
list?". lisp-faq/part2. 1996-02-20. node.[1] Although some implementations of association
[9] Klammer, F.; Mazzolini, L. (2006), “Pathfinders for asso- lists disallow having multiple nodes with the same keys
ciative maps”, Ext. Abstracts GIS-l 2006, GIS-I, pp. 71– as each other, such duplications are not problematic for
74. this search algorithm: duplicate keys that appear later in
the list are ignored.[2]
[10] Joel Adams and Larry Nyhoff. “Trees in STL”. Quote:
“The Standard Template library ... some of its contain- It is also possible to delete a key from an association list,
ers -- the set<T>, map<T1, T2>, multiset<T>, and mul- by scanning the list to find each occurrence of the key and
timap<T1, T2> templates -- are generally built using a splicing the nodes containing the key out of the list.[1] The
special kind of self-balancing binary search tree called a scan should continue to the end of the list, even when the
red-black tree.” key is found, in case the same key may have been inserted
[11] Knuth, Donald (1998). 'The Art of Computer Program- multiple times.
ming'. 3: Sorting and Searching (2nd ed.). Addison-
Wesley. pp. 513–558. ISBN 0-201-89685-0.
3.2.2 Performance
[12] Probst, Mark (2010-04-30). “Linear vs Binary Search”.
Retrieved 2016-11-20.
The disadvantage of association lists is that the time to
[13] “Dictionary<TKey, TValue> Class”. MSDN. search is O(n), where n is the length of the list.[3] For large
lists, this may be much slower than the times that can be
[14] “Associative Arrays, the D programming language”. Dig- obtained by representing an associative array as a binary
ital Mars.
search tree or as a hash table. Additionally, unless the
[15] “Archives and Serializations Programming Guide”, Apple list is regularly pruned to remove elements with duplicate
Inc., 2012 keys, multiple values associated with the same key will
3.3. HASH TABLE 57

increase the size of the list, and thus the time to search, [6] van de Snepscheut, Jan L. A. (1993). What Computing Is
without providing any compensatory advantage. All About. Monographs in Computer Science. Springer.
p. 201. ISBN 9781461227106.
One advantage of association lists is that a new element
can be added in constant time. Additionally, when the [7] Scott, Michael Lee (2000). “3.3.4 Association Lists
number of keys is very small, searching an association list and Central Reference Tables”. Programming Language
may be more efficient than searching a binary search tree Pragmatics. Morgan Kaufmann. p. 137. ISBN
or hash table, because of the greater simplicity of their 9781558604421.
implementation.[4] [8] Pearce, Jon (2012). Programming and Meta-
Programming in Scheme. Undergraduate Texts
in Computer Science. Springer. p. 214. ISBN
3.2.3 Applications and software libraries 9781461216827.
[9] Minsky, Yaron; Madhavapeddy, Anil; Hickey, Jason
In the early development of Lisp, association lists (2013). Real World OCaml: Functional Programming
were used to resolve references to free variables in for the Masses. O'Reilly Media. p. 253. ISBN
procedures.[5][6] In this application, it is convenient to 9781449324766.
augment association lists with an additional operation,
that reverses the addition of a key–value pair without [10] O'Sullivan, Bryan; Goerzen, John; Stewart, Donald Bruce
(2008). Real World Haskell: Code You Can Believe In.
scanning the list for other copies of the same key. In this
O'Reilly Media. p. 299. ISBN 9780596554309.
way, the association list can function as a stack, allow-
ing local variables to temporarily shadow other variables
with the same names, without destroying the values of
those other variables.[7] 3.3 Hash table
Many programming languages, including Lisp,[5]
Not to be confused with Hash list or Hash tree.
Scheme,[8] OCaml,[9] and Haskell[10] have functions for
“Rehash” redirects here. For the South Park episode, see
handling association lists in their standard libraries.
Rehash (South Park). For the IRC command, see List of
Internet Relay Chat commands § REHASH.
3.2.4 See also
• Self-organizing list, a strategy for re-ordering the hash
keys in an association list to speed up searches for
keys function buckets
frequently-accessed keys
00
01 521-8976
John Smith
3.2.5 References 02 521-1234
03
[1] Marriott, Kim; Stuckey, Peter J. (1998). Programming Lisa Smith
: :
with Constraints: An Introduction. MIT Press. pp. 193–
195. ISBN 9780262133418. 13
Sandra Dee
14 521-9655
[2] Frické, Martin (2012). “2.8.3 Association Lists”. Logic 15
and the Organization of Information. Springer. pp. 44–
45. ISBN 9781461430872.
A small phone book as a hash table
[3] Knuth, Donald. “6.1 Sequential Searching”. The Art of
Computer Programming, Vol. 3: Sorting and Searching
(2nd ed.). Addison Wesley. pp. 396–405. ISBN 0-201- In computing, a hash table (hash map) is a data structure
89685-0. used to implement an associative array, a structure that
can map keys to values. A hash table uses a hash function
[4] Janes, Calvin (2011). “Using Association Lists for As- to compute an index into an array of buckets or slots, from
sociative Arrays”. Developer’s Guide to Collections in which the desired value can be found.
Microsoft .NET. Pearson Education. p. 191. ISBN
9780735665279. Ideally, the hash function will assign each key to a unique
bucket, but most hash table designs employ an imperfect
[5] McCarthy, John; Abrahams, Paul W.; Edwards, Daniel hash function, which might cause hash collisions where
J.; Hart, Timothy P.; Levin, Michael I. (1985). LISP 1.5 the hash function generates the same index for more than
Programmer’s Manual (PDF). MIT Press. ISBN 0-262- one key. Such collisions must be accommodated in some
13011-4. See in particular p. 12 for functions that search way.
an association list and use it to substitute symbols in an-
other expression, and p. 103 for the application of asso- In a well-dimensioned hash table, the average cost (num-
ciation lists in maintaining variable bindings. ber of instructions) for each lookup is independent of the
58 CHAPTER 3. DICTIONARIES

number of elements stored in the table. Many hash ta- consecutive slots. Such clustering may cause the lookup
ble designs also allow arbitrary insertions and deletions cost to skyrocket, even if the load factor is low and colli-
of key-value pairs, at (amortized[2] ) constant average cost sions are infrequent. The popular multiplicative hash[3] is
per operation.[3][4] claimed to have particularly poor clustering behavior.[7]
In many situations, hash tables turn out to be more effi- Cryptographic hash functions are believed to provide
cient than search trees or any other table lookup struc- good hash functions for any table size s, either by modulo
ture. For this reason, they are widely used in many kinds reduction or by bit masking. They may also be ap-
of computer software, particularly for associative arrays, propriate if there is a risk of malicious users trying to
database indexing, caches, and sets. sabotage a network service by submitting requests de-
signed to generate a large number of collisions in the
server’s hash tables. However, the risk of sabotage can
3.3.1 Hashing also be avoided by cheaper methods (such as applying a
secret salt to the data, or using a universal hash function).
Main article: Hash function A drawback of cryptographic hashing functions is that
they are often slower to compute, which means that in
The idea of hashing is to distribute the entries (key/value cases where the uniformity for any s is not necessary, a
pairs) across an array of buckets. Given a key, the algo- non-cryptographic hashing function might be preferable.
rithm computes an index that suggests where the entry can
be found:
Perfect hash function
index = f(key, array_size)
Often this is done in two steps: If all keys are known ahead of time, a perfect hash func-
tion can be used to create a perfect hash table that has
hash = hashfunc(key) index = hash % array_size no collisions. If minimal perfect hashing is used, every
In this method, the hash is independent of the array size, location in the hash table can be used as well.
and it is then reduced to an index (a number between 0 Perfect hashing allows for constant time lookups in all
and array_size − 1) using the modulo operator (%). cases. This is in contrast to most chaining and open ad-
In the case that the array size is a power of two, the re- dressing methods, where the time for lookup is low on
mainder operation is reduced to masking, which improves average, but may be very large, O(n), for instance when
speed, but can increase problems with a poor hash func- all the keys hash to a few values.
tion.

3.3.2 Key statistics


Choosing a hash function
A critical statistic for a hash table is the load factor, de-
A good hash function and implementation algorithm are fined as
essential for good hash table performance, but may be
difficult to achieve.
n
A basic requirement is that the function should provide a factor load = ,
k
uniform distribution of hash values. A non-uniform dis-
tribution increases the number of collisions and the cost where
of resolving them. Uniformity is sometimes difficult to
ensure by design, but may be evaluated empirically us- • n is the number of entries;
ing statistical tests, e.g., a Pearson’s chi-squared test for
discrete uniform distributions.[5][6] • k is the number of buckets.
The distribution needs to be uniform only for table sizes
that occur in the application. In particular, if one uses As the load factor grows larger, the hash table becomes
dynamic resizing with exact doubling and halving of the slower, and it may even fail to work (depending on the
table size s, then the hash function needs to be uniform method used). The expected constant time property of
only when s is a power of two. Here the index can be a hash table assumes that the load factor is kept below
computed as some range of bits of the hash function. On some bound. For a fixed number of buckets, the time for
the other hand, some hashing algorithms prefer to have s a lookup grows with the number of entries and therefore
be a prime number.[7] The modulus operation may pro- the desired constant time is not achieved.
vide some additional mixing; this is especially useful with
Second to that, one can examine the variance of number
a poor hash function. of entries per bucket. For example, two tables both have
For open addressing schemes, the hash function should 1,000 entries and 1,000 buckets; one has exactly one en-
also avoid clustering, the mapping of two or more keys to try in each bucket, the other has all entries in the same
3.3. HASH TABLE 59

bucket. Clearly the hashing is not working in the second only basic data structures with simple algorithms, and can
one. use simple hash functions that are unsuitable for other
A low load factor is not especially beneficial. As the load methods.
factor approaches 0, the proportion of unused areas in The cost of a table operation is that of scanning the en-
the hash table increases, but there is not necessarily any tries of the selected bucket for the desired key. If the
reduction in search cost. This results in wasted memory. distribution of keys is sufficiently uniform, the average
cost of a lookup depends only on the average number of
keys per bucket—that is, it is roughly proportional to the
3.3.3 Collision resolution load factor.
For this reason, chained hash tables remain effective even
Hash collisions are practically unavoidable when hashing
when the number of table entries n is much higher than
a random subset of a large set of possible keys. For ex-
the number of slots. For example, a chained hash table
ample, if 2,450 keys are hashed into a million buckets,
with 1000 slots and 10,000 stored keys (load factor 10)
even with a perfectly uniform random distribution, ac-
is five to ten times slower than a 10,000-slot table (load
cording to the birthday problem there is approximately a
factor 1); but still 1000 times faster than a plain sequential
95% chance of at least two of the keys being hashed to
list.
the same slot.
For separate-chaining, the worst-case scenario is when all
Therefore, almost all hash table implementations have
entries are inserted into the same bucket, in which case
some collision resolution strategy to handle such events.
the hash table is ineffective and the cost is that of search-
Some common strategies are described below. All these
ing the bucket data structure. If the latter is a linear list,
methods require that the keys (or pointers to them) be
the lookup procedure may have to scan all its entries, so
stored in the table, together with the associated values.
the worst-case cost is proportional to the number n of en-
tries in the table.
Separate chaining The bucket chains are often searched sequentially using
the order the entries were added to the bucket. If the
load factor is large and some keys are more likely to come
up than others, then rearranging the chain with a move-
keys buckets entries to-front heuristic may be effective. More sophisticated
000
Lisa Smith 521-8976 data structures, such as balanced search trees, are worth
001
John Smith
002 considering only if the load factor is large (about 10 or
Lisa Smith
: : John Smith 521-1234 more), or if the hash distribution is likely to be very non-
151 uniform, or if one must guarantee good performance even
152
Sam Doe
153
Sandra Dee 521-9655 in a worst-case scenario. However, using a larger table
154 and/or a better hash function may be even more effective
Sandra Dee
: : Ted Baker 418-4165 in those cases.
253
Ted Baker
254 Chained hash tables also inherit the disadvantages of
Sam Doe 521-5030
255 linked lists. When storing small keys and values, the
space overhead of the next pointer in each entry record
Hash collision resolved by separate chaining. can be significant. An additional disadvantage is that
traversing a linked list has poor cache performance, mak-
In the method known as separate chaining, each bucket ing the processor cache ineffective.
is independent, and has some sort of list of entries with
the same index. The time for hash table operations is the
time to find the bucket (which is constant) plus the time
for the list operation.
overflow
keys buckets entries
In a good hash table, each bucket has zero or one en- 000

tries, and sometimes two or three, but rarely more than John Smith
001 Lisa Smith 521-8976
002
that. Therefore, structures that are efficient in time and Lisa Smith
: : : :
151
space for these cases are preferred. Structures that are ef- Sam Doe
152 John Smith 521-1234
Sandra Dee 521-9655
ficient for a fairly large number of entries per bucket are 153
154
Ted Baker 418-4165

Sandra Dee
not needed or desirable. If these cases happen often, the :
253
: : :

hashing function needs to be fixed. Ted Baker


254 Sam Doe 521-5030
255

Separate chaining with linked lists Chained hash ta- Hash collision by separate chaining with head records in the
bles with linked lists are popular because they require bucket array.
60 CHAPTER 3. DICTIONARIES

Separate chaining with list head cells Some chaining


implementations store the first record of each chain in the
keys buckets
slot array itself.[4] The number of pointer traversals is de-
000
creased by one for most cases. The purpose is to increase
001 Lisa Smith 521-8976
cache efficiency of hash table access.
John Smith 002
The disadvantage is that an empty bucket takes the same : : :
space as a bucket with one entry. To save space, such hash Lisa Smith 151

tables often have about as many slots as stored entries, 152 John Smith 521-1234
Sam Doe 153 Sandra Dee 521-9655
meaning that many slots have two or more entries.
154 Ted Baker 418-4165
Sandra Dee 155
Separate chaining with other structures Instead of : : :

a list, one can use any other data structure that supports Ted Baker 253
254 Sam Doe 521-5030
the required operations. For example, by using a self-
255
balancing binary search tree, the theoretical worst-case
time of common hash table operations (insertion, dele-
tion, lookup) can be brought down to O(log n) rather than Hash collision resolved by open addressing with linear probing
O(n). However, this introduces extra complexity into the (interval=1). Note that “Ted Baker” has a unique hash, but nev-
implementation, and may cause even worse performance ertheless collided with “Sandra Dee”, that had previously collided
for smaller hash tables, where the time spent inserting into with “John Smith”.
and balancing the tree is greater than the time needed to
perform a linear search on all of the elements of a list.[3][8] starting with the hashed-to slot and proceeding in some
A real world example of a hash table that uses a self- probe sequence, until an unoccupied slot is found. When
balancing binary search tree for buckets is the HashMap searching for an entry, the buckets are scanned in the
class in Java version 8.[9] same sequence, until either the target record is found, or
The variant called array hash table uses a dynamic array an unused array slot is found, which indicates that there
to store all the entries that hash to the same slot.[10][11][12] is no such key in the table.[15] The name “open address-
Each newly inserted entry gets appended to the end of ing” refers to the fact that the location (“address”) of the
the dynamic array that is assigned to the slot. The dy- item is not determined by its hash value. (This method
namic array is resized in an exact-fit manner, meaning is also called closed hashing; it should not be confused
it is grown only by as many bytes as needed. Alterna- with “open hashing” or “closed addressing” that usually
tive techniques such as growing the array by block sizes mean separate chaining.)
or pages were found to improve insertion performance, Well-known probe sequences include:
but at a cost in space. This variation makes more effi-
cient use of CPU caching and the translation lookaside
buffer (TLB), because slot entries are stored in sequential • Linear probing, in which the interval between
memory positions. It also dispenses with the next point- probes is fixed (usually 1)
ers that are required by linked lists, which saves space.
• Quadratic probing, in which the interval between
Despite frequent array resizing, space overheads incurred
probes is increased by adding the successive outputs
by the operating system such as memory fragmentation
of a quadratic polynomial to the starting value given
were found to be small.
by the original hash computation
An elaboration on this approach is the so-called dynamic
perfect hashing,[13] where a bucket that contains k entries • Double hashing, in which the interval between
is organized as a perfect hash table with k2 slots. While it probes is computed by a second hash function
uses more memory (n2 slots for n entries, in the worst case
and n × k slots in the average case), this variant has guar- A drawback of all these open addressing schemes is that
anteed constant worst-case lookup time, and low amor- the number of stored entries cannot exceed the number
tized time for insertion. It is also possible to use a fusion of slots in the bucket array. In fact, even with good hash
tree for each bucket, achieving constant time for all oper- functions, their performance dramatically degrades when
ations with high probability.[14] the load factor grows beyond 0.7 or so. For many ap-
plications, these restrictions mandate the use of dynamic
resizing, with its attendant costs.
Open addressing
Open addressing schemes also put more stringent require-
Main article: Open addressing ments on the hash function: besides distributing the keys
In another strategy, called open addressing, all entry more uniformly over the buckets, the function must also
records are stored in the bucket array itself. When a minimize the clustering of hash values that are consecu-
new entry has to be inserted, the buckets are examined, tive in the probe order. Using separate chaining, the only
3.3. HASH TABLE 61

concern is that too many objects map to the same hash other considerations typically come into play.
value; whether they are adjacent or nearby is completely
irrelevant.
Coalesced hashing A hybrid of chaining and open
Open addressing only saves memory if the entries are addressing, coalesced hashing links together chains of
small (less than four times the size of a pointer) and the nodes within the table itself.[15] Like open addressing, it
load factor is not too small. If the load factor is close to achieves space usage and (somewhat diminished) cache
zero (that is, there are far more buckets than stored en- advantages over chaining. Like chaining, it does not ex-
tries), open addressing is wasteful even if each entry is hibit clustering effects; in fact, the table can be efficiently
just two words. filled to a high density. Unlike chaining, it cannot have
more elements than table slots.

Cuckoo hashing Another alternative open-addressing


solution is cuckoo hashing, which ensures constant
lookup time in the worst case, and constant amortized
time for insertions and deletions. It uses two or more hash
functions, which means any key/value pair could be in two
or more locations. For lookup, the first hash function is
used; if the key/value is not found, then the second hash
function is used, and so on. If a collision happens during
insertion, then the key is re-hashed with the second hash
function to map it to another bucket. If all hash func-
This graph compares the average number of cache misses re- tions are used and there is still a collision, then the key it
quired to look up elements in tables with chaining and linear collided with is removed to make space for the new key,
probing. As the table passes the 80%-full mark, linear probing’s and the old key is re-hashed with one of the other hash
performance drastically degrades. functions, which maps it to another bucket. If that lo-
cation also results in a collision, then the process repeats
Open addressing avoids the time overhead of allocating until there is no collision or the process traverses all the
each new entry record, and can be implemented even in buckets, at which point the table is resized. By combin-
the absence of a memory allocator. It also avoids the ex- ing multiple hash functions with multiple cells per bucket,
tra indirection required to access the first entry of each very high space utilization can be achieved.
bucket (that is, usually the only one). It also has bet-
ter locality of reference, particularly with linear probing.
With small record sizes, these factors can yield better per-Hopscotch hashing Another alternative open-
formance than chaining, particularly for lookups. Hash addressing solution is hopscotch hashing,[16] which
tables with open addressing are also easier to serialize, combines the approaches of cuckoo hashing and linear
because they do not use pointers. probing, yet seems in general to avoid their limitations.
In particular it works well even when the load factor
On the other hand, normal open addressing is a poor grows beyond 0.9. The algorithm is well suited for
choice for large elements, because these elements fill en-
implementing a resizable concurrent hash table.
tire CPU cache lines (negating the cache advantage), and
a large amount of space is wasted on large empty table The hopscotch hashing algorithm works by defining a
slots. If the open addressing table only stores references neighborhood of buckets near the original hashed bucket,
to elements (external storage), it uses space comparable where a given entry is always found. Thus, search is lim-
to chaining even for large records but loses its speed ad- ited to the number of entries in this neighborhood, which
vantage. is logarithmic in the worst case, constant on average, and
with proper alignment of the neighborhood typically re-
Generally speaking, open addressing is better used for quires one cache miss. When inserting an entry, one first
hash tables with small records that can be stored within attempts to add it to a bucket in the neighborhood. How-
the table (internal storage) and fit in a cache line. They ever, if all buckets in this neighborhood are occupied, the
are particularly suitable for elements of one word or less. algorithm traverses buckets in sequence until an open slot
If the table is expected to have a high load factor, the (an unoccupied bucket) is found (as in linear probing). At
records are large, or the data is variable-sized, chained that point, since the empty bucket is outside the neigh-
hash tables often perform as well or better. borhood, items are repeatedly displaced in a sequence of
Ultimately, used sensibly, any kind of hash table algo- hops. (This is similar to cuckoo hashing, but with the dif-
rithm is usually fast enough; and the percentage of a cal- ference that in this case the empty slot is being moved into
culation spent in hash table code is low. Memory usage is the neighborhood, instead of items being moved out with
rarely considered excessive. Therefore, in most cases the the hope of eventually finding an empty slot.) Each hop
differences between these algorithms are marginal, and brings the open slot closer to the original neighborhood,
62 CHAPTER 3. DICTIONARIES

without invalidating the neighborhood property of any of and in Python’s dict, table size is resized when load factor
the buckets along the way. In the end, the open slot has is greater than 2/3.
been moved into the neighborhood, and the entry being Since buckets are usually implemented on top of a
inserted can be added to it. dynamic array and any constant proportion for resizing
greater than 1 will keep the load factor under the desired
Robin Hood hashing limit, the exact choice of the constant is determined by
the same space-time tradeoff as for dynamic arrays.
One interesting variation on double-hashing collision res- Resizing is accompanied by a full or incremental table
olution is Robin Hood hashing.[17][18] The idea is that a rehash whereby existing items are mapped to new bucket
new key may displace a key already inserted, if its probe locations.
count is larger than that of the key at the current posi-
tion. The net effect of this is that it reduces worst case To limit the proportion of memory wasted due to empty
search times in the table. This is similar to ordered hash buckets, some implementations also shrink the size of
tables[19] except that the criterion for bumping a key does the table—followed by a rehash—when items are deleted.
not depend on a direct relationship between the keys. From the point of space-time tradeoffs, this operation is
Since both the worst case and the variation in the num- similar to the deallocation in dynamic arrays.
ber of probes is reduced dramatically, an interesting vari-
ation is to probe the table starting at the expected suc-
Resizing by copying all entries
cessful probe value and then expand from that position
in both directions.[20] External Robin Hood hashing is an
extension of this algorithm where the table is stored in A common approach is to automatically trigger a com-
an external file and each table position corresponds to a plete resizing when the load factor exceeds some thresh-
fixed-sized page or bucket with B records.[21] old r ₐₓ. Then a new larger table is allocated, each entry
is removed from the old table, and inserted into the new
table. When all entries have been removed from the old
2-choice hashing table then the old table is returned to the free storage pool.
Symmetrically, when the load factor falls below a second
2-choice hashing employs two different hash functions, threshold r ᵢ , all entries are moved to a new smaller ta-
h1 (x) and h2 (x), for the hash table. Both hash functions ble.
are used to compute two table locations. When an object For hash tables that shrink and grow frequently, the resiz-
is inserted in the table, then it is placed in the table loca- ing downward can be skipped entirely. In this case, the
tion that contains fewer objects (with the default being the table size is proportional to the maximum number of en-
h1 (x) table location if there is equality in bucket size). 2- tries that ever were in the hash table at one time, rather
choice hashing employs the principle of the power of two than the current number. The disadvantage is that mem-
choices.[22] ory usage will be higher, and thus cache behavior may be
worse. For best control, a “shrink-to-fit” operation can be
provided that does this only on request.
3.3.4 Dynamic resizing
If the table size increases or decreases by a fixed percent-
The good functioning of a hash table depends on the fact age at each expansion, the total cost of these resizings,
that the table size is proportional to the number of entries. amortized over all insert and delete operations, is still a
With a fixed size, and the common structures, it is simi- constant, independent of the number of entries n and of
lar to linear search, except with a better constant factor. the number m of operations performed.
In some cases, the number of entries may be definitely For example, consider a table that was created with the
known in advance, for example keywords in a language. minimum possible size and is doubled each time the load
More commonly, this is not known for sure, if only due ratio exceeds some threshold. If m elements are inserted
to later changes in code and data. It is one serious, al- into that table, the total number of extra re-insertions that
though common, mistake to not provide any way for the occur in all dynamic resizings of the table is at most m −
table to resize. A general-purpose hash table “class” will 1. In other words, dynamic resizing roughly doubles the
almost always have some way to resize, and it is good cost of each insert or delete operation.
practice even for simple “custom” tables. An implemen-
tation should check the load factor, and do something if it
becomes too large (this needs to be done only on inserts, Incremental resizing
since that is the only thing that would increase it).
To keep the load factor under a certain limit, e.g., under Some hash table implementations, notably in real-time
3/4, many table implementations expand the table when systems, cannot pay the price of enlarging the hash table
items are inserted. For example, in Java’s HashMap class all at once, because it may interrupt time-critical opera-
the default load factor threshold for table expansion is 3/4 tions. If one cannot avoid dynamic resizing, a solution is
3.3. HASH TABLE 63

to perform the resizing gradually: 3.3.5 Performance analysis

• During the resize, allocate the new hash table, but In the simplest model, the hash function is completely un-
keep the old table unchanged. specified and the table does not resize. For the best pos-
sible choice of hash function, a table of size k with open
• In each lookup or delete operation, check both ta- addressing has no collisions and holds up to k elements,
bles. with a single comparison for successful lookup, and a ta-
ble of size k with chaining and n keys has the minimum
• Perform insertion operations only in the new table. max(0, n − k) collisions and O(1 + n/k) comparisons for

• At each insertion also move r elements from the old lookup. For the worst choice of hash function, every in-
table to the new table. sertion causes a collision, and hash tables degenerate to
linear search, with Ω(n) amortized comparisons per in-
• When all elements are removed from the old table, sertion and up to n comparisons for a successful lookup.
deallocate it. Adding rehashing to this model is straightforward. As in a
dynamic array, geometric resizing by a factor of b implies
To ensure that the old table is completely copied over be- that only n/bi keys are inserted i or more times, so that the
fore the new table itself needs to be enlarged, it is neces-total number of insertions is bounded above by bn/(b −
sary to increase the size of the table by a factor of at least
1), which is O(n). By using rehashing to maintain n < k,
(r + 1)/r during resizing. tables using both chaining and open addressing can have
Disk-based hash tables almost always use some scheme unlimited elements and perform successful lookup in a
of incremental resizing, since the cost of rebuilding the single comparison for the best choice of hash function.
entire table on disk would be too high. In more realistic models, the hash function is a random
variable over a probability distribution of hash functions,
and performance is computed on average over the choice
Monotonic keys of hash function. When this distribution is uniform, the
assumption is called “simple uniform hashing” and it can
If it is known that key values will always increase (or be shown that hashing with chaining requires Θ(1 + n/k)
decrease) monotonically, then a variation of consistent comparisons on average for an unsuccessful lookup, and
hashing can be achieved by keeping a list of the single hashing with open addressing requires Θ(1/(1 − n/k)).[24]
most recent key value at each hash table resize operation. Both these bounds are constant, if we maintain n/k < c
Upon lookup, keys that fall in the ranges defined by these using table resizing, where c is a fixed constant less than
list entries are directed to the appropriate hash function— 1.
and indeed hash table—both of which can be different for
each range. Since it is common to grow the overall num-
ber of entries by doubling, there will only be O(log(N)) 3.3.6 Features
ranges to check, and binary search time for the redirec-
tion would be O(log(log(N))). As with consistent hashing, Advantages
this approach guarantees that any key’s hash, once issued,
will never change, even when the hash table is later grown. The main advantage of hash tables over other table data
structures is speed. This advantage is more apparent
when the number of entries is large. Hash tables are par-
Other solutions ticularly efficient when the maximum number of entries
can be predicted in advance, so that the bucket array can
Linear hashing[23] is a hash table algorithm that permits be allocated once with the optimum size and never re-
incremental hash table expansion. It is implemented us- sized.
ing a single hash table, but with two possible lookup func-
tions. If the set of key-value pairs is fixed and known ahead
of time (so insertions and deletions are not allowed), one
Another way to decrease the cost of table resizing is to may reduce the average lookup cost by a careful choice
choose a hash function in such a way that the hashes of of the hash function, bucket table size, and internal data
most values do not change when the table is resized. Such structures. In particular, one may be able to devise a hash
hash functions are prevalent in disk-based and distributed function that is collision-free, or even perfect. In this case
hash tables, where rehashing is prohibitively costly. The the keys need not be stored in the table.
problem of designing a hash such that most values do
not change when the table is resized is known as the
distributed hash table problem. The four most popular Drawbacks
approaches are rendezvous hashing, consistent hashing,
the content addressable network algorithm, and Kademlia Although operations on a hash table take constant time on
distance. average, the cost of a good hash function can be signifi-
64 CHAPTER 3. DICTIONARIES

cantly higher than the inner loop of the lookup algorithm used by the hash table in the Linux routing table cache
for a sequential list or search tree. Thus hash tables are was changed with Linux version 2.4.2 as a countermea-
not effective when the number of entries is very small. sure against such attacks.[29]
(However, in some cases the high cost of computing the
hash function can be mitigated by saving the hash value
together with the key.) 3.3.7 Uses
For certain string processing applications, such as spell-
Associative arrays
checking, hash tables may be less efficient than tries,
finite automata, or Judy arrays. Also, if there are not too
Main article: associative array
many possible keys to store—that is, if each key can be
represented by a small enough number of bits—then, in-
stead of a hash table, one may use the key directly as the Hash tables are commonly used to implement many
index into an array of values. Note that there are no col- types of in-memory tables. They are used to imple-
lisions in this case. ment associative arrays (arrays whose indices are arbi-
trary strings or other complicated objects), especially
The entries stored in a hash table can be enumerated ef-
in interpreted programming languages like Perl, Ruby,
ficiently (at constant cost per entry), but only in some
Python, and PHP.
pseudo-random order. Therefore, there is no efficient
way to locate an entry whose key is nearest to a given key. When storing a new item into a multimap and a hash col-
Listing all n entries in some specific order generally re- lision occurs, the multimap unconditionally stores both
quires a separate sorting step, whose cost is proportional items.
to log(n) per entry. In comparison, ordered search trees When storing a new item into a typical associative array
have lookup and insertion cost proportional to log(n), but and a hash collision occurs, but the actual keys them-
allow finding the nearest key at about the same cost, and selves are different, the associative array likewise stores
ordered enumeration of all entries at constant cost per en- both items. However, if the key of the new item exactly
try. matches the key of an old item, the associative array typ-
If the keys are not stored (because the hash function is ically erases the old item and overwrites it with the new
collision-free), there may be no easy way to enumerate item, so every item in the table has a unique key.
the keys that are present in the table at any given moment.
Although the average cost per operation is constant and Database indexing
fairly small, the cost of a single operation may be quite
high. In particular, if the hash table uses dynamic resiz- Hash tables may also be used as disk-based data struc-
ing, an insertion or deletion operation may occasionally tures and database indices (such as in dbm) although B-
take time proportional to the number of entries. This may trees are more popular in these applications. In multi-
be a serious drawback in real-time or interactive applica- node database systems, hash tables are commonly used
tions. to distribute rows amongst nodes, reducing network traf-
Hash tables in general exhibit poor locality of refer- fic for hash joins.
ence—that is, the data to be accessed is distributed
seemingly at random in memory. Because hash tables Caches
cause access patterns that jump around, this can trig-
ger microprocessor cache misses that cause long delays. Main article: cache (computing)
Compact data structures such as arrays searched with
linear search may be faster, if the table is relatively small
and keys are compact. The optimal performance point Hash tables can be used to implement caches, auxiliary
varies from system to system. data tables that are used to speed up the access to data that
is primarily stored in slower media. In this application,
Hash tables become quite inefficient when there are many hash collisions can be handled by discarding one of the
collisions. While extremely uneven hash distributions two colliding entries—usually erasing the old item that is
are extremely unlikely to arise by chance, a malicious currently stored in the table and overwriting it with the
adversary with knowledge of the hash function may be new item, so every item in the table has a unique hash
able to supply information to a hash that creates worst- value.
case behavior by causing excessive collisions, resulting
in very poor performance, e.g., a denial of service at-
tack.[25][26][27] In critical applications, a data structure Sets
with better worst-case guarantees can be used; however,
universal hashing—a randomized algorithm that prevents Besides recovering the entry that has a given key, many
the attacker from predicting which inputs cause worst- hash table implementations can also tell whether such an
case behavior—may be preferable.[28] The hash function entry exists or not.
3.3. HASH TABLE 65

Those structures can therefore be used to implement a Python's built-in hash table implementation, in the form
set data structure, which merely records whether a given of the dict type, as well as Perl's hash type (%) are used
key belongs to a specified set of keys. In this case, the internally to implement namespaces and therefore need
structure can be simplified by eliminating all parts that to pay more attention to security, i.e., collision attacks.
have to do with the entry values. Hashing can be used to Python sets also use hashes internally, for fast lookup
implement both static and dynamic sets. (though they store only keys, not values).[31]
In the .NET Framework, support for hash tables is pro-
vided via the non-generic Hashtable and generic Dictio-
Object representation
nary classes, which store key-value pairs, and the generic
HashSet class, which stores only values.
Several dynamic languages, such as Perl, Python,
JavaScript, Lua, and Ruby, use hash tables to implement In Rust's standard library, the generic HashMap and
objects. In this representation, the keys are the names of HashSet structs use linear probing with Robin Hood
the members and methods of the object, and the values bucket stealing.
are pointers to the corresponding member or method.
3.3.9 History
Unique data representation
The idea of hashing arose independently in different
places. In January 1953, H. P. Luhn wrote an internal
Main article: String interning
IBM memorandum that used hashing with chaining.[32]
Gene Amdahl, Elaine M. McGraw, Nathaniel Rochester,
Hash tables can be used by some programs to avoid cre- and Arthur Samuel implemented a program using hash-
ating multiple character strings with the same contents. ing at about the same time. Open addressing with linear
For that purpose, all strings in use by the program are probing (relatively prime stepping) is credited to Amdahl,
stored in a single string pool implemented as a hash table, but Ershov (in Russia) had the same idea.[32]
which is checked whenever a new string has to be created.
This technique was introduced in Lisp interpreters under
the name hash consing, and can be used with many other 3.3.10 See also
kinds of data (expression trees in a symbolic algebra sys-
tem, records in a database, files in a file system, binary • Rabin–Karp string search algorithm
decision diagrams, etc.). • Stable hashing
• Consistent hashing
Transposition table
• Extendible hashing
Main article: Transposition table • Lazy deletion
• Pearson hashing
• PhotoDNA
3.3.8 Implementations
• Search data structure
In programming languages
Related data structures
Many programming languages provide hash table func-
tionality, either as built-in associative arrays or as stan-
There are several data structures that use hash functions
dard library modules. In C++11, for example, the
but cannot be considered special cases of hash tables:
unordered_map class provides hash tables for keys and
values of arbitrary type.
• Bloom filter, memory efficient data-structure de-
The Java programming language (including the vari- signed for constant-time approximate lookups; uses
ant which is used on Android) includes the HashSet, hash function(s) and can be seen as an approximate
HashMap, LinkedHashSet, and LinkedHashMap generic hash table.
collections.[30]
• Distributed hash table (DHT), a resilient dynamic
In PHP 5, the Zend 2 engine uses one of the hash func- table spread over several nodes of a network.
tions from Daniel J. Bernstein to generate the hash values
used in managing the mappings of data pointers stored • Hash array mapped trie, a trie structure, similar to
in a hash table. In the PHP source code, it is labelled as the array mapped trie, but where each key is hashed
DJBX33A (Daniel J. Bernstein, Times 33 with Addition). first.
66 CHAPTER 3. DICTIONARIES

3.3.11 References [15] Tenenbaum, Aaron M.; Langsam, Yedidyah; Augenstein,


Moshe J. (1990). Data Structures Using C. Prentice Hall.
[1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, pp. 456–461, p. 472. ISBN 0-13-199746-7.
Ronald L.; Stein, Clifford (2009). Introduction to Algo-
rithms (3rd ed.). Massachusetts Institute of Technology. [16] Herlihy, Maurice; Shavit, Nir; Tzafrir, Moran (2008).
pp. 253–280. ISBN 978-0-262-03384-8. “Hopscotch Hashing”. DISC '08: Proceedings of the
22nd international symposium on Distributed Computing.
[2] Charles E. Leiserson, Amortized Algorithms, Table Berlin, Heidelberg: Springer-Verlag. pp. 350–364.
Doubling, Potential Method Lecture 13, course MIT
6.046J/18.410J Introduction to Algorithms—Fall 2005 [17] Celis, Pedro (1986). Robin Hood hashing (PDF) (Tech-
nical report). Computer Science Department, University
[3] Knuth, Donald (1998). 'The Art of Computer Program- of Waterloo. CS-86-14.
ming'. 3: Sorting and Searching (2nd ed.). Addison-
Wesley. pp. 513–558. ISBN 0-201-89685-0. [18] Goossaert, Emmanuel (2013). “Robin Hood hashing”.
[19] Amble, Ole; Knuth, Don (1974). “Ordered
[4] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
hash tables”. Computer Journal. 17 (2): 135.
Ronald L.; Stein, Clifford (2001). “Chapter 11: Hash Ta-
doi:10.1093/comjnl/17.2.135.
bles”. Introduction to Algorithms (2nd ed.). MIT Press and
McGraw-Hill. pp. 221–252. ISBN 978-0-262-53196-2. [20] Viola, Alfredo (October 2005). “Exact distribution of in-
dividual displacements in linear probing hashing”. Trans-
[5] Pearson, Karl (1900). “On the criterion that a given sys-
actions on Algorithms (TALG). ACM. 1 (2,): 214–242.
tem of deviations from the probable in the case of a corre-
doi:10.1145/1103963.1103965.
lated system of variables is such that it can be reasonably
supposed to have arisen from random sampling”. Philo- [21] Celis, Pedro (March 1988). External Robin Hood Hashing
sophical Magazine, Series 5. 50 (302). pp. 157–175. (Technical report). Computer Science Department, Indi-
doi:10.1080/14786440009463897. ana University. TR246.
[6] Plackett, Robin (1983). “Karl Pearson and the Chi- [22] http://www.eecs.harvard.edu/~{}michaelm/postscripts/
Squared Test”. International Statistical Review (Inter- handbook2001.pdf
national Statistical Institute (ISI)). 51 (1). pp. 59–72.
doi:10.2307/1402731. [23] Litwin, Witold (1980). “Linear hashing: A new tool for
file and table addressing”. Proc. 6th Conference on Very
[7] Wang, Thomas (March 1997). “Prime Double Hash Ta- Large Databases. pp. 212–223.
ble”. Archived from the original on 1999-09-03. Re-
trieved 2015-05-10. [24] Doug Dunham. CS 4521 Lecture Notes. University of
Minnesota Duluth. Theorems 11.2, 11.6. Last modified
[8] Probst, Mark (2010-04-30). “Linear vs Binary Search”. April 21, 2009.
Retrieved 2016-11-20.
[25] Alexander Klink and Julian Wälde’s Efficient Denial of
[9] “How does a HashMap work in JAVA”. coding-geek.com. Service Attacks on Web Application Platforms, December
28, 2011, 28th Chaos Communication Congress. Berlin,
[10] Askitis, Nikolas; Zobel, Justin (October 2005). Cache- Germany.
conscious Collision Resolution in String Hash Tables. Pro-
ceedings of the 12th International Conference, String [26] Mike Lennon “Hash Table Vulnerability Enables Wide-
Processing and Information Retrieval (SPIRE 2005). Scale DDoS Attacks”. 2011.
3772/2005. pp. 91–102. doi:10.1007/11575832_11.
[27] “Hardening Perl’s Hash Function”. November 6, 2013.
ISBN 978-3-540-29740-6.
[28] Crosby and Wallach. Denial of Service via Algorithmic
[11] Askitis, Nikolas; Sinha, Ranjan (2010). “Engineering
Complexity Attacks. quote: “modern universal hashing
scalable, cache and space efficient tries for strings”. The
techniques can yield performance comparable to com-
VLDB Journal. 17 (5): 633–660. doi:10.1007/s00778-
monplace hash functions while being provably secure
010-0183-9. ISSN 1066-8888.
against these attacks.” “Universal hash functions ... are
[12] Askitis, Nikolas (2009). Fast and Compact Hash Ta- ... a solution suitable for adversarial environments. ... in
bles for Integer Keys (PDF). Proceedings of the 32nd Aus- production systems.”
tralasian Computer Science Conference (ACSC 2009). 91.
[29] Bar-Yosef, Noa; Wool, Avishai (2007). Remote algo-
pp. 113–122. ISBN 978-1-920682-72-9.
rithmic complexity attacks against randomized hash tables
[13] Erik Demaine, Jeff Lind. 6.897: Advanced Data Struc- Proc. International Conference on Security and Cryptog-
tures. MIT Computer Science and Artificial Intelligence raphy (SECRYPT) (PDF). p. 124.
Laboratory. Spring 2003. http://courses.csail.mit.edu/6.
[30] https://docs.oracle.com/javase/tutorial/collections/
897/spring03/scribe_notes/L2/lecture2.pdf
implementations/index.html
[14] Willard, Dan E. (2000). “Examining computational ge- [31] https://stackoverflow.com/questions/513882/
ometry, van Emde Boas trees, and hashing from the per- python-list-vs-dict-for-look-up-table
spective of the fusion tree”. SIAM Journal on Computing.
29 (3): 1030–1049. doi:10.1137/S0097539797322425. [32] Mehta, Dinesh P.; Sahni, Sartaj. Handbook of Datastruc-
MR 1740562.. tures and Applications. p. 9-15. ISBN 1-58488-435-5.
3.4. LINEAR PROBING 67

3.3.12 Further reading up the value associated with a given key. It was in-
vented in 1954 by Gene Amdahl, Elaine M. McGraw,
• Tamassia, Roberto; Goodrich, Michael T. (2006). and Arthur Samuel and first analyzed in 1963 by Donald
“Chapter Nine: Maps and Dictionaries”. Data struc- Knuth.
tures and algorithms in Java : [updated for Java 5.0]
Along with quadratic probing and double hashing, linear
(4th ed.). Hoboken, NJ: Wiley. pp. 369–418. ISBN
probing is a form of open addressing. In these schemes,
0-471-73884-0.
each cell of a hash table stores a single key–value pair.
• McKenzie, B. J.; Harries, R.; Bell, T. (Feb When the hash function causes a collision by mapping
1990). “Selecting a hashing algorithm”. Soft- a new key to a cell of the hash table that is already oc-
ware Practice & Experience. 20 (2): 209–224. cupied by another key, linear probing searches the ta-
doi:10.1002/spe.4380200207. ble for the closest following free location and inserts the
new key there. Lookups are performed in the same way,
by searching the table sequentially starting at the posi-
3.3.13 External links tion given by the hash function, until finding a cell with a
matching key or an empty cell.
• A Hash Function for Hash Table Lookup by Bob As Thorup & Zhang (2012) write, “Hash tables are the
Jenkins. most commonly used nontrivial data structures, and the
most popular implementation on standard hardware uses
• Hash Tables by SparkNotes—explanation using C
linear probing, which is both fast and simple.”[1] Lin-
• Hash functions by Paul Hsieh ear probing can provide high performance because of
its good locality of reference, but is more sensitive to
• Design of Compact and Efficient Hash Tables for the quality of its hash function than some other colli-
Java sion resolution schemes. It takes constant expected time
per search, insertion, or deletion when implemented us-
• NIST entry on hash tables ing a random hash function, a 5-independent hash func-
tion, or tabulation hashing. However, good results can
• Lecture on Hash Tables
be achieved in practice with other hash functions such as
• Open Data Structures – Chapter 5 – Hash Tables MurmurHash.[2]

• MIT’s Introduction to Algorithms: Hashing 1 MIT


OCW lecture Video 3.4.1 Operations
• MIT’s Introduction to Algorithms: Hashing 2 MIT Linear probing is a component of open addressing
OCW lecture Video schemes for using a hash table to solve the dictionary
problem. In the dictionary problem, a data structure
should maintain a collection of key–value pairs subject
3.4 Linear probing to operations that insert or delete pairs from the collec-
tion or that search for the value associated with a given
key. In open addressing solutions to this problem, the
data structure is an array T (the hash table) whose cells
Keys Indices Key-value pairs T[i] (when nonempty) each store a single key–value pair.
(records)
A hash function is used to map each key into the cell of T
0
John Smith
1 Lisa Smith +1-555-8976
where that key should be stored, typically scrambling the
keys so that keys with similar values are not placed near
Lisa Smith 872
each other in the table. A hash collision occurs when the
873 John Smith +1-555-1234 hash function maps a key into a cell that is already oc-
Sam Doe
874 Sandra Dee +1-555-9655 cupied by a different key. Linear probing is a strategy
for resolving collisions, by placing the new key into the
998 Sam Doe +1-555-5030
Sandra Dee
999
closest following empty cell.[3][4]

The collision between John Smith and Sandra Dee (both hashing Search
to cell 873) is resolved by placing Sandra Dee at the next free
location, cell 874. To search for a given key x, the cells of T are examined,
beginning with the cell at index h(x) (where h is the hash
Linear probing is a scheme in computer programming function) and continuing to the adjacent cells h(x) + 1,
for resolving collisions in hash tables, data structures for h(x) + 2, ..., until finding either an empty cell or a cell
maintaining a collection of key–value pairs and looking whose stored key is x. If a cell containing the key is found,
68 CHAPTER 3. DICTIONARIES

the search returns the value from that cell. Otherwise, if search for a movable key continues for the new emptied
an empty cell is found, the key cannot be in the table, cell, in the same way, until it terminates by reaching a cell
because it would have been placed in that cell in prefer- that was already empty. In this process of moving keys to
ence to any later cell that has not yet been searched. In earlier cells, each key is examined only once. Therefore,
this case, the search returns as its result that the key is not the time to complete the whole process is proportional to
present in the dictionary.[3][4] the length of the block of occupied cells containing the
deleted key, matching the running time of the other hash
table operations.[3]
Insertion
Alternatively, it is possible to use a lazy deletion strat-
To insert a key–value pair (x,v) into the table (possibly re- egy in which a key–value pair is removed by replacing
placing any existing pair with the same key), the insertion the value by a special flag value indicating a deleted key.
algorithm follows the same sequence of cells that would However, these flag values will contribute to the load
be followed for a search, until finding either an empty cell factor of the hash table. With this strategy, it may be-
or a cell whose stored key is x. The new key–value pair come necessary to clean the flag values out of the array
is then placed into that cell.[3][4] and rehash all the remaining key–value pairs once too
large a fraction of the array becomes occupied by deleted
If the insertion would cause the load factor of the table keys.[3][4]
(its fraction of occupied cells) to grow above some pre-
set threshold, the whole table may be replaced by a new
table, larger by a constant factor, with a new hash func- 3.4.2 Properties
tion, as in a dynamic array. Setting this threshold close to
zero and using a high growth rate for the table size leads Linear probing provides good locality of reference, which
to faster hash table operations but greater memory usage causes it to require few uncached memory accesses per
than threshold values close to one and low growth rates. operation. Because of this, for low to moderate load fac-
A common choice would be to double the table size when tors, it can provide very high performance. However,
the load factor would exceed 1/2, causing the load factor compared to some other open addressing strategies, its
to stay between 1/4 and 1/2.[5] performance degrades more quickly at high load factors
because of primary clustering, a tendency for one col-
Deletion lision to cause more nearby collisions.[3] Additionally,
achieving good performance with this method requires
a higher-quality hash function than for some other col-
lision resolution schemes.[6] When used with low-quality
hash functions that fail to eliminate nonuniformities in
the input distribution, linear probing can be slower than
other open-addressing strategies such as double hashing,
which probes a sequence of cells whose separation is de-
termined by a second hash function, or quadratic probing,
When a key–value pair is deleted, it may be necessary to move where the size of each step varies depending on its posi-
another pair backwards into its cell, to prevent searches for the tion within the probe sequence.[7]
moved key from finding an empty cell.

It is also possible to remove a key–value pair from the 3.4.3 Analysis


dictionary. However, it is not sufficient to do so by sim-
ply emptying its cell. This would affect searches for other Using linear probing, dictionary operations can be im-
keys that have a hash value earlier than the emptied cell, plemented in constant expected time. In other words, in-
but that are stored in a position later than the emptied sert, remove and search operations can be implemented
cell. The emptied cell would cause those searches to in- in O(1), as long as the load factor of the hash table is a
correctly report that the key is not present. constant strictly less than one.[8]
Instead, when a cell i is emptied, it is necessary to search In more detail, the time for any particular operation (a
forward through the following cells of the table until find- search, insertion, or deletion) is proportional to the length
ing either another empty cell or a key that can be moved to of the contiguous block of occupied cells at which the
cell i (that is, a key whose hash value is equal to or earlier operation starts. If all starting cells are equally likely,
than i). When an empty cell is found, then emptying cell in a hash table with N cells, then a maximal block of
i is safe and the deletion process terminates. But, when k occupied cells will have probability k/N of contain-
the search finds a key that can be moved to cell i, it per- ing the starting location of a search, and will take time
forms this move. This has the effect of speeding up later O(k) whenever it is the starting location. Therefore, the
searches for the moved key, but it also empties out an- expected time for an operation can be calculated as the
other cell, later in the same block of occupied cells. The product of these two terms, O(k2 /N), summed over all of
3.4. LINEAR PROBING 69

the maximal blocks of contiguous cells in the table. A rather than by their value. For instance, this is done us-
similar sum of squared block lengths gives the expected ing linear probing by the IdentityHashMap class of the
time bound for a random hash function (rather than for a Java collections framework.[12] The hash value that this
random starting location into a specific state of the hash class associates with each object, its identityHashCode,
table), by summing over all the blocks that could exist is guaranteed to remain fixed for the lifetime of an ob-
(rather than the ones that actually exist in a given state ject but is otherwise arbitrary.[13] Because the identity-
of the table), and multiplying the term for each potential HashCode is constructed only once per object, and is not
block by the probability that the block is actually occu- required to be related to the object’s address or value, its
pied. That is, defining Block(i,k) to be the event that there construction may involve slower computations such as the
is a maximal contiguous block of occupied cells of length call to a random or pseudorandom number generator. For
k beginning at index i, the expected time per operation is instance, Java 8 uses an Xorshift pseudorandom number
generator to construct these values.[14]
For most applications of hashing, it is necessary to com-

N ∑
n
pute the hash function for each value every time that it
2
E[T ] = O(1) + O(k /N ) Pr[Block(i, k)].
is hashed, rather than once when its object is created.
i=1 k=1
In such applications, random or pseudorandom numbers
This formula can be simplified by replacing Block(i,k) by cannot be used as hash values, because then different ob-
a simpler necessary condition Full(k), the event that at jects with the same value would have different hashes.
least k elements have hash values that lie within a block And cryptographic hash functions (which are designed
of cells of length k. After this replacement, the value to be computationally indistinguishable from truly ran-
within the sum no longer depends on i, and the 1/N fac- dom functions) are usually too slow to be used in hash
tor cancels the N terms of the outer summation. These tables.[15] Instead, other methods for constructing hash
simplifications lead to the bound functions have been devised. These methods compute
the hash function quickly, and can be proven to work well
with linear probing. In particular, linear probing has been

n analyzed from the framework of k-independent hashing,
E[T ] ≤ O(1) + O(k 2 ) Pr[Full(k)]. a class of hash functions that are initialized from a small
k=1 random seed and that are equally likely to map any k-tuple
of distinct keys to any k-tuple of indexes. The parame-
But by the multiplicative form of the Chernoff bound,
ter k can be thought of as a measure of hash function
when the load factor is bounded away from one, the prob-
quality: the larger k is, the more time it will take to com-
ability that a block of length k contains at least k hashed
pute the hash function but it will behave more similarly
values is exponentially small as a function of k, caus-
to completely random functions. For linear probing, 5-
ing this sum to be bounded by a constant independent of
independence is enough to guarantee constant expected
n.[3] It is also possible to perform the same analysis using
time per operation,[16] while some 4-independent hash
Stirling’s approximation instead of the Chernoff bound
functions perform badly, taking up to logarithmic time
to estimate the probability that a block contains exactly k
per operation.[6]
hashed values.[4][9]
Another method of constructing hash functions with both
In terms of the load factor α, the expected time for a suc-
high quality and practical speed is tabulation hashing. In
cessful search is O(1 + 1/(1 − α)), and the expected time
this method, the hash value for a key is computed by using
for an unsuccessful search (or the insertion of a new key)
each byte of the key as an index into a table of random
is O(1 + 1/(1 − α)2 ).[10] For constant load factors, with
numbers (with a different table for each byte position).
high probability, the longest probe sequence (among the
The numbers from those table cells are then combined
probe sequences for all keys stored in the table) has loga-
by a bitwise exclusive or operation. Hash functions con-
rithmic length.[11]
structed this way are only 3-independent. Nevertheless,
linear probing using these hash functions takes constant
3.4.4 Choice of hash function expected time per operation.[4][17] Both tabulation hash-
ing and standard methods for generating 5-independent
Because linear probing is especially sensitive to unevenly hash functions are limited to keys that have a fixed num-
distributed hash values,[7] it is important to combine it ber of bits. To handle strings or other types of variable-
with a high-quality hash function that does not produce length keys, it is possible to compose a simpler universal
such irregularities. hashing technique that maps the keys to intermediate val-
ues and a higher quality (5-independent or tabulation)
The analysis above assumes that each key’s hash is a ran- hash function that maps the intermediate values to hash
dom number independent of the hashes of all the other table indices.[1][18]
keys. This assumption is unrealistic for most applications
of hashing. However, random or pseudorandom hash val- In an experimental comparison, Richter et al. found
ues may be used when hashing objects by their identity that the Multiply-Shift family of hash functions (defined
70 CHAPTER 3. DICTIONARIES

as hz (x) = (x · z mod2w ) ÷ 2w−d ) was “the fastest [3] Goodrich, Michael T.; Tamassia, Roberto (2015), “Sec-
hash function when integrated with all hashing schemes, tion 6.3.3: Linear Probing”, Algorithm Design and Appli-
i.e., producing the highest throughputs and also of good cations, Wiley, pp. 200–203.
quality” whereas tabulation hashing produced “the low- [4] Morin, Pat (February 22, 2014), “Section 5.2: Lin-
est throughput”.[2] They point out that each table look-up earHashTable: Linear Probing”, Open Data Structures (in
require several cycles, being more expensive than simple pseudocode) (0.1Gβ ed.), pp. 108–116, retrieved 2016-
arithmetic operations. They also found MurmurHash to 01-15.
be superior than tabulation hashing: “By studying the re-
sults provided by Mult and Murmur, we think that the [5] Sedgewick, Robert; Wayne, Kevin (2011), Algorithms
(4th ed.), Addison-Wesley Professional, p. 471, ISBN
trade-off for by tabulation (...) is less attractive in prac-
9780321573513. Sedgewick and Wayne also halve the ta-
tice”.
ble size when a deletion would cause the load factor to be-
come too low, causing them to use a wider range [1/8,1/2]
in the possible values of the load factor.
3.4.5 History
[6] Pătraşcu, Mihai; Thorup, Mikkel (2010), “On the k-
The idea of an associative array that allows data to be ac- independence required by linear probing and minwise in-
cessed by its value rather than by its address dates back to dependence” (PDF), Automata, Languages and Program-
the mid-1940s in the work of Konrad Zuse and Vannevar ming, 37th International Colloquium, ICALP 2010, Bor-
deaux, France, July 6–10, 2010, Proceedings, Part I,
Bush,[19] but hash tables were not described until 1953,
Lecture Notes in Computer Science, 6198, Springer, pp.
in an IBM memorandum by Hans Peter Luhn. Luhn used 715–726, doi:10.1007/978-3-642-14165-2_60
a different collision resolution method, chaining, rather
than linear probing.[20] [7] Heileman, Gregory L.; Luo, Wenbin (2005), “How
caching affects hashing” (PDF), Seventh Workshop on Al-
Knuth (1963) summarizes the early history of linear gorithm Engineering and Experiments (ALENEX 2005),
probing. It was the first open addressing method, and was pp. 141–154.
originally synonymous with open addressing. According
to Knuth, it was first used by Gene Amdahl, Elaine M. [8] Knuth, Donald (1963), Notes on “Open” Addressing
McGraw (née Boehme), and Arthur Samuel in 1954, in [9] Eppstein, David (October 13, 2011), “Linear probing
an assembler program for the IBM 701 computer.[8] The made easy”, 0xDE.
first published description of linear probing is by Peterson
(1957),[8] who also credits Samuel, Amdahl, and Boehme [10] Sedgewick, Robert (2003), “Section 14.3: Linear Prob-
but adds that “the system is so natural, that it very likely ing”, Algorithms in Java, Parts 1–4: Fundamentals, Data
may have been conceived independently by others either Structures, Sorting, Searching (3rd ed.), Addison Wesley,
[21] pp. 615–620, ISBN 9780321623973.
before or since that time”. Another early publication
of this method was by Soviet researcher Andrey Ershov, [11] Pittel, B. (1987), “Linear probing: the probable largest
in 1958.[22] search time grows logarithmically with the number
of records”, Journal of Algorithms, 8 (2): 236–249,
The first theoretical analysis of linear probing, show-
doi:10.1016/0196-6774(87)90040-X, MR 890874.
ing that it takes constant expected time per operation
with random hash functions, was given by Knuth.[8] [12] “IdentityHashMap”, Java SE 7 Documentation, Oracle, re-
Sedgewick calls Knuth’s work “a landmark in the analysis trieved 2016-01-15.
of algorithms”.[10] Significant later developments include
[13] Friesen, Jeff (2012), Beginning Java 7, Expert’s voice in
a more detailed analysis of the probability distribution of Java, Apress, p. 376, ISBN 9781430239109.
the running time,[23][24] and the proof that linear probing
runs in constant time per operation with practically us- [14] Kabutz, Heinz M. (September 9, 2014), “Identity Crisis”,
able hash functions rather than with the idealized random The Java Specialists’ Newsletter, 222.
functions assumed by earlier analysis.[16][17] [15] Weiss, Mark Allen (2014), “Chapter 3: Data Structures”,
in Gonzalez, Teofilo; Diaz-Herrera, Jorge; Tucker, Allen,
Computing Handbook, 1 (3rd ed.), CRC Press, p. 3-11,
3.4.6 References ISBN 9781439898536.

[1] Thorup, Mikkel; Zhang, Yin (2012), “Tabulation-based [16] Pagh, Anna; Pagh, Rasmus; Ružić, Milan (2009), “Linear
5-independent hashing with applications to linear probing probing with constant independence”, SIAM Journal on
and second moment estimation”, SIAM Journal on Com- Computing, 39 (3): 1107–1120, doi:10.1137/070702278,
puting, 41 (2): 293–331, doi:10.1137/100800774, MR MR 2538852
2914329.
[17] Pătraşcu, Mihai; Thorup, Mikkel (2011), “The power
[2] Richter, Stefan; Alvarez, Victor; Dittrich, Jens (2015), “A of simple tabulation hashing”, Proceedings of the
seven-dimensional analysis of hashing methods and its im- 43rd annual ACM Symposium on Theory of Com-
plications on query processing”, Proceedings of the VLDB puting (STOC '11), pp. 1–10, arXiv:1011.5200 ,
Endowment, 9 (3): 293–331. doi:10.1145/1993636.1993638
3.5. QUADRATIC PROBING 71

[18] Thorup, Mikkel (2009), “String hashing for linear prob- Quadratic probing is used in the Berkeley Fast File
ing”, Proceedings of the Twentieth Annual ACM-SIAM System to allocate free blocks. The allocation routine
Symposium on Discrete Algorithms, Philadelphia, PA: chooses a new cylinder-group when the current is nearly
SIAM, pp. 655–664, doi:10.1137/1.9781611973068.72, full using quadratic probing, because of the speed it shows
MR 2809270. in finding unused cylinder-groups.
[19] Parhami, Behrooz (2006), Introduction to Parallel Pro-
cessing: Algorithms and Architectures, Series in Computer
Science, Springer, 4.1 Development of early models, p. 3.5.1 Quadratic function
67, ISBN 9780306469640.
Let h(k) be a hash function that maps an element k to an
[20] Morin, Pat (2004), “Hash tables”, in Mehta, Dinesh integer in [0,m-1], where m is the size of the table. Let
P.; Sahni, Sartaj, Handbook of Data Structures and Ap- the ith probe position for a value k be given by the function
plications, Chapman & Hall / CRC, p. 9-15, ISBN
9781420035179.

[21] Peterson, W. W. (April 1957), “Addressing for random- h(k, i) = (h(k) + c1 i + c2 i2 ) (mod m)
access storage”, IBM Journal of Research and Develop-
where c2 ≠ 0. If c2 = 0, then h(k,i) degrades to a linear
ment, Riverton, NJ, USA: IBM Corp., 1 (2): 130–146,
doi:10.1147/rd.12.0130. probe. For a given hash table, the values of c1 and c2
remain constant.
[22] Ershov, A. P. (1958), “On Programming of Arithmetic
Operations”, Communications of the ACM, 1 (8): 3–6,
Examples:
doi:10.1145/368892.368907. Translated from Doklady
AN USSR 118 (3): 427–430, 1958, by Morris D. Fried- • If h(k, i) = (h(k) + i + i2 ) (mod m) , then the
man. Linear probing is described as algorithm A2. probe sequence will be h(k), h(k) + 2, h(k) + 6, ...

[23] Flajolet, P.; Poblete, P.; Viola, A. (1998), “On the analy- • For m = 2n , a good choice for the constants are c1
sis of linear probing hashing”, Algorithmica, 22 (4): 490– = c2 = 1/2, as the values of h(k,i) for i in [0,m-1]
515, doi:10.1007/PL00009236, MR 1701625. are all distinct. This leads to a probe sequence of
h(k), h(k) + 1, h(k) + 3, h(k) + 6, ... where the
[24] Knuth, D. E. (1998), “Linear probing and
values increase by 1, 2, 3, ...
graphs”, Algorithmica, 22 (4): 561–568,
doi:10.1007/PL00009240, MR 1701629. • For prime m > 2, most choices of c1 and c2 will
make h(k,i) distinct for i in [0, (m-1)/2]. Such
choices include c1 = c2 = 1/2, c1 = c2 = 1, and c1 =
3.5 Quadratic probing 0, c2 = 1. Because there are only about m/2 distinct
probes for a given element, it is difficult to guaran-
Quadratic probing is an open addressing scheme in tee that insertions will succeed when the load factor
computer programming for resolving collisions in hash is > 1/2.
tables—when an incoming data’s hash value indicates it
should be stored in an already-occupied slot or bucket. 3.5.2 Quadratic probing insertion
Quadratic probing operates by taking the original hash in-
dex and adding successive values of an arbitrary quadratic The problem, here, is to insert a key at an available key
polynomial until an open slot is found. space in a given Hash Table using quadratic probing.[1]
For a given hash value, the indices generated by linear
probing are as follows:
Algorithm to insert key in hash table
H + 1, H + 2, H + 3, H + 4, ..., H + k
This method results in primary clustering, and as the clus- 1. Get the key k 2. Set counter j = 0 3. Compute hash
ter grows larger, the search for those items hashing within function h[k] = k % SIZE 4. If hashtable[h[k]] is empty
the cluster becomes less efficient. (4.1) Insert key k at hashtable[h[k]] (4.2) Stop Else (4.3)
The key space at hashtable[h[k]] is occupied, so we need
An example sequence using quadratic probing is: to find the next available key space (4.4) Increment j (4.5)
2 2 2 2
H + 1 , H + 2 , H + 3 , H + 4 , ..., H + k 2 Compute new hash function h[k] = ( k + j * j ) % SIZE
(4.6) Repeat Step 4 till j is equal to the SIZE of hash table
Quadratic probing can be a more efficient algorithm in
5. The hash table is full 6. Stop
a closed hash table, since it better avoids the clustering
problem that can occur with linear probing, although it
is not immune. It also provides good memory caching C function for key insertion
because it preserves some locality of reference; however,
linear probing has greater locality and, thus, better cache int quadratic_probing_insert(int *hashtable, int key, int
performance. *empty) { /* hashtable[] is an integer hash table; empty[]
72 CHAPTER 3. DICTIONARIES

is another array which indicates whether the key space is h(k) + x2 = h(k) + y 2 (mod b) x2 = y 2 (mod b)
occupied; If an empty key space is found, the function x2 − y 2 = 0 (mod b) (x − y)(x + y) = 0 (mod b)
returns the index of the bucket where the key is inserted, As b (table size) is a prime greater than 3, either (x - y) or
otherwise it returns (−1) if no empty key space is found (x + y) has to be equal to zero. Since x and y are unique,
*/ int i, index; for (i = 0; i < SIZE; i++) { index = (key + (x - y) cannot be zero. Also, since 0 ≤ x, y ≤ (b / 2), (x +
i*i) % SIZE; if (empty[index]) { hashtable[index] = key; y) cannot be zero.
empty[index] = 0; return index; } } return −1; }
Thus, by contradiction, it can be said that the first (b / 2)
alternative locations after h(k) are unique. So an empty
key space can always be found as long as at most (b / 2)
3.5.3 Quadratic probing search locations are filled, i.e., the hash table is not more than
half full.
Algorithm to search element in hash table
Alternating sign
1. Get the key k to be searched 2. Set counter j = 0 3.
Compute hash function h[k] = k % SIZE 4. If the key If the sign of the offset is alternated (e.g. +1, −4, +9, −16
space at hashtable[h[k]] is occupied (4.1) Compare the etc.), and if the number of buckets is a prime number p
element at hashtable[h[k]] with the key k. (4.2) If they are congruent to 3 modulo 4 (i.e. one of 3, 7, 11, 19, 23, 31
equal (4.2.1) The key is found at the bucket h[k] (4.2.2) and so on), then the first p offsets will be unique modulo
Stop Else (4.3) The element might be placed at the next p.
location given by the quadratic function (4.4) Increment
In other words, a permutation of 0 through p-1 is ob-
j (4.5) Set h[k] = ( k + (j * j) ) % SIZE, so that we can
tained, and, consequently, a free bucket will always be
probe the bucket at a new slot, h[k]. (4.6) Repeat Step 4
found as long as there exists at least one.
till j is greater than SIZE of hash table 5. The key was
not found in the hash table 6. Stop The insertion algorithm only receives a minor modifica-
tion (but do note that SIZE has to be a suitable prime
number as explained above):
C function for key searching 1. Get the key k 2. Set counter j = 0 3. Compute hash
function h[k] = k % SIZE 4. If hashtable[h[k]] is empty
int quadratic_probing_search(int *hashtable, int key, (4.1) Insert key k at hashtable[h[k]] (4.2) Stop Else (4.3)
int *empty) { /* If the key is found in the hash table, The key space at hashtable[h[k]] is occupied, so we need
the function returns the index of the hashtable where to find the next available key space (4.4) Increment j (4.5)
the key is inserted, otherwise it returns (−1) if the Compute new hash function h[k]. If j is odd, then h[k] =
key is not found */ int i, index; for (i = 0; i < SIZE; ( k + j * j ) % SIZE, else h[k] = ( k - j * j ) % SIZE (4.6)
i++) { index = (key + i*i) % SIZE; if (!empty[index] Repeat Step 4 till j is equal to the SIZE of hash table 5.
&& hashtable[index] == key) return index; } return −1; } The hash table is full 6. Stop
The search algorithm is modified likewise.

3.5.4 Limitations 3.5.5 See also


[2]
For linear probing it is a bad idea to let the hash table • Hash tables
get nearly full, because performance is degraded as the
hash table gets filled. In the case of quadratic probing, • Hash collision
the situation is even more drastic. With the exception of • Double hashing
the triangular number case for a power-of-two-sized hash
table, there is no guarantee of finding an empty cell once • Linear probing
the table gets more than half full, or even before the table • Hash function
gets half full if the table size is not prime. This is because
at most half of the table can be used as alternative loca-
tions to resolve collisions. If the hash table size is b (a 3.5.6 References
prime greater than 3), it can be proven that the first b/2
alternative locations including the initial location h(k) are [1] Horowitz, Sahni, Anderson-Freed (2011). Fundamentals
all distinct and unique. Suppose, we assume two of the of Data Structures in C. University Press. ISBN 978-81-
2
alternative locations to be given by h(k) + x (mod b) 7371-605-8.
2
and h(k) + y (mod b) , where 0 ≤ x, y ≤ (b / 2). If [2] Weiss, Mark Allen (2009). Data Structures and Algorithm
these two locations point to the same key space, but x ≠ Analysis in C++. Pearson Education. ISBN 978-81-317-
y. Then the following would have to be true, 1474-4.
3.6. DOUBLE HASHING 73

3.5.7 External links and Siegel[4] showed this with k -wise independent and
uniform functions (for k = c log n , and suitable constant
• Tutorial/quadratic probing c ).

3.6 Double hashing 3.6.2 Implementation details for caching

Double hashing is a computer programming technique Linear probing and, to a lesser extent, quadratic probing
used in hash tables to resolve hash collisions, in cases are able to take advantage of the data cache by accessing
when two different values to be searched for produce the locations that are close together. Double hashing has, on
same hash key. It is a popular collision-resolution tech- average, larger intervals and is not able to achieve this
nique in open-addressed hash tables. Double hashing is advantage.
implemented in many popular libraries. Like all other forms of open addressing, double hashing
Like linear probing, it uses one hash value as a starting becomes linear as the hash table approaches maximum
point and then repeatedly steps forward an interval until capacity. The only solution to this is to rehash to a larger
the desired value is located, an empty location is reached, size, as with all other open addressing schemes.
or the entire table has been searched; but this interval On top of that, it is possible for the secondary hash func-
is decided using a second, independent hash function tion to evaluate to zero. For example, if we choose k=5
(hence the name double hashing). Unlike linear probing with the following function:
and quadratic probing, the interval depends on the data,
h2 (k) = 5 − (k mod 7)
so that even values mapping to the same location have
different bucket sequences; this minimizes repeated col- The resulting sequence will always remain at the initial
lisions and the effects of clustering. hash value. One possible solution is to change the sec-
ondary hash function to:
Given two randomly, uniformly, and independently se-
lected hash functions h1 and h2 , the ith location in the h2 (k) = (k mod 7) + 1
bucket sequence for value k in a hash table T is: h(i, k) = This ensures that the secondary hash function will always
(h1 (k) + i · h2 (k)) mod |T |. Generally, h1 and h2 are be non zero.
selected from a set of universal hash functions.

3.6.1 Classical applied data structure 3.6.3 See also

Double hashing with open addressing is a classical data • Collision resolution in hash tables
structure on a table T . Let n be the number of elements
stored in T , then T 's load factor is α = |Tn | . • Hash function
Double hashing approximates uniform open address • Linear probing
hashing. That is, start by randomly, uniformly and inde-
pendently selecting two universal hash functions h1 and • Cuckoo hashing
h2 to build a double hashing table T .
All elements are put in T by double hashing using h1 and
h2 . Given a key k , determining the (i + 1) -st hash 3.6.4 Notes
location is computed by:
h(i, k) = (h1 (k) + i · h2 (k)) mod |T |. [1] Bradford, Phillip G.; Katehakis, Michael N. (2007), “A
probabilistic study on combinatorial expanders and hash-
Let T have fixed load factor α : 1 > α > 0 . Bradford ing” (PDF), SIAM Journal on Computing, 37 (1): 83–111,
and Katehakis[1] showed the expected number of probes doi:10.1137/S009753970444630X, MR 2306284.
for an unsuccessful search in T , still using these initially
1
chosen hash functions, is 1−α regardless of the distribu- [2] L. Guibas and E. Szemerédi: The Analysis of Dou-
tion of the inputs. More precisely, these two uniformly, ble Hashing, Journal of Computer and System Sciences,
randomly and independently chosen hash functions are 1978, 16, 226-274.
chosen from a set of universal hash functions where pair-
wise independence suffices. [3] G. S. Lueker and M. Molodowitch: More Analysis of Dou-
ble Hashing, Combinatorica, 1993, 13(1), 83-96.
Previous results include: Guibas and Szemerédi[2]
1
showed 1−α holds for unsuccessful search for load factors [4] J. P. Schmidt and A. Siegel: Double Hashing is Com-
α < 0.319 . Also, Lueker and Molodowitch[3] showed putable and Randomizable with Universal Hash Functions,
this held assuming ideal randomized functions. Schmidt manuscript.
74 CHAPTER 3. DICTIONARIES

3.6.5 External links


• How Caching Affects Hashing by Gregory L. Heile-
man and Wenbin Luo 2005.

• Hash Table Animation

• klib a C library that includes double hashing func-


tionality.

3.7 Cuckoo hashing


Cuckoo hashing is a scheme in computer programming
for resolving hash collisions of values of hash functions in
a table, with worst-case constant lookup time. The name
derives from the behavior of some species of cuckoo,
where the cuckoo chick pushes the other eggs or young
out of the nest when it hatches; analogously, inserting a
new key into a cuckoo hashing table may push an older
key to a different location in the table.

3.7.1 History
Cuckoo hashing was first described by Rasmus Pagh and
Flemming Friche Rodler in 2001.[1]

3.7.2 Operation
Cuckoo hashing is a form of open addressing in which
each non-empty cell of a hash table contains a key or
key–value pair. A hash function is used to determine the
location for each key, and its presence in the table (or
the value associated with it) can be found by examining
that cell of the table. However, open addressing suffers
from collisions, which happen when more than one key is
mapped to the same cell. The basic idea of cuckoo hash-
ing is to resolve collisions by using two hash functions
instead of only one. This provides two possible locations
in the hash table for each key. In one of the commonly
used variants of the algorithm, the hash table is split into Cuckoo hashing example. The arrows show the alternative loca-
two smaller tables of equal size, and each hash function tion of each key. A new item would be inserted in the location
provides an index into one of these two tables. It is also of A by moving A to its alternative location, currently occupied
possible for both hash functions to provide indexes into a by B, and moving B to its alternative location which is currently
single table. vacant. Insertion of a new item in the location of H would not
succeed: Since H is part of a cycle (together with W), the new
Lookup requires inspection of just two locations in the
item would get kicked out again.
hash table, which takes constant time in the worst case
(see Big O notation). This is in contrast to many other
hash table algorithms, which may not have a constant keys to their second locations (or back to their first loca-
worst-case bound on the time to do a lookup. Deletions, tions) to make room for the new key. A greedy algorithm
also, may be performed by blanking the cell containing a is used: The new key is inserted in one of its two possible
key, in constant worst case time, more simply than some locations, “kicking out”, that is, displacing, any key that
other schemes such as linear probing. might already reside in this location. This displaced key
When a new key is inserted, and one of its two cells is is then inserted in its alternative location, again kicking
empty, it may be placed in that cell. However, when both out any key that might reside there. The process con-
cells are already full, it will be necessary to move other tinues in the same way until an empty position is found,
3.7. CUCKOO HASHING 75

completing the algorithm. However, it is possible for this 3.7.5 Variations


insertion process to fail, by entering an infinite loop or by
finding a very long chain (longer than a preset threshold
that is logarithmic in the table size). In this case, the hash Several variations of cuckoo hashing have been studied,
table is rebuilt in-place using new hash functions: primarily with the aim of improving its space usage by
increasing the load factor that it can tolerate to a num-
There is no need to allocate new tables for ber greater than the 50% threshold of the basic algorithm.
the rehashing: We may simply run through the Some of these methods can also be used to reduce the fail-
tables to delete and perform the usual insertion ure rate of cuckoo hashing, causing rebuilds of the data
procedure on all keys found not to be at their structure to be much less frequent.
intended position in the table. Generalizations of cuckoo hashing that use more than two
— Pagh & Rodler, “Cuckoo Hashing”[1] alternative hash functions can be expected to utilize a
larger part of the capacity of the hash table efficiently
while sacrificing some lookup and insertion speed. Us-
3.7.3 Theory ing just three hash functions increases the load to 91%.[3]
Another generalization of cuckoo hashing, called blocked
Insertions succeed in expected constant time, even con- cuckoo hashing consists in using more than one key per
[1]

sidering the possibility of having to rebuild the table, as bucket. Using [4]
just 2 keys per bucket permits a load factor
long as the number of keys is kept below half of the ca- above 80%.
pacity of the hash table, i.e., the load factor is below 50%. Another variation of cuckoo hashing that has been stud-
One method of proving this uses the theory of random ied is cuckoo hashing with a stash. The stash, in this data
graphs: one may form an undirected graph called the structure, is an array of a constant number of keys, used
“cuckoo graph” that has a vertex for each hash table lo- to store keys that cannot successfully be inserted into the
cation, and an edge for each hashed value, with the end- main hash table of the structure. This modification re-
points of the edge being the two possible locations of the duces the failure rate of cuckoo hashing to an inverse-
value. Then, the greedy insertion algorithm for adding a polynomial function with an exponent that can be made
set of values to a cuckoo hash table succeeds if and only if arbitrarily large by increasing the stash size. However,
the cuckoo graph for this set of values is a pseudoforest, larger stashes also mean slower searches for keys that are
a graph with at most one cycle in each of its connected not present or are in the stash. A stash can be used in
components. Any vertex-induced subgraph with more combination with more than two hash functions or with
edges than vertices corresponds to a set of keys for which blocked cuckoo hashing[5]to achieve both high load factors
there are an insufficient number of slots in the hash table. and small failure rates. The analysis of cuckoo hash-
When the hash function is chosen randomly, the cuckoo ing with a stash extends to practical hash functions, not
graph is a random graph in the Erdős–Rényi model. With just to the random hash function [6] model commonly used
high probability, for a random graph in which the ratio of in theoretical analysis of hashing.
the number of edges to the number of vertices is bounded Some people recommend a simplified generalization of
below 1/2, the graph is a pseudoforest and the cuckoo cuckoo hashing called skewed-associative cache in some
hashing algorithm succeeds in placing all keys. More- CPU caches.[7]
over, the same theory also proves that the expected size of
Another variation of a cuckoo hash table, called a cuckoo
a connected component of the cuckoo graph is small, en-
filter, replaces the stored keys of a cuckoo hash table with
suring that each insertion takes constant expected time.[2]
much shorter fingerprints, computed by applying another
hash function to the keys. In order to allow these fin-
3.7.4 Example gerprints to be moved around within the cuckoo filter,
without knowing the keys that they came from, the two
The following hash functions are given: locations of each fingerprint may be computed from each
⌊ ⌋ other by a bitwise exclusive or operation with the finger-
h (k) = k mod 11 h′ (k) = 11 k
mod 11 print, or with a hash of the fingerprint. This data structure
Columns in the following two tables show the state of the forms an approximate set membership data structure with
hash tables over time as the elements are inserted. much the same properties as a Bloom filter: it can store
the members of a set of keys, and test whether a query key
is a member, with some chance of false positives (queries
Cycle that are incorrectly reported as being part of the set) but
no false negatives. However, it improves on a Bloom fil-
If you now wish to insert the element 6, then you get into ter in multiple respects: its memory usage is smaller by
a cycle. In the last row of the table we find the same initial a constant factor, it has better locality of reference, and
situation as at the beginning again. (unlike Bloom filters) it allows for fast deletion of set el-
⌊6⌋
h (6) = 6 mod 11 = 6 h′ (6) = 11 mod 11 = 0 ements with no additional storage penalty.[8]
76 CHAPTER 3. DICTIONARIES

3.7.6 Comparison with related structures [7] “Micro-Architecture”.

A study by Zukowski et al.[9] has shown that cuckoo [8] Fan, Bin; Andersen, Dave G.; Kaminsky, Michael;
hashing is much faster than chained hashing for small, Mitzenmacher, Michael D. (2014), “Cuckoo fil-
cache-resident hash tables on modern processors. Ken- ter: Practically better than Bloom”, Proc. 10th
ACM Int. Conf. Emerging Networking Experi-
neth Ross[10] has shown bucketized versions of cuckoo
ments and Technologies (CoNEXT '14), pp. 75–88,
hashing (variants that use buckets that contain more than doi:10.1145/2674005.2674994
one key) to be faster than conventional methods also for
large hash tables, when space utilization is high. The per- [9] Zukowski, Marcin; Heman, Sandor; Boncz, Peter (June
formance of the bucketized cuckoo hash table was inves- 2006). “Architecture-Conscious Hashing” (PDF). Pro-
tigated further by Askitis,[11] with its performance com- ceedings of the International Workshop on Data Manage-
pared against alternative hashing schemes. ment on New Hardware (DaMoN). Retrieved 2008-10-
16.
A survey by Mitzenmacher[3] presents open problems re-
lated to cuckoo hashing as of 2009. [10] Ross, Kenneth (2006-11-08). “Efficient Hash Probes
on Modern Processors” (PDF). IBM Research Report
RC24100. RC24100. Retrieved 2008-10-16.
3.7.7 See also
[11] Askitis, Nikolas (2009). Fast and Compact Hash Ta-
• Perfect hashing bles for Integer Keys (PDF). Proceedings of the 32nd Aus-
tralasian Computer Science Conference (ACSC 2009). 91.
• Linear probing pp. 113–122. ISBN 978-1-920682-72-9.
• Double hashing
• Hash collision 3.7.9 External links
• Hash function • A cool and practical alternative to traditional hash
• Quadratic probing tables, U. Erlingsson, M. Manasse, F. Mcsherry,
2006.
• Hopscotch hashing
• Cuckoo Hashing for Undergraduates, 2006, R.
Pagh, 2006.
3.7.8 References
• Cuckoo Hashing, Theory and Practice (Part 1, Part
[1] Pagh, Rasmus; Rodler, Flemming Friche (2001). 2 and Part 3), Michael Mitzenmacher, 2007.
“Cuckoo Hashing”. Algorithms — ESA 2001. Lec-
ture Notes in Computer Science. 2161. pp. 121– • Naor, Moni; Segev, Gil; Wieder, Udi (2008).
133. doi:10.1007/3-540-44676-1_10. ISBN 978-3-540-
“History-Independent Cuckoo Hashing”. Interna-
42493-2.
tional Colloquium on Automata, Languages and Pro-
[2] Kutzelnigg, Reinhard (2006). Bipartite random graphs gramming (ICALP). Reykjavik, Iceland. Retrieved
and cuckoo hashing (PDF). Fourth Colloquium on Mathe- 2008-07-21.
matics and Computer Science. Discrete Mathematics and
Theoretical Computer Science. AG. pp. 403–406 • Algorithmic Improvements for Fast Concurrent
Cuckoo Hashing, X. Li, D. Andersen, M. Kamin-
[3] Mitzenmacher, Michael (2009-09-09). “Some Open
sky, M. Freedman. EuroSys 2014.
Questions Related to Cuckoo Hashing | Proceedings of
ESA 2009” (PDF). Retrieved 2010-11-10.

[4] Dietzfelbinger, Martin; Weidling, Christoph (2007), Examples


“Balanced allocation and dictionaries with tightly packed
constant size bins”, Theoret. Comput. Sci., 380 (1-2): 47– • Concurrent high-performance Cuckoo hashtable
68, doi:10.1016/j.tcs.2007.02.054, MR 2330641. written in C++
[5] Kirsch, Adam; Mitzenmacher, Michael D.; Wieder,
Udi (2010), “More robust hashing: cuckoo hashing
• Cuckoo hash map written in C++
with a stash”, SIAM J. Comput., 39 (4): 1543–1561,
doi:10.1137/080728743, MR 2580539. • Static cuckoo hashtable generator for C/C++

[6] Aumüller, Martin; Dietzfelbinger, Martin; Woelfel, • Generic Cuckoo hashmap in Java
Philipp (2014), “Explicit and efficient hash families suf-
fice for cuckoo hashing with a stash”, Algorithmica, • Cuckoo hash table written in Haskell
70 (3): 428–456, doi:10.1007/s00453-013-9840-x, MR
3247374. • Cuckoo hashing for Go
3.8. HOPSCOTCH HASHING 77

word size. The neighborhood is thus a “virtual” bucket


that has fixed size and overlaps with the next H-1 buckets.
To speed the search, each bucket (array entry) includes a
“hop-information” word, an H-bit bitmap that indicates
which of the next H-1 entries contain items that hashed
to the current entry’s virtual bucket. In this way, an item
can be found quickly by looking at the word to see which
entries belong to the bucket, and then scanning through
the constant number of entries (most modern processors
support special bit manipulation operations that make the
lookup in the “hop-information” bitmap very fast).
Here is how to add item x which was hashed to bucket i:

1. If the entry i is empty, add x to i and return.

2. Starting at entry i, use a linear probe to find an empty


Hopscotch hashing. Here, H is 4. Gray entries are occupied. In entry at index j.
part (a), the item x is added with a hash value of 6. A linear
probe finds that entry 13 is empty. Because 13 is more than 4 3. If the empty entry’s index j is within H-1 of entry
entries away from 6, the algorithm looks for an earlier entry to i, place x there and return. Otherwise, entry j is
swap with 13. The first place to look in is H-1 = 3 entries before, too far from i. To create an empty entry closer to
at entry 10. That entry’s hop information bit-map indicates that
i, find an item y whose hash value lies between i and
d, the item at entry 11, can be displaced to 13. After displacing d,
Entry 11 is still too far from entry 6, so the algorithm examines
j, but within H-1 of j. Displacing y to j creates a new
entry 8. The hop information bit-map indicates that item c at empty slot closer to i. Repeat until the empty entry
entry 9 can be moved to entry 11. Finally, a is moved to entry 9. is within H-1 of entry i, place x there and return.
Part (b) shows the table state just before adding x. If no such item y exists, or if the bucket i already
contains H items, resize and rehash the table.

3.8 Hopscotch hashing The idea is that hopscotch hashing “moves the empty slot
towards the desired bucket”. This distinguishes it from
Hopscotch hashing is a scheme in computer program- linear probing which leaves the empty slot where it was
ming for resolving hash collisions of values of hash func- found, possibly far away from the original bucket, or from
tions in a table using open addressing. It is also well suited cuckoo hashing that, in order to create a free bucket,
for implementing a concurrent hash table. Hopscotch moves an item out of one of the desired buckets in the
hashing was introduced by Maurice Herlihy, Nir Shavit target arrays, and only then tries to find the displaced item
and Moran Tzafrir in 2008.[1] The name is derived from a new place.
the sequence of hops that characterize the table’s inser- To remove an item from the table, one simply removes
tion algorithm. it from the table entry. If the neighborhood buckets are
The algorithm uses a single array of n buckets. For each cache aligned, then one could apply a reorganization op-
bucket, its neighborhood is a small collection of nearby eration in which items are moved into the now vacant lo-
consecutive buckets (i.e. ones with close indices to the cation in order to improve alignment.
original hashed bucket). The desired property of the One advantage of hopscotch hashing is that it provides
neighborhood is that the cost of finding an item in the good performance at very high table load factors, even
buckets of the neighborhood is close to the cost of find- ones exceeding 0.9. Part of this efficiency is due to using
ing it in the bucket itself (for example, by having buckets a linear probe only to find an empty slot during insertion,
in the neighborhood fall within the same cache line). The not for every lookup as in the original linear probing hash
size of the neighborhood must be sufficient to accommo- table algorithm. Another advantage is that one can use
date a logarithmic number of items in the worst case (i.e. any hash function, in particular simple ones that are close-
it must accommodate log(n) items), but only a constant to-universal.
number on average. If some bucket’s neighborhood is
filled, the table is resized.
In hopscotch hashing, as in cuckoo hashing, and unlike in 3.8.1 See also
linear probing, a given item will always be inserted-into
• Cuckoo hashing
and found-in the neighborhood of its hashed bucket. In
other words, it will always be found either in its original • Hash collision
hashed array entry, or in one of the next H-1 neighboring
entries. H could, for example, be 32, a common machine • Hash function
78 CHAPTER 3. DICTIONARIES

• Linear probing if the input data is unknown, it is deliberately difficult to


reconstruct it (or equivalent alternatives) by knowing the
• Open addressing stored hash value. This is used for assuring integrity of
• Perfect hashing transmitted data, and is the building block for HMACs,
which provide message authentication.
• Quadratic probing Hash functions are related to (and often confused with)
checksums, check digits, fingerprints, lossy compression,
randomization functions, error-correcting codes, and ci-
3.8.2 References
phers. Although these concepts overlap to some extent,
[1] Herlihy, Maurice and Shavit, Nir and Tzafrir, Moran each has its own uses and requirements and is designed
(2008). “Hopscotch Hashing” (PDF). DISC '08: Proceed- and optimized differently. The Hash Keeper database
ings of the 22nd international symposium on Distributed maintained by the American National Drug Intelligence
Computing. Arcachon, France: Springer-Verlag. pp. Center, for instance, is more aptly described as a cata-
350–364. logue of file fingerprints than of hash values.

3.8.3 External links 3.9.1 Uses


• libhhash - a C hopscotch hashing implementation Hash tables
• hopscotch-map - a C++ implementation of a hash
map using hopscotch hashing Hash functions are used in hash tables,[1] to quickly lo-
cate a data record (e.g., a dictionary definition) given its
search key (the headword). Specifically, the hash func-
tion is used to map the search key to an index; the index
3.9 Hash function gives the place in the hash table where the corresponding
record should be stored. Hash tables, in turn, are used to
This article is about a programming concept. For other implement associative arrays and dynamic sets.[2]
meanings of “hash” and “hashing”, see Hash (disam-
biguation). Typically, the domain of a hash function (the set of possi-
A hash function is any function that can be used to ble keys) is larger than its range (the number of different
table indices), and so it will map several different keys to
the same index. Therefore, each slot of a hash table is
hash associated with (implicitly or explicitly) a set of records,
rather than a single record. For this reason, each slot of
keys function hashes
a hash table is often called a bucket, and hash values are
00 also called bucket indices.
John Smith
01
Thus, the hash function only hints at the record’s location
02 — it tells where one should start looking for it. Still, in a
Lisa Smith
03 half-full table, a good hash function will typically narrow
04 the search down to only one or two entries.
Sam Doe
05
:
Sandra Dee Caches
15
Hash functions are also used to build caches for large data
A hash function that maps names to integers from 0 to 15. There
sets stored in slow media. A cache is generally simpler
is a collision between keys “John Smith” and “Sandra Dee”. than a hashed search table, since any collision can be re-
solved by discarding or writing back the older of the two
map data of arbitrary size to data of fixed size. The val- colliding items. This is also used in file comparison.
ues returned by a hash function are called hash values,
hash codes, digests, or simply hashes. One use is a data
Bloom filters
structure called a hash table, widely used in computer
software for rapid data lookup. Hash functions accelerate
table or database lookup by detecting duplicated records Main article: Bloom filter
in a large file. An example is finding similar stretches in
DNA sequences. They are also useful in cryptography. Hash functions are an essential ingredient of the Bloom
A cryptographic hash function allows one to easily ver- filter, a space-efficient probabilistic data structure that is
ify that some input data maps to a given hash value, but used to test whether an element is a member of a set.
3.9. HASH FUNCTION 79

Finding duplicate records Finding similar substrings

Main article: Hash table The same techniques can be used to find equal or similar
stretches in a large collection of strings, such as a docu-
ment repository or a genomic database. In this case, the
When storing records in a large unsorted file, one may input strings are broken into many small pieces, and a
use a hash function to map each record to an index into hash function is used to detect potentially equal pieces,
a table T, and to collect in each bucket T[i] a list of the as above.
numbers of all records with the same hash value i. Once
the table is complete, any two duplicate records will end The Rabin–Karp algorithm is a relatively fast string
up in the same bucket. The duplicates can then be found searching algorithm that works in O(n) time on average.
by scanning every bucket T[i] which contains two or more It is based on the use of hashing to compare strings.
members, fetching those records, and comparing them.
With a table of appropriate size, this method is likely to be
Geometric hashing
much faster than any alternative approach (such as sorting
the file and comparing all consecutive pairs).
This principle is widely used in computer graphics,
computational geometry and many other disciplines, to
solve many proximity problems in the plane or in three-
Protecting data dimensional space, such as finding closest pairs in a set of
points, similar shapes in a list of shapes, similar images
Main article: Security of cryptographic hash functions in an image database, and so on. In these applications,
the set of all inputs is some sort of metric space, and the
hashing function can be interpreted as a partition of that
A hash value can be used to uniquely identify secret infor- space into a grid of cells. The table is often an array with
mation. This requires that the hash function is collision- two or more indices (called a grid file, grid index, bucket
resistant, which means that it is very hard to find data that grid, and similar names), and the hash function returns
will generate the same hash value. These functions are an index tuple. This special case of hashing is known as
categorized into cryptographic hash functions and prov- geometric hashing or the grid method. Geometric hash-
ably secure hash functions. Functions in the second cate- ing is also used in telecommunications (usually under the
gory are the most secure but also too slow for most practi- name vector quantization) to encode and compress multi-
cal purposes. Collision resistance is accomplished in part dimensional signals.
by generating very large hash values. For example, SHA-
1, one of the most widely used cryptographic hash func-
tions, generates 160 bit values. Standard uses of hashing in cryptography

Main article: Cryptographic hash function

Finding similar records


Some standard applications that employ hash func-
tions include authentication, message integrity (using an
Main article: Locality sensitive hashing HMAC (Hashed MAC)), message fingerprinting, data
corruption detection, and digital signature efficiency.
Hash functions can also be used to locate table records
whose key is similar, but not identical, to a given key; or
pairs of records in a large file which have similar keys. For 3.9.2 Properties
that purpose, one needs a hash function that maps similar
keys to hash values that differ by at most m, where m is a Good hash functions, in the original sense of the term, are
small integer (say, 1 or 2). If one builds a table T of all usually required to satisfy certain properties listed below.
record numbers, using such a hash function, then similar The exact requirements are dependent on the application,
records will end up in the same bucket, or in nearby buck- for example a hash function well suited to indexing data
ets. Then one need only check the records in each bucket will probably be a poor choice for a cryptographic hash
T[i] against those in buckets T[i+k] where k ranges be- function.
tween −m and m.
This class includes the so-called acoustic fingerprint al- Determinism
gorithms, that are used to locate similar-sounding entries
in large collection of audio files. For this application, the A hash procedure must be deterministic—meaning that
hash function must be as insensitive as possible to data for a given input value it must always generate the same
capture or transmission errors, and to trivial changes such hash value. In other words, it must be a function of the
as timing and volume changes, compression, etc.[3] data to be hashed, in the mathematical sense of the term.
80 CHAPTER 3. DICTIONARIES

This requirement excludes hash functions that depend Defined range


on external variable parameters, such as pseudo-random
number generators or the time of day. It also excludes It is often desirable that the output of a hash function have
functions that depend on the memory address of the ob- fixed size (but see below). If, for example, the output is
ject being hashed in cases that the address may change constrained to 32-bit integer values, the hash values can
during execution (as may happen on systems that use cer- be used to index into an array. Such hashing is commonly
tain methods of garbage collection), although sometimes used to accelerate data searches.[5] On the other hand,
rehashing of the item is possible. cryptographic hash functions produce much larger hash
values, in order to ensure the computational complexity
The determinism is in the context of the reuse of the
of brute-force inversion.[2] For example, SHA-1, one of
function. For example, Python adds the feature that hash
the most widely used cryptographic hash functions, pro-
functions make use of a randomized seed that is gener-
duces a 160-bit value.
ated once when the Python process starts in addition to
[4]
the input to be hashed . The Python hash is still a valid Producing fixed-length output from variable length in-
hash function when used in within a single run. But if the put can be accomplished by breaking the input data into
values are persisted (for example, written to disk) they chunks of specific size. Hash functions used for data
can no longer be treated as valid hash values, since in the searches use some arithmetic expression which iteratively
next run the random value might differ. processes chunks of the input (such as the characters in
a string) to produce the hash value.[5] In cryptographic
hash functions, these chunks are processed by a one-way
compression function, with the last chunk being padded
Uniformity if necessary. In this case, their size, which is called block
size, is much bigger than the size of the hash value.[2] For
example, in SHA-1, the hash value is 160 bits and the
A good hash function should map the expected inputs as
block size 512 bits.
evenly as possible over its output range. That is, every
hash value in the output range should be generated with
roughly the same probability. The reason for this last re- Variable range In many applications, the range of
quirement is that the cost of hashing-based methods goes hash values may be different for each run of the program,
up sharply as the number of collisions—pairs of inputs or may change along the same run (for instance, when a
that are mapped to the same hash value—increases. If hash table needs to be expanded). In those situations, one
some hash values are more likely to occur than others, needs a hash function which takes two parameters—the
a larger fraction of the lookup operations will have to input data z, and the number n of allowed hash values.
search through a larger set of colliding table entries.
A common solution is to compute a fixed hash function
Note that this criterion only requires the value to be uni- with a very large range (say, 0 to 232 − 1), divide the re-
formly distributed, not random in any sense. A good ran- sult by n, and use the division’s remainder. If n is itself a
domizing function is (barring computational efficiency power of 2, this can be done by bit masking and bit shift-
concerns) generally a good choice as a hash function, but ing. When this approach is used, the hash function must
the converse need not be true. be chosen so that the result has fairly uniform distribution
Hash tables often contain only a small subset of the valid between 0 and n − 1, for any value of n that may occur in
inputs. For instance, a club membership list may contain the application. Depending on the function, the remain-
only a hundred or so member names, out of the very large der may be uniform only for certain values of n, e.g. odd
set of all possible names. In these cases, the uniformity or prime numbers.
criterion should hold for almost all typical subsets of en- We can allow the table size n to not be a power of 2
tries that may be found in the table, not just for the global and still not have to perform any remainder or division
set of all possible entries. operation, as these computations are sometimes costly.
b
In other words, if a typical set of m records is hashed to For example, let n be significantly less than 2 . Con-
n table slots, the probability of a bucket receiving many sider a pseudorandom number generator (PRNG) func-
b
more than m/n records should be vanishingly small. In tion P(key) that is uniform on the interval [0, 2 − 1].
particular, if m is less than n, very few buckets should A hash function uniform on the interval [0, n-1] is n
b
have more than one or two records. (In an ideal "perfect P(key)/2 . We can replace the division by a (possibly
hash function", no bucket should have more than one faster) right bit shift: nP(key) >> b.
record; but a small number of collisions is virtually in-
evitable, even if n is much larger than m – see the birthday Variable range with minimal movement (dynamic
paradox). hash function) When the hash function is used to store
When testing a hash function, the uniformity of the distri- values in a hash table that outlives the run of the program,
bution of hash values can be evaluated by the chi-squared and the hash table needs to be expanded or shrunk, the
test. hash table is referred to as a dynamic hash table.
3.9. HASH FUNCTION 81

A hash function that will relocate the minimum number and their probability distribution in the intended applica-
of records when the table is – where z is the key being tion.
hashed and n is the number of allowed hash values – such
that H(z,n + 1) = H(z,n) with probability close to n/(n +
1).
Trivial hash function
Linear hashing and spiral storage are examples of dy-
namic hash functions that execute in constant time but If the data to be hashed is small enough, one can use
relax the property of uniformity to achieve the minimal the data itself (reinterpreted as an integer) as the hashed
movement property. value. The cost of computing this “trivial” (identity) hash
Extendible hashing uses a dynamic hash function that re- function is effectively zero. This hash function is perfect,
quires space proportional to n to compute the hash func- as it maps each input to a distinct hash value.
tion, and it becomes a function of the previous keys that The meaning of “small enough” depends on the size of
have been inserted. the type that is used as the hashed value. For example,
Several algorithms that preserve the uniformity property in Java, the hash code is a 32-bit integer. Thus the 32-
but require time proportional to n to compute the value bit integer Integer and 32-bit floating-point Float objects
of H(z,n) have been invented. can simply use the value directly; whereas the 64-bit in-
teger Long and 64-bit floating-point Double cannot use
this method.
Data normalization
Other types of data can also use this perfect hashing
In some applications, the input data may contain features scheme. For example, when mapping character strings
that are irrelevant for comparison purposes. For example, between upper and lower case, one can use the binary
when looking up a personal name, it may be desirable encoding of each character, interpreted as an integer, to
to ignore the distinction between upper and lower case index a table that gives the alternative form of that char-
letters. For such data, one must use a hash function that acter (“A” for “a”, “8” for “8”, etc.). If [7]
each character is
is compatible with the data equivalence criterion being stored in 8 bits (as 8in extended ASCII or ISO Latin 1),
used: that is, any two inputs that are considered equivalent the table has only 2 = 256 entries; in the
16
case of Unicode
must yield the same hash value. This can be accomplished characters, the table would have 17×2 = 1114112 en-
by normalizing the input before hashing it, as by upper- tries.
casing all letters. The same technique can be used to map two-letter coun-
try codes like “us” or “za” to country names (262 = 676
table entries), 5-digit zip codes like 13083 to city names
Continuity (100000 entries), etc. Invalid data values (such as the
country code “xx” or the zip code 00000) may be left un-
“A hash function that is used to search for similar (as op- defined in the table or mapped to some appropriate “null”
posed to equivalent) data must be as continuous as possi- value.
ble; two inputs that differ by a little should be mapped to
equal or nearly equal hash values.”[6]
Note that continuity is usually considered a fatal flaw for
checksums, cryptographic hash functions, and other re- Perfect hashing
lated concepts. Continuity is desirable for hash functions
only in some applications, such as hash tables used in Main article: Perfect hash function
Nearest neighbor search. A hash function that is injective—that is, maps each valid
input to a different hash value—is said to be perfect.
With such a function one can directly locate the desired
Non-invertible entry in a hash table, without any additional searching.

In cryptographic applications, hash functions are typically


expected to be practically non-invertible, meaning that it
is not realistic to reconstruct the input datum x from its Minimal perfect hashing
hash value h(x) alone without spending great amounts of
computing time (see also One-way function). A perfect hash function for n keys is said to be minimal if
its range consists of n consecutive integers, usually from 0
to n−1. Besides providing single-step lookup, a minimal
3.9.3 Hash function algorithms perfect hash function also yields a compact hash table,
without any vacant slots. Minimal perfect hash functions
For most types of hashing functions, the choice of the are much harder to find than perfect ones with a wider
function depends strongly on the nature of the input data, range.
82 CHAPTER 3. DICTIONARIES

sions; whereas the remainder formula z mod m, which is


hash quite sensitive to the trailing digits, may still yield a fairly
keys function hashes even distribution.
00
01
John Smith
02
Hashing variable-length data
03
Lisa Smith
04
When the data values are long (or variable-length)
05
Sam Doe character strings—such as personal names, web page ad-
: dresses, or mail messages—their distribution is usually
13 very uneven, with complicated dependencies. For exam-
Sandra Dee
14 ple, text in any natural language has highly non-uniform
15 distributions of characters, and character pairs, very char-
acteristic of the language. For such data, it is prudent to
use a hash function that depends on all characters of the
A perfect hash function for the four names shown string—and depends on each character in a different way.
In cryptographic hash functions, a Merkle–Damgård con-
struction is usually used. In general, the scheme for hash-
hash ing such data is to break the input into a sequence of small
keys function hashes units (bits, bytes, words, etc.) and combine all the units
John Smith b[1], b[2], …, b[m] sequentially, as follows
S ← S0; // Initialize the state. for k in 1, 2, ..., m do // Scan
0
Lisa Smith the input data units: S ← F(S, b[k]); // Combine data unit
1
k into the state. return G(S, n) // Extract the hash value
2 from the state.
Sam Doe
3
This schema is also used in many text checksum and fin-
Sandra Dee gerprint algorithms. The state variable S may be a 32-
or 64-bit unsigned integer; in that case, S0 can be 0, and
G(S,n) can be just S mod n. The best choice of F is a
A minimal perfect hash function for the four names shown complex issue and depends on the nature of the data. If
the units b[k] are single bits, then F(S,b) could be, for
instance
Hashing uniformly distributed data
if highbit(S) = 0 then return 2 * S + b else return (2 *
If the inputs are bounded-length strings and each input S + b) ^ P
may independently occur with uniform probability (such Here highbit(S) denotes the most significant bit of S; the
as telephone numbers, car license plates, invoice num- '*' operator denotes unsigned integer multiplication with
bers, etc.), then a hash function needs to map roughly the lost overflow; '^' is the bitwise exclusive or operation ap-
same number of inputs to each hash value. For instance, plied to words; and P is a suitable fixed word.[8]
suppose that each input is an integer z in the range 0 to
N−1, and the output must be an integer h in the range 0 to
n−1, where N is much larger than n. Then the hash func-
tion could be h = z mod n (the remainder of z divided by
Special-purpose hash functions
n), or h = (z × n) ÷ N (the value z scaled down by n/N and
truncated to an integer), or many other formulas.
In many cases, one can design a special-purpose
(heuristic) hash function that yields many fewer colli-
Hashing data with other distributions sions than a good general-purpose hash function. For ex-
ample, suppose that the input data are file names such
These simple formulas will not do if the input values are as FILE0000.CHK, FILE0001.CHK, FILE0002.CHK,
not equally likely, or are not independent. For instance, etc., with mostly sequential numbers. For such data, a
most patrons of a supermarket will live in the same geo- function that extracts the numeric part k of the file name
graphic area, so their telephone numbers are likely to be- and returns k mod n would be nearly optimal. Needless
gin with the same 3 to 4 digits. In that case, if m is 10000 to say, a function that is exceptionally good for a specific
or so, the division formula (z × m) ÷ M, which depends kind of data may have dismal performance on data with
mainly on the leading digits, will generate a lot of colli- different distribution.
3.9. HASH FUNCTION 83

Rolling hash fast, but have higher collision rates in hash tables than
more sophisticated hash functions.[12]
Main article: Rolling hash
In many applications, such as hash tables, collisions make
the system a little slower but are otherwise harmless.
In some applications, such as substring search, one must In such systems, it is often better to use hash func-
compute a hash function h for every k-character substring tions based on multiplication—such as MurmurHash and
of a given n-character string t; where k is a fixed integer, the SBoxHash—or even simpler hash functions such as
and n is k. The straightforward solution, which is to ex- CRC32—and tolerate more collisions; rather than use a
tract every such substring s of t and compute h(s) sep- more complex hash function that avoids many of those
arately, requires a number of operations proportional to collisions but takes longer to compute.[12] Multiplicative
k·n. However, with the proper choice of h, one can use hashing is susceptible to a “common mistake” that leads
the technique of rolling hash to compute all those hashes to poor diffusion—higher-value input bits do not affect
with an effort proportional to k + n. lower-value output bits.[13]

Universal hashing Hashing with cryptographic hash functions

A universal hashing scheme is a randomized algorithm Some cryptographic hash functions, such as SHA-1, have
that selects a hashing function h among a family of such even stronger uniformity guarantees than checksums or
functions, in such a way that the probability of a collision fingerprints, and thus can provide very good general-
of any two distinct keys is 1/n, where n is the number purpose hashing functions.
of distinct hash values desired—independently of the two
keys. Universal hashing ensures (in a probabilistic sense) In ordinary applications, this advantage may be too small
that the hash function application will behave as well as to offset their much higher cost.[14] However, this method
if it were using a random function, for any distribution of can provide uniformly distributed hashes even when the
the input data. It will, however, have more collisions than keys are chosen by a malicious agent. This feature may
perfect hashing and may require more operations than a help to protect services against denial of service attacks.
special-purpose hash function. See also unique permuta-
tion hashing.[9]
Hashing by nonlinear table lookup

Hashing with checksum functions Main article: Tabulation hashing

One can adapt certain checksum or fingerprinting algo- Tables of random numbers (such as 256 random 32-bit
rithms for use as hash functions. Some of those algo- integers) can provide high-quality nonlinear functions to
rithms will map arbitrary long string data z, with any typ- be used as hash functions or for other purposes such as
ical real-world distribution—no matter how non-uniform cryptography. The key to be hashed is split into 8-bit
and dependent—to a 32-bit or 64-bit string, from which (one-byte) parts, and each part is used as an index for the
one can extract a hash value in 0 through n − 1. nonlinear table. The table values are then added by arith-
This method may produce a sufficiently uniform distri- metic or XOR addition to the hash output value. Because
bution of hash values, as long as the hash range size n is the table is just 1024 bytes in size, it fits into the cache
small compared to the range of the checksum or finger- of modern microprocessors and allows very fast execu-
print function. However, some checksums fare poorly in tion of the hashing algorithm. As the table value is on
the avalanche test, which may be a concern in some ap- average much longer than 8 bits, one bit of input affects
plications. In particular, the popular CRC32 checksum nearly all output bits.
provides only 16 bits (the higher half of the result) that This algorithm has proven to be very fast and of high qual-
are usable for hashing. Moreover, each bit of the input ity for hashing purposes (especially hashing of integer-
has a deterministic effect on each bit of the CRC32, that number keys).
is one can tell without looking at the rest of the input,
which bits of the output will flip if the input bit is flipped;
so care must be taken to use all 32 bits when computing Efficient hashing of strings
the hash from the checksum.[10]
See also: Universal hashing § Hashing strings
Multiplicative hashing
Modern microprocessors will allow for much faster pro-
Multiplicative hashing is a simple type of hash func- cessing, if 8-bit character strings are not hashed by pro-
tion often used by teachers introducing students to hash cessing one character at a time, but by interpreting the
tables.[11] Multiplicative hash functions are simple and string as an array of 32 bit or 64 bit integers and hash-
84 CHAPTER 3. DICTIONARIES

ing/accumulating these “wide word” integer values by 3.9.5 Origins of the term
means of arithmetic operations (e.g. multiplication by
constant and bit-shifting). The remaining characters of The term “hash” offers a natural analogy with its non-
the string which are smaller than the word length of the technical meaning (to “chop” or “make a mess” out of
CPU must be handled differently (e.g. being processed something), given how hash functions scramble their in-
one character at a time). put data to derive their output.[19] In his research for the
precise origin of the term, Donald Knuth notes that, while
This approach has proven to speed up hash code genera-
Hans Peter Luhn of IBM appears to have been the first to
tion by a factor of five or more on modern microproces-
use the concept of a hash function in a memo dated Jan-
sors of a word size of 64 bit.
uary 1953, the term itself would only appear in published
Another approach[15] is to convert strings to a 32 or 64 literature in the late 1960s, on Herbert Hellerman’s Digi-
bit numeric value and then apply a hash function. One tal Computer System Principles, even though it was already
method that avoids the problem of strings having great widespread jargon by then.[20]
similarity (“Aaaaaaaaaa” and “Aaaaaaaaab”) is to use a
Cyclic redundancy check (CRC) of the string to com-
pute a 32- or 64-bit value. While it is possible that two 3.9.6 List of hash functions
different strings will have the same CRC, the likelihood
is very small and only requires that one check the ac- Main article: List of hash functions
tual string found to determine whether one has an ex-
act match. CRCs will be different for strings such as
“Aaaaaaaaaa” and “Aaaaaaaaab”. Although, CRC codes • Coalesced hashing
can be used as hash values[16] they are not cryptographi-
cally secure since they are not collision-resistant.[17] • Cuckoo hashing

• Hopscotch hashing

3.9.4 Locality-sensitive hashing • NIST hash function competition

• MD5
Locality-sensitive hashing (LSH) is a method of per-
forming probabilistic dimension reduction of high- • Bernstein hash[21]
dimensional data. The basic idea is to hash the input
items so that similar items are mapped to the same buck- • Fowler-Noll-Vo hash function (32, 64, 128, 256,
ets with high probability (the number of buckets being 512, or 1024 bits)
much smaller than the universe of possible input items).
• Jenkins hash function (32 bits)
This is different from the conventional hash functions,
such as those used in cryptography, as in this case the • Pearson hashing (64 bits)
goal is to maximize the probability of “collision” of sim-
ilar items rather than to avoid collisions.[18] • Zobrist hashing
One example of LSH is MinHash algorithm used for find-
ing similar documents (such as web-pages): 3.9.7 See also
Let h be a hash function that maps the members of A and
B to distinct integers, and for any set S define h ᵢ (S) to • Comparison of cryptographic hash functions
be the member x of S with the minimum value of h(x).
Then h ᵢ (A) = h ᵢ (B) exactly when the minimum hash • Distributed hash table
value of the union A ∪ B lies in the intersection A ∩ B. • Identicon
Therefore,
• Low-discrepancy sequence

Pr[h ᵢ (A) = h ᵢ (B)] = J(A,B). where J is • PhotoDNA


Jaccard index.
• Transposition table

In other words, if r is a random variable that is one when 3.9.8 References


h ᵢ (A) = h ᵢ (B) and zero otherwise, then r is an unbiased
estimator of J(A,B), although it has too high a variance to [1] Konheim, Alan (2010). “7. HASHING FOR STORAGE:
be useful on its own. The idea of the MinHash scheme is DATA MANAGEMENT”. Hashing in Computer Science:
to reduce the variance by averaging together several vari- Fifty Years of Slicing and Dicing. Wiley-Interscience.
ables constructed in the same way. ISBN 9780470344736.
3.10. PERFECT HASH FUNCTION 85

[2] Menezes, Alfred J.; van Oorschot, Paul C.; Vanstone, [21] “Hash Functions”. cse.yorku.ca. September 22, 2003.
Scott A (1996). Handbook of Applied Cryptography. Retrieved November 1, 2012. the djb2 algorithm (k=33)
CRC Press. ISBN 0849385237. was first reported by dan bernstein many years ago in
comp.lang.c.
[3] “Robust Audio Hashing for Content Identification by Jaap
Haitsma, Ton Kalker and Job Oostveen”

[4] “3. Data model — Python 3.6.1 documentation”. 3.9.9 External links
docs.python.org. Retrieved 2017-03-24.
• Calculate hash of a given value by Timo Denk
[5] Sedgewick, Robert (2002). “14. Hashing”. Algorithms in
Java (3 ed.). Addison Wesley. ISBN 978-0201361209. • Hash Functions and Block Ciphers by Bob Jenkins
[6] “Fundamental Data Structures – Josiang p.132”. Re-
• The Goulburn Hashing Function (PDF) by Mayur
trieved May 19, 2014.
Patel
[7] Plain ASCII is a 7-bit character encoding, although it is
often stored in 8-bit bytes with the highest-order bit always • Hash Function Construction for Textual and Geo-
clear (zero). Therefore, for plain ASCII, the bytes have metrical Data Retrieval Latest Trends on Comput-
only 27 = 128 valid values, and the character translation ers, Vol.2, pp. 483–489, CSCC conference, Corfu,
table has only this many entries. 2010
[8] Broder, A. Z. (1993). “Some applications of Rabin’s fin-
gerprinting method”. Sequences II: Methods in Communi-
cations, Security, and Computer Science. Springer-Verlag. 3.10 Perfect hash function
pp. 143–152.

[9] Shlomi Dolev, Limor Lahiani, Yinnon Haviv, “Unique In computer science, a perfect hash function for a set S
permutation hashing”, Theoretical Computer Science is a hash function that maps distinct elements in S to a set
Volume 475, 4 March 2013, Pages 59–65. of integers, with no collisions. In mathematical terms, it
is a total injective function.
[10] Bret Mulvey, Evaluation of CRC32 for Hash Tables, in
Hash Functions. Accessed April 10, 2009. Perfect hash functions may be used to implement a
lookup table with constant worst-case access time. A
[11] Knuth. “The Art of Computer Programming”. Volume 3: perfect hash function has many of the same applications
“Sorting and Searching”. Section “6.4. Hashing”. as other hash functions, but with the advantage that no
[12] Peter Kankowski. “Hash functions: An empirical com- collision resolution has to be implemented.
parison”.

[13] “CS 3110 Lecture 21: Hash functions”. Section “Multi- 3.10.1 Application
plicative hashing”.

[14] Bret Mulvey, Evaluation of SHA-1 for Hash Tables, in A perfect hash function with values in a limited range can
Hash Functions. Accessed April 10, 2009. be used for efficient lookup operations, by placing keys
from S (or other associated values) in a lookup table in-
[15] http://citeseer.ist.psu.edu/viewdoc/summary?doi=10.1. dexed by the output of the function. One can then test
1.18.7520 Performance in Practice of String Hashing whether a key is present in S, or look up a value associ-
Functions ated with that key, by looking for it at its cell of the table.
[16] Peter Kankowski. “Hash functions: An empirical com- Each such lookup takes constant time in the worst case.[1]
parison”.

[17] Cam-Winget, Nancy; Housley, Russ; Wagner, David; 3.10.2 Construction


Walker, Jesse (May 2003). “Security Flaws in 802.11
Data Link Protocols”. Communications of the ACM. 46 A perfect hash function for a specific set S that can be
(5): 35–39. doi:10.1145/769800.769823.
evaluated in constant time, and with values in a small
[18] A. Rajaraman and J. Ullman (2010). “Mining of Massive range, can be found by a randomized algorithm in a num-
Datasets, Ch. 3.”. ber of operations that is proportional to the size of S. The
original construction of Fredman, Komlós & Szemerédi
[19] Knuth, Donald E. (2000). Sorting and searching (2. ed., (1984) uses a two-level scheme to map a set S of n ele-
6. printing, newly updated and rev. ed.). Boston [u.a.]: ments to a range of O(n) indices, and then map each index
Addison-Wesley. p. 514. ISBN 0-201-89685-0.
to a range of hash values. The first level of their construc-
[20] Knuth, Donald E. (2000). Sorting and searching (2. ed., tion chooses a large prime p (larger than the size of the
6. printing, newly updated and rev. ed.). Boston [u.a.]: universe from which S is drawn), and a parameter k, and
Addison-Wesley. p. 547-548. ISBN 0-201-89685-0. maps each element x of S to the index
86 CHAPTER 3. DICTIONARIES

Minimal perfect hash function

g(x) = (kxmodp)modn. A minimal perfect hash function is a perfect hash func-


tion that maps n keys to n consecutive integers – usually
the numbers from 0 to n − 1 or from 1 to n. A more for-
If k is chosen randomly, this step is likely to have colli- mal way of expressing this is: Let j and k be elements of
sions, but the number of elements nᵢ that are simultane- some finite set S. F is a minimal perfect hash function if
ously mapped to the same index i is likely to be small. The and only if F(j) = F(k) implies j = k (injectivity) and there
second level of their construction assigns disjoint ranges exists an integer a such that the range of F is a..a + |S| − 1.
of O(ni2 ) integers to each index i. It uses a second set It has been proven that a general purpose minimal perfect
of linear modular functions, one for each index i, to map hash scheme requires at least 1.44 bits/key.[4] The best
each member x of S into the range associated with g(x).[1] currently known minimal perfect hashing schemes can be
As Fredman, Komlós & Szemerédi (1984) show, there represented using approximately 2.6 bits per key.[5]
exists a choice of the parameter k such that the sum of
the lengths of the ranges for the n different values of g(x)
Order preservation
is O(n). Additionally, for each value of g(x), there exists a
linear modular function that maps the corresponding sub-
A minimal perfect hash function F is order preserving if
set of S into the range associated with that value. Both k,
keys are given in some order a1 , a2 , ..., an and for any
and the second-level functions for each value of g(x), can
keys aj and ak, j < k implies F(aj) < F(ak).[6] In this case,
be found in polynomial time by choosing values randomly
the function value is just the position of each key in the
until finding one that works.[1]
sorted ordering of all of the keys. A simple implementa-
The hash function itself requires storage space O(n) to tion of order-preserving minimal perfect hash functions
store k, p, and all of the second-level linear modular func- with constant access time is to use an (ordinary) perfect
tions. Computing the hash value of a given key x may be hash function or cuckoo hashing to store a lookup table
performed in constant time by computing g(x), looking up of the positions of each key. If the keys to be hashed are
the second-level function associated with g(x), and apply- themselves stored in a sorted array, it is possible to store
ing this function to x. A modified version of this two-level a small number of additional bits per key in a data struc-
scheme with a larger number of values at the top level can ture that can be used to compute hash values quickly.[7]
be used to construct a perfect hash function that maps S Order-preserving minimal perfect hash functions require
into a smaller range of length n + o(n).[1] necessarily Ω(n log n) bits to be represented.[8]

3.10.5 Related constructions


3.10.3 Space lower bounds
A simple alternative to perfect hashing, which also allows
The use of O(n) words of information to store the func- dynamic updates, is cuckoo hashing. This scheme maps
tion of Fredman, Komlós & Szemerédi (1984) is near- keys to two or more locations within a range (unlike per-
optimal: any perfect hash function that can be calculated fect hashing which maps each key to a single location)
in constant time requires at least a number of bits that is but does so in such a way that the keys can be assigned
proportional to the size of S.[2] one-to-one to locations to which they have been mapped.
Lookups with this scheme are slower, because multiple
locations must be checked, but nevertheless take constant
worst-case time.[9]
3.10.4 Extensions
3.10.6 References
Dynamic perfect hashing
[1] Fredman, Michael L.; Komlós, János; Szemerédi, En-
Main article: Dynamic perfect hashing dre (1984), “Storing a Sparse Table with O(1) Worst
Case Access Time”, Journal of the ACM, 31 (3): 538,
doi:10.1145/828.1884, MR 0819156
Using a perfect hash function is best in situations where
there is a frequently queried large set, S, which is seldom [2] Fredman, Michael L.; Komlós, János (1984), “On the size
updated. This is because any modification of the set S of separating systems and families of perfect hash func-
may cause the hash function to no longer be perfect for the tions”, SIAM Journal on Algebraic and Discrete Methods,
5 (1): 61–68, doi:10.1137/0605009, MR 731857.
modified set. Solutions which update the hash function
any time the set is modified are known as dynamic perfect [3] Dietzfelbinger, Martin; Karlin, Anna; Mehlhorn, Kurt;
hashing,[3] but these methods are relatively complicated Meyer auf der Heide, Friedhelm; Rohnert, Hans; Tarjan,
to implement. Robert E. (1994), “Dynamic perfect hashing: upper and
3.11. UNIVERSAL HASHING 87

lower bounds”, SIAM Journal on Computing, 23 (4): 738– accesses”. In Proceedings of the 20th Annual
761, doi:10.1137/S0097539791194094, MR 1283572. ACM-SIAM Symposium On Discrete Mathematics
(SODA), New York, 2009. ACM Press.
[4] Belazzougui, Djamal; Botelho, Fabiano C.; Dietzfel-
binger, Martin (2009), “Hash, displace, and compress” • Douglas C. Schmidt, GPERF: A Perfect Hash Func-
(PDF), Algorithms—ESA 2009: 17th Annual European
tion Generator, C++ Report, SIGS, Vol. 10, No. 10,
Symposium, Copenhagen, Denmark, September 7-9, 2009,
Proceedings, Lecture Notes in Computer Science, 5757,
November/December, 1998.
Berlin: Springer, pp. 682–693, doi:10.1007/978-3-642-
04128-0_61, MR 2557794.
3.10.8 External links
[5] Baeza-Yates, Ricardo; Poblete, Patricio V. (2010),
“Searching”, in Atallah, Mikhail J.; Blanton, Marina, Al- • Minimal Perfect Hashing by Bob Jenkins
gorithms and Theory of Computation Handbook: General
Concepts and Techniques (2nd ed.), CRC Press, ISBN • gperf is an Open Source C and C++ perfect hash
9781584888239. See in particular p. 2-10. generator
[6] Jenkins, Bob (14 April 2009), “order-preserving mini- • cmph is Open Source implementing many perfect
mal perfect hashing”, in Black, Paul E., Dictionary of Al-
hashing methods
gorithms and Data Structures, U.S. National Institute of
Standards and Technology, retrieved 2013-03-05 • Sux4J is Open Source implementing perfect hash-
[7] Belazzougui, Djamal; Boldi, Paolo; Pagh, Rasmus; Vi- ing, including monotone minimal perfect hashing in
gna, Sebastiano (November 2008), “Theory and prac- Java
tice of monotone minimal perfect hashing”, Journal of
Experimental Algorithmics, 16, Art. no. 3.2, 26pp, • MPHSharp is Open Source implementing many per-
doi:10.1145/1963190.2025378. fect hashing methods in C#

[8] Fox, Edward A.; Chen, Qi Fan; Daoud, Amjad M.;


Heath, Lenwood S. (July 1991), “Order-preserving
minimal perfect hash functions and information re- 3.11 Universal hashing
trieval”, ACM Transactions on Information Systems,
New York, NY, USA: ACM, 9 (3): 281–308, In mathematics and computing universal hashing (in a
doi:10.1145/125187.125200. randomized algorithm or data structure) refers to select-
[9] Pagh, Rasmus; Rodler, Flemming Friche (2004), ing a hash function at random from a family of hash func-
“Cuckoo hashing”, Journal of Algorithms, 51 (2): 122– tions with a certain mathematical property (see definition
144, doi:10.1016/j.jalgor.2003.12.002, MR 2050140. below). This guarantees a low number of collisions in
expectation, even if the data is chosen by an adversary.
Many universal families are known (for hashing integers,
3.10.7 Further reading vectors, strings), and their evaluation is often very effi-
cient. Universal hashing has numerous uses in computer
• Richard J. Cichelli. Minimal Perfect Hash Functions science, for example in implementations of hash tables,
Made Simple, Communications of the ACM, Vol. randomized algorithms, and cryptography.
23, Number 1, January 1980.
• Thomas H. Cormen, Charles E. Leiserson, Ronald
L. Rivest, and Clifford Stein. Introduction to Algo-
3.11.1 Introduction
rithms, Second Edition. MIT Press and McGraw-
See also: Hash function
Hill, 2001. ISBN 0-262-03293-7. Section 11.5:
Perfect hashing, pp. 245–249.
Assume we want to map keys from some universe U into
• Fabiano C. Botelho, Rasmus Pagh and Nivio Zi- m bins (labelled [m] = {0, . . . , m − 1} ). The algorithm
viani. “Perfect Hashing for Data Management Ap- will have to handle some data set S ⊆ U of |S| = n
plications”. keys, which is not known in advance. Usually, the goal of
• Fabiano C. Botelho and Nivio Ziviani. “External hashing is to obtain a low number of collisions (keys from
perfect hashing for very large key sets”. 16th ACM S that land in the same bin). A deterministic hash func-
Conference on Information and Knowledge Man- tion cannot offer any guarantee in an adversarial setting
agement (CIKM07), Lisbon, Portugal, November if the size of U is greater than m · n , since the adver-
2007. sary may choose S to be precisely the preimage of a bin.
This means that all data keys land in the same bin, making
• Djamal Belazzougui, Paolo Boldi, Rasmus Pagh, hashing useless. Furthermore, a deterministic hash func-
and Sebastiano Vigna. “Monotone minimal per- tion does not allow for rehashing: sometimes the input
fect hashing: Searching a sorted table with O(1) data turns out to be bad for the hash function (e.g. there
88 CHAPTER 3. DICTIONARIES

are too many collisions), so one would like to change the tant for the least significant bits of the hash values to be
hash function. also universal. When a family is strongly universal, this
The solution to these problems is to pick a function ran- is guaranteed:
L
if H is a strongly universal family with

domly from a family of hash functions. A family of func- m = 2 , then the family made of the functions hmod2L

tions H = {h : U → [m]} is called a universal family for all h ∈ H is also strongly universal for L ≤ L .
if, ∀x, y ∈ U, x ̸= y : Prh∈H [h(x) = h(y)] ≤ m 1
. Unfortunately, the same is not true of (merely) universal
families. For example, the family made of the identity
In other words, any two keys of the universe collide with function h(x) = x is clearly universal, but the family
probability at most 1/m when the hash function h is made of the function h(x) = xmod2L′ fails to be uni-
drawn randomly from H . This is exactly the proba- versal.
bility of collision we would expect if the hash function
assigned truly random hash codes to every key. Some- UMAC and Poly1305-AES and several other message
times, the definition is relaxed to allow collision proba- authentication
[4][5]
code algorithms are based on universal
bility O(1/m) . This concept was introduced by Carter hashing. In such applications, the software chooses a
[1]
and Wegman in 1977, and has found numerous appli- new hash function for every message, based on a unique
[2]
cations in computer science (see, for example ). If we nonce for that message.
have an upper bound of ϵ < 1 on the collision probability, Several hash table implementations are based on univer-
we say that we have ϵ -almost universality. sal hashing. In such applications, typically the software
Many, but not all, universal families have the following chooses a new hash function only after it notices that “too
stronger uniform difference property: many” keys have collided; until then, the same hash func-
tion continues to be used over and over. (Some colli-
sion resolution schemes, such as dynamic perfect hashing,
∀x, y ∈ U, x ̸= y , when h is drawn randomly
pick a new hash function every time there is a collision.
from the family H , the difference h(x) −
Other collision resolution schemes, such as cuckoo hash-
h(y) mod m is uniformly distributed in [m] .
ing and 2-choice hashing, allow a number of collisions
before picking a new hash function). A survey of fastest
Note that the definition of universality is only concerned known universal and strongly universal hash functions for
with whether h(x) − h(y) = 0 , which counts collisions. integers, vectors, and strings is found in.[6]
The uniform difference property is stronger.
(Similarly, a universal family can be XOR universal if
∀x, y ∈ U, x ̸= y , the value h(x) ⊕ h(y) mod m is 3.11.2 Mathematical guarantees
uniformly distributed in [m] where ⊕ is the bitwise ex-
clusive or operation. This is only possible if m is a power For any fixed set S of n keys, using a universal family
of two.) guarantees the following properties.
An even stronger condition is pairwise independence: we
have this property when ∀x, y ∈ U, x ̸= y we have the 1. For any fixed x in S , the expected number of keys
probability that x, y will hash to any pair of hash values in the bin h(x) is n/m . When implementing hash
z1 , z2 is as if they were perfectly random: P (h(x) = tables by chaining, this number is proportional to the
z1 ∧ h(y) = z2 ) = 1/m2 . Pairwise independence is expected running time of an operation involving the
sometimes called strong universality. key x (for example a query, insertion or deletion).
Another property is uniformity. We say that a family is
uniform if all hash values are equally likely: P (h(x) = 2. The expected number of pairs of keys x, y in S with
z) = 1/m for any hash value z . Universality does not x ̸= y that collide ( h(x) = h(y) ) is bounded above
imply uniformity. However, strong universality does im- by n(n − 1)/2m , which is of order O(n2 /m) .
ply uniformity. When the number of bins, m , is O(n) , the expected
number of collisions is O(n) . When hashing into n2
Given a family with the uniform distance property, one
bins, there are no collisions at all with probability at
can produce a pairwise independent or strongly univer-
least a half.
sal hash family by adding a uniformly distributed random
constant with values in [m] to the hash functions. (Sim- 3. The expected number of keys in bins with at least t
ilarly, if m is a power of two, we can achieve pairwise keys in them is bounded above by 2n/(t−2(n/m)+
independence from an XOR universal hash family by do- 1) .[7] Thus, if the capacity of each bin is capped
ing an exclusive or with a uniformly distributed random to three times the average size ( t = 3n/m ), the
constant.) Since a shift by a constant is sometimes irrel- total number of keys in overflowing bins is at most
evant in applications (e.g. hash tables), a careful distinc- O(m) . This only holds with a hash family whose
tion between the uniform distance property and pairwise collision probability is bounded above by 1/m . If a
independent is sometimes not made.[3] weaker definition is used, bounding it by O(1/m) ,
For some applications (such as hash tables), it is impor- this result is no longer true.[7]
3.11. UNIVERSAL HASHING 89

As the above guarantees hold for any fixed set S , they


hold if the data set is chosen by an adversary. However,
the adversary has to make this choice before (or indepen- ⌊(p − 1)/m⌋/(p − 1) ≤ ((p − 1)/m)/(p − 1) = 1/m
dent of) the algorithm’s random choice of a hash function.
If the adversary can observe the random choice of the al- Another way to see H is a universal family is via the no-
gorithm, randomness serves no purpose, and the situation tion of statistical distance. Write the difference h(x) −
is the same as deterministic hashing. h(y) as

The second and third guarantee are typically used in con-


junction with rehashing. For instance, a randomized al- h(x) − h(y) ≡ (a(x − y) mod p) (mod m)
gorithm may be prepared to handle some O(n) num-
ber of collisions. If it observes too many collisions, it Since x − y is nonzero and a is uniformly distributed in
chooses another random h from the family and repeats. {1, . . . , p} , it follows that a(x − y) modulo p is also
Universality guarantees that the number of repetitions is uniformly distributed in {1, . . . , p} . The distribution of
a geometric random variable. (h(x) − h(y)) mod m is thus almost uniform, up to a
difference in probability of ±1/p between the samples.
As a result, the statistical distance to a uniform family is
3.11.3 Constructions O(m/p) , which becomes negligible when p ≫ m .
Since any computer data can be represented as one or The family of simpler hash functions
more machine words, one generally needs hash functions
for three types of domains: machine words (“integers”);
fixed-length vectors of machine words; and variable- ha (x) = (ax mod p) mod m
length vectors (“strings”).
is only approximately universal: Pr{ha (x) = ha (y)} ≤
2/m for all x ̸= y .[1] Moreover, this analysis is nearly
Hashing integers tight; Carter and Wegman [1] show that Pr{ha (1) =
ha (m + 1)} ≥ 2/(m − 1) whenever (p − 1) mod m = 1
This section refers to the case of hashing integers that fit .
in machines words; thus, operations like multiplication,
addition, division, etc. are cheap machine-level instruc-
tions. Let the universe to be hashed be U = {0, . . . , m − Avoiding modular arithmetic The state of the art for
1} . hashing integers is the multiply-shift scheme described
[1] by Dietzfelbinger et al. in 1997.[8] By avoiding modular
The original proposal of Carter and Wegman was to
arithmetic, this method is much easier to implement and
pick a prime p ≥ m and define
also runs significantly faster in practice (usually by at least
a factor of four[9] ). The scheme assumes the number of
bins is a power of two, m = 2M . Let w be the number
ha,b (x) = ((ax + b) mod p) mod m of bits in a machine word. Then the hash functions are
parametrised over odd positive integers a < 2w (that fit
where a, b are randomly chosen integers modulo p with
in a word of w bits). To evaluate ha (x) , multiply x by
a ̸= 0 . (This is a single iteration of a linear congruential
a modulo 2w and then keep the high order M bits as the
generator.)
hash code. In mathematical notation, this is
To see that H = {ha,b } is a universal family, note that
h(x) = h(y) only holds when
ha (x) = (a · x mod 2w ) div 2w−M

ax + b ≡ ay + b + i · m (mod p) and it can be implemented in C-like programming lan-


guages by
for some integer i between 0 and (p − 1)/m . If x ̸= y
, their difference, x − y is nonzero and has an inverse ha (x) = (unsigned) (a*x) >> (w-M)
modulo p . Solving for a yields
This scheme does not satisfy the uniform difference prop-
erty and is only 2/m -almost-universal; for any x ̸= y ,
a ≡ i · m · (x − y)−1 (mod p) Pr{ha (x) = ha (y)} ≤ 2/m .
There are p − 1 possible choices for a (since a = 0 is ex- To understand the behavior of the hash function, notice
cluded) and, varying i in the allowed range, ⌊(p − 1)/m⌋ that, if axmod2w and aymod2w have the same highest-
possible non-zero values for the right hand side. Thus the order 'M' bits, then a(x − y)mod2w has either all 1’s or
collision probability is all 0’s as its highest order M bits (depending on whether
90 CHAPTER 3. DICTIONARIES

axmod2w or aymod2w is larger. Assume that the least It is possible to halve the number of multiplications,
significant set bit of x−y appears on position w−c . Since which roughly translates to a two-fold speed-up in
a is a random odd integer and odd integers have inverses practice.[12] Initialize the hash function with a vector ā =
in the ring Z2w , it follows that a(x − y)mod2w will be (a0 , . . . , ak−1 ) of random odd integers on 2w bits each.
uniformly distributed among w -bit integers with the least The following hash family is universal:[14]
significant set bit on position w − c . The probability that
these bits are all 0’s or all 1’s is therefore at most 2/2M =  
2/m . On the other hand, if c < M , then higher-order ( ⌈k/2⌉
∑ )
M bits of a(x − y)mod2w contain both 0’s and 1’s, so it hā (x̄) =  (x2i + a2i ) · (x2i+1 + a2i+1 ) mod 22w  div 22w−M
is certain that h(x) ̸= h(y) . Finally, if c = M then bit i=0

w − M of a(x − y)mod2w is 1 and ha (x) = ha (y) if If double-precision operations are not available, one can
and only if bits w − 1, . . . , w − M + 1 are also 1, which
interpret the input as a vector of half-words ( w/2 -bit
happens with probability 1/2M −1 = 2/m . integers). The algorithm will then use ⌈k/2⌉ multiplica-
This analysis is tight, as can be shown with the example tions, where k was the number of half-words in the vec-
x = 2w−M −2 and y = 3x . To obtain a truly 'universal' tor. Thus, the algorithm runs at a “rate” of one multipli-
hash function, one can use the multiply-add-shift scheme cation per word of input.
The same scheme can also be used for hashing integers,
by interpreting their bits as vectors of bytes. In this vari-
ha,b (x) = ((ax + b)mod2w ) div 2w−M ant, the vector technique is known as tabulation hashing
and it provides a practical alternative to multiplication-
which can be implemented in C-like programming lan-
based universal hashing schemes.[15]
guages by
Strong universality at high speed is also possible.[16] Ini-
ha,b (x) = (unsigned) (a*x+b) >> (w-M) tialize the hash function with a vector ā = (a0 , . . . , ak )
of random integers on 2w bits. Compute
where a is a random odd positive integer with a < 2w
and b is a random non-negative integer with b < 2w−M .

k−1
With these choices of a and b , Pr{ha,b (x) = ha,b (y)} ≤ hā (x̄)strong = (a0 + ai+1 xi mod 22w ) div 2w
1/m for all x ̸≡ y (mod 2 ) .w [10]
This differs slightly i=0
but importantly from the mistranslation in the English
The result is strongly universal on w bits. Experimentally,
paper.[11]
it was found to run at 0.2 CPU cycle per byte on recent
Intel processors for w = 32 .
Hashing vectors
Hashing strings
This section is concerned with hashing a fixed-length vec-
tor of machine words. Interpret the input as a vector
This refers to hashing a variable-sized vector of machine
x̄ = (x0 , . . . , xk−1 ) of k machine words (integers of w
words. If the length of the string can be bounded by a
bits each). If H is a universal family with the uniform
small number, it is best to use the vector solution from
difference property, the following family (dating back to
above (conceptually padding the vector with zeros up to
Carter and Wegman[1] ) also has the uniform difference
the upper bound). The space required is the maximal
property (and hence is universal):
length of the string, but the time to evaluate h(s) is just
(∑ ) the length of s . As long as zeroes are forbidden in the
k−1
h(x̄) = i=0 hi (xi ) mod m , where each string, the zero-padding can be ignored when evaluating
hi ∈ H is chosen independently at random. the hash function without affecting universality[12] . Note
that if zeroes are allowed in the string, then it might be
If m is a power of two, one may replace summation by best to append a fictitious non-zero (e.g., 1) character to
exclusive or.[12] all strings prior to padding: this will ensure that univer-
sality is not affected.[16]
In practice, if double-precision arithmetic is available,
this is instantiated with the multiply-shift hash family Now assume we want to hash x̄ = (x0 , . . . , xℓ ) , where a
of.[13] Initialize the hash function with a vector ā = good bound on ℓ is not known a priori. A universal family
(a0 , . . . , ak−1 ) of random odd integers on 2w bits each. proposed by [13] treats the string x as the coefficients of
Then if the number of bins is m = 2M for M ≤ w : a polynomial modulo a large prime. If xi ∈ [u] , let
p ≥ max{u, m} be a prime and define:
( k−1 ) (( ∑ ) )
(∑ ) ha (x̄) = hint

x
i=0 i · a i
mod p ,
hā (x̄) = xi · ai mod 22w div 22w−M
i=0
where a ∈ [p] is uniformly random and hint is
3.11. UNIVERSAL HASHING 91

chosen randomly from a universal family map- 3.11.4 See also


ping integer domain [p] 7→ [m] .
• K-independent hashing
Using properties of modular arithmetic, above can be • Rolling hashing
computed without producing large numbers for large
strings as follows:[17] • Tabulation hashing
uint hash(String x, int a, int p) uint h = INITIAL_VALUE
• Min-wise independence
for (uint i=0 ; i < x.length ; ++i) h = ((h*a) + x[i]) mod
p return h • Universal one-way hash function

This Rabin-Karp rolling hash is based on a linear con- • Low-discrepancy sequence


[18]
gruential generator. Above algorithm is also known as • Perfect hashing
Multiplicative hash function.[19] In practice, the mod op-
erator and the parameter p can be avoided altogether by
simply allowing integer to overflow because it is equiva- 3.11.5 References
lent to mod (Max-Int-Value + 1) in many programming
languages. Below table shows values chosen to initialize [1] Carter, Larry; Wegman, Mark N. (1979). “Universal
h and a for some of the popular implementations. Classes of Hash Functions”. Journal of Computer and
Consider two strings x̄, ȳ and let ℓ be length of the longer System Sciences. 18 (2): 143–154. doi:10.1016/0022-
0000(79)90044-8. Conference version in STOC'77.
one; for the analysis, the shorter string is conceptually
padded with zeros up to length ℓ . A collision before ap- [2] Miltersen, Peter Bro. “Universal Hashing”. Archived
plying hint implies that a is a root of the polynomial with from the original (PDF) on 24 June 2009.
coefficients x̄ − ȳ . This polynomial has at most ℓ roots
modulo p , so the collision probability is at most ℓ/p . The [3] Motwani, Rajeev; Raghavan, Prabhakar (1995). Ran-
probability of collision through the random hint brings the domized Algorithms. Cambridge University Press. p. 221.
total collision probability to m1
+ pℓ . Thus, if the prime ISBN 0-521-47465-5.
p is sufficiently large compared to the length of strings
[4] David Wagner, ed. “Advances in Cryptology - CRYPTO
hashed, the family is very close to universal (in statistical 2008”. p. 145.
distance).
Other universal families of hash functions used to hash [5] Jean-Philippe Aumasson, Willi Meier, Raphael Phan,
Luca Henzen. “The Hash Function BLAKE”. 2014. p.
unknown-length strings to fixed-length hash values in-
10.
clude the Rabin fingerprint and the Buzhash.
[6] Thorup, Mikkel (2015). “High Speed Hashing for Inte-
gers and Strings”.
Avoiding modular arithmetic To mitigate the com-
putational penalty of modular arithmetic, two tricks are [7] Baran, Ilya; Demaine, Erik D.; Pătraşcu, Mihai (2008).
used in practice:[12] “Subquadratic Algorithms for 3SUM” (PDF). Algorith-
mica. 50 (4): 584–596. doi:10.1007/s00453-007-9036-
3.
1. One chooses the prime p to be close to a power of
two, such as a Mersenne prime. This allows arith- [8] Dietzfelbinger, Martin; Hagerup, Torben; Katajainen,
metic modulo p to be implemented without divi- Jyrki; Penttonen, Martti (1997). “A Reliable Ran-
sion (using faster operations like addition and shifts). domized Algorithm for the Closest-Pair Problem”
(Postscript). Journal of Algorithms. 25 (1): 19–51.
For instance, on modern architectures one can work
doi:10.1006/jagm.1997.0873. Retrieved 10 February
with p = 261 − 1 , while xi 's are 32-bit values. 2011.

2. One can apply vector hashing to blocks. For in- [9] Thorup, Mikkel. “Text-book algorithms at SODA”.
stance, one applies vector hashing to each 16-word
block of the string, and applies string hashing to the [10] Woelfel, Philipp (2003). Über die Komplexität der Multi-
⌈k/16⌉ results. Since the slower string hashing is plikation in eingeschränkten Branchingprogrammmodellen
applied on a substantially smaller vector, this will (PDF) (Ph.D.). Universität Dortmund. Retrieved 18
September 2012.
essentially be as fast as vector hashing.
[11] Woelfel, Philipp (1999). Efficient Strongly Universal and
3. One chooses a power-of-two as the divisor, allow- Optimally Universal Hashing (PDF). Mathematical Foun-
ing arithmetic modulo 2w to be implemented with- dations of Computer Science 1999. LNCS. 1672. pp.
out division (using faster operations of bit masking). 262–272. doi:10.1007/3-540-48340-3_24. Retrieved 17
The NH hash-function family takes this approach. May 2011.
92 CHAPTER 3. DICTIONARIES

[12] Thorup, Mikkel (2009). String hashing for lin- hash codes of any designated k keys are independent ran-
ear probing. Proc. 20th ACM-SIAM Symposium dom variables (see precise mathematical definitions be-
on Discrete Algorithms (SODA). pp. 655–664. low). Such families allow good average case performance
doi:10.1137/1.9781611973068.72. Archived (PDF) in randomized algorithms or data structures, even if the
from the original on 2013-10-12., section 5.3 input data is chosen by an adversary. The trade-offs be-
[13] Dietzfelbinger, Martin; Gil, Joseph; Matias, Yossi; Pip-
tween the degree of independence and the efficiency of
penger, Nicholas (1992). Polynomial Hash Functions Are evaluating the hash function are well studied, and many k
Reliable (Extended Abstract). Proc. 19th International -independent families have been proposed.
Colloquium on Automata, Languages and Programming
(ICALP). pp. 235–246.

[14] Black, J.; Halevi, S.; Krawczyk, H.; Krovetz, T. (1999).


3.12.1 Background
UMAC: Fast and Secure Message Authentication (PDF).
Advances in Cryptology (CRYPTO '99)., Equation 1 See also: Hash function

[15] Pătraşcu, Mihai; Thorup, Mikkel (2011). The power


of simple tabulation hashing. Proceedings of the
The goal of hashing is usually to map keys from some
43rd annual ACM Symposium on Theory of Comput- large domain (universe) U into a smaller range, such as
m bins (labelled [m] = {0, . . . , m − 1} ). In the analysis
ing (STOC '11). pp. 1–10. arXiv:1011.5200 .
doi:10.1145/1993636.1993638. of randomized algorithms and data structures, it is often
desirable for the hash codes of various keys to “behave
[16] Kaser, Owen; Lemire, Daniel (2013). “Strongly randomly”. For instance, if the hash code of each key
universal string hashing is fast”. Computer Jour- were an independent random choice in [m] , the num-
nal. Oxford University Press. arXiv:1202.4961 . ber of keys per bin could be analyzed using the Chernoff
doi:10.1093/comjnl/bxt070. bound. A deterministic hash function cannot offer any
such guarantee in an adversarial setting, as the adversary
[17] “Hebrew University Course Slides” (PDF). may choose the keys to be the precisely the preimage of a
bin. Furthermore, a deterministic hash function does not
[18] Robert Uzgalis. “Library Hash Functions”. 1996.
allow for rehashing: sometimes the input data turns out
[19] Kankowsk, Peter. “Hash functions: An empirical com- to be bad for the hash function (e.g. there are too many
parison”. collisions), so one would like to change the hash function.
The solution to these problems is to pick a function ran-
[20] Yigit, Ozan. “String hash functions”.
domly from a large family of hash functions. The random-
[21] Kernighan; Ritchie (1988). “6”. The C Programming Lan- ness in choosing the hash function can be used to guaran-
guage (2nd ed.). p. 118. ISBN 0-13-110362-8. tee some desired random behavior of the hash codes of
any keys of interest. The first definition along these lines
[22] “String (Java Platform SE 6)". docs.oracle.com. Retrieved was universal hashing, which guarantees a low collision
2015-06-10. probability for any two designated keys. The concept of k
-independent hashing, introduced by Wegman and Carter
in 1981,[2] strengthens the guarantees of random behav-
3.11.6 Further reading ior to families of k designated keys, and adds a guarantee
on the uniform distribution of hash codes.
• Knuth, Donald Ervin (1998). The Art of Computer
Programming, Vol. III: Sorting and Searching (3rd
ed.). Reading, Mass; London: Addison-Wesley.
ISBN 0-201-89685-0. 3.12.2 Definitions

The strictest definition, introduced by Wegman and


3.11.7 External links Carter[2] under the name “strongly universal k hash fam-
ily”, is the following. A family of hash functions H =
• Open Data Structures - Section 5.1.1 - Multiplica- {h : U → [m]} is k -independent if for any k distinct
tive Hashing keys (x1 , . . . , xk ) ∈ U k and any k hash codes (not nec-
essarily distinct) (y1 , . . . , yk ) ∈ [m]k , we have:

3.12 K-independent hashing


Pr [h(x1 ) = y1 ∧ · · · ∧ h(xk ) = yk ] = m−k
h∈H
In computer science, a family of hash functions is said
to be k -independent or k -universal[1] if selecting a This definition is equivalent to the following two condi-
function at random from the family guarantees that the tions:
3.12. K-INDEPENDENT HASHING 93

1. for any fixed x ∈ U , as h is drawn randomly from 3.12.4 Independence needed by different
H , h(x) is uniformly distributed in [m] . hashing methods
2. for any fixed, distinct keys x1 , . . . , xk ∈ U , as h
The notion of k-independence can be used to differentiate
is drawn randomly from H , h(x1 ), . . . , h(xk ) are
between different hashing methods, according to the level
independent random variables.
of independence required to guarantee constant expected
time per operation.
Often it is inconvenient to achieve the perfect joint prob-
ability of m−k due to rounding issues. Following,[3] one For instance, hash chaining takes constant expected time
may define a (µ, k) -independent family to satisfy: even with a 2-independent hash function, because the ex-
pected time to perform a search for a given key is bounded
∀ distinct (x1 , . . . , xk ) ∈ Uk by the expected number of collisions that key is involved
and ∀(y1 , . . . , yk ) ∈ [m]k , in. By linearity of expectation, this expected number
Prh∈H [h(x1 ) = y1 ∧ · · · ∧ h(xk ) = yk ] ≤ equals the sum, over all other keys in the hash table, of the
µ/mk probability that the given key and the other key collide.
Because the terms of this sum only involve probabilistic
Observe that, even if µ is close to 1, h(xi ) are no longer events involving two keys, 2-independence is sufficient to
independent random variables, which is often a problem ensure that this sum has the same value that it would for
[2]
in the analysis of randomized algorithms. Therefore, a a truly random hash function.
more common alternative to dealing with rounding issues Double hashing is another method of hashing that re-
is to prove that the hash family is close in statistical dis- quires a low degree of independence. It is a form of open
tance to a k -independent family, which allows black-box addressing that uses two hash functions: one to determine
use of the independence properties. the start of a probe sequence, and the other to determine
the step size between positions in the probe sequence. As
long as both of these are 2-independent, this method gives
3.12.3 Techniques constant expected time per operation.[7]
Polynomials with random coefficients On the other hand, linear probing, a simpler form of open
addressing where the step size is always one, requires 5-
The original technique for constructing k-independent independence. It can be guaranteed to work in constant
hash functions, given by Carter and Wegman, was to se- expected time per operation with a 5-independent hash
lect a large prime number p, choose k random numbers function,[8] and there exist 4-independent hash functions
modulo p, and use these numbers as the coefficients of a for which it takes logarithmic time per operation.[9]
polynomial of degree k whose values modulo p are used
as the value of the hash function. All polynomials of the
given degree modulo p are equally likely, and any polyno- 3.12.5 References
mial is uniquely determined by any k-tuple of argument-
[1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
value pairs with distinct arguments, from which it follows
Ronald L.; Stein, Clifford (2009) [1990]. Introduction to
that any k-tuple of distinct arguments is equally likely to Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN
[2]
be mapped to any k-tuple of hash values. 0-262-03384-4.

[2] Wegman, Mark N.; Carter, J. Lawrence (1981). “New


Tabulation hashing hash functions and their use in authentication and
set equality” (PDF). Journal of Computer and Sys-
Main article: Tabulation hashing tem Sciences. 22 (3): 265–279. doi:10.1016/0022-
0000(81)90033-7. Conference version in FOCS'79. Re-
trieved 9 February 2011.
Tabulation hashing is a technique for mapping keys to
hash values by partitioning each key into bytes, using each [3] Siegel, Alan (2004). “On universal classes of extremely
byte as the index into a table of random numbers (with a random constant-time hash functions and their time-space
different table for each byte position), and combining the tradeoff” (PDF). SIAM Journal on Computing. 33 (3):
results of these table lookups by a bitwise exclusive or op- 505–543. doi:10.1137/S0097539701386216. Confer-
eration. Thus, it requires more randomness in its initial- ence version in FOCS'89.
ization than the polynomial method, but avoids possibly-
[4] Pătraşcu, Mihai; Thorup, Mikkel (2012), “The
slow multiplication operations. It is 3-independent but power of simple tabulation hashing”, Journal of
not 4-independent.[4] Variations of tabulation hashing can
the ACM, 59 (3): Art. 14, arXiv:1011.5200 ,
achieve higher degrees of independence by performing
doi:10.1145/2220357.2220361, MR 2946218.
table lookups based on overlapping combinations of bits
from the input key, or by applying simple tabulation hash- [5] Siegel, Alan (2004), “On universal classes of ex-
ing iteratively.[5][6] tremely random constant-time hash functions”,
94 CHAPTER 3. DICTIONARIES

SIAM Journal on Computing, 33 (3): 505–543, 3.13.1 Method


doi:10.1137/S0097539701386216, MR 2066640.
Let p denote the number of bits in a key to be hashed,
[6] Thorup, M. (2013), “Simple tabulation, fast expanders, and q denote the number of bits desired in an output hash
double tabulation, and high independence”, Proceed-
function. Choose another number r, less than or equal to
ings of the 54th Annual IEEE Symposium on Founda-
tions of Computer Science (FOCS 2013), pp. 90–99,
p; this choice is arbitrary, and controls the tradeoff be-
doi:10.1109/FOCS.2013.18, MR 3246210. tween time and memory usage of the hashing method:
smaller values of r use less memory but cause the hash
[7] Bradford, Phillip G.; Katehakis, Michael N. (2007), “A
function to be slower. Compute t by rounding p/r up
probabilistic study on combinatorial expanders and hash- to the next larger integer; this gives the number of r-bit
ing” (PDF), SIAM Journal on Computing, 37 (1): 83–111, blocks needed to represent a key. For instance, if r = 8,
doi:10.1137/S009753970444630X, MR 2306284. then an r-bit number is a byte, and t is the number of
bytes per key. The key idea of tabulation hashing is to
[8] Pagh, Anna; Pagh, Rasmus; Ružić, Milan (2009), “Linear view a key as a vector of t r-bit numbers, use a lookup ta-
probing with constant independence”, SIAM Journal on ble filled with random values to compute a hash value for
Computing, 39 (3): 1107–1120, doi:10.1137/070702278, each of the r-bit numbers representing a given key, and
MR 2538852 combine these values with the bitwise binary exclusive or
operation.[1] The choice of r should be made in such a
[9] Pătraşcu, Mihai; Thorup, Mikkel (2010), “On the k- way that this table is not too large; e.g., so that it fits into
independence required by linear probing and minwise the computer’s cache memory.[2]
independence” (PDF), Automata, Languages and Pro-
gramming, 37th International Colloquium, ICALP 2010,
The initialization phase of the algorithm creates a two-
Bordeaux, France, July 6-10, 2010, Proceedings, Part I, dimensional array T of dimensions 2r by t, and fills the
Lecture Notes in Computer Science, 6198, Springer, pp. array with random q-bit numbers. Once the array T is
715–726, doi:10.1007/978-3-642-14165-2_60 initialized, it can be used to compute the hash value h(x)
of any given key x. To do so, partition x into r-bit values,
where x0 consists of the low order r bits of x, x1 consists
of the next r bits, etc. For example, with the choice r =
3.12.6 Further reading 8, xi is just the ith byte of x. Then, use these values as
indices into T and combine them with the exclusive or
• Motwani, Rajeev; Raghavan, Prabhakar (1995). operation:[1]
Randomized Algorithms. Cambridge University
Press. p. 221. ISBN 0-521-47465-5.

h(x) = T[0][x0 ] ⊕ T[1][x1 ] ⊕ T[2][x2 ] ⊕ ...

3.13 Tabulation hashing


In computer science, tabulation hashing is a method for 3.13.2 History
constructing universal families of hash functions by com-
bining table lookup with exclusive or operations. It was The first instance of tabulation hashing is Zobrist hashing,
first studied in the form of Zobrist hashing for computer
a method for hashing positions in abstract board games
games; later work by Carter and Wegman extended this such as chess named after Albert Lindsey Zobrist, who
method to arbitrary fixed-length keys. Generalizations of
published it in 1970.[3] In this method, a random bitstring
tabulation hashing have also been developed that can han- is generated for each game feature such as a combination
dle variable-length keys such as text strings. of a chess piece and a square of the chessboard. Then, to
Despite its simplicity, tabulation hashing has strong theo- hash any game position, the bitstrings for the features of
retical properties that distinguish it from some other hash that position are combined by a bitwise exclusive or. The
functions. In particular, it is 3-independent: every 3-tuple resulting hash value can then be used as an index into a
of keys is equally likely to be mapped to any 3-tuple of transposition table. Because each move typically changes
hash values. However, it is not 4-independent. More so- only a small number of game features, the Zobrist value
phisticated but slower variants of tabulation hashing ex- of the position after a move can be updated quickly from
tend the method to higher degrees of independence. the value of the position before the move, without needing
Because of its high degree of independence, tabula- to loop over all of the features of the position.[4]
tion hashing is usable with hashing methods that require Tabulation hashing in greater generality, for arbitrary bi-
a high-quality hash function, including linear probing, nary values, was later rediscovered by Carter & Wegman
cuckoo hashing, and the MinHash technique for estimat- (1979) and studied in more detail by Pătraşcu & Thorup
ing the size of set intersections. (2012).
3.13. TABULATION HASHING 95

3.13.3 Universality functions, and is constant whenever the load factor of


the hash table is constant. Therefore, tabulation hashing
Carter & Wegman (1979) define a randomized scheme can be used to compute hash functions for hash chaining
for generating hash functions to be universal if, for any with a theoretical guarantee of constant expected time per
two keys, the probability that they collide (that is, they are operation.[6]
mapped to the same value as each other) is 1/m, where
However, universal hashing is not strong enough to guar-
m is the number of values that the keys can take on.
antee the performance of some other hashing algorithms.
They defined a stronger property in the subsequent pa-
For instance, for linear probing, 5-independent hash func-
per Wegman & Carter (1981): a randomized scheme for
tions are strong enough to guarantee constant time op-
generating hash functions is k-independent if, for every
eration, but there are 4-independent hash functions that
k-tuple of keys, and each possible k-tuple of values, the
fail.[7] Nevertheless, despite only being 3-independent,
probability that those keys are mapped to those values is
tabulation hashing provides the same constant-time guar-
1/mk . 2-independent hashing schemes are automatically
antee for linear probing.[8]
universal, and any universal hashing scheme can be con-
verted into a 2-independent scheme by storing a random Cuckoo hashing, another technique for implementing
number x as part of the initialization phase of the algo- hash tables, guarantees constant time per lookup (regard-
rithm and adding x to each hash value. Thus, universality less of the hash function). Insertions into a cuckoo hash
is essentially the same as 2-independence. However, k- table may fail, causing the entire table to be rebuilt, but
independence for larger values of k is a stronger property, such failures are sufficiently unlikely that the expected
held by fewer hashing algorithms. time per insertion (using either a truly random hash func-
tion or a hash function with logarithmic independence)
As Pătraşcu & Thorup (2012) observe, tabulation hashing
is constant. With tabulation hashing, on the other hand,
is 3-independent but not 4-independent. For any single
the best bound known on the failure probability is higher,
key x, T[x0 ,0] is equally likely to take on any hash value,
high enough that insertions cannot be guaranteed to take
and the exclusive or of T[x0 ,0] with the remaining table
constant expected time. Nevertheless, tabulation hashing
values does not change this property. For any two keys x
is adequate to ensure the linear-expected-time construc-
and y, x is equally likely to be mapped to any hash value
tion of a cuckoo hash table for a static set of keys that
as before, and there is at least one position i where xi ≠
does not change as the table is used.[8]
yi; the table value T[yi,i] is used in the calculation of h(y)
but not in the calculation of h(x), so even after the value
of h(x) has been determined, h(y) is equally likely to be
any valid hash value. Similarly, for any three keys x, y, 3.13.5 Extensions
and z, at least one of the three keys has a position i where
its value zi differs from the other two, so that even after Although tabulation hashing as described above (“simple
the values of h(x) and h(y) are determined, h(z) is equally tabulation hashing”) is only 3-independent, variations of
likely to be any valid hash value.[5] this method can be used to obtain hash functions with
much higher degrees of independence. Siegel (2004) uses
However, this reasoning breaks down for four keys be- the same idea of using exclusive or operations to combine
cause there are sets of keys w, x, y, and z where none random values from a table, with a more complicated al-
of the four has a byte value that it does not share with at gorithm based on expander graphs for transforming the
least one of the other keys. For instance, if the keys have key bits into table indices, to define hashing schemes that
two bytes each, and w, x, y, and z are the four keys that are k-independent for any constant or even logarithmic
have either zero or one as their byte values, then each byte value of k. However, the number of table lookups needed
value in each position is shared by exactly two of the four to compute each hash value using Siegel’s variation of
keys. For these four keys, the hash values computed by tabulation hashing, while constant, is still too large to be
tabulation hashing will always satisfy the equation h(w) practical, and the use of expanders in Siegel’s technique
⊕ h(x) ⊕ h(y) ⊕ h(z) = 0, whereas for a 4-independent also makes it not fully constructive. Thorup (2013) pro-
hashing scheme the same equation would only be satis- vides a scheme based on tabulation hashing that reaches
fied with probability 1/m. Therefore, tabulation hashing high degrees of independence more quickly, in a more
is not 4-independent.[5] constructive way. He observes that using one round of
simple tabulation hashing to expand the input keys to six
times their original length, and then a second round of
3.13.4 Application
simple tabulation hashing on the expanded keys, results
Because tabulation hashing is a universal hashing scheme, in a hashing scheme whose independence number is ex-
it can be used in any hashing-based algorithm in which ponential in the parameter r, the number of bits per block
universality is sufficient. For instance, in hash chaining, in the partition of the keys into blocks.
the expected time per operation is proportional to the Simple tabulation is limited to keys of a fixed length, be-
sum of collision probabilities, which is the same for any cause a different table of random values needs to be ini-
universal scheme as it would be for truly random hash tialized for each position of a block in the keys. Lemire
96 CHAPTER 3. DICTIONARIES

(2012) studies variations of tabulation hashing suitable • Lemire, Daniel (2012), “The univer-
for variable-length keys such as character strings. The sality of iterated hashing over variable-
general type of hashing scheme studied by Lemire uses a length strings”, Discrete Applied Mathe-
single table T indexed by the value of a block, regard- matics, 160: 604–617, arXiv:1008.1715 ,
less of its position within the key. However, the val- doi:10.1016/j.dam.2011.11.009, MR 2876344.
ues from this table may be combined by a more compli-
cated function than bitwise exclusive or. Lemire shows • Pagh, Anna; Pagh, Rasmus; Ružić, Milan (2009),
that no scheme of this type can be 3-independent. Nev- “Linear probing with constant independence”,
ertheless, he shows that it is still possible to achieve 2- SIAM Journal on Computing, 39 (3): 1107–1120,
independence. In particular, a tabulation scheme that in- doi:10.1137/070702278, MR 2538852.
terprets the values T[xi] (where xi is, as before, the ith
block of the input) as the coefficients of a polynomial • Pătraşcu, Mihai; Thorup, Mikkel (2010), “On
over a finite field and then takes the remainder of the re- the k-independence required by linear probing
sulting polynomial modulo another polynomial, gives a and minwise independence” (PDF), Proceedings of
2-independent hash function. the 37th International Colloquium on Automata,
Languages and Programming (ICALP 2010), Bor-
deaux, France, July 6-10, 2010, Part I, Lecture
3.13.6 Notes Notes in Computer Science, 6198, Springer,
pp. 715–726, doi:10.1007/978-3-642-14165-2_60,
[1] Morin (2014); Mitzenmacher & Upfal (2014). MR 2734626.
[2] Mitzenmacher & Upfal (2014).
• Pătraşcu, Mihai; Thorup, Mikkel (2012), “The
[3] Thorup (2013). power of simple tabulation hashing”, Journal of
[4] Zobrist (1970). the ACM, 59 (3): Art. 14, arXiv:1011.5200 ,
doi:10.1145/2220357.2220361, MR 2946218.
[5] Pătraşcu & Thorup (2012); Mitzenmacher & Upfal
(2014). • Siegel, Alan (2004), “On universal classes of
[6] Carter & Wegman (1979). extremely random constant-time hash functions”,
SIAM Journal on Computing, 33 (3): 505–543,
[7] For the sufficiency of 5-independent hashing for linear doi:10.1137/S0097539701386216, MR 2066640.
probing, see Pagh, Pagh & Ružić (2009). For examples of
weaker hashing schemes that fail, see Pătraşcu & Thorup • Thorup, M. (2013), “Simple tabulation, fast ex-
(2010). panders, double tabulation, and high independence”,
[8] Pătraşcu & Thorup (2012). Proceedings of the 54th Annual IEEE Symposium on
Foundations of Computer Science (FOCS 2013), pp.
90–99, doi:10.1109/FOCS.2013.18, MR 3246210.
3.13.7 References
• Wegman, Mark N.; Carter, J. Lawrence (1981),
Secondary sources “New hash functions and their use in authentica-
tion and set equality”, Journal of Computer and Sys-
• Morin, Pat (February 22, 2014), “Section 5.2.3: tem Sciences, 22 (3): 265–279, doi:10.1016/0022-
Tabulation hashing”, Open Data Structures (in pseu- 0000(81)90033-7, MR 633535.
docode) (0.1Gβ ed.), pp. 115–116, retrieved 2016-
01-08. • Zobrist, Albert L. (April 1970), A New Hashing
Method with Application for Game Playing (PDF),
• Mitzenmacher, Michael; Upfal, Eli (2014), “Some Tech. Rep. 88, Madison, Wisconsin: Computer
practical randomized algorithms and data struc- Sciences Department, University of Wisconsin.
tures”, in Tucker, Allen; Gonzalez, Teofilo; Diaz-
Herrera, Jorge, Computing Handbook: Computer
Science and Software Engineering (3rd ed.), CRC
Press, pp. 11-1 – 11-23, ISBN 9781439898529. 3.14 Cryptographic hash function
See in particular Section 11.1.1: Tabulation hash-
ing, pp. 11-3 – 11-4. A cryptographic hash function is a special class of hash
function that has certain properties which make it suitable
Primary sources for use in cryptography. It is a mathematical algorithm
that maps data of arbitrary size to a bit string of a fixed
• Carter, J. Lawrence; Wegman, Mark N. (1979), size (a hash function) which is designed to also be a one-
“Universal classes of hash functions”, Journal of way function, that is, a function which is infeasible to in-
Computer and System Sciences, 18 (2): 143–154, vert. The only way to recreate the input data from an
doi:10.1016/0022-0000(79)90044-8, MR 532173. ideal cryptographic hash function’s output is to attempt
3.14. CRYPTOGRAPHIC HASH FUNCTION 97

Input Digest 3.14.1 Properties


cryptographic
DFCD 3454 BBEA 788A 751A
Fox hash
696C 24D9 7009 CA99 2D17
function Most cryptographic hash functions are designed to take a
The red fox cryptographic string of any length as input and produce a fixed-length
0086 46BB FB7D CBE2 823C
jumps over hash
the blue dog function
ACC7 6CD1 90B1 EE6E 3ABC hash value.
The red fox cryptographic
8FD8 7558 7851 4F32 D1C6
A cryptographic hash function must be able to withstand
jumps ouer hash
the blue dog function
76B1 79A9 0DA4 AEFE 4819 all known types of cryptanalytic attack. In theoretical
cryptography, the security level of a cryptographic hash
The red fox cryptographic
jumps oevr hash
FCD3 7FDB 5AF2 C6FF 915F
D401 C0A9 7D9A 46AF FB45
function has been defined using the following properties:
the blue dog function

The red fox cryptographic


jumps oer hash
8ACA D682 D588 4C75 4BF4
1799 7D88 BCF8 92B9 6A6C
• Pre-image resistance
the blue dog function

Given a hash value h it should be diffi-


A cryptographic hash function (specifically SHA-1) at work. A cult to find any message m such that h =
small change in the input (in the word “over”) drastically changes hash(m). This concept is related to that
the output (digest). This is the so-called avalanche effect. of one-way function. Functions that lack
this property are vulnerable to preimage
attacks.
a brute-force search of possible inputs to see if they pro-
duce a match, or use a “rainbow table” of matched hashes. • Second pre-image resistance
Bruce Schneier has called one-way hash functions “the
workhorses of modern cryptography”.[1] The input data Given an input m1 it should be dif-
is often called the message, and the output (the hash value ficult to find different input m2 such
or hash) is often called the message digest or simply the that hash(m1 ) = hash(m2 ). Functions
digest. that lack this property are vulnerable to
second-preimage attacks.
The ideal cryptographic hash function has five main prop-
erties: • Collision resistance

It should be difficult to find two different


• it is deterministic so the same message always results messages m1 and m2 such that hash(m1 )
in the same hash = hash(m2 ). Such a pair is called a cryp-
tographic hash collision. This property
• it is quick to compute the hash value for any given is sometimes referred to as strong colli-
message sion resistance. It requires a hash value
at least twice as long as that required
• it is infeasible to generate a message from its hash for preimage-resistance; otherwise col-
value except by trying all possible messages lisions may be found by a birthday at-
tack.[2]
• a small change to a message should change the hash
value so extensively that the new hash value appears Collision resistance implies second pre-image resistance,
uncorrelated with the old hash value but does not imply pre-image resistance.[3] The weaker
assumption is always preferred in theoretical cryptogra-
• it is infeasible to find two different messages with the phy, but in practice, a hash-function which is only second
same hash value pre-image resistant is considered insecure and is therefore
not recommended for real applications.

Cryptographic hash functions have many information- Informally, these properties mean that a malicious ad-
security applications, notably in digital signatures, versary cannot replace or modify the input data without
message authentication codes (MACs), and other forms changing its digest. Thus, if two strings have the same
of authentication. They can also be used as ordinary hash digest, one can be very confident that they are identical.
functions, to index data in hash tables, for fingerprinting, A function meeting these criteria may still have unde-
to detect duplicate data or uniquely identify files, and as sirable properties. Currently popular cryptographic hash
checksums to detect accidental data corruption. Indeed, functions are vulnerable to length-extension attacks: given
in information-security contexts, cryptographic hash val- hash(m) and len(m) but not m, by choosing a suitable m'
ues are sometimes called (digital) fingerprints, checksums, an attacker can calculate hash(m || m') where || denotes
or just hash values, even though all these terms stand for concatenation.[4] This property can be used to break naive
more general functions with rather different properties authentication schemes based on hash functions. The
and purposes. HMAC construction works around these problems.
98 CHAPTER 3. DICTIONARIES

In practice, collision resistance is insufficient for many 3.14.2 Illustration


practical uses. In addition to collision resistance, it should
be impossible for an adversary to find two messages with An illustration of the potential use of a cryptographic
substantially similar digests; or to infer any useful infor-
hash is as follows: Alice poses a tough math problem to
mation about the data, given only its digest. In particular,
Bob and claims she has solved it. Bob would like to try
should behave as much as possible like a random func- it himself, but would yet like to be sure that Alice is not
tion (often called a random oracle in proofs of security)bluffing. Therefore, Alice writes down her solution, com-
while still being deterministic and efficiently computable.putes its hash and tells Bob the hash value (whilst keep-
This rules out functions like the SWIFFT function, which ing the solution secret). Then, when Bob comes up with
can be rigorously proven to be collision resistant assum-the solution himself a few days later, Alice can prove that
ing that certain problems on ideal lattices are computa- she had the solution earlier by revealing it and having Bob
tionally difficult, but as a linear function, does not satisfy
hash it and check that it matches the hash value given to
these additional properties.[5] him before. (This is an example of a simple commitment
Checksum algorithms, such as CRC32 and other cyclic scheme; in actual practice, Alice and Bob will often be
redundancy checks, are designed to meet much weaker computer programs, and the secret would be something
requirements, and are generally unsuitable as crypto- less easily spoofed than a claimed puzzle solution).
graphic hash functions. For example, a CRC was used
for message integrity in the WEP encryption standard,
but an attack was readily discovered which exploited the 3.14.3 Applications
linearity of the checksum.
Verifying the integrity of files or messages

Main article: File verification

An important application of secure hashes is verification


of message integrity. Determining whether any changes
Degree of difficulty
have been made to a message (or a file), for example, can
be accomplished by comparing message digests calcu-
In cryptographic practice, “difficult” generally means “al- lated before, and after, transmission (or any other event).
most certainly beyond the reach of any adversary who For this reason, most digital signature algorithms only
must be prevented from breaking the system for as long confirm the authenticity of a hashed digest of the message
as the security of the system is deemed important”. The to be “signed”. Verifying the authenticity of a hashed di-
meaning of the term is therefore somewhat dependent on gest of the message is considered proof that the message
the application, since the effort that a malicious agent itself is authentic.
may put into the task is usually proportional to his ex-
MD5, SHA1, or SHA2 hashes are sometimes posted
pected gain. However, since the needed effort usu-
along with files on websites or forums to allow verification
ally grows very quickly with the digest length, even a
of integrity.[6] This practice establishes a chain of trust so
thousand-fold advantage in processing power can be neu-
long as the hashes are posted on a site authenticated by
tralized by adding a few dozen bits to the latter.
HTTPS.
For messages selected from a limited set of messages, for
example passwords or other short messages, it can be fea-
sible to invert a hash by trying all possible messages in the Password verification
set. Because cryptographic hash functions are typically
designed to be computed quickly, special key derivation Main article: password hashing
functions that require greater computing resources have
been developed that make such brute force attacks more A related application is password verification (first in-
difficult. vented by Roger Needham). Storing all user passwords
In some theoretical analyses “difficult” has a spe- as cleartext can result in a massive security breach if the
cific mathematical meaning, such as “not solvable in password file is compromised. One way to reduce this
asymptotic polynomial time". Such interpretations of danger is to only store the hash digest of each password.
difficulty are important in the study of provably secure To authenticate a user, the password presented by the user
cryptographic hash functions but do not usually have a is hashed and compared with the stored hash. (Note that
strong connection to practical security. For example, an this approach prevents the original passwords from be-
exponential time algorithm can sometimes still be fast ing retrieved if forgotten or lost, and they have to be re-
enough to make a feasible attack. Conversely, a polyno- placed with new ones.) The password is often concate-
mial time algorithm (e.g., one that requires n20 steps for nated with a random, non-secret salt value before the hash
n-digit keys) may be too slow for any practical use. function is applied. The salt is stored with the password
3.14. CRYPTOGRAPHIC HASH FUNCTION 99

hash. Because users have different salts, it is not feasible tographic hash functions tend to be much more expensive
to store tables of precomputed hash values for common computationally. For this reason, they tend to be used in
passwords. Key stretching functions, such as PBKDF2, contexts where it is necessary for users to protect them-
Bcrypt or Scrypt, typically use repeated invocations of a selves against the possibility of forgery (the creation of
cryptographic hash to increase the time required to per- data with the same digest as the expected data) by poten-
form brute force attacks on stored password digests. tially malicious participants.
In 2013 a long-term Password Hashing Competition was
announced to choose a new, standard algorithm for pass- Pseudorandom generation and key derivation
word hashing.[7]
Hash functions can also be used in the generation of
pseudorandom bits, or to derive new keys or passwords
Proof-of-work from a single secure key or password.
Main article: Proof-of-work system
3.14.4 Hash functions based on block ci-
A proof-of-work system (or protocol, or function) is an phers
economic measure to deter denial of service attacks and
other service abuses such as spam on a network by requir- There are several methods to use a block cipher to build a
ing some work from the service requester, usually mean- cryptographic hash function, specifically a one-way com-
ing processing time by a computer. A key feature of these pression function.
schemes is their asymmetry: the work must be moder-
The methods resemble the block cipher modes of opera-
ately hard (but feasible) on the requester side but easy
tion usually used for encryption. Many well-known hash
to check for the service provider. One popular system –
functions, including MD4, MD5, SHA-1 and SHA-2 are
used in Bitcoin mining and Hashcash – uses partial hash
built from block-cipher-like components designed for the
inversions to prove that work was done, as a good-will to-
purpose, with feedback to ensure that the resulting func-
ken to send an e-mail. The sender is required to find a
tion is not invertible. SHA-3 finalists included functions
message whose hash value begins with a number of zero
with block-cipher-like components (e.g., Skein, BLAKE)
bits. The average work that sender needs to perform in
though the function finally selected, Keccak, was built on
order to find a valid message is exponential in the number
a cryptographic sponge instead.
of zero bits required in the hash value, while the recipi-
ent can verify the validity of the message by executing a A standard block cipher such as AES can be used in
single hash function. For instance, in Hashcash, a sender place of these custom block ciphers; that might be useful
is asked to generate a header whose 160 bit SHA-1 hash when an embedded system needs to implement both en-
value has the first 20 bits as zeros. The sender will on cryption and hashing with minimal code size or hardware
average have to try 219 times to find a valid header. area. However, that approach can have costs in efficiency
and security. The ciphers in hash functions are built for
hashing: they use large keys and blocks, can efficiently
File or data identifier change keys every block, and have been designed and vet-
ted for resistance to related-key attacks. General-purpose
A message digest can also serve as a means of reliably ciphers tend to have different design goals. In particu-
identifying a file; several source code management sys- lar, AES has key and block sizes that make it nontrivial
tems, including Git, Mercurial and Monotone, use the to use to generate long hash values; AES encryption be-
sha1sum of various types of content (file content, direc- comes less efficient when the key changes each block; and
tory trees, ancestry information, etc.) to uniquely identify related-key attacks make it potentially less secure for use
them. Hashes are used to identify files on peer-to-peer in a hash function than for encryption.
filesharing networks. For example, in an ed2k link, an
MD4-variant hash is combined with the file size, provid-
ing sufficient information for locating file sources, down- 3.14.5 Merkle–Damgård construction
loading the file and verifying its contents. Magnet links
are another example. Such file hashes are often the top Main article: Merkle–Damgård construction
hash of a hash list or a hash tree which allows for addi- A hash function must be able to process an arbitrary-
tional benefits. length message into a fixed-length output. This can be
achieved by breaking the input up into a series of equal-
One of the main applications of a hash function is to al- sized blocks, and operating on them in sequence using a
low the fast look-up of a data in a hash table. Being hash one-way compression function. The compression func-
functions of a particular kind, cryptographic hash func- tion can either be specially designed for hashing or be
tions lend themselves well to this application too. built from a block cipher. A hash function built with
However, compared with standard hash functions, cryp- the Merkle–Damgård construction is as resistant to colli-
100 CHAPTER 3. DICTIONARIES

Message Message Message


that uses SHA-1 to generate internal tables, which are
block 1 block 2 block n then used in a keystream generator more or less unrelated
to the hash algorithm. SEAL is not guaranteed to be as
Message Message Message Length strong (or weak) as SHA-1. Similarly, the key expansion
block 1 block 2 block n padding
of the HC-128 and HC-256 stream ciphers makes heavy
use of the SHA256 hash function.
Finali-
IV f f f f sation
Hash

3.14.7 Concatenation
The Merkle–Damgård hash construction.
Concatenating outputs from multiple hash functions pro-
vides collision resistance as good as the strongest of the
sions as is its compression function; any collision for the algorithms included in the concatenated result. For ex-
full hash function can be traced back to a collision in the ample, older versions of Transport Layer Security (TLS)
compression function. and Secure Sockets Layer (SSL) use concatenated MD5
The last block processed should also be unambiguously and SHA-1 sums.[8][9] This ensures that a method to find
length padded; this is crucial to the security of this collisions in one of the hash functions does not defeat data
construction. This construction is called the Merkle– protected by both hash functions.
Damgård construction. Most widely used hash functions, For Merkle–Damgård construction hash functions, the
including SHA-1 and MD5, take this form. concatenated function is as collision-resistant as its
The construction has certain inherent flaws, includ- strongest component, but not more collision-resistant.
ing length-extension and generate-and-paste attacks, and Antoine Joux observed that 2-collisions lead to n-
cannot be parallelized. As a result, many entrants in the collisions: If it is feasible for an attacker to find two mes-
recent NIST hash function competition were built on dif- sages with the same MD5 hash, the attacker can find as
ferent, sometimes novel, constructions. many messages as the attacker desires with identical MD5
hashes with no greater difficulty.[10] Among the n mes-
sages with the same MD5 hash, there is likely to be a col-
3.14.6 Use in building other cryptographic lision in SHA-1. The additional work needed to find the
SHA-1 collision (beyond the exponential birthday search)
primitives requires only polynomial time.[11][12]
Hash functions can be used to build other cryptographic
primitives. For these other primitives to be cryptograph- 3.14.8 Cryptographic hash algorithms
ically secure, care must be taken to build them correctly.
Message authentication codes (MACs) (also called keyed There is a long list of cryptographic hash functions, al-
hash functions) are often built from hash functions. though many have been found to be vulnerable and should
HMAC is such a MAC. not be used. Even if a hash function has never been bro-
Just as block ciphers can be used to build hash func- ken, a successful attack against a weakened variant may
tions, hash functions can be used to build block ciphers. undermine the experts’ confidence and lead to its aban-
Luby-Rackoff constructions using hash functions can be donment. For instance, in August 2004 weaknesses were
provably secure if the underlying hash function is secure. found in several then-popular hash functions, including
Also, many hash functions (including SHA-1 and SHA- SHA-0, RIPEMD, and MD5. These weaknesses called
2) are built by using a special-purpose block cipher in into question the security of stronger algorithms derived
a Davies-Meyer or other construction. That cipher can from the weak hash functions—in particular, SHA-1
(a strengthened version of SHA-0), RIPEMD-128, and
also be used in a conventional mode of operation, with-
out the same security guarantees. See SHACAL, BEAR RIPEMD-160 (both strengthened versions of RIPEMD).
Neither SHA-0 nor RIPEMD are widely used since they
and LION.
were replaced by their strengthened versions.
Pseudorandom number generators (PRNGs) can be built
using hash functions. This is done by combining a (secret) As of 2009, the two most commonly used cryptographic
random seed with a counter and hashing it. hash functions were MD5 and SHA-1. However, a suc-
cessful attack on MD5 broke Transport Layer Security in
Some hash functions, such as Skein, Keccak, and 2008.[13]
RadioGatún output an arbitrarily long stream and can be
used as a stream cipher, and stream ciphers can also be The United States National Security Agency (NSA) de-
built from fixed-length digest hash functions. Often this veloped SHA-0 and SHA-1.
is done by first building a cryptographically secure pseu- On 12 August 2004, Joux, Carribault, Lemuet, and Jalby
dorandom number generator and then using its stream of announced a collision for the full SHA-0 algorithm. Joux
random bytes as keystream. SEAL is a stream cipher et al. accomplished this using a generalization of the
3.14. CRYPTOGRAPHIC HASH FUNCTION 101

Chabaud and Joux attack. They found that the collision [8] Florian Mendel; Christian Rechberger; Martin Schläffer.
had complexity 251 and took about 80,000 CPU hours “MD5 is Weaker than Weak: Attacks on Concatenated
on a supercomputer with 256 Itanium 2 processors— Combiners”. “Advances in Cryptology - ASIACRYPT
equivalent to 13 days of full-time use of the supercom- 2009”. p. 145. quote: 'Concatenating ... is often used by
puter. implementors to “hedge bets” on hash functions. A com-
biner of the form MD5||SHA-1 as used in SSL3.0/TLS1.0
In February 2005, an attack on SHA-1 was reported ... is an example of such a strategy.'
that would find collision in about 269 hashing operations,
rather than the 280 expected for a 160-bit hash function. [9] Danny Harnik; Joe Kilian; Moni Naor; Omer Reingold;
In August 2005, another attack on SHA-1 was reported Alon Rosen. “On Robust Combiners for Oblivious Trans-
63
that would find collisions in 2 operations. Theoreti- fer and Other Primitives”. “Advances in Cryptology - EU-
[14][15] ROCRYPT 2005”. quote: “the concatenation of hash
cal weaknesses of SHA-1 exist, and in February
[16] functions as suggested in the TLS... is guaranteed to be
of 2017 Google announced a collision in SHA-1. Se- as secure as the candidate that remains secure.” p. 99.
curity researchers recommend that new applications can
avoid these problems by using later members of the SHA [10] Antoine Joux. Multicollisions in Iterated Hash Functions.
family, such as SHA-2, or using techniques such as ran- Application to Cascaded Constructions. LNCS 3152/2004,
domized hashing[17][18] that do not require collision resis- pages 306–316 Full text.
tance.
[11] Finney, Hal (August 20, 2004). “More Problems with
However, to ensure the long-term robustness of applica- Hash Functions”. The Cryptography Mailing List. Re-
tions that use hash functions, there was a competition to trieved May 25, 2016.
design a replacement for SHA-2. On October 2, 2012,
Keccak was selected as the winner of the NIST hash func- [12] Hoch, Jonathan J.; Shamir, Adi (2008). “On the Strength
tion competition. A version of this algorithm became a of the Concatenated Hash Combiner when All the Hash
FIPS standard on August 5, 2015 under the name SHA- Functions Are Weak” (PDF). Retrieved May 25, 2016.
[19]
3.
[13] Alexander Sotirov, Marc Stevens, Jacob Appelbaum, Ar-
Another finalist from the NIST hash function competi- jen Lenstra, David Molnar, Dag Arne Osvik, Benne de
tion, BLAKE, was optimized to produce BLAKE2 which Weger, MD5 considered harmful today: Creating a rogue
is notable for being faster than SHA-3, SHA-2, SHA-1, CA certificate, accessed March 29, 2009.
or MD5, and is used in numerous applications and li-
[14] Xiaoyun Wang, Yiqun Lisa Yin, and Hongbo Yu, Finding
braries.
Collisions in the Full SHA-1

[15] Bruce Schneier, Cryptanalysis of SHA-1 (summarizes


3.14.9 See also Wang et al. results and their implications)

3.14.10 References [16] Fox-Brewster, Thomas. “Google Just 'Shattered' An Old


Crypto Algorithm -- Here’s Why That’s Big For Web Se-
[1] Schneier, Bruce. “Cryptanalysis of MD5 and SHA: Time curity”. Forbes. Retrieved 2017-02-24.
for a New Standard”. Computerworld. Retrieved 2016-
[17] Shai Halevi, Hugo Krawczyk, Update on Randomized
04-20. Much more than encryption algorithms, one-way
Hashing
hash functions are the workhorses of modern cryptogra-
phy.
[18] Shai Halevi and Hugo Krawczyk, Randomized Hashing
and Digital Signatures
[2] Katz, Jonathan; Lindell, Yehuda (2008). Introduction to
Modern Cryptography. Chapman & Hall/CRC.
[19] NIST.gov – Computer Security Division – Computer Se-
curity Resource Center
[3] Rogaway & Shrimpton 2004, in Sec. 5. Implications.

[4] “Flickr’s API Signature Forgery Vulnerability”. Thai


Duong and Juliano Rizzo. 3.14.11 External links
[5] Lyubashevsky, Vadim and Micciancio, Daniele and Peik- • Paar, Christof; Pelzl, Jan (2009). “11: Hash Func-
ert, Chris and Rosen, Alon. “SWIFFT: A Modest Pro- tions”. Understanding Cryptography, A Textbook for
posal for FFT Hashing”. Springer. Retrieved 29 August
Students and Practitioners. Springer. (companion
2016.
web site contains online cryptography course that
[6] Perrin, Chad (December 5, 2007). “Use MD5 hashes covers hash functions)
to verify software downloads”. TechRepublic. Retrieved
March 2, 2013. • “The ECRYPT Hash Function Website”.

[7] “Password Hashing Competition”. Retrieved March 3, • Buldas, A. (2011). “Series of mini-lectures about
2013. cryptographic hash functions”.
102 CHAPTER 3. DICTIONARIES

• Rogaway, P.; Shrimpton, T. (2004). “Crypto-


graphic Hash-Function Basics: Definitions, Impli-
cations, and Separations for Preimage Resistance,
Second-Preimage Resistance, and Collision Resis-
tance”. CiteSeerX 10.1.1.3.6200 .
Chapter 4

Sets

4.1 Set (abstract data type) types, and quotient sets may be replaced by setoids.) The
characteristic function F of a set S is defined as:
In computer science, a set is an abstract data type that can
store certain values, without any particular order, and no {
repeated values. It is a computer implementation of the 1, if x ∈ S
F (x) =
mathematical concept of a finite set. Unlike most other 0, if x ̸∈ S
collection types, rather than retrieving a specific element
from a set, one typically tests a value for membership in In theory, many other abstract data structures can be
a set. viewed as set structures with additional operations and/or
additional axioms imposed on the standard operations.
Some set data structures are designed for static or frozen
For example, an abstract heap can be viewed as a set
sets that do not change after they are constructed. Static
structure with a min(S) operation that returns the element
sets allow only query operations on their elements — such
of smallest value.
as checking whether a given value is in the set, or enumer-
ating the values in some arbitrary order. Other variants,
called dynamic or mutable sets, allow also the insertion 4.1.2 Operations
and deletion of elements from the set.
An abstract data structure is a collection, or aggregate, Core set-theoretical operations
of data. The data may be booleans, numbers, characters,
or other data structures. If one considers the structure One may define the operations of the algebra of sets:
yielded by packaging [lower-alpha 1] or indexing,[lower-alpha 2]
there are four basic data structures:[1][2] • union(S,T): returns the union of sets S and T.
• intersection(S,T): returns the intersection of sets S
1. unpackaged, unindexed: bunch and T.
2. packaged, unindexed: set • difference(S,T): returns the difference of sets S and
3. unpackaged, indexed: string (sequence) T.

4. packaged, indexed: list (array) • subset(S,T): a predicate that tests whether the set S
is a subset of set T.
In this view, the contents of a set are a bunch, and isolated
data items are elementary bunches (elements). Whereas Static sets
sets contain elements, bunches consist of elements.
Further structuring may be achieved by considering the Typical operations that may be provided by a static set
multiplicity of elements (sets become multisets, bunches structure S are:
become hyperbunches)[3] or their homogeneity (a record
is a set of fields, not necessarily all of the same type). • is_element_of(x,S): checks whether the value x is in
the set S.

4.1.1 Type theory • is_empty(S): checks whether the set S is empty.


• size(S) or cardinality(S): returns the number of ele-
In type theory, sets are generally identified with their ments in S.
indicator function (characteristic function): accordingly,
a set of values of type A may be denoted by 2A or P(A) • iterate(S): returns a function that returns one more
. (Subtypes and subsets may be modeled by refinement value of S at each call, in some arbitrary order.

103
104 CHAPTER 4. SETS

• enumerate(S): returns a list containing the elements • equal(S1 , S2 ): checks whether the two given sets are
of S in some arbitrary order. equal (i.e. contain all and only the same elements).
• build(x1 ,x2 ,…,xn,): creates a set structure with val- • hash(S): returns a hash value for the static set S such
ues x1 ,x2 ,…,xn. that if equal(S1 , S2 ) then hash(S1 ) = hash(S2 )
• create_from(collection): creates a new set structure
Other operations can be defined for sets with elements of
containing all the elements of the given collection or
a special type:
all the elements returned by the given iterator.
• sum(S): returns the sum of all elements of S for some
Dynamic sets definition of “sum”. For example, over integers or
reals, it may be defined as fold(0, add, S).
Dynamic set structures typically add:
• collapse(S): given a set of sets, return the union.[9]
For example, collapse({{1}, {2, 3}}) == {1, 2, 3}.
• create(): creates a new, initially empty set structure.
May be considered a kind of sum.
• create_with_capacity(n): creates a new set
• flatten(S): given a set consisting of sets and atomic
structure, initially empty but capable of hold-
elements (elements that are not sets), returns a set
ing up to n elements.
whose elements are the atomic elements of the orig-
• add(S,x): adds the element x to S, if it is not present inal top-level set or elements of the sets it contains.
already. In other words, remove a level of nesting – like col-
lapse, but allow atoms. This can be done a sin-
• remove(S, x): removes the element x from S, if it is gle time, or recursively flattening to obtain a set of
present. only atomic elements.[10] For example, flatten({1,
• capacity(S): returns the maximum number of values {2, 3}}) == {1, 2, 3}.
that S can hold. • nearest(S,x): returns the element of S that is closest
in value to x (by some metric).
Some set structures may allow only some of these opera-
tions. The cost of each operation will depend on the im- • min(S), max(S): returns the minimum/maximum el-
plementation, and possibly also on the particular values ement of S.
stored in the set, and the order in which they are inserted.
4.1.3 Implementations
Additional operations
Sets can be implemented using various data structures,
There are many other operations that can (in principle) which provide different time and space trade-offs for
be defined in terms of the above, such as: various operations. Some implementations are designed
to improve the efficiency of very specialized operations,
• pop(S): returns an arbitrary element of S, deleting it such as nearest or union. Implementations described as
from S.[4] “general use” typically strive to optimize the element_of,
add, and delete operations. A simple implementation is
• pick(S): returns an arbitrary element of S.[5][6][7] to use a list, ignoring the order of the elements and tak-
Functionally, the mutator pop can be interpreted as ing care to avoid repeated values. This is simple but in-
the pair of selectors (pick, rest), where rest returns efficient, as operations like set membership or element
the set consisting of all elements except for the ar- deletion are O(n), as they require scanning the entire
bitrary element.[8] Can be interpreted in terms of list.[lower-alpha 4] Sets are often instead implemented using
iterate.[lower-alpha 3] more efficient data structures, particularly various flavors
of trees, tries, or hash tables.
• map(F,S): returns the set of distinct values resulting
from applying function F to each element of S. As sets can be interpreted as a kind of map (by the in-
dicator function), sets are commonly implemented in the
• filter(P,S): returns the subset containing all elements same way as (partial) maps (associative arrays) – in this
of S that satisfy a given predicate P. case in which the value of each key-value pair has the unit
• fold(A0 ,F,S): returns the value A|S| after applying type or a sentinel value (like 1) – namely, a self-balancing
Ai+1 := F(Ai, e) for each element e of S, for some binary search tree for sorted sets (which has O(log n) for
binary operation F. F must be associative and com- most operations), or a hash table for unsorted sets (which
mutative for this to be well-defined. has O(1) average-case, but O(n) worst-case, for most op-
erations). A sorted linear hash table[11] may be used to
• clear(S): delete all elements of S. provide deterministically ordered sets.
4.1. SET (ABSTRACT DATA TYPE) 105

Further, in languages that support maps but not sets, sets • Python has built-in set and frozenset types since 2.4,
can be implemented in terms of maps. For example, a and since Python 3.0 and 2.7, supports non-empty
common programming idiom in Perl that converts an ar- set literals using a curly-bracket syntax, e.g.: {x, y,
ray to a hash whose values are the sentinel value 1, for use z}.
as a set, is:
• The .NET Framework provides the generic HashSet
my %elements = map { $_ => 1 } @elements; and SortedSet classes that implement the generic
ISet interface.
Other popular methods include arrays. In particular a • Smalltalk's class library includes Set and Identity-
subset of the integers 1..n can be implemented efficiently Set, using equality and identity for inclusion test
as an n-bit bit array, which also support very efficient respectively. Many dialects provide variations for
union and intersection operations. A Bloom map imple- compressed storage (NumberSet, CharacterSet), for
ments a set probabilistically, using a very compact repre- ordering (OrderedSet, SortedSet, etc.) or for weak
sentation but risking a small chance of false positives on references (WeakIdentitySet).
queries.
• Ruby's standard library includes a set module which
The Boolean set operations can be implemented in terms
contains Set and SortedSet classes that implement
of more elementary operations (pop, clear, and add), but
sets using hash tables, the latter allowing iteration in
specialized algorithms may yield lower asymptotic time
sorted order.
bounds. If sets are implemented as sorted lists, for ex-
ample, the naive algorithm for union(S,T) will take time • OCaml's standard library contains a Set module,
proportional to the length m of S times the length n of which implements a functional set data structure us-
T; whereas a variant of the list merging algorithm will ing binary search trees.
do the job in time proportional to m+n. Moreover, there
are specialized set data structures (such as the union-find • The GHC implementation of Haskell provides a
data structure) that are optimized for one or more of these Data.Set module, which implements immutable sets
operations, at the expense of others. using binary search trees.[12]
• The Tcl Tcllib package provides a set module which
implements a set data structure based upon TCL
4.1.4 Language support lists.

One of the earliest languages to support sets was Pascal; • The Swift standard library contains a Set type, since
many languages now include it, whether in the core lan- Swift 1.2.
guage or in a standard library.
As noted in the previous section, in languages which do
• In C++, the Standard Template Library (STL) pro- not directly support sets but do support associative arrays,
vides the set template class, which is typically im- sets can be emulated using associative arrays, by using the
plemented using a binary search tree (e.g. red-black elements as keys, and using a dummy value as the values,
tree); SGI's STL also provides the hash_set template which are ignored.
class, which implements a set using a hash table.
C++11 has support for the unordered_set template 4.1.5 Multiset
class, which is implemented using a hash table. In
sets, the elements themselves are the keys, in con- A generalization of the notion of a set is that of a multiset
trast to sequenced containers, where elements are or bag, which is similar to a set but allows repeated
accessed using their (relative or absolute) position. (“equal”) values (duplicates). This is used in two dis-
Set elements must have a strict weak ordering. tinct senses: either equal values are considered identical,
and are simply counted, or equal values are considered
• Java offers the Set interface to support sets (with the
equivalent, and are stored as distinct items. For example,
HashSet class implementing it using a hash table),
given a list of people (by name) and ages (in years), one
and the SortedSet sub-interface to support sorted
could construct a multiset of ages, which simply counts
sets (with the TreeSet class implementing it using
the number of people of a given age. Alternatively, one
a binary search tree).
can construct a multiset of people, where two people are
• Apple's Foundation framework (part of Cocoa) considered equivalent if their ages are the same (but may
provides the Objective-C classes NSSet, be different people and have different names), in which
NSMutableSet, NSCountedSet, NSOrderedSet, case each pair (name, age) must be stored, and selecting
and NSMutableOrderedSet. The CoreFoundation on a given age gives all the people of a given age.
APIs provide the CFSet and CFMutableSet types Formally, it is possible for objects in computer science
for use in C. to be considered “equal” under some equivalence relation
106 CHAPTER 4. SETS

but still distinct under another relation. Some types of multiplicities (this will not be able to distinguish between
multiset implementations will store distinct equal objects equal elements at all).
as separate items in the data structure; while others will Typical operations on bags:
collapse it down to one version (the first one encountered)
and keep a positive integer count of the multiplicity of the • contains(B, x): checks whether the element x is
element. present (at least once) in the bag B
As with sets, multisets can naturally be implemented us-
• is_sub_bag(B1 , B2 ): checks whether each element
ing hash table or trees, which yield different performance
in the bag B1 occurs in B1 no more often than it oc-
characteristics. curs in the bag B ; sometimes denoted as B ⊑ B .
2 1 2
The set of all bags over type T is given by the expression
• count(B, x): returns the number of times that the
bag T. If by multiset one considers equal items identi-
element x occurs in the bag B; sometimes denoted
cal and simply counts them, then a multiset can be in-
as B # x.
terpreted as a function from the input domain to the
non-negative integers (natural numbers), generalizing the • scaled_by(B, n): given a natural number n, returns a
identification of a set with its indicator function. In some bag which contains the same elements as the bag B,
cases a multiset in this counting sense may be generalized except that every element that occurs m times in B
to allow negative values, as in Python. occurs n * m times in the resulting bag; sometimes
denoted as n ⊗ B.
• C++'s Standard Template Library implements both • union(B1 , B2 ): returns a bag that containing just
sorted and unsorted multisets. It provides the those values that occur in either the bag B1 or the
multiset class for the sorted multiset, as a kind of bag B2 , except that the number of times a value x
associative container, which implements this multi- occurs in the resulting bag is equal to (B1 # x) + (B2
set using a self-balancing binary search tree. It pro- # x); sometimes denoted as B1 ⊎ B2 .
vides the unordered_multiset class for the unsorted
multiset, as a kind of unordered associative contain-
ers, which implements this multiset using a hash ta- Multisets in SQL
ble. The unsorted multiset is standard as of C++11;
previously SGI’s STL provides the hash_multiset In relational databases, a table can be a (mathematical) set
class, which was copied and eventually standardized. or a multiset, depending on the presence on unicity con-
straints on some columns (which turns it into a candidate
• For Java, third-party libraries provide multiset func- key).
tionality:
SQL allows the selection of rows from a relational table:
• Apache Commons Collections provides the this operation will in general yield a multiset, unless the
Bag and SortedBag interfaces, with imple- keyword DISTINCT is used to force the rows to be all
menting classes like HashBag and TreeBag. different, or the selection includes the primary (or a can-
didate) key.
• Google Guava provides the Multiset interface,
with implementing classes like HashMultiset In ANSI SQL the MULTISET keyword can be used to
and TreeMultiset. transform a subquery into a collection expression:
SELECT expression1, expression2... FROM ta-
• Apple provides the NSCountedSet class as part of
ble_name...
Cocoa, and the CFBag and CFMutableBag types as
part of CoreFoundation.
is a general select that can be used as subquery expression
• Python’s standard library includes of another more general query, while
collections.Counter, which is similar to a mul-
MULTISET(SELECT expression1, expression2...
tiset.
FROM table_name...)
• Smalltalk includes the Bag class, which can be in-
stantiated to use either identity or equality as predi- transforms the subquery into a collection expression that
cate for inclusion test. can be used in another query, or in assignment to a col-
umn of appropriate collection type.
Where a multiset data structure is not available, a
workaround is to use a regular set, but override the equal-
ity predicate of its items to always return “not equal” on 4.1.6 See also
distinct objects (however, such will still not be able to
• Bloom filter
store multiple occurrences of the same object) or use
an associative array mapping the values to their integer • Disjoint set
4.2. BIT ARRAY 107

4.1.7 Notes [11] Wang, Thomas (1997), Sorted Linear Hash Table

[1] “Packaging” consists in supplying a container for an ag- [12] Stephen Adams, "Efficient sets: a balancing act", Journal
gregation of objects in order to turn them into a single of Functional Programming 3(4):553-562, October 1993.
object. Consider a function call: without packaging, a Retrieved on 2015-03-11.
function can be called to act upon a bunch only by passing
each bunch element as a separate argument, which com-
plicates the function’s signature considerably (and is just 4.2 Bit array
not possible in some programming languages). By pack-
aging the bunch’s elements into a set, the function may
now be called upon a single, elementary argument: the set A bit array (also known as bitmap, bitset, bit string,
object (the bunch’s package). or bit vector) is an array data structure that compactly
stores bits. It can be used to implement a simple set data
[2] Indexing is possible when the elements being considered structure. A bit array is effective at exploiting bit-level
are totally ordered. Being without order, the elements of
parallelism in hardware to perform operations quickly. A
a multiset (for example) do not have lesser/greater or pre-
ceding/succeeding relationships: they can only be com-
typical bit array stores kw bits, where w is the number of
pared in absolute terms (same/different). bits in the unit of storage, such as a byte or word, and
k is some nonnegative integer. If w does not divide the
[3] For example, in Python pick can be implemented on a number of bits to be stored, some space is wasted due to
derived class of the built-in set as follows: internal fragmentation.
class Set(set): def pick(self): return next(iter(self))

[4] Element insertion can be done in O(1) time by simply in-


4.2.1 Definition
serting at an end, but if one avoids duplicates this takes
O(n) time. A bit array is a mapping from some domain (almost al-
ways a range of integers) to values in the set {0, 1}. The
values can be interpreted as dark/light, absent/present,
4.1.8 References locked/unlocked, valid/invalid, etcetera. The point is that
there are only two possible values, so they can be stored
[1] Hehner, Eric C. R. (1981), “Bunch Theory: A Simple Set in one bit. As with other arrays, the access to a single
Theory for Computer Science”, Information Processing bit can be managed by applying an index to the array.
Letters, 12 (1): 26, doi:10.1016/0020-0190(81)90071-5 Assuming its size (or length) to be n bits, the array can
[2] Hehner, Eric C. R. (2004), A Practical Theory of Pro- be used to specify a subset of the domain (e.g. {0, 1, 2,
gramming, second edition ..., n−1}), where a 1-bit indicates the presence and a 0-bit
the absence of a number in the set. This set data structure
[3] Hehner, Eric C. R. (2012), A Practical Theory of Pro- uses about n/w words of space, where w is the number of
gramming, 2012-3-30 edition bits in each machine word. Whether the least significant
[4] Python: pop() bit (of the word) or the most significant bit indicates the
smallest-index number is largely irrelevant, but the for-
[5] Management and Processing of Complex Data Structures: mer tends to be preferred (on little-endian machines).
Third Workshop on Information Systems and Artificial In-
telligence, Hamburg, Germany, February 28 - March 2,
1994. Proceedings, ed. Kai v. Luck, Heinz Marburger, p. 4.2.2 Basic operations
76

[6] Python Issue7212: Retrieve an arbitrary element from a Although most machines are not able to address individ-
set without removing it; see msg106593 regarding stan- ual bits in memory, nor have instructions to manipulate
dard name single bits, each bit in a word can be singled out and ma-
nipulated using bitwise operations. In particular:
[7] Ruby Feature #4553: Add Set#pick and Set#pop

[8] Inductive Synthesis of Functional Programs: Universal • OR can be used to set a bit to one: 11101010 OR
Planning, Folding of Finite Programs, and Schema Ab- 00000100 = 11101110
straction by Analogical Reasoning, Ute Schmid, Springer,
Aug 21, 2003, p. 240 • AND can be used to set a bit to zero: 11101010
AND 11111101 = 11101000
[9] Recent Trends in Data Type Specification: 10th Workshop
on Specification of Abstract Data Types Joint with the 5th • AND together with zero-testing can be used to de-
COMPASS Workshop, S. Margherita, Italy, May 30 - June termine if a bit is set:
3, 1994. Selected Papers, Volume 10, ed. Egidio Aste-
siano, Gianna Reggio, Andrzej Tarlecki, p. 38
11101010 AND 00000001 =
[10] Ruby: flatten() 00000000 = 0
108 CHAPTER 4. SETS

11101010 AND 00000010 = word and keep a running total. Counting zeros is simi-
00000010 ≠ 0 lar. See the Hamming weight article for examples of an
efficient implementation.
• XOR can be used to invert or toggle a bit:
Inversion
11101010 XOR 00000100 =
11101110 Vertical flipping of a one-bit-per-pixel image, or some
11101110 XOR 00000100 = FFT algorithms, requires flipping the bits of individual
11101010 words (so b31 b30 ... b0 becomes b0 ... b30 b31). When
this operation is not available on the processor, it’s still
• NOT can be used to invert all bits. possible to proceed by successive passes, in this example
on 32 bits:
NOT 10110010 = 01001101 exchange two 16bit halfwords exchange bytes by pairs
(0xddccbbaa -> 0xccddaabb) ... swap bits by pairs
To obtain the bit mask needed for these operations, we swap bits (b31 b30 ... b1 b0 -> b30 b31 ... b0 b1) The
can use a bit shift operator to shift the number 1 to the last operation can be written ((x&0x55555555)<<1) |
left by the appropriate number of places, as well as bitwise (x&0xaaaaaaaa)>>1)).
negation if necessary.
Given two bit arrays of the same size representing sets, we
can compute their union, intersection, and set-theoretic Find first one
difference using n/w simple bit operations each (2n/w for
difference), as well as the complement of either: The find first set or find first one operation identifies the
for i from 0 to n/w-1 complement_a[i] := not a[i] union[i] index or position of the 1-bit with the smallest index in
:= a[i] or b[i] intersection[i] := a[i] and b[i] difference[i] an array, and has widespread hardware support (for ar-
:= a[i] and (not b[i]) rays not larger than a word) and efficient algorithms for
its computation. When a priority queue is stored in a bit
array, find first one can be used to identify the highest pri-
If we wish to iterate through the bits of a bit array, we can
ority element in the queue. To expand a word-size find
do this efficiently using a doubly nested loop that loops
first one to longer arrays, one can find the first nonzero
through each word, one at a time. Only n/w memory ac-
word and then run find first one on that word. The re-
cesses are required:
lated operations find first zero, count leading zeros, count
for i from 0 to n/w-1 index := 0 // if needed word := a[i] leading ones, count trailing zeros, count trailing ones, and
for b from 0 to w-1 value := word and 1 ≠ 0 word := log base 2 (see find first set) can also be extended to a bit
word shift right 1 // do something with value index := array in a straightforward manner.
index + 1 // if needed

Both of these code samples exhibit ideal locality of refer-


4.2.4 Compression
ence, which will subsequently receive large performance
A bit array is the densest storage for “random” bits, that
boost from a data cache. If a cache line is k words, only
is, where each bit is equally likely to be 0 or 1, and each
about n/wk cache misses will occur.
one is independent. But most data is not random, so it
may be possible to store it more compactly. For exam-
4.2.3 More complex operations ple, the data of a typical fax image is not random and can
be compressed. Run-length encoding is commonly used
As with character strings it is straightforward to define to compress these long streams. However, most com-
length, substring, lexicographical compare, concatenation, pressed data formats are not so easy to access randomly;
reverse operations. The implementation of some of these also by compressing bit arrays too aggressively we run
operations is sensitive to endianness. the risk of losing the benefits due to bit-level parallelism
(vectorization). Thus, instead of compressing bit arrays
as streams of bits, we might compress them as streams of
Population / Hamming weight bytes or words (see Bitmap index (compression)).

If we wish to find the number of 1 bits in a bit array, some-


times called the population count or Hamming weight, 4.2.5 Advantages and disadvantages
there are efficient branch-free algorithms that can com-
pute the number of bits in a word using a series of simple Bit arrays, despite their simplicity, have a number of
bit operations. We simply run such an algorithm on each marked advantages over other data structures for the same
4.2. BIT ARRAY 109

problems: based on bit arrays that accept either false positives or


false negatives.
• They are extremely compact; few other data struc- Bit arrays and the operations on them are also important
tures can store n independent pieces of data in n/w for constructing succinct data structures, which use close
words. to the minimum possible space. In this context, opera-
tions like finding the nth 1 bit or counting the number of
• They allow small arrays of bits to be stored and ma- 1 bits up to a certain position become important.
nipulated in the register set for long periods of time
with no memory accesses. Bit arrays are also a useful abstraction for examining
streams of compressed data, which often contain ele-
• Because of their ability to exploit bit-level paral- ments that occupy portions of bytes or are not byte-
lelism, limit memory access, and maximally use the aligned. For example, the compressed Huffman coding
data cache, they often outperform many other data representation of a single 8-bit character can be anywhere
structures on practical data sets, even those that are from 1 to 255 bits long.
more asymptotically efficient. In information retrieval, bit arrays are a good representa-
tion for the posting lists of very frequent terms. If we
However, bit arrays aren't the solution to everything. In compute the gaps between adjacent values in a list of
particular: strictly increasing integers and encode them using unary
coding, the result is a bit array with a 1 bit in the nth
• Without compression, they are wasteful set data position if and only if n is n
in the list. The implied proba-
structures for sparse sets (those with few elements bility of a gap of n is 1/2 . This is also the special case of
compared to their range) in both time and space. Golomb coding where the parameter M is 1; this parame-
For such applications, compressed bit arrays, Judy ter is only normally selected when -log(2-p)/log(1-p) ≤ 1,
arrays, tries, or even Bloom filters should be consid- or roughly the term occurs in at least 38% of documents.
ered instead.

• Accessing individual elements can be expensive and 4.2.7 Language support


difficult to express in some languages. If random ac-
cess is more common than sequential and the array The APL programming language fully supports bit arrays
is relatively small, a byte array may be preferable on of arbitrary shape and size as a Boolean datatype distinct
a machine with byte addressing. A word array, how- from integers. All major implementations (Dyalog APL,
ever, is probably not justified due to the huge space APL2, APL Next, NARS2000, Gnu APL, etc.) pack the
overhead and additional cache misses it causes, un- bits densely into whatever size the machine word is. Bits
less the machine only has word addressing. may be accessed individually via the usual indexing no-
tation (A[3]) as well as through all of the usual primitive
functions and operators where they are often operated on
4.2.6 Applications using a special case algorithm such as summing the bits
via a table lookup of bytes.
Because of their compactness, bit arrays have a number
of applications in areas where space or efficiency is at a The C programming language's bitfields, pseudo-objects
premium. Most commonly, they are used to represent a found in structs with size equal to some number of bits,
simple group of boolean flags or an ordered sequence of are in fact small bit arrays; they are limited in that they
boolean values. cannot span words. Although they give a convenient syn-
tax, the bits are still accessed using bitwise operators on
Bit arrays are used for priority queues, where the bit at most machines, and they can only be defined statically
index k is set if and only if k is in the queue; this data (like C’s static arrays, their sizes are fixed at compile-
structure is used, for example, by the Linux kernel, and time). It is also a common idiom for C programmers to
benefits strongly from a find-first-zero operation in hard- use words as small bit arrays and access bits of them us-
ware. ing bit operators. A widely available header file included
Bit arrays can be used for the allocation of memory pages, in the X11 system, xtrapbits.h, is “a portable way for sys-
inodes, disk sectors, etc. In such cases, the term bitmap tems to define bit field manipulation of arrays of bits.”
may be used. However, this term is frequently used to A more explanatory description of aforementioned ap-
refer to raster images, which may use multiple bits per proach can be found in the comp.lang.c faq.
pixel. In C++, although individual bools typically occupy the
Another application of bit arrays is the Bloom filter, a same space as a byte or an integer, the STL type vec-
probabilistic set data structure that can store large sets tor<bool> is a partial template specialization in which
in a small space in exchange for a small probability of bits are packed as a space efficiency optimization. Since
error. It is also possible to build probabilistic hash tables bytes (and not bits) are the smallest addressable unit in
110 CHAPTER 4. SETS

C++, the [] operator does not return a reference to an el- or word boundary— or unaligned— elements immedi-
ement, but instead returns a proxy reference. This might ately follow each other with no padding.
seem a minor point, but it means that vector<bool> is not Hardware description languages such as VHDL, Verilog,
a standard STL container, which is why the use of vec- and SystemVerilog natively support bit vectors as these
tor<bool> is generally discouraged. Another unique STL are used to model storage elements like flip-flops, hard-
class, bitset,[1] creates a vector of bits fixed at a partic- ware busses and hardware signals in general. In hard-
ular size at compile-time, and in its interface and syntax ware verification languages such as OpenVera, e and
more resembles the idiomatic use of words as bit sets by C SystemVerilog, bit vectors are used to sample values from
programmers. It also has some additional power, such as
the hardware models, and to represent data that is trans-
the ability to efficiently count the number of bits that are ferred to hardware during simulations.
set. The Boost C++ Libraries provide a dynamic_bitset
class[2] whose size is specified at run-time.
The D programming language provides bit arrays in its 4.2.8 See also
standard library, Phobos, in std.bitmanip. As in C++, the
[] operator does not return a reference, since individual • Bit field
bits are not directly addressable on most hardware, but
• Arithmetic logic unit
instead returns a bool.
In Java, the class BitSet creates a bit array that is then • Bitboard Chess and similar games.
manipulated with functions named after bitwise opera-
tors familiar to C programmers. Unlike the bitset in C++, • Bitmap index
the Java BitSet does not have a “size” state (it has an ef- • Binary numeral system
fectively infinite size, initialized with 0 bits); a bit can
be set or tested at any index. In addition, there is a • Bitstream
class EnumSet, which represents a Set of values of an
enumerated type internally as a bit vector, as a safer al- • Judy array
ternative to bitfields.
The .NET Framework supplies a BitArray collection 4.2.9 References
class. It stores boolean values, supports random access
and bitwise operators, can be iterated over, and its Length [1] std::bitset
property can be changed to grow or truncate it.
[2] boost::dynamic_bitset
Although Standard ML has no support for bit arrays,
Standard ML of New Jersey has an extension, the BitAr- [3] http://perldoc.perl.org/perlop.html#
ray structure, in its SML/NJ Library. It is not fixed in size Bitwise-String-Operators
and supports set operations and bit operations, including,
[4] http://perldoc.perl.org/functions/vec.html
unusually, shift operations.
Haskell likewise currently lacks standard support for bit-
wise operations, but both GHC and Hugs provide a 4.2.10 External links
Data.Bits module with assorted bitwise functions and op-
erators, including shift and rotate operations and an “un- • mathematical bases by Pr. D.E.Knuth
boxed” array over boolean values may be used to model
a Bit array, although this lacks support from the former • vector<bool> Is Nonconforming, and Forces Opti-
module. mization Choice

In Perl, strings can be used as expandable bit arrays. They • vector<bool>: More Problems, Better Solutions
can be manipulated using the usual bitwise operators (~ |
& ^),[3] and individual bits can be tested and set using the
vec function.[4] 4.3 Bloom filter
In Ruby, you can access (but not set) a bit of an integer
(Fixnum or Bignum) using the bracket operator ([]), as if Not to be confused with Bloom shader effect.
it were an array of bits.
Apple’s Core Foundation library contains CFBitVector A Bloom filter is a space-efficient probabilistic data
and CFMutableBitVector structures. structure, conceived by Burton Howard Bloom in 1970,
PL/I supports arrays of bit strings of arbitrary length, that is used to test whether an element is a member of a
which may be either fixed-length or varying. The array el- set. False positive matches are possible, but false nega-
ements may be aligned— each element begins on a byte tives are not – in other words, a query returns either “pos-
sibly in set” or “definitely not in set”. Elements can be
4.3. BLOOM FILTER 111

added to the set, but not removed (though this can be ad- have been set to 1 when it was inserted. If all are 1, then
dressed with a “counting” filter); the more elements that either the element is in the set, or the bits have by chance
are added to the set, the larger the probability of false been set to 1 during the insertion of other elements, re-
positives. sulting in a false positive. In a simple Bloom filter, there
Bloom proposed the technique for applications where is no way to distinguish between the two cases, but more
the amount of source data would require an impracti- advanced techniques can address this problem.
cally large amount of memory if “conventional” error- The requirement of designing k different independent
free hashing techniques were applied. He gave the ex- hash functions can be prohibitive for large k. For a good
ample of a hyphenation algorithm for a dictionary of hash function with a wide output, there should be little
500,000 words, out of which 90% follow simple hyphen- if any correlation between different bit-fields of such a
ation rules, but the remaining 10% require expensive disk hash, so this type of hash can be used to generate mul-
accesses to retrieve specific hyphenation patterns. With tiple “different” hash functions by slicing its output into
sufficient core memory, an error-free hash could be used multiple bit fields. Alternatively, one can pass k differ-
to eliminate all unnecessary disk accesses; on the other ent initial values (such as 0, 1, ..., k − 1) to a hash func-
hand, with limited core memory, Bloom’s technique uses tion that takes an initial value; or add (or append) these
a smaller hash area but still eliminates most unnecessary values to the key. For larger m and/or k, independence
accesses. For example, a hash area only 15% of the size among the hash functions can be relaxed with negligible
needed by an ideal error-free hash still eliminates 85% of increase in false positive rate.[3] Specifically, Dillinger &
the disk accesses – an 85–15 form of the Pareto princi- Manolios (2004b) show the effectiveness of deriving the
ple.[1] k indices using enhanced double hashing or triple hash-
More generally, fewer than 10 bits per element are re- ing, variants of double hashing that are effectively simple
quired for a 1% false positive probability, independent of random number generators seeded with the two or three
the size or number of elements in the set. [2] hash values.
Removing an element from this simple Bloom filter is im-
possible because false negatives are not permitted. An
4.3.1 Algorithm description element maps to k bits, and although setting any one of
those k bits to zero suffices to remove the element, it
also results in removing any other elements that happen
{x, y, z}
to map onto that bit. Since there is no way of determining
whether any other elements have been added that affect
the bits for an element to be removed, clearing any of the
0 1 0 1 1 1 0 0 0 0 0 1 0 1 0 0 1 0
bits would introduce the possibility for false negatives.
One-time removal of an element from a Bloom filter can
w be simulated by having a second Bloom filter that contains
items that have been removed. However, false positives
An example of a Bloom filter, representing the set {x, y, z}. The in the second filter become false negatives in the com-
colored arrows show the positions in the bit array that each set
posite filter, which may be undesirable. In this approach
element is mapped to. The element w is not in the set {x, y, z},
re-adding a previously removed item is not possible, as
because it hashes to one bit-array position containing 0. For this
figure, m = 18 and k = 3. one would have to remove it from the “removed” filter.
It is often the case that all the keys are available but are ex-
An empty Bloom filter is a bit array of m bits, all set to pensive to enumerate (for example, requiring many disk
0. There must also be k different hash functions defined, reads). When the false positive rate gets too high, the
each of which maps or hashes some set element to one of filter can be regenerated; this should be a relatively rare
the m array positions with a uniform random distribution. event.
Typically, k is a constant, much smaller than m, which is
proportional to the number of elements to be added; the
precise choice of k and the constant of proportionality of 4.3.2 Space and time advantages
m are determined by the intended false positive rate of
the filter. While risking false positives, Bloom filters have a strong
space advantage over other data structures for represent-
To add an element, feed it to each of the k hash functions ing sets, such as self-balancing binary search trees, tries,
to get k array positions. Set the bits at all these positions hash tables, or simple arrays or linked lists of the entries.
to 1. Most of these require storing at least the data items them-
To query for an element (test whether it is in the set), feed selves, which can require anywhere from a small num-
it to each of the k hash functions to get k array positions. ber of bits, for small integers, to an arbitrary number
If any of the bits at these positions is 0, the element is def- of bits, such as for strings (tries are an exception, since
initely not in the set – if it were, then all the bits would they can share storage between elements with equal pre-
112 CHAPTER 4. SETS

FILTER STORAGE
set, which means the array must be very large and contain
Do you have 'key1'?
long runs of zeros. The information content of the array
Filter: Storage:
No
relative to its size is low. The generalized Bloom filter (k
No
No
greater than 1) allows many more bits to be set while still
maintaining a low false positive rate; if the parameters
(k and m) are chosen well, about half of the bits will be
set,[5] and these will be apparently random, minimizing
necessary
Do you have 'key2'?
disk access Storage: redundancy and maximizing information content.
Filter:
Yes Yes

Yes: here is key2 Yes: here is key2


4.3.3 Probability of false positives
False Positive unnecessary
Do you have 'key3'? Storage:
Filter: disk access
No 1
Yes log_2(m)=8
No log_2(m)=12
No
log_2(m)=16
log_2(m)=20
0.01
log_2(m)=24
log_2(m)=28
log_2(m)=32

Bloom filter used to speed up answers in a key-value storage sys- 0.0001


log_2(m)=36

tem. Values are stored on a disk which has slow access times.
Bloom filter decisions are much faster. However some unneces-
p
sary disk accesses are made when the filter reports a positive (in 1e-06

order to weed out the false positives). Overall answer speed is


better with the Bloom filter than without the Bloom filter. Use of
1e-08
a Bloom filter for this purpose, however, does increase memory
usage.
1e-10
1 10 100 1000 10000 100000 1e+06 1e+07 1e+08 1e+09

fixes). However, Bloom filters do not store the data items


at all, and a separate solution must be provided for the The false positive probability p as a function of number of ele-
actual storage. Linked structures incur an additional lin- ments n in the filter and the filter size m . An optimal number of
ear space overhead for pointers. A Bloom filter with 1% hash functions k = (m/n) ln 2 has been assumed.
error and an optimal value of k, in contrast, requires only
about 9.6 bits per element, regardless of the size of the Assume that a hash function selects each array position
elements. This advantage comes partly from its compact- with equal probability. If m is the number of bits in the
ness, inherited from arrays, and partly from its probabilis- array, the probability that a certain bit is not set to 1 by
tic nature. The 1% false-positive rate can be reduced by a certain hash function during the insertion of an element
a factor of ten by adding only about 4.8 bits per element. is

However, if the number of potential values is small and


many of them can be in the set, the Bloom filter is easily 1
1− .
surpassed by the deterministic bit array, which requires m
only one bit for each potential element. Note also that
If k is the number of hash functions, the probability that
hash tables gain a space and time advantage if they be-
the bit is not set to 1 by any of the hash functions is
gin ignoring collisions and store only whether each bucket
contains an entry; in this case, they have effectively be-
come Bloom filters with k = 1.[4] ( )k
1
Bloom filters also have the unusual property that the time 1− .
m
needed either to add items or to check whether an item is
in the set is a fixed constant, O(k), completely indepen- If we have inserted n elements, the probability that a cer-
dent of the number of items already in the set. No other tain bit is still 0 is
constant-space set data structure has this property, but the
average access time of sparse hash tables can make them ( )kn
faster in practice than some Bloom filters. In a hardware 1
1− ;
implementation, however, the Bloom filter shines because m
its k lookups are independent and can be parallelized.
the probability that it is 1 is therefore
To understand its space efficiency, it is instructive to com-
pare the general Bloom filter with its special case when k
( )kn
= 1. If k = 1, then in order to keep the false positive 1
rate sufficiently low, a small fraction of bits should be 1 − 1 − m .
4.3. BLOOM FILTER 113

Now test membership of an element that is not in the set.


Each of the k array positions computed by the hash func- m
tions is 1 with a probability as above. The probability of k = ln 2.
n
all of them being 1, which would cause the algorithm to
erroneously claim that the element is in the set, is often The required number of bits, m, given n (the number of
given as inserted elements) and a desired false positive probability
p (and assuming the optimal value of k is used) can be
( computed by substituting the optimal value of k in the
[ ]kn )k ( ) probability expression above:
1 k
1− 1− ≈ 1 − e−kn/m .
m
( )m
n ln 2
This is not strictly correct as it assumes independence for
p = 1 − e−( n ln 2) m
m n

the probabilities of each bit being set. However, assuming


it is a close approximation we have that the probability
which can be simplified to:
of false positives decreases as m (the number of bits in
the array) increases, and increases as n (the number of
inserted elements) increases. m 2
ln p = − (ln 2) .
An alternative analysis arriving at the same approxima- n
tion without the assumption of independence is given by
Mitzenmacher and Upfal.[6] After all n items have been This results in:
added to the Bloom filter, let q be the fraction of the m
bits that are set to 0. (That is, the number of bits still set to
0 is qm.) Then, when testing membership of an element m = − n ln p
not in the set, for the array position given by any of the (ln 2)2
k hash functions, the probability that the bit is found set
to 1 is 1 − q . So the probability that all k hash functions with the corresponding number of hash functions k (ig-
find their bit set to 1 is (1 − q)k . Further, the expected noring integrality):
value of q is the probability that a given array position is
left untouched by each of the k hash functions for each of
the n items, which is (as above) ln p
k=− .
ln 2

( )kn This means that for a given false positive probability p, the
1 length of a Bloom filter m is proportionate to the number
E[q] = 1−
m of elements being filtered n and the required number of
hash functions only depends on the target false positive
It is possible to prove, without the independence assump-
probability p.[8]
tion, that q is very strongly concentrated around its ex-
n ln p
pected value. In particular, from the Azuma–Hoeffding The formula m = − (ln 2)2 is approximate for three rea-
[7]
inequality, they prove that sons. First, and of least concern, it approximates 1 − m 1

as e−m , which is a good asymptotic approximation (i.e.,


which holds as m →∞). Second, of more concern, it as-
λ
Pr(|q − E[q]| ≥ ) ≤ 2 exp(−2λ2 /kn) sumes that during the membership test the event that one
m tested bit is set to 1 is independent of the event that any
Because of this, we can say that the exact probability of other tested bit is set to 1. Third, of most concern, it as-
false positives is sumes that k = m n ln 2 is fortuitously integral.
Goel and Gupta,[9] however, give a rigorous upper bound
( [ ]kn )that
k makes no approximations and requires no assump-
∑ ( )
1 tions. They show thatkthe false positive probability for a
Pr(q = t)(1−t) ≈ (1−E[q]) = 1 − 1 −
k k
≈ 1 − e−kn/m
m finite Bloom filter with m bits ( m > 1 ), n elements, and
t
k hash functions is at most
as before.

Optimal number of hash functions (1 − e−k(n+0.5)/(m−1) )k .

The number of hash functions, k, must be a positive inte- This bound can (
be interpreted as saying that the approxi-
)k
ger. Putting this constraint aside, for a given m and n, the mate formula 1 − e−kn/m can be applied at a penalty
value of k that minimizes the false positive probability is of at most half an extra element and at most one fewer bit.
114 CHAPTER 4. SETS

4.3.4 Approximating the number of items • Union and intersection of Bloom filters with the
in a Bloom filter same size and set of hash functions can be imple-
mented with bitwise OR and AND operations re-
Swamidass & Baldi (2007) showed that the number of spectively. The union operation on Bloom filters is
items in a Bloom filter can be approximated with the fol- lossless in the sense that the resulting Bloom filter is
lowing formula, the same as the Bloom filter created from scratch us-
ing the union of the two sets. The intersect operation
[ ] satisfies a weaker property: the false positive proba-
∗ m X bility in the resulting Bloom filter is at most the false-
n = − ln 1 − ,
k m positive probability in one of the constituent Bloom
filters, but may be larger than the false positive prob-
where n∗ is an estimate of the number of items in the ability in the Bloom filter created from scratch using
filter, m is the length (size) of the filter, k is the number the intersection of the two sets.
of hash functions, and X is the number of bits set to one.
• Some kinds of superimposed code can be seen as
a Bloom filter implemented with physical edge-
4.3.5 The union and intersection of sets notched cards. An example is Zatocoding, invented
by Calvin Mooers in 1947, in which the set of cate-
Bloom filters are a way of compactly representing a set of gories associated with a piece of information is rep-
items. It is common to try to compute the size of the in- resented by notches on a card, with a random pattern
tersection or union between two sets. Bloom filters can be of four notches for each category.
used to approximate the size of the intersection and union
of two sets. Swamidass & Baldi (2007) showed that for
two Bloom filters of length m, their counts, respectively 4.3.7 Examples
can be estimated as
• Akamai's web servers use Bloom filters to pre-
vent “one-hit-wonders” from being stored in its disk
[ ]
m n(A) caches. One-hit-wonders are web objects requested
n(A∗ ) = − ln 1 − by users just once, something that Akamai found ap-
k m
plied to nearly three-quarters of their caching infras-
and tructure. Using a Bloom filter to detect the second
request for a web object and caching that object only
[ ] on its second request prevents one-hit wonders from
∗ m n(B) entering the disk cache, significantly reducing disk
n(B ) = − ln 1 − .
k m workload and increasing disk cache hit rates.[10]
The size of their union can be estimated as • Google BigTable, Apache HBase and Apache Cas-
sandra, and Postgresql[11] use Bloom filters to re-
[ ] duce the disk lookups for non-existent rows or
m n(A ∪ B)
n(A∗ ∪ B ∗ ) = − ln 1 − , columns. Avoiding costly disk lookups consider-
k m ably increases the performance of a database query
where n(A ∪ B) is the number of bits set to one in either operation.[12]
of the two Bloom filters. Finally, the intersection can be • The Google Chrome web browser used to use a
estimated as Bloom filter to identify malicious URLs. Any URL
was first checked against a local Bloom filter, and
only if the Bloom filter returned a positive result was
n(A∗ ∩ B ∗ ) = n(A∗ ) + n(B ∗ ) − n(A∗ ∪ B ∗ ), a full check of the URL performed (and the user
warned, if that too returned a positive result).[13][14]
using the three formulas together.
• The Squid Web Proxy Cache uses Bloom filters for
cache digests.[15]
4.3.6 Interesting properties
• Bitcoin uses Bloom filters to speed up wallet
• Unlike a standard hash table, a Bloom filter of a fixed synchronization.[16][17]
size can represent a set with an arbitrarily large num-
• The Venti archival storage system uses Bloom filters
ber of elements; adding an element never fails due
to detect previously stored data.[18]
to the data structure “filling up”. However, the false
positive rate increases steadily as elements are added • The SPIN model checker uses Bloom filters to
until all bits in the filter are set to 1, at which point track the reachable state space for large verification
all queries yield a positive result. problems.[19]
4.3. BLOOM FILTER 115

• The Cascading analytics framework uses Bloom [0, n/ε] where ϵ is the requested false positive rate. The
filters to speed up asymmetric joins, where one sequence of values is then sorted and compressed using
of the joined data sets is significantly larger than Golomb coding (or some other compression technique)
the other (often called Bloom join in the database to occupy a space close to n log2 (1/ϵ) bits. To query
literature).[20] the Bloom filter for a given key, it will suffice to check
if its corresponding value is stored in the Bloom filter.
• The Exim mail transfer agent (MTA) uses Bloom Decompressing the whole Bloom filter for each query
filters in its rate-limit feature.[21] would make this variant totally unusable. To overcome
this problem the sequence of values is divided into small
• Medium uses Bloom filters to avoid recommending
[22] blocks of equal size that are compressed separately. At
articles a user has previously read.
query time only half a block will need to be decompressed
on average. Because of decompression overhead, this
4.3.8 Alternatives variant may be slower than classic Bloom filters but this
may be compensated by the fact that a single hash func-
Classic Bloom filters use 1.44 log2 (1/ϵ) bits of space per tion need to be computed.
inserted key, where ϵ is the false positive rate of the Another alternative to classic Bloom filter is the one based
Bloom filter. However, the space that is strictly neces- on space efficient variants of cuckoo hashing. In this case
sary for any data structure playing the same role as a once the hash table is constructed, the keys stored in the
Bloom filter is only log2 (1/ϵ) per key.[23] Hence Bloom hash table are replaced with short signatures of the keys.
filters use 44% more space than an equivalent optimal Those signatures are strings of bits computed using a hash
data structure. Instead, Pagh et al. provide an optimal- function applied on the keys.
space data structure. Moreover, their data structure has
constant locality of reference independent of the false
positive rate, unlike Bloom filters, where a smaller false 4.3.9 Extensions and applications
positive rate ϵ leads to a greater number of memory ac-
cesses per query, log(1/ϵ) . Also, it allows elements to Cache filtering
be deleted without a space penalty, unlike Bloom filters.
The same improved properties of optimal space usage,
constant locality of reference, and the ability to delete el-
ements are also provided by the cuckoo filter of Fan et
al. (2014), an open source implementation of which is
available.
Stern & Dill (1996) describe a probabilistic structure
based on hash tables, hash compaction, which Dillinger
& Manolios (2004b) identify as significantly more ac- Using a Bloom filter to prevent one-hit-wonders from being
curate than a Bloom filter when each is configured op- stored in a web cache decreased the rate of disk writes by nearly
timally. Dillinger and Manolios, however, point out that one half, reducing the load on the disks and potentially increasing
the reasonable accuracy of any given Bloom filter over a disk performance.[10]
wide range of numbers of additions makes it attractive
for probabilistic enumeration of state spaces of unknown Content delivery networks deploy web caches around the
size. Hash compaction is, therefore, attractive when the world to cache and serve web content to users with greater
number of additions can be predicted accurately; how- performance and reliability. A key application of Bloom
ever, despite being very fast in software, hash compaction filters is their use in efficiently determining which web ob-
is poorly suited for hardware because of worst-case linear jects to store in these web caches. Nearly three-quarters
access time. of the URLs accessed from a typical web cache are “one-
hit-wonders” that are accessed by users only once and
Putze, Sanders & Singler (2007) have studied some vari- never again. It is clearly wasteful of disk resources to
ants of Bloom filters that are either faster or use less space store one-hit-wonders in a web cache, since they will
than classic Bloom filters. The basic idea of the fast vari- never be accessed again. To prevent caching one-hit-
ant is to locate the k hash values associated with each key wonders, a Bloom filter is used to keep track of all URLs
into one or two blocks having the same size as processor’s that are accessed by users. A web object is cached only
memory cache blocks (usually 64 bytes). This will pre- when it has been accessed at least once before, i.e., the ob-
sumably improve performance by reducing the number ject is cached on its second request. The use of a Bloom
of potential memory cache misses. The proposed vari- filter in this fashion significantly reduces the disk write
ants have however the drawback of using about 32% more workload, since one-hit-wonders are never written to the
space than classic Bloom filters. disk cache. Further, filtering out the one-hit-wonders
The space efficient variant relies on using a single hash also saves cache space on disk, increasing the cache hit
function that generates for each key a value in the range rates.[10]
116 CHAPTER 4. SETS

Counting filters Decentralized aggregation

Bloom filters can be organized in distributed data struc-


tures to perform fully decentralized computations of
aggregate functions. Decentralized aggregation makes
collective measurements locally available in every node
of a distributed network without involving a centralized
Counting filters provide a way to implement a delete oper- computational entity for this purpose.[24]
ation on a Bloom filter without recreating the filter afresh.
In a counting filter the array positions (buckets) are ex-
tended from being a single bit to being an n-bit counter. Data synchronization
In fact, regular Bloom filters can be considered as count-
ing filters with a bucket size of one bit. Counting filters Bloom filters can be used for approximate data synchro-
were introduced by Fan et al. (2000). nization as in Byers et al. (2004). Counting Bloom filters
The insert operation is extended to increment the value can be used to approximate the number of differences be-
of the buckets, and the lookup operation checks that each tween two sets and this approach is described in Agarwal
of the required buckets is non-zero. The delete operation & Trachtenberg (2006).
then consists of decrementing the value of each of the
respective buckets.
Bloomier filters
Arithmetic overflow of the buckets is a problem and the
buckets should be sufficiently large to make this case rare. Chazelle et al. (2004) designed a generalization of Bloom
If it does occur then the increment and decrement opera- filters that could associate a value with each element that
tions must leave the bucket set to the maximum possible had been inserted, implementing an associative array.
value in order to retain the properties of a Bloom filter. Like Bloom filters, these structures achieve a small space
The size of counters is usually 3 or 4 bits. Hence count- overhead by accepting a small probability of false posi-
ing Bloom filters use 3 to 4 times more space than static tives. In the case of “Bloomier filters”, a false positive is
Bloom filters. In contrast, the data structures of Pagh, defined as returning a result when the key is not in the
Pagh & Rao (2005) and Fan et al. (2014) also allow dele- map. The map will never return the wrong value for a
tions but use less space than a static Bloom filter. key that is in the map.
Another issue with counting filters is limited scalability.
Because the counting Bloom filter table cannot be ex- Compact approximators
panded, the maximal number of keys to be stored simul-
taneously in the filter must be known in advance. Once Boldi & Vigna (2005) proposed a lattice-based general-
the designed capacity of the table is exceeded, the false ization of Bloom filters. A compact approximator as-
positive rate will grow rapidly as more keys are inserted. sociates to each key an element of a lattice (the standard
Bonomi et al. (2006) introduced a data structure based on Bloom filters being the case of the Boolean two-element
d-left hashing that is functionally equivalent but uses ap- lattice). Instead of a bit array, they have an array of lattice
proximately half as much space as counting Bloom filters. elements. When adding a new association between a key
The scalability issue does not occur in this data structure. and an element of the lattice, they compute the maximum
Once the designed capacity is exceeded, the keys could of the current contents of the k array locations associated
be reinserted in a new hash table of double size. to the key with the lattice element. When reading the
value associated to a key, they compute the minimum of
The space efficient variant by Putze, Sanders & Singler
the values found in the k locations associated to the key.
(2007) could also be used to implement counting filters
The resulting value approximates from above the original
by supporting insertions and deletions.
value.
Rottenstreich, Kanizo & Keslassy (2012) introduced a
new general method based on variable increments that
significantly improves the false positive probability of Stable Bloom filters
counting Bloom filters and their variants, while still sup-
porting deletions. Unlike counting Bloom filters, at each Deng & Rafiei (2006) proposed Stable Bloom filters as a
element insertion, the hashed counters are incremented variant of Bloom filters for streaming data. The idea is
by a hashed variable increment instead of a unit incre- that since there is no way to store the entire history of a
ment. To query an element, the exact values of the coun- stream (which can be infinite), Stable Bloom filters con-
ters are considered and not just their positiveness. If a tinuously evict stale information to make room for more
sum represented by a counter value cannot be composed recent elements. Since stale information is evicted, the
of the corresponding variable increment for the queried Stable Bloom filter introduces false negatives, which do
element, a negative answer can be returned to the query. not appear in traditional Bloom filters. The authors show
4.3. BLOOM FILTER 117

that a tight upper bound of false positive rates is guaran- patterns don't match, we check the attenuated Bloom fil-
teed, and the method is superior to standard Bloom filters ter in order to determine which node should be the next
in terms of false positive rates and time efficiency when a hop. We see that n2 doesn't offer service A but lies on the
small space and an acceptable false positive rate are given. path to nodes that do. Hence, we move to n2 and repeat
the same procedure. We quickly find that n3 offers the
service, and hence the destination is located.[27]
Scalable Bloom filters
By using attenuated Bloom filters consisting of multiple
Almeida et al. (2007) proposed a variant of Bloom fil- layers, services at more than one hop distance can be
ters that can adapt dynamically to the number of elements discovered while avoiding saturation of the Bloom filter
stored, while assuring a minimum false positive proba- by attenuating (shifting out) bits set by sources further
bility. The technique is based on sequences of standard away.[26]
Bloom filters with increasing capacity and tighter false
positive probabilities, so as to ensure that a maximum Chemical structure searching
false positive probability can be set beforehand, regard-
less of the number of elements to be inserted. Bloom filters are often used to search large chemical
structure databases (see chemical similarity). In the sim-
plest case, the elements added to the filter (called a fin-
Layered Bloom filters
gerprint in this field) are just the atomic numbers present
A layered Bloom filter consists of multiple Bloom filter in the molecule, or a hash based on the atomic number
layers. Layered Bloom filters allow keeping track of how of each atom and the number and type of its bonds. This
many times an item was added to the Bloom filter by case is too simple to be useful. More advanced filters
checking how many layers contain the item. With a lay- also encode atom counts, larger substructure features like
ered Bloom filter a check operation will normally return carboxyl groups, and graph properties like the number of
the deepest layer number the item was found in.[25] rings. In hash-based fingerprints, a hash function based
on atom and bond properties is used to turn a subgraph
into a PRNG seed, and the first output values used to set
Attenuated Bloom filters bits in the Bloom filter.
Molecular fingerprints started in the late 1940s as way
to search for chemical structures searched on punched
cards. However, it wasn't until around 1990 that Day-
light introduced a hash-based method to generate the bits,
rather than use a precomputed table. Unlike the dictio-
nary approach, the hash method can assign bits for sub-
structures which hadn't previously been seen. In the early
1990s, the term “fingerprint” was considered different
from “structural keys”, but the term has since grown to en-
compass most molecular characteristics which can used
for a similarity comparison, including structural keys,
sparse count fingerprints, and 3D fingerprints. Unlike
Attenuated Bloom Filter Example: Search for pattern 11010, Bloom filters, the Daylight hash method allows the num-
starting from node n1.
ber of bits assigned per feature to be a function of the
feature size, but most implementations of Daylight-like
An attenuated Bloom filter of depth D can be viewed as fingerprints use a fixed number of bits per feature, which
an array of D normal Bloom filters. In the context of makes them a Bloom filter. The original Daylight fin-
service discovery in a network, each node stores regular gerprints could be used for both similarity and screening
and attenuated Bloom filters locally. The regular or local purposes. Many other fingerprint types, like the popular
Bloom filter indicates which services are offered by the ECFP2, can be used for similarity but not for screening
node itself. The attenuated filter of level i indicates which because they include local environmental characteristics
services can be found on nodes that are i-hops away from that introduce false negatives when used as a screen. Even
the current node. The i-th value is constructed by taking if these are constructed with the same mechanism, these
a union of local Bloom filters for nodes i-hops away from are not Bloom filters because they cannot be used to filter.
the node.[26]
Let’s take a small network shown on the graph below as
an example. Say we are searching for a service A whose 4.3.10 See also
id hashes to bits 0,1, and 3 (pattern 11010). Let n1 node
• Count–min sketch
to be the starting point. First, we check whether service
A is offered by n1 by checking its local filter. Since the • Feature hashing
118 CHAPTER 4. SETS

• MinHash 4.3.12 References


• Quotient filter • Agarwal, Sachin; Trachtenberg, Ari (2006),
“Approximating the number of differences be-
• Skip list
tween remote sets” (PDF), IEEE Information
Theory Workshop, Punta del Este, Uruguay:
4.3.11 Notes 217, doi:10.1109/ITW.2006.1633815, ISBN
1-4244-0035-X
[1] Bloom (1970).
• Ahmadi, Mahmood; Wong, Stephan (2007), “A
[2] Bonomi et al. (2006). Cache Architecture for Counting Bloom Filters”,
15th international Conference on Networks (ICON-
[3] Dillinger & Manolios (2004a); Kirsch & Mitzenmacher
2007), p. 218, doi:10.1109/ICON.2007.4444089,
(2006).
ISBN 978-1-4244-1229-7
[4] Mitzenmacher & Upfal (2005).
• Almeida, Paulo; Baquero, Carlos; Preguica, Nuno;
[5] Blustein & El-Maazawi (2002), pp. 21–22 Hutchison, David (2007), “Scalable Bloom Fil-
ters” (PDF), Information Processing Letters, 101 (6):
[6] Mitzenmacher & Upfal (2005), pp. 109–111, 308.
255–261, doi:10.1016/j.ipl.2006.10.007
[7] Mitzenmacher & Upfal (2005), p. 308.
• Apache Software Foundation (2012), “11.6.
[8] Starobinski, Trachtenberg & Agarwal (2003) Schema Design”, The Apache HBase Reference
Guide, Revision 0.94.27
[9] Goel & Gupta (2010)
• Bloom, Burton H. (1970), “Space/Time Trade-
[10] Maggs & Sitaraman (2015). offs in Hash Coding with Allowable Errors”,
[11] ""Bloom index contrib module"". Postgresql.org. 2016- Communications of the ACM, 13 (7): 422–426,
04-01. Retrieved 2016-06-18. doi:10.1145/362686.362692

[12] Chang et al. (2006); Apache Software Foundation (2012). • Blustein, James; El-Maazawi, Amal (2002), “opti-
mal case for general Bloom filters”, Bloom Filters
[13] Yakunin, Alex (2010-03-25). “Alex Yakunin’s blog: Nice — A Tutorial, Analysis, and Survey, Dalhousie Uni-
Bloom filter application”. Blog.alexyakunin.com. Re-
versity Faculty of Computer Science, pp. 1–31
trieved 2014-05-31.

[14] “Issue 10896048: Transition safe browsing from bloom


• Boldi, Paolo; Vigna, Sebastiano (2005), “Mu-
filter to prefix set. - Code Review”. Chromiumcodere- table strings in Java: design, implementation
view.appspot.com. Retrieved 2014-07-03. and lightweight text-search algorithms”, Sci-
ence of Computer Programming, 54 (1): 3–23,
[15] Wessels (2004). doi:10.1016/j.scico.2004.05.003
[16] Bitcoin 0.8.0 • Bonomi, Flavio; Mitzenmacher, Michael; Pan-
[17] “The Bitcoin Foundation - Supporting the development of igrahy, Rina; Singh, Sushil; Varghese, George
Bitcoin”. bitcoinfoundation.org. (2006), “An Improved Construction for Count-
ing Bloom Filters”, Algorithms – ESA 2006,
[18] “Plan 9 /sys/man/8/venti”. Plan9.bell-labs.com. Re- 14th Annual European Symposium (PDF), Lecture
trieved 2014-05-31. Notes in Computer Science, 4168, pp. 684–
[19] http://spinroot.com/ 695, doi:10.1007/11841036_61, ISBN 978-3-540-
38875-3
[20] Mullin (1990).
• Broder, Andrei; Mitzenmacher, Michael (2005),
[21] “Exim source code”. github. Retrieved 2014-03-03. “Network Applications of Bloom Filters: A Sur-
vey” (PDF), Internet Mathematics, 1 (4): 485–509,
[22] “What are Bloom filters?". Medium. Retrieved 2015-11-
01.
doi:10.1080/15427951.2004.10129096

[23] Pagh, Pagh & Rao (2005). • Byers, John W.; Considine, Jeffrey; Mitzenmacher,
Michael; Rost, Stanislav (2004), “Informed con-
[24] Pournaras, Warnier & Brazier (2013). tent delivery across adaptive overlay networks”,
IEEE/ACM Transactions on Networking, 12 (5):
[25] Zhiwang, Jungang & Jian (2010).
767, doi:10.1109/TNET.2004.836103
[26] Koucheryavy et al. (2009).
• Chang, Fay; Dean, Jeffrey; Ghemawat, Sanjay;
[27] Kubiatowicz et al. (2000). Hsieh, Wilson; Wallach, Deborah; Burrows, Mike;
4.3. BLOOM FILTER 119

Chandra, Tushar; Fikes, Andrew; Gruber, Robert • Eppstein, David; Goodrich, Michael T. (2007),
(2006), “Bigtable: A Distributed Storage System for “Space-efficient straggler identification in round-
Structured Data”, Seventh Symposium on Operating trip data streams via Newton’s identities and in-
System Design and Implementation vertible Bloom filters”, Algorithms and Data Struc-
tures, 10th International Workshop, WADS 2007,
• Charles, Denis; Chellapilla, Kumar (2008), Springer-Verlag, Lecture Notes in Computer Sci-
“Bloomier Filters: A second look”, The Computing ence 4619, pp. 637–648, arXiv:0704.3313
Research Repository (CoRR), arXiv:0807.0928
• Fan, Bin; Andersen, Dave G.; Kaminsky, Michael;
• Chazelle, Bernard; Kilian, Joe; Rubinfeld, Ronitt; Mitzenmacher, Michael D. (2014), “Cuckoo fil-
Tal, Ayellet (2004), “The Bloomier filter: an ef- ter: Practically better than Bloom”, Proc. 10th
ficient data structure for static support lookup ta- ACM Int. Conf. Emerging Networking Experi-
bles”, Proceedings of the Fifteenth Annual ACM- ments and Technologies (CoNEXT '14), pp. 75–88,
SIAM Symposium on Discrete Algorithms (PDF), pp. doi:10.1145/2674005.2674994. Open source im-
30–39 plementation available on github.
• Cohen, Saar; Matias, Yossi (2003), “Spectral Bloom • Fan, Li; Cao, Pei; Almeida, Jussara; Broder, An-
Filters”, Proceedings of the 2003 ACM SIGMOD drei (2000), “Summary Cache: A Scalable Wide-
International Conference on Management of Data Area Web Cache Sharing Protocol”, IEEE/ACM
(PDF), pp. 241–252, doi:10.1145/872757.872787, Transactions on Networking, 8 (3): 281–293,
ISBN 158113634X doi:10.1109/90.851975. A preliminary version ap-
peared at SIGCOMM '98.
• Deng, Fan; Rafiei, Davood (2006), “Approximately
Detecting Duplicates for Streaming Data using Sta- • Goel, Ashish; Gupta, Pankaj (2010), “Small subset
ble Bloom Filters”, Proceedings of the ACM SIG- queries and bloom filters using ternary associative
MOD Conference (PDF), pp. 25–36 memories, with applications”, ACM Sigmetrics 2010,
38: 143, doi:10.1145/1811099.1811056
• Dharmapurikar, Sarang; Song, Haoyu; Turner,
Jonathan; Lockwood, John (2006), “Fast packet • Haghighat, Mohammad Hashem; Tavakoli, Mehdi;
classification using Bloom filters”, Proceedings of Kharrazi, Mehdi (2013), “Payload Attribution via
the 2006 ACM/IEEE Symposium on Architecture for Character Dependent Multi-Bloom Filters”, Trans-
Networking and Communications Systems (PDF), action on Information Forensics and Security, IEEE,
pp. 61–70, doi:10.1145/1185347.1185356, ISBN 99 (5): 705, doi:10.1109/TIFS.2013.2252341
1595935800, archived from the original (PDF) on
2007-02-02 • Kirsch, Adam; Mitzenmacher, Michael (2006),
“Less Hashing, Same Performance: Building a
• Dietzfelbinger, Martin; Pagh, Rasmus (2008), “Suc- Better Bloom Filter”, in Azar, Yossi; Erlebach,
cinct Data Structures for Retrieval and Approximate Thomas, Algorithms – ESA 2006, 14th Annual
Membership”, The Computing Research Repository European Symposium (PDF), Lecture Notes in
(CoRR), arXiv:0803.3693 Computer Science, 4168, Springer-Verlag, Lecture
Notes in Computer Science 4168, pp. 456–467,
• Dillinger, Peter C.; Manolios, Panagiotis (2004a), doi:10.1007/11841036, ISBN 978-3-540-38875-3
“Fast and Accurate Bitstate Verification for SPIN”,
Proceedings of the 11th International Spin Workshop • Koucheryavy, Y.; Giambene, G.; Staehle, D.;
on Model Checking Software, Springer-Verlag, Lec- Barcelo-Arroyo, F.; Braun, T.; Siris, V. (2009),
ture Notes in Computer Science 2989 “Traffic and QoS Management in Wireless Multi-
media Networks”, COST 290 Final Report, USA:
• Dillinger, Peter C.; Manolios, Panagiotis (2004b), 111
“Bloom Filters in Probabilistic Verification”,
Proceedings of the 5th International Conference • Kubiatowicz, J.; Bindel, D.; Czerwinski, Y.; Geels,
on Formal Methods in Computer-Aided Design, S.; Eaton, D.; Gummadi, R.; Rhea, S.; Weather-
Springer-Verlag, Lecture Notes in Computer spoon, H.; et al. (2000), “Oceanstore: An architec-
Science 3312 ture for global-scale persistent storage” (PDF), ACM
SIGPLAN Notices, USA: 190–201
• Donnet, Benoit; Baynat, Bruno; Friedman, Timur
(2006), “Retouched Bloom Filters: Allowing Net- • Maggs, Bruce M.; Sitaraman, Ramesh K. (July
worked Applications to Flexibly Trade Off False 2015), “Algorithmic nuggets in content deliv-
Positives Against False Negatives”, CoNEXT 06 – ery”, SIGCOMM Computer Communication Review,
2nd Conference on Future Networking Technologies, New York, NY, USA: ACM, 45 (3): 52–66,
archived from the original on 2009-05-17 doi:10.1145/2805789.2805800
120 CHAPTER 4. SETS

• Mitzenmacher, Michael; Upfal, Eli (2005), • Shanmugasundaram, Kulesh; Brönnimann, Hervé;


Probability and computing: Randomized algorithms Memon, Nasir (2004), “Payload attribution via hier-
and probabilistic analysis, Cambridge University archical Bloom filters”, Proceedings of the 11th ACM
Press, pp. 107–112, ISBN 9780521835404 Conference on Computer and Communications Se-
curity, pp. 31–41, doi:10.1145/1030083.1030089,
• Mortensen, Christian Worm; Pagh, Rasmus; ISBN 1581139616
Pătraşcu, Mihai (2005), “On dynamic range
reporting in one dimension”, Proceedings • Starobinski, David; Trachtenberg, Ari; Agarwal,
of the Thirty-seventh Annual ACM Sympo- Sachin (2003), “Efficient PDA Synchronization”,
sium on Theory of Computing, pp. 104–111, IEEE Transactions on Mobile Computing, 2 (1): 40,
doi:10.1145/1060590.1060606, ISBN 1581139608 doi:10.1109/TMC.2003.1195150

• Mullin, James K. (1990), “Optimal semijoins for • Stern, Ulrich; Dill, David L. (1996), “A New
distributed database systems”, Software Engineer- Scheme for Memory-Efficient Probabilistic Verifi-
ing, IEEE Transactions on, 16 (5): 558–560, cation”, Proceedings of Formal Description Tech-
doi:10.1109/32.52778 niques for Distributed Systems and Communica-
tion Protocols, and Protocol Specification, Testing,
• Pagh, Anna; Pagh, Rasmus; Rao, S. Srinivasa and Verification: IFIP TC6/WG6.1 Joint Interna-
(2005), “An optimal Bloom filter replacement”, tional Conference, Chapman & Hall, IFIP Con-
Proceedings of the Sixteenth Annual ACM-SIAM ference Proceedings, pp. 333–348, CiteSeerX
Symposium on Discrete Algorithms (PDF), pp. 823– 10.1.1.47.4101
829
• Swamidass, S. Joshua; Baldi, Pierre (2007), “Math-
• Porat, Ely (2008), “An Optimal Bloom Filter Re- ematical correction for fingerprint similarity mea-
placement Based on Matrix Solving”, The Comput- sures to improve chemical retrieval”, Journal of
ing Research Repository (CoRR), arXiv:0804.1845 chemical information and modeling, ACS Publi-
cations, 47 (3): 952–964, doi:10.1021/ci600526a,
• Pournaras, E.; Warnier, M.; Brazier, F.M.T.. PMID 17444629
(2013), “A generic and adaptive aggregation
service for large-scale decentralized networks”, • Wessels, Duane (January 2004), “10.7 Cache Di-
Complex Adaptive Systems Modeling, 1:19, gests”, Squid: The Definitive Guide (1st ed.),
doi:10.1186/2194-3206-1-19. Prototype imple- O'Reilly Media, p. 172, ISBN 0-596-00162-2,
mentation available on github. Cache Digests are based on a technique first pub-
lished by Pei Cao, called Summary Cache. The fun-
• Putze, F.; Sanders, P.; Singler, J. (2007), “Cache-,
damental idea is to use a Bloom filter to represent
Hash- and Space-Efficient Bloom Filters”, in Deme-
the cache contents.
trescu, Camil, Experimental Algorithms, 6th Interna-
tional Workshop, WEA 2007 (PDF), Lecture Notes • Zhiwang, Cen; Jungang, Xu; Jian, Sun (2010),
in Computer Science, 4525, Springer-Verlag, Lec- “A multi-layer Bloom filter for duplicated URL
ture Notes in Computer Science 4525, pp. 108– detection”, Proc. 3rd International Conference
121, doi:10.1007/978-3-540-72845-0, ISBN 978- on Advanced Computer Theory and Engineer-
3-540-72844-3 ing (ICACTE 2010), 1, pp. V1–586–V1–591,
doi:10.1109/ICACTE.2010.5578947
• Rottenstreich, Ori; Kanizo, Yossi; Keslassy,
Isaac (2012), “The Variable-Increment Count-
ing Bloom Filter”, 31st Annual IEEE Interna- 4.3.13 External links
tional Conference on Computer Communications,
2012, Infocom 2012 (PDF), pp. 1880–1888, • Why Bloom filters work the way they do (Michael
doi:10.1109/INFCOM.2012.6195563, ISBN Nielsen, 2012)
978-1-4673-0773-4
• Bloom Filters — A Tutorial, Analysis, and Survey
• Sethumadhavan, Simha; Desikan, Rajagopalan; (Blustein & El-Maazawi, 2002) at Dalhousie Uni-
Burger, Doug; Moore, Charles R.; Keckler, Stephen versity
W. (2003), “Scalable hardware memory disam-
biguation for high ILP processors”, 36th Annual • Table of false-positive rates for different configura-
IEEE/ACM International Symposium on Microar- tions from a University of Wisconsin–Madison web-
chitecture, 2003, MICRO-36 (PDF), pp. 399– site
410, doi:10.1109/MICRO.2003.1253244, ISBN 0-
7695-2043-X, archived from the original (PDF) on • Interactive Processing demonstration from ash-
2007-01-14 can.org
4.4. MINHASH 121

• “More Optimal Bloom Filters,” Ely Porat sets A and B. In other words, if r is the random variable
(Nov/2007) Google TechTalk video on YouTube that is one when h ᵢ (A) = h ᵢ (B) and zero otherwise,
then r is an unbiased estimator of J(A,B). r has too high a
• “Using Bloom Filters” Detailed Bloom Filter expla- variance to be a useful estimator for the Jaccard similar-
nation using Perl ity on its own—it is always zero or one. The idea of the
• “A Garden Variety of Bloom Filters - Explanation MinHash scheme is to reduce this variance by averaging
and Analysis of Bloom filter variants together several variables constructed in the same way.

• “Bloom filters, fast and simple” - Explanation and


example implementation in Python 4.4.2 Algorithm

Variant with many hash functions


4.4 MinHash
The simplest version of the minhash scheme uses k differ-
In computer science, MinHash (or the min-wise ent hash functions, where k is a fixed integer parameter,
independent permutations locality sensitive hashing and represents each set S by the k values of h ᵢ (S) for
scheme) is a technique for quickly estimating how similar these k functions.
two sets are. The scheme was invented by Andrei Broder
To estimate J(A,B) using this version of the scheme, let
(1997),[1] and initially used in the AltaVista search en-
y be the number of hash functions for which h ᵢ (A) =
gine to detect duplicate web pages and eliminate them
h ᵢ (B), and use y/k as the estimate. This estimate is
from search results.[2] It has also been applied in large-
the average of k different 0-1 random variables, each of
scale clustering problems, such as clustering documents
which is one when h ᵢ (A) = h ᵢ (B) and zero otherwise,
by the similarity of their sets of words.[1]
and each of which is an unbiased estimator of J(A,B).
Therefore, their average is also an unbiased estimator,
4.4.1 Jaccard similarity and minimum and by standard Chernoff bounds for sums of 0-1 random
variables, its expected error is O(1/√k).[3]
hash values
Therefore, for any constant ε > 0 there is a constant k =
The Jaccard similarity coefficient is a commonly used in- O(1/ε2 ) such that the expected error of the estimate is at
dicator of the similarity between two sets. For sets A and most ε. For example, 400 hashes would be required to
B it is defined to be the ratio of the number of elements estimate J(A,B) with an expected error less than or equal
of their intersection and the number of elements of their to .05.
union:

Variant with a single hash function


|A ∩ B|
J(A, B) = .
|A ∪ B| It may be computationally expensive to compute multiple
hash functions, but a related version of MinHash scheme
This value is 0 when the two sets are disjoint, 1 when they avoids this penalty by using only a single hash function
are equal, and strictly between 0 and 1 otherwise. Two and uses it to select multiple values from each set rather
sets are more similar (i.e. have relatively more members than selecting only a single minimum value per hash func-
in common) when their Jaccard index is closer to 1. The tion. Let h be a hash function, and let k be a fixed integer.
goal of MinHash is to estimate J(A,B) quickly, without If S is any set of k or more values in the domain of h, de-
explicitly computing the intersection and union. fine h₍k₎(S) to be the subset of the k members of S that
Let h be a hash function that maps the members of A have the smallest values of h. This subset h₍k₎(S) is used
and B to distinct integers, and for any set S define h ᵢ (S) as a signature for the set S, and the similarity of any two
to be the minimal member of S with respect to h—that sets is estimated by comparing their signatures.
is, the member x of S with the minimum value of h(x). Specifically, let A and B be any two sets. Then X =
Now, if we apply h ᵢ to both A and B, we will get the h₍k₎(h₍k₎(A) ∪ h₍k₎(B)) = h₍k₎(A ∪ B) is a set of k elements
same value exactly when the element of the union A ∪ B of A ∪ B, and if h is a random function then any sub-
with minimum hash value lies in the intersection A ∩ B. set of k elements is equally likely to be chosen; that is, X
The probability of this being true is the ratio above, and is a simple random sample of A ∪ B. The subset Y = X
therefore: ∩ h₍k₎(A) ∩ h₍k₎(B) is the set of members of X that be-
long to the intersection A ∩ B. Therefore, |Y|/k is an un-
Pr[ h ᵢ (A) = h ᵢ (B) ] = J(A,B), biased estimator of J(A,B). The difference between this
estimator and the estimator produced by multiple hash
That is, the probability that h ᵢ (A) = h ᵢ (B) is true is functions is that X always has exactly k members, whereas
equal to the similarity J(A,B), assuming randomly chosen the multiple hash functions may lead to a smaller number
122 CHAPTER 4. SETS

of sampled elements due to the possibility that two differ- 4.4.4 Applications
ent hash functions may have the same minima. However,
when k is small relative to the sizes of the sets, this dif- The original applications for MinHash involved cluster-
ference is negligible. ing and eliminating near-duplicates among web docu-
By standard Chernoff bounds for sampling without re- ments, represented [1][2][6]
as sets of the words occurring in those
placement, this estimator has expected error O(1/√k), documents. Similar techniques have also been used
matching the performance of the multiple-hash-function for clustering and near-duplicate elimination for other
scheme. types of data, such as images: in the case of image data,
an image can be represented as a set of smaller subim-
ages cropped from it, or as sets of more complex image
Time analysis feature descriptions.[7]
In data mining, Cohen et al. (2001) use MinHash as a
The estimator |Y|/k can be computed in time O(k) from tool for association rule learning. Given a database in
the two signatures of the given sets, in either variant of the which each entry has multiple attributes (viewed as a 0–1
scheme. Therefore, when ε and k are constants, the time matrix with a row per database entry and a column per
to compute the estimated similarity from the signatures is attribute) they use MinHash-based approximations to the
also constant. The signature of each set can be computed Jaccard index to identify candidate pairs of attributes that
in linear time on the size of the set, so when many pair- frequently co-occur, and then compute the exact value
wise similarities need to be estimated this method can of the index for only those pairs to determine the ones
lead to a substantial savings in running time compared whose frequencies of co-occurrence are below a given
to doing a full comparison of the members of each set. strict threshold.[8]
Specifically, for set size n the many hash variant takes
O(n k) time. The single hash variant is generally faster,
requiring O(n) time to maintain the queue of minimum 4.4.5 Other uses
hash values assuming n >> k.[1]
The MinHash scheme may be seen as an instance of
locality sensitive hashing, a collection of techniques for
4.4.3 Min-wise independent permutations
using hash functions to map large sets of objects down to
smaller hash values in such a way that, when two objects
In order to implement the MinHash scheme as described
have a small distance from each other, their hash values
above, one needs the hash function h to define a random
are likely to be the same. In this instance, the signature
permutation on n elements, where n is the total number
of a set may be seen as its hash value. Other locality
of distinct elements in the union of all of the sets to be
sensitive hashing techniques exist for Hamming distance
compared. But because there are n! different permuta-
between sets and cosine distance between vectors; local-
tions, it would require Ω(n log n) bits just to specify a
ity sensitive hashing has important applications in nearest
truly random permutation, an infeasibly large number for
neighbor search algorithms.[9] For large distributed sys-
even moderate values of n. Because of this fact, by anal-
tems, and in particular MapReduce, there exist modified
ogy to the theory of universal hashing, there has been sig-
versions of MinHash to help compute similarities with no
nificant work on finding a family of permutations that is
dependence on the point dimension.[10]
“min-wise independent”, meaning that for any subset of
the domain, any element is equally likely to be the min-
imum. It has been established that a min-wise indepen-
dent family of permutations must include at least 4.4.6 Evaluation and benchmarks

A large scale evaluation has been conducted by Google


in 2006 [11] to compare the performance of Min-
lcm(1, 2, · · · , n) ≥ en−o(n) hash and SimHash[12] algorithms. In 2007 Google re-
ported using Simhash for duplicate detection for web
different permutations, and therefore that it needs Ω(n)
crawling[13] and using Minhash and LSH for Google
bits to specify a single permutation, still infeasibly
News personalization.[14]
large.[2]
Because of this impracticality, two variant notions of
min-wise independence have been introduced: restricted 4.4.7 See also
min-wise independent permutations families, and ap-
proximate min-wise independent families. Restricted • SimHash
min-wise independence is the min-wise independence
property restricted to certain sets of cardinality at most • w-shingling
k.[4] Approximate min-wise independence has at most a
fixed probability ε of varying from full independence.[5] • Count-min sketch
4.5. DISJOINT-SET DATA STRUCTURE 123

4.4.8 References [12] Charikar, Moses S. (2002), “Similarity estimation tech-


niques from rounding algorithms”, Proceedings of the
[1] Broder, Andrei Z. (1997), “On the resemblance and con- 34th Annual ACM Symposium on Theory of Computing,
tainment of documents”, Compression and Complexity doi:10.1145/509907.509965.
of Sequences: Proceedings, Positano, Amalfitan Coast,
[13] Gurmeet Singh, Manku; Jain, Arvind; Das Sarma, An-
Salerno, Italy, June 11-13, 1997 (PDF), IEEE, pp. 21–
ish (2007), “Detecting near-duplicates for web crawling”,
29, doi:10.1109/SEQUEN.1997.666900.
Proceedings of the 16th International Conference on World
Wide Web (PDF), doi:10.1145/1242572.1242592.
[2] Broder, Andrei Z.; Charikar, Moses; Frieze, Alan M.;
Mitzenmacher, Michael (1998), “Min-wise independent [14] Das, Abhinandan S.; Datar, Mayur; Garg, Ahutosh; Ra-
permutations”, Proc. 30th ACM Symposium on The- jaram, Shyam; et al. (2007), “Google news personaliza-
ory of Computing (STOC '98), New York, NY, USA: tion: scalable online collaborative filtering”, Proceedings
Association for Computing Machinery, pp. 327–336, of the 16th International Conference on World Wide Web,
doi:10.1145/276698.276781. doi:10.1145/1242572.1242610.

[3] Vassilvitskii, Sergey (2011), COMS 6998-12: Dealing


with Massive Data (lecture notes, Columbia university) 4.4.9 External links
(PDF).
• Mining of Massive Datasets, Ch. 3. Finding similar
[4] Matoušek, Jiří; Stojaković, Miloš (2003), “On re- Items
stricted min-wise independence of permutations”, Ran-
dom Structures and Algorithms, 23 (4): 397–408, • Set Similarity & MinHash - C# implementation
doi:10.1002/rsa.10101.
• Minhash with LSH for all-pair search (C# imple-
[5] Saks, M.; Srinivasan, A.; Zhou, S.; Zuckerman, D. mentation)
(2000), “Low discrepancy sets yield approximate min-
wise independent permutation families”, Information Pro- • MinHash – Java implementation
cessing Letters, 73 (1–2): 29–32, doi:10.1016/S0020- • MinHash – Scala implementation and a duplicate
0190(99)00163-5.
detection tool
[6] Manasse, Mark (2012). On the Efficient Determination of • All pairs similarity search (Google Research)
Most Near Neighbors: Horseshoes, Hand Grenades, Web
Search, and Other Situations when Close is Close Enough. • Distance and Similarity Measures(Wolfram Alpha)
Morgan & Claypool. p. 72. ISBN 9781608450886.
• Nilsimsa hash (Python implementation)
[7] Chum, Ondřej; Philbin, James; Isard, Michael; Zisser-
• Simhash
man, Andrew (2007), “Scalable near identical image and
shot detection”, Proceedings of the 6th ACM International
Conference on Image and Cideo Retrieval (CIVR'07),
doi:10.1145/1282280.1282359; Chum, Ondřej; Philbin, 4.5 Disjoint-set data structure
James; Zisserman, Andrew (2008), “Near duplicate
image detection: min-hash and tf-idf weighting”,
Proceedings of the British Machine Vision Conference 1 2 3 4 5 6 7 8
(PDF), 3, p. 4.

[8] Cohen, E.; Datar, M.; Fujiwara, S.; Gionis, A.; Indyk, P.; MakeSet creates 8 singletons.
Motwani, R.; Ullman, J. D.; Yang, C. (2001), “Finding
interesting associations without support pruning”, IEEE
Transactions on Knowledge and Data Engineering, 13 (1): 1 2 5 6 8 3 4 7
64–78, doi:10.1109/69.908981.
After some operations of Union, some sets are grouped together.
[9] Andoni, Alexandr; Indyk, Piotr (2008), “Near-optimal
hashing algorithms for approximate nearest neighbor in In computer science, a disjoint-set data structure, also
high dimensions”, Communications of the ACM, 51 (1): called a union–find data structure or merge–find set,
117–122, doi:10.1145/1327452.1327494. is a data structure that keeps track of a set of elements
partitioned into a number of disjoint (nonoverlapping)
[10] Zadeh, Reza; Goel, Ashish (2012), Dimension Indepen-
subsets. It supports two useful operations:
dent Similarity Computation, arXiv:1206.2082 .

[11] Henzinger, Monika (2006), “Finding near-duplicate • Find: Determine which subset a particular element
web pages: a large-scale evaluation of algorithms”, is in. Find typically returns an item from this set
Proceedings of the 29th Annual International ACM SIGIR that serves as its “representative"; by comparing the
Conference on Research and Development in Information result of two Find operations, one can determine
Retrieval (PDF), doi:10.1145/1148170.1148222. whether two elements are in the same subset.
124 CHAPTER 4. SETS

• Union: Join two subsets into a single subset. have the name of the list to which it belongs updated. The
element x will only have its name updated when the list it
The other important operation, MakeSet, which makes a belongs to is merged with another list of the same size or
set containing only a given element (a singleton), is gen- of greater size. Each time that happens, the size of the list
erally trivial. With these three operations, many practical to which x belongs at least doubles. So finally, the ques-
partitioning problems can be solved (see the Applications tion is “how many times can a number double before it is
section). the size of n ?" (then the list containing x will contain all
n elements). The answer is exactly log2 (n) . So for any
In order to define these operations more precisely, some given element of any given list in the structure described,
way of representing the sets is needed. One common it will need to be updated log2 (n) times in the worst case.
approach is to select a fixed element of each set, called Therefore, updating a list of n elements stored in this way
its representative, to represent the set as a whole. Then, takes O(n log(n)) time in the worst case. A find opera-
Find(x) returns the representative of the set that x be- tion can be done in O(1) for this structure because each
longs to, and Union takes two set representatives as its node contains the name of the list to which it belongs.
arguments.
A similar argument holds for merging the trees in the data
structures discussed below. Additionally, it helps explain
4.5.1 Disjoint-set linked lists the time analysis of some operations in the binomial heap
and Fibonacci heap data structures.
A simple disjoint-set data structure uses a linked list for
each set. The element at the head of each list is chosen as
its representative. 4.5.2 Disjoint-set forests
MakeSet creates a list of one element. Union appends the Disjoint-set forests are data structures where each set is
two lists, a constant-time operation if the list carries a represented by a tree data structure, in which each node
pointer to its tail. The drawback of this implementation holds a reference to its parent node (see parent pointer
is that Find requires O(n) or linear time to traverse the listtree). They were first described by Bernard A. Galler and
backwards from a given element to the head of the list. Michael J. Fischer in 1964,[3] although their precise anal-
This can be avoided by including in each linked list node ysis took years.
a pointer to the head of the list; then Find takes constant In a disjoint-set forest, the representative of each set is the
time, since this pointer refers directly to the set represen- root of that set’s tree. Find follows parent nodes until it
tative. However, Union now has to update each element reaches the root. Union combines two trees into one by
of the list being appended to make it point to the head of attaching the root of one to the root of the other.
the new combined list, requiring O(n) time.
When the length of each list is tracked, the required time
can be improved by always appending the smaller list Implementation
to the longer. Using this weighted-union heuristic, a se-
quence of m MakeSet, Union, and Find operations on n el- Naive One way of implementing these might be:
ements requires O(m + nlog n) time.[2] For asymptotically function MakeSet(x) x.parent := x function Find(x) if
faster operations, a different data structure is needed. x.parent == x return x else return Find(x.parent) func-
tion Union(x, y) xRoot := Find(x) yRoot := Find(y)
xRoot.parent := yRoot
Analysis of the naive approach
In this naive form, this approach is no better than the
We now explain the bound O(n log(n)) above. linked-list approach, because the tree it creates can be
highly unbalanced.
Suppose you have a collection of lists and each node of
each list contains an object, the name of the list to which
it belongs, and the number of elements in that list. Also Union by rank The previous implementation can be
assume that the total number of elements in all lists is n enhanced in two ways.
(i.e. there are n elements overall). We wish to be able to
The first way, called union by rank, is to always attach
merge any two of these lists, and update all of their nodes
the smaller tree to the root of the larger tree. Since it is
so that they still contain the name of the list to which they
the depth of the tree that affects the running time, the
belong. The rule for merging the lists A and B is that iftree with smaller depth gets added under the root of the
A is larger than B then merge the elements of B into A deeper tree, which only increases the depth if the depths
and update the elements that used to belong to B , and were equal. In the context of this algorithm, the term rank
vice versa. is used instead of depth since it stops being equal to the
Choose an arbitrary element of list L , say x . We wish depth if path compression (described below) is also used.
to count how many times in the worst case will x need to One-element trees are defined to have a rank of zero, and
4.5. DISJOINT-SET DATA STRUCTURE 125

whenever two trees of the same rank r are united, the


rank of the result is r+1. Just applying this technique
alone yields a worst-case running-time of O(log n) for the
Union or Find operation. Pseudocode for the improved
MakeSet and Union:
function MakeSet(x) x.parent := x x.rank := 0 function
Union(x, y) xRoot := Find(x) yRoot := Find(y) // if x
and y are already in the same set (i.e., have the same root
or representative) if xRoot == yRoot return // x and y
are not in same set, so we merge them if xRoot.rank <
yRoot.rank xRoot.parent := yRoot else if xRoot.rank >
yRoot.rank yRoot.parent := xRoot else yRoot.parent :=
xRoot xRoot.rank := xRoot.rank + 1

Path compression The second improvement, called


path compression, is a way of flattening the structure of
the tree whenever Find is used on it. The idea is that
each node visited on the way to a root node may as well
be attached directly to the root node; they all share the
same representative. To effect this, as Find recursively
traverses up the tree, it changes each node’s parent refer-
ence to point to the root that it found. The resulting tree
is much flatter, speeding up future operations not only on
these elements but on those referencing them, directly or
indirectly. Here is the improved Find: A demo for Union-Find when using Kruskal’s algorithm to find
function Find(x) if x.parent != x x.parent := minimum spanning tree.
Find(x.parent) return x.parent
These two techniques complement each other; applied to-
gether, the amortized time per operation is only O(α(n)) doesn't allow deletion of edges—even without path com-
, where α(n) is the inverse of the function n = f (x) = pression or the rank heuristic.
A(x, x) , and A is the extremely fast-growing Ackermann
function. Since α(n) is the inverse of this function, α(n)
is less than 5 for all remotely practical values of n . Thus,
the amortized running time per operation is effectively a
small constant. 4.5.4 History
In fact, this is asymptotically optimal: Fredman and Saks
showed in 1989 that Ω(α(n)) words must be accessed by While the ideas used in disjoint-set forests have long been
any disjoint-set data structure per operation on average.[4] familiar, Robert Tarjan was the first to prove the upper
bound (and a restricted version of the lower bound) in
terms of the inverse Ackermann function, in 1975.[6] Un-
4.5.3 Applications til this time the best bound on the time per operation,
proven by Hopcroft and Ullman,[7] was O(log* n), the
Disjoint-set data structures model the partitioning of a iterated logarithm of n, another slowly growing function
set, for example to keep track of the connected com- (but not quite as slow as the inverse Ackermann function).
ponents of an undirected graph. This model can then Tarjan and Van Leeuwen also developed one-pass Find
be used to determine whether two vertices belong to the algorithms that are more efficient in practice while re-
same component, or whether adding an edge between taining the same worst-case complexity.[6]
them would result in a cycle. The Union–Find algo-
rithm is used in high-performance implementations of In 2007, Sylvain Conchon and Jean-Christophe Filliâtre
unification.[5] developed a persistent version of the disjoint-set forest
data structure, allowing previous versions of the struc-
This data structure is used by the Boost Graph Library to ture to be efficiently retained, and formalized its correct-
implement its Incremental Connected Components func- ness using the proof assistant Coq.[8] However, the im-
tionality. It is also used for implementing Kruskal’s algo- plementation is only asymptotic if used ephemerally or if
rithm to find the minimum spanning tree of a graph. the same version of the structure is repeatedly used with
Note that the implementation as disjoint-set forests limited backtracking.
126 CHAPTER 4. SETS

4.5.5 See also • Wait-free Parallel Algorithms for the Union–Find


Problem, a 1994 paper by Richard J. Anderson and
• Partition refinement, a different data structure for Heather Woll describing a parallelized version of
maintaining disjoint sets, with updates that split sets Union–Find that never needs to block
apart rather than merging them together
• Python implementation
• Dynamic connectivity
• Visual explanation and C# code

4.5.6 References
4.6 Partition refinement
[1] Tarjan, Robert Endre (1975). “Efficiency of a Good But
Not Linear Set Union Algorithm”. Journal of the ACM. In the design of algorithms, partition refinement is a
22 (2): 215–225. doi:10.1145/321879.321884.
technique for representing a partition of a set as a data
[2] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, structure that allows the partition to be refined by split-
Ronald L.; Stein, Clifford (2001), “Chapter 21: Data ting its sets into a larger number of smaller sets. In that
structures for Disjoint Sets”, Introduction to Algorithms sense it is dual to the union-find data structure, which also
(Second ed.), MIT Press, pp. 498–524, ISBN 0-262- maintains a partition into disjoint sets but in which the op-
03293-7 erations merge pairs of sets together.

[3] Galler, Bernard A.; Fischer, Michael J. (May 1964), “An Partition refinement forms a key component of several
improved equivalence algorithm”, Communications of the efficient algorithms on graphs and finite automata, in-
ACM, 7 (5): 301–303, doi:10.1145/364099.364331. The cluding DFA minimization, the Coffman–Graham algo-
paper originating disjoint-set forests. rithm for parallel scheduling, and lexicographic breadth-
first search of graphs.[1][2][3]
[4] Fredman, M.; Saks, M. (May 1989), “The cell probe com-
plexity of dynamic data structures”, Proceedings of the
Twenty-First Annual ACM Symposium on Theory of Com- 4.6.1 Data structure
puting: 345–354, Theorem 5: Any CPROBE(log n) im-
plementation of the set union problem requires Ω(m α(m,
A partition refinement algorithm maintains a family of
n)) time to execute m Find’s and n−1 Union’s, beginning
disjoint sets Si. At the start of the algorithm, this family
with n singleton sets.
contains a single set of all the elements in the data struc-
[5] Knight, Kevin (1989). “Unification: A multidisci- ture. At each step of the algorithm, a set X is presented to
plinary survey”. ACM Computing Surveys. 21: 93–124. the algorithm, and each set Si in the family that contains
doi:10.1145/62029.62030. members of X is split into two sets, the intersection Si ∩
X and the difference Si \ X.
[6] Tarjan, Robert E.; van Leeuwen, Jan (1984), “Worst-case
analysis of set union algorithms”, Journal of the ACM, 31 Such an algorithm may be implemented efficiently by
(2): 245–281, doi:10.1145/62.2160 maintaining data structures representing the following
information:[4][5]
[7] Hopcroft, J. E.; Ullman, J. D. (1973). “Set Merging Al-
gorithms”. SIAM Journal on Computing. 2 (4): 294–303.
• The ordered sequence of the sets Si in the family, in
doi:10.1137/0202024.
a form such as a doubly linked list that allows new
[8] Conchon, Sylvain; Filliâtre, Jean-Christophe (October sets to be inserted into the middle of the sequence
2007), “A Persistent Union-Find Data Structure”, ACM
SIGPLAN Workshop on ML, Freiburg, Germany • Associated with each set Si, a collection of its ele-
ments of Si, in a form such as a doubly linked list
or array data structure that allows for rapid deletion
4.5.7 External links of individual elements from the collection. Alterna-
tively, this component of the data structure may be
• C++ implementation, part of the Boost C++ li- represented by storing all of the elements of all of
braries the sets in a single array, sorted by the identity of
the set they belong to, and by representing the col-
• A Java implementation with an application to color lection of elements in any set Si by its starting and
image segmentation, Statistical Region Merging ending positions in this array.
(SRM), IEEE Trans. Pattern Anal. Mach. Intell.
• Associated with each element, the set it belongs to.
26(11): 1452–1458 (2004)

• Java applet: A Graphical Union–Find Implementa- To perform a refinement operation, the algorithm loops
tion, by Rory L. P. McGuire through the elements of the given set X. For each such
4.6. PARTITION REFINEMENT 127

element x, it finds the set Si that contains x, and checks of edges, its input size.[7]
whether a second set for Si ∩ X has already been started. Partition refinement also forms a key step in lexicographic
If not, it creates the second set and add Si to a list L of breadth-first search, a graph search algorithm with appli-
the sets that are split by the operation. Then, regardless cations in the recognition of chordal graphs and several
of whether a new set was formed, the algorithm removes other important classes of graphs. Again, the disjoint set
x from Si and adds it to Si ∩ X. In the representation in elements are vertices and the set X represent sets of neigh-
which all elements are stored in a single array, moving x bors, so the algorithm takes linear time.[8][9]
from one set to another may be performed by swapping
x with the final element of Si and then decrementing the
end index of Si and the start index of the new set. Finally,
4.6.3 See also
after all elements of X have been processed in this way,
the algorithm loops through L, separating each current
• Refinement (sigma algebra)
set Si from the second set that has been split from it, and
reports both of these sets as being newly formed by the
refinement operation.
4.6.4 References
The time to perform a single refinement operations in this
way is O(|X|), independent of the number of elements in [1] Paige, Robert; Tarjan, Robert E. (1987), “Three partition
the family of sets and also independent of the total num- refinement algorithms”, SIAM Journal on Computing, 16
ber of sets in the data structure. Thus, the time for a (6): 973–989, doi:10.1137/0216062, MR 917035.
sequence of refinements is proportional to the total size
of the sets given to the algorithm in each refinement step. [2] Habib, Michel; Paul, Christophe; Viennot, Laurent
(1999), “Partition refinement techniques: an inter-
esting algorithmic tool kit”, International Journal of
4.6.2 Applications Foundations of Computer Science, 10 (2): 147–170,
doi:10.1142/S0129054199000125, MR 1759929.
An early application of partition refinement was in an al-
[3] Habib, Michel; Paul, Christophe; Viennot, Laurent
gorithm by Hopcroft (1971) for DFA minimization. In
(1998), “A synthesis on partition refinement: a use-
this problem, one is given as input a deterministic finite ful routine for strings, graphs, Boolean matrices and
automaton, and must find an equivalent automaton with as automata”, STACS 98 (Paris, 1998), Lecture Notes in
few states as possible. Hopcroft’s algorithm maintains a Computer Science, 1373, Springer-Verlag, pp. 25–38,
partition of the states of the input automaton into subsets, doi:10.1007/BFb0028546, MR 1650757.
with the property that any two states in different subsets
must be mapped to different states of the output automa- [4] Valmari, Antti; Lehtinen, Petri (2008). “Efficient min-
ton. Initially, there are two subsets, one containing all the imization of DFAs with partial transition functions”.
accepting states of the automaton and one containing the In Albers, Susanne; Weil, Pascal. 25th International
remaining states. At each step one of the subsets Si and Symposium on Theoretical Aspects of Computer Science
(STACS 2008). Leibniz International Proceedings in In-
one of the input symbols x of the automaton are chosen,
formatics (LIPIcs). 1. Dagstuhl, Germany: Schloss
and the subsets of states are refined into states for which a Dagstuhl: Leibniz-Zentrum fuer Informatik. pp. 645–
transition labeled x would lead to Si, and states for which 656. doi:10.4230/LIPIcs.STACS.2008.1328. ISBN 978-
an x-transition would lead somewhere else. When a set 3-939897-06-4. ISSN 1868-8969..
Si that has already been chosen is split by a refinement,
only one of the two resulting sets (the smaller of the two) [5] Knuutila, Timo (2001). “Re-describing an algorithm by
needs to be chosen again; in this way, each state partici- Hopcroft”. Theoretical Computer Science. 250 (1-2):
pates in the sets X for O(s log n) refinement steps and the 333–363. doi:10.1016/S0304-3975(99)00150-4. ISSN
overall algorithm takes time O(ns log n), where n is the 0304-3975.
number of initial states and s is the size of the alphabet.[6]
[6] Hopcroft, John (1971), “An n log n algorithm for mini-
Partition refinement was applied by Sethi (1976) in an mizing states in a finite automaton”, Theory of machines
efficient implementation of the Coffman–Graham al- and computations (Proc. Internat. Sympos., Technion,
gorithm for parallel scheduling. Sethi showed that it Haifa, 1971), New York: Academic Press, pp. 189–196,
could be used to construct a lexicographically ordered MR 0403320.
topological sort of a given directed acyclic graph in linear
[7] Sethi, Ravi (1976), “Scheduling graphs on two pro-
time; this lexicographic topological ordering is one of the
cessors”, SIAM Journal on Computing, 5 (1): 73–82,
key steps of the Coffman–Graham algorithm. In this ap-
doi:10.1137/0205005, MR 0398156.
plication, the elements of the disjoint sets are vertices of
the input graph and the sets X used to refine the partition [8] Rose, D. J.; Tarjan, R. E.; Lueker, G. S. (1976), “Al-
are sets of neighbors of vertices. Since the total number gorithmic aspects of vertex elimination on graphs”,
of neighbors of all vertices is just the number of edges in SIAM Journal on Computing, 5 (2): 266–283,
the graph, the algorithm takes time linear in the number doi:10.1137/0205021.
128 CHAPTER 4. SETS

[9] Corneil, Derek G. (2004), “Lexicographic breadth first


search – a survey”, Graph-Theoretic Methods in Com-
puter Science, Lecture Notes in Computer Science, 3353,
Springer-Verlag, pp. 1–19.
Chapter 5

Priority queues

5.1 Priority queue In addition, peek (in this context often called find-max
or find-min), which returns the highest-priority element
In computer science, a priority queue is an abstract data but does not modify the queue, is very frequently imple-
type which is like a regular queue or stack data structure, mented, and nearly always executes in O(1) time. This
but where additionally each element has a “priority” as- operation and its O(1) performance is crucial to many ap-
sociated with it. In a priority queue, an element with high plications of priority queues.
priority is served before an element with low priority. If More advanced implementations may support more com-
two elements have the same priority, they are served ac- plicated operations, such as pull_lowest_priority_element,
cording to their order in the queue. inspecting the first few highest- or lowest-priority ele-
While priority queues are often implemented with heaps, ments, clearing the queue, clearing subsets of the queue,
they are conceptually distinct from heaps. A priority performing a batch insert, merging two or more queues
queue is an abstract concept like “a list" or “a map"; just into one, incrementing priority of any element, etc.
as a list can be implemented with a linked list or an array,
a priority queue can be implemented with a heap or a va-
5.1.2 Similarity to queues
riety of other methods such as an unordered array.
One can imagine a priority queue as a modified queue,
but when one would get the next element off the queue,
5.1.1 Operations the highest-priority element is retrieved first.
A priority queue must at least support the following op- Stacks and queues may be modeled as particular kinds of
erations: priority queues. As a reminder, here is how stacks and
queues behave:
• insert_with_priority: add an element to the queue
with an associated priority. • stack – elements are pulled in last-in first-out-order
(e.g., a stack of papers)
• pull_highest_priority_element: remove the element • queue – elements are pulled in first-in first-out-order
from the queue that has the highest priority, and re- (e.g., a line in a cafeteria)
turn it.

This is also known In a stack, the priority of each inserted element is mono-
as "pop_element(Off)", tonically increasing; thus, the last element inserted is al-
"get_maximum_element" or ways the first retrieved. In a queue, the priority of each
"get_front(most)_element". inserted element is monotonically decreasing; thus, the
first element inserted is always the first retrieved.
Some conventions reverse the order of
priorities, considering lower values to
be higher priority, so this may also be 5.1.3 Implementation
known as "get_minimum_element", and
is often referred to as "get-min" in the lit- Naive implementations
erature.
This may instead be specified as sepa- There are a variety of simple, usually inefficient, ways
rate "peek_at_highest_priority_element" to implement a priority queue. They provide an anal-
and "delete_element" functions, ogy to help one understand what a priority queue is. For
which can be combined to produce instance, one can keep all the elements in an unsorted
"pull_highest_priority_element". list. Whenever the highest-priority element is requested,

129
130 CHAPTER 5. PRIORITY QUEUES

search through all elements for the one with the highest For applications that do many "peek" operations for ev-
priority. (In big O notation: O(1) insertion time, O(n) pull ery “extract-min” operation, the time complexity for peek
time due to search.) actions can be reduced to O(1) in all tree and heap imple-
mentations by caching the highest priority element after
every insertion and removal. For insertion, this adds at
Usual implementation most a constant cost, since the newly inserted element is
compared only to the previously cached minimum ele-
To improve performance, priority queues typically use ment. For deletion, this at most adds an additional “peek”
a heap as their backbone, giving O(log n) performance cost, which is typically cheaper than the deletion cost, so
for inserts and removals, and O(n log n) to build initially. overall time complexity is not significantly impacted.
Variants of the basic heap data structure such as pairing
Monotone priority queues are specialized queues that are
heaps or Fibonacci heaps can provide better bounds for
[1] optimized for the case where no item is ever inserted that
some operations.
has a lower priority (in the case of min-heap) than any
Alternatively, when a self-balancing binary search tree is item previously extracted. This restriction is met by sev-
used, insertion and removal also take O(log n) time, al- eral practical applications of priority queues.
though building trees from existing sequences of elements
takes O(n log n) time; this is typical where one might al-
ready have access to these data structures, such as with Summary of running times
third-party or standard libraries.
In the following time complexities[5] O(f) is an asymp-
From a computational-complexity standpoint, priority totic upper bound and Θ(f) is an asymptotically tight
queues are congruent to sorting algorithms. See the next bound (see Big O notation). Function names assume a
section for how efficient sorting algorithms can create ef- min-heap.
ficient priority queues.
[1] Brodal and Okasaki later describe a persistent variant with
the same bounds except for decrease-key, which is not
Specialized heaps
supported. Heaps with n elements can be constructed
bottom-up in O(n).[9]
There are several specialized heap data structures that
either supply additional operations or outperform heap- [2] Amortized time.
based implementations for specific types of keys, specif- [12]
[3] Lower√ bound of Ω(log log n), upper bound of
ically integer keys.
O(22 log log n ). [13]

• When the set of keys is {1, 2, ..., C}, and only in- [4] n is the size of the larger heap.
sert, find-min and extract-min are needed, a bucket
queue can be constructed as an array of C linked lists
plus a pointer top, initially C. Inserting an item with 5.1.4 Equivalence of priority queues and
key k appends the item to the k'th, and updates top sorting algorithms
← min(top, k), both in constant time. Extract-min
deletes and returns one item from the list with in- Using a priority queue to sort
dex top, then increments top if needed until it again
points to a non-empty list; this takes O(C) time in The semantics of priority queues naturally suggest a sort-
the worst case. These queues are useful for sorting ing method: insert all the elements to be sorted into a
the vertices of a graph by their degree.[2]:374 priority queue, and sequentially remove them; they will
come out in sorted order. This is actually the proce-
• For the set of keys {1, 2, ..., C}, a van Emde Boas dure used by several sorting algorithms, once the layer
tree would support the minimum, maximum, insert, of abstraction provided by the priority queue is removed.
delete, search, extract-min, extract-max, predecessor This sorting method is equivalent to the following sorting
and successor operations in O(log log C) time, but algorithms:
has a space cost for small queues of about O(2m/2 ),
where m is the number of bits in the priority value.[3]
Using a sorting algorithm to make a priority queue
• The Fusion tree algorithm by Fredman and Willard
implements the minimum operation in O(1) A sorting algorithm can also be used to implement a pri-
√ time ority queue. Specifically, Thorup says:[14]
and insert and extract-min operations in O( log n)
time however it is stated by the author that, “Our
algorithms have theoretical interest only; The con- We present a general deterministic linear
stant factors involved in the execution times pre- space reduction from priority queues to sort-
clude practicality.”.[4] ing implying that if we can sort up to n keys in
5.1. PRIORITY QUEUE 131

S(n) time per key, then there is a priority queue send the traffic from the highest priority queue upon ar-
supporting delete and insert in O(S(n)) time and rival. This ensures that the prioritized traffic (such as real-
find-min in constant time. time traffic, e.g. an RTP stream of a VoIP connection)
is forwarded with the least delay and the least likelihood
That is, if there is a sorting algorithm which can sort in of being rejected due to a queue reaching its maximum
O(S) time per key, where S is some function of n and word capacity. All other traffic can be handled when the high-
size,[15] then one can use the given procedure to create a est priority queue is empty. Another approach used is to
priority queue where pulling the highest-priority element send disproportionately more traffic from higher priority
is O(1) time, and inserting new elements (and deleting queues.
elements) is O(S) time. For example, if one has an O(n Many modern protocols for local area networks also in-
log log n) sort algorithm, one can create a priority queue clude the concept of priority queues at the media access
with O(1) pulling and O(log log n) insertion. control (MAC) sub-layer to ensure that high-priority ap-
plications (such as VoIP or IPTV) experience lower la-
tency than other applications which can be served with
5.1.5 Libraries best effort service. Examples include IEEE 802.11e (an
amendment to IEEE 802.11 which provides quality of
A priority queue is often considered to be a "container service) and ITU-T G.hn (a standard for high-speed local
data structure". area network using existing home wiring (power lines,
The Standard Template Library (STL), and the C++ phone lines and coaxial cables).
1998 standard, specifies priority_queue as one of the STL Usually a limitation (policer) is set to limit the band-
container adaptor class templates. However, it does not width that traffic from the highest priority queue can take,
specify how two elements with same priority should be in order to prevent high priority packets from choking
served, and indeed, common implementations will not re- off all other traffic. This limit is usually never reached
turn them according to their order in the queue. It im- due to high level control instances such as the Cisco
plements a max-priority-queue, and has three parame- Callmanager, which can be programmed to inhibit calls
ters: a comparison object for sorting such as a function which would exceed the programmed bandwidth limit.
object (defaults to less<T> if unspecified), the underly-
ing container for storing the data structures (defaults to
std::vector<T>), and two iterators to the beginning and Discrete event simulation
end of a sequence. Unlike actual STL containers, it does
not allow iteration of its elements (it strictly adheres to its Another use of a priority queue is to manage the events
abstract data type definition). STL also has utility func- in a discrete event simulation. The events are added to
tions for manipulating another random-access container the queue with their simulation time used as the prior-
as a binary max-heap. The Boost (C++ libraries) also ity. The execution of the simulation proceeds by repeat-
have an implementation in the library heap. edly pulling the top of the queue and executing the event
thereon.
Python’s heapq module implements a binary min-heap on
top of a list. See also: Scheduling (computing), queueing theory
Java's library contains a PriorityQueue class, which im-
plements a min-priority-queue. Dijkstra’s algorithm
Go's library contains a container/heap module, which im-
When the graph is stored in the form of adjacency list or
plements a min-heap on top of any compatible data struc-
matrix, priority queue can be used to extract minimum
ture.
efficiently when implementing Dijkstra’s algorithm, al-
The Standard PHP Library extension contains the class though one also needs the ability to alter the priority of a
SplPriorityQueue. particular vertex in the priority queue efficiently.
Apple’s Core Foundation framework contains a
CFBinaryHeap structure, which implements a min-heap. Huffman coding

Huffman coding requires one to repeatedly obtain the two


5.1.6 Applications lowest-frequency trees. A priority queue is one method of
doing this.
Bandwidth management

Priority queuing can be used to manage limited resources Best-first search algorithms
such as bandwidth on a transmission line from a network
router. In the event of outgoing traffic queuing due to Best-first search algorithms, like the A* search algorithm,
insufficient bandwidth, all other queues can be halted to find the shortest path between two vertices or nodes of
132 CHAPTER 5. PRIORITY QUEUES

a weighted graph, trying out the most promising routes [3] P. van Emde Boas. Preserving order in a forest in less than
first. A priority queue (also known as the fringe) is used logarithmic time. In Proceedings of the 16th Annual Sym-
to keep track of unexplored routes; the one for which the posium on Foundations of Computer Science, pages 75-84.
estimate (a lower bound in the case of A*) of the total IEEE Computer Society, 1975.
path length is smallest is given highest priority. If mem- [4] Michael L. Fredman and Dan E. Willard. Surpassing the
ory limitations make best-first search impractical, vari- information theoretic bound with fusion trees. Journal of
ants like the SMA* algorithm can be used instead, with Computer and System Sciences, 48(3):533-551, 1994
a double-ended priority queue to allow removal of low-
[5] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
priority items.
Ronald L. (1990). Introduction to Algorithms (1st ed.).
MIT Press and McGraw-Hill. ISBN 0-262-03141-8.
ROAM triangulation algorithm [6] Fredman, Michael Lawrence; Tarjan, Robert E. (July
1987). “Fibonacci heaps and their uses in improved net-
The Real-time Optimally Adapting Meshes (ROAM) al- work optimization algorithms” (PDF). Journal of the As-
gorithm computes a dynamically changing triangulation sociation for Computing Machinery. 34 (3): 596–615.
of a terrain. It works by splitting triangles where more doi:10.1145/28869.28874.
detail is needed and merging them where less detail is [7] Iacono, John (2000), “Improved upper bounds for pair-
needed. The algorithm assigns each triangle in the ter- ing heaps”, Proc. 7th Scandinavian Workshop on Al-
rain a priority, usually related to the error decrease if that gorithm Theory, Lecture Notes in Computer Science,
triangle would be split. The algorithm uses two priority 1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428 ,
queues, one for triangles that can be split and another for doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2
triangles that can be merged. In each step the triangle
from the split queue with the highest priority is split, or [8] Brodal, Gerth S. (1996), “Worst-Case Efficient Priority
Queues” (PDF), Proc. 7th Annual ACM-SIAM Symposium
the triangle from the merge queue with the lowest priority
on Discrete Algorithms, pp. 52–58
is merged with its neighbours.
[9] Goodrich, Michael T.; Tamassia, Roberto (2004). “7.3.6.
Bottom-Up Heap Construction”. Data Structures and Al-
Prim’s algorithm for minimum spanning tree gorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-
46983-1.
Using min heap priority queue in Prim’s algorithm to [10] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E.
find the minimum spanning tree of a connected and (2009). “Rank-pairing heaps” (PDF). SIAM J. Computing:
undirected graph, one can achieve a good running time. 1463–1485.
This min heap priority queue uses the min heap data
structure which supports operations such as insert, min- [11] Brodal, G. S. L.; Lagogiannis, G.; Tarjan, R. E.
imum, extract-min, decrease-key.[16] In this implementa- (2012). Strict Fibonacci heaps (PDF). Proceedings of
the 44th symposium on Theory of Computing - STOC
tion, the weight of the edges is used to decide the priority
'12. p. 1177. doi:10.1145/2213977.2214082. ISBN
of the vertices. Lower the weight, higher the priority and 9781450312455.
higher the weight, lower the priority.[17]
[12] Fredman, Michael Lawrence (July 1999). “On the Ef-
ficiency of Pairing Heaps and Related Data Structures”
5.1.7 See also (PDF). Journal of the Association for Computing Machin-
ery. 46 (4): 473–501. doi:10.1145/320211.320214.
• Batch queue [13] Pettie, Seth (2005). Towards a Final Analysis of Pair-
ing Heaps (PDF). FOCS '05 Proceedings of the 46th
• Command queue Annual IEEE Symposium on Foundations of Computer
Science. pp. 174–183. CiteSeerX 10.1.1.549.471 .
• Job scheduler doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.

[14] Thorup, Mikkel (2007). “Equivalence between prior-


5.1.8 References ity queues and sorting”. Journal of the ACM. 54 (6).
doi:10.1145/1314690.1314692.
[1] Thomas H. Cormen, Charles E. Leiserson, Ronald L. [15] http://courses.csail.mit.edu/6.851/spring07/scribe/lec17.
Rivest, and Clifford Stein. Introduction to Algorithms, Sec- pdf
ond Edition. MIT Press and McGraw-Hill, 2001. ISBN
0-262-03293-7. Chapter 20: Fibonacci Heaps, pp.476– [16] Thomas H. Cormen, Charles E. Leiserson, Ronald L.
497. Third edition p518. Rivest, Clifford Stein (2009). INTRODUCTION TO AL-
GORITHMS. 3. MIT Press. p. 634. ISBN 978-81-
[2] Skiena, Steven (2010). The Algorithm Design Manual 203-4007-7. In order to implement Prim’s algorithm ef-
(2nd ed.). Springer Science+Business Media. ISBN 1- ficiently, we need a fast way to select a new edge to add to
849-96720-2. the tree formed by the edges in A. In the pseudo-code
5.2. BUCKET QUEUE 133

[17] “Prim’s Algorithm”. Geek for Geeks. Retrieved 12 element with minimum (or maximum) priority. It con-
September 2014. sists of an array A of container data structures, where ar-
ray cell A[p] stores the collection of elements with prior-
ity p. It can handle the following operations:
5.1.9 Further reading
• To insert an element x with priority p, add x to the
• Thomas H. Cormen, Charles E. Leiserson, Ronald
container at A[p].
L. Rivest, and Clifford Stein. Introduction to Algo-
rithms, Second Edition. MIT Press and McGraw- • To remove an element x with priority p, remove x
Hill, 2001. ISBN 0-262-03293-7. Section 6.5: Pri- from the container at A[p]
ority queues, pp. 138–142.
• To find an element with the minimum priority, per-
form a sequential search to find the first non-empty
5.1.10 External links container, and then choose an arbitrary element
from this container.
• C++ reference for std::priority_queue
• Descriptions by Lee Killough In this way, insertions and deletions take constant time,
while finding the minimum priority element takes time
• PQlib - Open source Priority Queue library for C O(C).[1][3]
• libpqueue is a generic priority queue (heap) imple-
mentation (in C) used by the Apache HTTP Server 5.2.2 Optimizations
project.
As an optimization, the data structure can also maintain
• Survey of known priority queue structures by Stefan
an index L that lower-bounds the minimum priority of
Xenos
an element. When inserting a new element, L should be
• UC Berkeley - Computer Science 61B - Lecture 24: updated to the minimum of its old value and the new ele-
Priority Queues (video) - introduction to priority ment’s priority. When searching for the minimum prior-
queues using binary heap ity element, the search can start at L instead of at zero, and
after the search L should be left equal to the priority that
was found in the search.[3] In this way the time for a search
5.2 Bucket queue is reduced to the difference between the previous lower
bound and its next value; this difference could be sig-
nificantly smaller than C. For applications of monotone
In the design and analysis of data structures, a bucket
priority queues such as Dijkstra’s algorithm in which the
queue[1] (also called a bucket priority queue[2] or
minimum priorities form a monotonic sequence, the sum
bounded-height priority queue[3] ) is a priority queue
of these differences is at most C, so the total time for
for prioritizing elements whose priorities are small
a sequence of n operations is O(n + C), rather than the
integers. It has the form of an array of buckets: an array
slower O(nC) time bound that would result without this
data structure, indexed by the priorities, whose cells con-
optimization.
tain buckets of items with the same priority as each other.
Another optimization (already given by Dial 1969) can
The bucket queue is the priority-queue analogue of
be used to save space when the priorities are monotonic
pigeonhole sort (also called bucket sort), a sorting algo-
and, at any point in time, fall within a range of r values
rithm that places elements into buckets indexed by their
rather than extending over the whole range from 0 to C. In
priorities and then concatenates the buckets. Using a
this case, one can index the array by the priorities mod-
bucket queue as the priority queue in a selection sort gives
ulo r rather than by their actual values. The search for
a form of the pigeonhole sort algorithm.
the minimum priority element should always begin at the
Applications of the bucket queue include computation of previous minimum, to avoid priorities that are higher than
the degeneracy of a graph as well as fast algorithms for the minimum but have lower moduli.[1]
shortest paths and widest paths for graphs with weights
that are small integers or are already sorted. Its first use[2]
was in a shortest path algorithm by Dial (1969).[4] 5.2.3 Applications
A bucket queue can be used to maintain the vertices
5.2.1 Basic data structure of an undirected graph, prioritized by their degrees,
and repeatedly find and remove the vertex of minimum
This structure can handle the insertions and deletions of degree.[3] This greedy algorithm can be used to calculate
elements with integer priorities in the range from 0 to the degeneracy of a given graph. It takes linear time, with
some known bound C, as well as operations that find the or without the optimization that maintains a lower bound
134 CHAPTER 5. PRIORITY QUEUES

on the minimum priority, because each vertex is found in


time proportional to its degree and the sum of all vertex
degrees is linear in the number of edges of the graph.[5] 100
In Dijkstra’s algorithm for shortest paths in positively- 19 36
weighted directed graphs, a bucket queue can be used to
obtain a time bound of O(n + m + dc), where n is the
number of vertices, m is the number of edges, d is the
diameter of the network, and c is the maximum (inte- 17 3 25 1
[6]
ger) link cost. This variant of DIjkstra’s algorithm is
also known as Dial’s algorithm, after Robert B. Dial, who
published it in 1969.[7] In this algorithm, the priorities 2 7
will only span a range of width c + 1, so the modular
optimization can be used to reduce the space to O(n +
c).[1] A variant of the same algorithm can be used for the
Example of a complete binary max-heap with node keys being
widest path problem, and (in combination with methods integers from 1 to 100
for quickly partitioning non-integer edge weights) leads
to near-linear-time solutions to the single-source single-
destination version of this problem.[8]

5.2.4 References
[1] Mehlhorn, Kurt; Sanders, Peter (2008), “10.5.1 Bucket parent node of B, then the key (the value) of node A is
Queues”, Algorithms and Data Structures: The Basic Tool- ordered with respect to the key of node B with the same
box, Springer, p. 201, ISBN 9783540779773. ordering applying across the heap. A heap can be classi-
fied further as either a "max heap" or a "min heap". In
[2] Edelkamp, Stefan; Schroedl, Stefan (2011), “3.1.1 Bucket a max heap, the keys of parent nodes are always greater
Data Structures”, Heuristic Search: Theory and Applica-
than or equal to those of the children and the highest key
tions, Elsevier, pp. 90–92, ISBN 9780080919737. See
also p. 157 for the history and naming of this structure.
is in the root node. In a min heap, the keys of parent
nodes are less than or equal to those of the children and
[3] Skiena, Steven S. (1998), The Algorithm Design Manual, the lowest key is in the root node.
Springer, p. 181, ISBN 9780387948607.
The heap is one maximally efficient implementation of
[4] Dial, Robert B. (1969), “Algorithm 360: Shortest- an abstract data type called a priority queue, and in fact
path forest with topological ordering [H]", priority queues are often referred to as “heaps”, regard-
Communications of the ACM, 12 (11): 632–633, less of how they may be implemented. A common imple-
doi:10.1145/363269.363610. mentation of a heap is the binary heap, in which the tree is
[5] Matula, D. W.; Beck, L. L. (1983), “Smallest-last order- a complete binary tree (see figure). The heap data struc-
ing and clustering and graph coloring algorithms”, Journal ture, specifically the binary heap, was introduced by J. W.
of the ACM, 30 (3): 417–427, doi:10.1145/2402.322385, J. Williams in 1964, as a data structure for the heapsort
MR 0709826. sorting algorithm.[1] Heaps are also crucial in several ef-
ficient graph algorithms such as Dijkstra’s algorithm. In
[6] Varghese, George (2005), Network Algorithmics: An In-
a heap, the highest (or lowest) priority element is always
terdisciplinary Approach to Designing Fast Networked De-
vices, Morgan Kaufmann, ISBN 9780120884773. stored at the root. A heap is not a sorted structure and
can be regarded as partially ordered. As visible from the
[7] Dial, Robert B. (1969), “Algorithm 360: Shortest- heap-diagram, there is no particular relationship among
path forest with topological ordering [H]", nodes on any given level, even among the siblings. When
Communications of the ACM, 12 (11): 632–633, a heap is a complete binary tree, it has a smallest possible
doi:10.1145/363269.363610. height—a heap with N nodes always has log N height. A
[8] Gabow, Harold N.; Tarjan, Robert E. (1988), “Algo- heap is a useful data structure when you need to remove
rithms for two bottleneck optimization problems”, Jour- the object with the highest (or lowest) priority.
nal of Algorithms, 9 (3): 411–417, doi:10.1016/0196-
Note that, as shown in the graphic, there is no implied
6774(88)90031-4, MR 955149
ordering between siblings or cousins and no implied se-
quence for an in-order traversal (as there would be in, e.g.,
a binary search tree). The heap relation mentioned above
5.3 Heap (data structure) applies only between nodes and their parents, grandpar-
ents, etc. The maximum number of children each node
In computer science, a heap is a specialized tree-based can have depends on the type of heap, but in many types
data structure that satisfies the heap property: If A is a it is at most two, which is known as a binary heap.
5.3. HEAP (DATA STRUCTURE) 135

5.3.1 Operations 5.3.2 Implementation


The common operations involving heaps are: Heaps are usually implemented in an array (fixed size or
dynamic array), and do not require pointers between ele-
Basic ments. After an element is inserted into or deleted from
a heap, the heap property may be violated and the heap
• find-max or find-min: find a maximum item of a must be balanced by internal operations.
max-heap, or a minimum item of a min-heap, re- 100
spectively (a.k.a. peek)
19 36

• insert: adding a new key to the heap (a.k.a., push[2] ) 17 12 25 5

• extract-min [or extract-max]: returns the node of 9 15 6 11 13 8 1 4


minimum value from a min heap [or maximum
value from a max heap] after removing it from the 100 19 36 17 12 25 5 9 15 6 11 13 8 1 4

heap (a.k.a., pop[3] )


Example of a complete binary max-heap with node keys being
• delete-max or delete-min: removing the root node of integers from 1 to 100 and how it would be stored in an array.
a max- or min-heap, respectively
Full and almost full binary heaps may be represented in
• replace: pop root and push a new key. More efficient
a very space-efficient way (as an implicit data structure)
than pop followed by push, since only need to bal-
using an array alone. The first (or last) element will con-
ance once, not twice, and appropriate for fixed-size
tain the root. The next two elements of the array contain
heaps.[4]
its children. The next four contain the four children of
the two child nodes, etc. Thus the children of the node
Creation at position n would be at positions 2n and 2n + 1 in a
one-based array, or 2n + 1 and 2n + 2 in a zero-based
• create-heap: create an empty heap array. This allows moving up or down the tree by doing
• heapify: create a heap out of given array of elements simple index computations. Balancing a heap is done by
sift-up or sift-down operations (swapping elements which
• merge (union): joining two heaps to form a valid new are out of order). As we can build a heap from an array
heap containing all the elements of both, preserving without requiring extra memory (for the nodes, for exam-
the original heaps. ple), heapsort can be used to sort an array in-place.
• meld: joining two heaps to form a valid new heap Different types of heaps implement the operations in dif-
containing all the elements of both, destroying the ferent ways, but notably, insertion is often done by adding
original heaps. the new element at the end of the heap in the first available
free space. This will generally violate the heap property,
Inspection and so the elements are then sifted up until the heap prop-
erty has been reestablished. Similarly, deleting the root
• size: return the number of items in the heap. is done by removing the root and then putting the last el-
ement in the root and sifting down to rebalance. Thus
• is-empty: return true if the heap is empty, false oth- replacing is done by deleting the root and putting the new
erwise. element in the root and sifting down, avoiding a sifting up
step compared to pop (sift down of last element) followed
Internal by push (sift up of new element).
Construction of a binary (or d-ary) heap out of a given
• increase-key or decrease-key: updating a key within array of elements may be performed in linear time using
a max- or min-heap, respectively the classic Floyd algorithm, with the worst-case number
• delete: delete an arbitrary node (followed by moving of comparisons equal to 2N − 2s2 (N) − e2 (N) (for a bi-
last node and sifting to maintain heap) nary heap), where s2 (N) is the sum of all digits of the bi-
nary representation of N and e2 (N) is the exponent of 2
• sift-up: move a node up in the tree, as long as in the prime factorization of N.[5] This is faster than a se-
needed; used to restore heap condition after inser- quence of consecutive insertions into an originally empty
tion. Called “sift” because node moves up the tree heap, which is log-linear (or linearithmic).[lower-alpha 1]
until it reaches the correct level, as in a sieve.
• sift-down: move a node down in the tree, similar to 5.3.3 Variants
sift-up; used to restore heap condition after deletion
or replacement. • 2–3 heap
136 CHAPTER 5. PRIORITY QUEUES

• B-heap 5.3.5 Applications


• Beap The heap data structure has many applications.

• Binary heap
• Heapsort: One of the best sorting methods being in-
• Binomial heap place and with no quadratic worst-case scenarios.

• Selection algorithms: A heap allows access to the


• Brodal queue
min or max element in constant time, and other
• d-ary heap selections (such as median or kth-element) can be
done in sub-linear time on data that is in a heap.[15]
• Fibonacci heap
• Graph algorithms: By using heaps as internal traver-
• Leaf heap sal data structures, run time will be reduced by
polynomial order. Examples of such problems
• Leftist heap are Prim’s minimal-spanning-tree algorithm and
Dijkstra’s shortest-path algorithm.
• Pairing heap
• Priority Queue: A priority queue is an abstract con-
• Radix heap cept like “a list” or “a map"; just as a list can be im-
plemented with a linked list or an array, a priority
• Randomized meldable heap queue can be implemented with a heap or a variety
of other methods.
• Skew heap
• Order statistics: The Heap data structure can be used
• Soft heap to efficiently find the kth smallest (or largest) element
in an array.
• Ternary heap

• Treap 5.3.6 Implementations


• Weak heap • The C++ Standard Library provides the make_heap,
push_heap and pop_heap algorithms for heaps (usu-
ally implemented as binary heaps), which operate on
5.3.4 Comparison of theoretic bounds for arbitrary random access iterators. It treats the itera-
variants tors as a reference to an array, and uses the array-to-
heap conversion. It also provides the container adap-
In the following time complexities[6] O(f) is an asymp- tor priority_queue, which wraps these facilities in a
totic upper bound and Θ(f) is an asymptotically tight container-like class. However, there is no standard
bound (see Big O notation). Function names assume a support for the decrease/increase-key operation.
min-heap.
• The Boost C++ libraries include a heaps library.
Unlike the STL it supports decrease and increase
[1] Each insertion
∑ takes O(log(k)) in the existing size of the operations, and supports additional types of heap:
heap, thus n k=1 O(log k) . Since log n/2 = (log n) − 1 specifically, it supports d-ary, binomial, Fibonacci,
, a constant factor (half) of these insertions are within a
constant factor of the maximum, so asymptotically we can pairing and skew heaps.
assume k = n ; formally the time is nO(log n)−O(n) =
• There is generic heap implementation for C and C++
O(n log n) . This can also be readily seen from Stirling’s
with D-ary heap and B-heap support. It provides
approximation.
STL-like API.
[2] Brodal and Okasaki later describe a persistent variant with
• The Java platform (since version 1.5) pro-
the same bounds except for decrease-key, which is not
supported. Heaps with n elements can be constructed vides a binary heap implementation with class
bottom-up in O(n).[10] java.util.PriorityQueue in the Java Collections
Framework. This class implements by default a
[3] Amortized time. min-heap; to implement a max-heap, programmer
should write a custom comparator. There is no
[13]
[4] Lower√ bound of Ω(log log n), upper bound of support for the decrease/increase-key operation.
O(22 log log n ). [14]
• Python has a heapq module that implements a pri-
[5] n is the size of the larger heap. ority queue using a binary heap.
5.4. BINARY HEAP 137

• PHP has both max-heap (SplMaxHeap) and min- [7] Fredman, Michael Lawrence; Tarjan, Robert E. (July
heap (SplMinHeap) as of version 5.3 in the Standard 1987). “Fibonacci heaps and their uses in improved net-
PHP Library. work optimization algorithms” (PDF). Journal of the As-
sociation for Computing Machinery. 34 (3): 596–615.
• Perl has implementations of binary, binomial, and doi:10.1145/28869.28874.
Fibonacci heaps in the Heap distribution available
[8] Iacono, John (2000), “Improved upper bounds for pair-
on CPAN. ing heaps”, Proc. 7th Scandinavian Workshop on Al-
gorithm Theory, Lecture Notes in Computer Science,
• The Go language contains a heap package with heap
algorithms that operate on an arbitrary type that sat- 1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428 ,
doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2
isfies a given interface.
[9] Brodal, Gerth S. (1996), “Worst-Case Efficient Priority
• Apple’s Core Foundation library contains a Queues” (PDF), Proc. 7th Annual ACM-SIAM Symposium
CFBinaryHeap structure. on Discrete Algorithms, pp. 52–58

• Pharo has an implementation in the Collections- [10] Goodrich, Michael T.; Tamassia, Roberto (2004). “7.3.6.
Sequenceable package along with a set of test cases. Bottom-Up Heap Construction”. Data Structures and Al-
A heap is used in the implementation of the timer gorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-
event loop. 46983-1.

[11] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E.


• The Rust programming language has a binary max-
(2009). “Rank-pairing heaps” (PDF). SIAM J. Computing:
heap implementation, BinaryHeap, in the collec- 1463–1485.
tions module of its standard library.
[12] Brodal, G. S. L.; Lagogiannis, G.; Tarjan, R. E.
(2012). Strict Fibonacci heaps (PDF). Proceedings of
5.3.7 See also the 44th symposium on Theory of Computing - STOC
'12. p. 1177. doi:10.1145/2213977.2214082. ISBN
• Sorting algorithm 9781450312455.

[13] Fredman, Michael Lawrence (July 1999). “On the Ef-


• Search data structure
ficiency of Pairing Heaps and Related Data Structures”
• Stack (abstract data type) (PDF). Journal of the Association for Computing Machin-
ery. 46 (4): 473–501. doi:10.1145/320211.320214.
• Queue (abstract data type) [14] Pettie, Seth (2005). Towards a Final Analysis of Pair-
ing Heaps (PDF). FOCS '05 Proceedings of the 46th
• Tree (data structure)
Annual IEEE Symposium on Foundations of Computer
• Treap, a form of binary search tree based on heap- Science. pp. 174–183. CiteSeerX 10.1.1.549.471 .
ordered trees doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.

[15] Frederickson, Greg N. (1993), “An Optimal Algorithm


for Selection in a Min-Heap”, Information and Compu-
5.3.8 References tation (PDF), 104 (2), Academic Press, pp. 197–214,
doi:10.1006/inco.1993.1030
[1] Williams, J. W. J. (1964), “Algorithm 232 - Heap-
sort”, Communications of the ACM, 7 (6): 347–348,
doi:10.1145/512274.512284 5.3.9 External links
[2] The Python Standard Library, 8.4. heapq — Heap queue
• Heap at Wolfram MathWorld
algorithm, heapq.heappush
• Explanation of how the basic heap algorithms work
[3] The Python Standard Library, 8.4. heapq — Heap queue
algorithm, heapq.heappop

[4] The Python Standard Library, 8.4. heapq — Heap queue 5.4 Binary heap
algorithm, heapq.heapreplace

[5] Suchenek, Marek A. (2012), “Elementary Yet Precise A binary heap is a heap data structure that takes the form
Worst-Case Analysis of Floyd’s Heap-Construction Pro- of a binary tree. Binary heaps are a common way of im-
gram”, Fundamenta Informaticae, IOS Press, 120 (1): plementing priority queues.[1]:162–163 The binary heap was
75–92, doi:10.3233/FI-2012-751. introduced by J. W. J. Williams in 1964, as a data struc-
ture for the heapsort.[2]
[6] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L. (1990). Introduction to Algorithms (1st ed.). A binary heap is defined as a binary tree with two addi-
MIT Press and McGraw-Hill. ISBN 0-262-03141-8. tional constraints:[3]
138 CHAPTER 5. PRIORITY QUEUES

5.4.1 Heap operations


100 Both the insert and remove operations modify the heap to
conform to the shape property first, by adding or remov-
19 36 ing from the end of the heap. Then the heap property is
restored by traversing up or down the heap. Both opera-
tions take O(log n) time.
17 3 25 1
Insert

2 7 To add an element to a heap we must perform an up-heap


operation (also known as bubble-up, percolate-up, sift-up,
trickle-up, heapify-up, or cascade-up), by following this
Example of a complete binary max heap
algorithm:

1. Add the element to the bottom level of the heap.


2. Compare the added element with its parent; if they
are in the correct order, stop.
3. If not, swap the element with its parent and return
to the previous step.

The number of operations required depends only on the


number of levels the new element must rise to satisfy the
heap property, thus the insertion operation has a worst-
case time complexity of O(log n) but an average-case
complexity of O(1).[4]
As an example of binary heap insertion, say we have a
Example of a complete binary min heap max-heap

11

• Shape property: a binary heap is a complete binary


5 8
tree; that is, all levels of the tree, except possibly the
last one (deepest) are fully filled, and, if the last level
of the tree is not complete, the nodes of that level are 3 4 X

filled from left to right.


and we want to add the number 15 to the heap. We first
place the 15 in the position marked by the X. However,
• Heap property: the key stored in each node is ei- the heap property is violated since 15 > 8, so we need to
ther greater than or equal to (≥) or less than or equal swap the 15 and the 8. So, we have the heap looking as
to (≤) the keys in the node’s children, according to follows after the first swap:
some total order.
11

Heaps where the parent key is greater than or equal to 5 15


(≥) the child keys are called max-heaps; those where it is
less than or equal to (≤) are called min-heaps. Efficient 3 4 8
(logarithmic time) algorithms are known for the two op-
erations needed to implement a priority queue on a bi- However the heap property is still violated since 15 > 11,
nary heap: inserting an element, and removing the small- so we need to swap again:
est (largest) element from a min-heap (max-heap). Bi-
nary heaps are also commonly employed in the heapsort 15
sorting algorithm, which is an in-place algorithm owing
to the fact that binary heaps can be implemented as an
5 11
implicit data structure, storing keys in an array and us-
ing their relative positions within that array to represent
3 4 8
child-parent relationships.
5.4. BINARY HEAP 139

which is a valid max-heap. There is no need to check the be swapped with its smaller child), until it satisfies the
left child after this final step: at the start, the max-heap heap property in its new position. This functionality is
was valid, meaning 11 > 5; if 15 > 11, and 11 > 5, then achieved by the Max-Heapify function as defined be-
15 > 5, because of the transitive relation. low in pseudocode for an array-backed heap A of length
heap_length[A]. Note that “A” is indexed starting at 1, not
0 as is common in many real programming languages.
Extract
Max-Heapify (A, i):
The procedure for deleting the root from the heap (effec- left ← 2*i // ← means “assignment” right ← 2*i + 1
tively extracting the maximum element in a max-heap or largest ← i
the minimum element in a min-heap) and restoring the if left ≤ heap_length[A] and A[left] > A[largest] then:
properties is called down-heap (also known as bubble- largest ← left
down, percolate-down, sift-down, trickle down, heapify- if right ≤ heap_length[A] and A[right] > A[largest] then:
down, cascade-down, and extract-min/max). largest ← right
if largest ≠ i then:
swap A[i] and A[largest] Max-Heapify(A, largest)
1. Replace the root of the heap with the last element
on the last level. For the above algorithm to correctly re-heapify the array,
the node at index i and its two direct children must vio-
2. Compare the new root with its children; if they are late the heap property. If they do not, the algorithm will
in the correct order, stop. fall through with no change to the array. The down-heap
3. If not, swap the element with one of its children and operation (without the preceding swap) can also be used
return to the previous step. (Swap with its smaller to modify the value of the root, even when an element is
child in a min-heap and its larger child in a max- not being deleted. In the pseudocode above, what starts
heap.) with // is a comment. Note that A is an array (or list) that
starts being indexed from 1 up to length(A), according to
the pseudocode.
So, if we have the same max-heap as before
In the worst case, the new root has to be swapped with
its child on each level until it reaches the bottom level of
11
the heap, meaning that the delete operation has a time
complexity relative to the height of the tree, or O(log n).
5 8

3 4
5.4.2 Building a heap

We remove the 11 and replace it with the 4. Building a heap from an array of n input elements can be
done by starting with an empty heap, then successively
4
inserting each element. This approach, called Williams’
method after the inventor of binary heaps, is easily seen
to run in O(n log n) time: it performs n insertions at O(log
5 8
n) cost each.[lower-alpha 1]

3 However, Williams’ method is suboptimal. A faster


method (due to Floyd[5] ) starts by arbitrarily putting the
elements on a binary tree, respecting the shape property
Now the heap property is violated since 8 is greater than (the tree could be represented by an array, see below).
4. In this case, swapping the two elements, 4 and 8, is Then starting from the lowest level and moving upwards,
enough to restore the heap property and we need not swap sift the root of each subtree downward as in the dele-
elements further: tion algorithm until the heap property is restored. More
specifically if all the subtrees starting at some height h
8 have already been “heapified” (the bottommost level cor-
responding to h = 0 ), the trees at height h + 1 can
5 4
be heapified by sending their root down along the path
of maximum valued children when building a max-heap,
or minimum valued children when building a min-heap.
3
This process takes O(h) operations (swaps) per node. In
this method most of the heapification takes place in the
The downward-moving node is swapped with the larger lower levels. Since the height of the heap is ⌊log n⌋ , the
⌊log n⌋
of its children in a max-heap (in a min-heap it would number of nodes at height h is ≤ 2 2h ≤ 2nh . There-
140 CHAPTER 5. PRIORITY QUEUES

fore, the cost of heapifying all subtrees is: heap is always a complete binary tree, it can be stored
compactly. No space is required for pointers; instead,
  the parent and children of each node can be found by
⌊log n⌋
∑ n ⌊log n⌋
∑ h arithmetic on array indices. These properties make this
O(h) = O n  heap implementation a simple example of an implicit data
2h 2h structure or Ahnentafel list. Details depend on the root
h=0 h=0
( ∞ ) position, which in turn may depend on constraints of a
∑ h
=O n programming language used for implementation, or pro-
2h grammer preference. Specifically, sometimes the root is
h=0
= O(n) placed at index 1, sacrificing space in order to simplify
arithmetic.
[lower-alpha 2]
∑∞ This uses the fact that the given infinite series Let n be the number of elements in the heap and i be an
i
i=0 i/2 converges. arbitrary valid index of the array storing the heap. If the
The exact value of the above (the worst-case number of tree root is at index 0, with valid indices 0 through n − 1,
comparisons during the heap construction) is known to be then each element a at index i has
equal to:
• children at indices 2i + 1 and 2i + 2
2n − 2s2 (n) − e2 (n) ,[6] • its parent at index floor((i − 1) ∕ 2).

where s2 (n) is the sum of all digits of the binary repre- Alternatively, if the tree root is at index 1, with valid in-
sentation of n and e2 (n) is the exponent of 2 in the prime dices 1 through n, then each element a at index i has
factorization of n.
The Build-Max-Heap function that follows, converts an • children at indices 2i and 2i +1
array A which stores a complete binary tree with n nodes
• its parent at index floor(i ∕ 2).
to a max-heap by repeatedly using Max-Heapify in a bot-
tom up manner. It is based on the observation that the
array elements indexed by floor(n/2) + 1, floor(n/2) + 2, This implementation is used in the heapsort algorithm,
..., n are all leaves for the tree (assuming that indices start where it allows the space in the input array to be reused
at 1), thus each is a one-element heap. Build-Max-Heap to store the heap (i.e. the algorithm is done in-place).
runs Max-Heapify on each of the remaining tree nodes. The implementation is also useful for use as a Priority
queue where use of a dynamic array allows insertion of
Build-Max-Heap (A): an unbounded number of items.
heap_length[A] ← length[A]
for each index i from floor(length[A]/2) downto 1 do: The upheap/downheap operations can then be stated in
Max-Heapify(A, i) terms of an array as follows: suppose that the heap prop-
erty holds for the indices b, b+1, ..., e. The sift-down
function extends the heap property to b−1, b, b+1, ..., e.
5.4.3 Heap implementation Only index i = b−1 can violate the heap property. Let j
be the index of the largest child of a[i] (for a max-heap,
or the smallest child for a min-heap) within the range b,
..., e. (If no such index exists because 2i > e then the heap
property holds for the newly extended range and nothing
needs to be done.) By swapping the values a[i] and a[j]
0 1 2 3 4 5 6 the heap property for position i is established. At this
point, the only problem is that the heap property might
A small complete binary tree stored in an array not hold for index j. The sift-down function is applied
tail-recursively to index j until the heap property is estab-
lished for all elements.
The sift-down function is fast. In each step it only needs
two comparisons and one swap. The index value where it
is working doubles in each iteration, so that at most log2
e steps are required.

Comparison between a binary heap and an array implementa- For big heaps and using virtual memory, storing elements
tion. in an array according to the above scheme is inefficient:
(almost) every level is in a different page. B-heaps are
Heaps are commonly implemented with an array. Any bi- binary heaps that keep subtrees in a single page, reducing
nary tree can be stored in an array, but because a binary the number of pages accessed by up to a factor of ten.[7]
5.4. BINARY HEAP 141

The operation of merging two binary heaps takes Θ(n) for observations together yields the following expression for
equal-sized heaps. The best you can do is (in case of array the index of the last node in layer l.
implementation) simply concatenating the two heap ar-
rays and build a heap of the result.[8] A heap on n elements
can be merged with a heap on k elements using O(log n
log k) key comparisons, or, in case of a pointer-based im- last(l) = (2l+1 − 1) − 1 = 2l+1 − 2
plementation, in O(log n log k) time.[9] An algorithm for
splitting a heap on n elements into two heaps on k and n-k Let there be j nodes after node i in layer L, such that
elements, respectively, based on a new view of heaps as an
ordered collections of subheaps was presented in.[10] The
algorithm requires O(log n * log n) comparisons. The
view also presents a new and conceptually simple algo- i= last(L) − j
rithm for merging heaps. When merging is a common
= (2L+1 − 2) − j
task, a different heap implementation is recommended,
such as binomial heaps, which can be merged in O(log
n). Each of these j nodes must have exactly 2 children, so
there must be 2j nodes separating i 's right child from
Additionally, a binary heap can be implemented with a the end of its layer ( L + 1 ).
traditional binary tree data structure, but there is an issue
with finding the adjacent element on the last level on the
binary heap when adding an element. This element can
be determined algorithmically or by adding extra data to right = 1) + last(L − 2j
the nodes, called “threading” the tree—instead of merely
= (2L+2 − 2) − 2j
storing references to the children, we store the inorder
successor of the node as well. = 2(2L+1 − 2 − j) + 2
It is possible to modify the heap structure to allow extrac- = 2i + 2
tion of both the smallest and largest element in O (log n)
time.[11] To do this, the rows alternate between min heap As required.
and max heap. The algorithms are roughly the same, but, Noting that the left child of any node is always 1 place
in each step, one must consider the alternating rows with before its right child, we get left = 2i + 1 .
alternating comparisons. The performance is roughly the
same as a normal single direction heap. This idea can be If the root is located at index 1 instead of 0, the last node
generalised to a min-max-median heap. in each level is instead at index 2l+1 − 1 . Using this
throughout yields left = 2i and right = 2i + 1 for heaps
with their root at 1.
5.4.4 Derivation of index equations
Parent node
In an array-based heap, the children and parent of a node
can be located via simple arithmetic on the node’s index. Every node is either the left or right child of its parent, so
This section derives the relevant equations for heaps with we know that either of the following is true.
their root at index 0, with additional notes on heaps with
their root at index 1.
1. i = 2 × (parent) + 1
To avoid confusion, we'll define the level of a node as its
distance from the root, such that the root itself occupies 2. i = 2 × (parent) + 2
level 0.
Hence,

Child nodes

For a general node located at index i (beginning from 0), i−1 i−2
parent = or
we will first derive the index of its right child, right = 2 2
2i + 2 . ⌊ ⌋
i−1
Let node i be located in level L , and note that any level Now consider the expression .
l contains exactly 2l nodes. Furthermore, there are ex- 2
actly 2l+1 − 1 nodes contained in the layers up to and If node i is a left child, this gives the result immediately,
including layer l (think of binary arithmetic; 0111...111 however, it also gives the correct result if node i is a right
= 1000...000 - 1). Because the root is stored at 0, the k child. In this case, (i−2) must be even, and hence (i−1)
th node will be stored at index (k − 1) . Putting these must be odd.
142 CHAPTER 5. PRIORITY QUEUES

5.4.7 See also


⌊ ⌋ ⌊ ⌋
i−1 i−2 1 • Heap
= +
2 2 2
i−2 • Heapsort
=
2
= parent 5.4.8 References

Therefore, irrespective of whether a node is a left or right [1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L.; Stein, Clifford (2009) [1990]. Introduction to
child, its parent can be found by the expression:
Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN
0-262-03384-4.

[2] Williams, J. W. J. (1964), “Algorithm 232 - Heap-


⌊ ⌋ sort”, Communications of the ACM, 7 (6): 347–348,
i−1
parent = doi:10.1145/512274.512284
2
[3] eEL,CSA_Dept,IISc,Bangalore, “Binary Heaps”, Data
Structures and Algorithms
5.4.5 Related structures
[4] http://wcipeg.com/wiki/Binary_heap
Since the ordering of siblings in a heap is not specified [5] Hayward, Ryan; McDiarmid, Colin (1991). “Average
by the heap property, a single node’s two children can Case Analysis of Heap Building by Repeated Insertion”
be freely interchanged unless doing so violates the shape (PDF). J. Algorithms. 12: 126–153.
property (compare with treap). Note, however, that in
the common array-based heap, simply swapping the chil- [6] Suchenek, Marek A. (2012), “Elementary Yet Precise
Worst-Case Analysis of Floyd’s Heap-Construction Pro-
dren might also necessitate moving the children’s sub-tree
gram”, Fundamenta Informaticae, IOS Press, 120 (1):
nodes to retain the heap property.
75–92, doi:10.3233/FI-2012-751.
The binary heap is a special case of the d-ary heap in
[7] Poul-Henning Kamp. “You're Doing It Wrong”. ACM
which d = 2.
Queue. June 11, 2010.

[8] Chris L. Kuszmaul. “binary heap”. Dictionary of Algo-


5.4.6 Summary of running times rithms and Data Structures, Paul E. Black, ed., U.S. Na-
tional Institute of Standards and Technology. 16 Novem-
In the following time complexities[12] O(f) is an asymp- ber 2009.
totic upper bound and Θ(f) is an asymptotically tight [9] J.-R. Sack and T. Strothotte “An Algorithm for Merging
bound (see Big O notation). Function names assume a Heaps”, Acta Informatica 22, 171-186 (1985).
min-heap.
[10] . J.-R. Sack and T. Strothotte “A characterization of heaps
and its applications” Information and Computation Vol-
[1] In fact, this procedure can be shown to take Θ(n log n) ume 86, Issue 1, May 1990, Pages 69–86.
time in the worst case, meaning that n log n is also an
asymptotic lower bound on the complexity.[1]:167 In the [11] Atkinson, M.D.; J.-R. Sack; N. Santoro & T. Strothotte
average case (averaging over all permutations of n inputs), (1 October 1986). “Min-max heaps and generalized pri-
though, the method takes linear time.[5] ority queues.” (PDF). Programming techniques and Data
structures. Comm. ACM, 29(10): 996–1000.
[2] This does not mean that sorting can be done in linear time
since building a heap is only the first step of the heapsort [12] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
algorithm. Ronald L. (1990). Introduction to Algorithms (1st ed.).
MIT Press and McGraw-Hill. ISBN 0-262-03141-8.
[3] Brodal and Okasaki later describe a persistent variant with
[13] Fredman, Michael Lawrence; Tarjan, Robert E. (July
the same bounds except for decrease-key, which is not
1987). “Fibonacci heaps and their uses in improved net-
supported. Heaps with n elements can be constructed
work optimization algorithms” (PDF). Journal of the As-
bottom-up in O(n).[16]
sociation for Computing Machinery. 34 (3): 596–615.
doi:10.1145/28869.28874.
[4] Amortized time.
[14] Iacono, John (2000), “Improved upper bounds for pair-
[19]
[5] Lower√ bound of Ω(log log n), upper bound of ing heaps”, Proc. 7th Scandinavian Workshop on Al-
O(22 log log n ). [20] gorithm Theory, Lecture Notes in Computer Science,
1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428 ,
[6] n is the size of the larger heap. doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2
5.5. ''D''-ARY HEAP 143

[15] Brodal, Gerth S. (1996), “Worst-Case Efficient Priority in practice despite having a theoretically larger worst-
Queues” (PDF), Proc. 7th Annual ACM-SIAM Symposium case running time.[6][7] Like binary heaps, d-ary heaps
on Discrete Algorithms, pp. 52–58 are an in-place data structure that uses no additional stor-
age beyond that needed to store the array of items in the
[16] Goodrich, Michael T.; Tamassia, Roberto (2004). “7.3.6.
heap.[2][8]
Bottom-Up Heap Construction”. Data Structures and Al-
gorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-
46983-1.
5.5.1 Data structure
[17] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E.
(2009). “Rank-pairing heaps” (PDF). SIAM J. Computing: The d-ary heap consists of an array of n items, each of
1463–1485. which has a priority associated with it. These items may
[18] Brodal, G. S. L.; Lagogiannis, G.; Tarjan, R. E. be viewed as the nodes in a complete d-ary tree, listed in
(2012). Strict Fibonacci heaps (PDF). Proceedings of breadth first traversal order: the item at position 0 of the
the 44th symposium on Theory of Computing - STOC array forms the root of the tree, the items at positions 1
'12. p. 1177. doi:10.1145/2213977.2214082. ISBN through d are its children, the next d2 items are its grand-
9781450312455. children, etc. Thus, the parent of the item at position i
(for any i > 0) is the item at position floor((i − 1)/d) and
[19] Fredman, Michael Lawrence (July 1999). “On the Ef- its children are the items at positions di + 1 through di +
ficiency of Pairing Heaps and Related Data Structures” d. According to the heap property, in a min-heap, each
(PDF). Journal of the Association for Computing Machin-
item has a priority that is at least as large as its parent; in
ery. 46 (4): 473–501. doi:10.1145/320211.320214.
a max-heap, each item has a priority that is no larger than
[20] Pettie, Seth (2005). Towards a Final Analysis of Pair- its parent.[2][3]
ing Heaps (PDF). FOCS '05 Proceedings of the 46thThe minimum priority item in a min-heap (or the maxi-
Annual IEEE Symposium on Foundations of Computer
mum priority item in a max-heap) may always be found
Science. pp. 174–183. CiteSeerX 10.1.1.549.471 . at position 0 of the array. To remove this item from the
doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0. priority queue, the last item x in the array is moved into
its place, and the length of the array is decreased by one.
Then, while item x and its children do not satisfy the heap
5.4.9 External links property, item x is swapped with one of its children (the
one with the smallest priority in a min-heap, or the one
• Binary Heap Applet by Kubo Kovac
with the largest priority in a max-heap), moving it down-
• Open Data Structures - Section 10.1 - BinaryHeap: ward in the tree and later in the array, until eventually the
An Implicit Binary Tree heap property is satisfied. The same downward swapping
procedure may be used to increase the priority of an item
• Implementation of binary max heap in C by Robin in a min-heap, or to decrease the priority of an item in a
Thomas max-heap.[2][3]
To insert a new item into the heap, the item is appended
• Implementation of binary min heap in C by Robin
to the end of the array, and then while the heap property
Thomas
is violated it is swapped with its parent, moving it upward
in the tree and earlier in the array, until eventually the
heap property is satisfied. The same upward-swapping
5.5 ''d''-ary heap procedure may be used to decrease the priority of an item
in a min-heap, or to increase the priority of an item in a
[2][3]
The d-ary heap or d-heap is a priority queue data struc- max-heap.
ture, a generalization of the binary heap in which the To create a new heap from an array of n items, one may
nodes have d children instead of 2.[1][2][3] Thus, a binary loop over the items in reverse order, starting from the
heap is a 2-heap, and a ternary heap is a 3-heap. Ac- item at position n − 1 and ending at the item at position
cording to Tarjan[2] and Jensen et al.,[4] d-ary heaps were 0, applying the downward-swapping procedure for each
invented by Donald B. Johnson in 1975.[1] item.[2][3]
This data structure allows decrease priority operations to
be performed more quickly than binary heaps, at the ex-
pense of slower delete minimum operations. This trade- 5.5.2 Analysis
off leads to better running times for algorithms such as
Dijkstra’s algorithm in which decrease priority operations In a d-ary heap with n items in it, both the upward-
are more common than delete min operations.[1][5] Addi- swapping procedure and the downward-swapping proce-
tionally, d-ary heaps have better memory cache behav- dure may perform as many as logd n = log n / log d swaps.
ior than binary heaps, allowing them to run more quickly In the upward-swapping procedure, each swap involves a
144 CHAPTER 5. PRIORITY QUEUES

single comparison of an item with its parent, and takes a min-heap in which there are n delete-min operations
constant time. Therefore, the time to insert a new item and as many as m decrease-priority operations, where n
into the heap, to decrease the priority of an item in a min-is the number of vertices in the graph and m is the number
heap, or to increase the priority of an item in a max-heap, of edges. By using a d-ary heap with d = m/n, the total
is O(log n / log d). In the downward-swapping procedure, times for these two types of operations may be balanced
each swap involves d comparisons and takes O(d) time: against each other, leading to a total time of O(m logm/n
it takes d − 1 comparisons to determine the minimum or n) for the algorithm, an improvement over the O(m log n)
maximum of the children and then one more comparison running time of binary heap versions of these algorithms
against the parent to determine whether a swap is needed. whenever the number of edges is significantly larger than
Therefore, the time to delete the root item, to increase thethe number of vertices.[1][5] An alternative priority queue
priority of an item in a min-heap, or to decrease the pri- data structure, the Fibonacci heap, gives an even better
ority of an item in a max-heap, is O(d log n / log d).[2][3]theoretical running time of O(m + n log n), but in prac-
tice d-ary heaps are generally at least as fast, and often
When creating a d-ary heap from a set of n items, most of [10]
the items are in positions that will eventually hold leaves faster, than Fibonacci heaps for this application.
of the d-ary tree, and no downward swapping is per- 4-heaps may perform better than binary heaps in practice,
formed for those items. At most n/d + 1 items are non- even for delete-min operations.[2][3] Additionally, a d-ary
leaves, and may be swapped downwards at least once, at heap typically runs much faster than a binary heap for
a cost of O(d) time to find the child to swap them with. heap sizes that exceed the size of the computer’s cache
At most n/d2 + 1 nodes may be swapped downward two memory: A binary heap typically requires more cache
times, incurring an additional O(d) cost for the second misses and virtual memory page faults than a d-ary heap,
swap beyond the cost already counted in the first term, each one taking far more time than the extra work in-
etc. Therefore, the total amount of time to create a heap curred by the additional comparisons a d-ary heap makes
in this way is compared to a binary heap.[6][7]
∑logd n ( n )
i=1 di + 1 O(d) = O(n). [2][3]
5.5.4 References
The exact value of the above (the worst-case number of
comparisons during the construction of d-ary heap) is [1] Johnson, D. B. (1975), “Priority queues with update and
known to be equal to: finding minimum spanning trees”, Information Processing
Letters, 4 (3): 53–57, doi:10.1016/0020-0190(75)90001-
0.
d
d−1 (n − sd (n)) − (d − 1 −
(nmodd))(ed (⌊ nd ⌋) + 1) ,[9] [2] Tarjan, R. E. (1983), “3.2. d-heaps”, Data Structures
and Network Algorithms, CBMS-NSF Regional Confer-
where s (n) is the sum of all digits of the standard base-d ence Series in Applied Mathematics, 44, Society for In-
representation of n and e (n) is the exponent of d in the dustrial and Applied Mathematics, pp. 34–38.
factorization of n. This reduces to
[3] Weiss, M. A. (2007), "d-heaps”, Data Structures and Al-
gorithm Analysis (2nd ed.), Addison-Wesley, p. 216,
2n − 2s2 (n) − e2 (n) ,[9] ISBN 0-321-37013-9.

for d = 2, and to [4] Jensen, C.; Katajainen, J.; Vitale, F. (2004), An extended
truth about heaps (PDF).
3
2 (n − s3 (n)) − 2e3 (n) − e3 (n − 1) ,[9]
[5] Tarjan (1983), pp. 77 and 91.
for d = 3. [6] Naor, D.; Martel, C. U.; Matloff, N. S. (1991), “Per-
The space usage of the d-ary heap, with insert and delete- formance of priority queue structures in a virtual mem-
min operations, is linear, as it uses no extra storage other ory environment”, Computer Journal, 34 (5): 428–437,
than an array containing a list of the items in the heap.[2][8] doi:10.1093/comjnl/34.5.428.
If changes to the priorities of existing items need to be
[7] Kamp, Poul-Henning (2010), “You're doing it wrong”,
supported, then one must also maintain pointers from the
ACM Queue, 8 (6).
items to their positions in the heap, which again uses only
linear storage.[2] [8] Mortensen, C. W.; Pettie, S. (2005), “The complexity of
implicit and space efficient priority queues”, Algorithms
and Data Structures: 9th International Workshop, WADS
5.5.3 Applications 2005, Waterloo, Canada, August 15–17, 2005, Proceed-
ings, Lecture Notes in Computer Science, 3608, Springer-
Dijkstra’s algorithm for shortest paths in graphs and Verlag, pp. 49–60, doi:10.1007/11534273_6, ISBN 978-
Prim’s algorithm for minimum spanning trees both use 3-540-28101-6.
5.6. BINOMIAL HEAP 145

[9] Suchenek, Marek A. (2012), “Elementary Yet Precise Order 0 1 2 3


Worst-Case Analysis of Floyd’s Heap-Construction Pro-
gram”, Fundamenta Informaticae, IOS Press, 120 (1):
75–92, doi:10.3233/FI-2012-751.

[10] Cherkassky, B. V.; Goldberg, A. V.; Radzik, T. (1996),


“Shortest paths algorithms: Theory and experimental
evaluation”, Mathematical Programming, 73 (2): 129–
174, doi:10.1007/BF02592101.

5.5.5 External links


Binomial trees of order 0 to 3: Each tree has a root node with
• C++ implementation of generalized heap with D- subtrees of all lower ordered binomial trees, which have been
Heap support highlighted. For example, the order 3 binomial tree is connected
to an order 2, 1, and 0 (highlighted as blue, green and red re-
spectively) binomial tree.

5.6 Binomial heap


• Each binomial tree in a heap obeys the minimum-
“Binomial tree” redirects here. For binomial price trees, heap property: the key of a node is greater than or
see binomial options pricing model. equal to the key of its parent.

In computer science, a binomial heap is a heap similar


to a binary heap but also supports quick merging of two • There can only be either one or zero binomial trees
heaps. This is achieved by using a special tree structure. It for each order, including zero order.
is important as an implementation of the mergeable heap
abstract data type (also called meldable heap), which is a
priority queue supporting merge operation. The first property ensures that the root of each binomial
tree contains the smallest key in the tree, which applies to
the entire heap.
5.6.1 Binomial heap The second property implies that a binomial heap with
n nodes consists of at most log n + 1 binomial trees. In
A binomial heap is implemented as a collection of bi- fact, the number and orders of these trees are uniquely
nomial trees (compare with a binary heap, which has a determined by the number of nodes n: each binomial tree
shape of a single binary tree), which are defined recur- corresponds to one digit in the binary representation of
sively as follows: number n. For example number 13 is 1101 in binary,
23 + 22 + 20 , and thus a binomial heap with 13 nodes
• A binomial tree of order 0 is a single node will consist of three binomial trees of orders 3, 2, and 0
(see figure below).
• A binomial tree of order k has a root node whose
children are roots of binomial trees of orders k−1,
k−2, ..., 2, 1, 0 (in this order).

A binomial tree of order k has 2k nodes, height k. 9 5 12


Because of its unique structure, a binomial tree of order
k can be constructed from two trees of order k−1 trivially
by attaching one of them as the leftmost child of the root 17 21 23 12 77
of the other tree. This feature is central to the merge op-
eration of a binomial heap, which is its major advantage
over other conventional heaps. 99 33 24 23
The name
( ) comes from the shape: a binomial tree of order
n has nd nodes at depth d . (See Binomial coefficient.)
53
5.6.2 Structure of a binomial heap Example
1

of a binomial heap containing 13 nodes with distinct keys.


A binomial heap is implemented as a set of binomial trees The heap consists of three binomial trees with orders 0, 2,
that satisfy the binomial heap properties: and 3.
146 CHAPTER 5. PRIORITY QUEUES

5.6.3 Implementation 1 9 2

Because no operation requires random access to the root 11 7 4 8

nodes of the binomial trees, the roots of the binomial trees


12 10 5
can be stored in a linked list, ordered by increasing order

+
3 6
of the tree. 21
14

Merge 1 6 2

3 9 14 7 4 8

7 > 3
11 12

21
10 5

12 8 5 4 This shows the merger of two binomial heaps. This is accom-


plished by merging two binomial trees of the same order one by
one. If the resulting merged tree has the same order as one bi-
nomial tree in one of the two heaps, then those two are merged
13 9 again.

If only one of the heaps contains a tree of order j, this tree


is moved to the merged heap. If both heaps contain a tree
3
of order j, the two trees are merged to one tree of order
j+1 so that the minimum-heap property is satisfied. Note
that it may later be necessary to merge this tree with some
7 5 4 other tree of order j+1 present in one of the heaps. In the
course of the algorithm, we need to examine at most three
trees of any order (two from the two heaps we merge and
12 8 9 one composed of two smaller trees).
Because each binomial tree in a binomial heap corre-
sponds to a bit in the binary representation of its size,
13 there is an analogy between the merging of two heaps and
the binary addition of the sizes of the two heaps, from
right-to-left. Whenever a carry occurs during addition,
To merge two binomial trees of the same order, first compare the this corresponds to a merging of two binomial trees dur-
root key. Since 7>3, the black tree on the left(with root node 7) ing the merge.
is attached to the grey tree on the right(with root node 3) as a Each tree has order at most log n and therefore the run-
subtree. The result is a tree of order 3. ning time is O(log n).

As mentioned above, the simplest and most important op- function merge(p, q) while not (p.end() and q.end())
eration is the merging of two binomial trees of the same tree = mergeTree(p.currentTree(), q.currentTree()) if
order within a binomial heap. Due to the structure of bi- not heap.currentTree().empty() tree = mergeTree(tree,
nomial trees, they can be merged trivially. As their root heap.currentTree()) heap.addTree(tree) heap.next();
node is the smallest element within the tree, by compar- p.next(); q.next()
ing the two keys, the smaller of them is the minimum key,
and becomes the new root node. Then the other tree be-
comes a subtree of the combined tree. This operation is
basic to the complete merging of two binomial heaps. Insert
function mergeTree(p, q) if p.root.key <= q.root.key re-
turn p.addSubTree(q) else return q.addSubTree(p) The Inserting a new element to a heap can be done by sim-
operation of merging two heaps is perhaps the most in- ply creating a new heap containing only this element and
teresting and can be used as a subroutine in most other then merging it with the original heap. Due to the merge,
operations. The lists of roots of both heaps are traversed insert takes O(log n) time. However, across a series of
simultaneously in a manner similar to that of the merge n consecutive insertions, insert has an amortized time of
algorithm. O(1) (i.e. constant).
5.6. BINOMIAL HEAP 147

Find minimum supported. Heaps with n elements can be constructed


bottom-up in O(n).[5]
To find the minimum element of the heap, find the min-
[2] Amortized time.
imum among the roots of the binomial trees. This can
[8]
again be done easily in O(log n) time, as there are just [3] Lower√ bound of Ω(log log n), upper bound of
O(log n) trees and hence roots to examine. O(22 log log n ). [9]
By using a pointer to the binomial tree that contains the [4] n is the size of the larger heap.
minimum element, the time for this operation can be re-
duced to O(1). The pointer must be updated when per-
forming any operation other than Find minimum. This 5.6.5 Applications
can be done in O(log n) without raising the running time
of any operation. • Discrete event simulation
• Priority queues
Delete minimum

To delete the minimum element from the heap, first find


5.6.6 See also
this element, remove it from its binomial tree, and obtain • Fibonacci heap
a list of its subtrees. Then transform this list of subtrees
into a separate binomial heap by reordering them from • Skew binomial heap
smallest to largest order. Then merge this heap with the
original heap. Since each root has at most log n children, • Soft heap
creating this new heap is O(log n). Merging heaps is O(log • Weak heap, an implicit variant of the binomial heap
n), so the entire delete minimum operation is O(log n).
function deleteMin(heap) min = heap.trees().first() for
each current in heap.trees() if current.root < min.root
5.6.7 References
then min = current for each tree in min.subTrees() • Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
tmp.addTree(tree) heap.removeTree(min) merge(heap, Ronald L.; Stein, Clifford (2001) [1990]. “Chap-
tmp) ter 19: Binomial Heaps”. Introduction to Algorithms
(2nd ed.). MIT Press and McGraw-Hill. pp. 455–
Decrease key 475. ISBN 0-262-03293-7.
• Vuillemin, Jean (April 1978). “A data
After decreasing the key of an element, it may become structure for manipulating priority queues”
smaller than the key of its parent, violating the minimum- (PDF). Communications of the ACM. 21
heap property. If this is the case, exchange the element
(4): 309–314. CiteSeerX 10.1.1.309.9090 .
with its parent, and possibly also with its grandparent, and
doi:10.1145/359460.359478.
so on, until the minimum-heap property is no longer vio-
lated. Each binomial tree has height at most log n, so this
takes O(log n) time. 5.6.8 External links
• Java applet simulation of binomial heap
Delete
• Python implementation of binomial heap
To delete an element from the heap, decrease its key to
negative infinity (that is, some value lower than any el- • Two C implementations of binomial heap (a generic
ement in the heap) and then delete the minimum in the one and one optimized for integer keys)
heap. • Haskell implementation of binomial heap
• Common Lisp implementation of binomial heap
5.6.4 Summary of running times
[1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
In the following time complexities[1] O(f) is an asymp- Ronald L. (1990). Introduction to Algorithms (1st ed.).
totic upper bound and Θ(f) is an asymptotically tight MIT Press and McGraw-Hill. ISBN 0-262-03141-8.
bound (see Big O notation). Function names assume a
[2] Fredman, Michael Lawrence; Tarjan, Robert E. (July
min-heap. 1987). “Fibonacci heaps and their uses in improved net-
work optimization algorithms” (PDF). Journal of the As-
[1] Brodal and Okasaki later describe a persistent variant with sociation for Computing Machinery. 34 (3): 596–615.
the same bounds except for decrease-key, which is not doi:10.1145/28869.28874.
148 CHAPTER 5. PRIORITY QUEUES

[3] Iacono, John (2000), “Improved upper bounds for pair- by a non-constant factor. It is also possible to merge two
ing heaps”, Proc. 7th Scandinavian Workshop on Al- Fibonacci heaps in constant amortized time, improving
gorithm Theory, Lecture Notes in Computer Science, on the logarithmic merge time of a binomial heap, and
1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428 , improving on binary heaps which cannot handle merges
doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2 efficiently.
[4] Brodal, Gerth S. (1996), “Worst-Case Efficient Priority Using Fibonacci heaps for priority queues improves the
Queues” (PDF), Proc. 7th Annual ACM-SIAM Symposium asymptotic running time of important algorithms, such
on Discrete Algorithms, pp. 52–58 as Dijkstra’s algorithm for computing the shortest path
between two nodes in a graph, compared to the same al-
[5] Goodrich, Michael T.; Tamassia, Roberto (2004). “7.3.6.
gorithm using other slower priority queue data structures.
Bottom-Up Heap Construction”. Data Structures and Al-
gorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-
46983-1.
5.7.1 Structure
[6] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E.
(2009). “Rank-pairing heaps” (PDF). SIAM J. Computing:
1463–1485.

[7] Brodal, G. S. L.; Lagogiannis, G.; Tarjan, R. E.


(2012). Strict Fibonacci heaps (PDF). Proceedings of
the 44th symposium on Theory of Computing - STOC
'12. p. 1177. doi:10.1145/2213977.2214082. ISBN
9781450312455.

[8] Fredman, Michael Lawrence (July 1999). “On the Ef-


ficiency of Pairing Heaps and Related Data Structures”
(PDF). Journal of the Association for Computing Machin-
ery. 46 (4): 473–501. doi:10.1145/320211.320214.

[9] Pettie, Seth (2005). Towards a Final Analysis of Pair-


ing Heaps (PDF). FOCS '05 Proceedings of the 46th
Annual IEEE Symposium on Foundations of Computer Figure 1. Example of a Fibonacci heap. It has three trees of
degrees 0, 1 and 3. Three vertices are marked (shown in blue).
Science. pp. 174–183. CiteSeerX 10.1.1.549.471 .
Therefore, the potential of the heap is 9 (3 trees + 2 × (3 marked-
doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.
vertices)).

A Fibonacci heap is a collection of trees satisfying the


5.7 Fibonacci heap minimum-heap property, that is, the key of a child is al-
ways greater than or equal to the key of the parent. This
In computer science, a Fibonacci heap is a data struc- implies that the minimum key is always at the root of one
ture for priority queue operations, consisting of a collec- of the trees. Compared with binomial heaps, the struc-
tion of heap-ordered trees. It has a better amortized run- ture of a Fibonacci heap is more flexible. The trees do not
ning time than many other priority queue data structures have a prescribed shape and in the extreme case the heap
including the binary heap and binomial heap. Michael can have every element in a separate tree. This flexibility
L. Fredman and Robert E. Tarjan developed Fibonacci allows some operations to be executed in a lazy manner,
heaps in 1984 and published them in a scientific journal postponing the work for later operations. For example,
in 1987. They named Fibonacci heaps after the Fibonacci merging heaps is done simply by concatenating the two
numbers, which are used in their running time analysis. lists of trees, and operation decrease key sometimes cuts
For the Fibonacci heap, the find-minimum operation a node from its parent and forms a new tree.
takes constant (O(1)) amortized time.[1] The insert and However, at some point order needs to be introduced to
decrease key operations also work in constant amortized the heap to achieve the desired running time. In partic-
time.[2] Deleting an element (most often used in the spe- ular, degrees of nodes (here degree means the number
cial case of deleting the minimum element) works in of children) are kept quite low: every node has degree at
O(log n) amortized time, where n is the size of the heap.[2] most O(log n) and the size of a subtree rooted in a node
This means that starting from an empty data structure, of degree k is at least Fk₊₂, where Fk is the kth Fibonacci
any sequence of a insert and decrease key operations and number. This is achieved by the rule that we can cut at
b delete operations would take O(a + b log n) worst case most one child of each non-root node. When a second
time, where n is the maximum heap size. In a binary or child is cut, the node itself needs to be cut from its par-
binomial heap such a sequence of operations would take ent and becomes the root of a new tree (see Proof of de-
O((a + b) log n) time. A Fibonacci heap is thus better gree bounds, below). The number of trees is decreased
than a binary or binomial heap when b is smaller than a in the operation delete minimum, where trees are linked
5.7. FIBONACCI HEAP 149

together.
As a result of a relaxed structure, some operations can
take a long time while others are done very quickly. For
the amortized running time analysis we use the potential
method, in that we pretend that very fast operations take
a little bit longer than they actually do. This additional
time is then later combined and subtracted from the ac-
tual running time of slow operations. The amount of time
saved for later use is measured at any given moment by a
potential function. The potential of a Fibonacci heap is
given by

Potential = t + 2m Fibonacci heap from Figure 1 after first phase of extract mini-
mum. Node with key 1 (the minimum) was deleted and its chil-
dren were added as separate trees.
where t is the number of trees in the Fibonacci heap, and
m is the number of marked nodes. A node is marked if
at least one of its children was cut since this node was it takes time O(d) to process all new roots and the poten-
made a child of another node (all roots are unmarked). tial increases by d−1. Therefore, the amortized running
The amortized time for an operation is given by the sum time of this phase is O(d) = O(log n).
of the actual time and c times the difference in potential,
where c is a constant (chosen to match the constant factors
in the O notation for the actual time).
Thus, the root of each tree in a heap has one unit of time
stored. This unit of time can be used later to link this tree
with another tree at amortized time 0. Also, each marked
node has two units of time stored. One can be used to
cut the node from its parent. If this happens, the node
becomes a root and the second unit of time will remain
stored in it as in any other root.

5.7.2 Implementation of operations

To allow fast deletion and concatenation, the roots of all


trees are linked using a circular, doubly linked list. The
children of each node are also linked using such a list.
For each node, we maintain its number of children and
whether the node is marked. Moreover, we maintain a
pointer to the root containing the minimum key.
Fibonacci heap from Figure 1 after extract minimum is com-
Operation find minimum is now trivial because we keep
pleted. First, nodes 3 and 6 are linked together. Then the result
the pointer to the node containing it. It does not change is linked with tree rooted at node 2. Finally, the new minimum is
the potential of the heap, therefore both actual and amor- found.
tized cost are constant.
As mentioned above, merge is implemented simply by However to complete the extract minimum operation, we
concatenating the lists of tree roots of the two heaps. This need to update the pointer to the root with minimum
can be done in constant time and the potential does not key. Unfortunately there may be up to n roots we need
change, leading again to constant amortized time. to check. In the second phase we therefore decrease the
number of roots by successively linking together roots of
Operation insert works by creating a new heap with one the same degree. When two roots u and v have the same
element and doing merge. This takes constant time, and degree, we make one of them a child of the other so that
the potential increases by one, because the number of the one with the smaller key remains the root. Its degree
trees increases. The amortized cost is thus still constant. will increase by one. This is repeated until every root has
Operation extract minimum (same as delete minimum) a different degree. To find trees of the same degree ef-
operates in three phases. First we take the root containing ficiently we use an array of length O(log n) in which we
the minimum element and remove it. Its children will be- keep a pointer to one root of each degree. When a second
come roots of new trees. If the number of children was d, root is found of the same degree, the two are linked and
150 CHAPTER 5. PRIORITY QUEUES

the array is updated. The actual running time is O(log n Fd+2 ≥ φd for all integers d ≥ 0 , where
tion) that √
.
+ m) where m is the number of roots at the beginning of φ = (1 + 5)/2 = 1.618 . (We then have n ≥ Fd+2 ≥
d
the second phase. At the end we will have at most O(log φ , and taking the log to base φ of both sides gives
n) roots (because each has a different degree). There- d ≤ logφ n as required.)
fore, the difference in the potential function from beforeConsider any node x somewhere in the heap (x need not
this phase to after it is: O(log n) − m, and the amortized
be the root of one of the main trees). Define size(x) to be
running time is then at most O(log n + m) + c(O(log n) − the size of the tree rooted at x (the number of descendants
m). With a sufficiently large choice of c, this simplifies of x, including x itself). We prove by induction on the
to O(log n).
height of x (the length of a longest simple path from x
In the third phase we check each of the remaining roots to a descendant leaf), that size(x) ≥ Fd₊₂, where d is the
and find the minimum. This takes O(log n) time and the degree of x.
potential does not change. The overall amortized running Base case: If x has height 0, then d = 0, and size(x) = 1
time of extract minimum is therefore O(log n). =F . 2

Inductive case: Suppose x has positive height and degree


d>0. Let y1 , y2 , ..., yd be the children of x, indexed in
order of the times they were most recently made children
of x (y1 being the earliest and yd the latest), and let c1 , c2 ,
..., cd be their respective degrees. We claim that ci ≥ i−2
for each i with 2≤i≤d: Just before yi was made a child
of x, y1 ,...,yi₋₁ were already children of x, and so x had
Fibonacci heap from Figure 1 after decreasing key of node 9 to
degree at least i−1 at that time. Since trees are combined
0. This node as well as its two marked ancestors are cut from the only when the degrees of their roots are equal, it must
tree rooted at 1 and placed as new roots. have been that yi also had degree at least i−1 at the time
it became a child of x. From that time to the present, yi
Operation decrease key will take the node, decrease the can only have lost at most one child (as guaranteed by the
key and if the heap property becomes violated (the new marking process), and so its current degree ci is at least
key is smaller than the key of the parent), the node is cut i−2. This proves the claim.
from its parent. If the parent is not a root, it is marked. If Since the heights of all the yi are strictly less than that of
it has been marked already, it is cut as well and its parent x, we can apply the inductive hypothesis to them to get
is marked. We continue upwards until we reach either size(yi) ≥ Fci₊₂ ≥ F₍i₋₂₎₊₂ = Fi. The nodes x and y1 each
the root or an unmarked node. Now we set the minimum contribute at least 1 to size(x), and so we have
pointer to the decreased value if it is the new minimum. ∑d ∑d
In the process we create some number, say k, of new trees. ∑ size(x) ≥ 2 + i=2 size(yi ) ≥ 2 + i=2 Fi = 1 +
d
Each of these new trees except possibly the first one was i=0 Fi .
marked originally but as a root it will become unmarked. ∑d
A routine induction proves that 1 + i=0 Fi = Fd+2
One node can become marked. Therefore, the number of
for any d ≥ 0 , which gives the desired lower bound on
marked nodes changes by −(k − 1) + 1 = − k + 2. Com-
size(x).
bining these 2 changes, the potential changes by 2(−k +
2) + k = −k + 4. The actual time to perform the cutting
was O(k), therefore (again with a sufficiently large choice
of c) the amortized running time is constant. 5.7.4 Worst case
Finally, operation delete can be implemented simply by Although Fibonacci heaps look very efficient, they have
decreasing the key of the element to be deleted to minus the following two drawbacks (as mentioned in the pa-
infinity, thus turning it into the minimum of the whole per “The Pairing Heap: A new form of Self Adjusting
heap. Then we call extract minimum to remove it. The Heap”): “They are complicated when it comes to cod-
amortized running time of this operation is O(log n). ing them. Also they are not as efficient in practice when
compared with the theoretically less efficient forms of
heaps, since in their simplest version they require stor-
5.7.3 Proof of degree bounds
age and manipulation of four pointers per node, com-
The amortized performance of a Fibonacci heap depends pared to the two or three pointers per node needed for
on the degree (number of children) of any tree root be- other structures ".[3] These other structures are referred to
ing O(log n), where n is the size of the heap. Here we Binary heap, Binomial heap, Pairing Heap, Brodal Heap
show that the size of the (sub)tree rooted at any node and Rank Pairing Heap.
x of degree d in the heap must have size at least Fd₊₂, Although the total running time of a sequence of opera-
where Fk is the kth Fibonacci number. The degree bound tions starting with an empty structure is bounded by the
follows from this and the fact (easily proved by induc- bounds given above, some (very few) operations in the
5.7. FIBONACCI HEAP 151

sequence can take very long to complete (in particular [3] Fredman, Michael L.; Sedgewick, Robert; Sleator, Daniel
delete and delete minimum have linear running time in D.; Tarjan, Robert E. (1986). “The pairing heap: a new
the worst case). For this reason Fibonacci heaps and form of self-adjusting heap” (PDF). Algorithmica. 1 (1):
other amortized data structures may not be appropriate 111–129. doi:10.1007/BF01840439.
for real-time systems. It is possible to create a data struc- [4] Gerth Stølting Brodal (1996), “Worst-Case Efficient
ture which has the same worst-case performance as the Priority Queues”, Proc. 7th ACM-SIAM Symposium
Fibonacci heap has amortized performance. One such on Discrete Algorithms, Society for Industrial and Ap-
structure, the Brodal queue,[4] is, in the words of the cre- plied Mathematics: 52–58, CiteSeerX 10.1.1.43.8133 ,
ator, “quite complicated” and "[not] applicable in prac- doi:10.1145/313852.313883, ISBN 0-89871-366-8
tice.” Created in 2012, the strict Fibonacci heap[5] is a
[5] Brodal, G. S. L.; Lagogiannis, G.; Tarjan, R. E. (2012).
simpler (compared to Brodal’s) structure with the same
Strict Fibonacci heaps (PDF). Proceedings of the 44th
worst-case bounds. It is unknown whether the strict Fi- symposium on Theory of Computing - STOC '12. p.
bonacci heap is efficient in practice. The run-relaxed 1177. doi:10.1145/2213977.2214082. ISBN 978-1-
heaps of Driscoll et al. give good worst-case performance 4503-1245-5.
for all Fibonacci heap operations except merge.
[6] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L. (1990). Introduction to Algorithms (1st ed.).
5.7.5 Summary of running times MIT Press and McGraw-Hill. ISBN 0-262-03141-8.
[7] Iacono, John (2000), “Improved upper bounds for pair-
In the following time complexities[6] O(f) is an asymp- ing heaps”, Proc. 7th Scandinavian Workshop on Al-
totic upper bound and Θ(f) is an asymptotically tight gorithm Theory, Lecture Notes in Computer Science,
bound (see Big O notation). Function names assume a 1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428 ,
min-heap. doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2
[8] Brodal, Gerth S. (1996), “Worst-Case Efficient Priority
[1] Brodal and Okasaki later describe a persistent variant with
Queues” (PDF), Proc. 7th Annual ACM-SIAM Symposium
the same bounds except for decrease-key, which is not
on Discrete Algorithms, pp. 52–58
supported. Heaps with n elements can be constructed
bottom-up in O(n).[9] [9] Goodrich, Michael T.; Tamassia, Roberto (2004). “7.3.6.
Bottom-Up Heap Construction”. Data Structures and Al-
[2] Amortized time.
gorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471-
[3] Lower√ bound of Ω(log log n), [12]
upper bound of 46983-1.
O(22 log log n ). [13] [10] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E.
(2009). “Rank-pairing heaps” (PDF). SIAM J. Computing:
[4] n is the size of the larger heap.
1463–1485.
[11] Brodal, G. S. L.; Lagogiannis, G.; Tarjan, R. E.
5.7.6 Practical considerations (2012). Strict Fibonacci heaps (PDF). Proceedings of
the 44th symposium on Theory of Computing - STOC
Fibonacci heaps have a reputation for being slow in '12. p. 1177. doi:10.1145/2213977.2214082. ISBN
practice[14] due to large memory consumption per node 9781450312455.
and high constant factors on all operations.[15] Recent ex- [12] Fredman, Michael Lawrence (July 1999). “On the Ef-
perimental results suggest that Fibonacci heaps are more ficiency of Pairing Heaps and Related Data Structures”
efficient in practice than most of its later derivatives, (PDF). Journal of the Association for Computing Machin-
including quake heaps, violation heaps, strict Fibonacci ery. 46 (4): 473–501. doi:10.1145/320211.320214.
heaps, rank pairing heaps, but less efficient than either
[13] Pettie, Seth (2005). Towards a Final Analysis of Pair-
pairing heaps or array-based heaps.[16] ing Heaps (PDF). FOCS '05 Proceedings of the 46th
Annual IEEE Symposium on Foundations of Computer
5.7.7 References Science. pp. 174–183. CiteSeerX 10.1.1.549.471 .
doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.
[1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, [14] http://www.cs.princeton.edu/~{}wayne/
Ronald L.; Stein, Clifford (2001) [1990]. “Chapter 20: kleinberg-tardos/pdf/FibonacciHeaps.pdf, p. 79
Fibonacci Heaps”. Introduction to Algorithms (2nd ed.).
MIT Press and McGraw-Hill. pp. 476–497. ISBN 0- [15] http://web.stanford.edu/class/cs166/lectures/07/
262-03293-7. Third edition p. 518. Small07.pdf, p. 72

[2] Fredman, Michael Lawrence; Tarjan, Robert E. (July [16] Larkin, Daniel; Sen, Siddhartha; Tarjan, Robert (2014).
1987). “Fibonacci heaps and their uses in improved net- “A Back-to-Basics Empirical Study of Priority Queues”.
work optimization algorithms” (PDF). Journal of the As- Proceedings of the Sixteenth Workshop on Algorithm En-
sociation for Computing Machinery. 34 (3): 596–615. gineering and Experiments: 61–72. arXiv:1403.0252 .
doi:10.1145/28869.28874. doi:10.1137/1.9781611973198.7.
152 CHAPTER 5. PRIORITY QUEUES

5.7.8 External links for which decrease-key runs in O(log log n) amortized
time and with all other operations matching Fibonacci
• Java applet simulation of a Fibonacci heap heaps,[7] but no tight Θ(log log n) bound is known for
the original data structure.[6][3] Moreover, it is an open
• MATLAB implementation of Fibonacci heap
question whether a o(log n) amortized time bound for
• De-recursived and memory efficient C imple- decrease-key and a O(1) amortized time bound for insert
mentation of Fibonacci heap (free/libre software, can be achieved simultaneously.[8]
CeCILL-B license) Although this is worse than other priority queue algo-
rithms such as Fibonacci heaps, which perform decrease-
• Ruby implementation of the Fibonacci heap (with
key in O(1) amortized time, the performance in practice
tests)
is excellent. Stasko and Vitter,[4] Moret and Shapiro,[9]
• Pseudocode of the Fibonacci heap algorithm and Larkin, Sen, and Tarjan[8] conducted experiments on
pairing heaps and other heap data structures. They con-
• Various Java Implementations for Fibonacci heap cluded that pairing heaps are often faster in practice than
array-based binary heaps and d-ary heaps, and almost al-
ways faster in practice than other pointer-based heaps, in-
5.8 Pairing heap cluding data structures like Fibonacci heaps that are the-
oretically more efficient.
A pairing heap is a type of heap data structure with
relatively simple implementation and excellent practical
amortized performance, introduced by Michael Fredman,
Robert Sedgewick, Daniel Sleator, and Robert Tarjan in
1986.[1] Pairing heaps are heap-ordered multiway tree
5.8.1 Structure
structures, and can be considered simplified Fibonacci
heaps. They are considered a “robust choice” for imple- A pairing heap is either an empty heap, or a pair consist-
menting such algorithms as Prim’s MST algorithm,[2] and ing of a root element and a possibly empty list of pairing
support the following operations (assuming a min-heap): heaps. The heap ordering property requires that all the
root elements of the subheaps in the list are not smaller
• find-min: simply return the top element of the heap. than the root element of the heap. The following descrip-
tion assumes a purely functional heap that does not sup-
• merge: compare the two root elements, the smaller port the decrease-key operation.
remains the root of the result, the larger element and
type PairingHeap[Elem] = Empty | Heap(elem: Elem,
its subtree is appended as a child of this root.
subheaps: List[PairingHeap[Elem]])
• insert: create a new heap for the inserted element A pointer-based implementation for RAM machines,
and merge into the original heap. supporting decrease-key, can be achieved using three
pointers per node, by representing the children of a node
• decrease-key (optional): remove the subtree rooted
by a singly-linked list: a pointer to the node’s first child,
at the key to be decreased, replace the key with a
one to its next sibling, and one to its previous sibling (or,
smaller key, then merge the result back into the heap.
for the leftmost sibling, to its parent). Alternatively, the
• delete-min: remove the root and merge its subtrees. previous-pointer can be omitted by letting the last child
Various strategies are employed. point back to the parent, if a single boolean flag is added
to indicate “end of list”. This achieves a more compact
The analysis of pairing heaps’ time complexity was ini- structure at[1]the expense of a constant overhead factor per
tially inspired by that of splay trees.[1] The amortized time operation.
per delete-min is O(log n), and the operations find-min,
merge, and insert run in O(1) amortized time.[3]
Determining the precise asymptotic running time of pair-
ing heaps when a decrease-key operation is needed has 5.8.2 Operations
turned out to be difficult. Initially, the time complexity of
this operation was conjectured on empirical grounds to be find-min
O(1),[4] but Fredman proved that the amortized time per
decrease-key is at least Ω(log log n) for some sequences
of operations.[5] Using a different amortization argument, The function find-min simply returns the root element of
Pettie then proved that insert, meld, and decrease-key all the heap:

2 log log n
run in O(2 ) amortized time, which is o(log n) function find-min(heap: PairingHeap[Elem]) -> Elem if
.[6] Elmasry later introduced a variant of pairing heaps heap == Empty error else return heap.elem
5.8. PAIRING HEAP 153

merge 5.8.3 Summary of running times

Merging with an empty heap returns the other heap, oth- In the following time complexities[10] O(f) is an asymp-
erwise a new heap is returned that has the minimum of totic upper bound and Θ(f) is an asymptotically tight
the two root elements as its root element and just adds the bound (see Big O notation). Function names assume a
heap with the larger root to the list of subheaps: min-heap.
function merge(heap1, heap2: PairingHeap[Elem]) -
> PairingHeap[Elem] if heap1 == Empty return [1] Brodal and Okasaki later describe a persistent variant with
heap2 elsif heap2 == Empty return heap1 elsif the same bounds except for decrease-key, which is not
heap1.elem < heap2.elem return Heap(heap1.elem, supported. Heaps with n elements can be constructed
heap2 :: heap1.subheaps) else return Heap(heap2.elem, bottom-up in O(n).[14]
heap1 :: heap2.subheaps)
[2] Amortized time.
[17]
[3] Lower√ bound of Ω(log log n), upper bound of
insert
O(22 log log n ). [18]
The easiest way to insert an element into a heap is to [4] n is the size of the larger heap.
merge the heap with a new heap containing just this ele-
ment and an empty list of subheaps:
function insert(elem: Elem, heap: PairingHeap[Elem]) 5.8.4 References
-> PairingHeap[Elem] return merge(Heap(elem, []),
heap) [1] Fredman, Michael L.; Sedgewick, Robert; Sleator, Daniel
D.; Tarjan, Robert E. (1986). “The pairing heap: a new
form of self-adjusting heap” (PDF). Algorithmica. 1 (1):
delete-min 111–129. doi:10.1007/BF01840439.

[2] Mehlhorn, Kurt; Sanders, Peter (2008). Algorithms and


The only non-trivial fundamental operation is the deletion
Data Structures: The Basic Toolbox (PDF). Springer. p.
of the minimum element from the heap. The standard
231.
strategy first merges the subheaps in pairs (this is the step
that gave this datastructure its name) from left to right and [3] Iacono, John (2000). Improved upper bounds for
then merges the resulting list of heaps from right to left: pairing heaps (PDF). Proc. 7th Scandinavian Work-
function delete-min(heap: PairingHeap[Elem]) -> Pair- shop on Algorithm Theory. Lecture Notes in Com-
puter Science. 1851. Springer-Verlag. pp. 63–
ingHeap[Elem] if heap == Empty error else return
merge-pairs(heap.subheaps) 77. arXiv:1110.4428 . doi:10.1007/3-540-44985-X_5.
ISBN 978-3-540-67690-4.
This uses the auxiliary function merge-pairs:
[4] Stasko, John T.; Vitter, Jeffrey S. (1987), “Pairing
function merge-pairs(list: List[PairingHeap[Elem]]) ->
heaps: experiments and analysis”, Communications of the
PairingHeap[Elem] if length(list) == 0 return Empty
ACM, 30 (3): 234–249, CiteSeerX 10.1.1.106.2988 ,
elsif length(list) == 1 return list[0] else return
doi:10.1145/214748.214759
merge(merge(list[0], list[1]), merge-pairs(list[2..]))
That this does indeed implement the described two-pass [5] Fredman, Michael L. (1999). “On the efficiency of pairing
left-to-right then right-to-left merging strategy can be heaps and related data structures” (PDF). Journal of the
seen from this reduction: ACM. 46 (4): 473–501. doi:10.1145/320211.320214.

merge-pairs([H1, H2, H3, H4, H5, H6, H7]) => [6] Pettie, Seth (2005), “Towards a final analysis of pair-
merge(merge(H1, H2), merge-pairs([H3, H4, H5, H6, ing heaps” (PDF), Proc. 46th Annual IEEE Sympo-
H7])) # merge H1 and H2 to H12, then the rest of sium on Foundations of Computer Science, pp. 174–183,
the list => merge(H12, merge(merge(H3, H4), merge- doi:10.1109/SFCS.2005.75, ISBN 0-7695-2468-0
pairs([H5, H6, H7]))) # merge H3 and H4 to H34,
[7] Elmasry, Amr (2009), “Pairing heaps with O(log log n)
then the rest of the list => merge(H12, merge(H34,
decrease cost” (PDF), Proc. 20th Annual ACM-SIAM
merge(merge(H5, H6), merge-pairs([H7])))) # merge H5
Symposium on Discrete Algorithms, pp. 471–476,
and H6 to H56, then the rest of the list => merge(H12, doi:10.1137/1.9781611973068.52
merge(H34, merge(H56, H7))) # switch direction, merge
the last two resulting heaps, giving H567 => merge(H12, [8] Larkin, Daniel H.; Sen, Siddhartha; Tarjan, Robert
merge(H34, H567)) # merge the last two resulting heaps, E. (2014), “A back-to-basics empirical study of pri-
giving H34567 => merge(H12, H34567) # finally, merge ority queues”, Proceedings of the 16th Workshop on
the first merged pair with the result of merging the rest Algorithm Engineering and Experiments, pp. 61–72,
=> H1234567 arXiv:1403.0252 , doi:10.1137/1.9781611973198.7
154 CHAPTER 5. PRIORITY QUEUES

[9] Moret, Bernard M. E.; Shapiro, Henry D. (1991), “An 5.9 Double-ended priority queue
empirical analysis of algorithms for constructing a min-
imum spanning tree”, Proc. 2nd Workshop on Algo-
Not to be confused with Double-ended queue.
rithms and Data Structures, Lecture Notes in Computer
Science, 519, Springer-Verlag, pp. 400–411, CiteSeerX
10.1.1.53.5960 , doi:10.1007/BFb0028279, ISBN 3- In computer science, a double-ended priority queue
540-54343-0 (DEPQ)[1] or double-ended heap[2] is a data structure
similar to a priority queue or heap, but allows for efficient
[10] Cormen, Thomas H.; Leiserson, Charles E.; Rivest, removal of both the maximum and minimum, according
Ronald L. (1990). Introduction to Algorithms (1st ed.). to some ordering on the keys (items) stored in the struc-
MIT Press and McGraw-Hill. ISBN 0-262-03141-8. ture. Every element in a DEPQ has a priority or value.
In a DEPQ, it is possible to remove the elements in both
[11] Fredman, Michael Lawrence; Tarjan, Robert E. (July
ascending as well as descending order.[3]
1987). “Fibonacci heaps and their uses in improved net-
work optimization algorithms” (PDF). Journal of the As-
sociation for Computing Machinery. 34 (3): 596–615.
doi:10.1145/28869.28874.
5.9.1 Operations

[12] Iacono, John (2000), “Improved upper bounds for pair- A double-ended priority queue features the follow oper-
ing heaps”, Proc. 7th Scandinavian Workshop on Al- ations:
gorithm Theory, Lecture Notes in Computer Science,
1851, Springer-Verlag, pp. 63–77, arXiv:1110.4428 , isEmpty() Checks if DEPQ is empty and returns true if
doi:10.1007/3-540-44985-X_5, ISBN 3-540-67690-2 empty.

[13] Brodal, Gerth S. (1996), “Worst-Case Efficient Priority size() Returns the total number of elements present in
Queues” (PDF), Proc. 7th Annual ACM-SIAM Symposium the DEPQ.
on Discrete Algorithms, pp. 52–58
getMin() Returns the element having least priority.
[14] Goodrich, Michael T.; Tamassia, Roberto (2004). “7.3.6.
getMax() Returns the element having highest priority.
Bottom-Up Heap Construction”. Data Structures and Al-
gorithms in Java (3rd ed.). pp. 338–341. ISBN 0-471- put(x) Inserts the element x in the DEPQ.
46983-1.
removeMin() Removes an element with minimum pri-
[15] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E. ority and returns this element.
(2009). “Rank-pairing heaps” (PDF). SIAM J. Computing:
1463–1485. removeMax() Removes an element with maximum pri-
ority and returns this element.
[16] Brodal, G. S. L.; Lagogiannis, G.; Tarjan, R. E.
(2012). Strict Fibonacci heaps (PDF). Proceedings of
If an operation is to be performed on two elements having
the 44th symposium on Theory of Computing - STOC
the same priority, then the element inserted first is chosen.
'12. p. 1177. doi:10.1145/2213977.2214082. ISBN
9781450312455. Also, the priority of any element can be changed once it
has been inserted in the DEPQ.[4]
[17] Fredman, Michael Lawrence (July 1999). “On the Ef-
ficiency of Pairing Heaps and Related Data Structures”
(PDF). Journal of the Association for Computing Machin- 5.9.2 Implementation
ery. 46 (4): 473–501. doi:10.1145/320211.320214.
Double-ended priority queues can be built from balanced
[18] Pettie, Seth (2005). Towards a Final Analysis of Pair- binary search trees (where the minimum and maximum
ing Heaps (PDF). FOCS '05 Proceedings of the 46th elements are the leftmost and rightmost leaves, respec-
Annual IEEE Symposium on Foundations of Computer tively), or using specialized data structures like min-max
Science. pp. 174–183. CiteSeerX 10.1.1.549.471 . heap and pairing heap.
doi:10.1109/SFCS.2005.75. ISBN 0-7695-2468-0.
Generic methods of arriving at double-ended priority
queues from normal priority queues are:[5]
5.8.5 External links
Dual structure method
• Louis Wasserman discusses pairing heaps and their
implementation in Haskell in The Monad Reader, In this method two different priority queues for min and
Issue 16 (pp. 37–52). max are maintained. The same elements in both the PQs
are shown with the help of correspondence pointers.
• pairing heaps, Sartaj Sahni Here, the minimum and maximum elements are values
5.9. DOUBLE-ENDED PRIORITY QUEUE 155

A dual structure with 14,12,4,10,8 as the members of DEPQ.[1]

contained in the root nodes of min heap and max heap


respectively. A leaf correspondence heap for the same elements as above.[1]

• Removing the min element: Perform removemin() necessary for non-leaf elements to be in a one-to-one
on the min heap and remove(node value) on the max correspondence pair.[1]
heap, where node value is the value in the corre-
sponding node in the max heap.
• Removing the max element: Perform remove- Interval heaps
max() on the max heap and remove(node value) on
the min heap, where node value is the value in the
corresponding node in the min heap.

Total correspondence

Implementing a DEPQ using interval heap.

Apart from the above-mentioned correspondence meth-


ods, DEPQ’s can be obtained efficiently using interval
heaps.[6] An interval heap is like an embedded min-max
heap in which each node contains two elements. It is a
complete binary tree in which:[6]

A total correspondence heap for the elements 3, 4, 5, 5, 6, 6, 7, • The left element is less than or equal to the right
8, 9, 10, 11 with element 11 as buffer.[1] element.
Half the elements are in the min PQ and the other half • Both the elements define a closed interval.
in the max PQ. Each element in the min PQ has a
• Interval represented by any node except the root is a
one-to-one correspondence with an element in max PQ.
sub-interval of the parent node.
If the number of elements in the DEPQ is odd, one of
the elements is retained in a buffer.[1] Priority of every • Elements on the left hand side define a min heap.
element in the min PQ will be less than or equal to the
corresponding element in the max PQ. • Elements on the right hand side define a max heap.

Depending on the number of elements, two cases are


possible[6] -
Leaf correspondence
1. Even number of elements: In this case, each node
In this method only the leaf elements of the min and contains two elements say p and q, with p ≤ q. Every
max PQ form corresponding one-to-one pairs. It is not node is then represented by the interval [p, q].
156 CHAPTER 5. PRIORITY QUEUES

2. Odd number of elements: In this case, each node root node. This element is removed and returned.
except the last contains two elements represented by To fill in the vacancy created on the right hand side
the interval [p, q] whereas the last node will contain of the root node, an element from the last node is
a single element and is represented by the interval removed and reinserted into the root node. Further
[p, p]. comparisons are carried out on a similar basis as dis-
cussed above. Finally, the root node will again con-
tain the max element on the right hand side.
Inserting an element Depending on the number of
elements already present in the interval heap, following
Thus, with interval heaps, both the minimum and maxi-
cases are possible:
mum elements can be removed efficiently traversing from
root to leaf. Thus, a DEPQ can be obtained[6] from an in-
• Odd number of elements: If the number of ele- terval heap where the elements of the interval heap are the
ments in the interval heap is odd, the new element priorities of elements in the DEPQ.
is firstly inserted in the last node. Then, it is suc-
cessively compared with the previous node elements
and tested to satisfy the criteria essential for an in- 5.9.3 Time Complexity
terval heap as stated above. In case if the element
does not satisfy any of the criteria, it is moved from Interval Heaps
the last node to the root until all the conditions are
satisfied.[6] When DEPQ’s are implemented using Interval heaps con-
sisting of n elements, the time complexities for the vari-
• Even number of elements: If the number of ele-
ous functions are formulated in the table below[1]
ments is even, then for the insertion of a new ele-
ment an additional node is created. If the element
falls to the left of the parent interval, it is consid-
Pairing heaps
ered to be in the min heap and if the element falls
to the right of the parent interval, it is considered in
When DEPQ’s are implemented using heaps or pairing
the max heap. Further, it is compared successively
heaps consisting of n elements, the time complexities for
and moved from the last node to the root until all
the various functions are formulated in the table below.[1]
the conditions for interval heap are satisfied. If the
For pairing heaps, it is an amortized complexity.
element lies within the interval of the parent node it-
self, the process is stopped then and there itself and
moving of elements does not take place.[6]
5.9.4 Applications
The time required for inserting an element depends on the External sorting
number of movements required to meet all the conditions
and is O(log n). One example application of the double-ended priority
queue is external sorting. In an external sort, there are
more elements than can be held in the computer’s mem-
Deleting an element
ory. The elements to be sorted are initially on a disk and
the sorted sequence is to be left on the disk. The external
• Min element: In an interval heap, the minimum el-
quick sort is implemented using the DEPQ as follows:
ement is the element on the left hand side of the root
node. This element is removed and returned. To fill
in the vacancy created on the left hand side of the 1. Read in as many elements as will fit into an internal
root node, an element from the last node is removed DEPQ. The elements in the DEPQ will eventually
and reinserted into the root node. This element is be the middle group (pivot) of elements.
then compared successively with all the left hand
elements of the descending nodes and the process 2. Read in the remaining elements. If the next element
stops when all the conditions for an interval heap are is ≤ the smallest element in the DEPQ, output this
satisfied.In case if the left hand side element in the next element as part of the left group. If the next el-
node becomes greater than the right side element at ement is ≥ the largest element in the DEPQ, output
[6]
any stage, the two elements are swapped and then this next element as part of the right group. Other-
further comparisons are done. Finally, the root node wise, remove either the max or min element from
will again contain the minimum element on the left the DEPQ (the choice may be made randomly or al-
hand side. ternately); if the max element is removed, output it
as part of the right group; otherwise, output the re-
• Max element: In an interval heap, the maximum moved element as part of the left group; insert the
element is the element on the right hand side of the newly input element into the DEPQ.
5.10. SOFT HEAP 157

3. Output the elements in the DEPQ, in sorted order, More precisely, the guarantee offered by the soft heap
as the middle group. is the following: for a fixed value ε between 0 and 1/2, at
any point in time there will be at most ε*n corrupted keys
4. Sort the left and right groups recursively. in the heap, where n is the number of elements inserted
so far. Note that this does not guarantee that only a fixed
percentage of the keys currently in the heap are corrupted:
5.9.5 See also in an unlucky sequence of insertions and deletions, it can
happen that all elements in the heap will have corrupted
• Queue (abstract data type)
keys. Similarly, we have no guarantee that in a sequence
• Priority queue of elements extracted from the heap with findmin and
delete, only a fixed percentage will have corrupted keys:
• Double-ended queue in an unlucky scenario only corrupted elements are ex-
tracted from the heap.

5.9.6 References The soft heap was designed by Bernard Chazelle in 2000.
The term “corruption” in the structure is the result of what
[1] Data Structures, Algorithms, & Applications in Java: Chazelle called “carpooling” in a soft heap. Each node in
Double-Ended Priority Queues, Sartaj Sahni, 1999. the soft heap contains a linked-list of keys and one com-
mon key. The common key is an upper bound on the
[2] Brass, Peter (2008). Advanced Data Structures. Cam- values of the keys in the linked-list. Once a key is added
bridge University Press. p. 211. ISBN 9780521880374.
to the linked-list, it is considered corrupted because its
[3] “Depq - Double-Ended Priority Queue”. value is never again relevant in any of the soft heap op-
erations: only the common keys are compared. This is
[4] “depq”. what makes soft heaps “soft"; you can't be sure whether
[5] Fundamentals of Data Structures in C++ - Ellis Horowitz,
or not any particular value you put into it will be cor-
Sartaj Sahni and Dinesh Mehta rupted. The purpose of these corruptions is effectively
to lower the information entropy of the data, enabling the
[6] http://www.mhhe.com/engcs/compsci/sahni/enrich/c9/ data structure to break through information-theoretic bar-
interval.pdf riers regarding heaps.

5.10 Soft heap 5.10.1 Applications


Despite their limitations and unpredictable nature, soft
For the Canterbury scene band, see Soft Heap.
heaps are useful in the design of deterministic algorithms.
They were used to achieve the best complexity to date for
In computer science, a soft heap is a variant on the simple finding a minimum spanning tree. They can also be used
heap data structure that has constant amortized time for to easily build an optimal selection algorithm, as well as
5 types of operations. This is achieved by carefully “cor- near-sorting algorithms, which are algorithms that place
rupting” (increasing) the keys of at most a certain number every element near its final position, a situation in which
of values in the heap. The constant time operations are: insertion sort is fast.
One of the simplest examples is the selection algorithm.
• create(S): Create a new soft heap Say we want to find the kth largest of a group of n num-
bers. First, we choose an error rate of 1/3; that is, at most
• insert(S, x): Insert an element into a soft heap
about 33% of the keys we insert will be corrupted. Now,
• meld(S, S' ): Combine the contents of two soft heaps we insert all n elements into the heap — we call the orig-
into one, destroying both inal values the “correct” keys, and the values stored in the
heap the “stored” keys. At this point, at most n/3 keys are
• delete(S, x): Delete an element from a soft heap corrupted, that is, for at most n/3 keys is the “stored” key
larger than the “correct” key, for all the others the stored
• findmin(S): Get the element with minimum key in
key equals the correct key.
the soft heap
Next, we delete the minimum element from the heap n/3
Other heaps such as Fibonacci heaps achieve most of times (this is done according to the “stored” key). As the
these bounds without any corruption, but cannot provide a total number of insertions we have made so far is still n,
constant-time bound on the critical delete operation. The there are still at most n/3 corrupted keys in the heap. Ac-
amount of corruption can be controlled by the choice of cordingly, at least 2n/3 − n/3 = n/3 of the keys remaining
a parameter ε, but the lower this is set, the more time in the heap are not corrupted.
insertions require (O(log 1/ε) for an error rate of ε). Let L be the element with the largest correct key among
158 CHAPTER 5. PRIORITY QUEUES

the elements we removed. The stored key of L is possi-


bly larger than its correct key (if L was corrupted), and
even this larger value is smaller than all the stored keys
of the remaining elements in the heap (as we were re-
moving minimums). Therefore, the correct key of L is
smaller than the remaining n/3 uncorrupted elements in
the soft heap. Thus, L divides the elements somewhere
between 33%/66% and 66%/33%. We then partition the
set about L using the partition algorithm from quicksort
and apply the same algorithm again to either the set of
numbers less than L or the set of numbers greater than L,
neither of which can exceed 2n/3 elements. Since each
insertion and deletion requires O(1) amortized time, the
total deterministic time is T(n) = T(2n/3) + O(n). Using
case 3 of the master theorem (with ε=1 and c=2/3), we
know that T(n) = Θ(n).
The final algorithm looks like this:
function softHeapSelect(a[1..n], k) if k = 1 then return
minimum(a[1..n]) create(S) for i from 1 to n insert(S,
a[i]) for i from 1 to n/3 x := findmin(S) delete(S, x) xIn-
dex := partition(a, x) // Returns new index of pivot x if
k < xIndex softHeapSelect(a[1..xIndex-1], k) else soft-
HeapSelect(a[xIndex..n], k-xIndex+1)

5.10.2 References
• Chazelle, Bernard (November 2000). “The
soft heap: an approximate priority queue
with optimal error rate” (PDF). J. ACM. 47
(6): 1012–1027. CiteSeerX 10.1.1.5.9705 .
doi:10.1145/355541.355554.
• Kaplan, Haim; Zwick, Uri (2009). “A sim-
pler implementation and analysis of Chazelle’s
soft heaps”. Proceedings of the Nineteenth An-
nual ACM–SIAM Symposium on Discrete Algorithms.
Society for Industrial and Applied Mathematics.
pp. 477–485. CiteSeerX 10.1.1.215.6250 .
doi:10.1137/1.9781611973068.53. ISBN 978-0-
89871-680-1.
Chapter 6

Successors and neighbors

6.1 Binary search algorithm target value is less than or greater than the middle ele-
ment, the search continues in the lower or upper half of
This article is about searching a finite sorted array. the array, respectively,
[7]
eliminating the other half from
For searching continuous function values, see bisection consideration.
method.
Procedure
In computer science, binary search, also known as
half-interval search,[1] logarithmic search,[2] or bi- Given an array A of n elements with values or records A0
nary chop,[3] is a search algorithm that finds the position ... An₋₁, sorted such that A0 ≤ ... ≤ An₋₁, and target value
of a target value within a sorted array.[4][5] Binary search T, the following subroutine uses binary search to find the
compares the target value to the middle element of the index of T in A.[7]
array; if they are unequal, the half in which the target
cannot lie is eliminated and the search continues on the
remaining half until it is successful or the remaining half 1. Set L to 0 and R to n − 1.
is empty.
2. If L > R, the search terminates as unsuccessful.
Binary search runs in at worst logarithmic time, mak-
ing O(log n) comparisons, where n is the number of el- 3. Set m (the position of the middle element) to the
ements in the array, the O is Big O notation, and log is floor (the largest previous integer) of (L + R) / 2.
the logarithm. Binary search takes only constant (O(1))
space, meaning that the space taken by the algorithm 4. If Am < T, set L to m + 1 and go to step 2.
is the same for any number of elements in the array.[6]
Although specialized data structures designed for fast 5. If Am > T, set R to m – 1 and go to step 2.
searching—such as hash tables—can be searched more
efficiently, binary search applies to a wider range of 6. Now Am = T, the search is done; return m.
search problems.
Although the idea is simple, implementing binary search This iterative procedure keeps track of the search bound-
correctly requires attention to some subtleties about its aries via two variables. Some implementations may place
exit conditions and midpoint calculation. the comparison for equality at the end of the algorithm,
resulting in a faster comparison loop but costing one more
There exist numerous variations of binary search. In par- iteration on average.[8]
ticular, fractional cascading speeds up binary searches for
the same value in multiple arrays, efficiently solving a se-
ries of search problems in computational geometry and Approximate matches
numerous other fields. Exponential search extends binary
search to unbounded lists. The binary search tree and B- The above procedure only performs exact matches, find-
tree data structures are based on binary search. ing the position of a target value. However, due to the or-
dered nature of sorted arrays, it is trivial to extend binary
search to perform approximate matches. For example,
6.1.1 Algorithm binary search can be used to compute, for a given value,
its rank (the number of smaller elements), predecessor
Binary search works on sorted arrays. Binary search be- (next-smallest element), successor (next-largest element),
gins by comparing the middle element of the array with and nearest neighbor. Range queries seeking the number
the target value. If the target value matches the mid- of elements between two values can be performed with
dle element, its position in the array is returned. If the two rank queries.[9]

159
160 CHAPTER 6. SUCCESSORS AND NEIGHBORS

• Rank queries can be performed using a modified iteration, always eliminates the smaller subarray out of
version of binary search. By returning m on a suc- the two if they are not of equal size.[lower-alpha 1][12]
cessful search, and L on an unsuccessful search, the On average, assuming that each element is equally likely
number of elements less than the target value is re- to be searched, by the time the search completes, the tar-
turned instead.[9] get value will most likely be found at the second-deepest
level of the tree. This is equivalent to a binary search that
• Predecessor and successor queries can be performed
completes one iteration before the worst case, reached af-
with rank queries. Once the rank of the target value
ter log2 n − 1 iterations. However, the tree may be unbal-
is known, its predecessor is the element at the posi-
anced, with the deepest level partially filled, and equiv-
tion given by its rank (as it is the largest element that
alently, the array may not be divided perfectly by the
is smaller than the target value). Its successor is the
search in some iterations, half of the time resulting in the
element after it (if it is present in the array) or at the
smaller subarray being eliminated. The actual number of
next position after the predecessor (otherwise).[10] n−log2 n−1
The nearest neighbor of the target value is either its average iterations is slightly higher, at log2 n− n
[6]
predecessor or successor, whichever is closer. iterations. In the best case, where the first middle ele-
ment selected is equal to the target value, its position is
• Range queries are also straightforward. Once the returned after one iteration.[13] In terms of iterations, no
ranks of the two values are known, the number search algorithm that is based solely on comparisons can
of elements greater than or equal to the first value exhibit better average and worst-case performance than
and less than the second is the difference of the binary search.[12]
two ranks. This count can be adjusted up or down Each iteration of the binary search algorithm defined
by one according to whether the endpoints of the above makes one or two comparisons, checking if the
range should be considered to be part of the range middle element is equal to the target value in each it-
and whether the array contains keys matching those eration. Again assuming that each element is equally
endpoints.[11] likely to be searched, each iteration makes 1.5 compar-
isons on average. A variation of the algorithm instead
checks for equality at the very end of the search, elim-
6.1.2 Performance inating on average half a comparison from each itera-
tion. This decreases the time taken per iteration very
slightly on most computers, while guaranteeing that the
search takes the maximum number of iterations, on aver-
age adding one iteration to the search. Because the com-
parison loop is performed only ⌊log2 n + 1⌋ times in the
worst case, for all but enormous n , the slight increase in
comparison loop efficiency does not compensate for the
extra iteration. Knuth 1998 gives a value of 266 (more
than 73 quintillion)[14] elements for this variation to be
faster.[lower-alpha 2][15][16]
A tree representing binary search. The array being searched here
Fractional cascading can be used to speed up searches of
is [20, 30, 40, 50, 90, 100], and the target value is 40.
the same value in multiple arrays. Where k is the num-
ber of arrays, searching each array for the target value
The performance of binary search can be analyzed by re- takes O(k log n) time; fractional cascading reduces this
ducing the procedure to a binary comparison tree, where to O(k + log n) .[17]
the root node is the middle element of the array; the mid-
dle element of the lower half is left of the root and the
middle element of the upper half is right of the root.
The rest of the tree is built in a similar fashion. This
model represents binary search; starting from the root 6.1.3 Binary search versus other schemes
node, the left or right subtrees are traversed depending
on whether the target value is less or more than the node Sorted arrays with binary search are a very inefficient
under consideration, representing the successive elimina- solution when insertion and deletion operations are in-
tion of elements.[6][12] terleaved with retrieval, taking O(n) time for each such
The worst case is ⌊log2 n + 1⌋ iterations (of the compar- operation, and complicating memory use.[18] Other data
ison loop), where the ⌊⌋ notation denotes the floor func- structures support much more efficient insertion and dele-
tion that rounds its argument down to an integer and log2 tion, and also fast exact matching. However, binary
is the binary logarithm. This is reached when the search search applies to a wide range of search problems, usually
reaches the deepest level of the tree, equivalent to a bi- solving them in O(log n) time regardless of the type or
nary search that has reduced to one element and, in each structure of the values themselves.
6.1. BINARY SEARCH ALGORITHM 161

Hashing operations.[31]

For implementing associative arrays, hash tables, a data


structure that maps keys to records using a hash func- Mixed approaches
tion, are generally faster than binary search on a sorted
array of records;[19] most implementations require only The Judy array uses a combination of approaches to pro-
amortized constant time on average.[lower-alpha 3][21] How- vide a highly efficient solution.
ever, hashing is not useful for approximate matches, such
as computing the next-smallest, next-largest, and nearest
key, as the only information given on a failed search is that Set membership algorithms
the target is not present in any record.[22] Binary search
is ideal for such matches, performing them in logarithmic
A related problem to search is set membership. Any al-
time. In addition, all operations possible on a sorted ar-
gorithm that does lookup, like binary search, can also be
ray can be performed—such as finding the smallest and
used for set membership. There are other algorithms that
largest key and performing range searches.[23]
are more specifically suited for set membership. A bit ar-
ray is the simplest, useful when the range of keys is lim-
Trees ited; it is very fast, requiring only O(1) time. The Judy1
type of Judy array handles 64-bit keys efficiently.
A binary search tree is a binary tree data structure that For approximate results, Bloom filters, another proba-
works based on the principle of binary search: the records bilistic data structure based on hashing, store a set of
of the tree are arranged in sorted order, and traversal keys by encoding the keys using a bit array and multi-
of the tree is performed using a logarithmic time binary ple hash functions. Bloom filters are much more space-
search-like algorithm. Insertion and deletion also require efficient than bitarrays in most cases and not much slower:
logarithmic time in binary search trees. This is faster than with k hash functions, membership queries require only
the linear time insertion and deletion of sorted arrays, and O(k) time. However, Bloom filters suffer from false pos-
binary trees retain the ability to perform all the operations itives.[lower-alpha 6][lower-alpha 7][33]
possible on a sorted array, including range and approxi-
mate queries.[24]
However, binary search is usually more efficient for Other data structures
searching as binary search trees will most likely be imper-
fectly balanced, resulting in slightly worse performance There exist data structures that may improve on binary
than binary search. This applies even to balanced bi- search in some cases for both searching and other opera-
nary search trees, binary search trees that balance their tions available for sorted arrays. For example, searches,
own nodes—as they rarely produce optimally-balanced approximate matches, and the operations available to
trees—but to a lesser extent. Although unlikely, the sorted arrays can be performed more efficiently than bi-
tree may be severely imbalanced with few internal nodes nary search on specialized data structures such as van
with two children, resulting in the average and worst-case Emde Boas trees, fusion trees, tries, and bit arrays. How-
search time approaching n comparisons.[lower-alpha 4] Bi- ever, while these operations can always be done at least
nary search trees take more space than sorted arrays.[26] efficiently on a sorted array regardless of the keys, such
data structures are usually only faster because they ex-
Binary search trees lend themselves to fast searching in ploit the properties of keys with a certain attribute (usu-
external memory stored in hard disks, as binary search ally keys that are small integers), and thus will be time or
trees can effectively be structured in filesystems. The B- space consuming for keys that lack that attribute.[23]
tree generalizes this method of tree organization; B-trees
are frequently used to organize long-term storage such as
databases and filesystems.[27][28]
6.1.4 Variations

Linear search Uniform binary search

Linear search is a simple search algorithm that checks ev- Uniform binary search stores, instead of the lower and up-
ery record until it finds the target value. Linear search per bounds, the index of the middle element and the num-
can be done on a linked list, which allows for faster in- ber of elements around the middle element that were not
sertion and deletion than an array. Binary search is faster eliminated yet. Each step reduces the width by about half.
than linear search for sorted arrays except if the array is This variation is uniform because the difference between
short.[lower-alpha 5][30] If the array must first be sorted, that the indices of middle elements and the preceding mid-
cost must be amortized over any searches. Sorting the ar- dle elements chosen remains constant between searches
ray also enables efficient approximate matches and other of arrays of the same length.[34]
162 CHAPTER 6. SUCCESSORS AND NEIGHBORS

Boundary search

For a sorted array with duplicates, we can find the bound-


ary of the range of some target value in the array with two
binary searches.
Given an array A of n elements with values A0 ... An₋₁,
sorted such that A0 ≤ ... ≤ An₋₁, and target value T, the
following subroutine finds the left boundary index of ele-
ments equal T in A.

1. Set L to −1 and R to n − 1. Fibonacci search on the function f (x) = sin ((x+ 10 1


)π) on the
unit interval [0, 1] . The algorithm finds an interval containing
1
2. While L - R > 1: the maximum of f with a length less than or equal to 10 in the
5 6
above example. In three iterations, it returns the interval [ 13 , 13 ]
1
• Set m (the position of the middle element) to , which is of length 13 .
the floor (the largest previous integer) of L +
(R - L) / 2.
• If Am < T, set L to m; otherwise, set R to m. Exponential search

3. Now if AR = T, the search is done and return R;


Otherwise, the target is not found. Main article: Exponential search

And the following subroutine finds the right boundary in- Exponential search extends binary search to unbounded
dex of T in A. lists. It starts by finding the first element with an index
that is both a power of two and greater than the target
value. Afterwards, it sets that index as the upper bound,
1. Set L to 0 and R to n. and switches to binary search. A search takes ⌊log2 x+1⌋
iterations of the exponential search and at most ⌊log2 x⌋
2. While L - R > 1: iterations of the binary search, where x is the position of
the target value. Exponential search works on bounded
• Set m (the position of the middle element) to lists, but becomes an improvement over binary search
the floor (the largest previous integer) of L + only if the target value lies near beginning of the array.[37]
(R - L) / 2.
• If Am ≤ T, set L to m; otherwise, set R to m.

3. Now if AL = T, the search is done and return L;


Otherwise, the target is not found.
Interpolation search

Here we use L + (R - L) / 2 instead of (L + R) / 2 to avoid


overflow. Main article: Interpolation search

Instead of merely calculating the midpoint, interpolation


Fibonacci search search estimates the position of the target value, taking
into account the lowest and highest elements in the ar-
Main article: Fibonacci search technique ray and the length of the array. This is only possible if
the array elements are numbers. It works on the basis
Fibonacci search is a method similar to binary search that that the midpoint is not the best guess in many cases; for
successively shortens the interval in which the maximum example, if the target value is close to the highest ele-
of a unimodal function lies. Given a finite interval, a uni- ment in the array, it is likely to be located near the end
modal function, and the maximum length of the resulting of the array.[38] When the distribution of the array ele-
interval, Fibonacci search finds a Fibonacci number such ments is uniform or near uniform, it makes O(log log n)
that if the interval is divided equally into that many subin- comparisons.[38][39][40]
tervals, the subintervals would be shorter than the maxi- In practice, interpolation search is slower than binary
mum length. After dividing the interval, it eliminates the search for small arrays, as interpolation search requires
subintervals in which the maximum cannot lie until one extra computation, and the slower growth rate of its time
or more contiguous subintervals remain.[35][36] complexity compensates for this only for large arrays.[38]
6.1. BINARY SEARCH ALGORITHM 163

Fractional cascading In a practical implementation, the variables used to rep-


resent the indices will often be of fixed size, and this can
Main article: Fractional cascading result in an arithmetic overflow for very large arrays. If
the midpoint of the span is calculated as (L + R) / 2, then
Fractional cascading is a technique that speeds up binary the value of L + R may exceed the range of integers of
searches for the same element for both exact and approx- the data type used to store the midpoint, even if L and R
imate matching in “catalogs” (arrays of sorted elements) are within the range. If L and R are nonnegative, this can
associated with vertices in graphs. Searching each cat- be avoided by calculating the midpoint as L + (R − L) /
alog separately requires O(k log n) time, where k is the 2.[49]
number of catalogs. Fractional cascading reduces this to If the target value is greater than the greatest value in the
O(k + log n) by storing specific information in each cat- array, and the last index of the array is the maximum
alog about other catalogs.[17] representable value of L, the value of L will eventually
Fractional cascading was originally developed to effi- become too large and overflow. A similar problem will
ciently solve various computational geometry problems, occur if the target value is smaller than the least value in
but it also has been applied elsewhere, in domains such the array and the first index of the array is the smallest
as data mining and Internet Protocol routing.[17] representable value of R. In particular, this means that R
must not be an unsigned type if the array starts with index
0.
6.1.5 History An infinite loop may occur if the exit conditions for
the loop are not defined correctly. Once L exceeds R,
In 1946, John Mauchly made the first mention of binary the search has failed and must convey the failure of the
search as part of the Moore School Lectures, the first ever search. In addition, the loop must be exited when the tar-
set of lectures regarding any computer-related topic.[41] get element is found, or in the case of an implementation
Every published binary search algorithm worked only where this check is moved to the end, checks for whether
for arrays whose length is one less than a power of the search was successful or failed at the end must be in
two[lower-alpha 8] until 1960, when Derrick Henry Lehmer place. Bentley found that, in his assignment of binary
published a binary search algorithm that worked on all search, this error was made by most of the programmers
arrays.[43] In 1962, Hermann Bottenbruch presented an who failed to implement a binary search correctly.[8][50]
ALGOL 60 implementation of binary search that placed
The algorithm is not cache friendly, i.e. it does not
the comparison for equality at the end, increasing the av-
make efficient use of cache memory. Furthermore it con-
erage number of iterations by one, but reducing to one
tains unpredictable code flow branches, which incur se-
the number of comparisons per iteration.[8] The uniform
vere penalties on modern CPUs and make it not vector-
binary search was presented to Donald Knuth in 1971
izable, i.e. it is not possible to take advantage of parallel
by A. K. Chandra of Stanford University and published
instructions set (e.g. SIMD). Morin et al[51] discuss vari-
in Knuth’s The Art of Computer Programming.[41] In
ous technical improvements to the algorithm aimed at re-
1986, Bernard Chazelle and Leonidas J. Guibas intro-
moving code branches and improving cache-friendliness.
duced fractional cascading as a method to solve numerous
Cannizzo [52] proposes further technical improvements to
search problems in computational geometry.[17][44][45]
the algorithm aimed at removing more code branches and
making the algorithm vectorizable.
6.1.6 Implementation issues
Although the basic idea of binary search is 6.1.7 Library support
comparatively straightforward, the details can
be surprisingly tricky ... — Donald Knuth[2] Many languages’ standard libraries include binary search
routines:
When Jon Bentley assigned binary search as a problem
in a course for professional programmers, he found that • C provides the function bsearch() in its standard li-
ninety percent failed to provide a correct solution after brary, which is typically implemented via binary
several hours of working on it,[46] and another study pub- search (although the official standard does not re-
lished in 1988 shows that accurate code for it is only found quire it so).[53]
in five out of twenty textbooks.[47] Furthermore, Bent- • C++'s STL provides the functions bi-
ley’s own implementation of binary search, published in nary_search(), lower_bound(), upper_bound()
his 1986 book Programming Pearls, contained an over- and equal_range().[54]
flow error that remained undetected for over twenty years.
The Java programming language library implementation • COBOL provides the SEARCH ALL verb for
of binary search had the same overflow bug for more than performing binary searches on COBOL ordered
nine years.[48] tables.[55]
164 CHAPTER 6. SUCCESSORS AND NEIGHBORS

• Java offers a set of overloaded binarySearch() [4] The worst binary search tree for searching can be pro-
static methods in the classes Arrays and Collections duced by inserting the values in sorted or near-sorted or-
in the standard java.util package for perform- der or in an alternating lowest-highest record pattern.[25]
ing binary searches on Java arrays and on Lists,
[5] Knuth 1998 performed a formal time performance analy-
respectively.[56][57] sis of both of these search algorithms. On Knuth’s hypo-
thetical MIX computer, intended to represent an ordinary
• Microsoft's .NET Framework 2.0 offers static computer, binary search takes on average 18 log n − 16
generic versions of the binary search algorithm in its units of time for a successful search, while linear search
collection base classes. An example would be Sys- with a sentinel node at the end of the list takes 1.75n +
tem.Array’s method BinarySearch<T>(T[] array, T 8.5 − n mod
4n
2
units. Linear search has lower initial com-
value).[58] plexity because it requires minimal computation, but it
quickly outgrows binary search in complexity. On the
• Python provides the bisect module.[59] MIX computer, binary search only outperforms linear
search with a sentinel if n > 44 .[12][29]
• Ruby's Array class includes a bsearch method with
built-in approximate matching.[60] [6] As simply setting all of the bits which the hash functions
point to for a specific key can affect queries for other keys
• Go's sort standard library package contains the which have a common hash location for one or more of
functions Search, SearchInts, SearchFloat64s, and the functions.[32]
SearchStrings, which implement general binary [7] There exist improvements of the Bloom filter which im-
search, as well as specific implementations for prove on its complexity or support deletion; for exam-
searching slices of integers, floating-point numbers, ple, the cuckoo filter exploits cuckoo hashing to gain these
and strings, respectively.[61] advantages.[32]
[42]
• For Objective-C, the Cocoa framework provides the [8] That is, arrays of length 1, 3, 7, 15, 31 ...
NSArray -indexOfObject:inSortedRange:options:
usingComparator: method in Mac OS X 10.6+.[62]
Citations
Apple’s Core Foundation C framework also contains
[63]
a CFArrayBSearchValues() function. [1] Willams, Jr., Louis F. (1975). A modification to the
half-interval search (binary search) method. Proceedings
of the 14th ACM Southeast Conference. pp. 95–101.
6.1.8 See also doi:10.1145/503561.503582.

• Bisection method – the same idea used to solve [2] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub-
equations in the real numbers section “Binary search”.

[3] Imperial College London notes: Lecture 9: Binary chop.


• Multiplicative binary search - binary search varia-
tion with simplified midpoint calculation [4] Cormen et al. 2009, p. 39.

[5] Weisstein, Eric W. “Binary Search”. MathWorld.


6.1.9 Notes and references [6] Flores, Ivan; Madpis, George (1971). “Average binary
search length for dense ordered lists”. CACM. 14 (9):
Notes 602–603. doi:10.1145/362663.362752.

[1] This happens as binary search will not always divide the [7] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub-
array perfectly. Take for example the array [1, 2 ... section “Algorithm B”.
16]. The first iteration will select the midpoint of 8. On
the left subarray are eight elements, but on the right are [8] Bottenbruch, Hermann (1962). “Structure and Use of
nine. If the search takes the right path, there is a higher ALGOL 60”. Journal of the ACM. 9 (2): 161–211. Pro-
chance that the search will make the maximum number of cedure is described at p. 214 (§43), titled “Program for
comparisons.[12] Binary Search”.

[9] Sedgewick & Wayne 2011, §3.1, subsection “Rank and


[2] A formal time performance analysis by Knuth showed that
selection”.
the average running time of this variation for a success-
ful search is 17.5 log2 n + 17 units of time compared to [10] Goldman & Goldman 2008, pp. 461–463.
18 log2 n − 16 units for regular binary search. The time
complexity for this variation grows slightly more slowly, [11] Sedgewick & Wayne 2011, §3.1, subsection “Range
but at the cost of higher initial complexity.[15] queries”.

[3] It is possible to perform hashing in guaranteed constant [12] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub-
time.[20] section “Further analysis of binary search”.
6.1. BINARY SEARCH ALGORITHM 165

[13] Chang 2003, p. 169. [34] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub-
section “An important variation”.
[14] Sloane, Neil. Table of n, 2n for n = 0..1000. Part of OEIS
A000079. Retrieved 30 April 2016. [35] Kiefer, J. (1953). “Sequential Minimax Search for a Max-
imum”. Proceedings of the American Mathematical So-
[15] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub- ciety. 4 (3): 502–506. doi:10.2307/2032161. JSTOR
section “Exercise 23”. 2032161.
[16] Rolfe, Timothy J. (1997). “Analytic derivation of com- [36] Hassin, Refael (1981). “On Maximizing Functions by Fi-
parisons in binary search”. ACM SIGNUM Newsletter. 32 bonacci Search”. Fibonacci Quarterly. 19: 347–351.
(4): 15–19. doi:10.1145/289251.289255.
[37] Moffat & Turpin 2002, p. 33.
[17] Chazelle, Bernard; Liu, Ding (2001). Lower bounds for
intersection searching and fractional cascading in higher [38] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub-
dimension. 33rd ACM Symposium on Theory of Com- section “Interpolation search”.
puting. pp. 322–329. doi:10.1145/380752.380818.
[39] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub-
[18] Knuth 1997, §2.2.2 (“Sequential Allocation”). section “Exercise 22”.

[19] Knuth 1998, §6.4 (“Hashing”). [40] Perl, Yehoshua; Itai, Alon; Avni, Haim (1978). “Interpo-
lation search—a log log n search”. CACM. 21 (7): 550–
[20] Knuth 1998, §6.4 (“Hashing”), subsection “History”. 553. doi:10.1145/359545.359557.

[21] Dietzfelbinger, Martin; Karlin, Anna; Mehlhorn, Kurt; [41] Knuth 1998, §6.2.1 (“Searching an ordered table”), sub-
Meyer auf der Heide, Friedhelm; Rohnert, Hans; Tarjan, section “History and bibliography”.
Robert E. (August 1994). “Dynamic Perfect Hashing:
Upper and Lower Bounds”. SIAM Journal on Computing. [42] “2n −1”. OEIS A000225. Retrieved 7 May 2016.
23 (4): 738–761. doi:10.1137/S0097539791194094.
[43] Lehmer, Derrick (1960). Teaching combinatorial tricks to
[22] Morin, Pat. “Hash Tables” (PDF). p. 1. Retrieved 28 a computer. Proceedings of Symposia in Applied Mathe-
March 2016. matics. 10. pp. 180–181. doi:10.1090/psapm/010.

[44] Chazelle, Bernard; Guibas, Leonidas J. (1986).


[23] Beame, Paul; Fich, Faith E. (2001). “Optimal Bounds
“Fractional cascading: I. A data structuring tech-
for the Predecessor Problem and Related Problems”.
nique” (PDF). Algorithmica. 1 (1): 133–162.
Journal of Computer and System Sciences. 65 (1): 38–72.
doi:10.1007/BF01840440.
doi:10.1006/jcss.2002.1822.
[45] Chazelle, Bernard; Guibas, Leonidas J. (1986),
[24] Sedgewick & Wayne 2011, §3.2 (“Binary Search Trees”), “Fractional cascading: II. Applications” (PDF),
subsection “Order-based methods and deletion”. Algorithmica, 1 (1): 163–191, doi:10.1007/BF01840441
[25] Knuth 1998, §6.2.2 (“Binary tree searching”), subsection [46] Bentley 2000, §4.1 (“The Challenge of Binary Search”).
“But what about the worst case?".
[47] Pattis, Richard E. (1988). “Textbook errors in bi-
[26] Sedgewick & Wayne 2011, §3.5 (“Applications”), nary searching”. SIGCSE Bulletin. 20: 190–194.
“Which symbol-table implementation should I use?". doi:10.1145/52965.53012.
[27] Knuth 1998, §5.4.9 (“Disks and Drums”). [48] Bloch, Joshua (2 June 2006). “Extra, Extra – Read All
About It: Nearly All Binary Searches and Mergesorts are
[28] Knuth 1998, §6.2.4 (“Multiway trees”). Broken”. Google Research Blog. Retrieved 21 April 2016.
[29] Knuth 1998, Answers to Exercises (§6.2.1) for “Exercise [49] Ruggieri, Salvatore (2003). “On computing the semi-sum
5”. of two integers” (PDF). Information Processing Letters. 87
(2): 67–71. doi:10.1016/S0020-0190(03)00263-1.
[30] Knuth 1998, §6.2.1 (“Searching an ordered table”).
[50] Bentley 2000, §4.4 (“Principles”).
[31] Sedgewick & Wayne 2011, §3.2 (“Ordered symbol ta-
bles”). [51] Khuong, Paul-Virak; Morin, Pat (2015-09-16). “Array
Layouts for Comparison-Based Searching”. arXiv:1509.
[32] Fan, Bin; Andersen, Dave G.; Kaminsky, Michael; 05053 [cs].
Mitzenmacher, Michael D. (2014). Cuckoo Filter: Prac-
tically Better Than Bloom. Proceedings of the 10th [52] Cannizzo, Fabio (2015-06-29). “Fast and Vectorizable
ACM International on Conference on emerging Net- Alternative to Binary Search in O(1) Applicable to a Wide
working Experiments and Technologies. pp. 75–88. Domain of Sorted Arrays of Floating Point Numbers”.
doi:10.1145/2674005.2674994. arXiv:1506.08620 [cs].

[33] Bloom, Burton H. (1970). “Space/time Trade-offs in [53] “bsearch – binary search a sorted table”. The Open Group
Hash Coding with Allowable Errors”. CACM. 13 (7): Base Specifications (7th ed.). The Open Group. 2013.
422–426. doi:10.1145/362686.362692. Retrieved 28 March 2016.
166 CHAPTER 6. SUCCESSORS AND NEIGHBORS

[54] Stroustrup 2013, §32.6.1 (“Binary Search”). • Leiss, Ernst (2007). A Programmer’s Companion to
Algorithm Analysis. Boca Raton, FL: CRC Press.
[55] “The Binary Search in COBOL”. The American Program-
ISBN 1-58488-673-0.
mer. Retrieved 7 November 2016.

[56] “java.util.Arrays”. Java Platform Standard Edition 8 Doc- • Moffat, Alistair; Turpin, Andrew (2002). Compres-
umentation. Oracle Corporation. Retrieved 1 May 2016. sion and Coding Algorithms. Hamburg, Germany:
Kluwer Academic Publishers. doi:10.1007/978-1-
[57] “java.util.Collections”. Java Platform Standard Edition 8 4615-0935-6. ISBN 978-0-7923-7668-2.
Documentation. Oracle Corporation. Retrieved 1 May
2016. • Sedgewick, Robert; Wayne, Kevin (2011).
Algorithms (4th ed.). Upper Saddle River, NJ:
[58] “List<T>.BinarySearch Method (T)". Microsoft Devel-
Addison-Wesley Professional. ISBN 978-0-321-
oper Network. Retrieved 10 April 2016.
57351-3. Condensed web version: ; book version
[59] “8.5. bisect — Array bisection algorithm”. The Python .
Standard Library. Python Software Foundation. Re-
trieved 10 April 2016. • Stroustrup, Bjarne (2013). The C++ Program-
ming Language (4th ed.). Upper Saddle River, NJ:
[60] Fitzgerald 2007, p. 152.
Addison-Wesley Professional. ISBN 978-0-321-
[61] “Package sort”. The Go Programming Language. Re- 56384-2.
trieved 28 April 2016.

[62] “NSArray”. Mac Developer Library. Apple Inc. Re- 6.1.10 External links
trieved 1 May 2016.

[63] “CFArray”. Mac Developer Library. Apple Inc. Re- • NIST Dictionary of Algorithms and Data Structures:
trieved 1 May 2016. binary search

• FastBinarySearch A software library containing


Works high performance scalar, SSE and AVX implemen-
tations of various versions of binary search written
• Alexandrescu, Andrei (2010). The D Program- in support of Cannizzo’s paper.[1]
ming Language. Upper Saddle River, NJ: Addison-
Wesley Professional. ISBN 0-321-63536-1. [1]

• Bentley, Jon (2000) [1986]. Programming Pearls


(2nd ed.). Addison-Wesley. ISBN 0-201-65788-0.
6.2 Binary search tree
• Chang, Shi-Kuo (2003). Data Structures and Algo-
rithms. Software Engineering and Knowledge Engi-
neering. 13. Singapore: World Scientific. ISBN
978-981-238-348-8. 8
• Cormen, Thomas H.; Leiserson, Charles E.;
Rivest, Ronald L.; Stein, Clifford (2009) [1990].
Introduction to Algorithms (3rd ed.). MIT Press and 3 10
McGraw-Hill. ISBN 0-262-03384-4.
• Fitzgerald, Michael (2007). Ruby Pocket Refer-
ence. Sebastopol, CA: O'Reilly Media. ISBN 978-
1-4919-2601-7.
1 6 14
• Goldman, Sally A.; Goldman, Kenneth J. (2008).
A Practical Guide to Data Structures and Algorithms
using Java. Boca Raton: CRC Press. ISBN 978-1- 4 7 13
58488-455-2.
• Knuth, Donald (1997). Fundamental Algorithms. A binary search tree of size 9 and depth 3, with 8 at the root. The
The Art of Computer Programming. 1 (3rd ed.). leaves are not drawn.
Reading, MA: Addison-Wesley Professional.
In computer science, binary search trees (BST), some-
• Knuth, Donald (1998). Sorting and Searching. The times called ordered or sorted binary trees, are a partic-
Art of Computer Programming. 3 (2nd ed.). Read- ular type of containers: data structures that store “items”
ing, MA: Addison-Wesley Professional. (such as numbers, names etc.) in memory. They allow
6.2. BINARY SEARCH TREE 167

fast lookup, addition and removal of items, and can be to be compared with the key of the element to be
used to implement either dynamic sets of items, or lookup inserted or found.
tables that allow finding an item by its key (e.g., finding
the phone number of a person by name). • The keys in the binary search tree may be long and
the run time may increase.
Binary search trees keep their keys in sorted order, so
that lookup and other operations can use the principle of • After a long intermixed sequence of random inser-
binary search: when looking for a key in a tree (or a place tion and deletion, the expected height of the tree
to insert a new key), they traverse the tree from root to approaches square root of the number of keys, √n,
leaf, making comparisons to keys stored in the nodes of which grows much faster than log n.
the tree and deciding, based on the comparison, to con-
tinue searching in the left or right subtrees. On average,
this means that each comparison allows the operations to Order relation
skip about half of the tree, so that each lookup, inser-
tion or deletion takes time proportional to the logarithm Binary search requires an order relation by which every
of the number of items stored in the tree. This is much element (item) can be compared with every other element
better than the linear time required to find items by key in the sense of a total preorder. The part of the element
in an (unsorted) array, but slower than the corresponding which effectively takes place in the comparison is called
operations on hash tables. its key. Whether duplicates, i.e. different elements with
same key, shall be allowed in the tree or not, does not
Several variants of the binary search tree have been stud- depend on the order relation, but on the application only.
ied in computer science; this article deals primarily with
the basic type, making references to more advanced types In the context of binary search trees a total preorder is re-
when appropriate. alized most flexibly by means of a three-way comparison
subroutine.

6.2.1 Definition
6.2.2 Operations
A binary search tree is a rooted binary tree, whose inter-
nal nodes each store a key (and optionally, an associated Binary search trees support three main operations: in-
value) and each have two distinguished sub-trees, com- sertion of elements, deletion of elements, and lookup
monly denoted left and right. The tree additionally satis- (checking whether a key is present).
fies the binary search tree property, which states that the
key in each node must be greater than or equal to any key Searching
stored in the left sub-tree, and less than or equal to any key
stored in the right sub-tree.[1]:287 (The leaves (final nodes) Searching a binary search tree for a specific key can be
of the tree contain no key and have no structure to distin- programmed recursively or iteratively.
guish them from one another. Leaves are commonly rep-
resented by a special leaf or nil symbol, a NULL pointer, We begin by examining the root node. If the tree is null,
etc.) the key we are searching for does not exist in the tree.
Otherwise, if the key equals that of the root, the search is
Generally, the information represented by each node is a successful and we return the node. If the key is less than
record rather than a single data element. However, for that of the root, we search the left subtree. Similarly, if
sequencing purposes, nodes are compared according to the key is greater than that of the root, we search the right
their keys rather than any part of their associated records. subtree. This process is repeated until the key is found or
The major advantage of binary search trees over other the remaining subtree is null. If the searched key is not
data structures is that the related sorting algorithms and found after a null subtree is reached, then the key is not
search algorithms such as in-order traversal can be very present in the tree. This is easily expressed as a recursive
efficient; they are also easy to code. algorithm (implemented in Python):
Binary search trees are a fundamental data structure used 1 def search_recursively(key, node): 2 if node is None
to construct more abstract data structures such as sets, or node.key == key: 3 return node 4 elif key < node.key:
multisets, and associative arrays. Some of their disad- 5 return search_recursively(key, node.left) 6 else: # key
vantages are as follows: > node.key 7 return search_recursively(key, node.right)

• The shape of the binary search tree depends entirely The same algorithm can be implemented iteratively:
on the order of insertions and deletions, and can be-
come degenerate. 1 def search_iteratively(key, node): 2 current_node =
node 3 while current_node is not None: 4 if key ==
• When inserting or searching for an element in a bi- current_node.key: 5 return current_node 6 elif key <
nary search tree, the key of each visited node has current_node.key: 7 current_node = current_node.left
168 CHAPTER 6. SUCCESSORS AND NEIGHBORS

8 else: # key > current_node.key: 9 current_node = The part that is rebuilt uses O(log n) space in the average
current_node.right 10 return None case and O(n) in the worst case.
In either version, this operation requires time proportional
These two examples rely on the order relation being a total to the height of the tree in the worst case, which is O(log
order. n) time in the average case over all trees, but O(n) time
If the order relation is only a total preorder a reasonable in the worst case.
extension of the functionality is the following: also in case Another way to explain insertion is that in order to insert
of equality search down to the leaves in a direction speci- a new node in the tree, its key is first compared with that
fiable by the user. A binary tree sort equipped with such of the root. If its key is less than the root’s, it is then
a comparison function becomes stable. compared with the key of the root’s left child. If its key
Because in the worst case this algorithm must search from is greater, it is compared with the root’s right child. This
the root of the tree to the leaf farthest from the root, process continues, until the new node is compared with
the search operation takes time proportional to the tree’s a leaf node, and then it is added as this node’s right or
height (see tree terminology). On average, binary search left child, depending on its key: if the key is less than
trees with n nodes have O(log n) height.[note 1] However, the leaf’s key, then it is inserted as the leaf’s left child,
in the worst case, binary search trees can have O(n) otherwise as the leaf’s right child.
height, when the unbalanced tree resembles a linked list There are other ways of inserting nodes into a binary tree,
(degenerate tree). but this is the only way of inserting nodes at the leaves and
at the same time preserving the BST structure.
Insertion
Deletion
Insertion begins as a search would begin; if the key is
not equal to that of the root, we search the left or right When removing a node from a binary search tree it
subtrees as before. Eventually, we will reach an external is mandatory to maintain the in-order sequence of the
node and add the new key-value pair (here encoded as a nodes. There are many possibilities to do this. However,
record 'newNode') as its right or left child, depending on the following method which has been proposed by T. Hi-
the node’s key. In other words, we examine the root and bbard in 1962[2] guarantees that the heights of the subject
recursively insert the new node to the left subtree if its subtrees are changed by at most one. There are three pos-
key is less than that of the root, or the right subtree if its sible cases to consider:
key is greater than or equal to the root.
Here’s how a typical binary search tree insertion might be • Deleting a node with no children: simply remove the
performed in a binary tree in C++: node from the tree.
Node* insert(Node*& root, int key, int value) { if (!root) • Deleting a node with one child: remove the node and
root = new Node(key, value); else if (key < root->key) replace it with its child.
root->left = insert(root->left, key, value); else // key >=
root->key root->right = insert(root->right, key, value); • Deleting a node with two children: call the node to
return root; } be deleted D. Do not delete D. Instead, choose either
its in-order predecessor node or its in-order succes-
sor node as replacement node E (s. figure). Copy
The above destructive procedural variant modifies the tree the user values of E to D.[note 2] If E does not have
in place. It uses only constant heap space (and the iter- a child simply remove E from its previous parent G.
ative version uses constant stack space as well), but the If E has a child, say F, it is a right child. Replace E
prior version of the tree is lost. Alternatively, as in the with F at E's parent.
following Python example, we can reconstruct all ances-
tors of the inserted node; any reference to the original
tree root remains valid, making the tree a persistent data In all cases, when D happens to be the root, make the
structure: replacement node root again.

def binary_tree_insert(node, key, value): if node is Broadly speaking, nodes with children are harder to
None: return NodeTree(None, key, value, None) if delete. As with all binary trees, a node’s in-order suc-
key == node.key: return NodeTree(node.left, key, cessor is its right subtree’s left-most child, and a node’s
value, node.right) if key < node.key: return Node- in-order predecessor is the left subtree’s right-most child.
Tree(binary_tree_insert(node.left, key, value), node.key, In either case, this node will have only one or no child at
node.value, node.right) else: return NodeTree(node.left, all. Delete it according to one of the two simpler cases
node.key, node.value, binary_tree_insert(node.right, above.
key, value)) Consistently using the in-order successor or the in-order
predecessor for every instance of the two-child case can
6.2. BINARY SEARCH TREE 169

D D E in a sorted list of node items (numbers, strings or other


comparable items).
G G G
C E C E C F The code for in-order traversal in Python is given be-
B F B F B low. It will call callback (some function the programmer
wishes to call on the node’s value, such as printing to the
Deleting a node with two children from a binary search tree. First screen) for every node in the tree.
the leftmost node in the right subtree, the in-order successor E, is def traverse_binary_tree(node, call-
identified. Its value is copied into the node D being deleted. The
back): if node is None: return tra-
in-order successor can then be easily deleted because it has at
verse_binary_tree(node.leftChild, callback) call-
most one child. The same method works symmetrically using the
in-order predecessor C. back(node.value) traverse_binary_tree(node.rightChild,
callback)

lead to an unbalanced tree, so some implementations se- Traversal requires O(n) time, since it must visit every
lect one or the other at different times. node. This algorithm is also O(n), so it is asymptotically
Runtime analysis: Although this operation does not al- optimal.
ways traverse the tree down to a leaf, this is always a Traversal can also be implemented iteratively. For cer-
possibility; thus in the worst case it requires time propor- tain applications, e.g. greater equal search, approxima-
tional to the height of the tree. It does not require more tive search, an operation for single step (iterative) traver-
even when the node has two children, since it still follows sal can be very useful. This is, of course, implemented
a single path and does not visit any node twice. without the callback construct and takes O(1) on average
def find_min(self): # Gets minimum node in a subtree and O(log n) in the worst case.
current_node = self while current_node.left_child:
current_node = current_node.left_child return Verification
current_node def replace_node_in_parent(self,
new_value=None): if self.parent: if self == Sometimes we already have a binary tree, and we need to
self.parent.left_child: self.parent.left_child = determine whether it is a BST. This problem has a simple
new_value else: self.parent.right_child = new_value recursive solution.
if new_value: new_value.parent = self.parent def
binary_tree_delete(self, key): if key < self.key: The BST property—every node on the right subtree has
self.left_child.binary_tree_delete(key) elif key > to be larger than the current node and every node on the
self.key: self.right_child.binary_tree_delete(key) left subtree has to be smaller than (or equal to - should
else: # delete the key here if self.left_child and not be the case as only unique values should be in the tree
self.right_child: # if both children are present suc- - this also poses the question as to if such nodes should be
cessor = self.right_child.find_min() self.key = succes- left or right of this parent) the current node—is the key
sor.key successor.binary_tree_delete(successor.key) to figuring out whether a tree is a BST or not. The greedy
elif self.left_child: # if the node has only a *left* algorithm – simply traverse the tree, at every node check
child self.replace_node_in_parent(self.left_child) elif whether the node contains a value larger than the value at
self.right_child: # if the node has only a *right* child the left child and smaller than the value on the right child
self.replace_node_in_parent(self.right_child) else: # this – does not work for all cases. Consider the following tree:
node has no children self.replace_node_in_parent(None) 20 / \ 10 30 / \ 5 40
In the tree above, each node meets the condition that the
node contains a value larger than its left child and smaller
than its right child hold, and yet it is not a BST: the value
Traversal
5 is on the right subtree of the node containing 20, a vio-
lation of the BST property.
Main article: Tree traversal
Instead of making a decision based solely on the values
of a node and its children, we also need information flow-
Once the binary search tree has been created, its elements ing down from the parent as well. In the case of the tree
can be retrieved in-order by recursively traversing the left above, if we could remember about the node containing
subtree of the root node, accessing the node itself, then the value 20, we would see that the node with value 5 is
recursively traversing the right subtree of the node, con- violating the BST property contract.
tinuing this pattern with each node in the tree as it’s re-
cursively accessed. As with all binary trees, one may con- So the condition we need to check at each node is:
duct a pre-order traversal or a post-order traversal, but
neither are likely to be useful for binary search trees. An • if the node is the left child of its parent, then it must
in-order traversal of a binary search tree will always result be smaller than (or equal to) the parent and it must
170 CHAPTER 6. SUCCESSORS AND NEIGHBORS

pass down the value from its parent to its right sub- into a linked list with no left subtrees. For example,
tree to make sure none of the nodes in that subtree build_binary_tree([1, 2, 3, 4, 5]) yields the tree (1 (2 (3
is greater than the parent (4 (5))))).

• if the node is the right child of its parent, then it must There are several schemes for overcoming this flaw
be larger than the parent and it must pass down the with simple binary trees; the most common is the self-
value from its parent to its left subtree to make sure balancing binary search tree. If this same procedure is
none of the nodes in that subtree is lesser than the done using such a tree, the overall worst-case time is O(n
parent. log n), which is asymptotically optimal for a comparison
sort. In practice, the added overhead in time and space for
a tree-based sort (particularly for node allocation) make
A recursive solution in C can explain this further: it inferior to other asymptotically optimal sorts such as
struct TreeNode { int key; int value; struct TreeNode heapsort for static list sorting. On the other hand, it is
*left; struct TreeNode *right; }; bool isBST(struct one of the most efficient methods of incremental sort-
TreeNode *node, int minKey, int maxKey) { if(node ing, adding items to a list over time while keeping the
== NULL) return true; if(node->key < minKey || list sorted at all times.
node->key > maxKey) return false; return isBST(node-
>left, minKey, node->key-1) && isBST(node->right,
node->key+1, maxKey); } Priority queue operations

Binary search trees can serve as priority queues: struc-


node->key+1 and node->key-1 are done to allow only dis- tures that allow insertion of arbitrary key as well as lookup
tinct elements in BST. and deletion of the minimum (or maximum) key. Inser-
If we want same elements to also be present, then we can tion works as previously explained. Find-min walks the
use only node->key in both places. tree, following left pointers as far as it can without hitting
a leaf:
The initial call to this function can be something like this:
// Precondition: T is not a leaf function find-min(T):
if(isBST(root, INT_MIN, INT_MAX)) { puts(“This is a
while hasLeft(T): T ? left(T) return key(T)
BST.”); } else { puts(“This is NOT a BST!"); }
Find-max is analogous: follow right pointers as far as
possible. Delete-min (max) can simply look up the mini-
Essentially we keep creating a valid range (starting from
mum (maximum), then delete it. This way, insertion and
[MIN_VALUE, MAX_VALUE]) and keep shrinking it
deletion both take logarithmic time, just as they do in a
down for each node as we go down recursively.
binary heap, but unlike a binary heap and most other pri-
As pointed out in section #Traversal, an in-order traver- ority queue implementations, a single tree can support all
sal of a binary search tree returns the nodes sorted. Thus of find-min, find-max, delete-min and delete-max at the
we only need to keep the last visited node while travers- same time, making binary search trees suitable as double-
ing the tree and check whether its key is smaller (or ended priority queues.[3]:156
smaller/equal, if duplicates are to be allowed in the tree)
compared to the current key.
6.2.4 Types
6.2.3 Examples of applications There are many types of binary search trees. AVL trees
and red-black trees are both forms of self-balancing bi-
Some examples shall illustrate the use of above basic nary search trees. A splay tree is a binary search tree that
building blocks. automatically moves frequently accessed elements nearer
to the root. In a treap (tree heap), each node also holds a
(randomly chosen) priority and the parent node has higher
Sort
priority than its children. Tango trees are trees optimized
for fast searches. T-trees are binary search trees opti-
Main article: Tree sort mized to reduce storage space overhead, widely used for
in-memory databases
A binary search tree can be used to implement a simple A degenerate tree is a tree where for each parent node,
sorting algorithm. Similar to heapsort, we insert all the there is only one associated child node. It is unbalanced
values we wish to sort into a new ordered data structure— and, in the worst case, performance degrades to that of
in this case a binary search tree—and then traverse it in a linked list. If your add node function does not handle
order. re-balancing, then you can easily construct a degenerate
The worst-case time of build_binary_tree is O(n2 )— tree by feeding it with data that is already sorted. What
if you feed it a sorted list of values, it chains them this means is that in a performance measurement, the tree
6.2. BINARY SEARCH TREE 171

will essentially behave like a linked list data structure. 6.2.5 See also

• Search tree
Performance comparisons
• Binary search algorithm
[4]
D. A. Heger (2004) presented a performance compar-
ison of binary search trees. Treap was found to have the • Randomized binary search tree
best average performance, while red-black tree was found
to have the smallest amount of performance variations. • Tango tree

• Self-balancing binary search tree


Optimal binary search trees
• Red–black tree
Main article: Optimal binary search tree
If we do not plan on modifying a search tree, and we • AVL tree

• Geometry of binary search trees

• Day–Stout–Warren algorithm

α γ 6.2.6 Notes

[1] The notion of an average BST is made precise as follows.


Let a random BST be one built using only insertions out of
β γ α β a sequence of unique elements in random order (all per-
mutations equally likely); then the expected height of the
Tree rotations are very common internal operations in binary tree is O(log n). If deletions are allowed as well as inser-
trees to keep perfect, or near-to-perfect, internal balance in the tions, “little is known about the average height of a binary
tree. search tree”.[1]:300

know exactly how often each item will be accessed, we [2] Of course, a generic software package has to work the
can construct[5] an optimal binary search tree, which is a other way around: It has to leave the user data untouched
and to furnish E with all the BST links to and from D.
search tree where the average cost of looking up an item
(the expected search cost) is minimized.
Even if we only have estimates of the search costs, such
a system can considerably speed up lookups on average.
6.2.7 References
For example, if you have a BST of English words used in
a spell checker, you might balance the tree based on word [1] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L.; Stein, Clifford (2009) [1990]. Introduction to
frequency in text corpora, placing words like the near the
Algorithms (3rd ed.). MIT Press and McGraw-Hill. ISBN
root and words like agerasia near the leaves. Such a tree 0-262-03384-4.
might be compared with Huffman trees, which similarly
seek to place frequently used items near the root in order [2] s. Robert Sedgewick, Kevin Wayne: Algorithms Fourth
to produce a dense information encoding; however, Huff- Edition. Pearson Education, 2011, ISBN 978-0-321-
man trees store data elements only in leaves, and these 57351-3, p. 410.
elements need not be ordered.
If we do not know the sequence in which the elements in [3] Mehlhorn, Kurt; Sanders, Peter (2008). Algorithms and
the tree will be accessed in advance, we can use splay trees Data Structures: The Basic Toolbox (PDF). Springer.
which are asymptotically as good as any static search tree
we can construct for any particular sequence of lookup [4] Heger, Dominique A. (2004), “A Disquisition on The Per-
operations. formance Behavior of Binary Search Tree Data Struc-
tures” (PDF), European Journal for the Informatics Pro-
Alphabetic trees are Huffman trees with the additional fessional, 5 (5): 67–75
constraint on order, or, equivalently, search trees with
the modification that all elements are stored in the leaves. [5] Gonnet, Gaston. “Optimal Binary Search Trees”. Scien-
Faster algorithms exist for optimal alphabetic binary trees tific Computation. ETH Zürich. Retrieved 1 December
(OABTs). 2013.
172 CHAPTER 6. SUCCESSORS AND NEIGHBORS

6.2.8 Further reading by repeated splitting. Adding and removing nodes di-
rectly in a random binary tree will in general disrupt its
• This article incorporates public domain material random structure, but the treap and related randomized
from the NIST document: Black, Paul E. “Binary binary search tree data structures use the principle of bi-
Search Tree”. Dictionary of Algorithms and Data nary trees formed from a random permutation in order
Structures. to maintain a balanced binary search tree dynamically as
nodes are inserted and deleted.
• Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L.; Stein, Clifford (2001). “12: Binary For random trees that are not necessarily binary, see
search trees, 15.5: Optimal binary search trees”. random tree.
Introduction to Algorithms (2nd ed.). MIT Press &
McGraw-Hill. pp. 253–272, 356–363. ISBN 0-
262-03293-7. 6.3.1 Binary trees from random permuta-
tions
• Jarc, Duane J. (3 December 2005). “Binary Tree
Traversals”. Interactive Data Structure Visualiza- For any set of numbers (or, more generally, values from
tions. University of Maryland. some total order), one may form a binary search tree in
which each number is inserted in sequence as a leaf of
• Knuth, Donald (1997). “6.2.2: Binary Tree Search-
the tree, without changing the structure of the previously
ing”. The Art of Computer Programming. 3: “Sort-
inserted numbers. The position into which each num-
ing and Searching” (3rd ed.). Addison-Wesley. pp.
ber should be inserted is uniquely determined by a binary
426–458. ISBN 0-201-89685-0.
search in the tree formed by the previous numbers. For
• Long, Sean. “Binary Search Tree” (PPT). Data instance, if the three numbers (1,3,2) are inserted into a
Structures and Algorithms Visualization-A Power- tree in that sequence, the number 1 will sit at the root
Point Slides Based Approach. SUNY Oneonta. of the tree, the number 3 will be placed as its right child,
and the number 2 as the left child of the number 3. There
• Parlante, Nick (2001). “Binary Trees”. CS Educa- are six different permutations of the numbers (1,2,3), but
tion Library. Stanford University. only five trees may be constructed from them. That is be-
cause the permutations (2,1,3) and (2,3,1) form the same
tree.
6.2.9 External links
• Literate implementations of binary search trees in Expected depth of a node
various languages on LiteratePrograms
For any fixed choice of a value x in a given set of n num-
• Binary Tree Visualizer (JavaScript animation of var-
bers, if one randomly permutes the numbers and forms a
ious BT-based data structures)
binary tree from them as described above, the expected
• Kovac, Kubo. “Binary Search Trees” (Java applet). value of the length of the path from the root of the tree to
Korešponden?ný seminár z programovania. x is at most 2 log n + O(1), where “log” denotes the natural
logarithm function and the O introduces big O notation.
• Madru, Justin (18 August 2009). “Binary Search For, the expected number of ancestors of x is by linear-
Tree”. JDServer. C++ implementation. ity of expectation equal to the sum, over all other values
y in the set, of the probability that y is an ancestor of x.
• Binary Search Tree Example in Python
And a value y is an ancestor of x exactly when y is the
• “References to Pointers (C++)". MSDN. Microsoft. first element to be inserted from the elements in the in-
2005. Gives an example binary tree implementa- terval [x,y]. Thus, the values that are adjacent to x in the
tion. sorted sequence of values have probability 1/2 of being
an ancestor of x, the values one step away have probabil-
ity 1/3, etc. Adding these probabilities for all positions
6.3 Random binary tree in the sorted sequence gives twice a Harmonic number,
leading to the bound above. A bound of this form holds
also for the expected search length of a path to a fixed
In computer science and probability theory, a random value x that is not part of the given set.[1]
binary tree is a binary tree selected at random from some
probability distribution on binary trees. Two different
distributions are commonly used: binary trees formed by The longest path
inserting nodes one at a time according to a random per-
mutation, and binary trees chosen from a uniform discrete Although not as easy to analyze as the average path length,
distribution in which all distinct trees are equally likely. there has also been much research on determining the ex-
It is also possible to form other distributions, for instance pectation (or high probability bounds) of the length of the
6.3. RANDOM BINARY TREE 173

longest path in a binary search tree generated from a ran- 6.3.2 Uniformly random binary trees
dom insertion order. It is now known that this length, for
a tree with n nodes, is almost surely The number of binary trees with n nodes is a Catalan
number: for n = 1, 2, 3, ... these numbers of trees are

1 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796,


log n ≈ 4.311 log n,
β … (sequence A000108 in the OEIS).

where β is the unique number in the range 0 < β < 1 sat- Thus, if one of these trees is selected uniformly at ran-
isfying the equation dom, its probability is the reciprocal of a Catalan number.
Trees in this model have expected depth proportional to
2βe 1−β
= 1. [2] the square root of n, rather than to the logarithm;[4] how-
ever, the Strahler number of a uniformly random binary
tree, a more sensitive measure of the distance from a leaf
in which a node has Strahler number i whenever it has ei-
Expected number of leaves
ther a child with that number or two children with number
i − 1, is with high probability logarithmic.[5]
In the random permutation model, each of the numbers
from the set of numbers used to form the tree, except Due to their large heights, this model of equiprobable ran-
for the smallest and largest of the numbers, has probabil- dom trees is not generally used for binary search trees,
ity 1/3 of being a leaf in the tree, for it is a leaf when it but it has been applied to problems of modeling the
[6]
inserted after its two neighbors, and any of the six permu- parse trees of algebraic expressions in compiler design
tations of these two neighbors and it are equally likely. By (where the above-mentioned bound on Strahler number
similar reasoning, the smallest and largest of the numbers translates into the number of registers needed to evaluate
[7] [8]
have probability 1/2 of being a leaf. Therefore, the ex- an expression ) and for modeling evolutionary trees.
pected number of leaves is the sum of these probabilities, In some cases the analysis of random binary trees un-
which for n ≥ 2 is exactly (n + 1)/3. der the random permutation model can be automatically
transferred to the uniform model.[9]

Treaps and randomized binary search trees


6.3.3 Random split trees
In applications of binary search tree data structures, it is Devroye & Kruszewski (1996) generate random binary
rare for the values in the tree to be inserted without dele- trees with n nodes by generating a real-valued random
tion in a random order, limiting the direct applications of variable x in the unit interval (0,1), assigning the first xn
random binary trees. However, algorithm designers have nodes (rounded down to an integer number of nodes) to
devised data structures that allow insertions and deletions the left subtree, the next node to the root, and the remain-
to be performed in a binary search tree, at each step main- ing nodes to the right subtree, and continuing recursively
taining as an invariant the property that the shape of the in each subtree. If x is chosen uniformly at random in the
tree is a random variable with the same distribution as a interval, the result is the same as the random binary search
random binary search tree. tree generated by a random permutation of the nodes,
If a given set of ordered numbers is assigned numeric pri- as any node is equally likely to be chosen as root; how-
orities (distinct numbers unrelated to their values), these ever, this formulation allows other distributions to be used
priorities may be used to construct a Cartesian tree for instead. For instance, in the uniformly random binary
the numbers, a binary tree that has as its inorder traversal tree model, once a root is fixed each of its two subtrees
sequence the sorted sequence of the numbers and that is must also be uniformly random, so the uniformly random
heap-ordered by priorities. Although more efficient con- model may also be generated by a different choice of dis-
struction algorithms are known, it is helpful to think of a tribution for x. As Devroye and Kruszewski show, by
Cartesian tree as being constructed by inserting the given choosing a beta distribution on x and by using an appro-
numbers into a binary search tree in priority order. Thus, priate choice of shape to draw each of the branches, the
by choosing the priorities either to be a set of independent mathematical trees generated by this process can be used
random real numbers in the unit interval, or by choosing to create realistic-looking botanical trees.
them to be a random permutation of the numbers from
1 to n (where n is the number of nodes in the tree), and
by maintaining the heap ordering property using tree ro- 6.3.4 Notes
tations after any insertion or deletion of a node, it is pos-
[1] Hibbard (1962); Knuth (1973); Mahmoud (1992), p. 75.
sible to maintain a data structure that behaves like a ran-
dom binary search tree. Such a data structure is known [2] Robson (1979); Pittel (1985); Devroye (1986); Mahmoud
as a treap or a randomized binary search tree.[3] (1992), pp. 91–99; Reed (2003).
174 CHAPTER 6. SUCCESSORS AND NEIGHBORS

[3] Martinez & Roura (1998); Seidel & Aragon (1996). • Mahmoud, Hosam M. (1992), Evolution of Random
Search Trees, John Wiley & Sons.
[4] Knuth (2005), p. 15.
• Martinez, Conrado; Roura, Salvador (1998),
[5] Devroye & Kruszewski (1995). That it is at most logarith-
“Randomized binary search trees”, Journal
mic is trivial, because the Strahler number of every tree is
of the ACM, ACM Press, 45 (2): 288–323,
bounded by the logarithm of the number of its nodes.
doi:10.1145/274787.274812.
[6] Mahmoud (1992), p. 63.
• Pittel, B. (1985), “Asymptotical growth of a class of
[7] Flajolet, Raoult & Vuillemin (1979). random trees”, Annals of Probability, 13 (2): 414–
427, doi:10.1214/aop/1176993000.
[8] Aldous (1996).
• Reed, Bruce (2003), “The height of a random binary
[9] Mahmoud (1992), p. 70.
search tree”, Journal of the ACM, 50 (3): 306–332,
doi:10.1145/765568.765571.
6.3.5 References • Robson, J. M. (1979), “The height of binary search
trees”, Australian Computer Journal, 11: 151–153.
• Aldous, David (1996), “Probability distributions on
cladograms”, in Aldous, David; Pemantle, Robin, • Seidel, Raimund; Aragon, Cecilia R. (1996),
Random Discrete Structures, The IMA Volumes in “Randomized Search Trees”, Algorithmica, 16 (4/5):
Mathematics and its Applications, 76, Springer- 464–497, doi:10.1007/s004539900061.
Verlag, pp. 1–18.

• Devroye, Luc (1986), “A note on the height of bi- 6.3.6 External links
nary search trees”, Journal of the ACM, 33 (3): 489–
498, doi:10.1145/5925.5930. • Open Data Structures - Chapter 7 - Random Binary
Search Trees
• Devroye, Luc; Kruszewski, Paul (1995), “A note
on the Horton-Strahler number for random trees”,
Information Processing Letters, 56 (2): 95–99,
doi:10.1016/0020-0190(95)00114-R. 6.4 Tree rotation
• Devroye, Luc; Kruszewski, Paul (1996), “The
botanical beauty of random binary trees”, in
Brandenburg, Franz J., Graph Drawing: 3rd
Int. Symp., GD'95, Passau, Germany, Septem-
ber 20-22, 1995, Lecture Notes in Computer
Science, 1027, Springer-Verlag, pp. 166–177, α γ
doi:10.1007/BFb0021801, ISBN 3-540-60723-4.

• Drmota, Michael (2009), Random Trees : An


Interplay between Combinatorics and Probability, α
Springer-Verlag, ISBN 978-3-211-75355-2. β γ β

• Flajolet, P.; Raoult, J. C.; Vuillemin, J. (1979), “The Generic tree rotations.
number of registers required for evaluating arith-
metic expressions”, Theoretical Computer Science, 9 In discrete mathematics, tree rotation is an operation on
(1): 99–125, doi:10.1016/0304-3975(79)90009-4. a binary tree that changes the structure without interfer-
ing with the order of the elements. A tree rotation moves
• Hibbard, Thomas N. (1962), “Some combinato- one node up in the tree and one node down. It is used to
rial properties of certain trees with applications to change the shape of the tree, and in particular to decrease
searching and sorting”, Journal of the ACM, 9 (1): its height by moving smaller subtrees down and larger sub-
13–28, doi:10.1145/321105.321108. trees up, resulting in improved performance of many tree
operations.
• Knuth, Donald M. (1973), “6.2.2 Binary Tree
Searching”, The Art of Computer Programming, III, There exists an inconsistency in different descriptions as
Addison-Wesley, pp. 422–451. to the definition of the direction of rotations. Some say
that the direction of rotation reflects the direction that a
• Knuth, Donald M. (2005), “Draft of Section 7.2.1.6: node is moving upon rotation (a left child rotating into its
Generating All Trees”, The Art of Computer Pro- parent’s location is a right rotation) while others say that
gramming, IV. the direction of rotation reflects which subtree is rotating
6.4. TREE ROTATION 175

(a left subtree rotating into its parent’s location is a left you can see in the diagram, the order of the leaves doesn't
rotation, the opposite of the former). This article takes change. The opposite operation also preserves the order
the approach of the directional movement of the rotating and is the second kind of rotation.
node. Assuming this is a binary search tree, as stated above,
the elements must be interpreted as variables that can be
compared to each other. The alphabetic characters to the
6.4.1 Illustration left are used as placeholders for these variables. In the
animation to the right, capital alphabetic characters are
used as variable placeholders while lowercase Greek let-
ters are placeholders for an entire set of variables. The
circles represent individual nodes and the triangles repre-
sent subtrees. Each subtree could be empty, consist of a
single node, or consist of any number of nodes.

6.4.2 Detailed illustration

Animation of tree rotations taking place.

The right rotation operation as shown in the adjacent im-


age is performed with Q as the root and hence is a right
rotation on, or rooted at, Q. This operation results in a ro- Pictorial description of how rotations are made.
tation of the tree in the clockwise direction. The inverse
operation is the left rotation, which results in a movement When a subtree is rotated, the subtree side upon which it
in a counter-clockwise direction (the left rotation shown is rotated increases its height by one node while the other
above is rooted at P). The key to understanding how a subtree decreases its height. This makes tree rotations
rotation functions is to understand its constraints. In par- useful for rebalancing a tree.
ticular the order of the leaves of the tree (when read left
to right for example) cannot change (another way to think Using the terminology of Root for the parent node of the
of it is that the order that the leaves would be visited in an subtrees to rotate, Pivot for the node which will become
in-order traversal must be the same after the operation as the new parent node, RS for rotation side upon to rotate
before). Another constraint is the main property of a bi- and OS for opposite side of rotation. In the above dia-
nary search tree, namely that the right child is greater than gram for the root Q, the RS is C and the OS is P. The
the parent and the left child is less than the parent. Notice pseudo code for the rotation is:
that the right child of a left child of the root of a sub-tree Pivot = Root.OS Root.OS = Pivot.RS Pivot.RS = Root
(for example node B in the diagram for the tree rooted Root = Pivot
at Q) can become the left child of the root, that itself
becomes the right child of the “new” root in the rotated This is a constant time operation.
sub-tree, without violating either of those constraints. As The programmer must also make sure that the root’s par-
176 CHAPTER 6. SUCCESSORS AND NEIGHBORS

ent points to the pivot after the rotation. Also, the pro-
grammer should note that this operation may result in a
new root for the entire tree and take care to update point-
ers accordingly.

6.4.3 Inorder invariance

The tree rotation renders the inorder traversal of the bi-


nary tree invariant. This implies the order of the elements
are not affected when a rotation is performed in any part
of the tree. Here are the inorder traversals of the trees
shown above:
Pictorial description of how rotations cause rebalancing in an
Left tree: ((A, P, B), Q, C) Right tree: (A, P, (B, Q, C)) AVL tree.
Computing one from the other is very simple. The fol-
lowing is example Python code that performs that com-
6.4.5 Rotation distance
putation:
def right_rotation(treenode): left, Q, C = treenode A, P, The rotation distance between any two binary trees with
B = left return (A, P, (B, Q, C)) the same number of nodes is the minimum number of ro-
tations needed to transform one into the other. With this
distance, the set of n-node binary trees becomes a metric
Another way of looking at it is:
space: the distance is symmetric, positive when given two
Right rotation of node Q: different trees, and satisfies the triangle inequality.
Let P be Q’s left child. Set Q’s left child to be P’s right It is an open problem whether there exists a polynomial
child. [Set P’s right-child’s parent to Q] Set P’s right child time algorithm for calculating rotation distance.
to be Q. [Set Q’s parent to P]
Daniel Sleator, Robert Tarjan and William Thurston
Left rotation of node P: showed that the rotation distance between any two n-
Let Q be P’s right child. Set P’s right child to be Q’s left node trees (for n ≥ 11) is at most 2n − 6, and that some
child. [Set Q’s left-child’s parent to P] Set Q’s left child pairs of trees are this far apart as soon as n is sufficiently
to be P. [Set P’s parent to Q] large.[1] Lionel Pournin showed that, in fact, such pairs
exist whenever n ≥ 11.[2]
All other connections are left as-is.
There are also double rotations, which are combinations
of left and right rotations. A double left rotation at X can 6.4.6 See also
be defined to be a right rotation at the right child of X
followed by a left rotation at X; similarly, a double right • AVL tree, red-black tree, and splay tree, kinds of
rotation at X can be defined to be a left rotation at the left binary search tree data structures that use rotations
child of X followed by a right rotation at X. to maintain balance.

Tree rotations are used in a number of tree data structures • Associativity of a binary operation means that per-
such as AVL trees, red-black trees, splay trees, and treaps. forming a tree rotation on it does not change the final
They require only constant time because they are local result.
transformations: they only operate on 5 nodes, and need
• The Day–Stout–Warren algorithm balances an un-
not examine the rest of the tree.
balanced BST.

• Tamari lattice, a partially ordered set in which the


6.4.4 Rotations for rebalancing elements can be defined as binary trees and the or-
dering between elements is defined by tree rotation.
A tree can be rebalanced using rotations. After a rotation,
the side of the rotation increases its height by 1 whilst the
side opposite the rotation decreases its height similarly.
6.4.7 References
Therefore, one can strategically apply rotations to nodes [1] Sleator, Daniel D.; Tarjan, Robert E.; Thurston, William
whose left child and right child differ in height by more P. (1988), “Rotation distance, triangulations, and hyper-
than 1. Self-balancing binary search trees apply this op- bolic geometry”, Journal of the American Mathematical
eration automatically. A type of tree which uses this re- Society, 1 (3): 647–681, doi:10.2307/1990951, JSTOR
balancing technique is the AVL tree. 1990951, MR 928904.
6.5. SELF-BALANCING BINARY SEARCH TREE 177

[2] Pournin, Lionel (2014), “The diameter of associ- data structures such as associative arrays, priority queues
ahedra”, Advances in Mathematics, 259: 13–42, and sets.
arXiv:1207.6296 , doi:10.1016/j.aim.2014.02.035, MR
The red–black tree, which is a type of self-balancing bi-
3197650.
nary search tree, was called symmetric binary B-tree[2]
and was renamed but can still be confused with the
6.4.8 External links generic concept of self-balancing binary search tree
because of the initials.
• Java applets demonstrating tree rotations
• The AVL Tree Rotations Tutorial (RTF) by John 6.5.1 Overview
Hargrove

6.5 Self-balancing binary search


tree
α γ

50
β γ α β

17 76 Tree rotations are very common internal operations on self-


balancing binary trees to keep perfect or near-to-perfect balance.

Most operations on a binary search tree (BST) take time


9 23 54 directly proportional to the height of the tree, so it is desir-
able to keep the height small. A binary tree with height
h can contain at most 20 +21 +···+2h = 2h+1 −1 nodes. It
14 19 72 follows that for a tree with n nodes and height h:
n ≤ 2h+1 − 1
And that implies:
12 67 h ≥ ⌈log2 (n + 1) − 1⌉ ≥ ⌊log2 n⌋ .
In other words, the minimum height of a tree with n nodes
An example of an unbalanced tree; following the path from the is log2 (n), rounded down; that is, ⌊log2 n⌋ .[1]
root to a node takes an average of 3.27 node accesses However, the simplest algorithms for BST item insertion
may yield a tree with height n in rather common situa-
tions. For example, when the items are inserted in sorted
50 key order, the tree degenerates into a linked list with n
nodes. The difference in performance between the two
17 72 situations may be enormous: for n = 1,000,000, for ex-
ample, the minimum height is ⌊log2 (1, 000, 000)⌋ = 19
12 23 54 76 .

9 14 19 67 If the data items are known ahead of time, the height can
be kept small, in the average sense, by adding values in a
random order, resulting in a random binary search tree.
The same tree after being height-balanced; the average path effort
However, there are many situations (such as online algo-
decreased to 3.00 node accesses
rithms) where this randomization is not viable.
In computer science, a self-balancing (or height- Self-balancing binary trees solve this problem by per-
balanced) binary search tree is any node-based binary forming transformations on the tree (such as tree rota-
search tree that automatically keeps its height (maximal tions) at key insertion times, in order to keep the height
number of levels below the root) small in the face of ar- proportional to log2 (n). Although a certain overhead is
bitrary item insertions and deletions.[1] involved, it may be justified in the long run by ensuring
These structures provide efficient implementations for fast execution of later operations.
mutable ordered lists, and can be used for other abstract Maintaining the height always at its minimum value
178 CHAPTER 6. SUCCESSORS AND NEIGHBORS

⌊log2 (n)⌋ is not always viable; it can be proven that any we have a very simple-to-describe yet asymptotically op-
insertion algorithm which did so would have an exces- timal O(n log n) sorting algorithm. Similarly, many al-
sive overhead. Therefore, most self-balanced BST algo- gorithms in computational geometry exploit variations
rithms keep the height within a constant factor of this on self-balancing BSTs to solve problems such as the
lower bound. line segment intersection problem and the point loca-
In the asymptotic ("Big-O") sense, a self-balancing BST tion problem efficiently. (For average-case performance,
structure containing n items allows the lookup, insertion, however, self-balanced BSTs may be less efficient than
and removal of an item in O(log n) worst-case time, and other solutions. Binary tree sort, in particular, is likely
to be slower than merge sort, quicksort, or heapsort, be-
ordered enumeration of all items in O(n) time. For some
implementations these are per-operation time bounds, cause of the tree-balancing overhead as well as cache ac-
cess patterns.)
while for others they are amortized bounds over a se-
quence of operations. These times are asymptotically op- Self-balancing BSTs are flexible data structures, in that
timal among all data structures that manipulate the key it’s easy to extend them to efficiently record additional in-
only through comparisons. formation or perform new operations. For example, one
can record the number of nodes in each subtree having
a certain property, allowing one to count the number of
6.5.2 Implementations nodes in a certain key range with that property in O(log
n) time. These extensions can be used, for example, to
Popular data structures implementing this type of tree in- optimize database queries or other list-processing algo-
clude: rithms.

• 2-3 tree 6.5.4 See also


• AA tree • Search data structure
• AVL tree • Day–Stout–Warren algorithm

• Red-black tree • Fusion tree


• Skip list
• Scapegoat tree
• Sorting
• Splay tree

• Treap 6.5.5 References


[1] Donald Knuth. The Art of Computer Programming, Vol-
6.5.3 Applications ume 3: Sorting and Searching, Second Edition. Addison-
Wesley, 1998. ISBN 0-201-89685-0. Section 6.2.3: Bal-
anced Trees, pp.458–481.
Self-balancing binary search trees can be used in a nat-
ural way to construct and maintain ordered lists, such as [2] Paul E. Black, “red-black tree”, in Dictionary of Algo-
priority queues. They can also be used for associative ar- rithms and Data Structures [online], Vreda Pieterse and
rays; key-value pairs are simply inserted with an ordering Paul E. Black, eds. 13 April 2015. (accessed 03 Oc-
based on the key alone. In this capacity, self-balancing tober 2016) Available from: https://xlinux.nist.gov/dads/
BSTs have a number of advantages and disadvantages HTML/redblack.html
over their main competitor, hash tables. One advantage
of self-balancing BSTs is that they allow fast (indeed,
6.5.6 External links
asymptotically optimal) enumeration of the items in key
order, which hash tables do not provide. One disadvan- • Dictionary of Algorithms and Data Structures:
tage is that their lookup algorithms get more complicated Height-balanced binary search tree
when there may be multiple items with the same key.
Self-balancing BSTs have better worst-case lookup per- • GNU libavl, a LGPL-licensed library of binary tree
formance than hash tables (O(log n) compared to O(n)), implementations in C, with documentation
but have worse average-case performance (O(log n) com-
pared to O(1)).
Self-balancing BSTs can be used to implement any algo- 6.6 Treap
rithm that requires mutable ordered lists, to achieve opti-
mal worst-case asymptotic performance. For example, if In computer science, the treap and the randomized bi-
binary tree sort is implemented with a self-balanced BST, nary search tree are two closely related forms of binary
6.6. TREAP 179

search tree data structures that maintain a dynamic set of to have the same priority) then the shape of a treap has
ordered keys and allow binary searches among the keys. the same probability distribution as the shape of a random
After any sequence of insertions and deletions of keys, binary search tree, a search tree formed by inserting the
the shape of the tree is a random variable with the same nodes without rebalancing in a randomly chosen insertion
probability distribution as a random binary tree; in par- order. Because random binary search trees are known to
ticular, with high probability its height is proportional to have logarithmic height with high probability, the same is
the logarithm of the number of keys, so that each search, true for treaps.
insertion, or deletion operation takes logarithmic time to Aragon and Seidel also suggest assigning higher priori-
perform.
ties to frequently accessed nodes, for instance by a pro-
cess that, on each access, chooses a random number and
replaces the priority of the node with that number if it
6.6.1 Description is higher than the previous priority. This modification
would cause the tree to lose its random shape; instead,
frequently accessed nodes would be more likely to be near

9 the root of the tree, causing searches for them to be faster.


Naor and Nissim[3] describe an application in maintaining
h authorization certificates in public-key cryptosystems.

6.6.2 Operations
Treaps support the following basic operations:
4 7
• To search for a given key value, apply a standard
c j binary search algorithm in a binary search tree, ig-
noring the priorities.

• To insert a new key x into the treap, generate a ran-


dom priority y for x. Binary search for x in the tree,
and create a new node at the leaf position where the
2 0 binary search determines a node for x should exist.
Then, as long as x is not the root of the tree and has
a e a larger priority number than its parent z, perform
a tree rotation that reverses the parent-child relation
between x and z.
A treap with alphabetic key and numeric max heap order
• To delete a node x from the treap, if x is a leaf of
the tree, simply remove it. If x has a single child
The treap was first described by Cecilia R. Aragon and z, remove x from the tree and make z be the child
Raimund Seidel in 1989;[1][2] its name is a portmanteau of the parent of x (or make z the root of the tree
of tree and heap. It is a Cartesian tree in which each key is if x had no parent). Finally, if x has two children,
given a (randomly chosen) numeric priority. As with any swap its position in the tree with the position of its
binary search tree, the inorder traversal order of the nodes immediate successor z in the sorted order, resulting
is the same as the sorted order of the keys. The structure in one of the previous cases. In this final case, the
of the tree is determined by the requirement that it be swap may violate the heap-ordering property for z,
heap-ordered: that is, the priority number for any non- so additional rotations may need to be performed to
leaf node must be greater than or equal to the priority of restore this property.
its children. Thus, as with Cartesian trees more generally,
the root node is the maximum-priority node, and its left
and right subtrees are formed in the same manner from Bulk operations
the subsequences of the sorted order to the left and right
of that node. In addition to the single-element insert, delete and lookup
An equivalent way of describing the treap is that it could operations, several fast “bulk” operations have been de-
be formed by inserting the nodes highest-priority-first fined on treaps: union, intersection and set difference.
into a binary search tree without doing any rebalancing. These rely on two helper operations, split and merge.
Therefore, if the priorities are independent random num-
bers (from a distribution over a large enough space of pos- • To split a treap into two smaller treaps, those smaller
sible priorities to ensure that two nodes are very unlikely than key x, and those larger than key x, insert x into
180 CHAPTER 6. SUCCESSORS AND NEIGHBORS

the treap with maximum priority—larger than the root of the tree, and otherwise it calls the insertion proce-
priority of any node in the treap. After this inser- dure recursively to insert x within the left or right subtree
tion, x will be the root node of the treap, all values (depending on whether its key is less than or greater than
less than x will be found in the left subtreap, and all the root). The numbers of descendants are used by the
values greater than x will be found in the right sub- algorithm to calculate the necessary probabilities for the
treap. This costs as much as a single insertion into random choices at each step. Placing x at the root of a
the treap. subtree may be performed either as in the treap by in-
serting it at a leaf and then rotating it upwards, or by an
• Merging two treaps that are the product of a former alternative algorithm described by Martínez and Roura
split, one can safely assume that the greatest value that splits the subtree into two pieces to be used as the
in the first treap is less than the smallest value in left and right children of the new node.
the second treap. Create a new node with value x,
such that x is larger than this max-value in the first The deletion procedure for a randomized binary search
treap, and smaller than the min-value in the second tree uses the same information per node as the insertion
treap, assign it the minimum priority, then set its left procedure, and like the insertion procedure it makes a se-
child to the first heap and its right child to the sec- quence of O(log n) random decisions in order to join the
ond heap. Rotate as necessary to fix the heap order. two subtrees descending from the left and right children
After that it will be a leaf node, and can easily be of the deleted node into a single tree. If the left or right
deleted. The result is one treap merged from the two subtree of the node to be deleted is empty, the join op-
original treaps. This is effectively “undoing” a split, eration is trivial; otherwise, the left or right child of the
and costs the same. deleted node is selected as the new subtree root with prob-
ability proportional to its number of descendants, and the
join proceeds recursively.
The union of two treaps t 1 and t 2 , representing sets A
and B is a treap t that represents A ∪ B. The following
recursive algorithm computes the union:
function union(t1 , t2 ): if t1 = nil: return t2 if t2 = nil:
6.6.4 Comparison
return t1 if priority(t1 ) < priority(t2 ): swap t1 and t2
t<, t> ← split t2 on key(t1 ) return new node(key(t1 ), The information stored per node in the randomized bi-
union(left(t1 ), t<), union(right(t1 ), t>)) nary tree is simpler than in a treap (a small integer rather
than a high-precision random number), but it makes a
Here, split is presumed to return two trees: one hold- greater number of calls to the random number generator
ing the keys less its input key, one holding the greater (O(log n) calls per insertion or deletion rather than one
keys. (The algorithm is non-destructive, but an in-place call per insertion) and the insertion procedure is slightly
destructive version exists as well.) more complicated due to the need to update the numbers
The algorithm for intersection is similar, but requires the of descendants per node. A minor technical difference is
join helper routine. The complexity of each of union, in- that, in a treap, there is a small probability of a collision
tersection and difference is O(m log n/m) for treaps of (two keys getting the same priority), and in both cases
sizes m and n, with m ≤ n. Moreover, since the recursive there will be statistical differences between a true ran-
calls to union are independent of each other, they can be dom number generator and the pseudo-random number
executed in parallel.[4] generator typically used on digital computers. However,
in any case the differences between the theoretical model
of perfect random choices used to design the algorithm
6.6.3 Randomized binary search tree and the capabilities of actual random number generators
are vanishingly small.
The randomized binary search tree, introduced by Although the treap and the randomized binary search tree
Martínez and Roura subsequently to the work of Aragon both have the same random distribution of tree shapes af-
and Seidel on treaps,[5] stores the same nodes with the ter each update, the history of modifications to the trees
same random distribution of tree shape, but maintains performed by these two data structures over a sequence
different information within the nodes of the tree in order of insertion and deletion operations may be different. For
to maintain its randomized structure. instance, in a treap, if the three numbers 1, 2, and 3
Rather than storing random priorities on each node, the are inserted in the order 1, 3, 2, and then the number
randomized binary search tree stores a small integer at 2 is deleted, the remaining two nodes will have the same
each node, the number of its descendants (counting itself parent-child relationship that they did prior to the inser-
as one); these numbers may be maintained during tree tion of the middle number. In a randomized binary search
rotation operations at only a constant additional amount tree, the tree after the deletion is equally likely to be either
of time per rotation. When a key x is to be inserted into of the two possible trees on its two nodes, independently
a tree that already has n nodes, the insertion algorithm of what the tree looked like prior to the insertion of the
chooses with probability 1/(n + 1) to place x as the new middle number.
6.7. AVL TREE 181

6.6.5 See also 6.7 AVL tree


• Finger search
J
+1
6.6.6 References
F P
‒1
[1] Aragon, Cecilia R.; Seidel, Raimund (1989), +1
“Randomized Search Trees” (PDF), Proc. 30th
Symp. Foundations of Computer Science (FOCS 1989), D G L V
‒1 0 +1 ‒1
Washington, D.C.: IEEE Computer Society Press,
pp. 540–545, doi:10.1109/SFCS.1989.63531, ISBN C N S X
0-8186-1982-1 0 0 0 0

[2] Seidel, Raimund; Aragon, Cecilia R. (1996), Q U


“Randomized Search Trees”, Algorithmica, 16 (4/5): 0 0
464–497, doi:10.1007/s004539900061
Fig. 1: AVL tree with balance factors (green)
[3] Naor, M.; Nissim, K. (April 2000), “Certificate revo-
cation and certificate update” (PDF), IEEE Journal on
Selected Areas in Communications, 18 (4): 561–570, In computer science, an AVL tree is a self-balancing bi-
doi:10.1109/49.839932. nary search tree. It was the first such data structure to be
invented.[2] In an AVL tree, the heights of the two child
[4] Blelloch, Guy E.,; Reid-Miller, Margaret, (1998),
subtrees of any node differ by at most one; if at any time
“Fast set operations using treaps”, Proc. 10th ACM
they differ by more than one, rebalancing is done to re-
Symp. Parallel Algorithms and Architectures (SPAA
1998), New York, NY, USA: ACM, pp. 16–26, store this property. Lookup, insertion, and deletion all
doi:10.1145/277651.277660, ISBN 0-89791-989-0. take O(log n) time in both the average and worst cases,
where n is the number of nodes in the tree prior to the
[5] Martínez, Conrado; Roura, Salvador (1997), operation. Insertions and deletions may require the tree
“Randomized binary search trees”, Journal of the to be rebalanced by one or more tree rotations.
ACM, 45 (2): 288–323, doi:10.1145/274787.274812
The AVL tree is named after its two Soviet inventors,
Georgy Adelson-Velsky and Evgenii Landis, who pub-
6.6.7 External links lished it in their 1962 paper “An algorithm for the or-
ganization of information”.[3]
• Collection of treap references and info by Cecilia
Aragon AVL trees are often compared with red–black trees be-
cause both support the same set of operations and take
• Open Data Structures - Section 7.2 - Treap: A Ran- O(log n) time for the basic operations. For lookup-
domized Binary Search Tree intensive applications, AVL trees are faster than red–
• Treap Applet by Kubo Kovac black trees because they are more strictly balanced.[4]
Similar to red–black trees, AVL trees are height-
• Animated treap balanced. Both are, in general, neither weight-balanced
nor μ-balanced for any μ≤1 ⁄2 ;[5] that is, sibling nodes can
• Randomized binary search trees. Lecture notes
have hugely differing numbers of descendants.
from a course by Jeff Erickson at UIUC. Despite
the title, this is primarily about treaps and skip lists;
randomized binary search trees are mentioned only
briefly.
6.7.1 Definition

• A high performance key-value store based on treap Balance factor


by Junyi Sun
In a binary tree the balance factor of a node N is defined
• VB6 implementation of treaps. Visual basic 6 im-
to be the height difference
plementation of treaps as a COM object.
• ActionScript3 implementation of a treap BalanceFactor(N) := –Height(LeftSubtree(N))
• Pure Python and Cython in-memory treap and dup- + Height(RightSubtree(N)) [6]
treap
• Treaps in C#. By Roy Clemmons of its two child subtrees. A binary tree is defined to be an
AVL tree if the invariant
• Pure Go in-memory, immutable treaps
• Pure Go persistent treap key-value storage library BalanceFactor(N) ∈ {–1,0,+1}
182 CHAPTER 6. SUCCESSORS AND NEIGHBORS

holds for every node N in the tree. traversing up to h ∝ log(n) links (particularly when nav-
A node N with BalanceFactor(N) < 0 is called “left- igating from the rightmost leaf of the root’s left subtree
heavy”, one with BalanceFactor(N) > 0 is called “right- to the root or from the root to the leftmost leaf of the
heavy”, and one with BalanceFactor(N) = 0 is sometimes root’s right subtree; in the AVL tree of figure 1, moving
simply called “balanced”. from node P to the next but one node Q takes 3 steps).
However, exploring all n nodes of the tree in this manner
would visit each link exactly twice: one downward visit
Remark to enter the subtree rooted by that node, another visit up-
ward to leave that node’s subtree after having explored it.
In the sequel, because there is a one-to-one correspon- And since there are n−1 links in any tree, the amortized
dence between nodes and the subtrees rooted by them, cost is found to be 2×(n−1)/n, or approximately 2.
we sometimes leave it to the context whether the name of
an object stands for the node or the subtree.
Insert

Properties When inserting an element into an AVL tree, you initially


follow the same process as inserting into a Binary Search
Balance factors can be kept up-to-date by knowing the Tree. After inserting a node, it is necessary to check each
previous balance factors and the change in height – it of the node’s ancestors for consistency with the invariants
is not necessary to know the absolute height. For hold- of AVL trees: this is called “retracing”. This is achieved
ing the AVL balance information, two bits per node are by considering the balance factor of each node.[9][10]
sufficient.[7]
The height h of an AVL tree with n nodes lies in the
interval:[8]

log2 (n+1) ≤ h < c log2 (n+2)+b


α γ
1
with the golden ratio φ := (1+√5) ⁄2 ≈ 1.618, c := ⁄ log2
φ ≈ 1.44, and b := c ⁄2 log2 5 – 2 ≈ –0.328. This is be-
cause an AVL tree of height h contains at least Fh₊₂ –
1 nodes where {Fh} is the Fibonacci sequence with the β γ α β
seed values F1 = 1, F2 = 1.
Fig. 2: Tree rotations

6.7.2 Operations Since with a single insertion the height of an AVL subtree
cannot increase by more than one, the temporary balance
Read-only operations of an AVL tree involve carrying out factor of a node after an insertion will be in the range [–
the same actions as would be carried out on an unbalanced 2,+2]. For each node checked, if the temporary balance
binary search tree, but modifications have to observe and factor remains in the range from –1 to +1 then only an
restore the height balance of the subtrees. update of the balance factor and no rotation is necessary.
However, if the temporary balance factor becomes less
than –1 or greater than +1, the subtree rooted at this node
Searching is AVL unbalanced, and a rotation is needed. The various
cases of rotations are described in section Rebalancing.
Searching for a specific key in an AVL tree can be done
the same way as that of a normal unbalanced binary By inserting the new node Z as a child of node X the
search tree. In order for search to work effectively it has height of that subtree Z increases from 0 to 1.
to employ a comparison function which establishes a total
order (or at least a total preorder) on the set of keys. The Invariant of the retracing loop for an insertion
number of comparisons required for successful search is
limited by the height h and for unsuccessful search is very The height of the subtree rooted by Z has increased by 1.
close to h, so both are in O(log n). It is already in AVL shape.
for (X = parent(Z); X != null; X = parent(Z)) { // Loop
Traversal (possibly up to the root) // BalanceFactor(X) has to be
updated: if (Z == right_child(X)) { // The right subtree
Once a node has been found in an AVL tree, the next or increases if (BalanceFactor(X) > 0) { // X is right-heavy
previous node can be accessed in amortized constant time. // ===> the temporary BalanceFactor(X) == +2 // ===>
Some instances of exploring these “nearby” nodes require rebalancing is required. G = parent(X); // Save parent
6.7. AVL TREE 183

of X around rotations if (BalanceFactor(Z) < 0) // Right from 1 to 0 or from 2 to 1, if that node had a child.
Left Case (see figure 5) N = rotate_RightLeft(X,Z); // Starting at this subtree, it is necessary to check each of
Double rotation: Right(Z) then Left(X) else // Right the ancestors for consistency with the invariants of AVL
Right Case (see figure 4) N = rotate_Left(X,Z); // Single trees. This is called “retracing”.
rotation Left(X) // After rotation adapt parent link } else
{ if (BalanceFactor(X) < 0) { BalanceFactor(X) = 0; Since with a single deletion the height of an AVL subtree
// Z’s height increase is absorbed at X. break; // Leave cannot decrease by more than one, the temporary balance
the loop } BalanceFactor(X) = +1; Z=X; // Height(Z) factor of a node will be in the range from −2 to +2. If the
increases by 1 continue; } } else { // Z == left_child(X): balance factor remains in the range from −1 to +1 it can be
the left subtree increases if (BalanceFactor(X) < 0) { // adjusted in accord with the AVL rules. If it becomes ±2
X is left-heavy // ===> the temporary BalanceFactor(X) then the subtree is unbalanced and needs to be rotated.
== –2 // ===> rebalancing is required. G = parent(X); // The various cases of rotations are described in section
Save parent of X around rotations if (BalanceFactor(Z) Rebalancing.
> 0) // Left Right Case N = rotate_LeftRight(X,Z);
// Double rotation: Left(Z) then Right(X) else // Left Invariant of the retracing loop for a deletion
Left Case N = rotate_Right(X,Z); // Single rotation
Right(X) // After rotation adapt parent link } else { if
(BalanceFactor(X) > 0) { BalanceFactor(X) = 0; // Z’s The height of the subtree rooted by N has decreased by
height increase is absorbed at X. break; // Leave the loop 1. It is already in AVL shape.
} BalanceFactor(X) = –1; Z=X; // Height(Z) increases for (X = parent(N); X != null; X = G) { // Loop (possibly
by 1 continue; } } // After a rotation adapt parent link: up to the root) G = parent(X); // Save parent of X around
// N is the new root of the rotated subtree // Height does rotations // BalanceFactor(X) has not yet been updated!
not change: Height(N) == old Height(X) parent(N) = G; if (N == left_child(X)) { // the left subtree decreases if
if (G != null) { if (X == left_child(G)) left_child(G) = (BalanceFactor(X) > 0) { // X is right-heavy // ===> the
N; else right_child(G) = N; break; } else { tree->root = temporary BalanceFactor(X) == +2 // ===> rebalancing
N; // N is the new root of the total tree break; } // There is required. Z = right_child(X); // Sibling of N (higher
is no fall thru, only break; or continue; } // Unless loop by 2) b = BalanceFactor(Z); if (b < 0) // Right Left Case
is left via break, the height of the total tree increases by 1. (see figure 5) N = rotate_RightLeft(X,Z); // Double
rotation: Right(Z) then Left(X) else // Right Right Case
In order to update the balance factors of all nodes, first (see figure 4) N = rotate_Left(X,Z); // Single rotation
observe that all nodes requiring correction lie from child Left(X) // After rotation adapt parent link } else { if
to parent along the path of the inserted leaf. If the above (BalanceFactor(X) == 0) { BalanceFactor(X) = +1; // N’s
procedure is applied to nodes along this path, starting height decrease is absorbed at X. break; // Leave the loop
from the leaf, then every node in the tree will again have } N = X; BalanceFactor(N) = 0; // Height(N) decreases
a balance factor of −1, 0, or 1. by 1 continue; } } else { // (N == right_child(X)): The
right subtree decreases if (BalanceFactor(X) < 0) { // X
The retracing can stop if the balance factor becomes 0 im- is left-heavy // ===> the temporary BalanceFactor(X)
plying that the height of that subtree remains unchanged. == –2 // ===> rebalancing is required. Z = left_child(X);
If the balance factor becomes ±1 then the height of the // Sibling of N (higher by 2) b = BalanceFactor(Z); if
subtree increases by one and the retracing needs to con- (b > 0) // Left Right Case N = rotate_LeftRight(X,Z);
tinue. // Double rotation: Left(Z) then Right(X) else // Left
Left Case N = rotate_Right(X,Z); // Single rotation
If the balance factor temporarily becomes ±2, this has
Right(X) // After rotation adapt parent link } else { if
to be repaired by an appropriate rotation after which the
(BalanceFactor(X) == 0) { BalanceFactor(X) = –1; // N’s
subtree has the same height as before (and its root the
height decrease is absorbed at X. break; // Leave the loop
balance factor 0).
} N = X; BalanceFactor(N) = 0; // Height(N) decreases
The time required is O(log n) for lookup, plus a maximum by 1 continue; } } // After a rotation adapt parent link: //
of O(log n) retracing levels (O(1) on average) on the way N is the new root of the rotated subtree parent(N) = G; if
back to the root, so the operation can be completed in (G != null) { if (X == left_child(G)) left_child(G) = N;
O(log n) time. else right_child(G) = N; if (b == 0) break; // Height does
not change: Leave the loop } else { tree->root = N; // N
is the new root of the total tree continue; } // Height(N)
Delete decreases by 1 (== old Height(X)−1) } // Unless loop
is left via break, the height of the total tree decreases by 1.
The preliminary steps for deleting a node are described in
section Binary search tree#Deletion. There, the effective The retracing can stop if the balance factor becomes
deletion of the subject node or the replacement node de- ±1 meaning that the height of that subtree remains un-
creases the height of the corresponding child tree either changed.
184 CHAPTER 6. SUCCESSORS AND NEIGHBORS

If the balance factor becomes 0 then the height of the sub- The union of two AVLs t 1 and t 2 representing sets A and
tree decreases by one and the retracing needs to continue. B, is an AVL t that represents A ∪ B. The following re-
If the balance factor temporarily becomes ±2, this has cursive function computes this union:
to be repaired by an appropriate rotation. It depends on function union(t1 , t2 ): if t1 = nil: return t2 if t2
the balance factor of the sibling Z (the higher child tree) = nil: return t1 t<, t> ← split t2 on t1 .root return
whether the height of the subtree decreases by one or does join(t1 .root,union(left(t1 ), t<),union(right(t1 ), t>))
not change (the latter, if Z has the balance factor 0).
Here, Split is presumed to return two trees: one hold-
The time required is O(log n) for lookup, plus a maximum ing the keys less its input key, one holding the greater
of O(log n) retracing levels (O(1) on average) on the way keys. (The algorithm is non-destructive, but an in-place
back to the root, so the operation can be completed in destructive version exists as well.)
O(log n) time. The algorithm for intersection or difference is similar, but
requires the Join2 helper routine that is the same as Join
Set operations and bulk operations but without the middle key. Based on the new functions
for union, intersection or difference, either one key or
In addition to the single-element insert, delete and lookup multiple keys can be inserted to or deleted from the AVL
operations, several set operations have been defined on tree. Since Split calls Join but does not deal with the bal-
AVL trees: union, intersection and set difference. Then ancing criteria of AVL trees directly, such an implemen-
fast bulk operations on insertions or deletions can be im- tation is usually called the “join-based” implementation.
plemented based on these set functions. These set oper- The complexity( of each
( n of ))union, intersection and dif-
ations rely on two helper operations, Split and Join. With ference is O m log m + 1 for AVLs of sizes m and
the new operations, the implementation of AVL trees can n(≥ m) . More importantly, since the recursive calls
be more efficient and highly-parallelizable.[11] to union, intersection or difference are independent of
each other, they can be executed in parallel with a parallel
[11]
• Join: The function Join is on two AVL trees t 1 and t 2 depth O(log m log n) . When m = 1 , the join-
and a key k and will return a tree containing all ele- based implementation has the same computational DAG
ments in t 1 , t 2 as well as k. It requires k to be greater as single-element insertion and deletion.
than all keys in t 1 and smaller than all keys in t 2 . If
the two trees differ by height at most one, Join sim-
ply create a new node with left subtree t 1 , root k and 6.7.3 Rebalancing
right subtree t 2 . Otherwise, suppose that t 1 is higher
than t 2 for more than one (the other case is symmet- If during a modifying operation (e.g. insert, delete) a
ric). Join follows the right spine of t 1 until a node c (temporary) height difference of more than one arises be-
which is balanced with t 2 . At this point a new node tween two child subtrees, the parent subtree has to be “re-
with left child c, root k and right child t 1 is created to balanced”. The given repair tools are the so-called tree
replace c. The new node satisfies the AVL invariant, rotations, because they move the keys only “vertically”,
and its height is one greater than c. The increase in so that the (“horizontal”) in-order sequence of the keys
height can increase the height of its ancestors, pos- is fully preserved (which is essential for a binary-search
sibly invalidating the AVL invariant of those nodes. tree).[12][13]
This can be fixed either with a double rotation if in- Let Z be the child higher by 2 (see figures 4 and 5). Two
valid at the parent or a single left rotation if invalid flavors of rotations are required: simple and double. Re-
higher in the tree, in both cases restoring the height balancing can be accomplished by a simple rotation (see
for any further ancestor nodes. Join will therefore figure 4) if the inner child of Z, that is the child with a
require at most two rotations. The cost of this func- child direction opposite to that of Z, (t in figure 4, Y in
23
tion is the difference of the heights between the two figure 5) is not higher than its sibling, the outer child t
4
input trees. in both figures. This situation is called “Right Right” or
• Split: To split an AVL tree into two smaller trees, “Left Left” in the literature.
those smaller than key x, and those larger than key x, On the other hand, if the inner child (t23 in figure 4, Y
first draw a path from the root by inserting x into the in figure 5) of Z is higher than t4 then rebalancing can be
AVL. After this insertion, all values less than x will accomplished by a double rotation (see figure 5). This sit-
be found on the left of the path, and all values greater uation is called “Right Left” because X is right- and Z left-
than x will be found on the right. By applying Join, heavy (or “Left Right” if X is left- and Z is right-heavy).
all the subtrees on the left side are merged bottom- From a mere graph-theoretic point of view, the two ro-
up using keys on the path as intermediate nodes from tations of a double are just single rotations. But they en-
bottom to top to form the left tree, and the right part counter and have to maintain other configurations of bal-
is asymmetric. The cost of Split is order of O(n) , ance factors. So, in effect, it is simpler – and more effi-
the height of the tree. cient – to specialize, just as in the original paper, where
6.7. AVL TREE 185

the double rotation is called Большое вращение (lit. big ~


turn) as opposed to the simple rotation which is called < < < <
Малое вращение (lit. little turn). But there are alterna- 0 +2 X
tives: one could e.g. update all the balance factors in a
separate walk from leaf to root. +1
1 Z 0
The cost of a rotation, both simple and double, is con-
stant.
2
For both flavors of rotations a mirrored version, i.e. ro-
tate_Right or rotate_LeftRight, respectively, is required
as well. h t1

h+1 t23
Simple rotation

Figure 4 shows a Right Right situation. In its upper half, h+2 t4


node X has two child trees with a balance factor of +2.
Moreover, the inner child t23 of Z is not higher than its
sibling t4 . This can happen by a height increase of subtree
t4 or by a height decrease of subtree t1 . In the latter case,
also the pale situation where t23 has the same height as t4 ~
may occur. < < < <
0
The result of the left rotation is shown in the lower half of 0 Z ‒1
the figure. Three links (thick edges in figure 4) and two
balance factors are to be updated. 0
1
+1
X
As the figure shows, before an insertion, the leaf layer
was at level h+1, temporarily at level h+2 and after the
2
rotation again at level h+1. In case of a deletion, the leaf
layer was at level h+2, where it is again, when t23 and
t4 were of same height. Otherwise the leaf layer reaches
level h+1, so that the height of the rotated tree decreases.
h+1 t1 t23 t4
Code snippet of a simple left rotation
h+2
node* rotate_Left(node* X,node* Z) { // Z is by 2
higher than its sibling t23 = left_child(Z); // Inner child Fig. 4: Simple rotation
of Z right_child(X) = t23; if (t23 != null) parent(t23) rotate_Left(X,Z)
= X; left_child(Z) = X; parent(X) = Z; // 1st case,
BalanceFactor(Z) == 0, only happens with deletion, not
insertion: if (BalanceFactor(Z) == 0) { // t23 has been The result of the first, the right, rotation is shown in the
of same height as t4 BalanceFactor(X) = +1; // t23 now middle third of the figure. (With respect to the balance
higher BalanceFactor(Z) = –1; // t4 now lower than X factors, this rotation is not of the same kind as the other
} else // 2nd case happens with insertion or deletion: { AVL single rotations, because the height difference be-
BalanceFactor(X) = 0; BalanceFactor(Z) = 0; } return Z; tween Y and t4 is only 1.) The result of the final left ro-
// return new root of rotated subtree } tation is shown in the lower third of the figure. Five links
(thick edges in figure 5) and three balance factors are to
be updated.
As the figure shows, before an insertion, the leaf layer was
Double rotation at level h+1, temporarily at level h+2 and after the double
rotation again at level h+1. In case of a deletion, the leaf
Figure 5 shows a Right Left situation. In its upper third, layer was at level h+2 and after the double rotation it is at
node X has two child trees with a balance factor of +2. level h+1, so that the height of the rotated tree decreases.
But unlike figure 4, the inner child Y of Z is higher than its
sibling t4 . This can happen by a height increase of subtree Code snippet of a right-left double rotation
t2 or t3 (with the consequence that they are of different
height) or by a height decrease of subtree t1 . In the latter node* rotate_RightLeft(node* X,node* Z) { // Z is by
case, if may also occur that t2 and t3 are of same height. 2 higher than its sibling Y = left_child(Z); // Inner child
186 CHAPTER 6. SUCCESSORS AND NEIGHBORS

<
~
< < < < < else // 2nd case, BalanceFactor(Y) == 0, only happens
+2
with deletion, not insertion: if (BalanceFactor(Y) == 0)
0 X
{ BalanceFactor(X) = 0; BalanceFactor(Z) = 0; } else //
‒1 3rd case happens with insertion or deletion: { // t2 was
1 Z
higher BalanceFactor(X) = 0; BalanceFactor(Z) = +1;
–1
2 +1 Y // t4 now higher } BalanceFactor(Y) = 0; return Y; //
0
return new root of rotated subtree }
3

h t1
6.7.4 Comparison to other structures
h+1 t4

t2 t3 Both AVL trees and red–black (RB) trees are self-


h+2
balancing binary search trees and they are related math-
ematically. Indeed, every AVL tree can be colored red–
black,[14] but there are RB trees which are not AVL bal-
~
anced. For maintaining the AVL resp. RB tree’s invari-
< < < < < < ants, rotations play an important role. In the worst case,
0 X even without rotations, AVL or RB insertions or deletions
require O(log n) inspections and/or updates to AVL bal-
1 Y ance factors resp. RB colors. RB insertions and dele-
tions and AVL insertions require from zero to three tail-
2 Z recursive rotations and run in amortized O(1) time,[15][16]
thus equally constant on average. AVL deletions requir-
3
ing O(log n) rotations in the worst case are also O(1) on
h t1 average. RB trees require storing one bit of information
(the color) in each node, while AVL trees mostly use two
h+1 t2 bits for the balance factor, although, when stored at the
children, one bit with meaning «lower than sibling» suf-
h+2 t3 t4 fices. The bigger difference between the two data struc-
tures is their height limit.
For a tree of size n ≥ 1
~
< < < < < <
• an AVL tree’s height is at most
0 0 Y
h ≦ c log2 (n + d) + b
–1 +1
1
0 X Z 0 < c log2 (n + 2) + b
2 √
where φ := 1+2 5 ≈ 1.618 the golden ratio,
c := log1 φ ≈ 1.440, b := 2c log2 5 − 2 ≈
2

h
−0.328, and d := 1 + φ41√5 ≈ 1.065 .

t1 t2 t3 t4
h+1 • an RB tree’s height is at most

Fig. 5: Double rotation rotate_RightLeft(X,Z) h ≦ 2 log2 (n + 1) . [17]

= rotate_Right around Z followed by


rotate_Left around X AVL trees are more rigidly balanced than RB trees with
an asymptotic relation AVL ⁄RB≈0.720 of the maximal
heights. For insertions and deletions, Ben Pfaff shows
of Z // Y is by 1 higher than sibling t3 = right_child(Y); in 79 measurements a relation of AVL ⁄RB between 0.677
left_child(Z) = t3; if (t3 != null) parent(t3) = Z; and 1.077 with median ≈0.947 and geometric mean
right_child(Y) = Z; parent(Z) = Y; t2 = left_child(Y); ≈0.910.[18]
right_child(X) = t2; if (t2 != null) parent(t2) = X;
left_child(Y) = X; parent(X) = Y; // 1st case, Balance-
Factor(Y) > 0, happens with insertion or deletion: if 6.7.5 See also
(BalanceFactor(Y) > 0) { // t3 was higher BalanceFac-
tor(X) = –1; // t1 now higher BalanceFactor(Z) = 0; } • Trees
6.8. RED–BLACK TREE 187

• Tree rotation [11] Blelloch, Guy E.; Ferizovic, Daniel; Sun, Yihan (2016),
“Just Join for Parallel Ordered Sets”, Proc. 28th ACM
• Red–black tree Symp. Parallel Algorithms and Architectures (SPAA 2016),
ACM, pp. 253–264, doi:10.1145/2935764.2935768,
• Splay tree ISBN 978-1-4503-4210-0.

• Scapegoat tree [12] Knuth, Donald E. (2000). Sorting and searching (2. ed.,
6. printing, newly updated and rev. ed.). Boston [u.a.]:
• B-tree Addison-Wesley. pp. 458–481. ISBN 0201896850.

• T-tree [13] Pfaff, Ben (2004). An Introduction to Binary Search Trees


and Balanced Trees. Free Software Foundation, Inc. pp.
• List of data structures 107–138.

[14] Paul E. Black (2015-04-13). “AVL tree”. Dictionary


of Algorithms and Data Structures. National Institute of
6.7.6 References Standards and Technology. Retrieved 2016-07-02.

[1] Eric Alexander. “AVL Trees”. [15] Mehlhorn & Sanders 2008, pp. 165, 158

[2] Robert Sedgewick, Algorithms, Addison-Wesley, 1983, [16] Dinesh P. Mehta, Sartaj Sahni (Ed.) Handbook of Data
ISBN 0-201-06672-6, page 199, chapter 15: Balanced Structures and Applications 10.4.2
Trees.
[17] Red–black tree#Proof of asymptotic bounds
[3] Georgy Adelson-Velsky, G.; Evgenii Landis (1962).
[18] Ben Pfaff: Performance Analysis of BSTs in System Soft-
“An algorithm for the organization of information”.
ware. Stanford University 2004.
Proceedings of the USSR Academy of Sciences (in Rus-
sian). 146: 263–266. English translation by Myron J.
Ricci in Soviet Math. Doklady, 3:1259–1263, 1962.
6.7.7 Further reading
[4] Pfaff, Ben (June 2004). “Performance Analysis of BSTs
in System Software” (PDF). Stanford University. • Donald Knuth. The Art of Computer Program-
ming, Volume 3: Sorting and Searching, Third Edi-
[5] AVL trees are not weight-balanced? (meaning: AVL trees tion. Addison-Wesley, 1997. ISBN 0-201-89685-0.
are not μ-balanced?) Pages 458–475 of section 6.2.3: Balanced Trees.
Thereby: A Binary Tree is called µ -balanced, with 0 ≤
µ ≤ 12 , if for every node N , the inequality

|Nl |
6.7.8 External links
1
2
−µ≤ |N |+1
≤ 1
2

• This article incorporates public domain material
holds and µ is minimal with this property. |N | is the num- from the NIST document: Black, Paul E. “AVL
ber of nodes below the tree with N as root (including the
Tree”. Dictionary of Algorithms and Data Struc-
root) and Nl is the left child node of N .
tures.
[6] Knuth, Donald E. (2000). Sorting and searching (2. ed.,
• AVL tree demonstration (HTML5/Canvas)
6. printing, newly updated and rev. ed.). Boston [u.a.]:
Addison-Wesley. p. 459. ISBN 0-201-89685-0. • AVL tree demonstration (requires Flash)
[7] More precisely: if the AVL balance information is kept in • AVL tree demonstration (requires Java)
the child nodes – with meaning “when going upward there
is an additional increment in height”, this can be done with
one bit. Nevertheless, the modifying operations can be
programmed more efficiently if the balance information 6.8 Red–black tree
can be checked with one test.
A red–black tree is a kind of self-balancing binary
[8] Knuth, Donald E. (2000). Sorting and searching (2. ed.,
search tree. Each node of the binary tree has an extra
6. printing, newly updated and rev. ed.). Boston [u.a.]:
Addison-Wesley. p. 460. ISBN 0-201-89685-0. bit, and that bit is often interpreted as the color (red or
black) of the node. These color bits are used to ensure
[9] Knuth, Donald E. (2000). Sorting and searching (2. ed., the tree remains approximately balanced during inser-
6. printing, newly updated and rev. ed.). Boston [u.a.]: tions and deletions.[2]
Addison-Wesley. pp. 458–481. ISBN 0201896850.
Balance is preserved by painting each node of the tree
[10] Pfaff, Ben (2004). An Introduction to Binary Search Trees with one of two colors (typically called 'red' and 'black')
and Balanced Trees. Free Software Foundation, Inc. pp. in a way that satisfies certain properties, which collec-
107–138. tively constrain how unbalanced the tree can become in
188 CHAPTER 6. SUCCESSORS AND NEIGHBORS

the worst case. When the tree is modified, the new tree is 6.8.2 Terminology
subsequently rearranged and repainted to restore the col-
oring properties. The properties are designed in such a A red–black tree is a special type of binary tree, used in
way that this rearranging and recoloring can be performed computer science to organize pieces of comparable data,
efficiently. such as text fragments or numbers.
The balancing of the tree is not perfect, but it is good The leaf nodes of red–black trees do not contain data.
enough to allow it to guarantee searching in O(log n) time, These leaves need not be explicit in computer memory—
where n is the total number of elements in the tree. The a null child pointer can encode the fact that this child is a
insertion and deletion operations, along with the tree re- leaf—but it simplifies some algorithms for operating on
arrangement and recoloring, are also performed in O(log red–black trees if the leaves really are explicit nodes. To
n) time.[3] save memory, sometimes a single sentinel node performs
Tracking the color of each node requires only 1 bit of the role of all leaf nodes; all references from internal
information per node because there are only two col- nodes to leaf nodes then point to the sentinel node.
ors. The tree does not contain any other data specific to Red–black trees, like all binary search trees, allow effi-
its being a red–black tree so its memory footprint is al- cient in-order traversal (that is: in the order Left–Root–
most identical to a classic (uncolored) binary search tree. Right) of their elements. The search-time results from the
In many cases, the additional bit of information can be traversal from root to leaf, and therefore a balanced tree
stored at no additional memory cost. of n nodes, having the least possible tree height, results in
O(log n) search time.

6.8.1 History
6.8.3 Properties
In 1972, Rudolf Bayer[4] invented a data structure that
was a special order-4 case of a B-tree. These trees main-
tained all paths from root to leaf with the same number of 13
nodes, creating perfectly balanced trees. However, they
were not binary search trees. Bayer called them a “sym- 8 17
metric binary B-tree” in his paper and later they became
popular as 2-3-4 trees or just 2-4 trees.[5] 1 11 15 25

In a 1978 paper, “A Dichromatic Framework for NIL


6 NIL NIL NIL NIL
22 27
Balanced Trees”,[6] Leonidas J. Guibas and Robert
Sedgewick derived the red-black tree from the symmet- NIL NIL NIL NIL NIL NIL

ric binary B-tree.[7] The color “red” was chosen because


it was the best-looking color produced by the color laser An example of a red–black tree
printer available to the authors while working at Xerox
PARC.[8] Another response from Guibas states that it was In addition to the requirements imposed on a binary
because of the red and black pens available to them to search tree the following must be satisfied by a red–black
draw the trees.[9] tree:[16]
In 1993, Arne Andersson introduced the idea of right
leaning tree to simplify insert and delete operations.[10] 1. Each node is either red or black.
In 1999, Chris Okasaki showed how to make insert op-
2. The root is black. This rule is sometimes omitted.
eration purely functional. Its balance function needed to
Since the root can always be changed from red to
take care of only 4 unbalanced cases and one default bal-
black, but not necessarily vice versa, this rule has
anced case.[11]
little effect on analysis.
The original algorithm used 8 unbalanced cases, but
Cormen et al. (2001) reduced that to 6 unbalanced 3. All leaves (NIL) are black.
cases.[2] Sedgewick showed that the insert operation can
be implemented in just 46 lines of Java code.[12][13] In 4. If a node is red, then both its children are black.
2008, Sedgewick proposed the left-leaning red–black
tree, leveraging Andersson’s idea that simplified algo- 5. Every path from a given node to any of its descen-
rithms. Sedgewick originally allowed nodes whose two dant NIL nodes contains the same number of black
children are red making his trees more like 2-3-4 trees but nodes. Some definitions: the number of black nodes
later this restriction was added making new trees more from the root to a node is the node’s black depth;
like 2-3 trees. Sedgewick implemented the insert algo- the uniform number of black nodes in all paths from
rithm in just 33 lines, significantly shortening his original root to the leaves is called the black-height of the
46 lines of code.[14][15] red–black tree.[17]
6.8. RED–BLACK TREE 189

These constraints enforce a critical property of red–black values per cluster with a maximum capacity of 3 values.
trees: the path from the root to the farthest leaf is no more
This B-tree type is still more general than a red–black
than twice as long as the path from the root to the nearesttree though, as it allows ambiguity in a red–black tree
leaf. The result is that the tree is roughly height-balanced.
conversion—multiple red–black trees can be produced
Since operations such as inserting, deleting, and finding from an equivalent B-tree of order 4. If a B-tree clus-
values require worst-case time proportional to the height ter contains only 1 value, it is the minimum, black, and
of the tree, this theoretical upper bound on the height al-has two child pointers. If a cluster contains 3 values, then
lows red–black trees to be efficient in the worst case, un- the central value will be black and each value stored on
like ordinary binary search trees.
its sides will be red. If the cluster contains two values,
To see why this is guaranteed, it suffices to consider the however, either one can become the black node in the
effect of properties 4 and 5 together. For a red–black tree red–black tree (and the other one will be red).
T, let B be the number of black nodes in property 5. Let So the order-4 B-tree does not maintain which of the val-
the shortest possible path from the root of T to any leaf ues contained in each cluster is the root black tree for the
consist of B black nodes. Longer possible paths may be whole cluster and the parent of the other values in the
constructed by inserting red nodes. However, property 4 same cluster. Despite this, the operations on red–black
makes it impossible to insert more than one consecutive trees are more economical in time because you don't have
red node. Therefore, ignoring any black NIL leaves, the to maintain the vector of values.[18] It may be costly if
longest possible path consists of 2*B nodes, alternating values are stored directly in each node rather than be-
black and red (this is the worst case). Counting the black ing stored by reference. B-tree nodes, however, are more
NIL leaves, the longest possible path consists of 2*B-1 economical in space because you don't need to store the
nodes. color attribute for each node. Instead, you have to know
The shortest possible path has all black nodes, and the which slot in the cluster vector is used. If values are stored
longest possible path alternates between red and black by reference, e.g. objects, null references can be used and
nodes. Since all maximal paths have the same number so the cluster can be represented by a vector containing 3
of black nodes, by property 5, this shows that no path is slots for value pointers plus 4 slots for child references in
more than twice as long as any other path. the tree. In that case, the B-tree can be more compact in
memory, improving data locality.
The same analogy can be made with B-trees with larger
6.8.4 Analogy to B-trees of order 4 orders that can be structurally equivalent to a colored bi-
nary tree: you just need more colors. Suppose that you
add blue, then the blue–red–black tree defined like red–
8 13 17 black trees but with the additional constraint that no two
successive nodes in the hierarchy will be blue and all blue
nodes will be children of a red node, then it becomes
equivalent to a B-tree whose clusters will have at most
NIL 1 6 NIL 11 NIL NIL 15 NIL 22 25 27
7 values in the following colors: blue, red, blue, black,
NIL NIL NIL NIL NIL NIL
blue, red, blue (For each cluster, there will be at most 1
black node, 2 red nodes, and 4 blue nodes).
The same red–black tree as in the example above, seen as a B-
tree. For moderate volumes of values, insertions and deletions
in a colored binary tree are faster compared to B-trees be-
A red–black tree is similar in structure to a B-tree of cause colored trees don't attempt to maximize the fill fac-
order[note 1] 4, where each node can contain between 1 and tor of each horizontal cluster of nodes (only the minimum
3 values and (accordingly) between 2 and 4 child point- fill factor is guaranteed in colored binary trees, limiting
ers. In such a B-tree, each node will contain only one the number of splits or junctions of clusters). B-trees
value matching the value in a black node of the red–black will be faster for performing rotations (because rotations
tree, with an optional value before and/or after it in the will frequently occur within the same cluster rather than
same node, both matching an equivalent red node of the with multiple separate nodes in a colored binary tree).
red–black tree. For storing large volumes, however, B-trees will be much
faster as they will be more compact by grouping several
One way to see this equivalence is to “move up” the red
children in the same cluster where they can be accessed
nodes in a graphical representation of the red–black tree,
locally.
so that they align horizontally with their parent black
node, by creating together a horizontal cluster. In the B- All optimizations possible in B-trees to increase the av-
tree, or in the modified graphical representation of the erage fill factors of clusters are possible in the equivalent
red–black tree, all leaf nodes are at the same depth. multicolored binary tree. Notably, maximizing the av-
erage fill factor in a structurally equivalent B-tree is the
The red–black tree is then structurally equivalent to a B-
same as reducing the total height of the multicolored tree,
tree of order 4, with a minimum fill factor of 33% of
190 CHAPTER 6. SUCCESSORS AND NEIGHBORS

by increasing the number of non-black nodes. The worst by a “color flip,” corresponding to a split, in which the
case occurs when all nodes in a colored binary tree are red color of two children nodes leaves the children and
black, the best case occurs when only a third of them are moves to the parent node. The tango tree, a type of tree
black (and the other two thirds are red nodes). optimized for fast searches, usually uses red–black trees
Notes as part of its data structure.
In the version 8 of Java, the Collection HashMap has been
modified such that instead of using a LinkedList to store
[1] Using Knuth’s definition of order: the maximum number different elements with colliding hashcodes, a Red-Black
of children tree is used. This results in the improvement of time com-
plexity of searching such an element from O(n) to O(log
n).[21]
6.8.5 Applications and related data struc-
tures 6.8.6 Operations
Red–black trees offer worst-case guarantees for insertion Read-only operations on a red–black tree require no mod-
time, deletion time, and search time. Not only does this ification from those used for binary search trees, because
make them valuable in time-sensitive applications such as every red–black tree is a special case of a simple binary
real-time applications, but it makes them valuable build- search tree. However, the immediate result of an in-
ing blocks in other data structures which provide worst- sertion or removal may violate the properties of a red–
case guarantees; for example, many data structures used black tree. Restoring the red–black properties requires
in computational geometry can be based on red–black a small number (O(log n) or amortized O(1)) of color
trees, and the Completely Fair Scheduler used in current changes (which are very quick in practice) and no more
Linux kernels uses red–black trees. than three tree rotations (two for insertion). Although in-
The AVL tree is another structure supporting O(log n) sert and delete operations are complicated, their times re-
search, insertion, and removal. It is more rigidly balanced main O(log n).
than red–black trees, leading to slower insertion and re-
moval but faster retrieval. This makes it attractive for data
Insertion
structures that may be built once and loaded without re-
construction, such as language dictionaries (or program
Insertion begins by adding the node as any binary search
dictionaries, such as the opcodes of an assembler or in-
tree insertion does and by coloring it red. Whereas in the
terpreter).
binary search tree, we always add a leaf, in the red–black
Red–black trees are also particularly valuable in tree, leaves contain no information, so instead we add a
functional programming, where they are one of the most red interior node, with two black leaves, in place of an
common persistent data structures, used to construct existing black leaf.
associative arrays and sets which can retain previous
What happens next depends on the color of other nearby
versions after mutations. The persistent version of
nodes. The term uncle node will be used to refer to the
red–black trees requires O(log n) space for each insertion
sibling of a node’s parent, as in human family trees. Note
or deletion, in addition to time.
that:
For every 2-4 tree, there are corresponding red–black
trees with data elements in the same order. The insertion • property 3 (all leaves are black) always holds.
and deletion operations on 2-4 trees are also equivalent
to color-flipping and rotations in red–black trees. This • property 4 (both children of every red node are
makes 2-4 trees an important tool for understanding the black) is threatened only by adding a red node, re-
logic behind red–black trees, and this is why many in- painting a black node red, or a rotation.
troductory algorithm texts introduce 2-4 trees just before • property 5 (all paths from any given node to its leaf
red–black trees, even though 2-4 trees are not often used nodes contain the same number of black nodes) is
in practice. threatened only by adding a black node, repainting
In 2008, Sedgewick introduced a simpler version of the a red node black (or vice versa), or a rotation.
red–black tree called the left-leaning red–black tree[19] by
eliminating a previously unspecified degree of freedom in Notes
the implementation. The LLRB maintains an additional
invariant that all red links must lean left except during in- 1. The label N will be used to denote the current node
serts and deletes. Red–black trees can be made isometric (colored red). In the diagrams N carries a blue con-
to either 2-3 trees,[20] or 2-4 trees,[19] for any sequence of tour. At the beginning, this is the new node being
operations. The 2-4 tree isometry was described in 1978 inserted, but the entire procedure may also be ap-
by Sedgewick. With 2-4 trees, the isometry is resolved plied recursively to other nodes (see case 3). P will
6.8. RED–BLACK TREE 191

denote N's parent node, G will denote N's grand- invalidated. In this case, the tree is still valid. Property
parent, and U will denote N's uncle. In between 5 (all paths from any given node to its leaf nodes contain
some cases, the roles and labels of the nodes are ex- the same number of black nodes) is not threatened, be-
changed, but in each case, every label continues to cause the current node N has two black leaf children, but
represent the same node it represented at the begin- because N is red, the paths through each of its children
ning of the case. have the same number of black nodes as the path through
the leaf it replaced, which was black, and so this property
2. If a node in the right (target) half of a diagram car- remains satisfied.
ries a blue contour it will become the current node
in the next iteration and there the other nodes will void insert_case2(struct node *n) { if (n->parent->color
be newly assigned relative to it. Any color shown in == BLACK) return; /* Tree is still valid */ else in-
the diagram is either assumed in its case or implied sert_case3(n); }
by those assumptions.

3. A numbered triangle represents a subtree of unspec- Note: In the following cases it can be assumed
ified depth. A black circle atop a triangle means that that N has a grandparent node G, because its
black-height of subtree is greater by one compared parent P is red, and if it were the root, it would
to subtree without this circle. be black. Thus, N also has an uncle node U,
although it may be a leaf in cases 4 and 5.
There are several cases of red–black tree insertion to han-
dle: void insert_case3(struct node *n) { struct node *u =
uncle(n), *g; if ((u != NULL) && (u->color == RED))
• N is the root node, i.e., first node of red–black tree { n->parent->color = BLACK; u->color = BLACK; g =
grandparent(n); g->color = RED; insert_case1(g); } else
• N's parent (P) is black { insert_case4(n); } }

• N's parent (P) and uncle (U) are red

• N is added to right of left child of grandparent, or Note: In the remaining cases, it is assumed that
N is added to left of right child of grandparent (P is the parent node P is the left child of its parent.
red and U is black) If it is the right child, left and right should be
reversed throughout cases 4 and 5. The code
• N is added to left of left child of grandparent, or N samples take care of this.
is added to right of right child of grandparent (P is
red and U is black) void insert_case4(struct node *n) { struct node *g
= grandparent(n); if ((n == n->parent->right) &&
Each case will be demonstrated with example C code. (n->parent == g->left)) { rotate_left(n->parent);
The uncle and grandparent nodes can be found by these /* * rotate_left can be the below because of al-
functions: ready having *g = grandparent(n) * * struct node
*saved_p=g->left, *saved_left_n=n->left; * g->left=n;
struct node *grandparent(struct node *n) { if ((n != * n->left=saved_p; * saved_p->right=saved_left_n; * *
NULL) && (n->parent != NULL)) return n->parent- and modify the parent’s nodes properly */ n = n->left;
>parent; else return NULL; } struct node *uncle(struct } else if ((n == n->parent->left) && (n->parent ==
node *n) { struct node *g = grandparent(n); if (g == g->right)) { rotate_right(n->parent); /* * rotate_right
NULL) return NULL; // No grandparent means no uncle can be the below to take advantage of already having
if (n->parent == g->left) return g->right; else return *g = grandparent(n) * * struct node *saved_p=g-
g->left; }
>right, *saved_right_n=n->right; * g->right=n; *
n->right=saved_p; * saved_p->left=saved_right_n; * */
Case 1: The current node N is at the root of the tree. In n = n->right; } insert_case5(n); }
this case, it is repainted black to satisfy property 2 (the void insert_case5(struct node *n) { struct node *g =
root is black). Since this adds one black node to every grandparent(n); n->parent->color = BLACK; g->color
path at once, property 5 (all paths from any given node to = RED; if (n == n->parent->left) rotate_right(g); else
its leaf nodes contain the same number of black nodes) is rotate_left(g); }
not violated.
void insert_case1(struct node *n) { if (n->parent == Note that inserting is actually in-place, since all the calls
NULL) n->color = BLACK; else insert_case2(n); } above use tail recursion.
In the algorithm above, all cases are chained in order, ex-
Case 2: The current node’s parent P is black, so prop- cept in insert case 3 where it can recurse to case 1 back
erty 4 (both children of every red node are black) is not to the grandparent node: this is the only case where an
192 CHAPTER 6. SUCCESSORS AND NEIGHBORS

iterative implementation will effectively loop. Because The complex case is when both M and C are black. (This
the problem of repair is escalated to the next higher level can only occur when deleting a black node which has two
but one, it takes maximally h ⁄2 iterations to repair the tree leaf children, because if the black node M had a black
(where h is the height of the tree). Because the probability non-leaf child on one side but just a leaf child on the
for escalation decreases exponentially with each iteration other side, then the count of black nodes on both sides
the average insertion cost is constant. would be different, thus the tree would have been an in-
Mehlhorn & Sanders (2008) point out: “AVL trees do not valid red–black tree by violation of property 5.) We be-
support constant amortized update costs”, but red-black gin by replacing M with its child C. We will relabel this
child C (in its new position) N, and its sibling (its new
trees do.[22]
parent’s other child) S. (S was previously the sibling of
M.) In the diagrams below, we will also use P for N's
Removal new parent (M's old parent), SL for S's left child, and
SR for S's right child (S cannot be a leaf because if M
In a regular binary search tree when deleting a node with and C were black, then P's one subtree which included
two non-leaf children, we find either the maximum ele- M counted two black-height and thus P's other subtree
ment in its left subtree (which is the in-order predeces- which includes S must also count two black-height, which
sor) or the minimum element in its right subtree (which cannot be the case if S is a leaf node).
is the in-order successor) and move its value into the node
being deleted (as shown here). We then delete the node Notes
we copied the value from, which must have fewer than
two non-leaf children. (Non-leaf children, rather than all 1. The label N will be used to denote the current node
children, are specified here because unlike normal binary (colored black). In the diagrams N carries a blue
search trees, red–black trees can have leaf nodes any- contour. At the beginning, this is the replacement
where, so that all nodes are either internal nodes with two node and a leaf, but the entire procedure may also
children or leaf nodes with, by definition, zero children. be applied recursively to other nodes (see case 3).
In effect, internal nodes having two leaf children in a red– In between some cases, the roles and labels of the
black tree are like the leaf nodes in a regular binary search nodes are exchanged, but in each case, every label
tree.) Because merely copying a value does not violate continues to represent the same node it represented
any red–black properties, this reduces to the problem of at the beginning of the case.
deleting a node with at most one non-leaf child. Once we
have solved that problem, the solution applies equally to 2. If a node in the right (target) half of a diagram car-
the case where the node we originally want to delete has ries a blue contour it will become the current node
at most one non-leaf child as to the case just considered in the next iteration and there the other nodes will
where it has two non-leaf children. be newly assigned relative to it. Any color shown
in the diagram is either assumed in its case or im-
Therefore, for the remainder of this discussion we address
plied by those assumptions. White represents an ar-
the deletion of a node with at most one non-leaf child. We
bitrary color (either red or black), but the same in
use the label M to denote the node to be deleted; C will
both halves of the diagram.
denote a selected child of M, which we will also call “its
child”. If M does have a non-leaf child, call that its child, 3. A numbered triangle represents a subtree of unspec-
C; otherwise, choose either leaf as its child, C. ified depth. A black circle atop a triangle means that
If M is a red node, we simply replace it with its child C, black-height of subtree is greater by one compared
which must be black by property 4. (This can only occur to subtree without this circle.
when M has two leaf children, because if the red node
M had a black non-leaf child on one side but just a leaf We will find the sibling using this function:
child on the other side, then the count of black nodes on struct node *sibling(struct node *n) { if ((n == NULL) ||
both sides would be different, thus the tree would violate (n->parent == NULL)) return NULL; // no parent means
property 5.) All paths through the deleted node will sim- no sibling if (n == n->parent->left) return n->parent-
ply pass through one fewer red node, and both the deleted >right; else return n->parent->left; }
node’s parent and child must be black, so property 3 (all
leaves are black) and property 4 (both children of every
red node are black) still hold. Note: In order that the tree remains well-
Another simple case is when M is black and C is red. defined, we need that every null leaf remains
Simply removing a black node could break Properties 4 a leaf after all transformations (that it will not
(“Both children of every red node are black”) and 5 (“All have any children). If the node we are delet-
paths from any given node to its leaf nodes contain the ing has a non-leaf (non-null) child N, it is easy
same number of black nodes”), but if we repaint C black, to see that the property is satisfied. If, on the
both of these properties are preserved. other hand, N would be a null leaf, it can be
6.8. RED–BLACK TREE 193

verified from the diagrams (or code) for all the void delete_case3(struct node *n) { struct node *s
cases that the property is satisfied as well. = sibling(n); if ((n->parent->color == BLACK) &&
(s->color == BLACK) && (s->left->color == BLACK)
We can perform the steps outlined above with the fol- && (s->right->color == BLACK)) { s->color = RED;
lowing code, where the function replace_node substitutes delete_case1(n->parent); } else delete_case4(n); }
child into n’s place in the tree. For convenience, code in void delete_case4(struct node *n) { struct node *s =
this section will assume that null leaves are represented sibling(n); if ((n->parent->color == RED) && (s->color
by actual node objects rather than NULL (the code in == BLACK) && (s->left->color == BLACK) &&
the Insertion section works with either representation). (s->right->color == BLACK)) { s->color = RED;
n->parent->color = BLACK; } else delete_case5(n); }
void delete_one_child(struct node *n) { /* * Precon-
void delete_case5(struct node *n) { struct node *s =
dition: n has at most one non-leaf child. */ struct
sibling(n); if (s->color == BLACK) { /* this if statement
node *child = is_leaf(n->right) ? n->left : n->right;
is trivial, due to case 2 (even though case 2 changed the
replace_node(n, child); if (n->color == BLACK) { if
sibling to a sibling’s child, the sibling’s child can't be
(child->color == RED) child->color = BLACK; else
red, since no red parent can have a red child). */ /* the
delete_case1(child); } free(n); }
following statements just force the red to be on the left
of the left of the parent, or right of the right, so case
six will rotate correctly. */ if ((n == n->parent->left)
Note: If N is a null leaf and we do not want
&& (s->right->color == BLACK) && (s->left->color
to represent null leaves as actual node objects,
== RED)) { /* this last test is trivial too due to cases
we can modify the algorithm by first calling
2-4. */ s->color = RED; s->left->color = BLACK;
delete_case1() on its parent (the node that we
rotate_right(s); } else if ((n == n->parent->right) &&
delete, n in the code above) and deleting it af-
(s->left->color == BLACK) && (s->right->color ==
terwards. We do this if the parent is black (red
RED)) {/* this last test is trivial too due to cases
is trivial), so it behaves in the same way as a
2-4. */ s->color = RED; s->right->color = BLACK;
null leaf (and is sometimes called a 'phantom'
rotate_left(s); } } delete_case6(n); }
leaf). And we can safely delete it at the end as n
void delete_case6(struct node *n) { struct node *s = sib-
will remain a leaf after all operations, as shown
ling(n); s->color = n->parent->color; n->parent->color
above. In addition, the sibling tests in cases 2
= BLACK; if (n == n->parent->left) { s->right->color =
and 3 require updating as it is no longer true
BLACK; rotate_left(n->parent); } else { s->left->color
that the sibling will have children represented
= BLACK; rotate_right(n->parent); } }
as objects.

If both N and its original parent are black, then deleting Again, the function calls all use tail recursion, so the al-
this original parent causes paths which proceed through N gorithm is in-place.
to have one fewer black node than paths that do not. As In the algorithm above, all cases are chained in order, ex-
this violates property 5 (all paths from any given node to
cept in delete case 3 where it can recurse to case 1 back
its leaf nodes contain the same number of black nodes), to the parent node: this is the only case where an itera-
the tree must be rebalanced. There are several cases to tive implementation will effectively loop. No more than
consider: h loops back to case 1 will occur (where h is the height
Case 1: N is the new root. In this case, we are done. We of the tree). And because the probability for escalation
removed one black node from every path, and the new decreases exponentially with each iteration the average
root is black, so the properties are preserved. removal cost is constant.
void delete_case1(struct node *n) { if (n->parent != Additionally, no tail recursion ever occurs on a child node,
NULL) delete_case2(n); } so the tail recursion loop can only move from a child back
to its successive ancestors. If a rotation occurs in case 2
(which is the only possibility of rotation within the loop
Note: In cases 2, 5, and 6, we assume N is of cases 1–3), then the parent of the node N becomes red
the left child of its parent P. If it is the right after the rotation and we will exit the loop. Therefore,
child, left and right should be reversed through- at most one rotation will occur within this loop. Since no
out these three cases. Again, the code exam- more than two additional rotations will occur after exiting
ples take both cases into account. the loop, at most three rotations occur in total.

void delete_case2(struct node *n) { struct node *s =


sibling(n); if (s->color == RED) { n->parent->color 6.8.7 Proof of asymptotic bounds
= RED; s->color = BLACK; if (n == n->parent->left)
rotate_left(n->parent); else rotate_right(n->parent); } A red black tree which contains n internal nodes has a
delete_case3(n); } height of O(log n).
194 CHAPTER 6. SUCCESSORS AND NEIGHBORS

Definitions: • Join: The function Join is on two red-black trees t 1


and t 2 and a key k and will return a tree containing
• h(v) = height of subtree rooted at node v all elements in t 1 , t 2 as well as k. It requires k to be
greater than all keys in t 1 and smaller than all keys
• bh(v) = the number of black nodes from v to any leaf in t 2 . If the two trees have the same black height,
in the subtree, not counting v if it is black - called Join simply create a new node with left subtree t 1 ,
the black-height root k and right subtree t 2 . If both t 1 and t 2 have
black root, set k to be red. Otherwise k is set black.
Suppose that t 1 has larger black height than t 2 (the
Lemma: A subtree rooted at node v has at least 2bh(v) −1
other case is symmetric). Join follows the right spine
internal nodes.
of t 1 until a black node c which is balanced with t 2 .
Proof of Lemma (by induction height): At this point a new node with left child c, root k (set
Basis: h(v) = 0 to be red) and right child t 1 is created to replace c.
The new node may invalidate the red-black invariant
If v has a height of zero then it must be null, therefore because at most three red nodes can appear in a row.
bh(v) = 0. So: This can be fixed with a double rotation. If double
red issue propagates to the root, the root is then set
to be black, restoring the properties. The cost of
2bh(v) − 1 = 20 − 1 = 1 − 1 = 0 this function is the difference of the black heights
between the two input trees.
Inductive Step: v such that h(v) = k, has at least 2bh(v) −1
internal nodes implies that v ′ such that h( v ′ ) = k+1 has • Split: To split a red-black tree into two smaller trees,

at least 2bh(v ) − 1 internal nodes. those smaller than key x, and those larger than key
x, first draw a path from the root by inserting x into
Since v ′ has h( v ′ ) > 0 it is an internal node. As such it has the red-black tree. After this insertion, all values
two children each of which have a black-height of either less than x will be found on the left of the path, and
bh( v ′ ) or bh( v ′ )−1 (depending on whether the child all values greater than x will be found on the right.
is red or black, respectively). By the inductive hypothesis By applying Join, all the subtrees on the left side are

each child has at least 2bh(v )−1 − 1 internal nodes, so v ′ merged bottom-up using keys on the path as inter-
has at least: mediate nodes from bottom to top to form the left
tree, and the right part is asymmetric. The cost of
′ ′ ′
Split is order of O(log n) , the height of the tree.
2bh(v )−1 − 1 + 2bh(v )−1 − 1 + 1 = 2bh(v ) − 1
The union of two red-black trees t 1 and t 2 representing
internal nodes. sets A and B, is a red-black tree t that represents A ∪ B.
Using this lemma we can now show that the height of the The following recursive function computes this union:
tree is logarithmic. Since at least half of the nodes on function union(t , t ): if t = nil: return t if t = nil:
1 2 1 2 2
any path from the root to a leaf are black (property 4 of return t t<, t> ← split t on t .root return join(t .root,
1 2 1 1
a red–black tree), the black-height of the root is at least union(left(t ), t<), union(right(t ), t>))
1 1
h(root)/2. By the lemma we get:
Here, Split is presumed to return two trees: one hold-
ing the keys less its input key, one holding the greater
h(root) h(root) keys. (The algorithm is non-destructive, but an in-place
n ≥ 2 2 −1 ↔ log2 (n + 1) ≥ ↔ h(root) ≤ destructive
2 log2 (n +version
1). exists as well.)
2
The algorithm for intersection or difference is similar, but
Therefore, the height of the root is O(log n).
requires the Join2 helper routine that is the same as Join
but without the middle key. Based on the new functions
6.8.8 Set operations and bulk operations for union, intersection or difference, either one key or
multiple keys can be inserted to or deleted from the red-
In addition to the single-element insert, delete and lookup black tree. Since Split calls Join but does not deal with
operations, several set operations have been defined on the balancing criteria of red-black trees directly, such an
red-black trees: union, intersection and set difference. implementation is usually called the “join-based” imple-
Then fast bulk operations on insertions or deletions can mentation.
be implemented based on these set functions. These The complexity
( of each
( n of )) union, intersection and dif-
set operations rely on two helper operations, Split and ference is O m log m + 1 for two red-black trees of
Join. With the new operations, the implementation sizes m and n(≥ m) . This complexity is optimal in
of red-black trees can be more efficient and highly- terms of the number of comparisons. More importantly,
parallelizable.[23] This implementation allows a red root. since the recursive calls to union, intersection or differ-
6.8. RED–BLACK TREE 195

ence are independent of each other, they can be executed 6.8.12 References
in parallel with a parallel depth O(log m log n) .[23] When
m = 1 , the join-based implementation has the same [1] James Paton. “Red-Black Trees”.
computational DAG as single-element insertion and dele-
tion if the root of the larger tree is used to split the smaller [2] Cormen, Thomas H.; Leiserson, Charles E.; Rivest,
Ronald L.; Stein, Clifford (2001). “Red–Black Trees”.
tree.
Introduction to Algorithms (second ed.). MIT Press. pp.
273–301. ISBN 0-262-03293-7.

6.8.9 Parallel algorithms [3] John Morris. “Red–Black Trees”.

Parallel algorithms for constructing red–black trees from [4] Rudolf Bayer (1972). “Symmetric binary B-Trees: Data
sorted lists of items can run in constant time or O(log log structure and maintenance algorithms”. Acta Informatica.
1 (4): 290–306. doi:10.1007/BF00289509.
n) time, depending on the computer model, if the number
of processors available is asymptotically proportional to [5] Drozdek, Adam. Data Structures and Algorithms in Java
the number n of items where n→∞. Fast search, inser- (2 ed.). Sams Publishing. p. 323. ISBN 0534376681.
tion, and deletion parallel algorithms are also known.[24]
[6] Leonidas J. Guibas and Robert Sedgewick (1978). “A
Dichromatic Framework for Balanced Trees”. Proceed-
6.8.10 Popular culture ings of the 19th Annual Symposium on Foundations of
Computer Science. pp. 8–21. doi:10.1109/SFCS.1978.3.
A red-black-tree was referenced correctly in an episode [7] “Red Black Trees”. eternallyconfuzzled.com. Retrieved
of Missing (Canadian TV series)[25] as noted by Robert 2015-09-02.
Sedgewick in one of his lectures:[26]
[8] Robert Sedgewick (2012). Red-Black BSTs. Coursera.
Jess: " It was the red door again. "
A lot of people ask why did we use the name red–black.
Pollock: " I thought the red door was the storage con- Well, we invented this data structure, this way of looking
tainer. " at balanced trees, at Xerox PARC which was the home
Jess: " But it wasn't red anymore, it was black. " of the personal computer and many other innovations that
Antonio: " So red turning to black means what? " we live with today entering[sic] graphic user interfaces,
Pollock: " Budget deficits, red ink, black ink. " ethernet and object-oriented programmings[sic] and many
Antonio: " It could be from a binary search tree. The other things. But one of the things that was invented there
red-black tree tracks every simple path from a node to a was laser printing and we were very excited to have nearby
descendant leaf that has the same number of black nodes. color laser printer that could print things out in color and
" out of the colors the red looked the best. So, that’s why we
picked the color red to distinguish red links, the types of
Jess: " Does that help you with the ladies? "
links, in three nodes. So, that’s an answer to the question
for people that have been asking.

6.8.11 See also [9] “Where does the term “Red/Black Tree” come from?".
programmers.stackexchange.com. Retrieved 2015-09-02.
• List of data structures
[10] Andersson, Arne (1993-08-11). Dehne, Frank; Sack,
Jörg-Rüdiger; Santoro, Nicola; Whitesides, Sue, eds.
• Tree data structure
“Balanced search trees made simple” (PDF). Algorithms
and Data Structures (Proceedings). Lecture Notes in
• Tree rotation Computer Science. Springer-Verlag Berlin Heidelberg.
709: 60–71. doi:10.1007/3-540-57155-8_236. ISBN
• AA tree, a variation of the red-black tree 978-3-540-57155-1. Archived from the original on 2000-
03-17.
• AVL tree
[11] Okasaki, Chris (1999-01-01). “Red-black trees in a func-
• B-tree (2-3 tree, 2-3-4 tree, B+ tree, B*-tree, UB- tional setting” (PS). Journal of Functional Programming.
tree) 9 (4): 471–477. doi:10.1017/S0956796899003494.
ISSN 1469-7653.
• Scapegoat tree
[12] Sedgewick, Robert (1983). Algorithms (1st ed.).
Addison-Wesley. ISBN 0-201-06672-6.
• Splay tree
[13] RedBlackBST code in Java
• T-tree
[14] Sedgewick, Robert (2008). “Left-leaning Red-Black
• WAVL tree Trees” (PDF).
196 CHAPTER 6. SUCCESSORS AND NEIGHBORS

[15] Sedgewick, Robert; Wayne, Kevin (2011). Algorithms 6.8.14 External links
(4th ed.). Addison-Wesley Professional. ISBN 978-0-
321-57351-3. • A complete and working implementation in C
[16] Cormen, Thomas; Leiserson, Charles; Rivest, Ronald; • Red–Black Tree Demonstration
Stein, Clifford (2009). “13”. Introduction to Algorithms
(3rd ed.). MIT Press. pp. 308–309. ISBN 978-0-262- • OCW MIT Lecture by Prof. Erik Demaine on Red
03384-8. Black Trees -
[17] Mehlhorn, Kurt; Sanders, Peter (2008). Algorithms and • Binary Search Tree Insertion Visualization on
Data Structures: The Basic Toolbox (PDF). Springer, YouTube – Visualization of random and pre-sorted
Berlin/Heidelberg. pp. 154–165. doi:10.1007/978-3- data insertions, in elementary binary search trees,
540-77978-0. ISBN 978-3-540-77977-3. p. 155.
and left-leaning red–black trees
[18] Sedgewick, Robert (1998). Algorithms in C++. Addison-
• An intrusive red-black tree written in C++
Wesley Professional. pp. 565–575. ISBN 978-
0201350883. • Red-black BSTs in 3.3 Balanced Search Trees
[19] http://www.cs.princeton.edu/~{}rs/talks/LLRB/ • Red–black BST Demo
RedBlack.pdf

[20] http://www.cs.princeton.edu/courses/archive/fall08/
cos226/lectures/10BalancedTrees-2x2.pdf 6.9 WAVL tree
[21] “How does a HashMap work in JAVA”. coding-geek.com.
In computer science, a WAVL tree or weak AVL tree
[22] Mehlhorn & Sanders 2008, pp. 165, 158 is a self-balancing binary search tree. WAVL trees are
named after AVL trees, another type of balanced search
[23] Blelloch, Guy E.; Ferizovic, Daniel; Sun, Yihan (2016), tree, and are closely related both to AVL trees and red–
“Just Join for Parallel Ordered Sets”, Proc. 28th ACM
black trees, which all fall into a common framework of
Symp. Parallel Algorithms and Architectures (SPAA 2016),
rank balanced trees. Like other balanced binary search
ACM, pp. 253–264, doi:10.1145/2935764.2935768,
ISBN 978-1-4503-4210-0. trees, WAVL trees can handle insertion, deletion, and
search operations in time O(log n) per operation.[1][2]
[24] Park, Heejin; Park, Kunsoo (2001). “Parallel algo-
WAVL trees are designed to combine some of the best
rithms for red–black trees”. Theoretical computer sci-
properties of both AVL trees and red–black trees. One
ence. Elsevier. 262 (1–2): 415–435. doi:10.1016/S0304-
3975(00)00287-5. Our parallel algorithm for construct- advantage of AVL trees over red–black trees is that they
ing a red–black tree from a sorted list of n items runs in are more balanced: they have height at most logφ n ≈
O(1) time with n processors on the CRCW PRAM and 1.44 log2 n (for a tree with n data items, where φ is the
runs in O(log log n) time with n / log log n processors on golden ratio), while red–black trees have larger maximum
the EREW PRAM. height, 2 log2 n . If a WAVL tree is created using only
insertions, without deletions, then it has the same small
[25] Missing (Canadian TV series). A, W Network (Canada); height bound that an AVL tree has. On the other hand,
Lifetime (United States). red–black trees have the advantage over AVL trees that
[26] Robert Sedgewick (2012). B-Trees. Coursera. 10:37 they perform less restructuring of their trees. In AVL
minutes in. So not only is there some excitement in that trees, each deletion may require a logarithmic number of
dialogue but it’s also technically correct which you don't tree rotation operations, while red–black trees have sim-
often find with math in popular culture of computer sci- pler deletion operations that use only a constant number
ence. A red black tree tracks every simple path from a of tree rotations. WAVL trees, like red–black trees, use
node to a descendant leaf with the same number of black only a constant number of tree rotations, and the constant
nodes they got that right. is even better than for red–black trees.[1][2]
WAVL trees were introduced by Haeupler, Sen & Tarjan
6.8.13 Further reading (2015). The same authors also provided a common view
of AVL trees, WAVL trees, and red–black trees as all
• Mathworld: Red–Black Tree being a type of rank-balanced tree.[2]

• San Diego State University: CS 660: Red–Black


tree notes, by Roger Whitney 6.9.1 Definition
• Pfaff, Ben (June 2004). “Performance Analysis of As with binary search trees more generally, a WAVL tree
BSTs in System Software” (PDF). Stanford Univer- consists of a collection of nodes, of two types: internal
sity. nodes and external nodes. An internal node stores a data
6.9. WAVL TREE 197

item, and is linked to its parent (except for a designated from each node to its parent, incrementing the rank of
root node that has no parent) and to exactly two children each parent node if necessary to make it greater than the
in the tree, the left child and the right child. An external new rank of its child, until one of three stopping condi-
node carries no data, and has a link only to its parent in tions is reached.
the tree. These nodes are arranged to form a binary tree,
so that for any internal node x the parents of the left and • If the path of incremented ranks reaches the root of
right children of x are x itself. The external nodes form the tree, then the rebalancing procedure stops, with-
the leaves of the tree.[3] The data items are arranged in out changing the structure of the tree.
the tree in such a way that an inorder traversal of the tree
lists the data items in sorted order.[4] • If the path of incremented ranks reaches a node
What distinguishes WAVL trees from other types of bi- whose parent’s rank previously differed by two, and
nary search tree is its use of ranks. These are num- (after incrementing the rank of the node) still differs
bers, stored with each node, that provide an approxima- by one, then again the rebalancing procedure stops
tion to the distance from the node to its farthest leaf de- without changing the structure of the tree.
scendant. The ranks are required to obey the following
• If the procedure increases the rank of a node x, so
properties:[1][2]
that it becomes equal to the rank of the parent y of x,
but the other child of y has a rank that is smaller by
• Every external node has rank 0[5] two (so that the rank of y cannot be increased) then
again the rebalancing procedure stops. In this case,
• If a non-root node has rank r, then the rank of its
by performing at most two tree rotations, it is always
parent must be either r + 1 or r + 2.
possible to rearrange the tree nodes near x and y in
• An internal node with two external children must such a way that the ranks obey the constraints of a
have rank exactly 1. WAVL tree, leaving the rank of the root of the ro-
tated subtree unchanged.

6.9.2 Operations Thus, overall, the insertion procedure consists of a search,


the creation of a constant number of new nodes, a loga-
Searching rithmic number of rank changes, and a constant number
of tree rotations.[1][2]
Searching for a key k in a WAVL tree is much the same
as in any balanced binary search tree data structure. One
begins at the root of the tree, and then repeatedly com- Deletion
pares k with the data item stored at each node on a path
from the root, following the path to the left child of a As with binary search trees more broadly, deletion oper-
node when k is smaller than the value at the node or in- ations on an internal node x that has at least one external-
stead following the path to the right child when k is largernode child may be performed directly, by removing x
than the value at the node. When a node with value equal from the tree and reconnecting the other child of x to the
to k is reached, or an external node is reached, the search parent of x. If, however, both children of a node x are in-
stops.[6] ternal nodes, then we may follow a path downward in the
If the search stops at an internal node, the key k has been tree from x to the leftmost descendant of its right child,
found. If instead, the search stops at an external node, a node y that immediately follows x in the sorted order-
then the position where k would be inserted (if it were ing of the tree nodes. Then y has an external-node child
inserted) has been found.[6] (its left child). We may delete x by performing the same
reconnection procedure at node y (effectively, deleting y
instead of x) and then replacing the data item stored at x
Insertion with the one that had been stored at y.[7]
In either case, after making this change to the tree struc-
Insertion of a key k into a WAVL tree is performed by ture, it is necessary to rebalance the tree and update its
performing a search for the external node where the key ranks. As in the case of an insertion, this may be done
should be added, replacing that node by an internal node by following a path upwards in the tree and changing the
with data item k and two external-node children, and then ranks of the nodes along this path until one of three things
rebalancing the tree. The rebalancing step can be per- happens: the root is reached and the tree is balanced, a
formed either top-down or bottom-up,[2] but the bottom- node is reached whose rank does not need to be changed,
up version of rebalancing is the one that most closely and again the tree is balanced, or a node is reached whose
matches AVL trees.[1][2] rank cannot be changed. In this last case a constant num-
In this rebalancing step, one assigns rank 1 to the newly ber of tree rotations completes the rebalancing stage of
created internal node, and then follows a path upward the deletion process.[1][2]
198 CHAPTER 6. SUCCESSORS AND NEIGHBORS

Overall, as with the insertion procedure, a deletion con- tions, then its structure will be the same as the struc-
sists of a search downward through the tree (to find the ture of an AVL tree created by the same insertion se-
node to be deleted), a continuation of the search farther quence, and its ranks will be the same as the ranks of
downward (to find a node with an external child), the re- the corresponding AVL tree. It is only through deletion
moval of a constant number of new nodes, a logarithmic operations that a WAVL tree can become different from
number of rank changes, and a constant number of tree an AVL tree. In particular this implies that a WAVL
rotations.[1][2] tree created only through insertions has height at most
logφ n ≈ 1.44 log2 n .[2]

6.9.3 Computational complexity


Red–black trees
Each search, insertion, or deletion in a WAVL tree in-
volves following a single path in the tree and performing A red–black tree is a balanced binary search tree in which
a constant number of steps for each node in the path. In each node has a color (red or black), satisfying the follow-
a WAVL tree with n items that has only undergone inser- ing properties:
tions, the maximum path length is logφ n ≈ 1.44 log2 n
. If both insertions and deletions may have happened, the • External nodes are black.
maximum path length is 2 log2 n . Therefore, in either
case, the worst-case time for each search, insertion, or • If an internal node is red, its two children are both
deletion in a WAVL tree with n data items is O(log n). black.

• All paths from the root to an external node have


equal numbers of black nodes.
6.9.4 Related structures

WAVL trees are closely related to both AVL trees and red–black trees can equivalently be defined in terms of a
red–black trees. Every AVL tree can have ranks assigned system of ranks, stored at the nodes, satisfying the fol-
to its nodes in a way that makes it into a WAVL tree. lowing requirements (different than the requirements for
And every WAVL tree can have its nodes colored red and ranks in WAVL trees):
black (and its ranks reassigned) in a way that makes it into
a red–black tree. However, some WAVL trees do not • The rank of an external node is always 0 and its par-
come from AVL trees in this way and some red–black ent’s rank is always 1.
trees do not come from WAVL trees in this way.
• The rank of any non-root node equals either its par-
ent’s rank or its parent’s rank minus 1.
AVL trees
• No two consecutive edges on any root-leaf path have
An AVL tree is a kind of balanced binary search tree in rank difference 0.
which the two children of each internal node must have
heights that differ by at most one.[8] The height of an ex- The equivalence between the color-based and rank-based
ternal node is zero, and the height of any internal node definitions can be seen, in one direction, by coloring a
is always one plus the maximum of the heights of its two node black if its parent has greater rank and red if its
children. Thus, the height function of an AVL tree obeys parent has equal rank. In the other direction, colors can
the constraints of a WAVL tree, and we may convert any be converted to ranks by making the rank of a black node
AVL tree into a WAVL tree by using the height of each equal to the number of black nodes on any path to an
node as its rank.[1][2] external node, and by making the rank of a red node equal
[9]
The key difference between an AVL tree and a WAVL to its parent.
tree arises when a node has two children with the same The ranks of the nodes in a WAVL tree can be converted
rank or height. In an AVL tree, if a node x has two chil- to a system of ranks of nodes, obeying the requirements
dren of the same height h as each other, then the height for red–black trees, by dividing each rank by two and
of x must be exactly h + 1. In contrast, in a WAVL tree, rounding up to the nearest integer.[10] Because of this con-
if a node x has two children of the same rank r as each version, for every WAVL tree there exists a valid red–
other, then the rank of x can be either r + 1 or r + 2. This black tree with the same structure. Because red–black
greater flexibility in ranks also leads to a greater flexibil- trees have maximum height 2 log2 n , the same is true for
ity in structures: some WAVL trees cannot be made into WAVL trees.[1][2] However, there exist red–black trees
AVL trees even by modifying their ranks, because they that cannot be given a valid WAVL tree rank function.[2]
include nodes whose children’s heights differ by more Despite the fact that, in terms of their tree structures,
than one.[2] WAVL trees are special cases of red–black trees, their
If a WAVL tree is created only using insertion opera- update operations are different. The tree rotations used
6.10. SCAPEGOAT TREE 199

in WAVL tree update operations may make changes that to a regular binary search tree: a node stores only a key
would not be permitted in a red–black tree, because they and two pointers to the child nodes. This makes scape-
would in effect cause the recoloring of large subtrees of goat trees easier to implement and, due to data structure
the red–black tree rather than making color changes only alignment, can reduce node overhead by up to one-third.
on a single path in the tree.[2] This allows WAVL trees
to perform fewer tree rotations per deletion, in the worst
case, than red-black trees.[1][2]
6.10.1 Theory
6.9.5 References A binary search tree is said to be weight-balanced if half
[1] Goodrich, Michael T.; Tamassia, Roberto (2015), “4.4
the nodes are on the left of the root, and half on the right.
Weak AVL Trees”, Algorithm Design and Applications, An α-weight-balanced node is defined as meeting a re-
Wiley, pp. 130–138. laxed weight balance criterion:
size(left) <= α*size(node) size(right) <= α*size(node)
[2] Haeupler, Bernhard; Sen, Siddhartha; Tarjan, Robert E.
(2015), “Rank-balanced trees” (PDF), ACM Transactions Where size can be defined recursively as:
on Algorithms, 11 (4): Art. 30, 26, doi:10.1145/2689412,
MR 3361215. function size(node) if node = nil return 0 else return
size(node->left) + size(node->right) + 1 end
[3] Goodrich & Tamassia (2015), Section 2.3 Trees, pp. 68–
An α of 1 therefore would describe a linked list as bal-
83.
anced, whereas an α of 0.5 would only match almost com-
[4] Goodrich & Tamassia (2015), Chapter 3 Binary Search plete binary trees.
Trees, pp. 89–114.
A binary search tree that is α-weight-balanced must also
[5] In this we follow Goodrich & Tamassia (2015). In the be α-height-balanced, that is
version described by Haeupler, Sen & Tarjan (2015), the height(tree) <= log₁/α(NodeCount) + 1
external nodes have rank −1. This variation makes very
little difference in the operations of WAVL trees, but it Scapegoat trees are not guaranteed to keep α-weight-
causes some minor changes to the formula for converting balance at all times, but are always loosely α-height-
WAVL trees to red–black trees. balanced in that
[6] Goodrich & Tamassia (2015), Section 3.1.2 Searching in height(scapegoat tree) <= log₁/α(NodeCount) + 1
a Binary Search Tree, pp. 95–96. This makes scapegoat trees similar to red-black trees in
[7] Goodrich & Tamassia (2015), Section 3.1.4 Deletion in a that they both have restrictions on their height. They dif-
Binary Search Tree, pp. 98–99. fer greatly though in their implementations of determin-
ing where the rotations (or in the case of scapegoat trees,
[8] Goodrich & Tamassia (2015), Section 4.2 AVL Trees, pp. rebalances) take place. Whereas red-black trees store ad-
120–125. ditional 'color' information in each node to determine the
[9] Goodrich & Tamassia (2015), Section 4.3 Red–black
location, scapegoat trees find a scapegoat which isn't α-
Trees, pp. 126–129. weight-balanced to perform the rebalance operation on.
This is loosely similar to AVL trees, in that the actual ro-
[10] In Haeupler, Sen & Tarjan (2015) the conversion is done tations depend on 'balances’ of nodes, but the means of
by rounding down, because the ranks of external nodes determining the balance differs greatly. Since AVL trees
are −1 rather than 0. Goodrich & Tamassia (2015) give a check the balance value on every insertion/deletion, it is
formula that also rounds down, but because they use rank typically stored in each node; scapegoat trees are able to
0 for external nodes their formula incorrectly assigns red–
calculate it only as needed, which is only when a scape-
black rank 0 to internal nodes with WAVL rank 1.
goat needs to be found.
Unlike most other self-balancing search trees, scapegoat
6.10 Scapegoat tree trees are entirely flexible as to their balancing. They sup-
port any α such that 0.5 < α < 1. A high α value results in
fewer balances, making insertion quicker but lookups and
In computer science, a scapegoat tree is a self-balancing deletions slower, and vice versa for a low α. Therefore in
binary search tree, invented by Arne Andersson[1] and practical applications, an α can be chosen depending on
again by Igal Galperin and Ronald L. Rivest.[2] It provides how frequently these actions should be performed.
worst-case O(log n) lookup time, and O(log n) amortized
insertion and deletion time.
Unlike most other self-balancing binary search trees that
provide worst case O(log n) lookup time, scapegoat trees 6.10.2 Operations
have no additional per-node memory overhead compared
200 CHAPTER 6. SUCCESSORS AND NEIGHBORS

Insertion worst-case scenarios are spread out, insertion takes O(log


n) amortized time.

Insertion is implemented with the same basic ideas as an


unbalanced binary search tree, however with a few signif- Sketch of proof for cost of insertion Define the Im-
icant changes. balance of a node v to be the absolute value of the differ-
ence in size between its left node and right node minus 1,
When finding the insertion point, the depth of the new or 0, whichever is greater. In other words:
node must also be recorded. This is implemented via a
simple counter that gets incremented during each itera- I(v) = max(| left(v) − right(v)| − 1, 0)
tion of the lookup, effectively counting the number of Immediately after rebuilding a subtree rooted at v, I(v) =
edges between the root and the inserted node. If this node 0.
violates the α-height-balance property (defined above), a
Lemma: Immediately before rebuilding the subtree
rebalance is required.
rooted at v,
To rebalance, an entire subtree rooted at a scapegoat un- I(v) = Ω(|v|) ( Ω is Big O Notation.)
dergoes a balancing operation. The scapegoat is defined
Proof of lemma:
as being an ancestor of the inserted node which isn't α-
weight-balanced. There will always be at least one such Let v0 be the root of a subtree immediately after rebuild-
ancestor. Rebalancing any of them will restore the α- ing. h(v0 ) = log(|v0 |+1) . If there are Ω(|v0 |) degener-
height-balanced property. ate insertions (that is, where each inserted node increases
the height by 1), then
One way of finding a scapegoat, is to climb from the new
I(v) = Ω(|v0 |) ,
node back up to the root and select the first node that isn't
h(v) = h(v0 ) + Ω(|v0 |) and
α-weight-balanced.
log(|v|) ≤ log(|v0 | + 1) + 1 .
Climbing back up to the root requires O(log n) storage
Since I(v) = Ω(|v|) before rebuilding, there were Ω(|v|)
space, usually allocated on the stack, or parent pointers.
insertions into the subtree rooted at v that did not result in
This can actually be avoided by pointing each child at its
rebuilding. Each of these insertions can be performed in
parent as you go down, and repairing on the walk back
O(log n) time. The final insertion that causes rebuilding
up.
costs O(|v|) . Using aggregate analysis it becomes clear
To determine whether a potential node is a viable scape- that the amortized cost of an insertion is O(log n) :
goat, we need to check its α-weight-balanced property. Ω(|v|)O(log n)+O(|v|)
To do this we can go back to the definition: Ω(|v|) = O(log n)

size(left) <= α*size(node) size(right) <= α*size(node)


Deletion
However a large optimisation can be made by realising
that we already know two of the three sizes, leaving only
Scapegoat trees are unusual in that deletion is easier than
the third having to be calculated.
insertion. To enable deletion, scapegoat trees need to
Consider the following example to demonstrate this. As- store an additional value with the tree data structure.
suming that we're climbing back up to the root: This property, which we will call MaxNodeCount sim-
size(parent) = size(node) + size(sibling) + 1 ply represents the highest achieved NodeCount. It is set
to NodeCount whenever the entire tree is rebalanced,
But as: and after insertion is set to max(MaxNodeCount, Node-
size(inserted node) = 1. Count).
The case is trivialized down to: To perform a deletion, we simply remove the node as you
would in a simple binary search tree, but if
size[x+1] = size[x] + size(sibling) + 1
NodeCount <= α*MaxNodeCount
Where x = this node, x + 1 = parent and size(sibling) is
the only function call actually required. then we rebalance the entire tree about the root, remem-
bering to set MaxNodeCount to NodeCount.
Once the scapegoat is found, the subtree rooted at
the scapegoat is completely rebuilt to be perfectly This gives deletion its worst-case performance of O(n)
balanced.[2] This can be done in O(n) time by travers- time; however, it is amortized to O(log n) average time.
ing the nodes of the subtree to find their values in sorted
order and recursively choosing the median as the root of Sketch of proof for cost of deletion Suppose the
the subtree. scapegoat tree has n elements and has just been rebuilt
As rebalance operations take O(n) time (dependent on the (in other words, it is a complete binary tree). At most
number of nodes of the subtree), insertion has a worst- n/2 − 1 deletions can be performed before the tree must
case performance of O(n) time. However, because these be rebuilt. Each of these deletions take O(log n) time
6.11. SPLAY TREE 201

(the amount of time to search for the element and flag it 6.11 Splay tree
as deleted). The n/2 deletion causes the tree to be re-
built and takes O(log n) + O(n) (or just O(n) ) time. A splay tree is a self-adjusting binary search tree with the
Using aggregate analysis it becomes clear that the amor- additional property that recently accessed elements are
tized cost of a deletion is O(log n) : quick to access again. It performs basic operations such
∑n 2 O(log n)+O(n) n
O(log n)+O(n) as insertion, look-up and removal in O(log n) amortized
1
n = 2 n = O(log n)
2 2 time. For many sequences of non-random operations,
splay trees perform better than other search trees, even
when the specific pattern of the sequence is unknown.
Lookup
The splay tree was invented by Daniel Sleator and Robert
Tarjan in 1985.[1]
Lookup is not modified from a standard binary search
tree, and has a worst-case time of O(log n). This is in All normal operations on a binary search tree are com-
contrast to splay trees which have a worst-case time of bined with one basic operation, called splaying. Splaying
O(n). The reduced node memory overhead compared to the tree for a certain element rearranges the tree so that
other self-balancing binary search trees can further im- the element is placed at the root of the tree. One way to
prove locality of reference and caching. do this is to first perform a standard binary tree search for
the element in question, and then use tree rotations in a
specific fashion to bring the element to the top. Alterna-
6.10.3 See also tively, a top-down algorithm can combine the search and
the tree reorganization into a single phase.
• Splay tree

• Trees 6.11.1 Advantages


• Tree rotation Good performance for a splay tree depends on the fact
that it is self-optimizing, in that frequently accessed nodes
• AVL tree will move nearer to the root where they can be accessed
more quickly. The worst-case height—though unlikely—
• B-tree is O(n), with the average being O(log n). Having fre-
quently used nodes near the root is an advantage for many
• T-tree
practical applications (also see Locality of reference),
and is particularly useful for implementing caches and
• List of data structures
garbage collection algorithms.
Advantages include:
6.10.4 References
• Comparable performance: Average-case perfor-
[1] Andersson, Arne (1989). Improving partial rebuilding
mance is as efficient as other trees.[2]
by using simple balance criteria. Proc. Workshop on
Algorithms and Data Structures. Journal of Algorithms.
Springer-Verlag. pp. 393–402. doi:10.1007/3-540- • Small memory footprint: Splay trees do not need to
51542-9_33. store any bookkeeping data.

[2] Galperin, Igal; Rivest, Ronald L. (1993). “Scapegoat


trees”. Proceedings of the fourth annual ACM-SIAM Sym- 6.11.2 Disadvantages
posium on Discrete algorithms: 165–174.
The most significant disadvantage of splay trees is that
the height of a splay tree can be linear. For example,
6.10.5 External links this will be the case after accessing all n elements in non-
decreasing order. Since the height of a tree corresponds
• Scapegoat Tree Applet by Kubo Kovac to the worst-case access time, this means that the actual
cost of an operation can be high. However the amortized
• Scapegoat Trees: Galperin and Rivest’s paper de- access cost of this worst case is logarithmic, O(log n).
scribing scapegoat trees Also, the expected access cost can be reduced to O(log
n) by using a randomized variant.[3]
• On Consulting a Set of Experts and Searching (full
version paper) The representation of splay trees can change even when
they are accessed in a 'read-only' manner (i.e. by find op-
• Open Data Structures - Chapter 8 - Scapegoat Trees erations). This complicates the use of such splay trees in
202 CHAPTER 6. SUCCESSORS AND NEIGHBORS

a multi-threaded environment. Specifically, extra man- p. Note that zig-zig steps are the only thing that differen-
agement is needed if multiple threads are allowed to per- tiate splay trees from the rotate to root method introduced
form find operations concurrently. This also makes them by Allen and Munro[4] prior to the introduction of splay
unsuitable for general use in purely functional program- trees.
ming, although even there they can be used in limited
ways to implement priority queues.

6.11.3 Operations

Splaying

When a node x is accessed, a splay operation is performed


on x to move it to the root. To perform a splay opera-
tion we carry out a sequence of splay steps, each of which Zig-zag step: this step is done when p is not the root and
moves x closer to the root. By performing a splay oper- x is a right child and p is a left child or vice versa. The tree
ation on the node of interest after every access, the re- is rotated on the edge between p and x, and then rotated
cently accessed nodes are kept near the root and the tree on the resulting edge between x and g.
remains roughly balanced, so that we achieve the desired
amortized time bounds.
Each particular step depends on three factors:

• Whether x is the left or right child of its parent node,


p,

• whether p is the root or not, and if not

• whether p is the left or right child of its parent, g (the


grandparent of x).
Join
It is important to remember to set gg (the great-
grandparent of x) to now point to x after any splay op- Given two trees S and T such that all elements of S are
eration. If gg is null, then x obviously is now the root and smaller than the elements of T, the following steps can be
must be updated as such. used to join them to a single tree:
There are three types of splay steps, each of which has a
left- and right-handed case. For the sake of brevity, only • Splay the largest item in S. Now this item is in the
one of these two is shown for each type. These three types root of S and has a null right child.
are:
• Set the right child of the new root to T.
Zig step: this step is done when p is the root. The tree
is rotated on the edge between x and p. Zig steps exist to
deal with the parity issue and will be done only as the last Split
step in a splay operation and only when x has odd depth
at the beginning of the operation. Given a tree and an element x, return two new trees: one
containing all elements less than or equal to x and the
p x other containing all elements greater than x. This can be
done in the following way:
x p

C A
• Splay x. Now it is in the root so the tree to its left
contains all elements smaller than x and the tree to
A B B C
its right contains all element larger than x.

• Split the right subtree from the rest of the tree.


Zig-zig step: this step is done when p is not the root and
x and p are either both right children or are both left chil-
dren. The picture below shows the case where x and p are Insertion
both left children. The tree is rotated on the edge joining
p with its parent g, then rotated on the edge joining x with To insert a value x into a splay tree:
6.11. SPLAY TREE 203

• Insert x as with a normal binary search tree. Below there is an implementation of splay trees in C++,
which uses pointers to represent each node on the tree.
• when an item is inserted, a splay is performed.
This implementation is based on bottom-up splaying ver-
• As a result, the newly inserted node x becomes the sion and uses the second method of deletion on a splay
root of the tree. tree. Also, unlike the above definition, this C++ version
does not splay the tree on finds - it only splays on inser-
ALTERNATIVE: tions and deletions.
#include <functional> #ifndef SPLAY_TREE #define
• Use the split operation to split the tree at the value SPLAY_TREE template<typename T, typename Comp
of x to two sub-trees: S and T. = std::less<T>> class splay_tree { private: Comp comp;
• Create a new tree in which x is the root, S is its left unsigned long p_size; struct node { node *left, *right;
sub-tree and T its right sub-tree. node *parent; T key; node( const T& init = T( ) ) : left(
nullptr ), right( nullptr ), parent( nullptr ), key( init ) { }
~node( ) { if( left ) delete left; if( right ) delete right; if(
Deletion parent ) delete parent; } } *root; void left_rotate( node
*x ) { node *y = x->right; if(y) { x->right = y->left; if(
To delete a node x, use the same method as with a binary y->left ) y->left->parent = x; y->parent = x->parent; }
search tree: if x has two children, swap its value with that if( !x->parent ) root = y; else if( x == x->parent->left
of either the rightmost node of its left sub tree (its in-order ) x->parent->left = y; else x->parent->right = y; if(y)
predecessor) or the leftmost node of its right subtree (its y->left = x; x->parent = y; } void right_rotate( node
in-order successor). Then remove that node instead. In *x ) { node *y = x->left; if(y) { x->left = y->right; if(
this way, deletion is reduced to the problem of removing y->right ) y->right->parent = x; y->parent = x->parent;
a node with 0 or 1 children. Unlike a binary search tree, } if( !x->parent ) root = y; else if( x == x->parent->left
in a splay tree after deletion, we splay the parent of the ) x->parent->left = y; else x->parent->right = y; if(y) y-
removed node to the top of the tree. >right = x; x->parent = y; } void splay( node *x ) { while(
ALTERNATIVE: x->parent ) { if( !x->parent->parent ) { if( x->parent-
>left == x ) right_rotate( x->parent ); else left_rotate(
• The node to be deleted is first splayed, i.e. brought x->parent ); } else if( x->parent->left == x && x-
to the root of the tree and then deleted. leaves the >parent->parent->left == x->parent ) { right_rotate(
tree with two sub trees. x->parent->parent ); right_rotate( x->parent ); } else
if( x->parent->right == x && x->parent->parent->right
• The two sub-trees are then joined using a “join” op- == x->parent ) { left_rotate( x->parent->parent );
eration. left_rotate( x->parent ); } else if( x->parent->left
== x && x->parent->parent->right == x->parent ) {
right_rotate( x->parent ); left_rotate( x->parent ); } else
6.11.4 Implementation and variants { left_rotate( x->parent ); right_rotate( x->parent ); } }
} void replace( node *u, node *v ) { if( !u->parent ) root
Splaying, as mentioned above, is performed during a sec- = v; else if( u == u->parent->left ) u->parent->left = v;
ond, bottom-up pass over the access path of a node. It is else u->parent->right = v; if( v ) v->parent = u->parent;
possible to record the access path during the first pass for } node* subtree_minimum( node *u ) { while( u->left )
use during the second, but that requires extra space dur- u = u->left; return u; } node* subtree_maximum( node
ing the access operation. Another alternative is to keep *u ) { while( u->right ) u = u->right; return u; } public:
a parent pointer in every node, which avoids the need for splay_tree( ) : root( nullptr ), p_size( 0 ) { } void insert(
extra space during access operations but may reduce over- const T &key ) { node *z = root; node *p = nullptr;
all time efficiency because of the need to update those while( z ) { p = z; if( comp( z->key, key ) ) z = z->right;
pointers.[1] else z = z->left; } z = new node( key ); z->parent = p; if(
Another method which can be used is based on the ar- !p ) root = z; else if( comp( p->key, z->key ) ) p->right
gument that we can restructure the tree on our way down = z; else p->left = z; splay( z ); p_size++; } node* find(
the access path instead of making a second pass. This const T &key ) { node *z = root; while( z ) { if( comp(
top-down splaying routine uses three sets of nodes - left z->key, key ) ) z = z->right; else if( comp( key, z->key )
tree, right tree and middle tree. The first two contain all ) z = z->left; else return z; } return nullptr; } void erase(
items of original tree known to be less than or greater than const T &key ) { node *z = find( key ); if( !z ) return;
current item respectively. The middle tree consists of the splay( z ); if( !z->left ) replace( z, z->right ); else if(
sub-tree rooted at the current node. These three sets are !z->right ) replace( z, z->left ); else { node *y = sub-
updated down the access path while keeping the splay op- tree_minimum( z->right ); if( y->parent != z ) { replace(
erations in check. Another method, semisplaying, modi- y, y->right ); y->right = z->right; y->right->parent = y;
fies the zig-zig case to reduce the amount of restructuring } replace( z, y ); y->left = z->left; y->left->parent = y;
done in all operations.[1][5] } delete z; p_size--; } const T& minimum( ) { return
204 CHAPTER 6. SUCCESSORS AND NEIGHBORS

subtree_minimum( root )->key; } const T& maximum( any operation is done (Φi) to the final state after all oper-
) { return subtree_maximum( root )->key; } bool ations are completed (Φf).
empty( ) const { return root == nullptr; } unsigned long
size( ) const { return p_size; } }; #endif // SPLAY_TREE

Φi − Φf = ranki (x) − rankf (x) = O(n log n)
x

6.11.5 Analysis where the last inequality comes from the fact that for ev-
ery node x, the minimum rank is 0 and the maximum
A simple amortized analysis of static splay trees can be rank is log(n).
carried out using the potential method. Define:
Now we can finally bound the actual time:

• size(r) = the number of nodes in the sub-tree rooted


at node r (including r).
Tactual (m) = O(m log n + n log n)
• rank(r) = log2 (size(r)).
• Φ = the sum of the ranks of all the nodes in the tree. Weighted analysis

Φ will tend to be high for poorly balanced trees and low The above analysis can be generalized in the following
for well-balanced trees. way.

To apply the potential method, we first calculate ΔΦ: the


change in the potential caused by a splay operation. We • Assign to each node r a weight w(r).
check each case separately. Denote by rank′ the rank • Define size(r) = the sum of weights of nodes in the
function after the operation. x, p and g are the nodes sub-tree rooted at node r (including r).
affected by the rotation operation (see figures above).
Zig step: • Define rank(r) and Φ exactly as above.

The same analysis applies and the amortized cost of a


splaying operation is again:
Zig-Zig step:
( )
W
rank(root)−rank(x) = O(log W −log w(x)) = O log
w(x)

Zig-Zag step: where W is the sum of all weights.


The decrease from the initial to the final potential is
bounded by:

The amortized cost of any operation is ΔΦ plus the actual


cost. The actual cost of any zig-zig or zig-zag operation Φ − Φ ≤ ∑ log W
i f
is 2 since there are two rotations to make. Hence: x∈tree
w(x)

since the maximum size of any single node is W and the


minimum is w(x).
When summed over the entire splay operation, this Hence the actual time is bounded by:
telescopes to 3(rank(root)−rank(x)) which is O(log n).
The Zig operation adds an amortized cost of 1, but there’s ( )
at most one such operation. ∑ W ∑ W
O log + log
So now we know that the total amortized time for a se- x∈sequence
w(x) x∈tree w(x)
quence of m operations is:

6.11.6 Performance theorems


Tamortized (m) = O(m log n)
There are several theorems and conjectures regarding the
To go from the amortized time to the actual time, we must worst-case runtime for performing a sequence S of m ac-
add the decrease in potential from the initial state before cesses in a splay tree containing n elements.
6.11. SPLAY TREE 205

Balance Theorem — The cost of performing the se- The tightest upper bound proven so far is 4.5n .[9]
quence S is O [m log n + n log n] .
Proof
6.11.7 Dynamic optimality conjecture
Take a constant weight, e.g. w(x) = 1 for every node x. Main article: Optimal binary search tree
Then W = n .
This theorem implies that splay trees perform as well as In addition to the proven performance guarantees for
static balanced binary search trees on sequences of at least splay trees there is an unproven conjecture of great in-
n accesses.[1] terest from the original Sleator and Tarjan paper. This
Static Optimality Theorem — Let qx be the number of conjecture is known as the dynamic optimality conjecture
times element x is accessed in S. If every element is ac- and it basically claims that splay trees perform as well as
cessed
[ at least once, then ]the cost of performing S is any other binary search tree algorithm up to a constant
∑ factor.
O m + x∈tree qx log qmx
Proof Dynamic Optimality Conjecture:[1] Let A
be any binary search tree algorithm that ac-
Let w(x) = qx . Then W = m . cesses an element x by traversing the path from
the root to x at a cost of d(x) + 1 , and that be-
This theorem implies that splay trees perform as well as tween accesses can make any rotations in the
an optimum static binary search tree on sequences of at tree at a cost of 1 per rotation. Let A(S) be
least n accesses. They spend less time on the more fre- the cost for A to perform the sequence S of
quent items.[1] accesses. Then the cost for a splay tree to per-
Static Finger Theorem — Assume that the items are num- form the same accesses is O[n + A(S)] .
bered from 1 through n in ascending order. Let f be any
fixed element
[ (the 'finger'). Then the cost of performing
] There are several corollaries of the dynamic optimality

S is O m + n log n + x∈sequence log(|x − f | + 1) . conjecture that remain unproven:
Proof
Traversal Conjecture:[1] Let T1 and T2 be
two splay trees containing the same elements.
Let w(x) = 1/(|x − f | + 1)2 . Then W = O(1) . The
Let S be the sequence obtained by visiting
net potential drop is O (n log n) since the weight of any
the elements in T2 in preorder (i.e., depth first
item is at least 1/n2 .[1]
search order). The total cost of performing the
Dynamic Finger Theorem — Assume that the 'finger' sequence S of accesses on T1 is O(n) .
for each step accessing an element y is the element ac-
cessed
[ in the previous step, x. The cost of performing
] S Deque Conjecture:[8][10][11] Let S be a se-
∑m
is O m + n + x,y∈sequence log(|y − x| + 1) .[6][7] quence of m double-ended queue operations
(push, pop, inject, eject). Then the cost of per-
Working Set Theorem — At any time during the
forming S on a splay tree is O(m + n) .
sequence, let t(x) be the number of distinct el-
ements accessed before the previous time element
x [was accessed. The cost of performing] S is Split Conjecture:[5] Let S be any permutation
∑ of the elements of the splay tree. Then the cost
O m + n log n + x∈sequence log(t(x) + 1)
of deleting the elements in the order S is O(n)
Proof .

Let w(x) = 1/(t(x) + 1)2 . Note that here the weights


change during the sequence. However, the sequence of 6.11.8 Variants
weights is still a permutation of 1, 14 , 91 , · · · , n12 . So as
before W = O(1) . The net potential drop is O (n log n). In order to reduce the number of restructuring operations,
it is possible to replace the splaying with semi-splaying,
This theorem is equivalent to splay trees having key- in which an element is splayed only halfway towards the
independent optimality.[1] root.[1][12]
Scanning Theorem — Also known as the Sequential Ac- Another way to reduce restructuring is to do full splaying,
cess Theorem or the Queue theorem. Accessing the n but only in some of the access operations - only when the
elements of a splay tree in symmetric order takes O(n) access path is longer than a threshold, or only in the first
time, regardless of the initial structure of the splay tree.[8] m access operations.[1]
206 CHAPTER 6. SUCCESSORS AND NEIGHBORS

6.11.9 See also • Brinkmann, Gunnar; Degraer, Jan; De Loof, Karel


(January 2009). “Rehabilitation of an unloved
• Finger tree child: semi-splaying” (PDF). Software—Practice
and Experience. 39 (1): 33–45. CiteSeerX
• Link/cut tree
10.1.1.84.790 . doi:10.1002/spe.v39:1. The re-
• Scapegoat tree sults show that semi-splaying, which was introduced
in the same paper as splaying, performs better than
• Zipper (data structure) splaying under almost all possible conditions. This
makes semi-splaying a good alternative for all appli-
• Trees
cations where normally splaying would be applied.
• Tree rotation The reason why splaying became so prominent while
semi-splaying is relatively unknown and much less
• AVL tree studied is hard to understand.
• B-tree
• Cole, Richard; Mishra, Bud; Schmidt, Jeanette;
• T-tree Siegel, Alan (January 2000). “On the Dy-
namic Finger Conjecture for Splay Trees.
• List of data structures Part I: Splay Sorting log n-Block Sequences”.
SIAM Journal on Computing. 30 (1): 1–43.
• Iacono’s working set structure doi:10.1137/s0097539797326988.
• Geometry of binary search trees
• Cole, Richard (January 2000). “On the Dynamic
• Splaysort, a sorting algorithm using splay trees Finger Conjecture for Splay Trees. Part II: The
Proof”. SIAM Journal on Computing. 30 (1): 44–
85. doi:10.1137/S009753979732699X.
6.11.10 Notes
• Elmasry, Amr (April 2004), “On the sequential ac-
[1] Sleator & Tarjan 1985.
cess theorem and Deque conjecture for splay trees”
[2] Goodrich, Tamassia & Goldwasser 2014. (PDF), Theoretical Computer Science, 314 (3): 459–
466, doi:10.1016/j.tcs.2004.01.019
[3] Albers & Karpinski 2002.

[4] Allen & Munro 1978.


• Goodrich, Michael; Tamassia, Roberto; Gold-
wasser, Michael (2014). Data Structures and Algo-
[5] Lucas 1991. rithms in Java (6 ed.). Wiley. p. 506. ISBN 978-1-
118-77133-4.
[6] Cole et al. 2000.

[7] Cole 2000. • Knuth, Donald (1997). The Art of Computer Pro-
gramming. 3: Sorting and Searching (3rd ed.).
[8] Tarjan 1985. Addison-Wesley. p. 478. ISBN 0-201-89685-0.
[9] Elmasry 2004.
• Lucas, Joan M. (1991). “On the Competitiveness
[10] Pettie 2008. of Splay Trees: Relations to the Union-Find Prob-
lem”. On-line Algorithms: Proceedings of a DIMACS
[11] Sundar 1992.
Workshop, February 11–13, 1991. Series in Dis-
[12] Brinkmann, Degraer & De Loof 2009. crete Mathematics and Theoretical Computer Sci-
ence. 7. Center for Discrete Mathematics and The-
oretical Computer Science. pp. 95–124. ISBN 0-
6.11.11 References 8218-7111-0.

• Albers, Susanne; Karpinski, Marek (28 February • Pettie, Seth (2008), “Splay Trees, Davenport-
2002). “Randomized Splay Trees: Theoretical and Schinzel Sequences, and the Deque
Experimental Results” (PDF). Information Process- Conjecture” (PDF), Proc. 19th ACM-
ing Letters. 81 (4): 213–221. doi:10.1016/s0020- SIAM Symposium on Discrete Algorithms,
0190(01)00230-7. 0707: 1115–1124, arXiv:0707.2160 ,
Bibcode:2007arXiv0707.2160P
• Allen, Brian; Munro, Ian (October 1978). “Self-
organizing search trees”. Journal of the ACM. 25 • Sleator, Daniel D.; Tarjan, Robert E. (1985).
(4): 526–535. doi:10.1145/322092.322094. “Self-Adjusting Binary Search Trees” (PDF).
6.12. TANGO TREE 207

Journal of the ACM. 32 (3): 652–686. Preferred Paths


doi:10.1145/3828.3835.
First, we define for each node its preferred child, which
• Sundar, Rajamani (1992). “On the Deque conjec- informally is the most-recently touched child by a tradi-
ture for the splay algorithm”. Combinatorica. 12 tional binary search tree lookup. More formally, consider
(1): 95–124. doi:10.1007/BF01191208. a subtree T, rooted at p, with children l (left) and r (right).
We set r as the preferred child of p if the most recently
accessed node in T is in the subtree rooted at r, and l as
• Tarjan, Robert E. (1985). “Sequential access in
the preferred child otherwise. Note that if the most re-
splay trees takes linear time”. Combinatorica. 5 (4):
cently accessed node of T is p itself, then l is the preferred
367–378. doi:10.1007/BF02579253.
child by definition.
A preferred path is defined by starting at the root and fol-
6.11.12 External links lowing the preferred children until reaching a leaf node.
Removing the nodes on this path partitions the remain-
• NIST’s Dictionary of Algorithms and Data Struc- der of the tree into a number of subtrees, and we recurse
tures: Splay Tree on each subtree (forming a preferred path from its root,
which partitions the subtree into more subtrees).
• Implementations in C and Java (by Daniel Sleator)

• Pointers to splay tree visualizations


Auxiliary Trees
• Fast and efficient implementation of Splay trees
To represent a preferred path, we store its nodes in a
• Top-Down Splay Tree Java implementation balanced binary search tree, specifically a red-black tree.
For each non-leaf node n in a preferred path P, it has a
• Zipper Trees
non-preferred child c, which is the root of a new auxiliary
• splay tree video tree. We attach this other auxiliary tree’s root (c) to n in
P, thus linking the auxiliary trees together. We also aug-
ment the auxiliary tree by storing at each node the min-
6.12 Tango tree imum and maximum depth (depth in the reference tree,
that is) of nodes in the subtree under that node.
A tango tree is a type of binary search tree proposed by
Erik D. Demaine, Dion Harmon, John Iacono, and Mihai 6.12.2 Algorithm
Patrascu in 2004.[1] It is named after Buenos Aires, some-
times considered as the Tango World Capital. Searching
It is an online binary search tree that achieves an
O(log log n) competitive ratio relative to the optimal To search for an element in the tango tree, we simply sim-
offline binary search tree, while only using O(log log n) ulate searching the reference tree. We start by searching
additional bits of memory per node. This improved upon the preferred path connected to the root, which is sim-
the previous best known competitive ratio, which was ulated by searching the auxiliary tree corresponding to
O(log n) . that preferred path. If the auxiliary tree doesn't contain
the desired element, the search terminates on the parent
of the root of the subtree containing the desired element
6.12.1 Structure (the beginning of another preferred path), so we simply
proceed by searching the auxiliary tree for that preferred
Tango trees work by partitioning a binary search tree into path, and so forth.
a set of preferred paths, which are themselves stored in
auxiliary trees (so the tango tree is represented as a tree
of trees). Updating

In order to maintain the structure of the tango tree (aux-


Reference Tree iliary trees correspond to preferred paths), we must do
some updating work whenever preferred children change
To construct a tango tree, we simulate a complete binary as a result of searches. When a preferred child changes,
search tree called the reference tree, which is simply a the top part of a preferred path becomes detached from
traditional binary search tree containing all the elements. the bottom part (which becomes its own preferred path)
This tree never shows up in the actual implementation, and reattached to another preferred path (which becomes
but is the conceptual basis behind the following pieces of the new bottom part). In order to do this efficiently, we'll
a tango tree. define cut and join operations on our auxiliary trees.
208 CHAPTER 6. SUCCESSORS AND NEIGHBORS

Join Our join operation will combine two auxiliary given access sequence. Our upper bound will be (k +
trees as long as they have the property that the top node of 1)O(log log n) , where k is the number of interleaves.
one (in the reference tree) is a child of the bottom node The total cost is divided into two parts, searching for the
of the other (essentially, that the corresponding preferred element, and updating the structure of the tango tree to
paths can be concatenated). This will work based on the maintain the proper invariants (switching preferred chil-
concatenate operation of red-black trees, which combines dren and re-arranging preferred paths).
two trees as long as they have the property that all ele-
ments of one are less than all elements of the other, and
split, which does the reverse. In the reference tree, note
that there exist two nodes in the top path such that a node Searching To see that the searching (not updating) fits
is in the bottom path if and only if its key-value is between in this bound, simply note that every time an auxiliary
them. Now, to join the bottom path to the top path, we tree search is unsuccessful and we have to move to the
simply split the top path between those two nodes, then next auxiliary tree, that results in a preferred child switch
concatenate the two resulting auxiliary trees on either side (since the parent preferred path now switches directions
of the bottom path’s auxiliary tree, and we have our final, to join the child preferred path). Since all auxiliary tree
joined auxiliary tree. searches are unsuccessful except the last one (we stop
once a search is successful, naturally), we search k + 1
auxiliary trees. Each search takes O(log log n) , because
Cut Our cut operation will break a preferred path into an auxiliary tree’s size is bounded by log n , the height of
two parts at a given node, a top part and a bottom part. the reference tree.
More formally, it'll partition an auxiliary tree into two
auxiliary trees, such that one contains all nodes at or above
a certain depth in the reference tree, and the other con- Updating The update cost fits within this bound as
tains all nodes below that depth. As in join, note that the well, because we only have to perform one cut and one
top part has two nodes that bracket the bottom part. Thus, join for every visited auxiliary tree. A single cut or join
we can simply split on each of these two nodes to divide operation takes only a constant number of searches, splits,
the path into three parts, then concatenate the two outer and concatenates, each of which takes logarithmic time
ones so we end up with two parts, the top and bottom, as in the size of the auxiliary tree, so our update cost is
desired. (k + 1)O(log log n) .

6.12.3 Analysis Competitive Ratio


In order to bound the competitive ratio for tango trees,
Tango trees are O(log log n) -competitive, because the
we must find a lower bound on the performance of the
work done by the optimal offline binary search tree is
optimal offline tree that we use as a benchmark. Once
at least linear in k (the total number of preferred child
we find an upper bound on the performance of the tango
switches), and the work done by the tango tree is at most
tree, we can divide them to bound the competitive ratio.
(k + 1)O(log log n) .

Interleave Bound
6.12.4 See also
Main article: Interleave lower bound
• Splay tree
To find a lower bound on the work done by the optimal
offline binary search tree, we again use the notion of pre- • Optimal binary search tree
ferred children. When considering an access sequence (a
sequence of searches), we keep track of how many times • Red-black tree
a reference tree node’s preferred child switches. The to-
tal number of switches (summed over all nodes) gives an
• Tree (data structure)
asymptotic lower bound on the work done by any binary
search tree algorithm on the given access sequence. This
is called the interleave lower bound.[1]
6.12.5 References
Tango Tree [1] Demaine, E. D.; Harmon, D.; Iacono, J.; Pă-
traşcu, M. (2007). “Dynamic Optimality—
In order to connect this to tango trees, we will find an Almost”. SIAM Journal on Computing. 37 (1):
upper bound on the work done by the tango tree for a 240. doi:10.1137/S0097539705447347.
6.13. SKIP LIST 209

6.13 Skip list Implementation details

In computer science, a skip list is a data structure that


allows fast search within an ordered sequence of ele-
ments. Fast search is made possible by maintaining a
linked hierarchy of subsequences, with each successive
subsequence skipping over fewer elements than the previ-
ous one. Searching starts in the sparsest subsequence un-
til two consecutive elements have been found, one smaller
and one larger than or equal to the element searched for. Inserting elements to skip list
Via the linked hierarchy, these two elements link to ele-
ments of the next sparsest subsequence, where searching The elements used for a skip list can contain more than
is continued until finally we are searching in the full se- one pointer since they can participate in more than one
quence. The elements that are skipped over may be cho- list.
sen probabilistically [2] or deterministically,[3] with the
former being more common. Insertions and deletions are implemented much like the
corresponding linked-list operations, except that “tall” el-
ements must be inserted into or deleted from more than
one linked list.
6.13.1 Description
O(n) operations, which force us to visit every node in as-
cending order (such as printing the entire list), provide the
NIL opportunity to perform a behind-the-scenes derandom-
NIL
NIL
ization of the level structure of the skip-list in an opti-
NIL mal way, bringing the skip list to O(log n) search time.
head 1 2 3 4 5 6 7 8 9 10 (Choose the level of the i'th finite node to be 1 plus the
number of times we can repeatedly divide i by 2 before
A schematic picture of the skip list data structure. Each box with it becomes odd. Also, i=0 for the negative infinity header
an arrow represents a pointer and a row is a linked list giving as we have the usual special case of choosing the highest
a sparse subsequence; the numbered boxes at the bottom repre- possible level for negative and/or positive infinite nodes.)
sent the ordered data sequence. Searching proceeds downwards However this also allows someone to know where all of
from the sparsest subsequence at the top until consecutive ele- the higher-than-level 1 nodes are and delete them.
ments bracketing the search element are found.
Alternatively, we could make the level structure quasi-
A skip list is built in layers. The bottom layer is an ordi- random in the following way:
nary ordered linked list. Each higher layer acts as an “ex- make all nodes level 1 j ← 1 while the number of nodes
press lane” for the lists below, where an element in layer at level j > 1 do for each i'th node at level j do if i is odd
i appears in layer i+1 with some fixed probability p (two if i is not the last node at level j randomly choose whether
commonly used values for p are 1/2 or 1/4). On average, to promote it to level j+1 else do not promote end if else
each element appears in 1/(1-p) lists, and the tallest ele- if i is even and node i-1 was not promoted promote it to
ment (usually a special head element at the front of the level j+1 end if repeat j ← j + 1 repeat
skip list) in all the lists. The skip list contains log1/p n
Like the derandomized version, quasi-randomization is
lists.
only done when there is some other reason to be running
A search for a target element begins at the head element an O(n) operation (which visits every node).
in the top list, and proceeds horizontally until the cur-
rent element is greater than or equal to the target. If the The advantage of this quasi-randomness is that it doesn't
current element is equal to the target, it has been found. give away nearly as much level-structure related informa-
If the current element is greater than the target, or the tion to an adversarial user as the de-randomized one. This
search reaches the end of the linked list, the procedure is desirable because an adversarial user who is able to tell
is repeated after returning to the previous element and which nodes are not at the lowest level can pessimize per-
dropping down vertically to the next lower list. The ex- formance by simply deleting higher-level nodes. (Bethea
pected number of steps in each linked list is at most 1/p, and Reiter however argue that nonetheless an adversary
which can be seen by tracing the search path backwards can use probabilistic and [4]
timing methods to force per-
from the target until reaching an element that appears in formance degradation. ) The search performance is still
the next higher list or reaching the beginning of the cur- guaranteed to be logarithmic.
rent list. Therefore, the total expected cost of a search It would be tempting to make the following “optimiza-
is (log1/p n)/p, which is O(log n) when p is a constant. tion": In the part which says “Next, for each i'th...”, for-
By choosing different values of p, it is possible to trade get about doing a coin-flip for each even-odd pair. Just
search costs against storage costs. flip a coin once to decide whether to promote only the
210 CHAPTER 6. SUCCESSORS AND NEIGHBORS

even ones or only the odd ones. Instead of O(n log n) 5), traverse a link of width 1 at the top level. Now four
coin flips, there would only be O(log n) of them. Unfor- more steps are needed but the next width on this level is
tunately, this gives the adversarial user a 50/50 chance of ten which is too large, so drop one level. Traverse one
being correct upon guessing that all of the even numbered link of width 3. Since another step of width 2 would be
nodes (among the ones at level 1 or higher) are higher than too far, drop down to the bottom level. Now traverse the
level one. This is despite the property that he has a very final link of width 1 to reach the target running total of 5
low probability of guessing that a particular node is at (1+3+1).
level N for some integer N. function lookupByPositionIndex(i) node ← head i ← i
A skip list does not provide the same absolute worst-case + 1 # don't count the head as a step for level from top
performance guarantees as more traditional balanced tree to bottom do while i ≥ node.width[level] do # if next
data structures, because it is always possible (though with step is not too far i ← i - node.width[level] # subtract the
very low probability) that the coin-flips used to build the current width node ← node.next[level] # traverse forward
skip list will produce a badly balanced structure. How- at the current level repeat repeat return node.value end
ever, they work well in practice, and the randomized bal- function
ancing scheme has been argued to be easier to imple- This method of implementing indexing is detailed in
ment than the deterministic balancing schemes used in Section 3.4 Linear List Operations in “A skip list cook-
balanced binary search trees. Skip lists are also useful in book” by William Pugh.
parallel computing, where insertions can be done in dif-
ferent parts of the skip list in parallel without any global
rebalancing of the data structure. Such parallelism can 6.13.2 History
be especially advantageous for resource discovery in an
ad-hoc wireless network because a randomized skip list Skip lists were first described in 1989 by William Pugh.[6]
can be made robust to the loss of any single node.[5]
To quote the author:

Indexable skiplist
Skip lists are a probabilistic data structure that
seem likely to supplant balanced trees as the im-
As described above, a skiplist is capable of fast O(log n)
plementation method of choice for many ap-
insertion and removal of values from a sorted sequence,
plications. Skip list algorithms have the same
but it has only slow O(n) lookups of values at a given
asymptotic expected time bounds as balanced
position in the sequence (i.e. return the 500th value);
trees and are simpler, faster and use less space.
however, with a minor modification the speed of random
access indexed lookups can be improved to O(log n) .
For every link, also store the width of the link. The width 6.13.3 Usages
is defined as the number of bottom layer links being tra-
versed by each of the higher layer “express lane” links. List of applications and frameworks that use skip lists:
For example, here are the widths of the links in the ex-
ample at the top of the page: • MemSQL uses skiplists as its prime indexing struc-
ture for its database technology.
1 10 o---> o----------------------------------------------------
-----> o Top level 1 3 2 5 o---> o---------------> o-------- • Cyrus IMAP server offers a “skiplist” backend DB
-> o---------------------------> o Level 3 1 2 1 2 3 2 o---> implementation (source file)
o---------> o---> o---------> o---------------> o---------> o
Level 2 1 1 1 1 1 1 1 1 1 1 1 o---> o---> o---> o---> o---> • Lucene uses skip lists to search delta-encoded post-
o---> o---> o---> o---> o---> o---> o Bottom level Head ing lists in logarithmic time.
1st 2nd 3rd 4th 5th 6th 7th 8th 9th 10th NIL Node Node
Node Node Node Node Node Node Node Node • QMap (up to Qt 4) template class of Qt that provides
a dictionary.
Notice that the width of a higher level link is the sum
of the component links below it (i.e. the width 10 link • Redis, an ANSI-C open-source persistent key/value
spans the links of widths 3, 2 and 5 immediately below store for Posix systems, uses skip lists in its imple-
it). Consequently, the sum of all widths is the same on mentation of ordered sets.[7]
every level (10 + 1 = 1 + 3 + 2 + 5 = 1 + 2 + 1 + 2 + 5).
To index the skiplist and find the i'th value, traverse the • nessDB, a very fast key-value embedded Database
skiplist while counting down the widths of each traversed Storage Engine (Using log-structured-merge (LSM)
link. Descend a level whenever the upcoming width trees), uses skip lists for its memtable.
would be too large. • skipdb is an open-source database format using or-
For example, to find the node in the fifth position (Node dered key/value pairs.
6.13. SKIP LIST 211

• ConcurrentSkipListSet and [9] Sundell, H.; Tsigas, P. (2003). “Fast and lock-free con-
ConcurrentSkipListMap in the Java 1.6 API. current priority queues for multi-thread systems”. Pro-
ceedings International Parallel and Distributed Processing
• Speed Tables are a fast key-value datastore for Tcl Symposium. p. 11. doi:10.1109/IPDPS.2003.1213189.
that use skiplists for indexes and lockless shared ISBN 0-7695-1926-1.
memory.
[10] Fomitchev, Mikhail; Ruppert, Eric (2004). Lock-free
• leveldb, a fast key-value storage library written at linked lists and skip lists (PDF). Proc. Annual ACM Symp.
Google that provides an ordered mapping from on Principles of Distributed Computing (PODC). pp. 50–
string keys to string values 59. doi:10.1145/1011767.1011776. ISBN 1581138024.

[11] Bajpai, R.; Dhara, K. K.; Krishnaswamy, V. (2008).


Skip lists are used for efficient statistical computations “QPID: A Distributed Priority Queue with Item Local-
of running medians (also known as moving medians). ity”. 2008 IEEE International Symposium on Parallel
Skip lists are also used in distributed applications (where and Distributed Processing with Applications. p. 215.
the nodes represent physical computers, and pointers doi:10.1109/ISPA.2008.90. ISBN 978-0-7695-3471-8.
represent network connections) and for implementing
highly scalable concurrent priority queues with less lock [12] Sundell, H. K.; Tsigas, P. (2004). “Scalable and lock-
contention,[8] or even without locking,[9][10][11] as well as free concurrent dictionaries”. Proceedings of the 2004
[12]
lockless concurrent dictionaries. There are also several ACM symposium on Applied computing - SAC '04 (PDF). p.
1438. doi:10.1145/967900.968188. ISBN 1581138121.
US patents for using skip lists to implement (lockless) pri-
[13]
ority queues and concurrent dictionaries. [13] US patent 7937378

6.13.4 See also


6.13.6 External links
• Bloom filter
• “Skip list” entry in the Dictionary of Algorithms and
• Skip graph Data Structures

• Skip Lists: A Linked List with Self-Balancing BST-


6.13.5 References Like Properties on MSDN in C# 2.0

[1] http://www.cs.uwaterloo.ca/research/tr/1993/28/ • Skip Lists lecture (MIT OpenCourseWare: Intro-


root2side.pdf duction to Algorithms)
[2] Pugh, W. (1990). “Skip lists: A probabilistic alternative
• Open Data Structures - Chapter 4 - Skiplists
to balanced trees” (PDF). Communications of the ACM.
33 (6): 668. doi:10.1145/78973.78977.
• Skip trees, an alternative data structure to skip lists
[3] Munro, J. Ian; Papadakis, Thomas; Sedgewick, Robert in a concurrent approach
(1992). “Deterministic skip lists”. Proceedings of the
third annual ACM-SIAM symposium on Discrete algo- • Skip tree graphs, a distributed version of skip trees
rithms (SODA '92). Orlando, Florida, USA: Society for
Industrial and Applied Mathematics, Philadelphia, PA, • More on skip tree graphs, a distributed version of
USA. pp. 367–375. alternative link skip trees

[4] Darrell Bethea and Michael K. Reiter, Data Struc-


tures with Unpredictable Timing https://www.cs.unc. Demo applets
edu/~{}djb/papers/2009-ESORICS.pdf, section 4 “Skip
Lists” • Skip List Applet by Kubo Kovac
[5] Shah, Gauri (2003). Distributed Data Structures for Peer-
to-Peer Systems (PDF) (Ph.D. thesis). Yale University. • Thomas Wenger’s demo applet on skiplists

[6] William Pugh (April 1989). “Concurrent Maintenance of


Skip Lists”, Tech. Report CS-TR-2222, Dept. of Com- Implementations
puter Science, U. Maryland.

[7] “Redis ordered set implementation”.


• Algorithm::SkipList, implementation in Perl on
CPAN
[8] Shavit, N.; Lotan, I. (2000). “Skiplist-based concurrent
priority queues”. Proceedings 14th International Paral- • Raymond Hettinger’s implementation in Python
lel and Distributed Processing Symposium. IPDPS 2000
(PDF). p. 263. doi:10.1109/IPDPS.2000.845994. ISBN • ConcurrentSkipListSet documentation for Java 6
0-7695-0574-0. (and sourcecode)
212 CHAPTER 6. SUCCESSORS AND NEIGHBORS

6.14 B-tree number of keys. Similarly, if an internal node and its


neighbor each have d keys, then a key may be deleted
Not to be confused with Binary tree. from the internal node by combining it with its neighbor.
Deleting the key would make the internal node have d −
1 keys; joining the neighbor would add d keys plus one
In computer science, a B-tree is a self-balancing tree data more key brought down from the neighbor’s parent. The
structure that keeps data sorted and allows searches, se- result is an entirely full node of 2d keys.
quential access, insertions, and deletions in logarithmic
time. The B-tree is a generalization of a binary search The number of branches (or child nodes) from a node
tree in that a node can have more than two children will be one more than the number of keys stored in the
(Comer 1979, p. 123). Unlike self-balancing binary node. In a 2-3 B-tree, the internal nodes will store either
search trees, the B-tree is optimized for systems that read one key (with two child nodes) or two keys (with three
and write large blocks of data. B-trees are a good example child nodes). A B-tree is sometimes described with the
of a data structure for external memory. It is commonly parameters (d+1) — (2d+1) or simply with the highest
used in databases and filesystems. branching order, (2d + 1) .
A B-tree is kept balanced by requiring that all leaf nodes
be at the same depth. This depth will increase slowly as
6.14.1 Overview elements are added to the tree, but an increase in the over-
all depth is infrequent, and results in all leaf nodes being
7 16 one more node farther away from the root.
B-trees have substantial advantages over alternative im-
plementations when the time to access the data of a node
1 2 5 6 9 12 18 21 greatly exceeds the time spent processing that data, be-
cause then the cost of accessing the node may be amor-
A B-tree (Bayer & McCreight 1972) of order 5 (Knuth 1998).
tized over multiple operations within the node. This usu-
ally occurs when the node data are in secondary storage
such as disk drives. By maximizing the number of keys
In B-trees, internal (non-leaf) nodes can have a variable
within each internal node, the height of the tree decreases
number of child nodes within some pre-defined range.
and the number of expensive node accesses is reduced. In
When data is inserted or removed from a node, its num-
addition, rebalancing of the tree occurs less often. The
ber of child nodes changes. In order to maintain the pre-
maximum number of child nodes depends on the infor-
defined range, internal nodes may be joined or split. Be-
mation that must be stored for each child node and the
cause a range of child nodes is permitted, B-trees do not
size of a full disk block or an analogous size in secondary
need re-balancing as frequently as other self-balancing
storage. While 2-3 B-trees are easier to explain, practical
search trees, but may waste some space, since nodes are
B-trees using secondary storage need a large number of
not entirely full. The lower and upper bounds on the num-
child nodes to improve performance.
ber of child nodes are typically fixed for a particular im-
plementation. For example, in a 2-3 B-tree (often simply
referred to as a 2-3 tree), each internal node may have Variants
only 2 or 3 child nodes.
Each internal node of a B-tree will contain a number of The term B-tree may refer to a specific design or it may
keys. The keys act as separation values which divide its refer to a general class of designs. In the narrow sense, a
subtrees. For example, if an internal node has 3 child B-tree stores keys in its internal nodes but need not store
nodes (or subtrees) then it must have 2 keys: a1 and a2 . those keys in the records at the leaves. The general *
class
All values in the leftmost subtree will be less than a1 , all includes variations such as the B+ tree and the B tree.
values in the middle subtree will be between a1 and a2 ,
and all values in the rightmost subtree will be greater than • In the B+ tree, copies of the keys are stored in
a2 . the internal nodes; the keys and records are stored
in leaves; in addition, a leaf node may include a
Usually, the number of keys is chosen to vary between d pointer to the next leaf node to speed sequential ac-
and 2d , where d is the minimum number of keys, and cess (Comer 1979, p. 129).
d + 1 is the minimum degree or branching factor of the
tree. In practice, the keys take up the most space in a • The B* tree balances more neighboring internal
node. The factor of 2 will guarantee that nodes can be nodes to keep the internal nodes more densely
split or combined. If an internal node has 2d keys, then packed (Comer 1979, p. 129). This variant requires
adding a key to that node can be accomplished by splitting non-root nodes to be at least 2/3 full instead of 1/2
the hypothetical 2d+1 key node into two d key nodes and (Knuth 1998, p. 488). To maintain this, instead of
moving the key that would have been in the middle to the immediately splitting up a node when it gets full, its
parent node. Each split node has the required minimum keys are shared with a node next to it. When both
6.14. B-TREE 213

nodes are full, then the two nodes are split into three. 1,000,000 records, then a specific record could be located
Deleting nodes is somewhat more complex than in- with at most 20 comparisons: ⌈log2 (1, 000, 000)⌉ = 20
serting however. .
Large databases have historically been kept on disk
• B-trees can be turned into order statistic trees to al-
low rapid searches for the Nth record in key order, drives. The time to read a record on a disk drive far ex-
or counting the number of records between any two ceeds the time needed to compare keys once the record
records, and various other related operations.[1] is available. The time to read a record from a disk drive
involves a seek time and a rotational delay. The seek time
may be 0 to 20 or more milliseconds, and the rotational
Etymology delay averages about half the rotation period. For a 7200
RPM drive, the rotation period is 8.33 milliseconds. For
Rudolf Bayer and Ed McCreight invented the B-tree a drive such as the Seagate ST3500320NS, the track-to-
while working at Boeing Research Labs in 1971 (Bayer & track seek time is 0.8 milliseconds and the average read-
McCreight 1972), but they did not explain what, if any- ing seek time is 8.5 milliseconds.[4] For simplicity, as-
thing, the B stands for. Douglas Comer explains: sume reading from disk takes about 10 milliseconds.
Naively, then, the time to locate one record out of a mil-
The origin of “B-tree” has never been ex- lion would take 20 disk reads times 10 milliseconds per
plained by the authors. As we shall see, “bal- disk read, which is 0.2 seconds.
anced,” “broad,” or “bushy” might apply. Oth-
The time won't be that bad because individual records are
ers suggest that the “B” stands for Boeing. Be-
grouped together in a disk block. A disk block might be
cause of his contributions, however, it seems
16 kilobytes. If each record is 160 bytes, then 100 records
appropriate to think of B-trees as “Bayer"-
could be stored in each block. The disk read time above
trees. (Comer 1979, p. 123 footnote 1)
was actually for an entire block. Once the disk head is
in position, one or more disk blocks can be read with lit-
Donald Knuth speculates on the etymology of B-trees in tle delay. With 100 records per block, the last 6 or so
his May, 1980 lecture on the topic “CS144C classroom comparisons don't need to do any disk reads—the com-
lecture about disk storage and B-trees”, suggesting the parisons are all within the last disk block read.
“B” may have originated from Boeing or from Bayer’s
name.[2] To speed the search further, the first 13 to 14 comparisons
(which each required a disk access) must be sped up.
Ed McCreight answered a question on B-tree’s name in
2013:
An index speeds the search
Bayer and I were in a lunch time where we
get to think a name. And we were, so, B, we A significant improvement can be made with an index. In
were thinking… B is, you know… We were the example above, initial disk reads narrowed the search
working for Boeing at the time, we couldn't use range by a factor of two. That can be improved substan-
the name without talking to lawyers. So, there tially by creating an auxiliary index that contains the first
is a B. It has to do with balance, another B. record in each disk block (sometimes called a sparse in-
Bayer was the senior author, who did have sev- dex). This auxiliary index would be 1% of the size of the
eral years older than I am and had many more original database, but it can be searched more quickly.
publications than I did. So there is another B. Finding an entry in the auxiliary index would tell us which
And so, at the lunch table we never did resolve block to search in the main database; after searching the
whether there was one of those that made more auxiliary index, we would have to search only that one
sense than the rest. What really lives to say is: block of the main database—at a cost of one more disk
the more you think about what the B in B-trees read. The index would hold 10,000 entries, so it would
means, the better you understand B-trees.”[3] take at most 14 comparisons. Like the main database, the
last 6 or so comparisons in the aux index would be on the
same disk block. The index could be searched in about 8
6.14.2 B-tree usage in databases disk reads, and the desired record could be accessed in 9
disk reads.
Time to search a sorted file
The trick of creating an auxiliary index can be repeated
Usually, sorting and searching algorithms have been char- to make an auxiliary index to the auxiliary index. That
acterized by the number of comparison operations that would make an aux-aux index that would need only 100
must be performed using order notation. A binary search entries and would fit in one disk block.
of a sorted table with N records, for example, can be Instead of reading 14 disk blocks to find the desired
done in roughly ⌈log2 N ⌉ comparisons. If the table had record, we only need to read 3 blocks. Reading and
214 CHAPTER 6. SUCCESSORS AND NEIGHBORS

searching the first (and only) block of the aux-aux in- • keeps the index balanced with a recursive algorithm
dex identifies the relevant block in aux-index. Reading
and searching that aux-index block identifies the relevant In addition, a B-tree minimizes waste by making sure the
block in the main database. Instead of 150 milliseconds, interior nodes are at least half full. A B-tree can handle
we need only 30 milliseconds to get the record. an arbitrary number of insertions and deletions.
The auxiliary indices have turned the search problem
from a binary search requiring roughly log2 N disk reads
to one requiring only logb N disk reads where b is the Disadvantages of B-trees
blocking factor (the number of entries per block: b = 100
• maximum key length cannot be changed without
entries per block; logb 1, 000, 000 = 3 reads).
completely rebuilding the database. This led to
In practice, if the main database is being frequently many database systems truncating full human names
searched, the aux-aux index and much of the aux index to 70 characters.
may reside in a disk cache, so they would not incur a disk
read.
(Other implementations of associative array, such as a
ternary search tree or a separate-chaining hash table, dy-
Insertions and deletions namically adapt to arbitrarily long key lengths.)

If the database does not change, then compiling the index


is simple to do, and the index need never be changed.
6.14.3 Technical description
If there are changes, then managing the database and its
Terminology
index becomes more complicated.
Deleting records from a database is relatively easy. The The literature on B-trees is not uniform in its terminology
index can stay the same, and the record can just be (Folk & Zoellick 1992, p. 362).
marked as deleted. The database remains in sorted order.
Bayer & McCreight (1972), Comer (1979), and others
If there are a large number of deletions, then searching
define the order of B-tree as the minimum number of
and storage become less efficient.
keys in a non-root node. Folk & Zoellick (1992) points
Insertions can be very slow in a sorted sequential file be- out that terminology is ambiguous because the maximum
cause room for the inserted record must be made. Insert- number of keys is not clear. An order 3 B-tree might hold
ing a record before the first record requires shifting all a maximum of 6 keys or a maximum of 7 keys. Knuth
of the records down one. Such an operation is just too (1998, p. 483) avoids the problem by defining the order
expensive to be practical. One solution is to leave some to be maximum number of children (which is one more
spaces. Instead of densely packing all the records in a than the maximum number of keys).
block, the block can have some free space to allow for
The term leaf is also inconsistent. Bayer & McCreight
subsequent insertions. Those spaces would be marked as
(1972) considered the leaf level to be the lowest level of
if they were “deleted” records.
keys, but Knuth considered the leaf level to be one level
Both insertions and deletions are fast as long as space below the lowest keys (Folk & Zoellick 1992, p. 363).
is available on a block. If an insertion won't fit on the There are many possible implementation choices. In
block, then some free space on some nearby block must some designs, the leaves may hold the entire data record;
be found and the auxiliary indices adjusted. The hope in other designs, the leaves may only hold pointers to the
is that enough space is available nearby, such that a lot data record. Those choices are not fundamental to the
of blocks do not need to be reorganized. Alternatively, idea of a B-tree.[5]
some out-of-sequence disk blocks may be used.
There are also unfortunate choices like using the variable
k to represent the number of children when k could be
Advantages of B-tree usage for databases confused with the number of keys.
For simplicity, most authors assume there are a fixed
The B-tree uses all of the ideas described above. In par- number of keys that fit in a node. The basic assumption is
ticular, a B-tree: the key size is fixed and the node size is fixed. In practice,
variable length keys may be employed (Folk & Zoellick
• keeps keys in sorted order for sequential traversing 1992, p. 379).

• uses a hierarchical index to minimize the number of


disk reads Definition

• uses partially full blocks to speed insertions and According to Knuth’s definition, a B-tree of order m is a
deletions tree which satisfies the following properties:
6.14. B-TREE 215

1. Every node has at most m children. 6.14.4 Best case and worst case heights
2. Every non-leaf node (except root) has at least ⌈m/2⌉ Let h be the height of the classic B-tree. Let n > 0 be the
children. number of entries in the tree.[6] Let m be the maximum
3. The root has at least two children if it is not a leaf number of children a node can have. Each node can have
node. at most m−1 keys.
It can be shown (by induction for example) that a B-tree
4. A non-leaf node with k children contains k−1 keys. of height h with all its nodes completely filled has n=

5. All leaves appear in the same level mh+1 −1 entries. Hence, the best case height of a B-tree
is:
Each internal node’s keys act as separation values which
divide its subtrees. For example, if an internal node has 3
⌈logm (n + 1)⌉ − 1
child nodes (or subtrees) then it must have 2 keys: a1 and
a2 . All values in the leftmost subtree will be less than a1 , Let d be the minimum number of children an inter-
all values in the middle subtree will be between a1 and nal (non-root) node can have. For an ordinary B-tree,
a2 , and all values in the rightmost subtree will be greater d=⌈m/2⌉.
than a2 .
Comer (1979, p. 127) and Cormen et al. (2001, pp. 383–
384) give the worst case height of a B-tree (where the root
Internal nodes Internal nodes are all nodes except for node is considered to have height 0) as
leaf nodes and the root node. They are usually repre-
sented as an ordered set of elements and child point-
⌊ ( )⌋
ers. Every internal node contains a maximum of U n+1
children and a minimum of L children. Thus, the h ≤ logd 2
.
number of elements is always 1 less than the number
of child pointers (the number of elements is between
L−1 and U−1). U must be either 2L or 2L−1; there- 6.14.5 Algorithms
fore each internal node is at least half full. The rela-
tionship between U and L implies that two half-full Search
nodes can be joined to make a legal node, and one
full node can be split into two legal nodes (if there’s Searching is similar to searching a binary search tree.
room to push one element up into the parent). These Starting at the root, the tree is recursively traversed from
properties make it possible to delete and insert new top to bottom. At each level, the search reduces its field of
values into a B-tree and adjust the tree to preserve view to the child pointer (subtree) whose range includes
the B-tree properties. the search value. A subtree’s range is defined by the val-
ues, or keys, contained in its parent node. These limiting
The root node The root node’s number of children has values are also known as separation values.
the same upper limit as internal nodes, but has no Binary search is typically (but not necessarily) used
lower limit. For example, when there are fewer than within nodes to find the separation values and child tree
L−1 elements in the entire tree, the root will be the of interest.
only node in the tree with no children at all.
Insertion
Leaf nodes Leaf nodes have the same restriction on the
number of elements, but have no children, and no All insertions start at a leaf node. To insert a new ele-
child pointers. ment, search the tree to find the leaf node where the new
element should be added. Insert the new element into that
A B-tree of depth n+1 can hold about U times as many node with the following steps:
items as a B-tree of depth n, but the cost of search, insert,
and delete operations grows with the depth of the tree. As 1. If the node contains fewer than the maximum legal
with any balanced tree, the cost grows much more slowly number of elements, then there is room for the new
than the number of elements. element. Insert the new element in the node, keeping
Some balanced trees store values only at leaf nodes, and the node’s elements ordered.
use different kinds of nodes for leaf nodes and internal 2. Otherwise the node is full, evenly split it into two
nodes. B-trees keep values in every node in the tree, and nodes so:
may use the same structure for all nodes. However, since
leaf nodes never have children, the B-trees benefit from (a) A single median is chosen from among the
improved performance if they use a specialized structure. leaf’s elements and the new element.
216 CHAPTER 6. SUCCESSORS AND NEIGHBORS

moves to the parent, but one element is added. So, it must


be possible to divide the maximum number U−1 of ele-
ments into two legal nodes. If this number is odd, then
U=2L and one of the new nodes contains (U−2)/2 = L−1
elements, and hence is a legal node, and the other con-
tains one more element, and hence it is legal too. If U−1
is even, then U=2L−1, so there are 2L−2 elements in the
node. Half of this number is L−1, which is the minimum
number of elements allowed per node.
An improved algorithm supports a single pass down the
tree from the root to the node where the insertion will
take place, splitting any full nodes encountered on the
way. This prevents the need to recall the parent nodes
into memory, which may be expensive if the nodes are
on secondary storage. However, to use this improved al-
gorithm, we must be able to send one element to the par-
ent and split the remaining U−2 elements into two legal
nodes, without adding a new element. This requires U =
2L rather than U = 2L−1, which accounts for why some
textbooks impose this requirement in defining B-trees.

Deletion

There are two popular strategies for deletion from a B-


tree.

1. Locate and delete the item, then restructure the tree


to retain its invariants, OR
2. Do a single pass down the tree, but before entering
(visiting) a node, restructure the tree so that once the
key to be deleted is encountered, it can be deleted
without triggering the need for any further restruc-
turing

The algorithm below uses the former strategy.


There are two special cases to consider when deleting an
element:
A B Tree insertion example with each iteration. The nodes of this
B tree have at most 3 children (Knuth order 3).
1. The element in an internal node is a separator for its
child nodes
(b) Values less than the median are put in the new
left node and values greater than the median 2. Deleting an element may put its node under the min-
are put in the new right node, with the median imum number of elements and children
acting as a separation value.
The procedures for these cases are in order below.
(c) The separation value is inserted in the node’s
parent, which may cause it to be split, and so
on. If the node has no parent (i.e., the node Deletion from a leaf node
was the root), create a new root above this node
(increasing the height of the tree). 1. Search for the value to delete.
2. If the value is in a leaf node, simply delete it from
If the splitting goes all the way up to the root, it creates a the node.
new root with a single separator value and two children,
which is why the lower bound on the size of internal nodes 3. If underflow happens, rebalance the tree as de-
does not apply to the root. The maximum number of ele- scribed in section “Rebalancing after deletion” be-
ments per node is U−1. When a node is split, one element low.
6.14. B-TREE 217

Deletion from an internal node Each element in an down; deficient node now has the minimum
internal node acts as a separation value for two subtrees, number of elements)
therefore we need to find a replacement for separation. 2. Replace the separator in the parent with the
Note that the largest element in the left subtree is still less last element of the left sibling (left sibling loses
than the separator. Likewise, the smallest element in the one node but still has at least the minimum
right subtree is still greater than the separator. Both of number of elements)
those elements are in leaf nodes, and either one can be
the new separator for the two subtrees. Algorithmically 3. The tree is now balanced
described below:
• Otherwise, if both immediate siblings have only the
minimum number of elements, then merge with a
1. Choose a new separator (either the largest element in sibling sandwiching their separator taken off from
the left subtree or the smallest element in the right their parent
subtree), remove it from the leaf node it is in, and
replace the element to be deleted with the new sep- 1. Copy the separator to the end of the left node
arator. (the left node may be the deficient node or it
may be the sibling with the minimum number
2. The previous step deleted an element (the new sepa- of elements)
rator) from a leaf node. If that leaf node is now defi-
cient (has fewer than the required number of nodes), 2. Move all elements from the right node to the
then rebalance the tree starting from the leaf node. left node (the left node now has the maxi-
mum number of elements, and the right node
– empty)
Rebalancing after deletion Rebalancing starts from a
3. Remove the separator from the parent along
leaf and proceeds toward the root until the tree is bal-
with its empty right child (the parent loses an
anced. If deleting an element from a node has brought
element)
it under the minimum size, then some elements must be
redistributed to bring all nodes up to the minimum. Usu- • If the parent is the root and now has no el-
ally, the redistribution involves moving an element from a ements, then free it and make the merged
sibling node that has more than the minimum number of node the new root (tree becomes shal-
nodes. That redistribution operation is called a rotation. lower)
If no sibling can spare an element, then the deficient node • Otherwise, if the parent has fewer than
must be merged with a sibling. The merge causes the par- the required number of elements, then re-
ent to lose a separator element, so the parent may become balance the parent
deficient and need rebalancing. The merging and rebal-
ancing may continue all the way to the root. Since the
Note: The rebalancing operations are different for
minimum element count doesn't apply to the root, mak-
B+ trees (e.g., rotation is different because parent
ing the root be the only deficient node is not a problem.
has copy of the key) and B* -tree (e.g., three siblings
The algorithm to rebalance the tree is as follows:
are merged into two siblings).

• If the deficient node’s right sibling exists and has


more than the minimum number of elements, then Sequential access
rotate left
While freshly loaded databases tend to have good sequen-
1. Copy the separator from the parent to the tial behavior, this behavior becomes increasingly difficult
end of the deficient node (the separator moves to maintain as a database grows, resulting in more random
down; the deficient node now has the mini- I/O and performance challenges.[7]
mum number of elements)
2. Replace the separator in the parent with the
first element of the right sibling (right sibling Initial construction
loses one node but still has at least the mini-
mum number of elements) In applications, it is frequently useful to build a B-tree to
represent a large existing collection of data and then up-
3. The tree is now balanced date it incrementally using standard B-tree operations. In
• Otherwise, if the deficient node’s left sibling exists this case, the most efficient way to construct the initial
and has more than the minimum number of ele- B-tree is not to insert every element in the initial collec-
ments, then rotate right tion successively, but instead to construct the initial set of
leaf nodes directly from the input, then build the internal
1. Copy the separator from the parent to the nodes from these. This approach to B-tree construction
start of the deficient node (the separator moves is called bulkloading. Initially, every leaf but the last one
218 CHAPTER 6. SUCCESSORS AND NEIGHBORS

has one extra element, which will be used to build the TOPS-20 (and possibly TENEX) used a 0 to 2 level tree
internal nodes. that has similarities to a B-tree. A disk block was 512 36-
9
For example, if the leaf nodes have maximum size 4 and bit words. If the file fit in a 512 (2 ) word block, then the
the initial collection is the integers 1 through 24, we would file directory would
18
point to that physical disk block. If
initially construct 4 leaf nodes containing 5 values each the file fit in 2 words, then the directory would point to
and 1 which contains 4 values: an aux index; the 512 words of that index would either be
NULL (the block isn't allocated) or point to the physical
We build the next level up from the leaves by taking the address of the block. If the file fit in 227 words, then
last element from each leaf node except the last one. the directory would point to a block holding an aux-aux
Again, each node except the last will contain one extra index; each entry would either be NULL or point to an
value. In the example, suppose the internal nodes contain aux index. Consequently, the physical disk block for a
at most 2 values (3 child pointers). Then the next level up 227 word file could be located in two disk reads and read
of internal nodes would be: on the third.
This process is continued until we reach a level with only Apple’s filesystem HFS+, Microsoft’s NTFS,[9] AIX
one node and it is not overfilled. In the example only the (jfs2) and some Linux filesystems, such as btrfs and Ext4,
root level remains: use B-trees.
B* -trees are used in the HFS and Reiser4 file systems.

6.14.6 In filesystems
6.14.7 Variations
Most modern filesystems use B-trees (or § Variants); al-
ternatives such as extendible hashing are less common.[8] Access concurrency
In addition to its use in databases, the B-tree is also used
in filesystems to allow quick random access to an arbitrary Lehman and Yao[10] showed that all the read locks could
block in a particular file. The basic problem is turning be avoided (and thus concurrent access greatly improved)
the file block i address into a disk block (or perhaps to a by linking the tree blocks at each level together with a
cylinder-head-sector) address. “next” pointer. This results in a tree structure where both
insertion and search operations descend from the root to
Some operating systems require the user to allocate the the leaf. Write locks are only required as a tree block is
maximum size of the file when the file is created. The file modified. This maximizes access concurrency by multi-
can then be allocated as contiguous disk blocks. When ple users, an important consideration for databases and/or
converting to a disk block the operating system just adds other B-tree based ISAM storage methods. The cost as-
the file block address to the starting disk block of the file. sociated with this improvement is that empty pages can-
The scheme is simple, but the file cannot exceed its cre- not be removed from the btree during normal operations.
ated size. (However, see [11] for various strategies to implement
Other operating systems allow a file to grow. The result- node merging, and source code at.[12] )
ing disk blocks may not be contiguous, so mapping logical United States Patent 5283894, granted in 1994, appears
blocks to physical blocks is more involved. to show a way to use a 'Meta Access Method' [13] to al-
MS-DOS, for example, used a simple File Allocation low concurrent B+ tree access and modification without
Table (FAT). The FAT has an entry for each disk locks. The technique accesses the tree 'upwards’ for both
block,[note 1] and that entry identifies whether its block is searches and updates by means of additional in-memory
used by a file and if so, which block (if any) is the next indexes that point at the blocks in each level in the block
disk block of the same file. So, the allocation of each cache. No reorganization for deletes is needed and there
file is represented as a linked list in the table. In order to are no 'next' pointers in each block as in Lehman and Yao.
find the disk address of file block i , the operating system
(or disk utility) must sequentially follow the file’s linked
list in the FAT. Worse, to find a free disk block, it must 6.14.8 See also
sequentially scan the FAT. For MS-DOS, that was not a
huge penalty because the disks and files were small and • B+tree
the FAT had few entries and relatively short file chains.
In the FAT12 filesystem (used on floppy disks and early • R-tree
hard disks), there were no more than 4,080 [note 2] entries,
and the FAT would usually be resident in memory. As • Red–black tree
disks got bigger, the FAT architecture began to confront
penalties. On a large disk using FAT, it may be necessary • 2–3 tree
to perform disk reads to learn the disk location of a file
block to be read or written. • 2–3–4 tree
6.14. B-TREE 219

6.14.9 Notes • Comer, Douglas (June 1979), “The Ubiquitous


B-Tree”, Computing Surveys, 11 (2): 123–137,
[1] For FAT, what is called a “disk block” here is what the doi:10.1145/356770.356776, ISSN 0360-0300.
FAT documentation calls a “cluster”, which is fixed-size
group of one or more contiguous whole physical disk • Cormen, Thomas; Leiserson, Charles; Rivest,
sectors. For the purposes of this discussion, a cluster has Ronald; Stein, Clifford (2001), Introduction to Algo-
no significant difference from a physical sector. rithms (Second ed.), MIT Press and McGraw-Hill,
[2] Two of these were reserved for special purposes, so only pp. 434–454, ISBN 0-262-03293-7. Chapter 18:
4078 could actually represent disk blocks (clusters). B-Trees.

• Folk, Michael J.; Zoellick, Bill (1992), File Struc-


6.14.10 References tures (2nd ed.), Addison-Wesley, ISBN 0-201-
55713-4
[1] Counted B-Trees, retrieved 2010-01-25
• Knuth, Donald (1998), Sorting and Searching, The
[2] Knuth’s video lectures from Stanford Art of Computer Programming, Volume 3 (Second
[3] Video of the talk at CPM 2013 (24th Annual Sympo- ed.), Addison-Wesley, ISBN 0-201-89685-0. Sec-
sium on Combinatorial Pattern Matching, Bad Herrenalb, tion 6.2.4: Multiway Trees, pp. 481–491. Also,
Germany, June 17–19, 2013), retrieved 2014-01-17; see pp. 476–477 of section 6.2.3 (Balanced Trees) dis-
question asked by Martin Farach-Colton cusses 2-3 trees.
[4] Seagate Technology LLC, Product Manual: Barracuda
ES.2 Serial ATA, Rev. F., publication 100468393, 2008 Original papers
, page 6
[5] Bayer & McCreight (1972) avoided the issue by saying • Bayer, Rudolf; McCreight, E. (July 1970), Organi-
an index element is a (physically adjacent) pair of (x, a) zation and Maintenance of Large Ordered Indices,
where x is the key, and a is some associated information. Mathematical and Information Sciences Report No.
The associated information might be a pointer to a record 20, Boeing Scientific Research Laboratories.
or records in a random access, but what it was didn't really
matter. Bayer & McCreight (1972) states, “For this paper • Bayer, Rudolf (1971), Binary B-Trees for Virtual
the associated information is of no further interest.” Memory, Proceedings of 1971 ACM-SIGFIDET
Workshop on Data Description, Access and Control,
[6] If n is zero, then no root node is needed, so the height of San Diego, California.
an empty tree is not well defined.
[7] “Cache Oblivious B-trees”. State University of New York
(SUNY) at Stony Brook. Retrieved 2011-01-17. 6.14.11 External links
[8] Mikuláš Patocka. “Design and Implementation of the • B-tree lecture by David Scot Taylor, SJSU
Spad Filesystem”. “Table 4.1: Directory organization in
filesystems”. 2006. • B-Tree animation applet by slady
[9] Mark Russinovich. “Inside Win2K NTFS, Part 1”. • B-tree and UB-tree on Scholarpedia Curator: Dr
Microsoft Developer Network. Archived from the orig-
Rudolf Bayer
inal on 13 April 2008. Retrieved 2008-04-18.
[10] “Efficient locking for concurrent operations on B-trees”. • B-Trees: Balanced Tree Data Structures
Portal.acm.org. doi:10.1145/319628.319663. Retrieved
• NIST’s Dictionary of Algorithms and Data Struc-
2012-06-28.
tures: B-tree
[11] http://www.dtic.mil/cgi-bin/GetTRDoc?AD=
ADA232287&Location=U2&doc=GetTRDoc.pdf • B-Tree Tutorial

[12] “Downloads - high-concurrency-btree - High Concurrency • The InfinityDB BTree implementation


B-Tree code in C - GitHub Project Hosting”. Retrieved
2014-01-27. • Cache Oblivious B(+)-trees

[13] Lockless Concurrent B+Tree • Dictionary of Algorithms and Data Structures entry
for B*-tree
General
• Open Data Structures - Section 14.2 - B-Trees
• Bayer, R.; McCreight, E. (1972), “Organization • Counted B-Trees
and Maintenance of Large Ordered Indexes”
(PDF), Acta Informatica, 1 (3): 173–189, • B-Tree .Net, a modern, virtualized RAM & Disk
doi:10.1007/bf00288683 implementation
220 CHAPTER 6. SUCCESSORS AND NEIGHBORS

6.15 B+ tree node, which is a leaf node. (The root is also the single
leaf, in this case.) This node is permitted to have as little
as one key if necessary, and at most b .

6.15.2 Algorithms
Search

The root of a B+ Tree represents the whole range of val-


ues in the tree, where every internal node is a subinterval.
We are looking for a value k in the B+ Tree. Starting from
A simple B+ tree example linking the keys 1–7 to data values
the root, we are looking for the leaf which may contain
d1 -d7 . The linked list (red) allows rapid in-order traversal. This
particular tree’s branching factor is b =4. the value k . At each node, we figure out which internal
pointer we should follow. An internal B+ Tree node has at
A B+ tree is an n-ary tree with a variable but often large most d ≤ b children, where every one of them represents a
number of children per node. A B+ tree consists of a root, different sub-interval. We select the corresponding node
internal nodes and leaves.[1] The root may be either a leaf by searching on the key values of the node.
or a node with two or more children.[2] Function: search (k) return tree_search (k, root); Func-
A B+ tree can be viewed as a B-tree in which each node tion: tree_search (k, node) if node is a leaf then return
contains only keys (not key–value pairs), and to which an node; switch k do case k < k_0 return tree_search(k,
additional level is added at the bottom with linked leaves. p_0); case k_i ≤ k < k_{i+1} return tree_search(k,
p_{i+1}); case k_d ≤ k return tree_search(k, p_{d+1});
The primary value of a B+ tree is in storing data for ef-
ficient retrieval in a block-oriented storage context — in This pseudocode assumes that no duplicates are allowed.
particular, filesystems. This is primarily because unlike
binary search trees, B+ trees have very high fanout (num- Prefix key compression
ber of pointers to child nodes in a node,[1] typically on the
order of 100 or more), which reduces the number of I/O • It is important to increase fan-out, as this allows to
operations required to find an element in the tree. direct searches to the leaf level more efficiently.
The ReiserFS, NSS, XFS, JFS, ReFS, and BFS filesys- • Index Entries are only to `direct traffic’, thus we can
tems all use this type of tree for metadata indexing; BFS compress them.
also uses B+ trees for storing directories. NTFS uses B+
trees for directory and security-related metadata index-
ing. EXT4 uses extent trees (a modified B+ tree data Insertion
structure) for file extent indexing.[3] Relational database
management systems such as IBM DB2,[4] Informix,[4] Perform a search to determine what bucket the new
Microsoft SQL Server,[4] Oracle 8,[4] Sybase ASE,[4] and record should go into.
SQLite[5] support this type of tree for table indices. Key–
value database management systems such as CouchDB[6] • If the bucket is not full (at most b − 1 entries after
and Tokyo Cabinet[7] support this type of tree for data the insertion), add the record.
access.
• Otherwise, split the bucket.
• Allocate new leaf and move half the bucket’s
6.15.1 Overview elements to the new bucket.
• Insert the new leaf’s smallest key and address
The order, or branching factor, b of a B+ tree measures
into the parent.
the capacity of nodes (i.e., the number of children nodes)
for internal nodes in the tree. The actual number of chil- • If the parent is full, split it too.
dren for a node, referred to here as m , is constrained for • Add the middle key to the parent node.
internal nodes so that ⌈b/2⌉ ≤ m ≤ b . The root is an ex- • Repeat until a parent is found that need not
ception: it is allowed to have as few as two children.[1] For split.
example, if the order of a B+ tree is 7, each internal node
(except for the root) may have between 4 and 7 children; • If the root splits, create a new root which has one key
the root may have between 2 and 7. Leaf nodes have no and two pointers. (That is, the value that gets pushed
children, but are constrained so that the number of keys to the new root gets removed from the original node)
must be at least ⌈b/2⌉ − 1 and at most b − 1 . In the situa-
tion where a B+ tree is nearly empty, it only contains one B-trees grow at the root and not at the leaves.[1]
6.15. B+ TREE 221

Deletion • The minimum number of records stored is nmin =


⌈ ⌉h−1 ⌈ ⌉h−2
2 2b − 2 2b
• Start at root, find leaf L where entry belongs.
• The minimum number of keys is nkmin =
• Remove the entry. ⌈ ⌉h−1
2 2b −1
• If L is at least half-full, done! • The maximum number of keys is nkmax = bh − 1
• If L has fewer entries than it should,
• The space required to store the tree is O(n)
• If sibling (adjacent node with same parent
as L) is more than half-full, re-distribute, • Inserting a record requires O(logb n) operations
borrowing an entry from it.
• Finding a record requires O(logb n) operations
• Otherwise, sibling is exactly half-full, so
we can merge L and sibling. • Removing a (previously located) record requires
O(logb n) operations
• If merge occurred, must delete entry (pointing to L
or sibling) from parent of L. • Performing a range query with k elements occurring
within the range requires O(logb n + k) operations
• Merge could propagate to root, decreasing height.

6.15.4 Implementation
Bulk-loading
The leaves (the bottom-most index blocks) of the B+ tree
Given a collection of data records, we want to create a are often linked to one another in a linked list; this makes
B+ tree index on some key field. One approach is to in- range queries or an (ordered) iteration through the blocks
sert each record into an empty tree. However, it is quite simpler and more efficient (though the aforementioned
expensive, because each entry requires us to start from upper bound can be achieved even without this addition).
the root and go down to the appropriate leaf page. An This does not substantially increase space consumption
efficient alternative is to use bulk-loading. or maintenance on the tree. This illustrates one of the
significant advantages of a B+tree over a B-tree; in a B-
• The first step is to sort the data entries according to tree, since not all keys are present in the leaves, such an
a search key in ascending order. ordered linked list cannot be constructed. A B+tree is
thus particularly useful as a database system index, where
• We allocate an empty page to serve as the root, and the data typically resides on disk, as it allows the B+tree
insert a pointer to the first page of entries into it. to actually provide an efficient structure for housing the
data itself (this is described in [4]:238 as index structure
• When the root is full, we split the root, and create a
“Alternative 1”).
new root page.
If a storage system has a block size of B bytes, and the
• Keep inserting entries to the right most index page keys to be stored have a size of k, arguably the most ef-
just above the leaf level, until all entries are indexed. ficient B+ tree is one where b = (B/k) − 1 . Although
theoretically the one-off is unnecessary, in practice there
Note : is often a little extra space taken up by the index blocks
(for example, the linked list references in the leaf blocks).
Having an index block which is slightly larger than the
• when the right-most index page above the leaf level
storage system’s actual block represents a significant per-
fills up, it is split;
formance decrease; therefore erring on the side of caution
• this action may, in turn, cause a split of the right- is preferable.
most index page on step closer to the root; If nodes of the B+ tree are organized as arrays of ele-
ments, then it may take a considerable time to insert or
• splits only occur on the right-most path from the root delete an element as half of the array will need to be
to the leaf level. shifted on average. To overcome this problem, elements
inside a node can be organized in a binary tree or a B+
tree instead of an array.
6.15.3 Characteristics
B+ trees can also be used for data stored in RAM. In this
For a b -order B+ tree with h levels of index: case a reasonable choice for block size would be the size
of processor’s cache line.
• The maximum number of records stored is nmax = Space efficiency of B+ trees can be improved by using
bh − bh−1 some compression techniques. One possibility is to use
222 CHAPTER 6. SUCCESSORS AND NEIGHBORS

delta encoding to compress keys stored into each block. [4] Ramakrishnan Raghu, Gehrke Johannes – Database
For internal blocks, space saving can be achieved by ei- Management Systems, McGraw-Hill Higher Education
ther compressing keys or pointers. For string keys, space (2000), 2nd edition (en) page 267
can be saved by using the following technique: Normally [5] SQLite Version 3 Overview
the i-th entry of an internal block contains the first key
of block i+1. Instead of storing the full key, we could [6] CouchDB Guide (see note after 3rd paragraph)
store the shortest prefix of the first key of block i+1 that
[7] Tokyo Cabinet reference Archived September 12, 2009,
is strictly greater (in lexicographic order) than last key of
at the Wayback Machine.
block i. There is also a simple way to compress pointers:
if we suppose that some consecutive blocks i, i+1, ... i+k [8] "The Ubiquitous B-Tree", ACM Computing Surveys
are stored contiguously, then it will suffice to store only 11(2): 121–137 (1979).
a pointer to the first block and the count of consecutive
blocks.
6.15.8 External links
All the above compression techniques have some draw-
backs. First, a full block must be decompressed to ex- • B+ tree in Python, used to implement a list
tract a single element. One technique to overcome this
problem is to divide each block into sub-blocks and com- • Dr. Monge’s B+ Tree index notes
press them separately. In this case searching or inserting
• Evaluating the performance of CSB+-trees on Mu-
an element will only need to decompress or compress a
tithreaded Architectures
sub-block instead of a full block. Another drawback of
compression techniques is that the number of stored ele- • Effect of node size on the performance of cache con-
ments may vary considerably from a block to another de- scious B+-trees
pending on how well the elements are compressed inside
each block. • Fractal Prefetching B+-trees
• Towards pB+-trees in the field: implementations
Choices and performance
6.15.5 History
• Cache-Conscious Index Structures for Main-
The B tree was first described in the paper Organization Memory Databases
and Maintenance of Large Ordered Indices. Acta Infor-
matica 1: 173–189 (1972) by Rudolf Bayer and Edward • Cache Oblivious B(+)-trees
M. McCreight. There is no single paper introducing the
• The Power of B-Trees: CouchDB B+ Tree Imple-
B+ tree concept. Instead, the notion of maintaining all
mentation
data in leaf nodes is repeatedly brought up as an interest-
ing variant. An early survey of B trees also covering B+ • B+ Tree Visualization
trees is Douglas Comer.[8] Comer notes that the B+ tree
was used in IBM’s VSAM data access software and he
refers to an IBM published article from 1973. Implementations

• Interactive B+ Tree Implementation in C


6.15.6 See also • Interactive B+ Tree Implementation in C++
• Binary search tree • Memory based B+ tree implementation as C++ tem-
plate library
• B-tree
• Stream based B+ tree implementation as C++ tem-
• Divide and conquer algorithm
plate library
• Open Source JavaScript B+ Tree Implementation
6.15.7 References
• Perl implementation of B+ trees
[1] Navathe, Ramez Elmasri, Shamkant B. (2010). Fun-
damentals of database systems (6th ed.). Upper Saddle • Java/C#/Python implementations of B+ trees
River, N.J.: Pearson Education. pp. 652–660. ISBN
9780136086208. • File based B+Tree in C# with threading and MVCC
support
[2] http://www.seanster.com/BplusTree/BplusTree.html
• JavaScript B+ Tree, MIT License
[3] Giampaolo, Dominic (1999). Practical File System Design
with the Be File System (PDF). Morgan Kaufmann. ISBN • JavaScript B+ Tree, Interactive and Open Source
1-55860-497-9.
Chapter 7

Integer and string searching

7.1 Trie can be compressed into a deterministic acyclic finite state


automaton.
This article is about a tree data structure. For the French Though tries are usually keyed by character strings, they
commune, see Trie-sur-Baïse. need not be. The same algorithms can be adapted to serve
In computer science, a trie, also called digital tree similar functions of ordered lists of any construct, e.g.
permutations on a list of digits or shapes. In particular, a
bitwise trie is keyed on the individual bits making up any
fixed-length binary datum, such as an integer or memory
address.
t A i

t i 7.1.1 History and etymology


o e A
11 n Trie were first described by Rene de la Briandais in
15 1959.[1][2]:336 The term trie was coined two years later
to te in by Edward Fredkin, who pronounces it /ˈtriː/ (as “tree”),
after the middle syllable of retrieval.[3][4] However, other
7 a d n n 5 authors pronounce it /ˈtraɪ/ (as “try”), in an attempt to dis-
tinguish it verbally from “tree”.[3][4][5]
tea ten inn
ted
3 7.1.2 Applications
4 12 9
As a replacement for other data structures
A trie for keys “A”,"to”, “tea”, “ted”, “ten”, “i”, “in”, and “inn”.
As discussed below, a trie has a number of advantages
and sometimes radix tree or prefix tree (as they can be over binary search trees.[6] A trie can also be used to re-
searched by prefixes), is a kind of search tree—an or- place a hash table, over which it has the following advan-
dered tree data structure that is used to store a dynamic tages:
set or associative array where the keys are usually strings.
Unlike a binary search tree, no node in the tree stores • Looking up data in a trie is faster in the worst case,
the key associated with that node; instead, its position in O(m) time (where m is the length of a search string),
the tree defines the key with which it is associated. All compared to an imperfect hash table. An imperfect
the descendants of a node have a common prefix of the hash table can have key collisions. A key collision
string associated with that node, and the root is associ- is the hash function mapping of different keys to the
ated with the empty string. Values are not necessarily same position in a hash table. The worst-case lookup
associated with every node. Rather, values tend only to speed in an imperfect hash table is O(N) time, but
be associated with leaves, and with some inner nodes that far more typically is O(1), with O(m) time spent
correspond to keys of interest. For the space-optimized evaluating the hash.
presentation of prefix tree, see compact prefix tree.
• There are no collisions of different keys in a trie.
In the example shown, keys are listed in the nodes and val-
ues below them. Each complete English word has an arbi- • Buckets in a trie, which are analogous to hash ta-
trary integer value associated with it. A trie can be seen as ble buckets that store key collisions, are necessary
a tree-shaped deterministic finite automaton. Each finite only if a single key is associated with more than one
language is generated by a trie automaton, and each trie value.

223
224 CHAPTER 7. INTEGER AND STRING SEARCHING

• There is no need to provide a hash function or to We can look up a value in the trie as follows:
change hash functions as more keys are added to a find :: String -> Trie a -> Maybe a find [] t = value t find
trie. (k:ks) t = do ct <- Data.Map.lookup k (children t) find
• A trie can provide an alphabetical ordering of the ks ct
entries by key.
In an imperative style, and assuming an appropriate data
Tries do have some drawbacks as well: type in place, we can describe the same algorithm in
Python (here, specifically for testing membership). Note
• Tries can be slower in some cases than hash ta- that children is a list of a node’s children; and we say that
bles for looking up data, especially if the data is di- a “terminal” node is one which contains a valid word.
rectly accessed on a hard disk drive or some other
secondary storage device where the random-access def find(node, key): for char in key: if char in
time is high compared to main memory.[7] node.children: node = node.children[char] else: return
None return node
• Some keys, such as floating point numbers, can lead
to long chains and prefixes that are not particularly
Insertion proceeds by walking the trie according to the
meaningful. Nevertheless, a bitwise trie can han-
string to be inserted, then appending new nodes for the
dle standard IEEE single and double format floating
suffix of the string that is not contained in the trie. In
point numbers.
imperative Pascal pseudocode:
• Some tries can require more space than a hash ta- algorithm insert(root : node, s : string, value : any): node
ble, as memory may be allocated for each character = root i = 0 n = length(s) while i < n: if node.child(s[i])
in the search string, rather than a single chunk of != nil: node = node.child(s[i]) i = i + 1 else: break
memory for the whole entry, as in most hash tables. (* append new nodes, if necessary *) while i < n:
node.child(s[i]) = new node node = node.child(s[i]) i = i
Dictionary representation + 1 node.value = value

A common application of a trie is storing a predictive text


or autocomplete dictionary, such as found on a mobile Sorting
telephone. Such applications take advantage of a trie’s
ability to quickly search for, insert, and delete entries; Lexicographic sorting of a set of keys can be accom-
however, if storing dictionary words is all that is required plished with a simple trie-based algorithm as follows:
(i.e., storage of information auxiliary to each word is not
required), a minimal deterministic acyclic finite state au-
• Insert all keys in a trie.
tomaton (DAFSA) would use less space than a trie. This
is because a DAFSA can compress identical branches • Output all keys in the trie by means of pre-
from the trie which correspond to the same suffixes (or order traversal, which results in output that is in
parts) of different words being stored. lexicographically increasing order. Pre-order traver-
Tries are also well suited for implementing approxi- sal is a kind of depth-first traversal.
mate matching algorithms,[8] including those used in spell
checking and hyphenation[4] software. This algorithm is a form of radix sort.
A trie forms the fundamental data structure of Burstsort,
Term indexing which (in 2007) was the fastest known string sorting
algorithm.[10] However, now there are faster string sorting
[11]
A discrimination tree term index stores its information in algorithms.
a trie data structure.[9]
Full text search
7.1.3 Algorithms
A special kind of trie, called a suffix tree, can be used to
Lookup and membership are easily described. The listing index all suffixes in a text in order to carry out fast full
below implements a recursive trie node as a Haskell data text searches.
type. It stores an optional value and a list of children tries,
indexed by the next character:
7.1.4 Implementation strategies
import Data.Map data Trie a = Trie { value :: Maybe a,
children :: Map Char (Trie a) } There are several ways to represent tries, corresponding
to different trade-offs between memory use and speed
7.1. TRIE 225

the alphabet array as a bitmap of 256 bits representing


b d the ASCII alphabet, reducing dramatically the size of the
nodes.[14]

a o a Bitwise tries

Bitwise tries are much the same as a normal character-


based trie except that individual bits are used to traverse
b d n x d n what effectively becomes a form of binary tree. Gen-
erally, implementations use a special CPU instruction to
very quickly find the first set bit in a fixed length key (e.g.,
GCC’s __builtin_clz() intrinsic). This value is then used
y k c to index a 32- or 64-entry table which points to the first
item in the bitwise trie with that number of leading zero
bits. The search then proceeds by testing each subsequent
bit in the key and choosing child[0] or child[1] appropri-
e ately until the item is found.
Although this process might sound slow, it is very cache-
local and highly parallelizable due to the lack of regis-
A trie implemented as a doubly chained tree: vertical arrows are
child pointers, dashed horizontal arrows are next pointers. The
ter dependencies and therefore in fact has excellent per-
set of strings stored in this trie is {baby, bad, bank, box, dad, formance on modern out-of-order execution CPUs. A
dance}. The lists are sorted to allow traversal in lexicographic red-black tree for example performs much better on pa-
order. per, but is highly cache-unfriendly and causes multiple
pipeline and TLB stalls on modern CPUs which makes
that algorithm bound by memory latency rather than CPU
of the operations. The basic form is that of a linked speed. In comparison, a bitwise trie rarely accesses mem-
set of nodes, where each node contains an array of child ory, and when it does, it does so only to read, thus avoid-
pointers, one for each symbol in the alphabet (so for the ing SMP cache coherency overhead. Hence, it is increas-
English alphabet, one would store 26 child pointers and ingly becoming the algorithm of choice for code that per-
for the alphabet of bytes, 256 pointers). This is simple but forms many rapid insertions and deletions, such as mem-
wasteful in terms of memory: using the alphabet of bytes ory allocators (e.g., recent versions of the famous Doug
(size 256) and four-byte pointers, each node requires a Lea’s allocator (dlmalloc) and its descendents).
kilobyte of storage, and when there is little overlap in the
strings’ prefixes, the number of required nodes is roughly Compressing tries
the combined length of the stored strings.[2]:341 Put an-
other way, the nodes near the bottom of the tree tend Compressing the trie and merging the common branches
to have few children and there are many of them, so the can sometimes yield large performance gains. This works
structure wastes space storing null pointers.[12] best under the following conditions:
The storage problem can be alleviated by an implemen-
tation technique called alphabet reduction, whereby the • The trie is mostly static (key insertions to or dele-
original strings are reinterpreted as longer strings over a tions from a pre-filled trie are disabled).
smaller alphabet. E.g., a string of n bytes can alternatively
• Only lookups are needed.
be regarded as a string of 2n four-bit units and stored in a
trie with sixteen pointers per node. Lookups need to visit • The trie nodes are not keyed by node-specific data,
twice as many nodes in the worst case, but the storage or the nodes’ data are common.[15]
requirements go down by a factor of eight.[2]:347–352
• The total set of stored keys is very sparse within their
An alternative implementation represents a node as a representation space.
triple (symbol, child, next) and links the children of a
node together as a singly linked list: child points to For example, it may be used to represent sparse bitsets,
the node’s first child, next to the parent node’s next i.e., subsets of a much larger, fixed enumerable set. In
child.[12][13] The set of children can also be repre- such a case, the trie is keyed by the bit element position
sented as a binary search tree; one instance of this within the full set. The key is created from the string of
idea is the ternary search tree developed by Bentley and bits needed to encode the integral position of each ele-
Sedgewick.[2]:353 ment. Such tries have a very degenerate form with many
Another alternative in order to avoid the use of an array missing branches. After detecting the repetition of com-
of 256 pointers (ASCII), as suggested before, is to store mon patterns or filling the unused gaps, the unique leaf
226 CHAPTER 7. INTEGER AND STRING SEARCHING

nodes (bit strings) can be stored and compressed easily, External memory tries
reducing the overall size of the trie.
Several trie variants are suitable for maintaining sets of
Such compression is also used in the implementation of
strings in external memory, including suffix trees. A
the various fast lookup tables for retrieving Unicode char-
combination of trie and B-tree, called the B-trie has also
acter properties. These could include case-mapping ta-
been suggested for this task; compared to suffix trees,
bles (e.g. for the Greek letter pi, from ∏ to π), or lookup
they are limited in the supported operations but also more
tables normalizing the combination of base and combin-
compact, while performing update operations faster.[17]
ing characters (like the a-umlaut in German, ä, or the
dalet-patah-dagesh-ole in Biblical Hebrew, ‫)ַּ֫ד‬. For such
applications, the representation is similar to transforming 7.1.5 See also
a very large, unidimensional, sparse table (e.g. Unicode
code points) into a multidimensional matrix of their com- • Suffix tree
binations, and then using the coordinates in the hyper-
matrix as the string key of an uncompressed trie to rep- • Radix tree
resent the resulting character. The compression will then
• Directed acyclic word graph (aka DAWG)
consist of detecting and merging the common columns
within the hyper-matrix to compress the last dimension • Acyclic deterministic finite automata
in the key. For example, to avoid storing the full, multi-
byte Unicode code point of each element forming a ma- • Hash trie
trix column, the groupings of similar code points can be • Deterministic finite automata
exploited. Each dimension of the hyper-matrix stores the
start position of the next dimension, so that only the off- • Judy array
set (typically a single byte) need be stored. The resulting
• Search algorithm
vector is itself compressible when it is also sparse, so each
dimension (associated to a layer level in the trie) can be • Extendible hashing
compressed separately.
• Hash array mapped trie
Some implementations do support such data compression
within dynamic sparse tries and allow insertions and dele- • Prefix Hash Tree
tions in compressed tries. However, this usually has a sig-
• Burstsort
nificant cost when compressed segments need to be split
or merged. Some tradeoff has to be made between data • Luleå algorithm
compression and update speed. A typical strategy is to
limit the range of global lookups for comparing the com- • Huffman coding
mon branches in the sparse trie. • Ctrie
The result of such compression may look similar to trying
to transform the trie into a directed acyclic graph (DAG), • HAT-trie
because the reverse transform from a DAG to a trie is
obvious and always possible. However, the shape of the 7.1.6 References
DAG is determined by the form of the key chosen to in-
dex the nodes, in turn constraining the compression pos- [1] de la Briandais, René (1959). File searching using variable
sible. length keys. Proc. Western J. Computer Conf. pp. 295–
298. Cited by Brass.
Another compression strategy is to “unravel” the data
structure into a single byte array.[16] This approach elim- [2] Brass, Peter (2008). Advanced Data Structures. Cam-
inates the need for node pointers, substantially reducing bridge University Press.
the memory requirements. This in turn permits memory
[3] Black, Paul E. (2009-11-16). “trie”. Dictionary of Al-
mapping and the use of virtual memory to efficiently load
gorithms and Data Structures. National Institute of Stan-
the data from disk. dards and Technology. Archived from the original on
One more approach is to “pack” the trie.[4] Liang de- 2010-05-19.
scribes a space-efficient implementation of a sparse
[4] Franklin Mark Liang (1983). Word Hy-phen-a-tion By
packed trie applied to automatic hyphenation, in which Com-put-er (Doctor of Philosophy thesis). Stanford Uni-
the descendants of each node may be interleaved in mem- versity. Archived from the original (PDF) on 2010-05-19.
ory. Retrieved 2010-03-28.

[5] Knuth, Donald (1997). “6.3: Digital Searching”. The Art


of Computer Programming Volume 3: Sorting and Search-
ing (2nd ed.). Addison-Wesley. p. 492. ISBN 0-201-
89685-0.
7.2. RADIX TREE 227

[6] Bentley, Jon; Sedgewick, Robert (1998-04-01). “Ternary [17] Askitis, Nikolas; Zobel, Justin (2008). “B-tries for Disk-
Search Trees”. Dr. Dobb’s Journal. Dr Dobb’s. Archived based String Management” (PDF). VLDB Journal: 1–26.
from the original on 2008-06-23. ISSN 1066-8888.
[7] Edward Fredkin (1960). “Trie Memory”. Com-
munications of the ACM. 3 (9): 490–499.
doi:10.1145/367390.367400. 7.1.7 External links
[8] Aho, Alfred V.; Corasick, Margaret J. (Jun 1975). • NIST’s Dictionary of Algorithms and Data Struc-
“Efficient String Matching: An Aid to Bibliographic
tures: Trie
Search” (PDF). Communications of the ACM. 18 (6): 333–
340. doi:10.1145/360825.360855.
[9] John W. Wheeler; Guarionex Jordan. “An Empirical
Study of Term Indexing in the Darwin Implementation of 7.2 Radix tree
the Model Evolution Calculus”. 2004. p. 5.
[10] “Cache-Efficient String Sorting Using Copying” (PDF).
Retrieved 2008-11-15.
[11] “Engineering Radix Sort for Strings.”. Lecture Notes in
Computer Science: 3–14. doi:10.1007/978-3-540-89097-
3_3.
[12] Allison, Lloyd. “Tries”. Retrieved 18 February 2014.
[13] Sahni, Sartaj. “Tries”. Data Structures, Algorithms, & Ap-
plications in Java. University of Florida. Retrieved 18
February 2014.
[14] Bellekens, Xavier (2014). A Highly-Efficient Memory-
Compression Scheme for GPU-Accelerated Intrusion De-
tection Systems. Glasgow, Scotland, UK: ACM. pp. An example of a radix tree
302:302––302:309. ISBN 978-1-4503-3033-6. Re-
trieved 21 October 2015. In computer science, a radix tree (also radix trie or
[15] Jan Daciuk; Stoyan Mihov; Bruce W. Watson; Richard E. compact prefix tree) is a data structure that represents
Watson (2000). “Incremental Construction of Minimal a space-optimized trie in which each node that is the only
Acyclic Finite-State Automata”. Computational Linguis- child is merged with its parent. The result is that the num-
tics. Association for Computational Linguistics. 26: 3. ber of children of every internal node is at least the radix r
doi:10.1162/089120100561601. Archived from the orig- of the radix tree, where r is a positive integer and a power
inal on 2006-03-13. Retrieved 2009-05-28. This paper x of 2, having x ≥ 1. Unlike in regular tries, edges can be
presents a method for direct building of minimal acyclic labeled with sequences of elements as well as single el-
finite states automaton which recognizes a given finite list ements. This makes radix trees much more efficient for
of words in lexicographical order. Our approach is to con-
small sets (especially if the strings are long) and for sets
struct a minimal automaton in a single phase by adding
new strings one by one and minimizing the resulting au-
of strings that share long prefixes.
tomaton on-the-fly Unlike regular trees (where whole keys are compared en
[16] Ulrich Germann; Eric Joanis; Samuel Larkin (2009). masse from their beginning up to the point of inequal-
“Tightly packed tries: how to fit large models into mem- ity), the key at each node is compared chunk-of-bits by
ory, and make them load fast, too” (PDF). ACL Work- chunk-of-bits, where the quantity of bits in that chunk at
shops: Proceedings of the Workshop on Software Engi- that node is the radix r of the radix trie. When the r is
neering, Testing, and Quality Assurance for Natural Lan- 2, the radix trie is binary (i.e., compare that node’s 1-bit
guage Processing. Association for Computational Lin- portion of the key), which minimizes sparseness at the
guistics. pp. 31–39. We present Tightly Packed Tries expense of maximizing trie depth—i.e., maximizing up
(TPTs), a compact implementation of read-only, com- to conflation of nondiverging bit-strings in the key. When
pressed trie structures with fast on-demand paging and r is an integer power of 2 greater or equal to 4, then the
short load times. We demonstrate the benefits of TPTs for
radix trie is an r-ary trie, which lessens the depth of the
storing n-gram back-off language models and phrase ta-
bles for statistical machine translation. Encoded as TPTs,
radix trie at the expense of potential sparseness.
these databases require less space than flat text file rep- As an optimization, edge labels can be stored in constant
resentations of the same data compressed with the gzip size by using two pointers to a string (for the first and last
utility. At the same time, they can be mapped into mem- elements).[1]
ory quickly and be searched directly in time linear in the
length of the key, without the need to decompress the en- Note that although the examples in this article show
tire file. The overhead for local decompression during strings as sequences of characters, the type of the string
search is marginal. elements can be chosen arbitrarily; for example, as a bit
228 CHAPTER 7. INTEGER AND STRING SEARCHING

or byte of the string representation when using multibyte Node


character encodings or Unicode.
• Array of Edges edges
7.2.1 Applications • function isLeaf()

Radix trees are useful for constructing associative arrays


function lookup(string x) { // Begin at the root with
with keys that can be expressed as strings. They find par-
no elements found Node traverseNode := root; int el-
ticular application in the area of IP routing,[2] where the
ementsFound := 0; // Traverse until a leaf is found
ability to contain large ranges of values with a few excep-
or it is not possible to continue while (traverseNode !=
tions is particularly suited to the hierarchical organization
null && !traverseNode.isLeaf() && elementsFound <
of IP addresses.[3] They are also used for inverted indexes
x.length) { // Get the next edge to explore based on the ele-
of text documents in information retrieval.
ments not yet found in x Edge nextEdge := select edge
from traverseNode.edges where edge.label is a prefix
7.2.2 Operations of x.suffix(elementsFound) // x.suffix(elementsFound) re-
turns the last (x.length - elementsFound) elements of x //
Radix trees support insertion, deletion, and searching op- Was an edge found? if (nextEdge != null) { // Set the next
erations. Insertion adds a new string to the trie while try- node to explore traverseNode := nextEdge.targetNode; //
ing to minimize the amount of data stored. Deletion re- Increment elements found based on the label stored at the
moves a string from the trie. Searching operations include edge elementsFound += nextEdge.label.length; } else {
(but are not necessarily limited to) exact lookup, find pre- // Terminate loop traverseNode := null; } } // A match is
decessor, find successor, and find all strings with a prefix. found if we arrive at a leaf node and have used up ex-
All of these operations are O(k) where k is the maximum actly x.length elements return (traverseNode != null &&
length of all strings in the set, where length is measured traverseNode.isLeaf() && elementsFound == x.length);
in the quantity of bits equal to the radix of the radix trie. }

Lookup Insertion

To insert a string, we search the tree until we can make


no further progress. At this point we either add a new
outgoing edge labeled with all remaining elements in the
input string, or if there is already an outgoing edge sharing
a prefix with the remaining input string, we split it into
two edges (the first labeled with the common prefix) and
proceed. This splitting step ensures that no node has more
children than there are possible string elements.
Several cases of insertion are shown below, though more
may exist. Note that r simply represents the root. It is
assumed that edges can be labelled with empty strings to
terminate strings where necessary and that the root has no
incoming edge. (The lookup algorithm described above
will not work when using empty-string edges.)
Finding a string in a Patricia trie

The lookup operation determines if a string exists in a


trie. Most operations modify this approach in some way
to handle their specific tasks. For instance, the node
where a string terminates may be of importance. This
operation is similar to tries except that some edges con-
sume multiple elements.
The following pseudo code assumes that these classes ex-
ist.
Edge

• Node targetNode

• string label • Insert


7.2. RADIX TREE 229

'water' at the root

• Insert
'toast' while splitting 'te' and moving previous strings
a level lower

Deletion

To delete a string x from a tree, we first locate the leaf


representing x. Then, assuming x exists, we remove the
corresponding leaf node. If the parent of our leaf node
has only one other child, then that child’s incoming label
is appended to the parent’s incoming label and the child
is removed.
• Insert
'slower' while keeping 'slow'
Additional operations

• Find all strings with common prefix: Returns an ar-


ray of strings which begin with the same prefix.

• Find predecessor: Locates the largest string less than


a given string, by lexicographic order.

• Find successor: Locates the smallest string greater


than a given string, by lexicographic order.

7.2.3 History
Donald R. Morrison first described what he called “Patri-
cia trees” in 1968;[4] the name comes from the acronym
• Insert
PATRICIA, which stands for "Practical Algorithm To
'test' which is a prefix of 'tester' Retrieve Information Coded In Alphanumeric". Gernot
Gwehenberger independently invented and described the
data structure at about the same time.[5] PATRICIA tries
are radix tries with radix equals 2, which means that each
bit of the key is compared individually and each node is
a two-way (i.e., left versus right) branch.

7.2.4 Comparison to other data structures


(In the following comparisons, it is assumed that the keys
are of length k and the data structure contains n mem-
bers.)
Unlike balanced trees, radix trees permit lookup, inser-
• tion,Insert
and deletion in O(k) time rather than O(log n). This
'team' while splitting 'test' and creating a new edge does not seem like an advantage, since normally k ≥ log n,
label 'st' but in a balanced tree every comparison is a string com-
parison requiring O(k) worst-case time, many of which
230 CHAPTER 7. INTEGER AND STRING SEARCHING

are slow in practice due to long common prefixes (in the 7.2.6 See also
case where comparisons begin at the start of the string).
In a trie, all comparisons require constant time, but it • Prefix tree (also known as a Trie)
takes m comparisons to look up a string of length m.
Radix trees can perform these operations with fewer com- • Deterministic acyclic finite state automaton
parisons, and require many fewer nodes. (DAFSA)

Radix trees also share the disadvantages of tries, however: • Ternary search tries
as they can only be applied to strings of elements or ele-
ments with an efficiently reversible mapping to strings, • Acyclic deterministic finite automata
they lack the full generality of balanced search trees, • Hash trie
which apply to any data type with a total ordering. A
reversible mapping to strings can be used to produce the • Deterministic finite automata
required total ordering for balanced search trees, but not
the other way around. This can also be problematic if a • Judy array
data type only provides a comparison operation, but not
• Search algorithm
a (de)serialization operation.
Hash tables are commonly said to have expected O(1) • Extendible hashing
insertion and deletion times, but this is only true when
• Hash array mapped trie
considering computation of the hash of the key to be a
constant-time operation. When hashing the key is taken • Prefix hash tree
into account, hash tables have expected O(k) insertion
and deletion times, but may take longer in the worst case • Burstsort
depending on how collisions are handled. Radix trees
have worst-case O(k) insertion and deletion. The suc- • Luleå algorithm
cessor/predecessor operations of radix trees are also not
• Huffman coding
implemented by hash tables.

7.2.7 References
7.2.5 Variants [1] Morin, Patrick. “Data Structures for Strings” (PDF). Re-
trieved 15 April 2012.
A common extension of radix trees uses two colors of
nodes, 'black' and 'white'. To check if a given string is [2] “rtfree(9)". www.freebsd.org. Retrieved 2016-10-23.
stored in the tree, the search starts from the top and fol-
[3] Knizhnik, Konstantin. “Patricia Tries: A Better Index For
lows the edges of the input string until no further progress Prefix Searches”, Dr. Dobb’s Journal, June, 2008.
can be made. If the search string is consumed and the fi-
nal node is a black node, the search has failed; if it is [4] Morrison, Donald R. Practical Algorithm to Retrieve In-
white, the search has succeeded. This enables us to add formation Coded in Alphanumeric
a large range of strings with a common prefix to the tree,
[5] G. Gwehenberger, Anwendung einer binären Verweisket-
using white nodes, then remove a small set of “excep-
tenmethode beim Aufbau von Listen. Elektronische
tions” in a space-efficient manner by inserting them using Rechenanlagen 10 (1968), pp. 223–226
black nodes.
[6] Askitis, Nikolas; Sinha, Ranjan (2007). HAT-trie: A
The HAT-trie is a cache-conscious data structure based
Cache-conscious Trie-based Data Structure for Strings.
on radix trees that offers efficient string storage and re-
Proceedings of the 30th Australasian Conference on Com-
trieval, and ordered iterations. Performance, with re- puter science. 62. pp. 97–105. ISBN 1-920682-43-0.
spect to both time and space, is comparable to the cache-
conscious hashtable.[6][7] See HAT trie implementation [7] Askitis, Nikolas; Sinha, Ranjan (October 2010).
notes at “Engineering scalable, cache and space efficient tries
for strings”. The VLDB Journal. 19 (5): 633–660.
The adaptive radix tree is a radix tree variant that in- doi:10.1007/s00778-010-0183-9.
tegrates adaptive node sizes to the radix tree. One ma-
jor drawback of the usual radix trees is the use of space, [8] Kemper, Alfons; Eickler, André (2013). Datenbanksys-
because it uses a constant node size in every level. The teme, Eine Einführung. 9. pp. 604–605. ISBN 978-3-
major difference between the radix tree and the adaptive 486-72139-3.
radix tree is its variable size for each node based on the
[9] “armon/libart · GitHub”. GitHub. Retrieved 17 Septem-
number of child elements, which grows while adding new ber 2014.
entries. Hence, the adaptive radix tree leads to a better
use of space without reducing its speed.[8][9][10] [10] http://www-db.in.tum.de/~{}leis/papers/ART.pdf
7.3. SUFFIX TREE 231

7.2.8 External links

• Algorithms and Data Structures Research & Ref-


erence Material: PATRICIA, by Lloyd Allison, A NA
Monash University
BANANA$

• Patricia Tree, NIST Dictionary of Algorithms and 0


Data Structures

• Crit-bit trees, by Daniel J. Bernstein $ NA $ NA$

• Radix Tree API in the Linux Kernel, by Jonathan 5 4 2


Corbet
$ NA$
• Kart (key alteration radix tree), by Paul Jarc

3 1
Implementations

• FreeBSD Implementation, used for paging, for- Suffix tree for the text BANANA. Each substring is terminated
warding and other things. with special character $. The six paths from the root to the leaves
(shown as boxes) correspond to the six suffixes A$, NA$, ANA$,
NANA$, ANANA$ and BANANA$. The numbers in the leaves
• Linux Kernel Implementation, used for the page give the start position of the corresponding suffix. Suffix links,
cache, among other things. drawn dashed, are used during construction.

• GNU C++ Standard library has a trie implementa-


tion 7.3 Suffix tree
• Java implementation of Concurrent Radix Tree, by In computer science, a suffix tree (also called PAT tree
Niall Gallagher or, in an earlier form, position tree) is a compressed trie
containing all the suffixes of the given text as their keys
• C# implementation of a Radix Tree and positions in the text as their values. Suffix trees al-
low particularly fast implementations of many important
• Practical Algorithm Template Library, a C++ li- string operations.
brary on PATRICIA tries (VC++ >=2003, GCC
The construction of such a tree for the string S takes time
G++ 3.x), by Roman S. Klyujkov
and space linear in the length of S . Once constructed,
several operations can be performed quickly, for instance
• Patricia Trie C++ template class implementation, by
locating a substring in S , locating a substring if a certain
Radu Gruian
number of mistakes are allowed, locating matches for a
regular expression pattern etc. Suffix trees also provide
• Haskell standard library implementation “based on one of the first linear-time solutions for the longest com-
big-endian patricia trees”. Web-browsable source mon substring problem. These speedups come at a cost:
code. storing a string’s suffix tree typically requires significantly
more space than storing the string itself.
• Patricia Trie implementation in Java, by Roger
Kapsi and Sam Berlin
7.3.1 History
• Crit-bit trees forked from C code by Daniel J. Bern-
stein The concept was first introduced by Weiner (1973),
which Donald Knuth subsequently characterized as “Al-
• Patricia Trie implementation in C, in libcprops gorithm of the Year 1973”. The construction was greatly
simplified by McCreight (1976) , and also by Ukkonen
• Patricia Trees : efficient sets and maps over integers (1995).[1] Ukkonen provided the first online-construction
in OCaml, by Jean-Christophe Filliâtre of suffix trees, now known as Ukkonen’s algorithm, with
running time that matched the then fastest algorithms.
• Radix DB (Patricia trie) implementation in C, by G. These algorithms are all linear-time for a constant-size al-
B. Versiani phabet, and have worst-case running time of O(n log n)
232 CHAPTER 7. INTEGER AND STRING SEARCHING

in general. 7.3.4 Functionality


Farach (1997) gave the first suffix tree construction algo-
A suffix tree for a string S of length n can be built in Θ(n)
rithm that is optimal for all alphabets. In particular, this
time, if the letters come from an alphabet of integers in a
is the first linear-time algorithm for strings drawn from
polynomial range (in particular, this is true for constant-
an alphabet of integers in a polynomial range. Farach’s
sized alphabets).[3] For larger alphabets, the running time
algorithm has become the basis for new algorithms for
is dominated by first sorting the letters to bring them into
constructing both suffix trees and suffix arrays, for exam-
a range of size O(n) ; in general, this takes O(n log n)
ple, in external memory, compressed, succinct, etc.
time. The costs below are given under the assumption
that the alphabet is constant.
7.3.2 Definition Assume that a suffix tree has been built for the string S of
length n , or that a generalised suffix tree has been built for
The suffix tree for the string S of length n is defined as a the set of strings D = {S1 , S2 , . . . , SK } of total length
tree such that:[2] n = |n1 | + |n2 | + · · · + |nK | . You can:

• Search for strings:


• The tree has exactly n leaves numbered from 1 to n.
• Check if a string P of length m is a substring
• Except for the root, every internal node has at least in O(m) time.[4]
two children.
• Find the first occurrence of the patterns
• Each edge is labeled with a non-empty substring of P1 , . . . , Pq of total length m as substrings in
S. O(m) time.
• Find all z occurrences of the patterns
• No two edges starting out of a node can have string- P1 , . . . , Pq of total length m as substrings in
labels beginning with the same character. O(m + z) time.[5]
• Search for a regular expression P in time ex-
• The string obtained by concatenating all the string- pected sublinear in n .[6]
labels found on the path from the root to leaf i spells
out suffix S[i..n], for i from 1 to n. • Find for each suffix of a pattern P , the
length of the longest match between a prefix
of P [i . . . m] and a substring in D in Θ(m)
Since such a tree does not exist for all strings, S is padded time.[7] This is termed the matching statis-
with a terminal symbol not seen in the string (usually de- tics for P .
noted $). This ensures that no suffix is a prefix of an-
other, and that there will be n leaf nodes, one for each of • Find properties of the strings:
the n suffixes of S . Since all internal non-root nodes are
• Find the longest common substrings of the
branching, there can be at most n − 1 such nodes, and n
string Si and Sj in Θ(ni + nj ) time.[8]
+ (n − 1) + 1 = 2n nodes in total (n leaves, n − 1 internal
non-root nodes, 1 root). • Find all maximal pairs, maximal repeats or su-
permaximal repeats in Θ(n + z) time.[9]
Suffix links are a key feature for older linear-time con-
struction algorithms, although most newer algorithms, • Find the Lempel–Ziv decomposition in Θ(n)
which are based on Farach’s algorithm, dispense with suf- time.[10]
fix links. In a complete suffix tree, all internal non-root • Find the longest repeated substrings in Θ(n)
nodes have a suffix link to another internal node. If the time.
path from the root to a node spells the string χα , where • Find the most frequently occurring substrings
χ is a single character and α is a string (possibly empty), of a minimum length in Θ(n) time.
it has a suffix link to the internal node representing α .
• Find the shortest strings from Σ that do not
See for example the suffix link from the node for ANA
occur in D , in O(n + z) time, if there are z
to the node for NA in the figure above. Suffix links are
such strings.
also used in some algorithms running on the tree.
• Find the shortest substrings occurring only
once in Θ(n) time.
7.3.3 Generalized suffix tree • Find, for each i , the shortest substrings of Si
not occurring elsewhere in D in Θ(n) time.
A generalized suffix tree is a suffix tree made for a set of
words instead of a single word. It represents all suffixes The suffix tree can be prepared for constant time
from this set of words. Each word must be terminated by lowest common ancestor retrieval between nodes in Θ(n)
a different termination symbol or word. time.[11] One can then also:
7.3. SUFFIX TREE 233

• Find the longest common prefix between the suffixes suffix tree is seen with a fibonacci word, giving the full 2n
Si [p..ni ] and Sj [q..nj ] in Θ(1) .[12] nodes.

• Search for a pattern P of length m with at most k An important choice when making a suffix tree im-
mismatches in O(kn + z) time, where z is the num- plementation is the parent-child relationships between
ber of hits.[13] nodes. The most common is using linked lists called sib-
ling lists. Each node has a pointer to its first child, and
• Find all z maximal palindromes in Θ(n) ,[14] or to the next node in the child list it is a part of. Other
Θ(gn) time if gaps of length g are allowed, or implementations with efficient running time properties
Θ(kn) if k mismatches are allowed.[15] use hash maps, sorted or unsorted arrays (with array dou-
bling), or balanced search trees. We are interested in:
• Find all z tandem repeats in O(n log n + z) , and
k-mismatch tandem repeats in O(kn log(n/k) + z)
.[16] • The cost of finding the child on a given character.
• Find the longest common substrings to at least k
strings in D for k = 2, . . . , K in Θ(n) time.[17] • The cost of inserting a child.

• Find the longest palindromic substring of a given • The cost of enlisting all children of a node (divided
string (using the generalized suffix tree of the string by the number of children in the table below).
and its reverse) in linear time.[18]

Let σ be the size of the alphabet. Then you have the fol-
7.3.5 Applications lowing costs:

Suffix trees can be used to solve a large number of string


problems that occur in text-editing, free-text search, com- Lookup Insertion Traversal
putational biology and other application areas.[19] Pri- arrays unsorted / lists Sibling O(σ) Θ(1) Θ(1)
mary applications include:[19] trees sibling Bitwise O(log σ) Θ(1) Θ(1)
maps Hash Θ(1) Θ(1) O(σ)
• String search, in O(m) complexity, where m is the tree search Balanced O(log σ) O(log σ) O(1)
length of the sub-string (but with initial O(n) time arrays Sorted O(log σ) O(σ) O(1)
required to build the suffix tree for the string) lists sibling + maps Hash O(1) O(1) O(1)
• Finding the longest repeated substring
The insertion cost is amortised, and that the costs for
• Finding the longest common substring hashing are given for perfect hashing.
The large amount of information in each edge and node
• Finding the longest palindrome in a string
makes the suffix tree very expensive, consuming about
10 to 20 times the memory size of the source text in good
Suffix trees are often used in bioinformatics applica- implementations. The suffix array reduces this require-
tions, searching for patterns in DNA or protein sequences ment to a factor of 8 (for array including LCP values built
(which can be viewed as long strings of characters). The within 32-bit address space and 8-bit characters.) This
ability to search efficiently with mismatches might be factor depends on the properties and may reach 2 with
considered their greatest strength. Suffix trees are also usage of 4-byte wide characters (needed to contain any
used in data compression; they can be used to find re- symbol in some UNIX-like systems, see wchar_t) on 32-
peated data, and can be used for the sorting stage of the bit systems. Researchers have continued to find smaller
Burrows–Wheeler transform. Variants of the LZW com- indexing structures.
pression schemes use suffix trees (LZSS). A suffix tree is
also used in suffix tree clustering, a data clustering algo-
rithm used in some search engines.[20]
7.3.7 Parallel construction

7.3.6 Implementation Various parallel algorithms to speed up suffix tree con-


struction have been proposed.[21][22][23][24][25] Recently,
If each node and edge can be represented in Θ(1) space, a practical parallel algorithm for suffix tree construction
the entire tree can be represented in Θ(n) space. The with O(n) work (sequential time) and O(log2 n) span
total length of all the strings on all of the edges in the tree has been developed. The algorithm achieves good paral-
is O(n2 ) , but each edge can be stored as the position and lel scalability on shared-memory multicore machines and
length of a substring of S, giving a total space usage of can index the 3GB human genome in under 3 minutes
Θ(n) computer words. The worst-case space usage of a using a 40-core machine.[26]
234 CHAPTER 7. INTEGER AND STRING SEARCHING

7.3.8 External construction [9] Gusfield (1999), p.144.

Though linear, the memory usage of a suffix tree is signif- [10] Gusfield (1999), p.166.
icantly higher than the actual size of the sequence collec- [11] Gusfield (1999), Chapter 8.
tion. For a large text, construction may require external
memory approaches. [12] Gusfield (1999), p.196.

There are theoretical results for constructing suffix trees [13] Gusfield (1999), p.200.
in external memory. The algorithm by Farach-Colton,
Ferragina & Muthukrishnan (2000) is theoretically op- [14] Gusfield (1999), p.198.
timal, with an I/O complexity equal to that of sorting. [15] Gusfield (1999), p.201.
However the overall intricacy of this algorithm has pre-
vented, so far, its practical implementation.[27] [16] Gusfield (1999), p.204.
On the other hand, there have been practical works for [17] Gusfield (1999), p.205.
constructing disk-based suffix trees which scale to (few)
GB/hours. The state of the art methods are TDD,[28] [18] Gusfield (1999), pp.197–199.
TRELLIS,[29] DiGeST,[30] and B2 ST.[31]
[19] Allison, L. “Suffix Trees”. Retrieved 2008-10-14.
TDD and TRELLIS scale up to the entire human genome
– approximately 3GB – resulting in a disk-based suffix [20] First introduced by Zamir & Etzioni (1998).
tree of a size in the tens of gigabytes.[28][29] However, [21] Apostolico et al. (Vishkin).
these methods cannot handle efficiently collections of se-
quences exceeding 3GB.[30] DiGeST performs signifi- [22] Hariharan (1994).
cantly better and is able to handle collections of sequences
[23] Sahinalp & Vishkin (1994).
in the order of 6GB in about 6 hours.[30] . All these meth-
ods can efficiently build suffix trees for the case when the [24] Farach & Muthukrishnan (1996).
tree does not fit in main memory, but the input does. The
most recent method, B2 ST,[31] scales to handle inputs that [25] Iliopoulos & Rytter (2004).
do not fit in main memory. ERA is a recent parallel suffix [26] Shun & Blelloch (2014).
tree construction method that is significantly faster. ERA
can index the entire human genome in 19 minutes on an [27] Smyth (2003).
8-core desktop computer with 16GB RAM. On a sim-
ple Linux cluster with 16 nodes (4GB RAM per node), [28] Tata, Hankins & Patel (2003).
ERA can index the entire human genome in less than 9 [29] Phoophakdee & Zaki (2007).
minutes.[32]
[30] Barsky et al. (2008).

7.3.9 See also [31] Barsky et al. (2009).

[32] Mansour et al. (2011).


• Suffix array
• Generalised suffix tree
7.3.11 References
• Trie
• Apostolico, A.; Iliopoulos, C.; Landau, G. M.;
Schieber, B.; Vishkin, U. (1988), “Parallel construc-
7.3.10 Notes tion of a suffix tree with applications”, Algorithmica,
3.
[1] Giegerich & Kurtz (1997).
• Baeza-Yates, Ricardo A.; Gonnet, Gaston
[2] http://www.cs.uoi.gr/~{}kblekas/courses/
bioinformatics/Suffix_Trees1.pdf
H. (1996), “Fast text searching for regu-
lar expressions or automaton searching on
[3] Farach (1997). tries”, Journal of the ACM, 43 (6): 915–936,
doi:10.1145/235809.235810.
[4] Gusfield (1999), p.92.

[5] Gusfield (1999), p.123. • Barsky, Marina; Stege, Ulrike; Thomo, Alex; Up-
ton, Chris (2008), “A new method for indexing
[6] Baeza-Yates & Gonnet (1996). genomes using on-disk suffix trees”, CIKM '08: Pro-
ceedings of the 17th ACM Conference on Informa-
[7] Gusfield (1999), p.132.
tion and Knowledge Management, New York, NY,
[8] Gusfield (1999), p.125. USA: ACM, pp. 649–658.
7.4. SUFFIX ARRAY 235

• Barsky, Marina; Stege, Ulrike; Thomo, Alex; Up- • Smyth, William (2003), Computing Patterns in
ton, Chris (2009), “Suffix trees for very large ge- Strings, Addison-Wesley.
nomic sequences”, CIKM '09: Proceedings of the
18th ACM Conference on Information and Knowl- • Shun, Julian; Blelloch, Guy E. (2014), “A Simple
edge Management, New York, NY, USA: ACM. Parallel Cartesian Tree Algorithm and its Appli-
cation to Parallel Suffix Tree Construction”, ACM
• Farach, Martin (1997), “Optimal Suffix Tree Con- Transactions on Parallel Computing.
struction with Large Alphabets” (PDF), 38th IEEE
Symposium on Foundations of Computer Science • Tata, Sandeep; Hankins, Richard A.; Patel, Jig-
(FOCS '97), pp. 137–143. nesh M. (2003), “Practical Suffix Tree Construc-
tion”, VLDB '03: Proceedings of the 30th Interna-
• Farach, Martin; Muthukrishnan, S. (1996), “Op- tional Conference on Very Large Data Bases, Mor-
timal Logarithmic Time Randomized Suffix Tree gan Kaufmann, pp. 36–47.
Construction”, International Colloquium on Au-
tomata Languages and Programming. • Ukkonen, E. (1995), “On-line construction of suf-
fix trees” (PDF), Algorithmica, 14 (3): 249–260,
• Farach-Colton, Martin; Ferragina, Paolo; Muthukr- doi:10.1007/BF01206331.
ishnan, S. (2000), “On the sorting-complexity of
suffix tree construction.”, Journal of the ACM, 47 • Weiner, P. (1973), “Linear pattern matching al-
(6): 987–1011, doi:10.1145/355541.355547. gorithms” (PDF), 14th Annual IEEE Symposium
on Switching and Automata Theory, pp. 1–11,
• Giegerich, R.; Kurtz, S. (1997), “From Ukko- doi:10.1109/SWAT.1973.13.
nen to McCreight and Weiner: A Unifying
View of Linear-Time Suffix Tree Construc- • Zamir, Oren; Etzioni, Oren (1998), “Web document
tion” (PDF), Algorithmica, 19 (3): 331–353, clustering: a feasibility demonstration”, SIGIR '98:
doi:10.1007/PL00009177. Proceedings of the 21st annual international ACM
SIGIR conference on Research and development in
• Gusfield, Dan (1999), Algorithms on Strings, Trees information retrieval, New York, NY, USA: ACM,
and Sequences: Computer Science and Computa- pp. 46–54.
tional Biology, Cambridge University Press, ISBN
0-521-58519-8.
7.3.12 External links
• Hariharan, Ramesh (1994), “Optimal Parallel Suffix
Tree Construction”, ACM Symposium on Theory of • Suffix Trees by Sartaj Sahni
Computing.
• NIST’s Dictionary of Algorithms and Data Struc-
• Iliopoulos, Costas; Rytter, Wojciech (2004), “On tures: Suffix Tree
Parallel Transformations of Suffix Arrays into Suf-
fix Trees”, 15th Australasian Workshop on Combi- • Universal Data Compression Based on the Burrows-
natorial Algorithms. Wheeler Transformation: Theory and Practice, ap-
plication of suffix trees in the BWT
• Mansour, Essam; Allam, Amin; Skiadopoulos,
Spiros; Kalnis, Panos (2011), “ERA: Efficient • Theory and Practice of Succinct Data Structures,
Serial and Parallel Suffix Tree Construction for C++ implementation of a compressed suffix tree
Very Long Strings” (PDF), PVLDB, 5 (1): 49–60, • Ukkonen’s Suffix Tree Implementation in C Part 1
doi:10.14778/2047485.2047490. Part 2 Part 3 Part 4 Part 5 Part 6
• McCreight, Edward M. (1976), “A Space-
Economical Suffix Tree Construction Algorithm”,
Journal of the ACM, 23 (2): 262–272, CiteSeerX 7.4 Suffix array
10.1.1.130.8022 , doi:10.1145/321941.321946.
In computer science, a suffix array is a sorted array of
• Phoophakdee, Benjarath; Zaki, Mohammed J. all suffixes of a string. It is a data structure used, among
(2007), “Genome-scale disk-based suffix tree in- others, in full text indices, data compression algorithms
dexing”, SIGMOD '07: Proceedings of the ACM SIG- and within the field of bioinformatics.[1]
MOD International Conference on Management of
Data, New York, NY, USA: ACM, pp. 833–844. Suffix arrays were introduced by Manber & Myers (1990)
as a simple, space efficient alternative to suffix trees. They
• Sahinalp, Cenk; Vishkin, Uzi (1994), “Symmetry have independently been discovered by Gaston Gonnet in
breaking for suffix tree construction”, ACM Sympo- 1987 under the name PAT array (Gonnet, Baeza-Yates &
sium on Theory of Computing Snider 1992).
236 CHAPTER 7. INTEGER AND STRING SEARCHING

7.4.1 Definition trees: Suffix arrays store n integers. Assuming an inte-


ger requires 4 bytes, a suffix array requires 4n bytes in
Let S = S[1]S[2]...S[n] be a string and let S[i, j] denote total. This is significantly less than the 20n bytes which
the substring of S ranging from i to j . are required by a careful suffix tree implementation.[3]
The suffix array A of S is now defined to be an array of However, in certain applications, the space requirements
integers providing the starting positions of suffixes of S in of suffix arrays may still be prohibitive. Analyzed in
lexicographical order. This means, an entry A[i] contains bits, a suffix array requires O(n log n) space, whereas
the starting position of the i -th smallest suffix in S and the original text over an alphabet of size σ only requires
thus for all 1 < i ≤ n : S[A[i − 1], n] < S[A[i], n] . O(n log σ) bits. For a human genome with σ = 4 and
n = 3.4 × 109 the suffix array would therefore occupy
about 16 times more memory than the genome itself.
7.4.2 Example
Such discrepancies motivated a trend towards
Consider the text S =banana$ to be indexed: compressed suffix arrays and BWT-based compressed
full-text indices such as the FM-index. These data
The text ends with the special sentinel letter $ that is structures require only space within the size of the text
unique and lexicographically smaller than any other char- or even less.
acter. The text has the following suffixes:
These suffixes can be sorted in ascending order:
7.4.5 Construction Algorithms
The suffix array A contains the starting positions of these
sorted suffixes: A suffix tree can be built in O(n) and can be converted
The suffix array with the suffixes written out vertically into a suffix array by traversing the tree depth-first also
underneath for clarity: in O(n) , so there exist algorithms that can build a suffix
array in O(n) .
So for example, A[3] contains the value 4, and therefore
refers to the suffix starting at position 4 within S , which A naive approach to construct a suffix array is to use a
is the suffix ana$. comparison-based sorting algorithm. These algorithms
require O(n log n) suffix comparisons, but a suffix com-
parison runs in O(n) time, so the overall runtime of this
7.4.3 Correspondence to suffix trees approach is O(n2 log n) .
More advanced algorithms take advantage of the fact that
Suffix arrays are closely related to suffix trees:
the suffixes to be sorted are not arbitrary strings but re-
lated to each other. These algorithms strive to achieve the
• Suffix arrays can be constructed by performing a following goals:[4]
depth-first traversal of a suffix tree. The suffix array
corresponds to the leaf-labels given in the order in
• minimal asymptotic complexity Θ(n)
which these are visited during the traversal, if edges
are visited in the lexicographical order of their first • lightweight in space, meaning little or no working
character. memory beside the text and the suffix array itself is
• A suffix tree can be constructed in linear time by needed
using a combination of suffix array and LCP ar- • fast in practice
ray. For a description of the algorithm, see the
corresponding section in the LCP array article.
One of the first algorithms to achieve all goals is the SA-
IS algorithm of Nong, Zhang & Chan (2009). The al-
It has been shown that every suffix tree algorithm can be gorithm is also rather simple (< 100 LOC) and can be
systematically replaced with an algorithm that uses a suf- enhanced to simultaneously construct the LCP array.[5]
fix array enhanced with additional information (such as The SA-IS algorithm is one of the fastest known suffix ar-
the LCP array) and solves the same problem in the same ray construction algorithms. A careful implementation by
time complexity.[2] Advantages of suffix arrays over suf- Yuta Mori outperforms most other linear or super-linear
fix trees include improved space requirements, simpler construction approaches.
linear time construction algorithms (e.g., compared to
Ukkonen’s algorithm) and improved cache locality.[1] Beside time and space requirements, suffix array con-
struction algorithms are also differentiated by their sup-
ported alphabet: constant alphabets where the alpha-
7.4.4 Space Efficiency bet size is bound by a constant, integer alphabets where
characters are integers in a range depending on n and
Suffix arrays were introduced by Manber & Myers (1990) general alphabets where only character comparisons are
in order to improve over the space requirements of suffix allowed.[6]
7.4. SUFFIX ARRAY 237

Most suffix array construction algorithms are based on mid else: l = mid + 1 return (s, r)
one of the following approaches:[4]
Finding the substring pattern P of length m in the string
• Prefix doubling algorithms are based on a strategy S of length n takes O(m log n) time, given that a sin-
of Karp, Miller & Rosenberg (1972). The idea is to gle suffix comparison needs to compare m characters.
find prefixes that honor the lexicographic ordering of Manber & Myers (1990) describe how this bound can
suffixes. The assessed prefix length doubles in each be improved to O(m + log n) time using LCP infor-
iteration of the algorithm until a prefix is unique and mation. The idea is that a pattern comparison does not
provides the rank of the associated suffix. need to re-compare certain characters, when it is already
known that these are part of the longest common prefix of
• Recursive algorithms follow the approach of the suf- the pattern and the current search interval. Abouelhoda,
fix tree construction algorithm by Farach (1997) to Kurtz & Ohlebusch (2004) improve the bound even fur-
recursively sort a subset of suffixes. This subset is ther and achieve a search time of O(m) as known from
then used to infer a suffix array of the remaining suf- suffix trees.
fixes. Both of these suffix arrays are then merged to
Suffix sorting algorithms can be used to compute the
compute the final suffix array.
Burrows–Wheeler transform (BWT). The BWT requires
• Induced copying algorithms are similar to recursive sorting of all cyclic permutations of a string. If this string
algorithms in the sense that they use an already ends in a special end-of-string character that is lexico-
sorted subset to induce a fast sort of the remaining graphically smaller than all other character (i.e., $), then
suffixes. The difference is that these algorithms fa- the order of the sorted rotated BWT matrix corresponds
vor iteration over recursion to sort the selected suffix to the order of suffixes in a suffix array. The BWT can
subset. A survey of this diverse group of algorithms therefore be computed in linear time by first construct-
has been put together by Puglisi, Smyth & Turpin ing a suffix array of the text and then deducing the BWT
(2007). string: BW T [i] = S[A[i] − 1] .
Suffix arrays can also be used to look up substrings in
A well-known recursive algorithm for integer alphabets Example-Based Machine Translation, demanding much
is the DC3 / skew algorithm of Kärkkäinen & Sanders less storage than a full phrase table as used in Statistical
(2003). It runs in linear time and has successfully been machine translation.
used as the basis for parallel[7] and external memory[8] Many additional applications of the suffix array require
suffix array construction algorithms. the LCP array. Some of these are detailed in the
Recent work by Salson et al. (2009) proposes an al- application section of the latter.
gorithm for updating the suffix array of a text that has
been edited instead of rebuilding a new suffix array from
scratch. Even if the theoretical worst-case time complex- 7.4.7 Notes
ity is O(n log n) , it appears to perform well in prac-
[1] Abouelhoda, Kurtz & Ohlebusch 2002.
tice: experimental results from the authors showed that
their implementation of dynamic suffix arrays is gener- [2] Abouelhoda, Kurtz & Ohlebusch 2004.
ally more efficient than rebuilding when considering the
insertion of a reasonable number of letters in the original [3] Kurtz 1999.
text.
[4] Puglisi, Smyth & Turpin 2007.

[5] Fischer 2011.


7.4.6 Applications
[6] Burkhardt & Kärkkäinen 2003.
The suffix array of a string can be used as an index to
quickly locate every occurrence of a substring pattern P [7] Kulla & Sanders 2007.
within the string S . Finding every occurrence of the pat-
[8] Dementiev et al. 2008.
tern is equivalent to finding every suffix that begins with
the substring. Thanks to the lexicographical ordering,
these suffixes will be grouped together in the suffix ar-
7.4.8 References
ray and can be found efficiently with two binary searches.
The first search locates the starting position of the inter- • Abouelhoda, Mohamed Ibrahim; Kurtz, Stefan;
val, and the second one determines the end position: Ohlebusch, Enno (2004). “Replacing suffix trees
def search(P): l = 0; r = n while l < r: mid = (l+r) / 2 if P with enhanced suffix arrays”. Journal of Discrete
> suffixAt(A[mid]): l = mid + 1 else: r = mid s = l; r = n Algorithms. 2 (1): 53–86. doi:10.1016/S1570-
while l < r: mid = (l+r) / 2 if P < suffixAt(A[mid]): r = 8667(03)00065-0.
238 CHAPTER 7. INTEGER AND STRING SEARCHING

• Manber, Udi; Myers, Gene (1990). Suffix arrays: a • Farach, M. (1997). Optimal suffix tree construc-
new method for on-line string searches. First Annual tion with large alphabets. Proceedings 38th Annual
ACM-SIAM Symposium on Discrete Algorithms. Symposium on Foundations of Computer Science.
pp. 319–327. p. 137. doi:10.1109/SFCS.1997.646102. ISBN 0-
8186-8197-7.
• Manber, Udi; Myers, Gene (1993). “Suffix ar-
rays: a new method for on-line string searches”. • Kärkkäinen, Juha; Sanders, Peter (2003). Simple
SIAM Journal on Computing. 22: 935–948. Linear Work Suffix Array Construction. Automata,
doi:10.1137/0222058. Languages and Programming. Lecture Notes in
Computer Science. 2719. p. 943. doi:10.1007/3-
• Gonnet, G.H; Baeza-Yates, R.A; Snider, T (1992). 540-45061-0_73. ISBN 978-3-540-40493-4.
“New indices for text: PAT trees and PAT ar-
rays”. Information retrieval: data structures and al- • Dementiev, Roman; Kärkkäinen, Juha; Mehn-
gorithms. ert, Jens; Sanders, Peter (2008). “Better
external memory suffix array construction”.
• Kurtz, S (1999). “Reducing the space requirement Journal of Experimental Algorithmics. 12: 1.
of suffix trees”. Software-Practice and Experi- doi:10.1145/1227161.1402296.
ence. 29 (13): 1149. doi:10.1002/(SICI)1097-
024X(199911)29:13<1149::AID- • Kulla, Fabian; Sanders, Peter (2007). “Scalable par-
SPE274>3.0.CO;2-O. allel suffix array construction”. Parallel Computing.
33 (9): 605. doi:10.1016/j.parco.2007.06.004.
• Abouelhoda, Mohamed Ibrahim; Kurtz, Stefan;
Ohlebusch, Enno (2002). The Enhanced Suffix Ar-
ray and Its Applications to Genome Analysis. Algo- 7.4.9 External links
rithms in Bioinformatics. Lecture Notes in Com-
puter Science. 2452. p. 449. doi:10.1007/3-540- • Suffix Array in Java
45784-4_35. ISBN 978-3-540-44211-0.
• Suffix sorting module for BWT in C code
• Puglisi, Simon J.; Smyth, W. F.; Turpin, Andrew H.
(2007). “A taxonomy of suffix array construction • Suffix Array Implementation in Ruby
algorithms”. ACM Computing Surveys. 39 (2): 4.
doi:10.1145/1242471.1242472. • Suffix array library and tools

• Nong, Ge; Zhang, Sen; Chan, Wai Hong (2009). • Project containing various Suffix Array c/c++ Im-
Linear Suffix Array Construction by Almost Pure plementations with a unified interface
Induced-Sorting. 2009 Data Compression Confer- • A fast, lightweight, and robust C API library to con-
ence. p. 193. doi:10.1109/DCC.2009.42. ISBN struct the suffix array
978-0-7695-3592-0.
• Suffix Array implementation in Python
• Fischer, Johannes (2011). Inducing the LCP-
Array. Algorithms and Data Structures. Lec- • Linear Time Suffix Array implementation in C using
ture Notes in Computer Science. 6844. p. 374. suffix tree
doi:10.1007/978-3-642-22300-6_32. ISBN 978-3-
642-22299-3.
• Salson, M.; Lecroq, T.; Léonard, M.; Mouchard, 7.5 Suffix automaton
L. (2010). “Dynamic extended suffix arrays”.
Journal of Discrete Algorithms. 8 (2): 241.
doi:10.1016/j.jda.2009.02.007. q0

• Burkhardt, Stefan; Kärkkäinen, Juha (2003). Fast


Lightweight Suffix Array Construction and Checking. s u f f i x
q1 q2 q3 q4 q5 q6 q7
Combinatorial Pattern Matching. Lecture Notes in
Computer Science. 2676. p. 55. doi:10.1007/3-
Non-deterministic suffix automaton for the word “suffix”. Epsilon
540-44888-8_5. ISBN 978-3-540-40311-1.
transitions are shown grey.
• Karp, Richard M.; Miller, Raymond E.; Rosen-
berg, Arnold L. (1972). Rapid identification of re- In computer science, a suffix automaton or directed
peated patterns in strings, trees and arrays. Pro- acyclic word graph is a finite automaton that recognizes
ceedings of the fourth annual ACM symposium the set of suffixes of a given string. It can be thought of
on Theory of computing - STOC '72. p. 125. as a compressed form of the suffix tree, a data structure
doi:10.1145/800152.804905. that efficiently represents the suffixes of the string. For
7.6. VAN EMDE BOAS TREE 239

example, a suffix automaton for the string “suffix” can be Verlag, pp. 503–518, doi:10.1007/978-3-642-
queried for other strings; it will report “true” for any of 22685-4_44, ISBN 978-3-642-22684-7
the strings “suffix”, “uffix”, “ffix”, “fix”, “ix” and “x”, and
“false” for any other string.[1]
The suffix automaton of a set of strings U has at most 2Q 7.6 Van Emde Boas tree
− 2 states, where Q is the number of nodes of a prefix-tree
representing the strings in U.[2] A Van Emde Boas tree (or Van Emde Boas prior-
Suffix automata have applications in approximate string ity queue; Dutch pronunciation: [vɑn 'ɛmdə 'boːɑs]), also
matching. [1] known as a vEB tree, is a tree data structure which im-
plements an associative array with m-bit integer keys. It
performs all operations in O(log m) time, or equivalently
7.5.1 See also in O(log log M) time, where M = 2m is the maximum
number of elements that can be stored in the tree. The M
• GADDAG is not to be confused with the actual number of elements
stored in the tree, by which the performance of other tree
• Suffix array data-structures is often measured. The vEB tree has good
space efficiency when it contains a large number of ele-
ments, as discussed below. It was invented by a team
7.5.2 References
led by Dutch computer scientist Peter van Emde Boas in
[1]
[1] Navarro, Gonzalo (2001), “A guided tour to approximate 1975.
string matching” (PDF), ACM Computing Surveys, 33 (1):
31–88, doi:10.1145/375360.375365
7.6.1 Supported operations
[2] Mohri, Mehryar; Moreno, Pedro; Weinstein, Eu-
gene (September 2009), “General suffix automa- A vEB supports the operations of an ordered associative
ton construction algorithm and space bounds”, array, which includes the usual associative array opera-
Theoretical Computer Science, 410 (37): 3553–3562, tions along with two more order operations, FindNext and
doi:10.1016/j.tcs.2009.03.034
FindPrevious:[2]

7.5.3 Additional reading • Insert: insert a key/value pair with an m-bit key

• Inenaga, S.; Hoshino, H.; Shinohara, A.; Takeda, • Delete: remove the key/value pair with a given key
M.; Arikawa, S. (2001), “On-line construction of • Lookup: find the value associated with a given key
symmetric compact directed acyclic word graphs”,
Proc. 8th Int. Symp. String Processing and In- • FindNext: find the key/value pair with the smallest
formation Retrieval, 2001. SPIRE 2001, pp. 96– key at least a given k
110, doi:10.1109/SPIRE.2001.989743, ISBN 0-
• FindPrevious: find the key/value pair with the largest
7695-1192-9.
key at most a given k
• Crochemore, Maxime; Vérin, Renaud (1997), “Di-
rect construction of compact directed acyclic word A vEB tree also supports the operations Minimum and
graphs”, Combinatorial Pattern Matching, Lecture Maximum, which return the minimum and maximum el-
Notes in Computer Science, Springer-Verlag, pp. ement stored in the tree respectively.[3] These both run
116–129, doi:10.1007/3-540-63220-4_55. in O(1) time, since the minimum and maximum element
are stored as attributes in each tree.
• Epifanio, Chiara; Mignosi, Filippo; Shallit, Jef-
frey; Venturini, Ilaria (2004), “Sturmian graphs and
a conjecture of Moser”, in Calude, Cristian S.; 7.6.2 How it works
Calude, Elena; Dineen, Michael J., Developments in
language theory. Proceedings, 8th international con- For the sake of simplicity, let log2 m = k for some integer
ference (DLT 2004), Auckland, New Zealand, De- k. Define M = 2m . A vEB tree T over the universe {0, ...,
cember 2004, Lecture Notes in Computer Science, M−1} has a root node that stores an array T.children of
3340, Springer-Verlag, pp. 175–187, ISBN 3-540- length √M. T.children[i] is a pointer to a vEB tree that is
24014-4, Zbl 1117.68454 responsible for the values {i√M, ..., (i+1)√M−1}. Addi-
• Do, H.H.; Sung, W.K. (2011), “Compressed Di- tionally, T stores two values T.min and T.max as well as
rected Acyclic Word Graph with Application in Lo- an auxiliary vEB tree T.aux.
cal Alignment”, Computing and Combinatorics, Lec- Data is stored in a vEB tree as follows: The smallest value
ture Notes in Computer Science, 6842, Springer- currently in the tree is stored in T.min and largest value is
240 CHAPTER 7. INTEGER AND STRING SEARCHING

1. If T is empty then we set T.min = T.max = x and we


are done.

2. Otherwise, if x<T.min then we insert T.min into the


subtree i responsible for T.min and then set T.min
= x. If T.children[i] was previously empty, then we
also insert i into T.aux

3. Otherwise, if x>T.max then we insert x into the sub-


tree i responsible for x and then set T.max = x. If
T.children[i] was previously empty, then we also in-
sert i into T.aux

An example Van Emde Boas tree with dimension 5 and the root’s 4. Otherwise, T.min< x < T.max so we insert x into
aux structure after 1, 2, 3, 5, 8 and 10 have been inserted. the subtree i responsible for x. If T.children[i] was
previously empty, then we also insert i into T.aux.

stored in T.max. Note that T.min is not stored anywhere In code:


else in the vEB tree, while T.max is. If T is empty then
we use the convention that T.max=−1 and T.min=M. Any function Insert(T, x) if T.min > T.max then // T is empty
other value x is stored in the subtree T.children[i] where i T.min = T.max = x; return if T.min == T.max then if
= ⌊x/√M⌋. The auxiliary tree T.aux keeps track of which x < T.min then T.min = x if x > T.max then T.max =
children are non-empty, so T.aux contains the value j if x if x < T.min then swap(x, T.min) if x > T.max then
and only if T.children[j] is non-empty. T.max = x i = floor(x / √M) Insert(T.children[i], x mod
√M) if T.children[i].min == T.children[i].max then In-
sert(T.aux, i) end
FindNext The key to the efficiency of this procedure is that insert-
ing an element into an empty vEB tree takes O(1) time.
The operation FindNext(T, x) that searches for the suc- So, even though the algorithm sometimes makes two re-
cessor of an element x in a vEB tree proceeds as fol- cursive calls, this only occurs when the first recursive call
lows: If x≤T.min then the search is complete, and the was into an empty subtree. This gives the same running
answer is T.min. If x>T.max then the next element time recurrence of as before.
does not exist, return M. Otherwise, let i = x/√M. If
x≤T.children[i].max then the value being searched for is
Delete
contained in T.children[i] so the search proceeds recur-
sively in T.children[i]. Otherwise, we search for the value
Deletion from vEB trees is the trickiest of the operations.
i in T.aux. This gives us the index j of the first subtree that
The call Delete(T, x) that deletes a value x from a vEB
contains an element larger than x. The algorithm then re-
tree T operates as follows:
turns T.children[j].min. The element found on the chil-
dren level needs to be composed with the high bits to form
a complete next element. 1. If T.min = T.max = x then x is the only element
stored in the tree and we set T.min = M and T.max
function FindNext(T, x). if x ≤ T.min then re- = −1 to indicate that the tree is empty.
turn T.min if x > T.max then // no next ele-
ment return M i = floor(x/√M) lo = x mod √M 2. Otherwise, if x == T.min then we need to
hi = x − lo if lo ≤ T.children[i].max then re- find the second-smallest value y in the vEB
turn hi + FindNext(T.children[i], lo) return hi + tree, delete it from its current location, and
T.children[FindNext(T.aux, i)].min end set T.min=y. The second-smallest value y is
Note that, in any case, the algorithm performs O(1) work T.children[T.aux.min].min, so it can be found in
and then possibly recurses on a subtree over a universe of O(1) time. We delete y from the subtree that con-
size M 1/2 (an m/2 bit universe). This gives a recurrence tains it.
for the running time of , which resolves to O(log m) =
3. If x≠T.min and x≠T.max then we delete x from the
O(log log M).
subtree T.children[i] that contains x.

4. If x == T.max then we will need to find the second-


Insert largest value y in the vEB tree and set T.max=y. We
start by deleting x as in previous case. Then value y
The call insert(T, x) that inserts a value x into a vEB tree is either T.min or T.children[T.aux.max].max, so it
T operates as follows: can be found in O(1) time.
7.7. FUSION TREE 241

5. In any of the above cases, if we delete the last ele- reason why they are not popular in practice. One way of
ment x or y from any subtree T.children[i] then we addressing this limitation is to use only a fixed number of
also delete i from T.aux bits per level, which results in a trie. Alternatively, each
table may be replaced by a hash table, reducing the space
In code: to O(n) (where n is the number of elements stored in the
data structure) at the expense of making the data struc-
function Delete(T, x) if T.min == T.max == x then ture randomized. Other structures, including y-fast tries
T.min = M T.max = −1 return if x == T.min then and x-fast tries have been proposed that have compara-
x = T.children[T.aux.min].min T.min = x i = floor(x / ble update and query times and also use randomized hash
√M) Delete(T.children[i], x mod √M) if T.children[i] tables to reduce the space to O(n) or O(n log M).
is empty then Delete(T.aux, i) if x == T.max then
if T.aux is empty then T.max = T.min else T.max =
T.children[T.aux.max].max end 7.6.3 References
Again, the efficiency of this procedure hinges on the fact
that deleting from a vEB tree that contains only one el- [1] Peter van Emde Boas: Preserving order in a forest in
less than logarithmic time (Proceedings of the 16th Annual
ement takes only constant time. In particular, the last
Symposium on Foundations of Computer Science 10: 75-
line of code only executes if x was the only element in 84, 1975)
T.children[i] prior to the deletion.
[2] Gudmund Skovbjerg Frandsen: Dynamic algorithms:
Course notes on van Emde Boas trees (PDF) (University
Discussion of Aarhus, Department of Computer Science)

The assumption that log m is an integer is unnecessary. [3] Thomas H. Cormen, Charles E. Leiserson, Ronald L.
The operations x/√M and x mod √M can be replaced Rivest, and Clifford Stein. Introduction to Algorithms,
by taking only higher-order ⌈m/2⌉ and the lower-order Third Edition. MIT Press, 2009. ISBN 978-0-262-
53305-8. Chapter 20: The van Emde Boas tree, pp. 531–
⌊m/2⌋ bits of x, respectively. On any existing machine,
560.
this is more efficient than division or remainder compu-
tations. [4] Rex, A. “Determining the space complexity of van Emde
The implementation described above uses pointers and Boas trees”. Retrieved 2011-05-27.
occupies a total space of O(M) = O(2m ). This√can be
seen
√ as follows. The√ recurrence is S(M ) = O( M ) + Further reading
( M + 1) · S(√ M ) . Resolving that would √ lead to
S(M ) ∈ (1 + M )log log M + log log M · O( M ) .
• Erik Demaine, Sam Fingeret, Shravas Rao, Paul
One can, fortunately, also show that S(M) = M−2 by
Christiano. Massachusetts Institute of Technology.
induction.[4]
6.851: Advanced Data Structures (Spring 2012).
In practical implementations, especially on machines Lecture 11 notes. March 22, 2012.
with shift-by-k and find first zero instructions, perfor-
mance can further be improved by switching to a bit ar- • Van Emde Boas, P.; Kaas, R.; Zijlstra, E. (1976).
ray once m equal to the word size (or a small multiple “Design and implementation of an efficient priority
thereof) is reached. Since all operations on a single word queue”. Mathematical Systems Theory. 10: 99–127.
are constant time, this does not affect the asymptotic per- doi:10.1007/BF01683268.
formance, but it does avoid the majority of the pointer
storage and several pointer dereferences, achieving a sig-
nificant practical savings in time and space with this trick. 7.7 Fusion tree
An obvious optimization of vEB trees is to discard empty
subtrees. This makes vEB trees quite compact when they In computer science, a fusion tree is a type of tree data
contain many elements, because no subtrees are created structure that implements an associative array on w-bit
until something needs to be added to them. Initially, each integers. When operating on a collection of n key–value
element added creates about log(m) new trees containing pairs, it uses O(n) space and performs searches in O(logw
about m/2 pointers all together. As the tree grows, more n) time, which is asymptotically faster than a traditional
and more subtrees are reused, especially the larger ones. self-balancing binary search tree, and also better than the
In a full tree of 2m elements, only O(2m ) space is used. van Emde Boas tree for large values of w. It achieves this
Moreover, unlike a binary search tree, most of this space speed by exploiting certain constant-time operations that
is being used to store data: even for billions of elements, can be done on a machine word. Fusion trees were in-
the pointers in a full vEB tree number in the thousands. vented in 1990 by Michael Fredman and Dan Willard.[1]
However, for small trees the overhead associated with Several advances have been made since Fredman and
vEB trees is enormous: on the order of √M. This is one Willard’s original 1990 paper. In 1999[2] it was shown
242 CHAPTER 7. INTEGER AND STRING SEARCHING

how to implement fusion trees under a model of com- Approximating the sketch
putation in which all of the underlying operations of the
algorithm belong to AC0 , a model of circuit complexity If the locations of the sketch bits are b1 < b2 < ··· < br,
that allows addition and bitwise Boolean operations but then the sketch of the key xw−₁···x1 x0 is the r-bit integer
disallows the multiplication operations used in the origi- xbr xbr−1 · · · xb1 .
nal fusion tree algorithm. A dynamic version of fusion With only standard word operations, such as those of the
trees using hash tables was proposed in 1996[3] which C programming language, it is difficult to directly com-
matched the original structure’s O(logw n) runtime in ex- pute the sketch of a key in constant time. Instead, the
pectation. Another dynamic version using exponential sketch bits can be packed into a range of size at most r4 ,
tree was proposed in 2007[4] which yields worst-case run- using bitwise AND and multiplication. The bitwise AND
times of O(logw n + log log u) per operation, where u is operation serves to clear all non-sketch bits from the key,
the size of the largest key. It remains open whether dy- while the multiplication shifts the sketch bits into a small
namic fusion trees can achieve O(logw n) per operation range. Like the “perfect” sketch, the approximate sketch
with high probability. preserves the order of the keys.
Some preprocessing is needed to determine the correct
7.7.1 How it works multiplication constant. Each sketch bit in location ∑ bi will
r
get shifted to bi + mi via a multiplication by m = i=1
mi
A fusion tree is essentially a B-tree with branching factor 2 . For the approximate sketch to work, the following
of w1/5 (any small exponent is also possible), which gives three properties must hold:
it a height of O(logw n). To achieve the desired runtimes
for updates and queries, the fusion tree must be able to 1. bi + mj are distinct for all pairs (i, j). This will ensure
1/5
search a node containing up to w keys in constant time. that the sketch bits are uncorrupted by the multipli-
This is done by compressing (“sketching”) the keys so that cation.
all can fit into one machine word, which in turn allows 2. bi + mi is a strictly increasing function of i. That is,
comparisons to be done in parallel. the order of the sketch bits is preserved.
3. (br + mr) - (b1 + m1 ) ≤ r4 . That is, the sketch bits
Sketching are packed into a range of size at most r4 .

Sketching is the method by which each w-bit key at a node An inductive argument shows how the mi can be con-
containing k keys is compressed into only k − 1 bits. Each structed. Let m1 = w − b1 . Suppose that 1 < t ≤ r and
key x may be thought of as a path in the full binary tree that m1 , m2 ... mt-1 have already been chosen. Then pick
of height w starting at the root and ending at the leaf cor- the smallest integer mt such that both properties (1) and
responding to x. To distinguish two paths, it suffices to (2) are satisfied. Property (1) requires that mt ≠ bi − bj
look at their branching point (the first bit where the two + ml for all 1 ≤ i, j ≤ r and 1 ≤ l ≤ t−1. Thus, there are
keys differ). All k paths together have k − 1 branching less than tr2 ≤ r3 values that mt must avoid. Since mt is
points, so at most k − 1 bits are needed to distinguish any chosen to be minimal, (bt + mt) ≤ (bt−₁ + mt−₁) + r3 .
two of the k keys. This implies Property (3).
The approximate sketch is thus computed as follows:

1. Mask out all but the sketch bits with a bitwise AND.
2. Multiply the key by the predetermined constant m.
This operation actually requires two machine words,
but this can still by done in constant time.
3. Mask out all but the shifted sketch bits. These are
now contained in a contiguous block of at most r4 <
w4/5 bits.

Parallel comparison

Visualization of the sketch function. The purpose of the compression achieved by sketching is
to allow all of the keys to be stored in one w-bit word. Let
An important property of the sketch function is that it pre- the node sketch of a node be the bit string
serves the order of the keys. That is, sketch(x) < sketch(y)
for any two keys x < y. 1sketch(x1 )1sketch(x2 )...1sketch(xk)
7.7. FUSION TREE 243

We can assume that the sketch function uses exactly b ≤ 3. Let l−1 be the length of the longest common prefix
r4 bits. Then each block uses 1 + b ≤ w4/5 bits, and since p.
k ≤ w1/5 , the total number of bits in the node sketch is at
most w. (a) If the l-th bit of q is 0, let e = p10w-l . Use
parallel comparison to search for the successor
A brief notational aside: for a bit string s and nonnegative of sketch(e). This is the actual predecessor of
integer m, let sm denote the concatenation of s to itself m q.
times. If t is also a bit string st denotes the concatenation
of t to s. (b) If the l-th bit of q is 1, let e = p01w-l . Use par-
allel comparison to search for the predecessor
The node sketch makes it possible to search the keys for of sketch(e). This is the actual successor of q.
any b-bit integer y. Let z = (0y)k , which can be computed
in constant time (multiply y by the constant (0b 1)k ). Note 4. Once either the predecessor or successor of q is
that 1sketch(xi) - 0y is always positive, but preserves its found, the exact position of q among the set of keys
leading 1 iff sketch(xi) ≥ y. We can thus compute the is determined.
smallest index i such that sketch(xi) ≥ y as follows:

1. Subtract z from the node sketch.


7.7.2 Fusion hashing

2. Take the bitwise AND of the difference and the con- An application of fusion trees to hash tables was given
stant (10b )k . This clears all but the leading bit of by Willard, who describes a data structure for hashing in
each block. which an outer-level hash table with hash chaining is com-
bined with a fusion tree representing each hash chain. In
3. Find the most significant bit of the result. hash chaining, in a hash table with a constant load factor,
4. Compute i, using the fact that the leading bit of the the average size of a chain is constant, but additionally
i-th block has index i(b+1). with high probability all chains have size O(log n / log log
n), where n is the number of hashed items. This chain size
is small enough that a fusion tree can handle searches and
Desketching updates within it in constant time per operation. There-
fore, the time for all operations in the data structure is
For an arbitrary query q, parallel comparison computes constant with high probability. More precisely, with this
the index i such that data structure, for every inverse-quasipolynomial proba-
bility p(n) = exp((log n)O(1) ), there is a constant C such
sketch(xi−₁) ≤ sketch(q) ≤ sketch(xi) that the probability that there exists an operation that ex-
ceeds time C is at most p(n).[5]
Unfortunately, the sketch function is not in general order-
preserving outside the set of keys, so it is not necessarily
7.7.3 References
the case that xi−₁ ≤ q ≤ xi. What is true is that, among
all of the keys, either xi−₁ or xi has the longest common [1] Fredman, M. L.; Willard, D. E. (1990), “BLAST-
prefix with q. This is because any key y with a longer ING Through the Information Theoretic Barrier with
common prefix with q would also have more sketch bits FUSION TREES”, Proceedings of the Twenty-second
in common with q, and thus sketch(y) would be closer to Annual ACM Symposium on Theory of Computing
sketch(q) than any sketch(xj). (STOC '90), New York, NY, USA: ACM, pp. 1–7,
doi:10.1145/100216.100217, ISBN 0-89791-361-2.
The length longest common prefix between two w-bit in-
tegers a and b can be computed in constant time by find- [2] Andersson, Arne; Miltersen, Peter Bro; Thorup, Mikkel
ing the most significant bit of the bitwise XOR between (1999), “Fusion trees can be implemented with AC0 in-
a and b. This can then be used to mask out all but the structions only”, Theoretical Computer Science, 215 (1-
longest common prefix. 2): 337–344, doi:10.1016/S0304-3975(98)00172-8, MR
1678804.
Note that p identifies exactly where q branches off from
the set of keys. If the next bit of q is 0, then the successor [3] Raman, Rajeev (1996), “Priority queues: small, mono-
of q is contained in the p1 subtree, and if the next bit of tone and trans-dichotomous”, Fourth Annual European
q is 1, then the predecessor of q is contained in the p0 Symposium on Algorithms (ESA '96), Barcelona, Spain,
subtree. This suggests the following algorithm: September 25–27, 1996, Lecture Notes in Computer
Science, 1136, Berlin: Springer-Verlag, pp. 121–137,
doi:10.1007/3-540-61680-2_51, MR 1469229.
1. Use parallel comparison to find the index i such that
sketch(xi−₁) ≤ sketch(q) ≤ sketch(xi). [4] Andersson, Arne; Thorup, Mikkel (2007), “Dynamic or-
dered sets with exponential search trees”, Journal of the
2. Compute the longest common prefix p of q and ei- ACM, 54 (3): A13, doi:10.1145/1236457.1236460, MR
ther xi−₁ or xi (taking the longer of the two). 2314255.
244 CHAPTER 7. INTEGER AND STRING SEARCHING

[5] Willard, Dan E. (2000), “Examining computational ge-


ometry, van Emde Boas trees, and hashing from the per-
spective of the fusion tree”, SIAM Journal on Computing,
29 (3): 1030–1049, doi:10.1137/S0097539797322425,
MR 1740562.

7.7.4 External links


• MIT CS 6.897: Advanced Data Structures: Lecture
4, Fusion Trees, Prof. Erik Demaine (Spring 2003)
• MIT CS 6.897: Advanced Data Structures: Lec-
ture 5, More fusion trees; self-organizing data struc-
tures, move-to-front, static optimality, Prof. Erik
Demaine (Spring 2003)
• MIT CS 6.851: Advanced Data Structures: Lecture
13, Fusion Tree notes, Prof. Erik Demaine (Spring
2007)

• MIT CS 6.851: Advanced Data Structures: Lecture


12, Fusion Tree notes, Prof. Erik Demaine (Spring
2012)
Chapter 8

Text and image sources, contributors, and


licenses

8.1 Text
• Abstract data type Source: https://en.wikipedia.org/wiki/Abstract_data_type?oldid=767611214 Contributors: SolKarma, Merphant,
Ark~enwiki, B4hand, Michael Hardy, Wapcaplet, Skysmith, Haakon, Silvonen, Populus, Wernher, W7cook, Aqualung, BenRG, Noldoaran,
Tea2min, Giftlite, WiseWoman, Jonathan.mark.lingard, Jorge Stolfi, Daniel Brockman, Knutux, Dunks58, Andreas Kaufmann, Corti,
Mike Rosoft, Rich Farmbrough, Wrp103, Pink18, RJHall, Leif, Spoon!, R. S. Shaw, Alansohn, Diego Moya, Mr Adequate, Kendrick
Hang, Japanese Searobin, Miaow Miaow, Ruud Koot, Marudubshinki, Graham87, BD2412, Qwertyus, Kbdank71, Rjwilmsi, MZM-
cBride, Everton137, Chobot, YurikBot, Wavelength, SAE1962, Cedar101, Fang Aili, Sean Whitton, Petri Krohn, DGaw, TuukkaH,
SmackBot, Brick Thrower, Jpvinall, Chris the speller, SchfiftyThree, Nbarth, Cybercobra, Dreadstar, A5b, Breno, Antonielly, MTS-
bot~enwiki, Phuzion, Only2sea, Blaisorblade, Gnfnrf, Thijs!bot, Sagaciousuk, Ideogram, Widefox, JAnDbot, Magioladitis, David Eppstein,
Zacchiro, Felipe1982, Javawizard, SpallettaAC1041, AntiSpamBot, Khinyaminn, Cobi, Funandtrvl, Lights, Sector7agent, Anna Lincoln,
Don4of4, Kbrose, Arjun024, Flyer22 Reborn, Yerpo, Svick, Fishnet37222, Denisarona, ClueBot, The Thing That Should Not Be, Un-
buttered Parsnip, Garyzx, Adrianwn, Mild Bill Hiccup, PeterV1510, Boing! said Zebedee, M4gnum0n, Armin Rigo, Cacadril, BOTarate,
Thehelpfulone, Aitias, Appicharlask, Baudway, Addbot, Ghettoblaster, Capouch, Daniel.Burckhardt, Chamal N, Debresser, Bluebusy,
Jarble, Yobot, Legobot II, Pcap, Vanished user rt41as76lk, Materialscientist, ArthurBot, Nhey24, Omnipaedista, RibotBOT, FrescoBot,
Mark Renier, Chevymontecarlo, Maggyero, The Arbiter, RedBot, Tanayseven, Reconsider the static, Babayagagypsies, Dismantle101, Liz-
tanp, Efphf, Dinamik-bot, Ljr1981, John of Reading, Thecheesykid, Ebrambot, Demonkoryu, ChuispastonBot, Snehalshekatkar, Double
Dribble, Rocketrod1960, ClueBot NG, Hoorayforturtles, Frietjes, Widr, BG19bot, Ameyenn, ChrisGualtieri, GoShow, Hower64, JYBot,
Dexbot, Pintoch, Epicgenius, Carwile2, Cpt Wise, KasparBot, Tropicalkitty, Cakedy, OAbot, Zech147 and Anonymous: 188
• Data structure Source: https://en.wikipedia.org/wiki/Data_structure?oldid=762192284 Contributors: LC~enwiki, Ap, -- April, Andre
Engels, Karl E. V. Palmen, XJaM, Arvindn, Ghyll~enwiki, Michael Hardy, TakuyaMurata, Minesweeper, Ahoerstemeier, Nanshu, King-
turtle, Glenn, UserGoogol, Jiang, Edaelon, Nikola Smolenski, Dcoetzee, Chris Lundberg, Populus, Traroth, Mrjeff, Bearcat, Robbot,
Noldoaran, Craig Stuntz, Altenmann, Babbage, Mushroom, Seth Ilys, GreatWhiteNortherner, Tea2min, Giftlite, DavidCary, Esap, Jorge
Stolfi, Siroxo, Pgan002, Kjetil r, Lancekt, Jacob grace, Pale blue dot, Andreas Kaufmann, Corti, Bri, Wrp103, Bender235, Mister-
Sheik, Lycurgus, Shanes, Viriditas, Vortexrealm, Obradovic Goran, Helix84, Mdd, Jumbuck, Alansohn, Liao, Tablizer, Yamla, PaePae,
ReyBrujo, Derbeth, Forderud, Mahanga, Bushytails, Mindmatrix, Carlette, Ruud Koot, Easyas12c, TreveX, Bluemoose, Abd, Palica,
Mandarax, Yoric~enwiki, Qwertyus, Koavf, Ligulem, GeorgeBills, Husky, Margosbot~enwiki, Fragglet, RexNL, Fresheneesz, Butros,
Chobot, Tas50, Banaticus, YurikBot, RobotE, Hairy Dude, Pi Delport, Mipadi, Grafen, Dmoss, Tony1, Googl, Ripper234, Closed-
mouth, Vicarious, JLaTondre, GrinBot~enwiki, TuukkaH, SmackBot, Reedy, DCDuring, Thunderboltz, BurntSky, Gilliam, Ohnoitsjamie,
EncMstr, MalafayaBot, Nick Levine, Frap, Allan McInnes, Khukri, Ryan Roos, Sethwoodworth, Er Komandante, SashatoBot, Demicx,
Soumyasch, Antonielly, SpyMagician, Loadmaster, Noah Salzman, Mr Stephen, Alhoori, Sharcho, Caiaffa, Iridescent, CRGreathouse,
Ahy1, FinalMinuet, Requestion, Nnp, Peterdjones, GPhilip, Pascal.Tesson, Qwyrxian, MTA~enwiki, Thadius856, AntiVandalBot, Wide-
fox, Seaphoto, Jirka6, Dougher, Tom 99, Lanov, MER-C, Wikilolo, Wmbolle, Magioladitis, Rhwawn, Nyq, Wwmbes, David Eppstein,
User A1, Cpl Syx, Oicumayberight, Gwern, MasterRadius, Rettetast, Lithui, Sanjay742, Rrwright, Marcin Suwalczan, Jimmytharpe, San-
thy, TXiKiBoT, Oshwah, Eve Hall, Vipinhari, Coldfire82, BwDraco, Rozmichelle, Billinghurst, Falcon8765, Spinningspark, Spitfire8520,
Haiviet~enwiki, SieBot, Caltas, Eurooppa~enwiki, Ham Pastrami, Jerryobject, Flyer22 Reborn, Jvs, Strife911, Ramkumaran7, Nskillen,
DancingPhilosopher, Digisus, Tanvir Ahmmed, ClueBot, Spellsinger180, Justin W Smith, The Thing That Should Not Be, Rodhullan-
demu, Sundar sando, Garyzx, Adrianwn, Abhishek.kumar.ak, Excirial, Alexbot, Erebus Morgaine, Arjayay, Morel, DumZiBoT, XLinkBot,
Paushali, Pgallert, Galzigler, Alexius08, MystBot, Dsimic, Jncraton, MrOllie, EconoPhysicist, Publichealthguru, Mdnahas, Tide rolls,
‫ماني‬, Teles, ‫سعی‬, Legobot, Luckas-bot, Yobot, Fraggle81, AnomieBOT, DemocraticLuntz, SteelPangolin, Jim1138, Kingpin13, Materi-
alscientist, ArthurBot, Xqbot, Pur3r4ngelw, Miym, DAndC, RibotBOT, Moxy, Shadowjams, Methcub, Prari, FrescoBot, Liridon, Mark
Renier, Hypersonic12, Maggyero, LeyNon, Rameshngbot, MertyWiki, Thompsonb24, Profvalente, FoxBot, Laurențiu Dascălu, Lotje,
Bharatshettybarkur, Tbhotch, Thinktdub, Kh31311, Vineetzone, Uriah123, DRAGON BOOSTER, EmausBot, Apoctyliptic, Dcirovic,
Thecheesykid, ZéroBot, MithrandirAgain, EdEColbert, IGeMiNix, Mentibot, BioPupil, MainFrame, Senator2029, Chandraguptamau-
rya, Rocketrod1960, Raveendra Lakpriya, Petrb, ClueBot NG, Aks1521, Widr, Danim, Jorgenev, Orzechowskid, Gmharhar, HMSSolent,
Wbm1058, Walk&check, Kndimov, Panchobook, Richfaber, SoniyaR, Yashykt, Cncmaster, Sallupandit, Pragmocialist, Sgord512, An-
derson, Vishnu0919, Varma rockzz, Frosty, Hernan mvs, Forgot to put name, I am One of Many, Bereajan, Gauravxpress, Haeynzen,
JaconaFrere, Justin15w, Gambhir.jagmeet, Richard Yin, Jrachiele, Guturu Bhuvanamitra, TranquilHope, Iliazm, AlphaBetaGamma01,
KasparBot, \wowzeryest\, ProprioMe OW, Ampsthatgoto11, Koerkra, SandeepGfG, Harvi004, Safadalvi and Anonymous: 393

245
246 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

• Analysis of algorithms Source: https://en.wikipedia.org/wiki/Analysis_of_algorithms?oldid=772874756 Contributors: Bryan Derksen,


Seb, Arvindn, Hfastedge, Edward, Nixdorf, Kku, Dfeuer, McKay, Pakaran, Murray Langton, Altenmann, MathMartin, Bkell, Tea2min,
David Gerard, Giftlite, Jao, Brona, Manuel Anastácio, Beland, Andreas Kaufmann, Liberlogos, Mani1, Bender235, Ashewmaker, MCiura,
Gary, Terrycojones, Pinar, Ruud Koot, Ilya, Qwertyus, Miserlou, DVdm, YurikBot, Louigi, PrologFan, Cedar101, Cmglee, Smack-
Bot, Vald, Nbarth, Mhym, GRuban, Radagast83, Cybercobra, Kendrick7, Spiritia, Lee Carre, Amakuru, CRGreathouse, ShelfSkewed,
Marek69, Hermel, Magioladitis, VoABot II, David Eppstein, User A1, Maju wiki, TyrS, 2help, Cometstyles, The Wilschon, BotKung,
Groupthink, Keilana, Xe7al, Ykhwong, Alastair Carnegie, Ivan Akira, Roux, Jarble, Legobot, Yobot, Fraggle81, Pcap, GateKeeper,
AnomieBOT, Materialscientist, Miym, Charvest, Fortdj33, 124Nick, Serols, RobinK, WillNess, RjwilmsiBot, Uploader4u, Prime-
fac, Jmencisom, Wikipelli, The Nut, Tirab, Tijfo098, ClueBot NG, Ifarzana, Satellizer, Tvguide1234, Helpful Pixie Bot, Intr199,
Manuel.mas12, Liam987, AlexanderZoul, Jochen Burghardt, Phamnhatkhanh, Cubism44, Vieque, PNattrass, InternetArchiveBot, John
“Hannibal” Smith and Anonymous: 84
• Amortized analysis Source: https://en.wikipedia.org/wiki/Amortized_analysis?oldid=754316633 Contributors: Michael Hardy, Takuya-
Murata, Poor Yorick, Dcoetzee, Altenmann, Giftlite, Brona, Andreas Kaufmann, Qutezuce, Talldean, Caesura, Joriki, Brazzy, Nneonneo,
Eubot, Mathbot, Rbonvall, Laubrau~enwiki, Jimp, RussBot, PrologFan, CWenger, Allens, SmackBot, InverseHypercube, Torzsmokus,
Pierre de Lyon, Sluzzelin, The Transhumanist, Magioladitis, User A1, Oravec, R'n'B, Bse3, Mantipula, BotKung, Svick, Safek, Rob Bed-
nark, BarretB, Addbot, EconoPhysicist, Worch, Jarble, AnomieBOT, Stevo2001, Jangirke, FrescoBot, Mike22120, Vramasub, ClueBot
NG, Josephshanak, MrBlok, Widr, Cerabot~enwiki, Jeff Erickson, Mayfanning7, Pelzflorian, SurendraMatavalam, ScottDNelson, Mur-
raycu and Anonymous: 36
• Accounting method Source: https://en.wikipedia.org/wiki/Accounting_method?oldid=581881141 Contributors: Damian Yerrick, Timo
Honkasalo, Dcoetzee, CyborgTosser, Hoxu, Mechonbarsa, Ruud Koot, SmackBot, AnthonyUK, Addbot, Yobot, RobinK, MladenWiki,
Miracle Pen, John of Reading, Rezabot, Proxyma and Anonymous: 6
• Potential method Source: https://en.wikipedia.org/wiki/Potential_method?oldid=763123665 Contributors: Andreas Kaufmann,
Oliphaunt, BD2412, Regnaron~enwiki, Ripper234, Bluebot, Rmturner, Deflective, CobaltBlue, Tokenzero, David Eppstein, Russl5445,
SchreiberBike, Addbot, Yobot, Erel Segal, DrilBot, EmausBot, Arrandale, Proxyma, Esmaeil1372, Mahmoud182003, Bishop-9-Echo and
Anonymous: 9
• Array data type Source: https://en.wikipedia.org/wiki/Array_data_type?oldid=759475225 Contributors: Edward, Michael Hardy, SamB,
Jorge Stolfi, Beland, D6, Spoon!, Bgwhite, RussBot, KGasso, SmackBot, Canthusus, Gilliam, Nbarth, Cybercobra, Lambiam, Korval,
Capmo, Hvn0413, Mike Fikes, JohnCaron, Hroðulf, Vigyani, Cerberus0, Kbrose, ClueBot, Mxaza, Garyzx, Staticshakedown, Addbot,
Yobot, Denispir, Pcap, AnomieBOT, Jim1138, Praba230890, Materialscientist, Nhantdn, Termininja, Akhilan, Thecheesykid, Cgt, Clue-
Bot NG, Mariuskempe, Helpful Pixie Bot, Airatique, Pratyya Ghosh, Soni, Pintoch, TwoTwoHello, Yamaha5, Fmadd and Anonymous:
46
• Array data structure Source: https://en.wikipedia.org/wiki/Array_data_structure?oldid=770233387 Contributors: The Anome, Ed Poor,
Andre Engels, Tsja, B4hand, Patrick, RTC, Michael Hardy, Norm, Nixdorf, Graue, TakuyaMurata, Alfio, Ellywa, Julesd, Cgs, Poor Yorick,
Rossami, Dcoetzee, Dysprosia, Jogloran, Wernher, Fvw, Sewing, Robbot, Josh Cherry, Fredrik, Lowellian, Wikibot, Jleedev, Giftlite,
SamB, DavidCary, Massysett, BenFrantzDale, Lardarse, Ssd, Jorge Stolfi, Macrakis, Jonathan Grynspan, Lockeownzj00, Beland, Vanished
user 1234567890, Karol Langner, Icairns, Simoneau, Jh51681, Andreas Kaufmann, Mattb90, Corti, Jkl, Rich Farmbrough, Guanabot,
Qutezuce, ESkog, ZeroOne, Danakil, MisterSheik, G worroll, Spoon!, Army1987, Func, Rbj, Mdd, Jumbuck, Mr Adequate, Atanamir,
Krischik, Tauwasser, ReyBrujo, Suruena, Rgrig, Forderud, Beliavsky, Beej71, Mindmatrix, Jimbryho, Ruud Koot, Jeff3000, Grika, Palica,
Gerbrant, Graham87, Kbdank71, Zzedar, Ketiltrout, Bill37212, Ligulem, Yamamoto Ichiro, Mike Van Emmerik, Quuxplusone, Intgr, Vi-
sor, Sharkface217, Bgwhite, YurikBot, Wavelength, Borgx, RobotE, RussBot, Fabartus, Splash, Pi Delport, Stephenb, Pseudomonas, Kim-
chi.sg, Dmason, JulesH, Mikeblas, Bota47, JLaTondre, Hide&Reason, Heavyrain2408, SmackBot, Princeatapi, Blue520, Trojo~enwiki,
Brick Thrower, Alksub, Apers0n, Betacommand, GwydionM, Anwar saadat, Keegan, Timneu22, Nbarth, Mcaruso, Tsca.bot, Tamfang,
Berland, Cybercobra, Mwtoews, Masterdriverz, Kukini, Smremde, SashatoBot, Derek farn, John, 16@r, Slakr, Beetstra, Dreftymac, Cour-
celles, George100, Ahy1, Engelec, Wws, Neelix, Simeon, Kaldosh, Travelbird, Mrstonky, Skittleys, Strangelv, Christian75, Narayanese,
Epbr123, Sagaciousuk, Trevyn, Escarbot, Thadius856, AntiVandalBot, AbstractClass, JAnDbot, JaK81600~enwiki, MER-C, Cameltrader,
PhiLho, SiobhanHansa, Magioladitis, VoABot II, Ling.Nut, DAGwyn, David Eppstein, User A1, Squidonius, Gwern, Highegg, Thema-
nia, Patar knight, J.delanoy, Slogsweep, Darkspots, Jayden54, Mfb52, Funandtrvl, VolkovBot, TXiKiBoT, Anonymous Dissident, Don4of4,
Amog, Redacteur, AHMartin, Nicvaroce, Kbrose, SieBot, Caltas, Garde, Tiptoety, Paolo.dL, Oxymoron83, Svick, Anchor Link Bot, Jlmer-
rill, ClueBot, LAX, Jackollie, The Thing That Should Not Be, Alksentrs, Rilak, Supertouch, R000t, Liempt, Excirial, Immortal Wowbagger,
Bargomm, Thingg, Footballfan190, Johnuniq, SoxBot III, Awbell, Chris glenne, Staticshakedown, SilvonenBot, Henrry513414, Dsimic,
Gaydudes, Btx40, EconoPhysicist, SamatBot, Zorrobot, Legobot, Luckas-bot, Yobot, Ptbotgourou, Fraggle81, Peter Flass, Tempodivalse,
Obersachsebot, Xqbot, SPTWriter, FrescoBot, Citation bot 2, I dream of horses, HRoestBot, MastiBot, Jandalhandler, Laurențiu Dascălu,
Dinamik-bot, TylerWilliamRoss, Merlinsorca, Jfmantis, The Utahraptor, Steve03Mills, EmausBot, Mfaheem007, Donner60, Ipsign, Ieee
andy, EdoBot, Muzadded, Mikhail Ryazanov, ClueBot NG, Widr, Helpful Pixie Bot, Roger Wellington-Oguri, Wbm1058, 111008066it,
Solomon7968, Mark Arsten, Crh23, Simba2331, Insidiae, OlyDLG, ChrisGualtieri, A'bad group, Makecat-bot, Pintoch, Chetan chopade,
Ginsuloft, KasparBot, Kekmon, CAPTAIN RAJU, Gulumeemee, Shahparth199730, Dmitrizaitsev, Ushkin N, Evilgogeta4 and Anony-
mous: 251
• Dynamic array Source: https://en.wikipedia.org/wiki/Dynamic_array?oldid=747460602 Contributors: Damian Yerrick, Edward, Ixfd64,
Phoe6, Dcoetzee, Furrykef, Wdscxsj, Jorge Stolfi, Karol Langner, Andreas Kaufmann, Moxfyre, Dpm64, Wrp103, Forbsey, ZeroOne,
MisterSheik, Spoon!, Runner1928, Ryk, Fresheneesz, Wavelength, SmackBot, Rōnin, Bluebot, Octahedron80, Nbarth, Cybercobra, De-
cltype, MegaHasher, Beetstra, Green caterpillar, Sytelus, Tobias382, Icep, Wikilolo, David Eppstein, Gwern, Cobi, Spinningspark, Arbor
to SJ, Ctxppc, ClueBot, Simonykill, Garyzx, Mild Bill Hiccup, Alex.vatchenko, Addbot, AndersBot, ‫ماني‬, Aekton, Didz93, Tartarus,
Luckas-bot, Yobot, , Arjun G. Menon, Rubinbot, Materialscientist, Citation bot, SPTWriter, FrescoBot, Mutinus, Patmorin, WillNess,
EmausBot, Card Zero, Pintoch, François Robere, RippleSax, Bender the Bot and Anonymous: 44
• Linked list Source: https://en.wikipedia.org/wiki/Linked_list?oldid=772896358 Contributors: Uriyan, BlckKnght, Fubar Obfusco, Perique
des Palottes, BL~enwiki, Paul Ebermann, Stevertigo, Nixdorf, Kku, TakuyaMurata, Karingo, Minesweeper, Stw, Angela, Smack, Dwo,
MatrixFrog, Dcoetzee, RickK, Ww, Andrewman327, IceKarma, Silvonen, Thue, Samber~enwiki, Traroth, Kenny Moens, Robbot, Astro-
nautics~enwiki, Fredrik, Yacht, 75th Trombone, Wereon, HaeB, Pengo, Tea2min, Enochlau, Giftlite, Achurch, Elf, Haeleth, BenFrantz-
Dale, Kenny sh, Levin, Jason Quinn, Jorge Stolfi, Mboverload, Ferdinand Pienaar, Neilc, CryptoDerk, Supadawg, Karl-Henner, Sam
Hocevar, Creidieki, Sonett72, Andreas Kaufmann, Jin~enwiki, Corti, Ta bu shi da yu, Poccil, Wrp103, Pluke, Antaeus Feldspar, Dcoet-
zeeBot~enwiki, Jarsyl, MisterSheik, Shanes, Nickj, Spoon!, Bobo192, R. S. Shaw, Adrian~enwiki, Giraffedata, Mdd, JohnyDog, Arthena,
8.1. TEXT 247

Upnishad, Mc6809e, Lord Pistachio, Fawcett5, Theodore Kloba, Wtmitchell, Docboat, RJFJR, Versageek, TheCoffee, Kenyon, Christian
*Yendi* Severin, Unixxx, Kelmar~enwiki, Mindmatrix, Arneth, Ruud Koot, Tabletop, I64s, Smiler jerg, Rnt20, Graham87, Magister
Mathematicae, BD2412, TedPostol, StuartBrady, Arivne, Intgr, Adamking, King of Hearts, DVdm, Bgwhite, YurikBot, Borgx, Deep-
trivia, RussBot, Jengelh, Grafen, Welsh, Daniel Mietchen, Raven4x4x, Quentin mcalmott, ColdFusion650, Cesarsorm~enwiki, Tetracube,
Clayhalliwell, LeonardoRob0t, Bluezy, Katieh5584, Tyomitch, Willemo, RichF, robot, SmackBot, Waltercruz~enwiki, FlashSheri-
dan, Rōnin, Sam Pointon, Gilliam, Leafboat, Rmosler2100, NewName, Chris the speller, TimBentley, Stevage, Nbarth, Colonies Chris,
Deshraj, JonHarder, Cybercobra, IE, MegaHasher, Lasindi, Atkinson 291, Dreslough, Jan.Smolik, NJZombie, Minna Sora no Shita, 16@r,
Hvn0413, Beetstra, ATren, Noah Salzman, Koweja, Hu12, Iridescent, PavelY, Aeons, Tawkerbot2, Ahy1, Penbat, VTBassMatt, Ntsimp,
Mblumber, JFreeman, Xenochria, HappyInGeneral, Headbomb, Marek69, Neil916, Dark knight, Nick Number, Danarmstrong, Tha-
dius856, AntiVandalBot, Ste4k, Darklilac, Wizmo, JAnDbot, XyBot, MER-C, PhilKnight, SiobhanHansa, Wikilolo, VoABot II, Twsx,
Japo, David Eppstein, Philg88, Gwern, Moggie2002, Thirdright, Trusilver, Javawizard, Dillesca, Daniel5Ko, Nwbeeson, Cobi, KylieTas-
tic, Ja 62, Brvman, Meiskam, Larryisgood, Oshwah, Vipinhari, Mantipula, Don4of4, Amog, BotKung, BigDunc, Wolfrock, Celticeric,
B4upradeep, Tomaxer, Albertus Aditya, Clowd81, Sprocter, Kbrose, Arjun024, J0hn7r0n, Wjl2, SieBot, Tiddly Tom, Yintan, Ham Pas-
trami, Pi is 3.14159, Keilana, Flyer22 Reborn, TechTony, Redmarkviolinist, Beejaye, Bughunter2, Mygerardromance, NHSKR, Hariva,
Denisarona, Thorncrag, VanishedUser sdu9aya9fs787sads, Scarlettwharton, ClueBot, Ravek, Justin W Smith, The Thing That Should Not
Be, Raghaven, ImperfectlyInformed, Garyzx, Arakunem, Mild Bill Hiccup, Rob Bednark, Lindsayfgilmour, TobiasPersson, SchreiberBike,
Dixie91, Nasty psycho, XLinkBot, Marc van Leeuwen, Avoided, G7mcluvn, Hook43113, Kurniasan, Wolkykim, Addbot, Anandvachhani,
MrOllie, Freqsh0, Zorrobot, Jarble, Quantumobserver, Yobot, Fraggle81, KamikazeBot, Shadoninja, AnomieBOT, Jim1138, Materialsci-
entist, Mwe 001, Citation bot, Quantran202, SPTWriter, Mtasic, Binaryedit, Miym, Etienne Lehnart, Sophus Bie, Apollo2991, Construc-
tive editor, Afromayun, Prari, FrescoBot, Meshari alnaim, Ijsf, Mark Renier, Citation bot 1, I dream of horses, Apeculiaz, Patmorin,
Carloseow, Vrenator, Zvn, BZRatfink, Arjitmalviya, Vhcomptech, WillNess, Minimac, Jfmantis, RjwilmsiBot, Agrammenos, EmausBot,
KralSS, Super48paul, Primefac, Simply.ari1201, Eniagrom, MaGa, Donner60, Carmichael, Peter Karlsen, 28bot, Sjoerddebruin, Clue-
Bot NG, Jack Greenmaven, Millermk, Rezabot, Widr, MerlIwBot, Helpful Pixie Bot, HMSSolent, BG19bot, WikiPhoenix, Tango4567,
Dekai Wu, Computersagar, SaurabhKB, Klilidiplomus, Singlaive, IgushevEdward, Electricmuffin11, TalhaIrfanKhan, Jmendeth, Frosty,
Smortypi, RossMMooney, Gauravxpress, Noyster, Suzrocksyu, Bryce archer, Melcous, Monkbot, Azx0987, Mahesh Dheravath, Vikas
bhatnager, Aswincweety, RationalBlasphemist, TaqPol, Ishanalgorithm, KasparBot, Pythagorean Aditya Guha Roy, Fmadd, Ushkin N,
SimoneBrigante, DigDugDogDoog and Anonymous: 677
• Doubly linked list Source: https://en.wikipedia.org/wiki/Doubly_linked_list?oldid=762931767 Contributors: McKay, Tea2min, Jorge
Stolfi, Andreas Kaufmann, CanisRufus, Velella, Mindmatrix, Ruud Koot, Ewlyahoocom, Daverocks, Jeremy Visser, Chris the speller, Nick
Levine, Cybercobra, Myasuda, Medinoc, Fetchcomms, Crazytonyi, Kehrbykid, TechTony, Fishnet37222, The Thing That Should Not
Be, Addbot, Happyrabbit, Vandtekor, Amaury, Sae1962, Tyriar, Ryanz1123, Pinethicket, Jeffrd10, Suffusion of Yellow, Jfmantis, John of
Reading, Wikipelli, Usb10, ManU0710, ClueBot NG, Widr, Tlefebvre, Prashantgonarkar, Closeyes2, Asaifm, Comp.arch, Dkg1992matrix
and Anonymous: 69
• Stack (abstract data type) Source: https://en.wikipedia.org/wiki/Stack_(abstract_data_type)?oldid=769760613 Contributors: The
Anome, Andre Engels, Arvindn, Christian List, Edward, Patrick, RTC, Michael Hardy, Modster, MartinHarper, Ixfd64, TakuyaMurata,
Mbessey, Stw, Stan Shebs, Notheruser, Dcoetzee, Jake Nelson, Traroth, JensMueller, Finlay McWalter, Robbot, Noldoaran, Murray Lang-
ton, Fredrik, Wlievens, Guy Peters, Tea2min, Adam78, Giftlite, DocWatson42, BenFrantzDale, WiseWoman, Gonzen, Macrakis, Vamp-
Willow, Hgfernan, Maximaximax, Marc Mongenet, Karl-Henner, Andreas Kaufmann, RevRagnarok, Corti, Poccil, Andrejj, CanisRufus,
Spoon!, Bobo192, Grue, Shenme, R. S. Shaw, Vystrix Nexoth, Physicistjedi, James Foster, Obradovic Goran, Mdd, Musiphil, Liao, Hack-
wrench, Pion, ReyBrujo, 2mcm, Netkinetic, Postrach, Mindmatrix, MattGiuca, Ruud Koot, Mandarax, Slgrandson, Graham87, Qwertyus,
Kbdank71, Angusmclellan, Maxim Razin, Vlad Patryshev, FlaBot, Dinoen, Mahlon, Chobot, Bgwhite, Gwernol, Whosasking, NoirNoir,
Roboto de Ajvol, YurikBot, Borgx, Michael Slone, Ahluka, Stephenb, ENeville, Mipadi, Reikon, Vanished user 1029384756, Xdenizen,
Scs, Epipelagic, Caerwine, Boivie, Rwxrwxrwx, Fragment~enwiki, Cedar101, TuukkaH, KnightRider~enwiki, SmackBot, Adam majew-
ski, Hftf, Incnis Mrsi, BiT, Edgar181, Fernandopabon, Gilliam, Chris the speller, Agateller, RDBrown, Jprg1966, Thumperward, Oli Filth,
Nbarth, DHN-bot~enwiki, Cybercobra, Funky Monkey, PeterJeremy, MarkPritchard, Mlpkr, Vasiliy Faronov, MegaHasher, SashatoBot,
Zchenyu, Vanished user 9i39j3, F15x28, Ultranaut, SpyMagician, CoolKoon, Loadmaster, Tasc, Mr Stephen, Iridescent, Nutster, Tsf, Jesse
Viviano, IntrigueBlue, Penbat, VTBassMatt, Myasuda, FrontLine~enwiki, Simenheg, Jzalae, Pheasantplucker, Bsmntbombdood, Seth Man-
apio, Thijs!bot, Al Lemos, Headbomb, Davidhorman, Mentifisto, AntiVandalBot, Seaphoto, Stevenbird, CosineKitty, Arch dude, IanOs-
good, Jheiv, SiobhanHansa, Wikilolo, Magioladitis, VoABot II, Gammy, Individual X, David Eppstein, Gwern, R'n'B, Pomte, Adavidb,
Ianboggs, Dillesca, Sanjay742, Bookmaker~enwiki, Cjhoyle, Manassehkatz, David.Federman, Funandtrvl, Jeff G., Cheusov, Maxtremus,
TXiKiBoT, Hqb, Klower, JhsBot, Aaron Rotenberg, BotKung, Wikidan829, !dea4u, SieBot, Calliopejen1, BotMultichill, Raviemani, Ham
Pastrami, Keilana, Aillema, Ctxppc, OKBot, Hariva, Mr. Stradivarius, Fsmoura, ClueBot, Clx321, Melizg, Robert impey, Mahue, Rustam-
abd, LobStoR, Aitias, Johnuniq, XLinkBot, Jyotiswaroopr123321, Ceriak, Hook43113, MystBot, Dsimic, Gggh, Addbot, Ghettoblaster,
CanadianLinuxUser, Numbo3-bot, OlEnglish, Jarble, Aavviof, Luckas-bot, KamikazeBot, Peter Flass, AnomieBOT, 1exec1, Unara, Ma-
terialscientist, Citation bot, Xqbot, Quantran202, TechBot, GrouchoBot, RibotBOT, Cmccormick8, In fact, Rpv.imcc, Mark Renier,
D'ohBot, Ionutzmovie, Alxeedo, Colin meier, Salocin-yel, I dream of horses, Tom.Reding, Xcvista, ElNuevoEinstein, Tapkeerrambo007,
Trappist the monk, Tbhotch, IITManojit, Yammesicka, Jfmantis, Faysol037, RjwilmsiBot, Ripchip Bot, Mohinib27, EmausBot, Wiki-
tanvirBot, Dreamkxd, Luciform, Gecg, Maashatra11, RA0808, ZéroBot, Shuipzv3, Arkahot, Paul Kube, Thine Antique Pen, Blizzmas-
terPilch, L Kensington, ChuispastonBot, ClueBot NG, Matthiaspaul, MelbourneStar, StanBally, Dhardik007, Strcat, Ztothefifth, Zak-
blade2000, Robin400, Widr, Nakarumaka, KLBot2, Spieren, Vishal G.Dhavale., Nipunbayas, PranavAmbhore, Solomon7968, Proxyma,
BattyBot, David.moreno72, Abhidesai, Nova2358, ChrisGualtieri, Flaqueleto, Chengshuotian, Kushalbiswas777, Mogism, Makecat-bot,
Bentonjimmy, Stephenamills, Tiberius6996, Maeganm, Gauravxpress, Yasir72.multan, Tranzenic, Rajavenu.iitm, Jacektomas, Monkbot,
Opencooper, Pre8y, Flayneorange, KasparBot, Benaboy01, Fmadd, Azurnwiki, Bender the Bot and Anonymous: 342
• Queue (abstract data type) Source: https://en.wikipedia.org/wiki/Queue_(abstract_data_type)?oldid=765584693 Contributors: Blck-
Knght, Andre Engels, DavidLevinson, LapoLuchini, Edward, Patrick, Michael Hardy, Ixfd64, Ahoerstemeier, Nanshu, Glenn, Emper-
orbma, Dcoetzee, Furrykef, Traroth, Metasquares, Jusjih, PuzzletChung, Robbot, Noldoaran, Fredrik, JosephBarillari, Rasmus Faber,
Tea2min, Giftlite, Massysett, BenFrantzDale, MingMecca, Zoney, Rdsmith4, Tsemii, Andreas Kaufmann, Corti, Discospinster, Wrp103,
Mecanismo, Mehrenberg, Indil, Kwamikagami, Chairboy, Spoon!, Robotje, Helix84, Zachlipton, Alansohn, Liao, Conan, Gunslinger47,
Mc6809e, Caesura, Jguk, Kenyon, Woohookitty, Mindmatrix, Peng~enwiki, MattGiuca, Ruud Koot, Graham87, Rachel1, Qwertyus, De-
Piep, Olivier Teuliere, Bruce1ee, W3bbo, Margosbot~enwiki, Wouter.oet, Ewlyahoocom, Jrtayloriv, Zotel, Roboto de Ajvol, PhilipR,
RussBot, J. M., SpuriousQ, Stephenb, Stassats, Howcheng, JohJak2, Caerwine, Mike1024, Carlosguitar, SmackBot, Honza Záruba,
M2MM4M, Dabear~enwiki, Skizzik, Chris the speller, Oli Filth, Wikibarista, Nbarth, DHN-bot~enwiki, OrphanBot, Zvar, Cyberco-
bra, Pissant, Mlpkr, Cdills, Kflorence, Almkglor, PseudoSudo, Ckatz, 16@r, Sharcho, Nutster, Penbat, VTBassMatt, Banditlord, A876,
248 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

Simenheg, Tawkerbot4, Christian75, X96lee15, Uruiamme, Thadius856, Hires an editor, Lperez2029, Egerhart, Deflective, Siobhan-
Hansa, Wikilolo, MikeDunlavey, David Eppstein, Gwern, GrahamDavies, Sanjay742, Contactbanish, NewEnglandYankee, Nwbeeson,
Bobo2000, AlnoktaBOT, JhsBot, Broadbot, Atiflz, BotKung, Jesin, Calliopejen1, BotMultichill, Ham Pastrami, Keilana, Thesuper-
slacker, Flyer22 Reborn, Hariva, Arsenic99, Chelseafan528, WikiBotas, ClueBot, Ggia, Vanmaple, Alexbot, Ksulli10, Jotterbot, Tobi-
asPersson, SensuiShinobu1234, DumZiBoT, Kletos, XLinkBot, SilvonenBot, Marry314113, Dsimic, Addbot, Some jerk on the Internet,
OliverTwisted, MrOllie, SoSaysChappy, ‫ماني‬, Loupeter, Yobot, Vanished user rt41as76lk, KamikazeBot, Materialscientist, LilHelpa,
Xqbot, Vegpuff, Joseph.w.s~enwiki, DSisyphBot, Ruby.red.roses, FrescoBot, Mark Renier, Miklcct, Arthur MILCHIOR, Gbduende,
PrometheeFeu~enwiki, Maxwellterry, John lindgren, Garfieldnate, EmausBot, Jasonanaggie, Akerans, Redhanker, Sorancio, Donner60,
Clehner~enwiki, Gralfca, ClueBot NG, Detonadorado, MahlerFive, Ztothefifth, Rahulghose, Iprathik, Zanaferx, Tlefebvre, Vasuakeel,
PhuksyWiki, Solomon7968, Fswangke, Dmitrysobolev, BattyBot, David.moreno72, Nemo Kartikeyan, Kushalbiswas777, DavidLeighEllis,
Sam Sailor, Tranzenic, ScottDNelson, Ishanalgorithm, InternetArchiveBot, Cakedy, NgYShung, GreenC bot and Anonymous: 206
• Double-ended queue Source: https://en.wikipedia.org/wiki/Double-ended_queue?oldid=772099980 Contributors: The Anome, Freck-
lefoot, Edward, Axlrosen, CesarB, Dcoetzee, Dfeuer, Zoicon5, Furrykef, Fredrik, Merovingian, Rasmus Faber, Tea2min, Smjg, Sj,
BenFrantzDale, Esrogs, Chowbok, Rosen, Andreas Kaufmann, Pt, Spoon!, Mindmatrix, Ruud Koot, Mandarax, Wikibofh, Drrngrvy,
Naraht, Ffaarr, Bgwhite, YurikBot, Fabartus, Jengelh, SpuriousQ, Fbergo, Schellhammer, Ripper234, Sneftel, Bcbell, SmackBot, Cparker,
Psiphiorg, Chris the speller, Kurykh, TimBentley, Oli Filth, Silly rabbit, Nbarth, Luder, Puetzk, Cybercobra, Offby1, Dicklyon, Cm-
drObot, Penbat, Funnyfarmofdoom, Mwhitlock, Omicronpersei8, Headbomb, VictorAnyakin, Felix C. Stegerman, David Eppstein, Mar-
tinBot, Huzzlet the bot, KILNA, VolkovBot, Anonymous Dissident, BotKung, Ramiromagalhaes, Kbrose, Hawk777, Ham Pastrami, Kr-
ishna.91, Hello71, Rdhettinger, Foxj, Alexbot, Rhododendrites, XLinkBot, Dekart, Wolkykim, Matěj Grabovský, Rrmsjp, Legobot, Yobot,
AnomieBOT, Sae1962, Arthur MILCHIOR, LittleWink, Woodlot, EmausBot, WikitanvirBot, Aamirlang, E Nocress, ClueBot NG, Ztothe-
fifth, Shire Reeve, Helpful Pixie Bot, BG19bot, IAPAAMMUABBU, Gauravi123, Mtnorthpoplar, RippleSax, Physics42, Vorhalas, Zdim
wiki and Anonymous: 94
• Circular buffer Source: https://en.wikipedia.org/wiki/Circular_buffer?oldid=772234418 Contributors: Damian Yerrick, Julesd, Malco-
hol, Chocolateboy, Tea2min, DavidCary, Andreas Kaufmann, Astronouth7303, Foobaz, Shabble, Cburnett, Qwertyus, Bgwhite, Pok148,
Cedar101, Mhi, WolfWings, SmackBot, Ohnoitsjamie, Chris the speller, KiloByte, Silly rabbit, Antonrojo, Rrelf, Frap, Cybercobra,
Bobamnertiopsis, Zoxc, Mike65535, Anonymi, Joeyadams, Mark Giannullo, Headbomb, Llloic, ForrestVoight, Marokwitz, Hosamaly,
Parthashome, Magioladitis, Indubitably, Amikake3, Strategist333, Billinghurst, Rhanekom, Calliopejen1, SiegeLord, BrightRoundCircle,
, OlivierEM, DrZoomEN, Para15000, Niceguyedc, Lucius Annaeus Seneca, Apparition11, Dekart, Dsimic, Addbot, Shervinemami,
MrOllie, OrlinKolev, Matěj Grabovský, Yobot, Ptbotgourou, Tennenrishin, AnomieBOT, BastianVenthur, Materialscientist, ChrisCPear-
son, Serkan Kenar, Shirik, 78.26, Mayukh iitbombay 2008, Hoo man, Sysabod, Ybungalobill, Paulitex, Lipsio, Eight40, ZéroBot, Bloodust,
Pokbot, ClueBot NG, Asimsalam, Shengliangsong, Lemtronix, Exfuent, Tectu, Msoltyspl, MuhannadAjjan, Cerabot~enwiki, ScotXW, Ji-
jubin, Clubjustin, Hailu143, EUROCALYPTUSTREE, Agustinothadeus and Anonymous: 103
• Associative array Source: https://en.wikipedia.org/wiki/Associative_array?oldid=758659575 Contributors: Damian Yerrick, Robert
Merkel, Fubar Obfusco, Maury Markowitz, Hirzel, B4hand, Paul Ebermann, Edward, Patrick, Michael Hardy, Shellreef, Graue,
Minesweeper, Brianiac, Samuelsen, Bart Massey, Hashar, Dcoetzee, Dysprosia, Silvonen, Bevo, Robbot, Noldoaran, Fredrik, Alten-
mann, Wlievens, Catbar, Wikibot, Ruakh, EvanED, Jleedev, Tea2min, Ancheta Wis, Jpo, DavidCary, Mintleaf~enwiki, Inter, Wolfkeeper,
Jorge Stolfi, Macrakis, Pne, Neilc, Kusunose, Karol Langner, Bosmon, Int19h, Andreas Kaufmann, RevRagnarok, Ericamick, LeeHunter,
PP Jewel, Kwamikagami, James b crocker, Spoon!, Bobo192, TommyG, Minghong, Alansohn, Mt~enwiki, Krischik, Sligocki, Rtmyers,
Kdau, Tony Sidaway, RainbowOfLight, Forderud, TShilo12, Boothy443, Mindmatrix, RzR~enwiki, Apokrif, Kglavin, Bluemoose, Ob-
sidianOrder, Pfunk42, Qwertyus, Yurik, Swmcd, Scandum, Koavf, Agorf, Jeff02, RexNL, Alvin-cs, Wavelength, Fdb, Maerk, Dggoldst,
Cedar101, JLaTondre, Owl-syme, TuukkaH, SmackBot, KnowledgeOfSelf, MeiStone, Mirzabah, TheDoctor10, Sam Pointon, Brianski,
Hugo-cs, Jdh30, Zven, Cfallin, CheesyPuffs144, Malbrain, Nick Levine, Vegard, Radagast83, Cybercobra, Decltype, Paddy3118, YeMer-
ryPeasant, AvramYU, Doug Bell, AmiDaniel, Antonielly, EdC~enwiki, Tobe2199, Hans Bauer, Dreftymac, Pimlottc, George100, JForget,
Jokes Free4Me, Pgr94, MrSteve, Countchoc, Ajo Mama, WinBot, Oddity-, Alphachimpbot, Maslin, JonathanCross, Pfast, PhiLho, Wm-
bolle, Magioladitis, David Eppstein, Gwern, Doc aberdeen, Signalhead, VolkovBot, Chaos5023, Kyle the bot, TXiKiBoT, Anna Lincoln,
BotKung, Comet--berkeley, Jesdisciple, PanagosTheOther, Nemo20000, Jerryobject, CultureDrone, Anchor Link Bot, ClueBot, Copyed-
itor42, Irishjugg~enwiki, XLinkBot, Orbnauticus, Frostus, Dsimic, Deineka, Addbot, Debresser, Jarble, Bartledan, Davidwhite544, Mar-
gin1522, Legobot, Luckas-bot, Yobot, TaBOT-zerem, Pcap, Peter Flass, AnomieBOT, RibotBOT, January2009, Sae1962, Efadae, Neil
Schipper, Floatingdecimal, Tushar858, EmausBot, WikitanvirBot, Marcos canbeiro, AvicBot, ClueBot NG, JannuBl22t, Helpful Pixie
Bot, Shuisman, DoctorRad, Crh23, Mithrasgregoriae, JYBot, Dcsaba70, Pintoch, LTWoods, Myconix, Comp.arch, Suelru, Bad Dryer,
Alonsoguillenv, EDickenson and Anonymous: 197
• Association list Source: https://en.wikipedia.org/wiki/Association_list?oldid=728838162 Contributors: SJK, Dcoetzee, Dremora, Tony
Sidaway, Pmcjones, SMcCandlish, David Eppstein, Yobot, Helpful Pixie Bot and Anonymous: 2
• Hash table Source: https://en.wikipedia.org/wiki/Hash_table?oldid=772548618 Contributors: Damian Yerrick, AxelBoldt, Zundark, The
Anome, BlckKnght, Sandos, Rgamble, LapoLuchini, AdamRetchless, Imran, Mrwojo, Frecklefoot, Michael Hardy, Nixdorf, Pnm, Axl-
rosen, TakuyaMurata, Ahoerstemeier, Nanshu, Dcoetzee, Dysprosia, Furrykef, Omegatron, Wernher, Bevo, Tjdw, Pakaran, Secretlondon,
Robbot, Fredrik, Tomchiukc, R3m0t, Altenmann, Ashwin, UtherSRG, Miles, Giftlite, DavidCary, Wolfkeeper, BenFrantzDale, Everyk-
ing, Waltpohl, Jorge Stolfi, Wmahan, Neilc, Pgan002, CryptoDerk, Knutux, Bug~enwiki, Sonjaaa, Teacup, Beland, Watcher, DNewhall,
ReiniUrban, Sam Hocevar, Derek Parnell, Askewchan, Kogorman, Andreas Kaufmann, Kaustuv, Shuchung~enwiki, T Long, Hydrox,
Cfailde, Luqui, Wrp103, Antaeus Feldspar, Bender235, Khalid, Raph Levien, JustinWick, CanisRufus, Shanes, Iron Wallaby, Krakhan,
Bobo192, Davidgothberg, Larryv, Sleske, Helix84, Mdd, Varuna, Baka toroi, Anthony Appleyard, Sligocki, Drbreznjev, DSatz, Akuch-
ling, TShilo12, Nuno Tavares, Woohookitty, LOL, Linguica, Paul Mackay~enwiki, Davidfstr, GregorB, Meneth, Graham87, Kbdank71,
Tostie14, Rjwilmsi, Scandum, Koavf, Kinu, Pleiotrop3, Filu~enwiki, Nneonneo, FlaBot, Ecb29, Fragglet, Intgr, Fresheneesz, Antaeus
FeIdspar, YurikBot, Wavelength, RobotE, Mongol, RussBot, Me and, CesarB’s unpriviledged account, Lavenderbunny, Gustavb, Mi-
padi, Cryptoid, Mike.aizatsky, Gareth Jones, Piccolomomo~enwiki, CecilWard, Nethgirb, Gadget850, Bota47, Sebleblanc, Deeday-UK,
Sycomonkey, Ninly, Gulliveig, Th1rt3en, CWenger, JLaTondre, ASchmoo, Kungfuadam, Daivox, SmackBot, Apanag, Obakeneko, Pizza-
Margherita, Alksub, Eskimbot, RobotJcb, C4chandu, Gilliam, Arpitm, Neurodivergent, EncMstr, Cribe, Deshraj, Tackline, Frap, Mayrel,
Radagast83, Cybercobra, Decltype, HFuruseth, Rich.lewis, Esb, Acdx, MegaHasher, Doug Bell, Derek farn, IronGargoyle, Josephsieh, Pe-
ter Horn, Pagh, Saxton, Tawkerbot2, Ouishoebean, CRGreathouse, Ahy1, MaxEnt, Seizethedave, Cgma, Not-just-yeti, Headbomb, Ther-
mon, OtterSmith, Ajo Mama, Stannered, AntiVandalBot, Hosamaly, Thailyn, Pixor, JAnDbot, MER-C, Epeefleche, Dmbstudio, Siobhan-
Hansa, Wikilolo, Bongwarrior, QrczakMK, Josephskeller, Tedickey, Schwarzbichler, Cic, Allstarecho, David Eppstein, Oravec, Gwern,
Magnus Bakken, Glrx, Narendrak, Tikiwont, Mike.lifeguard, Luxem, NewEnglandYankee, Cobi, Cometstyles, Winecellar, VolkovBot,
8.1. TEXT 249

Simulationelson, Floodyberry, Anurmi~enwiki, BotKung, Collin Stocks, JimJJewett, Nightkhaos, Spinningspark, Abatishchev, Helios2k6,
Kehrbykid, Kbrose, PeterCanthropus, Gerakibot, Triwbe, Digwuren, Svick, JL-Bot, ObfuscatePenguin, ClueBot, Justin W Smith, Imper-
fectlyInformed, Adrianwn, Mild Bill Hiccup, Niceguyedc, JJuran, Groxx, Berean Hunter, Eddof13, Johnuniq, Arlolra, XLinkBot, Het-
ori, Pichpich, Paulsheer, TheTraveler3, MystBot, Karuthedam, Wolkykim, Addbot, Gremel123, Scientus, CanadianLinuxUser, MrOl-
lie, Numbo3-bot, Om Sao, Zorrobot, Jarble, Frehley, Legobot, Luckas-bot, Yobot, Denispir, KamikazeBot, Peter Flass, Dmcomer,
AnomieBOT, Erel Segal, Jim1138, Sz-iwbot, Materialscientist, Citation bot, ArthurBot, Baliame, Drilnoth, Arbalest Mike, Ched, Shad-
owjams, Kracekumar, FrescoBot, Gbutler69, W Nowicki, X7q, Sae1962, Citation bot 1, Velociostrich, Simonsarris, Maggyero, Iekpo,
Trappist the monk, SchreyP, Grapesoda22, Patmorin, Cutelyaware, JeepdaySock, Shafigoldwasser, Kastchei, DuineSidhe, EmausBot, Su-
per48paul, Ibbn, DanielWaterworth, GoingBatty, Mousehousemd, ZéroBot, Purplie, Ticklemepink42, Paul Kube, Demonkoryu, Donner60,
Carmichael, Pheartheceal, Aberdeen01, Neil P. Quinn, Teapeat, Rememberway, ClueBot NG, Iiii I I I, Incompetence, Rawafmail, Frietjes,
Cntras, Rezabot, Jk2q3jrklse, Helpful Pixie Bot, BG19bot, Jan Spousta, MusikAnimal, SanAnMan, Pbruneau, AdventurousSquirrel, Tris-
ton J. Taylor, CitationCleanerBot, Happyuk, FeralOink, Spacemanaki, Aloksukhwani, Emimull, Deveedutta, Shmageggy, IgushevEdward,
AlecTaylor, Pintoch, Mcom320, Thomas J. S. Greenfield, Razibot, Djszapi, QuantifiedElf, Myconix, Chip Wildon Forster, Tmferrara,
Tuketu7, Whacks, Monkbot, Iokevins, Kjerish, Nitishch, Oleaster, MediKate, Micahsaint, Tourorist, Mtnorthpoplar, Dazappa, Luis150902,
Gou7214309, Dwemthy, Ushkin N, Earl King and Anonymous: 466
• Linear probing Source: https://en.wikipedia.org/wiki/Linear_probing?oldid=772550290 Contributors: Ubiquity, Bearcat, Enochlau, An-
dreas Kaufmann, Gazpacho, Discospinster, RJFJR, Linas, Tas50, The Rambling Man, CesarB’s unpriviledged account, SpuriousQ, Chris
the speller, JonHarder, MichaelBillington, Sbluen, Jeberle, Negrulio, Cryptic C62, Jngnyc, Alaibot, Thijs!bot, Headbomb, A3nm, David
Eppstein, STBot, Themania, OliviaGuest, Arjunaraoc, C. A. Russell, Addbot, Legobot, Yobot, Tedzdog, Patmorin, Infinity ive, Dixtosa,
Danmoberly, Dzf1992, Rubbish computer and Anonymous: 17
• Quadratic probing Source: https://en.wikipedia.org/wiki/Quadratic_probing?oldid=759799785 Contributors: Aragorn2, Dcoetzee,
Enochlau, Andreas Kaufmann, Rich Farmbrough, ZeroOne, Oleg Alexandrov, Ryk, Eubot, CesarB’s unpriviledged account, Robertvan1,
Mikeblas, SmackBot, InverseHypercube, Ian1000, Cybercobra, Wizardman, Jdanz, Simeon, Magioladitis, David Eppstein, R'n'B, Philip
Trueman, Hatmatbbat10, C. A. Russell, Addbot, Yobot, Bavla, Kmgpratyush, Donner60, ClueBot NG, Helpful Pixie Bot, Yashykt, Vaib-
hav1992, AndiPersti, Danielcamiel, EapenZhan and Anonymous: 40
• Double hashing Source: https://en.wikipedia.org/wiki/Double_hashing?oldid=722897281 Contributors: AxelBoldt, CesarB, Angela,
Dcoetzee, Usrnme h8er, Stesmo, RJFJR, Zawersh, Pfunk42, Gurch, CesarB’s unpriviledged account, Momeara, DasBrose~enwiki, Cob-
blet, SmackBot, Bluebot, Hashbrowncipher, JForget, Only2sea, Alaibot, Thijs!bot, David Eppstein, WonderPhil, Philip Trueman, Ox-
fordwang, Extensive~enwiki, Mild Bill Hiccup, Addbot, Tcl16, Smallman12q, Amiceli, Imposing, Jesse V., ClueBot NG, Exercisephys,
Bdawson1982, Kevin12xd and Anonymous: 36
• Cuckoo hashing Source: https://en.wikipedia.org/wiki/Cuckoo_hashing?oldid=772276433 Contributors: Arvindn, Dcoetzee, McKay,
Phil Boswell, Nyh, Pps, DavidCary, Neilc, Bender235, Unquietwiki, Zawersh, Ej, Nihiltres, Bgwhite, CesarB’s unpriviledged account,
Zr2d2, Zerodamage, Aaron Will, SmackBot, Mandyhan, Thumperward, Cybercobra, Pagh, Jafet, CRGreathouse, Alaibot, Headbomb,
Hermel, David Eppstein, S3000, Themania, Wjaguar, Mark cummins, LiranKatzir, Svick, Justin W Smith, Hetori, Addbot, Alquantor,
Lmonson26, Luckas-bot, Yobot, Valentas.Kurauskas, Thore Husfeldt, W Nowicki, Citation bot 1, Userask, Trappist the monk, EmausBot,
BuZZdEE.BuzZ, Rcsprinter123, Bomazi, Yoavt, BattyBot, Andrew Helwer, Dexbot, Usernameasdf, Monkbot, Cyberboys91, Harvi004
and Anonymous: 48
• Hopscotch hashing Source: https://en.wikipedia.org/wiki/Hopscotch_hashing?oldid=742861865 Contributors: Cybercobra, Svick, Im-
ageRemovalBot, Shafigoldwasser, QinglaiXiao, BG19bot, Alxradz and Anonymous: 9
• Hash function Source: https://en.wikipedia.org/wiki/Hash_function?oldid=771961351 Contributors: Damian Yerrick, Derek Ross, Taw,
BlckKnght, PierreAbbat, Miguel~enwiki, Imran, David spector, Dwheeler, Hfastedge, Michael Hardy, EddEdmondson, Ixfd64, Mde-
bets, Nanshu, J-Wiki, Jc~enwiki, Vanis~enwiki, Dcoetzee, Ww, The Anomebot, Doradus, Robbot, Noldoaran, Altenmann, Mikepelley,
Tea2min, Connelly, Giftlite, Paul Richter, DavidCary, KelvSYC, Wolfkeeper, Obli, Everyking, TomViza, Brona, Malyctenar, Jorge Stolfi,
Matt Crypto, Utcursch, Knutux, OverlordQ, Kusunose, Watcher, Karl-Henner, Talrias, Peter bertok, Quota, Eisnel, Shiftchange, Mormegil,
Jonmcauliffe, Rich Farmbrough, Antaeus Feldspar, Bender235, Chalst, Evand, PhilHibbs, Haxwell, Bobo192, Sklender, Davidgothberg,
Boredzo, Helix84, CyberSkull, Atlant, Jeltz, Mmmready, Apoc2400, InShaneee, Velella, Jopxton, ShawnVW, Kurivaim, MIT Trekkie,
Redvers, Blaxthos, Kazvorpal, Brookie, Linas, Mindmatrix, GVOLTT, LOL, TheNightFly, Drostie, Pfunk42, Graham87, Qwertyus,
Toolan, Rjwilmsi, Seraphimblade, Pabix, LjL, Ttwaring, Utuado, Nguyen Thanh Quang, FlaBot, Harmil, Gurch, Thenowhereman, Math-
rick, Intgr, M7bot, Chobot, Roboto de Ajvol, YurikBot, Wavelength, RattusMaximus, RobotE, CesarB’s unpriviledged account, Stephenb,
Pseudomonas, Andipi, Zeno of Elea, EngineerScotty, CecilWard, Mikeblas, Fender123, Bota47, Tachyon01, Ms2ger, Eurosong, Dfinkel,
Lt-wiki-bot, Ninly, Gulliveig, CharlesHBennett, StealthFox, Claygate, Snoops~enwiki, QmunkE, Emc2, Appleseed, Tobi Kellner, That
Guy, From That Show!, Jbalint, SmackBot, InverseHypercube, Bomac, KocjoBot~enwiki, BiT, Yamaguchi , Gilliam, Raghaw, Schmit-
eye, Mnbf9rca, JesseStone, Oli Filth, EncMstr, Octahedron80, Nbarth, Kmag~enwiki, Malbrain, Chlewbot, Shingra, Midnightcomm,
Lansey, Andrei Stroe, MegaHasher, Lambiam, Kuru, Alexcollins, Paulschou, RomanSpa, Chuck Simmons, KHAAAAAAAAAAN, Er-
win, Peyre, Vstarre, Pagh, MathStuf, ShakingSpirit, Iridescent, Agent X2, BrianRice, Courcelles, Owen214, Juhachi, Neelix, Mblumber,
SavantEdge, Adolphus79, Sytelus, Epbr123, Ultimus, Leedeth, Stualden, Folic Acid, AntiVandalBot, Xenophon (bot), JakeD409, Da-
vorian, Powerdesi, Dhrm77, JAnDbot, Epeefleche, Hamsterlopithecus, Kirrages, Stangaa, Steveprutz, Wikilolo, Coffee2theorems, Ma-
gioladitis, Pndfam05, Patelm, Tedickey, Nyttend, Kgfleischmann, Dappawit, Applrpn, STBot, GimliDotNet, R'n'B, Jfroelich, Francis
Tyers, Demosta, Thirdright, J.delanoy, Maurice Carbonaro, Svnsvn, Wjaguar, L337 kybldmstr, Globbet, Ontarioboy, Doug4, Meiskam,
Jrmcdaniel, VolkovBot, Sjones23, Boute, TXiKiBoT, Christofpaar, GroveGuy, A4bot, Nxavar, Noformation, Cuddlyable3, Crashthatch,
Wiae, Jediknil, Tastyllama, Skarz, LittleBenW, SieBot, WereSpielChequers, Laoris, KrizzyB, Xelgen, Flyer22 Reborn, Iamhigh, Dhb101,
BrightRoundCircle, OKBot, Svick, FusionNow, BitCrazed, ClueBot, Cab.jones, Ggia, Unbuttered Parsnip, Garyzx, Mild Bill Hiccup, शिव,
Dkf11, SamHartman, Rob Bednark, Alexbot, Erebus Morgaine, Diaa abdelmoneim, Wordsputtogether, Tonysan, Rishi.bedi, XLinkBot,
Kotha arun2005, Dthomsen8, MystBot, Karuthedam, SteveJothen, Addbot, Butterwell, TutterMouse, Dranorter, MrOllie, CarsracBot, An-
dersBot, Jeaise, Lightbot, Luckas-bot, Fraggle81, AnomieBOT, Erel Segal, Materialscientist, Citation bot, Twri, ArthurBot, Xqbot, Capri-
corn42, Matttoothman, M2millenium, Theclapp, RibotBOT, Alvin Seville, MerlLinkBot, FrescoBot, Nageh, MichealH, TruthIIPower,
Haeinous, Geoffreybernardo, Pinethicket, 10metreh, Cnwilliams, Mghgtg, Dinamik-bot, Vrenator, Keith Cascio, Phil Spectre, Jeffrd10,
Updatehelper, Kastchei, EmausBot, Timtempleton, Gfoley4, Mayazcherquoi, Timde, MarkWegman, Dewritech, Jachto, John Cline, White
Trillium, Fæ, Akerans, Paul Kube, Music Sorter, Donner60, Senator2029, Teapeat, Sven Manguard, Shi Hou, Mikhail Ryazanov, Re-
memberway, ClueBot NG, Incompetence, Neuroneutron, Monchoman45, Cntras, Widr, Mtking, Bluechimera0, HMSSolent, Wikisian,
BG19bot, JamesNZ, GarbledLecture933, Harpreet Osahan, Glacialfox, Winston Chuen-Shih Yang, ChrisGualtieri, Tech77, Jeff Erickson,
250 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

Jonahugh, Lindsaywinkler, Tmferrara, Cattycat95, Tolcso, Frogger48, Eddiearin123, Philnap, Kanterme, Laberinto15, MatthewBuch-
walder, Wkudrle, Computilizer, Mark22207, GlennLawyer, Gcarvelli, BlueFenixReborn, Some Gadget Geek, Siddharthgondhi, Surlycy-
borg, Ollie314, GSS-1987, Entranced98, KGirlTrucker81, John “Hannibal” Smith, WójcikBartosz, Hennerhubel, Lekkio and Anonymous:
497
• Perfect hash function Source: https://en.wikipedia.org/wiki/Perfect_hash_function?oldid=755582863 Contributors: Edward, Cimon
Avaro, Dcoetzee, Fredrik, Giftlite, Neilc, E David Moyer, Burschik, Bender235, LOL, Ruud Koot, JMCorey, ScottJ, Mathbot, Spl, Ce-
sarB’s unpriviledged account, Dtrebbien, Długosz, Gareth Jones, Salrizvy, Johndburger, SmackBot, Nbarth, Srchvrs, Otus, 4hodmt, Mega-
Hasher, Pagh, Mudd1, Krauss, Headbomb, Wikilolo, David Eppstein, Glrx, Cobi, Drkarger, Gajeam, PixelBot, Addbot, G121, Bbb23,
AnomieBOT, FrescoBot, Daoudamjad, John of Reading, Prvák, Maysak, Voomoo, Arka sett, BG19bot, SteveT84, Walrus068, Mcichelli,
Dexbot, Pintoch, Latin.ufmg and Anonymous: 34
• Universal hashing Source: https://en.wikipedia.org/wiki/Universal_hashing?oldid=768108046 Contributors: Mattflaschen, DavidCary,
Neilc, ArnoldReinhold, EmilJ, Pol098, Rjwilmsi, Sdornan, SeanMack, Chobot, Dmharvey, Gareth Jones, Guruparan18, Johndburger,
Twintop, CharlesHBennett, SmackBot, Cybercobra, Copysan, DanielLemire, Pagh, Dwmalone, Winxa, Jafet, Arnstein87, Marc W.
Abel, Sytelus, Francois.boutines, Headbomb, Golgofrinchian, David Eppstein, Copland Stalker, Danadocus, Cyberjoac, Ulamgamer, Ben-
der2k14, Rswarbrick, Addbot, RPHv, Yobot, Mpatrascu, Citation bot, LilHelpa, Citation bot 1, TPReal, Patmorin, RjwilmsiBot, Emaus-
Bot, Dewritech, ClueBot NG, Helpful Pixie Bot, Cleo, BG19bot, Walrus068, BattyBot, ChrisGualtieri, Zolgharnein, Jeff Erickson, Zen-
guine, Mikkel2thorup and Anonymous: 43
• K-independent hashing Source: https://en.wikipedia.org/wiki/K-independent_hashing?oldid=744752555 Contributors: Nandhp,
Rjwilmsi, CBM, David Eppstein, Iohannes Animosus, Mpatrascu, Mr Sheep Measham, BattyBot and Anonymous: 3
• Tabulation hashing Source: https://en.wikipedia.org/wiki/Tabulation_hashing?oldid=744532063 Contributors: RJFJR, DanielLemire,
Johnwbyrd, David Eppstein, Thomasda, Oranav, Thore Husfeldt, Tom.Reding, BG19bot, Cleanelephant, Eehcyl, Kbulgakov and Anony-
mous: 3
• Cryptographic hash function Source: https://en.wikipedia.org/wiki/Cryptographic_hash_function?oldid=769618249 Contributors:
Damian Yerrick, Bryan Derksen, Zundark, Arvindn, Imran, Paul Ebermann, Michael Hardy, Dan Koehl, Vacilandois, Dcljr, CesarB,
Ciphergoth, Feedmecereal, Charles Matthews, Ww, Amol kulkarni, Mrand, Taxman, Phil Boswell, Chuunen Baka, Robbot, Paranoid, As-
tronautics~enwiki, Fredrik, Lowellian, Pingveno, Aetheling, Mattflaschen, Javidjamae, Giftlite, Lunkwill, DavidCary, ShaunMacPherson,
Inkling, BenFrantzDale, Ianhowlett, Leonard G., Jorge Stolfi, Cloud200, Matt Crypto, Utcursch, CryptoDerk, Lightst, Antandrus, Tjwood,
Anirvan, Imjustmatthew, Rich Farmbrough, FT2, ArnoldReinhold, YUL89YYZ, Samboy, Mykhal, Chalst, Kyz, Sietse Snel, Schneier,
Bobo192, Myria, VBGFscJUn3, Davidgothberg, Boredzo, Quintus~enwiki, Sligocki, Ciphergoth2, Danhash, Pgimeno~enwiki, H2g2bob,
BDD, MIT Trekkie, PseudonympH, Simetrical, CygnusPius, Mindmatrix, Apokrif, Jok2000, Mandarax, Alienus, Ej, SMC, AndyKali,
Ruptor, Mathbot, Harmil, Maxal, Intgr, Fresheneesz, Wolfmankurd, Wigie, FrenchIsAwesome, CesarB’s unpriviledged account, Ted-
dyb, Gaius Cornelius, Rsrikanth05, Bachrach44, Froth, Guruparan18, Dbfirs, Ott2, Analoguedragon, Appleseed, Finell, DaishiHarada,
SmackBot, Mmernex, Tom Lougheed, Michaelfavor, Mdd4696, C4chandu, BiT, Yamaguchi , Ohnoitsjamie, Oli Filth, Nbarth, DHN-
bot~enwiki, Colonies Chris, Zsinj, Kotra, Deeb, Fuzzypeg, Lambiam, Twotwotwo, Twredfish, Brian Gunderson, Oswald Glinkmeyer, Dick-
lyon, Lee Carre, OnBeyondZebrax, Paul Foxworthy, Fils du Soleil, MoleculeUpload, Jafet, Chris55, Mellery, CmdrObot, Jesse Viviano,
Penbat, NormHardy, Cydebot, ST47, Optimist on the run, Bsmntbombdood, Bdragon, Jm3, N5iln, Strongriley, Dawnseeker2000, AntiVan-
dalBot, Nipisiquit, JAnDbot, BenjaminGittins, Instinct, Jimbobl, Coolhandscot, Gavia immer, Extropian314, VoABot II, NoDepositNoRe-
turn, Twsx, Firealwaysworks, David Eppstein, Vssun, WLU, Ratsbane, Gwern, JensAlfke, Maurice Carbonaro, Eliz81, Cpiral, Osndok,
83d40m, Robertgreer, SmallPotatoes, TreasuryTag, Sroc, TooTallSid, Oconnor663, Nxavar, Wordsmith, Jamelan, Enviroboy, Fltnsplr,
AP61, Arjun024, SieBot, Tehsha, Caltas, JuanPechiar, ArchiSchmedes, Jasonsewall, Wahrmund, Bpeps, ClueBot, JWilk, Ggia, Arakunem,
Avinava, CounterVandalismBot, Niceguyedc, DragonBot, Infomade, Cenarium, Leobold1, Erodium, Thinking Stone, DumZiBoT, Cmc-
queen1975, Pierzz, Mitch Ames, SteveJothen, Addbot, Non-dropframe, Laurinavicius, Cube444, Leszek Jańczuk, Wikipedian314, Down-
load, Maslen, Yobot, MarioS, Wurfmaul, Doctorhook, SwisterTwister, AnomieBOT, DemocraticLuntz, Materialscientist, Are you ready for
IPv6?, Xvsn, Clark89, Rabbler, Capricorn42, Oxwil, Marios.agathangelou, Sylletka, BrianWren, Daemorris, Amit 9b, Tsihonglau, Hymek,
MerlLinkBot, Maxiwheat, Bonev, FreeKnowledgeCreator, FrescoBot, Jsaenznoval, ‫תומר א‬., Haeinous, Doremo, Blotowij, Jandalhandler,
RobinK, Salvidrim!, LiberatorG, ‫קול ציון‬, Lotje, ATBS, Wedgefish, January, Eatnumber1, Plfernandez, Whisky drinker, Patriot8790, Trac-
erneo, RistoLaanoja, Mjd95, EmausBot, WikitanvirBot, AvicBot, ZéroBot, Quelrod, A930913, Erianna, Bomazi, ClueBot NG, Wcherowi,
MelbourneStar, Champloo11, Rezabot, Widr, Danwix, BG19bot, Lichtspiel, Garsd, ZipoBibrok5x10^8, Manoguru, RavelTwig, Luzm-
costa, David.moreno72, Darts123, Basisplan0815, JYBot, Pintoch, CuriousMind01, ‫مونا بشيري‬, Connorr89, Epicgenius, Tentinator,
Jianhui67, Musko47, TAKUMI YAMAWAKI, Claw of Slime, Monkbot, Maciej Czyżewski, TimMagee, Maths314, Chouhartem, Touror-
ist, Lover amethyst, Onlinetvnet, TheExceptionCloaker, Axlesoft, ‫ז‬62 and Anonymous: 305
• Set (abstract data type) Source: https://en.wikipedia.org/wiki/Set_(abstract_data_type)?oldid=757372624 Contributors: Damian Yer-
rick, William Avery, Mintguy, Patrick, Modster, TakuyaMurata, EdH, Mxn, Dcoetzee, Fredrik, Jorge Stolfi, Lvr, Urhixidur, Andreas
Kaufmann, CanisRufus, Spoon!, RJFJR, Ruud Koot, Pfunk42, Bgwhite, Roboto de Ajvol, Mlc, Cedar101, QmunkE, Incnis Mrsi, Blue-
bot, MartinPoulter, Nbarth, Gracenotes, Otus, Cybercobra, Dreadstar, Wizardman, MegaHasher, Hetar, Amniarix, CBM, Polaris408,
Peterdjones, Hosamaly, Hut 8.5, Wikilolo, Lt basketball, Gwern, Raise exception, Fylwind, Davecrosby uk, BotKung, Rhanekom, SieBot,
Oxymoron83, Casablanca2000in, Classicalecon, Linforest, Niceguyedc, UKoch, Quinntaylor, Addbot, SoSaysChappy, Loupeter, Legobot,
Luckas-bot, Denispir, Pcap, AnomieBOT, Citation bot, Twri, DSisyphBot, GrouchoBot, FrescoBot, Spindocter123, Tyamath, EmausBot,
Wikipelli, Elaz85, Mentibot, Nullzero, Helpful Pixie Bot, Poonam7393, Umasoni30, Vimalwatwani, Chmarkine, Irene31, Mark viking,
FriendlyCaribou, Brandon.heck, Aristiden7o, Bender the Bot and Anonymous: 46
• Bit array Source: https://en.wikipedia.org/wiki/Bit_array?oldid=772721016 Contributors: Awaterl, Boud, Pnm, Dcoetzee, Furrykef,
JesseW, AJim, Bovlb, Vadmium, Karol Langner, Sam Hocevar, Andreas Kaufmann, Notinasnaid, Paul August, CanisRufus, Spoon!,
R. S. Shaw, Rgrig, Forderud, Jacobolus, Bluemoose, Qwertyus, Hack-Man, StuartBrady, Intgr, RussBot, Cedar101, TomJF, JLaTondre,
Chris the speller, Bluebot, Doug Bell, Archimerged, DanielLemire, Glen Pepicelli, CRGreathouse, Gyopi, Neelix, Davnor, Kubanczyk,
Izyt, Gwern, Themania, R'n'B, Sudleyplace, TheChrisD, Cobi, Pcordes, Bvds, RomainThibaux, Psychless, Skwa, Onomou, MystBot, Ad-
dbot, IOLJeff, Tide rolls, Bluebusy, Peter Flass, AnomieBOT, Rubinbot, JnRouvignac, ZéroBot, Nomen4Omen, Cocciasik, ClueBot NG,
Snotbot, Minakshinajardhane, Chmarkine, Chip123456, BattyBot, Mogism, Thajdog10, User85734, François Robere, Carlos R Castro G,
Chadha.varun, Francisco Bajumuzi, Ushkin N and Anonymous: 54
• Bloom filter Source: https://en.wikipedia.org/wiki/Bloom_filter?oldid=772348327 Contributors: Damian Yerrick, The Anome, Edward,
Michael Hardy, Pnm, Wwwwolf, Thebramp, Charles Matthews, Dcoetzee, Doradus, Furrykef, Phil Boswell, Fredrik, Chocolateboy, Bab-
bage, Alan Liefting, Giftlite, DavidCary, ShaunMacPherson, Rchandra, Macrakis, Neilc, EvilGrin, James A. Donald, Mahemoff, Two
8.1. TEXT 251

Bananas, Urhixidur, Andreas Kaufmann, Anders94, Subrabbit, Smyth, Agl~enwiki, CanisRufus, Susvolans, Giraffedata, Drangon, Ter-
rycojones, Mbloore, Yinotaurus, Dzhim, GiovanniS, Galaxiaad, Mindmatrix, Shreevatsa, RzR~enwiki, Tabletop, Payrard, Ryan Reich,
Pfunk42, Qwertyus, Ses4j, Rjwilmsi, Sdornan, Brighterorange, Vsriram, Quuxplusone, Chobot, Wavelength, Argav, Taejo, CesarB’s un-
priviledged account, Msikma, E123, Dtrebbien, Wirthi, Cconnett, Cedar101, HereToHelp, Rubicantoto, Sbassi, Daivox, SmackBot, Stev0,
MalafayaBot, Nbarth, Cybercobra, Xiphoris, Wikidrone, Drae, Galaad2, Jeremy Banks, Shakeelmahate, Requestion, Krauss, Farzaneh,
Hilgerdenaar, Lindsay658, Hanche, Headbomb, NavenduJain, QuiteUnusual, Marokwitz, Labongo, Bblfish, Igodard, ARSHA, Magiola-
ditis, Alexmadon, David Eppstein, STBot, Flexdream, Willpwillp, Osndok, Coolg49964, Jjldj, Hammersoft, VolkovBot, Ferzkopp, Lo-
kiClock, Trachten, Rlaufer, SieBot, Emorrissey, Sswamida, Nhahmada, Abbasgadhia, Svick, Justin W Smith, Gtoal, HowardBGolden,
Rhubbarb, Quanstro, Pointillist, Shabbychef, Bender2k14, Sun Creator, Jakouye, AndreasBWagner, Sharma337, Dsimic, SteveJothen,
Addbot, Mortense, Jerz4835, FrankAndProust, MrOllie, Lightbot, Legobot, Russianspy3, Luckas-bot, Yobot, Ptbotgourou, Amirobot,
Gharb, AnomieBOT, Materialscientist, Citation bot, Naufraghi, Tjayrush, Krj373, Osloom, X7q, Citation bot 1, Chenopodiaceous,
HRoestBot, Jonesey95, Kronos04, Trappist the monk, Chronulator, Mavam, Buddeyp, RjwilmsiBot, Liorithiel, Lesshaste, John of Read-
ing, Drafiei, GoingBatty, HiW-Bot, ZéroBot, Meng6, AManWithNoPlan, Ashish goel public, Jar354, Mikhail Ryazanov, ClueBot NG,
Gareth Griffith-Jones, Bpodgursky, Rezabot, Helpful Pixie Bot, BG19bot, DivineTraube, ErikDubbelboer, Solomon7968, Exercisephys,
Chmarkine, Williamdemeo, Akryzhn, Pintoch, Faizan, Lsmll, Everymorning, BloomFilterEditor, OriRottenstreich, Monkbot, Reddish-
mariposa, Queelius, Epournaras, InternetArchiveBot, Ushkin N, Satokoala, Bender the Bot, Aagorilla, Bruce Maggs and Anonymous:
198
• MinHash Source: https://en.wikipedia.org/wiki/MinHash?oldid=772804906 Contributors: AxelBoldt, Kku, Qwertyus, Rjwilmsi, Gareth
Jones, Johndburger, Cedar101, Ma8thew, Ebrahim, David Eppstein, SchreiberBike, XLinkBot, Yobot, Citation bot, JonDePlume, Foo-
barnix, Trappist the monk, EmausBot, Nomadz, Chire, Chirag101192, Frietjes, Leopd, RWMajeed, Xmutangzk, Linuxjava, NickGrattan,
Srednuas Lenoroc, ElizaLepine and Anonymous: 27
• Disjoint-set data structure Source: https://en.wikipedia.org/wiki/Disjoint-set_data_structure?oldid=767826900 Contributors: The
Anome, Michael Hardy, Dominus, LittleDan, Charles Matthews, Dcoetzee, Grendelkhan, Pakaran, Giftlite, Pgan002, Jonel, Deewiant,
Finog, Andreas Kaufmann, Qutezuce, SamRushing, Nyenyec, Beige Tangerine, Msh210, Bigaln2, ReyBrujo, LOL, Bkkbrad, Ruud Koot,
Qwertyus, Kasei-jin~enwiki, Rjwilmsi, Salix alba, Intgr, Fresheneesz, Wavelength, Sceptre, NawlinWiki, Spike Wilbury, Kevtrice, Spirko,
Ripper234, Cedar101, Tevildo, SmackBot, Izzynn, Oli Filth, Nikaustr, Lambiam, Archimerged, SpyMagician, IanLiu, Dr Greg, Super-
joe30, Edward Vielmetti, Gfonsecabr, Headbomb, Kenahoo, Stellmach, David Eppstein, Chkno, Glrx, Rbrewer42, Kyle the bot, Oshwah,
Jamelan, AHMartin, Oaf2, MasterAchilles, Boydski, Alksentrs, Adrianwn, Vanisheduser12a67, DumZiBoT, XLinkBot, Cldoyle, Dekart,
Addbot, Shmilymann, Lightbot, Chipchap, Tonycao, Yobot, Erel Segal, Rubinbot, Sz-iwbot, Citation bot, Fantasticfears, Backpackadam,
Williacb, HRoestBot, MathijsM, Akim Demaille, Rednas1234, EmausBot, Zhouji2010, ZéroBot, Wmayner, ChuispastonBot, Mankarse,
Nullzero, Aleskotnik, Andreschulz, FutureTrillionaire, Josef Kufner, Qunwangcs157, Andyhowlett, Faizan, Simonemainardi, William Di
Luigi, Kimi91, Sharma.illusion, Kbhat95, Kennysong, R.J.C.vanHaaften, Nbro, Ahg simon, Michaelovertolli, Shiyu Ji, TaerimKim, Zha-
haoyu, AYUSHI, Pranavr93, Refat khan pathan and Anonymous: 82
• Partition refinement Source: https://en.wikipedia.org/wiki/Partition_refinement?oldid=771055545 Contributors: Tea2min, Linas, Qw-
ertyus, Matt Cook, Chris the speller, Headbomb, David Eppstein, Watchduck, Noamz, RjwilmsiBot, Xsoameix, David N. Jansen and
Anonymous: 2
• Priority queue Source: https://en.wikipedia.org/wiki/Priority_queue?oldid=772679819 Contributors: Frecklefoot, Michael Hardy, Nix-
dorf, Bdonlan, Strebe, Dcoetzee, Sanxiyn, Robbot, Fredrik, Kowey, Bkell, Tea2min, Decrypt3, Giftlite, Zigger, Vadmium, Andreas Kauf-
mann, Byrial, BACbKA, El C, Spoon!, Bobo192, Nyenyec, Dbeardsl, Jeltz, Mbloore, Forderud, RyanGerbil10, Kenyon, Woohookitty,
Oliphaunt, Ruud Koot, Hdante, Pete142, Graham87, Qwertyus, Pdelong, Ckelloug, Vegaswikian, StuartBrady, Jeff02, Spl, Anders.Warga,
Stephenb, Gareth Jones, Lt-wiki-bot, PaulWright, SmackBot, Emeraldemon, Stux, Gilliam, Riedl, Oli Filth, Silly rabbit, Nbarth, Kostmo,
Zvar, Calbaer, Cybercobra, BlackFingolfin, A5b, Clicketyclack, Ninjagecko, Robbins, Rory O'Kane, Sabik, John Reed Riley, ShelfSkewed,
Chrisahn, Corpx, Omicronpersei8, Thijs!bot, LeeG, Mentifisto, AntiVandalBot, Wayiran, CosineKitty, Ilingod, VoABot II, David Eppstein,
Jutiphan, Umpteee, Squids and Chips, TXiKiBoT, Coder Dan, Red Act, RHaden, Rhanekom, SieBot, ThomasTenCate, EnOreg, Volkan
YAZICI, ClueBot, Niceguyedc, Thejoshwolfe, SchreiberBike, BOTarate, Krungie factor, DumZiBoT, XLinkBot, Ghettoblaster, Vield,
Jncraton, Lightbot, Legobot, Yobot, FUZxxl, Bestiasonica, AnomieBOT, 1exec1, Kimsey0, Xqbot, Redroof, Thore Husfeldt, FrescoBot,
Hobsonlane, Itusg15q4user, Arthur MILCHIOR, Orangeroof, ElNuevoEinstein, HenryAyoola, EmausBot, LastKingpin, Moswento, Arken-
flame, Meng6, GabKBel, ChuispastonBot, Highway Hitchhiker, ClueBot NG, Carandraug, Ztothefifth, Widr, FutureTrillionaire, Happyuk,
Chmarkine, J.C. Labbrev, Dexbot, Kushalbiswas777, MeekMelange, Lone boatman, Sriharsh1234, Theemathas, Dough34, Mydog333,
Luckysud4, Sammydre, Bladeshade2, Mtnorthpoplar, Kdhanas, GreenC bot, Bender the Bot and Anonymous: 153
• Bucket queue Source: https://en.wikipedia.org/wiki/Bucket_queue?oldid=766064665 Contributors: David Eppstein
• Heap (data structure) Source: https://en.wikipedia.org/wiki/Heap_(data_structure)?oldid=764292574 Contributors: Derek Ross,
LC~enwiki, Christian List, Boleslav Bobcik, DrBob, B4hand, Frecklefoot, Paddu, Jimfbleak, Notheruser, Kragen, Jll, Aragorn2, Charles
Matthews, Timwi, Dcoetzee, Dfeuer, Dysprosia, Doradus, Jogloran, Shizhao, Cannona, Robbot, Noldoaran, Fredrik, Sbisolo, Vikingstad,
Giftlite, DavidCary, Wolfkeeper, Mellum, Tristanreid, Pgan002, Beland, Two Bananas, Pinguin.tk~enwiki, Andreas Kaufmann, Abdull,
Oskar Sigvardsson, Wiesmann, Yuval madar, Qutezuce, Tristan Schmelcher, Ascánder, Mwm126, Iron Wallaby, Spoon!, Mdd, Musiphil,
Guy Harris, Sligocki, Suruena, Derbeth, Wsloand, Oleg Alexandrov, Mahanga, Mindmatrix, LOL, Prophile, Daira Hopwood, Ruud Koot,
Apokrif, Tom W.M., Graham87, Qwertyus, Drpaule, Psyphen, Mathbot, Quuxplusone, Krun, Fresheneesz, Chobot, YurikBot, Wave-
length, RobotE, Vecter, NawlinWiki, DarkPhoenix, B7j0c, Moe Epsilon, Mlouns, LeoNerd, Bota47, Schellhammer, Lt-wiki-bot, Abu
adam~enwiki, Ketil3, HereToHelp, Daivox, SmackBot, Reedy, Tgdwyer, Eskimbot, Took, Thumperward, Oli Filth, Silly rabbit, Nbarth,
Ilyathemuromets, Jmnbatista, Cybercobra, Mlpkr, Prasi90, Itmozart, Atkinson 291, Ninjagecko, SabbeRubbish, Loadmaster, Hiiiiiiiiiiiii-
iiiiiiii, Jurohi, Jafet, Ahy1, Eric Le Bigot, Flamholz, Cydebot, Max sang, Christian75, Grubbiv, Thijs!bot, OverLeg, Ablonus, Anka.213,
BMB, Plaga701, Jirka6, Magioladitis, 28421u2232nfenfcenc, David Eppstein, Inhumandecency, Kibiru, Bradgib, Andre.holzner, Jfroelich,
Theo Mark, Cobi, STBotD, Cool 1 love, VolkovBot, JhsBot, Wingedsubmariner, Billinghurst, Rhanekom, Quietbritishjim, SieBot, Ham
Pastrami, Flyer22 Reborn, Svick, Jonlandrum, Ken123BOT, AncientPC, VanishedUser sdu9aya9fs787sads, ClueBot, Garyzx, Uncle Milty,
Bender2k14, Kukolar, Xcez-be, Addbot, Psyced, Nate Wessel, Chzz, Jasper Deng, Numbo3-bot, Konryd, Chipchap, Bluebusy, Luckas-bot,
Timeroot, KamikazeBot, DavidHarkness, AnomieBOT, Alwar.sumit, Jim1138, Burakov, ArthurBot, DannyAsher, Xqbot, Control.valve,
GrouchoBot, Лев Дубовой, Mcmlxxxi, Kxx, C7protal, Mark Renier, Wikitamino, Sae1962, Gruntler, AaronEmi, ImPerfection, Patmorin,
CobraBot, Akim Demaille, Stryder29, RjwilmsiBot, EmausBot, John of Reading, Tuankiet65, WikitanvirBot, Sergio91pt, Hari6389, Maxi-
antor, Kirelagin, Ermishin, Jaseemabid, Chris857, ClueBot NG, Manizzle23, Incompetence, Softsundude, Joel B. Lewis, Samuel Marks,
Mediator Scientiae, BG19bot, Racerdogjack, Chmarkine, Hadi Payami, PatheticCopyEditor, Hupili, ChrisGualtieri, Rarkenin, Frosty,
252 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

DJB3.14, Clevera, FenixFeather, P.t-the.g, Theemathas, Sunny1304, Tim.sebring, Ginsuloft, Azx0987, Chaticramos, KCAuXy4p, Evo-
hunz, Nbro, Sequoia 42, Ougarcia, Danmejia1, CLCStudent, Deacon Vorbis, Hellotherespellbound, Maldosari and Anonymous: 204
• Binary heap Source: https://en.wikipedia.org/wiki/Binary_heap?oldid=768914237 Contributors: Derek Ross, Taw, Shd~enwiki, B4hand,
Pit~enwiki, Nixdorf, Snoyes, Notheruser, Kragen, Kyokpae~enwiki, Dcoetzee, Dfeuer, Dysprosia, Kbk, Espertus, Fredrik, Altenmann,
DHN, Vikingstad, Tea2min, DavidCary, Laurens~enwiki, Levin, Alexf, Bryanlharris, Sam Hocevar, Andreas Kaufmann, Rich Farm-
brough, Sladen, Hydrox, Antaeus Feldspar, CanisRufus, Iron Wallaby, Liao, Wsloand, Bsdlogical, Kenyon, Oleg Alexandrov, Mahanga,
LOL, Ruud Koot, Qwertyus, Pdelong, Brighterorange, Drpaule, Platyk, VKokielov, Fresheneesz, Mdouze, Tofergregg, CiaPan, Daev,
MonoNexo, Htonl, Schellhammer, HereToHelp, Ilmari Karonen, DomQ, Theone256, Oli Filth, Nbarth, Matt77, Cybercobra, Djcmackay,
Danielcer, Ohconfucius, Doug Bell, J Crow, Catphive, Dicklyon, Inquisitus, Hu12, Velle~enwiki, Cydebot, Codetiger, Headbomb, Win-
Bot, Kba, Alfchung~enwiki, JAnDbot, MSBOT, R27182818, Magioladitis, Seshu pv, Jessicapierce, Japo, David Eppstein, Scott tucker,
Pgn674, Applegrew, Foober, Phishman3579, Funandtrvl, Rozmichelle, Vektor330, Tdeoras, Nuttycoconut, Lourakis, Ctxppc, Cpflames,
Anchor Link Bot, ClueBot, Miquelmartin, Jaded-view, Kukolar, Amossin, XLinkBot, Addbot, Bluebusy, Luckas-bot, Yobot, Amirobot,
Davidshen84, AnomieBOT, DemocraticLuntz, Jim1138, Baliame, Xqbot, Surturpain, Smk65536, GrouchoBot, Speakus, Okras, Fres-
coBot, Tom.Reding, Trappist the monk, Indy256, Patmorin, Duoduoduo, Loftpo, Tim-J.Swan, Superlaza, JosephCatrambone, Racerx11,
Dcirovic, Chris857, EdoBot, Dakaminski, Rezabot, Ciro.santilli, O12, Helpful Pixie Bot, BG19bot, Crocodilesareforwimps, Chmarkine,
MiquelMartin, IgushevEdward, Harsh 2580, Drjackstraw, 22990atinesh, Msproul, Billyisyoung, Lilalas, Cbcomp, Aswincweety, Erro-
hitagg, Nbro, Missingdays, Stevenxiaoxiong, Dilettantest, Wattitude, PhilipWelch and Anonymous: 175
• D-ary heap Source: https://en.wikipedia.org/wiki/D-ary_heap?oldid=752476068 Contributors: Derek Ross, Greenrd, Phil Boswell, Rich
Farmbrough, Qwertyus, Fresheneesz, SmackBot, Shalom Yechiel, Cydebot, Alaibot, David Eppstein, Skier Dude, Slemm, M2Ys4U, LeaW,
Addbot, DOI bot, Yobot, Miyagawa, Citation bot 1, JanniePieters, DrilBot, Dude1818, RjwilmsiBot, ChuispastonBot, Helpful Pixie Bot,
Fragapanagos, Angelababy00, Deacon Vorbis and Anonymous: 19
• Binomial heap Source: https://en.wikipedia.org/wiki/Binomial_heap?oldid=759725052 Contributors: Michael Hardy, Poor Yorick,
Dcoetzee, Dysprosia, Doradus, Maximus Rex, Cdang, Fredrik, Brona, MarkSweep, TonyW, Creidieki, Klemen Kocjancic, Martin TB,
Lemontea, Bo Lindbergh, Karlheg, Arthena, Wsloand, Oleg Alexandrov, LOL, Qwertyus, NeonMerlin, Fragglet, Fresheneesz, CiaPan,
YurikBot, Hairy Dude, Vecter, Googl, SmackBot, Theone256, Peterwhy, Yuide, Nviladkar, Stebulus, Cydebot, Marqueed, Thijs!bot,
Magioladitis, Matt.smart, Gwern, Funandtrvl, VolkovBot, Wingedsubmariner, Biscuittin, YonaBot, Volkan YAZICI, OOo.Rax, Alexbot,
Npansare, Addbot, Alquantor, Alex.mccarthy, Download, Sapeur, LinkFA-Bot, ‫ماني‬, Aham1234, Materialscientist, Vmanor, DARTH
SIDIOUS 2, Josve05a, Templatetypedef, ClueBot NG, BG19bot, Dexbot, Mark L MacDonald, Boza s6, Oleksandr Shturmov and Anony-
mous: 65
• Fibonacci heap Source: https://en.wikipedia.org/wiki/Fibonacci_heap?oldid=771211390 Contributors: Michael Hardy, Zeno Gantner,
Poor Yorick, Charles Matthews, Dcoetzee, Dysprosia, Wik, Hao2lian, Phil Boswell, Fredrik, Eliashedberg, P0nc, Brona, Creidieki,
Qutezuce, Bender235, Aquiel~enwiki, Mkorpela, Wsloand, Oleg Alexandrov, Japanese Searobin, LOL, Ruud Koot, Rjwilmsi, Ravik, Fresh-
eneesz, Antiuser, YurikBot, SmackBot, Arkitus, Droll, MrBananaGrabber, Ninjagecko, Jrouquie, Hiiiiiiiiiiiiiiiiiiiii, Vanisaac, Myasuda,
AnnedeKoning, Cydebot, Gimmetrow, Headbomb, DekuDekuplex, Jirka6, JAnDbot, David Eppstein, The Real Marauder, DerHexer, An-
dre.holzner, Adam Zivner, Yecril, Funandtrvl, Aaron Rotenberg, Wingedsubmariner, Wbrenna36, Crashie, Bporopat, Arjun024, Thw1309,
ClueBot, Gene91, Mild Bill Hiccup, Nanobear~enwiki, RobinMessage, Peatar, Kaba3, Safenner1, Addbot, LatitudeBot, Mdk wiki~enwiki,
Luckas-bot, Yobot, Vonehrenheim, AnomieBOT, Erel Segal, Citation bot, Miym, Kxx, Novamo, Arthur MILCHIOR, MorganGreen,
Pinethicket, Lars Washington, Ereiniona, EmausBot, Coliso, Wikipelli, Trimutius, Lexusuns, Templatetypedef, ClueBot NG, Softsun-
dude, O.Koslowski, BG19bot, PatheticCopyEditor, ChrisGualtieri, Martin.carames, Dexbot, Jochen Burghardt, Faizan, Alexwho314,
Theemathas, Nvmbs, Oleksandr Shturmov, Mtnorthpoplar, Aayushdhir and Anonymous: 110
• Pairing heap Source: https://en.wikipedia.org/wiki/Pairing_heap?oldid=772018183 Contributors: Phil Boswell, Pgan002, Wsloand, Ruud
Koot, Qwertyus, Quale, Drdisque, Cedar101, Sneftel, Tgdwyer, Bluebot, SAMJAM, Jrouquie, Cydebot, Alaibot, Magioladitis, David
Eppstein, Wingedsubmariner, Celique, Geoffrey.foster, Yobot, Gilo1969, Kxx, Citation bot 1, Breaddawson, Hoofinasia, Dexbot, Pintoch,
Jeff Erickson, CV9933 and Anonymous: 14
• Double-ended priority queue Source: https://en.wikipedia.org/wiki/Double-ended_priority_queue?oldid=762669115 Contributors:
Dremora, Ruud Koot, Qwertyus, Quuxplusone, Wavelength, Sneftel, Racklever, Henning Makholm, PamD, David Eppstein, Julianhyde,
AvicAWB, Templatetypedef, Shire Reeve, 0milch0, BG19bot, Ramesh Ramaiah, Vibhave, BPositive, Mark Arsten, Loriendrew, Concep-
tualizing and Anonymous: 6
• Soft heap Source: https://en.wikipedia.org/wiki/Soft_heap?oldid=766017531 Contributors: Denny, Dcoetzee, Doradus, Fredrik, Just An-
other Dan, Pgan002, Wsloand, Ruud Koot, Agthorr, Eubot, Boticario, Bondegezou, SmackBot, Bluebot, Cydebot, Alaibot, Headbomb,
Cobi, AHMartin, Bender2k14, Addbot, LilHelpa, Ita140188, Agentex, FrescoBot, Lunae and Anonymous: 13
• Binary search algorithm Source: https://en.wikipedia.org/wiki/Binary_search_algorithm?oldid=771938724 Contributors: Peter
Winnberg, Taw, Dze27, Ed Poor, LA2, M~enwiki, Hannes Hirzel, Edward, Patrick, Robert Dober, Nixdorf, Pnm, Zeno Gantner, Takuya-
Murata, Loisel, Stan Shebs, EdH, Mxn, Hashar, Charles Matthews, Dcoetzee, Fuzheado, SirJective, McKay, Pakaran, Phil Boswell,
Fredrik, Altenmann, Tea2min, Giftlite, The Cave Troll, BenFrantzDale, Mboverload, Macrakis, Pne, DevilsAdvocate, Beland, Over-
lordQ, Maximaximax, Two Bananas, Pm215, Ukexpat, Sleepyrobot, Ericamick, Bfjf, Harriv, Shlomif, ESkog, Plugwash, El C, Diomidis
Spinellis, EmilJ, Baruneju, Spoon!, BrokenSegue, Photonique, Musiphil, Alansohn, Liao, Caesura, Andrewmu, Mr flea, Gpvos, HenryLi,
Forderud, Ericl234, Nuno Tavares, Pol098, Tabletop, Palica, Gerbrant, Ryajinor, Arjarj, Zzedar, GrundyCamellia, Coemgenus, Scandum,
Quale, XP1, Ligulem, R.e.b., FlaBot, Quuxplusone, Sioux.cz, CiaPan, Chobot, DVdm, Drtom, The Rambling Man, YurikBot, Wave-
length, Stephenb, Ewx, Hv, ColdFusion650, Kcrca, Black Falcon, Googl, Nikkimaria, Zachwlewis, Cedar101, Htmnssn, Messy Thinking,
SigmaEpsilon, JLaTondre, Fsiler, SmackBot, NickyMcLean, WilliamThweatt, TestPilot, KocjoBot~enwiki, Ieopo, BiT, Gene Thomas,
Amux, J4 james, Iain.dalton, Oli Filth, Jonny Diamond, TripleF, Oylenshpeegul, Sephiroth BCR, Mlpkr, Agcala~enwiki, Doug Bell,
Breno, Beetstra, Wstomv, Mr Stephen, David Souther, TwistOfCain, Lavaka, Devourer09, Fabian Steeg~enwiki, David Cooke, Svivian,
Ironmagma, Mike Christie, Solidpoint, Verdy p, Boemanneke, Ardnew, FrancoGG, Tmdean, Heineman, AntiVandalBot, Kylemcinnes,
Seaphoto, Donbraffitt, Kdakin, JAnDbot, FactoidCow, SiobhanHansa, Magioladitis, Soulbot, Chutzpan, Allstarecho, David Eppstein, Tod-
dcs, Gwern, MartinBot, Glrx, Userabc, Trusilver, Fylwind, Dodno, WhiteOak2006, Izno, SoCalSuperEagle, Mariolj, Oshwah, Vipinhari,
Kinkydarkbird, Merritt.alex, Swanyboy2, Don4of4, CanOfWorms, Dirkbb, Meters, Df747jet, Brianga, ICrann15, Scarian, Comp123,
Jan Winnicki, Psherm85, Jerryobject, Flyer22 Reborn, Joshgilkerson, Lourakis, Macy, Dillard421, Svick, Hariva, Rdhettinger, Vanishe-
dUser sdu9aya9fs787sads, ClueBot, Justin W Smith, Syhon, Garyzx, Arunsingh16, Tim32, JeffDonner, Dasboe, Predator106, Hasanadnan-
taha, Hkleinnl, Neuralwarp, XLinkBot, Muffincorp, Mitch Ames, Bob1312, Briandamgaard, NjardarBot, Balabiot, Legobot, Luckas-bot,
8.1. TEXT 253

Yobot, MarioS, AnomieBOT, Andrewrp, 1exec1, Jim1138, Mangarah, Gankro, Materialscientist, Citation bot, Taeshadow, Lacis alfredo,
Melmann, SPTWriter, Jeffrey Mall, Mononomic, Pmlineditor, Shirik, Harry0xBd, WithWhich, FrescoBot, CarminPolitano, Ninaddb, At-
lantia, Biker Biker, BigDwiki, AANaimi, Nnarasimhakaushik, Aperisic, MoreNet, Jfmantis, RjwilmsiBot, JustAHappyCamper, EmausBot,
Msswp, Robrohan, Wikipelli, Dcirovic, John Cline, Checkingfax, ChaosCon, Midas02, Staszek Lem, DOwenWilliams, L Kensington, Bill
william compton, Ranching, Peter Karlsen, Mark Martinec, TYelliot, 28bot, Rocketrod1960, Haigee2007, ClueBot NG, MelbourneStar,
Gilderien, Imjooseo, Widr, Nullzero, Sangameshh, Jk2q3jrklse, Helpful Pixie Bot, Curb Chain, Wbm1058, BG19bot, Streaver91, Su-
peramin, Rodion Gork, Lambin~enwiki, Rynishere, Chmarkine, Njanani, BattyBot, Nithin.A.P, Timothy Gu, ChrisGualtieri, Daiyuda,
Wullschj, Aj8uppal, Codethinkers, Pintoch, Lugia2453, AlwaysAngry, Jamesx12345, Nero hu, NC4PK, Mark viking, I am One of Many,
Tentinator, IRockStone, DavidLeighEllis, Pappu0007, Bloghog23, Alex.koturanov, Rulnick, Benjohnbarnes, Peturb, Dalton Quinn, Kjer-
ish, KH-1, Esquivalience, CruiserAbhi, PJ Cabral, JJMC89, BiomolecularGraphics4All, Atlantic306, JindalApoorv, Sunflower42, Ro-
hit0303, ThePlatypusofDoom, Chrissymad, Divyanshj.16, Zaffy806, Fresal, SingSighSep and Anonymous: 432
• Binary search tree Source: https://en.wikipedia.org/wiki/Binary_search_tree?oldid=772628414 Contributors: Damian Yerrick, Bryan
Derksen, Taw, Mrwojo, Spiff~enwiki, PhilipMW, Michael Hardy, Chris-martin, Nixdorf, Ixfd64, Minesweeper, Darkwind, LittleDan,
Glenn, BAxelrod, Timwi, MatrixFrog, Dcoetzee, Havardk, Dysprosia, Doradus, Maximus Rex, Phil Boswell, Fredrik, Postdlf, Bkell,
Hadal, Tea2min, Enochlau, Awu, Giftlite, DavidCary, P0nc, Ezhiki, Maximaximax, Qleem, Karl-Henner, Qiq~enwiki, Shen, An-
dreas Kaufmann, Jin~enwiki, Grunt, Kate, Oskar Sigvardsson, D6, Ilana, Kulp, ZeroOne, Damotclese, Vdm, Func, LeonardoGre-
gianin, Runner1928, Nicolasbock, HasharBot~enwiki, Alansohn, Liao, RoySmith, Rudo.Thomas, Pion, Wtmitchell, Evil Monkey,
4c27f8e656bb34703d936fc59ede9a, Oleg Alexandrov, Mindmatrix, LOL, Oliphaunt, Ruud Koot, Trevor Andersen, GregorB, Mb1000,
MrSomeone, Qwertyus, Nneonneo, Hathawayc, VKokielov, Ecb29, Mathbot, BananaLanguage, DevastatorIIC, Quuxplusone, Sketch-The-
Fox, Butros, Banaticus, Roboto de Ajvol, YurikBot, Wavelength, Personman, Michael Slone, Hyad, Taejo, Gaius Cornelius, Oni Lukos,
TheMandarin, Salrizvy, Moe Epsilon, BOT-Superzerocool, Googl, Regnaron~enwiki, Abu adam~enwiki, Chery, Cedar101, Jogers, Leonar-
doRob0t, Richardj311, WikiWizard, SmackBot, Bernard François, Gilliam, Ohnoitsjamie, Theone256, Oli Filth, Neurodivergent, DHN-
bot~enwiki, Alexsh, Garoth, Mweber~enwiki, Allan McInnes, Calbaer, NitishP, Cybercobra, Underbar dk, Hcethatsme, MegaHasher,
Breno, Nux, Tachyon77, Beetstra, Dicklyon, Hu12, Vocaro, Konnetikut, JForget, James pic, CRGreathouse, Ahy1, WeggeBot, Mikeput-
nam, TrainUnderwater, Jdm64, AntiVandalBot, Jirka6, Lanov, Huttarl, Eapache, JAnDbot, Anoopjohnson, Magioladitis, Abednigo, All-
starecho, Tomt22, Gwern, S3000, MartinBot, Anaxial, Leyo, Mike.lifeguard, Phishman3579, Skier Dude, Joshua Issac, Mgius, Kewlito,
Danadocus, Vectorpaladin13, Labalius, BotKung, One half 3544, Spadgos, MclareN212, Nerdgerl, Rdemar, Davekaminski, Rhanekom,
SieBot, YonaBot, Xpavlic4, Casted, VVVBot, Ham Pastrami, Jerryobject, Flyer22 Reborn, Swapsy, Djcollom, Svick, Anchor Link Bot,
GRHooked, Loren.wilton, Xevior, ClueBot, ChandlerMapBot, Madhan virgo, Theta4, Splttingatms, Shailen.sobhee, AgentSnoop, Onomou,
XLinkBot, WikHead, Metalmax, MrOllie, Jdurham6, Nate Wessel, LinkFA-Bot, ‫ماني‬, Matekm, Legobot, Luckas-bot, Yobot, Dimchord,
AnomieBOT, The Parting Glass, Burakov, Ivan Kuckir, Tbvdm, LilHelpa, Shashi20008, Capricorn42, SPTWriter, Doctordiehard, Wtar-
reau, Shmomuffin, Dzikasosna, Smallman12q, Kurapix, Adamuu, FrescoBot, 4get, Citation bot 1, Golle95, Aniskhan001, Frankrod44,
Cochito~enwiki, MastiBot, Thesevenseas, Sss41, Vromascanu, Shuri org, Rolpa, Jayaneethatj, Avermapub, MladenWiki, Konstantin Pest,
Akim Demaille, Cyc115, WillNess, Nils schmidt hamburg, RjwilmsiBot, Ripchip Bot, X1024, Chibby0ne, Albmedina, Your Lord and Mas-
ter, Nomen4Omen, Meng6, Wmayner, Tolly4bolly, Snehalshekatkar, Dan Wang, ClueBot NG, SteveAyre, Jms49, Frietjes, Ontariolot, Sol-
san88, Nakarumaka, BG19bot, AlanSherlock, Rafikamal, BPositive, RJK1984, Phc1, WhiteNebula, IgushevEdward, Hdanak, JingguoYao,
Yaderbh, RachulAdmas, TwoTwoHello, Frosty, Josell2, SchumacherTechnologies, Farazbhinder, Wulfskin, Embanner, Mtahmed, Jihlim,
Kaidul, Cybdestroyer, Jabanabba, Gokayhuz, Mathgorges, Jianhui67, Paul2520, Super fish2, Ryuunoshounen, Dk1027, Azx0987, KH-1,
Tshubham, HarshalVTripathi, ChaseKR, Nbro, Filip Euler, Koolnik90, K-evariste, Enzoferber, Selecsosi, DaBrown95, Jonnypurgatory,
Peterc26, Mezhaka, SimoneBrigante, Cwowo, HarshKhatore and Anonymous: 379
• Random binary tree Source: https://en.wikipedia.org/wiki/Random_binary_tree?oldid=753780752 Contributors: Michael Hardy, Cyber-
cobra, David Eppstein, Cobi, Addbot, Cardel, Gilo1969, Citation bot 1, Patmorin, RjwilmsiBot, Helpful Pixie Bot, Marcocapelle, Dsp de
and Anonymous: 7
• Tree rotation Source: https://en.wikipedia.org/wiki/Tree_rotation?oldid=750774864 Contributors: Mav, BlckKnght, B4hand, Michael
Hardy, Kragen, Dcoetzee, Dysprosia, Altenmann, Michael Devore, Leonard G., Neilc, Andreas Kaufmann, Mr Bound, Chub~enwiki,
BRW, Oleg Alexandrov, Joriki, Graham87, Qwertyus, Wizzar, Pako, Mathbot, Peterl, Abarry, Trainra, Cedar101, SmackBot, DHN-
bot~enwiki, Ramasamy, Kjkjava, Hyperionred, Thijs!bot, Headbomb, Waylonflinn, Swpb, David Eppstein, Gwern, Vegasprof, STBotD,
Skaraoke, Mtanti, SCriBu, Castorvx, Salvar, SieBot, Woblosch, Svick, Xevior, Boykobb, LaaknorBot, ‫ماني‬, Legobot, Mangarah, Lil-
Helpa, GrouchoBot, Adamuu, Citation bot 1, Britannic124, Nomen4Omen, Alexey.kudinkin, ClueBot NG, Knowledgeofthekrell, Josell2,
Explorer512, Tar-Elessar, Javier Borrego Fernandez C-512, Fmadd, AdamBignell and Anonymous: 40
• Self-balancing binary search tree Source: https://en.wikipedia.org/wiki/Self-balancing_binary_search_tree?oldid=751986508 Contrib-
utors: Michael Hardy, Angela, Dcoetzee, Dysprosia, DJ Clayworth, Noldoaran, Fredrik, Diberri, Enochlau, Wolfkeeper, Jorge Stolfi, Neilc,
Pgan002, Jacob grace, Andreas Kaufmann, Shlomif, Baluba, Mdd, Alansohn, Jeltz, ABCD, Kdau, RJFJR, Japanese Searobin, Jacobolus,
Chochopk, Qwertyus, Moskvax, Intgr, YurikBot, Light current, Plyd, Daivox, MrDrBob, Cybercobra, Jon Awbrey, Ripe, Momet, Jafet,
CRGreathouse, Cydebot, Widefox, David Eppstein, Funandtrvl, VolkovBot, Sriganeshs, Lamro, Jruderman, Plastikspork, SteveJothen,
Addbot, Bluebusy, Yobot, Larrycz, Xqbot, Drilnoth, Steaphan Greene, FrescoBot, DrilBot, ActuallyRationalThinker, EmausBot, RA0808,
Larkinzhang1993, Azuris, ClueBot NG, Andreas4965, Solomon7968, Wolfgang42, Pintoch, Josell2, Jochen Burghardt, G PViB, Ollie314
and Anonymous: 51
• Treap Source: https://en.wikipedia.org/wiki/Treap?oldid=751556289 Contributors: Edward, Poor Yorick, Jogloran, Itai, Jleedev, Eequor,
Andreas Kaufmann, Qef, Milkmandan, Saccade, Wsloand, Oleg Alexandrov, Jörg Knappen~enwiki, Ruud Koot, Hdante, Behdad, Qw-
ertyus, Arbor, Gustavb, Regnaron~enwiki, James.nvc, SmackBot, KnowledgeOfSelf, Chris the speller, Cybercobra, MegaHasher, Pfh, J.
Finkelstein, Yzt, Jsaxton86, Cydebot, Blaisorblade, Escarbot, RainbowCrane, David Eppstein, AHMartin, Bajsejohannes, Justin W Smith,
Kukolar, Hans Adler, Addbot, Luckas-bot, Yobot, Erel Segal, Rubinbot, Citation bot, Bencmq, Gilo1969, Miym, Brutaldeluxe, Cshinyee,
C02134, ICEAGE, MaxDel, Patmorin, Cdb273, MoreNet, Allforrous, ChuispastonBot, BG19bot, Chmarkine, Naxik, Lsmll and Anony-
mous: 30
• AVL tree Source: https://en.wikipedia.org/wiki/AVL_tree?oldid=772933114 Contributors: Damian Yerrick, BlckKnght, M~enwiki, Ede-
maine, FvdP, Infrogmation, Michael Hardy, Nixdorf, Minesweeper, Jll, Poor Yorick, Dcoetzee, Dysprosia, Doradus, Greenrd, Topbanana,
Noldoaran, Fredrik, Altenmann, Merovingian, Tea2min, Andrew Weintraub, Mckaysalisbury, Neilc, Pgan002, Tsemii, Andreas Kauf-
mann, Safety Cap, Mike Rosoft, Guanabot, Byrial, Pavel Vozenilek, Shlomif, Lankiveil, Rockslave, Smalljim, Geek84, Axe-Lander,
Darangho, Kjkolb, Larryv, Obradovic Goran, HasharBot~enwiki, Orimosenzon, Kdau, Docboat, Evil Monkey, Tphyahoo, RJFJR, Kenyon,
Oleg Alexandrov, LOL, Ruud Koot, Gruu, Seyen, Graham87, Qwertyus, ErikHaugen, Toby Douglass, Mikm, Alex Kapranoff, Jeff02,
254 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

Gurch, Intgr, Chobot, YurikBot, Gaius Cornelius, NawlinWiki, Astral, Dtrebbien, Kain2396, Bkil, Pnorcks, Blackllotus, Bota47, Lt-wiki-
bot, Arthur Rubin, Cedar101, KGasso, Gulliveig, Danielx, LeonardoRob0t, Paul D. Anderson, SmackBot, Apanag, InverseHypercube,
David.Mestel, KocjoBot~enwiki, Gilliam, Tsoft, DHN-bot~enwiki, ChrisMP1, Tamfang, Cybercobra, Flyingspuds, Epachamo, Philvarner,
Dcamp314, Kuru, Euchiasmus, Michael miceli, Caviare, Babbling.Brook, Dicklyon, Yksyksyks, Momet, Nysin, Jac16888, Daewoollama,
Cyhawk, ST47, Zian, Joeyadams, Msanchez1978, Eleuther, AntiVandalBot, Ste4k, Jirka6, Gökhan, JAnDbot, Leuko, Magioladitis, Anant
sogani, Avicennasis, David Eppstein, Nguyễn Hữu Dung, MartinBot, J.delanoy, Pedrito, Phishman3579, Jeepday, Michael M Clarke, Un-
washedMeme, Binnacle, Adamd1008, DorganBot, Hwbehrens, Funandtrvl, BenBac, VolkovBot, Indubitably, Mtanti, Castorvx, AlexGreat,
Uw.Antony, Enviroboy, Srivesh, SieBot, Aent, Vektor330, Flyer22 Reborn, Hello71, Svick, Mauritsmaartendejong, Denisarona, Xevior,
ClueBot, Nnemo, CounterVandalismBot, Auntof6, Kukolar, Ksulli10, Moberg, Mellerho, XLinkBot, Gnowor, Njvinod, Resper~enwiki,
DOI bot, Dawynn, Ommiy-Pangaeus~enwiki, Leszek Jańczuk, Mr.Berna, West.andrew.g, Tide rolls, Matěj Grabovský, Bluebusy, MattyIX,
Legobot, Luckas-bot, Yobot, Fx4m, II MusLiM HyBRiD II, Agrawalyogesh, AnomieBOT, Jim1138, Royote, Kingpin13, Materialscien-
tist, Xqbot, Drilnoth, Oliversisson, VladimirReshetnikov, Greg Tyler, Shmomuffin, Adamuu, Mjkoo, FrescoBot, MarkHeily, Moham-
mad ahad, Ichimonji10, Maggyero, DrilBot, Sebculture, RedBot, Trappist the monk, MladenWiki, EmausBot, Benoit fraikin, Mzruya,
Iamnitin, AvicBot, Vlad.c.manea, Nomen4Omen, Geoff55, Mnogo, Chire, Compusense, ClueBot NG, MelbourneStar, Bulldog73, Mac-
donjo, G0gogcsc300, Codingrecipes, Helpful Pixie Bot, Titodutta, BG19bot, Northamerica1000, Solomon7968, Ravitkhurana, Crh23,
Proxyma, DmitriyVilkov, Zhaofeng Li, ChrisGualtieri, Eta Aquariids, Dexbot, Kushalbiswas777, CostinulAT, Akerbos, Josell2, Jochen
Burghardt, G PViB, Elfbw, Ppkhoa, Yelnatz, Dough34, Hibbarnt, Jasonchan1994, Jpopesculian, Skr15081997, Devsathish, Aviggiano,
Eeb379, Monkbot, Teetooan, HexTree, Henryy321, Badidipedia, Dankocevski, Esquivalience, StudentOfStones, Jmonty42, NNcNannara,
Ankitagrawalvit, Mhush12, Nirbhay c, NathanBierema, Saqibwahid and Anonymous: 359
• Red–black tree Source: https://en.wikipedia.org/wiki/Red%E2%80%93black_tree?oldid=768957805 Contributors: Dreamyshade, Jz-
cool, Ghakko, FvdP, Michael Hardy, Blow~enwiki, Minesweeper, Ahoerstemeier, Cyp, Strebe, Jerome.Abela, Notheruser, Kragen, Julesd,
Ghewgill, Timwi, MatrixFrog, Dcoetzee, Dfeuer, Dysprosia, Hao2lian, Shizhao, Phil Boswell, Robbot, Fredrik, Altenmann, Hump-
back~enwiki, Jleedev, Tea2min, Enochlau, Connelly, Giftlite, Sepreece, BenFrantzDale, Brona, Dratman, Leonard G., Pgan002, Li-
Daobing, Sebbe, Karl-Henner, Andreas Kaufmann, Tristero~enwiki, Perey, Spundun, Will2k, Haxwell, Aplusbi, SickTwist, Giraffedata,
Ryan Stone, Zetawoof, Iav, Hawke666, Cjcollier, Fawcett5, Denniss, Cburnett, RJFJR, H2g2bob, Kenyon, Silverdirk, Joriki, Mindma-
trix, Merlinme, Ruud Koot, Urod, Gimboid13, Jtsiomb, Marudubshinki, Graham87, Qwertyus, OMouse, Drebs~enwiki, Rjwilmsi, Hgka-
math, ErikHaugen, Toby Douglass, SLi, FlaBot, Margosbot~enwiki, Fragglet, Jameshfisher, Kri, Loading, SGreen~enwiki, YurikBot,
Wavelength, Jengelh, Rsrikanth05, Bovineone, Sesquiannual, Jaxl, Długosz, Coderzombie, Mikeblas, Blackllotus, Schellhammer, Reg-
naron~enwiki, Ripper234, JMBucknall, Lt-wiki-bot, Abu adam~enwiki, Smilindog2000, SmackBot, Pgk, Gilliam, Thumperward, Silly rab-
bit, DHN-bot~enwiki, Sct72, Khalil Sawant, Xiteer, Cybercobra, Philvarner, TheWarlock, Alexandr.Kara, SashatoBot, Mgrand, N3bulous,
Bezenek, Caviare, Dicklyon, Otac0n, Belfry, Pqrstuv, Pranith, Supertigerman, Ahy1, Jodawi, Pmussler, Linuxrocks123, Dantiston, Sytelus,
Epbr123, Ultimus, Abloomfi, Headbomb, AntiVandalBot, Widefox, Hermel, Roleplayer, .anacondabot, Stdazi, David Eppstein, Luna-
keet, Gwern, MartinBot, Glrx, Themania, IDogbert, Madhurtanwani, Phishman3579, Warut, Smangano, Binnacle, Lukax, Potatoswatter,
KylieTastic, Bonadea, Funandtrvl, DoorsAjar, Jozue, Simoncropp, Laurier12, Bioskope, Yakov1122~enwiki, YonaBot, Sdenn, Stone628,
Stanislav Nowak~enwiki, AlanUS, Hariva, Shyammurarka, Xevior, Uncle Milty, Nanobear~enwiki, Xmarios, Karlhendrikse, Kukolar,
MiniStephan, Uniwalk, Versus22, Johnuniq, XLinkBot, Consed, C. A. Russell, Addbot, Joshhibschman, Fcp2007, AgadaUrbanit, Tide
rolls, Lightbot, Luckas-bot, Yobot, Fraggle81, AnomieBOT, Narlami, Cababunga, Maxis ftw, ChrisCPearson, Storabled, Zehntor, Tb-
vdm, Xqbot, Nishantjr, RibotBOT, Kyle Hardgrave, Adamuu, FrescoBot, AstaBOTh15, Karakak, Kmels, Banej, Userask, Hnn79, Xsanda,
Trappist the monk, Gnathan87, MladenWiki, Pellucide, Belovedeagle, Patmorin, Sreeakshay, EmausBot, John of Reading, Dem1995, Hugh
Aguilar, K6ka, Nomen4Omen, Mnogo, Awakenrz, Card Zero, Grandphuba, KYLEMONGER, Kapil.xerox, Donner60, Wikipedian to the
max, 28bot, ClueBot NG, Xjianz, Spencer greg, Wittjeff, Ontariolot, Widr, Hagoth, BG19bot, Pratyya Ghosh, Deepakabhyankar, Naxik,
Dexbot, JingguoYao, Akerbos, Epicgenius, Mimibar, Kahtar, Kojikawano, Weishi Zeng, Suelru, Monkbot, Henryy321, Spasticcodemon-
key, Aureooms, HMSLavender, Freitafr, Nbro, Demagur, Equinox, Rubydragons, Jmonty42, Codedgeass, JamesBWatson3, Frankbryce,
Mar10dejong, Asgowrisankar, Jhnam88, Cristophercalo, Linkadvitch, Taozhijiang and Anonymous: 326
• WAVL tree Source: https://en.wikipedia.org/wiki/WAVL_tree?oldid=685567411 Contributors: David Eppstein and I dream of horses
• Scapegoat tree Source: https://en.wikipedia.org/wiki/Scapegoat_tree?oldid=753128418 Contributors: FvdP, Edward, Dcoetzee, Ruakh,
Dbenbenn, Tweenk, Sam Hocevar, Andreas Kaufmann, Rich Farmbrough, Jarsyl, Aplusbi, Oleg Alexandrov, Firsfron, Slike2, Qwertyus,
Mathbot, Wknight94, SmackBot, Chris the speller, Cybercobra, MegaHasher, Vanisaac, AbsolutBildung, Thijs!bot, Robert Ullmann, The-
mania, Danadocus, Joey Parrish, WillUther, Kukolar, SteveJothen, Addbot, Yobot, Citation bot, C.hahn, Patmorin, WikitanvirBot, Hankjo,
Mnogo, ClueBot NG, AlecTaylor, Tomer adar, Theemathas, Hqztrue and Anonymous: 37
• Splay tree Source: https://en.wikipedia.org/wiki/Splay_tree?oldid=771203455 Contributors: Mav, BlckKnght, Xaonon, Christopher Ma-
han, FvdP, Edward, Michael Hardy, Nixdorf, Pnm, Drz~enwiki, Dcoetzee, Dfeuer, Dysprosia, Silvonen, Tjdw, Phil Boswell, Fredrik,
Stephan Schulz, Giftlite, Wolfkeeper, CyborgTosser, Lqs, Wiml, Gscshoyru, Urhixidur, Karl Dickman, Andreas Kaufmann, Yonkel-
tron, Rich Farmbrough, Qutezuce, Bender235, Sietse Snel, Aplusbi, Chbarts, Phdye, Tabletop, VsevolodSipakov, Graham87, Qwertyus,
Rjwilmsi, Pako, Ligulem, Jameshfisher, Fresheneesz, Wavelength, Vecter, Romanc19s, Długosz, Abu adam~enwiki, Cedar101, Terber,
HereToHelp, That Guy, From That Show!, SmackBot, Honza Záruba, Unyoyega, Apankrat, Silly rabbit, Octahedron80, Axlape, Orphan-
Bot, Cybercobra, Philvarner, Just plain Bill, Ohconfucius, MegaHasher, Vanished user 9i39j3, Lim Wei Quan, Jamie King, Dicklyon,
Freeside3, Martlau, Momet, Ahy1, VTBassMatt, Escarbot, Atavi, Coldzero1120, Eapache, KConWiki, David Eppstein, Ahmad87, Gw-
ern, HPRappaport, Foober, Phishman3579, Dodno, Funandtrvl, Anna Lincoln, Rhanekom, Zuphilip, Russelj9, Svick, AlanUS, JP.Martin-
Flatin, Nanobear~enwiki, Pointillist, Safek, Kukolar, XLinkBot, Dekart, Maverickwoo, Addbot, ‫דוד שי‬, Legobot, Yobot, Roman Mu-
nich, AnomieBOT, Erel Segal, 1exec1, Josh Guffin, Citation bot, Winniehell, Shmomuffin, Dzikasosna, FrescoBot, Snietfeld, Citation bot
1, Jwillia3, Zetifree, Sss41, MladenWiki, Sihag.deepak, Ybungalobill, Crimer, Wyverald, Const86, EmausBot, Hannan1212, Dcirovic,
SlowByte, Mnogo, P2004a, Petrb, ClueBot NG, Wiki.ajaygautam, SteveAyre, Ontariolot, Antiqueight, Vagobot, Arunshankarbk, Harijec,
HueSatLum, FokkoDriesprong, Makecat-bot, Pintoch, Arunkumar nonascii, B.pradeep143, MazinIssa, Abc00786, Lfbarba, Craftbond-
pro, Mdburns, Fabio.pakk, BethNaught, Efortanely, BenedictEggers, Admodi, Havewish, Bender the Bot, Happyspace4ever, Haleal and
Anonymous: 138
• Tango tree Source: https://en.wikipedia.org/wiki/Tango_tree?oldid=766152766 Contributors: AnonMoos, Giraffedata, RHaworth, Qw-
ertyus, Rjwilmsi, Vecter, Jengelh, Grafen, Malcolma, Rayhe, SmackBot, C.Fred, Chris the speller, Iridescent, Alaibot, Headbomb, Nick
Number, Acroterion, Nyttend, Philg88, Inomyabcs, ImageRemovalBot, Sfan00 IMG, Nathan Johnson, Jasper Deng, Yobot, AnomieBOT,
Erel Segal, Anand Oza, FrescoBot, Σ, RenamedUser01302013, Card Zero, Ontariolot, Do not want, Tango tree, DoctorKubla, Dexbot,
Faizan, Pqqwetiqe and Anonymous: 17
8.1. TEXT 255

• Skip list Source: https://en.wikipedia.org/wiki/Skip_list?oldid=765080073 Contributors: Mrwojo, Stevenj, Charles Matthews, Dcoet-
zee, Dysprosia, Doradus, Populus, Noldoaran, Fredrik, Jrockway, Altenmann, Jorge Stolfi, Two Bananas, Andreas Kaufmann, Antaeus
Feldspar, R. S. Shaw, Davetcoleman, Nkour, Ruud Koot, Qwertyus, MarSch, Drpaule, Intgr, YurikBot, Wavelength, Pi Delport, Bovineone,
Gareth Jones, Zr2d2, Cedar101, AchimP, SmackBot, Gilliam, Chadmcdaniel, Silly rabbit, Cybercobra, Viebel, Almkglor, Laurienne Bell,
Nsfmc, CRGreathouse, Nczempin, Thijs!bot, Dougher, Bondolo, Sanchom, Magioladitis, A3nm, JaGa, STBotD, Musically ut, Funandtrvl,
VolkovBot, Rhanekom, SieBot, Ivan Štambuk, MinorContributor, Menahem.fuchs, Cereblio, OKBot, Svick, Rdhettinger, Denisarona,
PuercoPop, Gene91, Jurassicstrain, Kukolar, Resuna, Xcez-be, Braddunbar, Addbot, DOI bot, Jim10701, Luckas-bot, Yobot, Wojciech
mula, AnomieBOT, SvartMan, Citation bot, Carlsotr, Alan Dawrst, RibotBOT, FrescoBot, Jamesooders, MastiBot, Devynci, Patmorin,
EmausBot, Pet3ris, Allforrous, Jaspervdg, Overred~enwiki, ClueBot NG, Vishalvishnoi, Rpk512, BG19bot, ChrisGualtieri, Dexbot, Mark
viking, Purealtruism, Dmx2010, Monkbot, ‫דובק‬1, Mtnorthpoplar, Corka94, Deacon Vorbis and Anonymous: 116
• B-tree Source: https://en.wikipedia.org/wiki/B-tree?oldid=767638022 Contributors: Kpjas, Bryan Derksen, FvdP, Mrwojo, Spiff~enwiki,
Edward, Michael Hardy, Rp, Chadloder, Minesweeper, JWSchmidt, Ciphergoth, BAxelrod, Alaric, Charles Matthews, Dcoetzee, Dys-
prosia, Evgeni Sergeev, Greenrd, Hao2lian, Ed g2s, Tjdw, AaronSw, Carbuncle, Wtanaka, Fredrik, Altenmann, Liotier, Bkell, Dmn,
Tea2min, Giftlite, DavidCary, Uday, Wolfkeeper, Lee J Haywood, Levin, Curps, Joconnor, Ketil, Jorge Stolfi, AlistairMcMillan, Nayuki,
Neilc, Pgan002, Gdr, Cbraga, Knutux, Stephan Leclercq, Peter bertok, Andreas Kaufmann, Chmod007, Kate, Ta bu shi da yu, Slady,
Rich Farmbrough, Guanabot, Leibniz, Qutezuce, Talldean, Slike, Dpotter, Mrnaz, SickTwist, Wipe, R. S. Shaw, HasharBot~enwiki, Alan-
sohn, Anders Kaseorg, ABCD, Wtmitchell, Wsloand, MIT Trekkie, Voxadam, Postrach, Mindmatrix, Decrease789, Ruud Koot, Qwertyus,
FreplySpang, Rjwilmsi, Kinu, Strake, Sandman@llgp.org, FlaBot, Psyphen, Ysangkok, Fragglet, Joe07734, Makkuro, Fresheneesz, Kri,
Antimatter15, CiaPan, Daev, Chobot, Vyroglyph, YurikBot, Bovineone, Ethan, PrologFan, Mikeblas, EEMIV, Cedar101, LeonardoRob0t,
SmackBot, Cutter, Ssbohio, Btwied, Danyluis, Mhss, Chris the speller, Bluebot, Oli Filth, Malbrain, Stevemidgley, Cybercobra, AlyM, Jeff
Wheeler, Battamer, Ck lostsword, Zearin, Bezenek, Flying Bishop, Loadmaster, Dicklyon, P199, Inquisitus, Norm mit, Noodlez84, Lamdk,
Amniarix, FatalError, Ahy1, Aubrey Jaffer, Beeson, Cydebot, PKT, ContivityGoddess, Headbomb, I do not exist, Alfalfahotshots, AntiVan-
dalBot, Luna Santin, Widefox, Jirka6, Lfstevens, Lklundin, The Fifth Horseman, MER-C, .anacondabot, Nyq, Yakushima, David Eppstein,
Hbent, MoA)gnome, Ptheoch, CarlFeynman, Glrx, Trusilver, Altes, Phishman3579, Jy00912345, Priyank bolia, GoodPeriodGal, Dorgan-
Bot, MartinRinehart, Michael Angelkovich, VolkovBot, Oshwah, Appoose, Kovianyo, Don4of4, Dlae, Jesin, Billinghurst, Uw.Antony,
Aednichols, Joahnnes, Ham Pastrami, JCLately, Jojalozzo, Ctxppc, Dravecky, Anakin101, Hariva, Wantnot, ClueBot, Rpajares, Simon04,
Junk98df, Abrech, Kukolar, Iohannes Animosus, Doprendek, XLinkBot, Paushali, Addbot, CanadianLinuxUser, AnnaFrance, LinkFA-
Bot, Jjdawson7, Verbal, Lightbot, Krano, Teles, Twimoki, Luckas-bot, Quadrescence, Yobot, AnomieBOT, Gptelles, Materialscientist,
MorgothX, Xtremejames183, Xqbot, Nishantjr, Matttoothman, Sandeep.a.v, Merit 07, Almabot, GrouchoBot, Eddvella, January2009,
Jacosi, SirSeal, Hobsonlane, Bladefistx2, Mfwitten, Redrose64, Fgdafsdgfdsagfd, Trappist the monk, Patmorin, Hjasud, RjwilmsiBot, Ma-
chineRebel, John lindgren, DASHBot, Wkailey, John of Reading, Wout.mertens, John ch fr, Pyschobbens, Ctail, Fabriciodosanjossilva,
TomYHChan, Mnogo, NGPriest, Tuolumne0, ClueBot NG, Betzaar, Oldsharp, Widr, DanielKlein24, Bor4kip, RMcPhillip, Meurondb,
BG19bot, WinampLlama, Erik.Bjareholt, Cp3149, Andytwigg, David.moreno72, JoshuSasori, Jimw338, YFdyh-bot, Dexbot, Pintoch,
Seanhalle, Lsmll, Enock4seth, Tentinator, TheWisestOfFools, DavidLeighEllis, M Murphy1993, JaconaFrere, Skr15081997, Audreyme-
ows, Utsavullas33, Nbro, IvayloS, CAPTAIN RAJU, Grecinto, SundeepBhuvan, GreenC bot, Bender the Bot and Anonymous: 387
• B+ tree Source: https://en.wikipedia.org/wiki/B%2B_tree?oldid=771110786 Contributors: Bryan Derksen, Cherezov, Tim Starling, Pnm,
Eurleif, CesarB, Cherkash, Marc omorain, Josh Cherry, Vikreykja, Lupo, Dmn, Giftlite, Inkling, WorldsApart, Neilc, Lightst, Ar-
row~enwiki, WhiteDragon, Two Bananas, Scrool, Leibniz, Zenohockey, Nyenyec, Cmdrjameson, TheProject, Obradovic Goran, Hap-
pyvalley, Mdd, Arthena, Yamla, TZOTZIOY, Stevestrange, Knutties, Oleg Alexandrov, RHaworth, LrdChaos, LOL, Decrease789, Gre-
gorB, PhilippWeissenbacher, Ash211, Penumbra2000, Gurch, Degeberg, Intgr, Fresheneesz, Chobot, Bornhj, Encyclops, Bovineone, Capi,
Luc4~enwiki, Mikeblas, Foeckler, Snarius, Cedar101, LeonardoRob0t, Jbalint, Jsnx, Arny, DomQ, Mhss, Hongooi, Rrburke, Cyber-
cobra, Itmozart, Nat2, Leksey, Tlesher, Julthep, Cychoi, UncleDouggie, Yellowstone6, Ahy1, Unixguy, CmdrObot, Leujohn, Jwang01,
Ubuntu2, I do not exist, Nuworld, Widefox, Ste4k, JAnDbot, Txomin, CommonsDelinker, Garciada5658, Afaviram, Mfedyk, Priyank bo-
lia, Mqchen, Mrcowden, VolkovBot, OliviaGuest, Mdmkolbe, Muro de Aguas, Singaldhruv, Highlandsun, Wiae, SheffieldSteel, MRLacey,
S.Örvarr.S, SieBot, Tresiden, YonaBot, Yungoe, Amarvashishth, Mogentianae, Imachuchu, ClueBot, Kl4m, Boing! said Zebedee, Tux-
thepenguin933, SchreiberBike, Max613, Raatikka, Addbot, TutterMouse, Thunderpenguin, Favonian, AgadaUrbanit, Matěj Grabovský,
Bluebusy, Twimoki, Luckas-bot, Matthew D Dillon, Yobot, ColinTempler, Vevek, AnomieBOT, Materialscientist, LilHelpa, Nishantjr,
Makeswell, Nqzero, Ajarov, Mydimle, Pinethicket, Eddie595, Reaper Eternal, MikeDierken, Holy-foek, Kastauyra, Igor Yalovecky, Gf
uip, EmausBot, Immunize, Wout.mertens, Tommy2010, K6ka, Entalpia2, James.walmsley, Bad Romance, Fabrictramp(public), QEDK,
Ysoroka, Grundprinzip, ClueBot NG, Vedantkumar, MaximalIdeal, Anchor89, Giovanni Kock Bonetti, BG19bot, Lowercase Sigma,
Chmarkine, BattyBot, NorthernSilencer, Cyberbot II, Michaelcomella, AshishMbm2012, Perkinsb1024, EvergreenFir, Alexjlockwood,
Graham477, Andylamp, Cowprophet, Kaartic, Ngkaho1234, Shubh-i sparkx, GreenC bot, Victor.scherbakov, Kushgrover, Linfeng371
and Anonymous: 250
• Trie Source: https://en.wikipedia.org/wiki/Trie?oldid=771766500 Contributors: Bryan Derksen, Taral, Bignose, Edward, Chris-martin, Rl,
Denny, Dcoetzee, Dysprosia, Evgeni Sergeev, Doradus, Fredrik, Altenmann, Mattflaschen, Tea2min, Matt Gies, Giftlite, Dbenbenn, David-
Cary, Sepreece, Wolfkeeper, Pgan002, Gdr, LiDaobing, Danny Rathjens, Teacup, Watcher, Andreas Kaufmann, Kate, Antaeus Feldspar,
BACbKA, JustinWick, Kwamikagami, Diomidis Spinellis, EmilJ, Shoujun, Giraffedata, BlueNovember, Hugowolf, CyberSkull, Diego
Moya, Loreto~enwiki, Stillnotelf, Velella, Blahedo, Runtime, Tr00st, Gmaxwell, Simetrical, MattGiuca, Gerbrant, Graham87, BD2412,
Qwertyus, Rjwilmsi, Drpaule, Sperxios, Hairy Dude, Me and, Pi Delport, Dantheox, Gaius Cornelius, Nad, Mikeblas, Danielx, TMott,
SmackBot, Slamb, Honza Záruba, InverseHypercube, Karl Stroetmann, Jim baker, BiT, Ennorehling, Eug, Chris the speller, Neurodiver-
gent, MalafayaBot, Drewnoakes, Otus, Malbrain, Kaimiddleton, Cybercobra, Leaflord, ThePianoGuy, Musashiaharon, Denshade, Edlee,
Johnny Zoo, MichaelPloujnikov, Cydebot, Electrum, Farzaneh, Bsdaemon, Deborahjay, Headbomb, Widefox, Maged918, KMeyer, Nos-
big, Deflective, Raanoo, Ned14, David Eppstein, FuzziusMaximus, Micahcowan, Francis Tyers, Pavel Fusu, 97198, Dankogai, Funandtrvl,
Bse3, Kyle the bot, Nissenbenyitskhak, Jmacglashan, C0dergirl, Sergio01, Ham Pastrami, Enrique.benimeli, Svick, AlanUS, Jludwig,
VanishedUser sdu9aya9fs787sads, Anupchowdary, Para15000, Niceguyedc, Pombredanne, JeffDonner, Estirabot, Mindstalk, Stepheng-
matthews, Johnuniq, Dscholte, XLinkBot, Dsimic, Deineka, Addbot, Cowgod14, MrOllie, Yaframa, OlEnglish, ‫ماني‬, Legobot, Luckas-bot,
Yobot, Nashn, AnomieBOT, AmritasyaPutra, Royote, Citation bot, Ivan Kuckir, Coding.mike, GrouchoBot, Modiashutosh, RibotBOT,
Shadowjams, Pauldinhqd, FrescoBot, Mostafa.vafi, X7q, Jonasbn, Citation bot 1, Chenopodiaceous, Base698, GeypycGn, Miracle Pen,
Pmdusso, Diannaa, Cutelyaware, WillNess, RjwilmsiBot, EmausBot, DanielWaterworth, Dcirovic, Bleakgadfly, Midas02, HolyCookie,
Let4time, ClueBot NG, Jbragadeesh, Adityasinghhhhhh, Atthaphong, ‫אנונימי‬17, Helpful Pixie Bot, Sangdol, Sboosali, Dvanatta, Dexbot,
Pintoch, Junkyardsparkle, Jochen Burghardt, Kirpo, Vsethuooo, RealFoxX, Averruncus, AntonDevil, Painted Fox, Ramiyam, *thing goes,
Bwegs14, Iokevins, Angelababy00, Tylerbittner, GreenC bot and Anonymous: 179
256 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

• Radix tree Source: https://en.wikipedia.org/wiki/Radix_tree?oldid=771855304 Contributors: Cwitty, Edward, CesarB, Dcoetzee,


AaronSw, Javidjamae, Gwalla, Bhyde, Andreas Kaufmann, Qutezuce, Bender235, Brim, Guy Harris, Noosphere, Daira Hopwood, Gre-
gorB, Qwertyus, Yurik, Adoniscik, Hairy Dude, Me and, Pi Delport, Dogcow, Gulliveig, Modify, C.Fred, DBeyer, Optikos, Srchvrs,
Malbrain, Frap, Cybercobra, MegaHasher, Khazar, Nausher, Makyen, Babbling.Brook, Dicklyon, Pjrm, DavidDecotigny, Ahy1, Cydebot,
Headbomb, Mortehu, Coffee2theorems, Tedickey, Rocchini, Phishman3579, Jy00912345, SparsityProblem, Burkeaj, Cobi, VolkovBot,
Jamelan, Abatishchev, Para15000, Sameemir, Arkanosis, Safek, XLinkBot, Hetori, Rgruian, Dsimic, Addbot, Ollydbg, Lightbot, Yobot,
Npgall, Drachmae, Citation bot, Kirigiri, Shy Cohen, Pauldinhqd, FrescoBot, Citation bot 1, SpmRmvBot, ICEAGE, MastiBot, Hesamwls,
Puffin, TYelliot, Helpful Pixie Bot, CitationCleanerBot, ChrisGualtieri, Saffles, Awnedion, Pintoch, Lugia2453, Crow, Simonfakir,
David.sippet, Sebivor, Warrenjharper and Anonymous: 80
• Suffix tree Source: https://en.wikipedia.org/wiki/Suffix_tree?oldid=747460668 Contributors: AxelBoldt, Michael Hardy, Delirium, Alfio,
Charles Matthews, Dcoetzee, Jogloran, Phil Boswell, Sho Uemura, Giftlite, P0nc, Sundar, Two Bananas, Andreas Kaufmann, Squash,
Kbh3rd, Bcat, Shoujun, Christian Kreibich, R. S. Shaw, Jemfinch, Blahma, Mechonbarsa, RJFJR, Wsloand, Oleg Alexandrov, Ruud Koot,
Dionyziz, Rjwilmsi, JMCorey, Ffaarr, Bgwhite, Vecter, Ru.spider, TheMandarin, Nils Grimsmo, Lt-wiki-bot, TheTaxman, Cedar101,
DmitriyV, Heavyrain2408, SmackBot, C.Fred, TripleF, Cybercobra, Ninjagecko, ThePianoGuy, Nux, Beetstra, MTSbot~enwiki, Reques-
tion, MaxEnt, Cydebot, Jleunissen, Thijs!bot, Jhclark, Headbomb, Leafman, MER-C, Sarahj2107, CobaltBlue, Johnbibby, A3nm, David
Eppstein, Bbi5291, Andre.holzner, Glrx, Doranchak, Dhruvbird, Jamelan, NVar, Xevior, ClueBot, Garyzx, Para15000, Safek, Xodarap00,
Stephengmatthews, XLinkBot, Addbot, Deselaers, DOI bot, RomanPszonka, Chamal N, Nealjc, Yobot, Npgall, Kilom691, Senvey, Cita-
tion bot, Eumolpo, Xqbot, Gilo1969, Sky Attacker, X7q, Citation bot 1, SpmRmvBot, Skyerise, Illya.havsiyevych, RedBot, Xutaodeng,
Luismsgomes, Mavam, RjwilmsiBot, 12hugo34, Grondilu, Ronni1987, EdoBot, ClueBot NG, Kasirbot, T.seppelt, Andrew Helwer, Pin-
toch, Jochen Burghardt, Cos2, Farach, Anurag.x.singh and Anonymous: 81
• Suffix array Source: https://en.wikipedia.org/wiki/Suffix_array?oldid=720145448 Contributors: Edward, Mjordan, BenRG, Kiwibird,
Tea2min, Giftlite, Mboverload, Viksit, Beland, Karol Langner, Andreas Kaufmann, MeltBanana, Malcolm rowe, Arnabdotorg, Nroets,
RJFJR, Ruud Koot, Qwertyus, Gaius Cornelius, Nils Grimsmo, Bkil, SmackBot, TripleF, Malbrain, Chris83, Thijs!bot, Headbomb,
JoaquinFerrero, Wolfgang-gerlach~enwiki, Joe Wiki, David Eppstein, Glrx, Cobi, Singleheart, Jwarhol, Garyzx, Alexbot, XLinkBot,
Addbot, EchoBlaze94, Matěj Grabovský, Yobot, AnomieBOT, Olivier Lartillot, FrescoBot, Libor Vilímek, Chuancong, Gailcarmichael,
Saketkc, ZéroBot, Dennis714, Solomon7968, T.seppelt, Andrew Helwer, ChrisGualtieri, JingguoYao, SteenthIWbot, StephanErb, Gauvain,
Cos2, Hvaara, Anurag.x.singh, Denidi and Anonymous: 51
• Suffix automaton Source: https://en.wikipedia.org/wiki/Suffix_automaton?oldid=706186025 Contributors: Qwertyus, David Eppstein,
CorenSearchBot, Tr00rle and Dexbot
• Van Emde Boas tree Source: https://en.wikipedia.org/wiki/Van_Emde_Boas_tree?oldid=771133203 Contributors: B4hand, Michael
Hardy, Kragen, Charles Matthews, Dcoetzee, Doradus, Phil Boswell, Dbenbenn, Bender235, BACbKA, Nickj, Qwertyus, Rjwilmsi, Jeff02,
Quuxplusone, Fresheneesz, Argav, Pi Delport, Cedar101, Gulliveig, SmackBot, Cybercobra, A5b, David Cooke, Neelix, Cydebot, Cy-
hawk, Snoopy67, David Eppstein, Panarchy, Brvman, Dangercrow, Svick, Adrianwn, Kaba3, Addbot, Lightbot, Luckas-bot, Yobot, Fx4m,
Mangarah, Brutaldeluxe, AMWJ, Patmorin, Gailcarmichael, EmausBot, ClueBot NG, Jackrae, ElhamKhodaee, MatthewIreland, Dexbot,
Theemathas, RandomSort, Peter238, Cewbot, Knife-in-the-drawer, Nutka7, Puma314 and Anonymous: 25
• Fusion tree Source: https://en.wikipedia.org/wiki/Fusion_tree?oldid=736555935 Contributors: Edemaine, CesarB, Charles Matthews,
Dcoetzee, ZeroOne, Oleg Alexandrov, SmackBot, Cybercobra, Cydebot, Alaibot, Nick Number, David Eppstein, Lamro, Czcollier, Dec-
oratrix, Gmharhar, Vladfi, Comp.arch and Anonymous: 7

8.2 Images
• File:8bit-dynamiclist_(reversed).gif Source: https://upload.wikimedia.org/wikipedia/commons/c/cc/8bit-dynamiclist_%28reversed%
29.gif License: CC-BY-SA-3.0 Contributors: This file was derived from: 8bit-dynamiclist.gif
Original artist: Seahen, User:Rezonansowy
• File:AVL-double-rl_K.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/f9/AVL-double-rl_K.svg License: CC BY-SA
4.0 Contributors: This vector image was created with Inkscape. Original artist: Nomen4Omen
• File:AVL-simple-left_K.svg Source: https://upload.wikimedia.org/wikipedia/commons/7/76/AVL-simple-left_K.svg License: CC BY-
SA 4.0 Contributors: This vector image was created with Inkscape. Original artist: Nomen4Omen
• File:AVL-tree-delete.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/36/AVL-tree-delete.svg License: CC BY-SA 3.0
de Contributors: commons Original artist: Nomen4Omen
• File:AVL-tree-wBalance_K.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/ad/AVL-tree-wBalance_K.svg License:
CC BY-SA 4.0 Contributors: This vector image was created with Inkscape. Original artist: Nomen4Omen
• File:AVLtreef.svg Source: https://upload.wikimedia.org/wikipedia/commons/0/06/AVLtreef.svg License: Public domain Contributors:
Own work Original artist: User:Mikm
• File:Ambox_important.svg Source: https://upload.wikimedia.org/wikipedia/commons/b/b4/Ambox_important.svg License: Public do-
main Contributors: Own work, based off of Image:Ambox scales.svg Original artist: Dsmurat (talk · contribs)
• File:AmortizedPush.png Source: https://upload.wikimedia.org/wikipedia/commons/e/e5/AmortizedPush.png License: CC BY-SA 4.0
Contributors: Own work Original artist: ScottDNelson
• File:An_example_of_how_to_find_a_string_in_a_Patricia_trie.png Source: https://upload.wikimedia.org/wikipedia/commons/6/
63/An_example_of_how_to_find_a_string_in_a_Patricia_trie.png License: CC BY-SA 3.0 Contributors: Microsoft Visio Original artist:
Saffles
• File:Array_of_array_storage.svg Source: https://upload.wikimedia.org/wikipedia/commons/0/01/Array_of_array_storage.svg License:
Public domain Contributors: No machine-readable source provided. Own work assumed (based on copyright claims). Original artist: No
machine-readable author provided. Dcoetzee assumed (based on copyright claims).
• File:AttenuatedBloomFilter2.png Source: https://upload.wikimedia.org/wikipedia/commons/d/d8/AttenuatedBloomFilter2.png Li-
cense: CC BY-SA 4.0 Contributors: Own work Original artist: Satokoala
8.2. IMAGES 257

• File:B-tree.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/65/B-tree.svg License: CC BY-SA 3.0 Contributors: Own


work based on [1]. Original artist: CyHawk
• File:B_tree_insertion_example.png Source: https://upload.wikimedia.org/wikipedia/commons/3/33/B_tree_insertion_example.png Li-
cense: Public domain Contributors: I drew it :) Original artist: User:Maxtremus
• File:BinaryTreeRotations.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/43/BinaryTreeRotations.svg License: CC
BY-SA 3.0 Contributors: Own work Original artist: Josell7
• File:Binary_Heap_with_Array_Implementation.JPG Source: https://upload.wikimedia.org/wikipedia/commons/c/c4/Binary_Heap_
with_Array_Implementation.JPG License: CC0 Contributors: I (Chris857 (talk)) created this work entirely by myself.
Original artist: Chris857 (talk)
• File:Binary_search_tree.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/da/Binary_search_tree.svg License: Public
domain Contributors: No machine-readable source provided. Own work assumed (based on copyright claims). Original artist: No machine-
readable author provided. Dcoetzee assumed (based on copyright claims).
• File:Binary_tree_in_array.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/86/Binary_tree_in_array.svg License: Pub-
lic domain Contributors: No machine-readable source provided. Own work assumed (based on copyright claims). Original artist: No
machine-readable author provided. Dcoetzee assumed (based on copyright claims).
• File:Binomial-heap-13.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/61/Binomial-heap-13.svg License: CC-BY-
SA-3.0 Contributors: de:Bild:Binomial-heap-13.png by de:Benutzer:Koethnig Original artist: User:D0ktorz
• File:Binomial_Trees.svg Source: https://upload.wikimedia.org/wikipedia/commons/c/cf/Binomial_Trees.svg License: CC-BY-SA-3.0
Contributors: No machine-readable source provided. Own work assumed (based on copyright claims). Original artist: No machine-readable
author provided. Lemontea~commonswiki assumed (based on copyright claims).
• File:Binomial_heap_merge1.svg Source: https://upload.wikimedia.org/wikipedia/commons/9/9f/Binomial_heap_merge1.svg License:
CC-BY-SA-3.0 Contributors: Own work Original artist: Lemontea
• File:Binomial_heap_merge2.svg Source: https://upload.wikimedia.org/wikipedia/commons/e/e8/Binomial_heap_merge2.svg License:
CC-BY-SA-3.0 Contributors: Own work Original artist: Lemontea
• File:BloomFilterDisk.png Source: https://upload.wikimedia.org/wikipedia/commons/6/61/BloomFilterDisk.png License: CC BY-SA
4.0 Contributors: https://people.cs.umass.edu/~{}ramesh/Site/PUBLICATIONS.html Original artist: Ramesh K. Sitaraman
• File:Bloom_filter.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/ac/Bloom_filter.svg License: Public domain Contrib-
utors: self-made, originally for a talk at WADS 2007 Original artist: David Eppstein
• File:Bloom_filter_fp_probability.svg Source: https://upload.wikimedia.org/wikipedia/commons/e/ef/Bloom_filter_fp_probability.svg
License: CC BY 3.0 Contributors: Own work Original artist: Jerz4835
• File:Bloom_filter_speed.svg Source: https://upload.wikimedia.org/wikipedia/commons/c/c4/Bloom_filter_speed.svg License: Public
domain Contributors: Transferred from en.wikipedia to Commons by RMcPhillip using CommonsHelper. Original artist: Alexmadon
at English Wikipedia
• File:Bplustree.png Source: https://upload.wikimedia.org/wikipedia/commons/3/37/Bplustree.png License: CC BY 3.0 Contributors: Own
work Original artist: Grundprinzip
• File:Bstreesearchexample.jpg Source: https://upload.wikimedia.org/wikipedia/commons/f/fa/Bstreesearchexample.jpg License: Public
domain Contributors: ? Original artist: ?
• File:CPT-LinkedLists-addingnode.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/4b/
CPT-LinkedLists-addingnode.svg License: Public domain Contributors:
• Singly_linked_list_insert_after.png Original artist: Singly_linked_list_insert_after.png: Derrick Coetzee
• File:CPT-LinkedLists-deletingnode.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/d4/
CPT-LinkedLists-deletingnode.svg License: Public domain Contributors:
• Singly_linked_list_delete_after.png Original artist: Singly_linked_list_delete_after.png: Derrick Coetzee
• File:Circular_Buffer_Animation.gif Source: https://upload.wikimedia.org/wikipedia/commons/f/fd/Circular_Buffer_Animation.gif
License: CC BY-SA 4.0 Contributors: Own work Original artist: MuhannadAjjan
• File:Circular_buffer.svg Source: https://upload.wikimedia.org/wikipedia/commons/b/b7/Circular_buffer.svg License: CC-BY-SA-3.0
Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
• File:Circular_buffer_-_6789345.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/67/Circular_buffer_-_6789345.svg
License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
• File:Circular_buffer_-_6789AB5.svg Source: https://upload.wikimedia.org/wikipedia/commons/b/ba/Circular_buffer_-_6789AB5.
svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
• File:Circular_buffer_-_6789AB5_with_pointers.svg Source: https://upload.wikimedia.org/wikipedia/commons/0/05/Circular_
buffer_-_6789AB5_with_pointers.svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original
artist: en:User:Cburnett
• File:Circular_buffer_-_X789ABX.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/43/Circular_buffer_-_X789ABX.
svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
• File:Circular_buffer_-_XX123XX.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/d7/Circular_buffer_-_XX123XX.
svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
• File:Circular_buffer_-_XX123XX_with_pointers.svg Source: https://upload.wikimedia.org/wikipedia/commons/0/02/Circular_
buffer_-_XX123XX_with_pointers.svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original
artist: en:User:Cburnett
• File:Circular_buffer_-_XX1XXXX.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/89/Circular_buffer_-_
XX1XXXX.svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
258 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

• File:Circular_buffer_-_XXXX3XX.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/11/Circular_buffer_-_


XXXX3XX.svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
• File:Circular_buffer_-_empty.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/f7/Circular_buffer_-_empty.svg Li-
cense: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist: en:User:Cburnett
• File:Circularly-linked-list.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/df/Circularly-linked-list.svg License: Pub-
lic domain Contributors: Own work Original artist: Lasindi
• File:Closed_Access_logo_alternative.svg Source: https://upload.wikimedia.org/wikipedia/commons/c/c1/Closed_Access_logo_
alternative.svg License: CC0 Contributors: File:Open_Access_logo_PLoS_white.svg and own modification Original artist: Jakob Voß,
influenced by original art designed at PLoS, modified by Wikipedia users Nina and Beao
• File:Commons-logo.svg Source: https://upload.wikimedia.org/wikipedia/en/4/4a/Commons-logo.svg License: PD Contributors: ? Origi-
nal artist: ?
• File:Comparison_computational_complexity.svg Source: https://upload.wikimedia.org/wikipedia/commons/7/7e/Comparison_
computational_complexity.svg License: CC BY-SA 4.0 Contributors: Own work Original artist: Cmglee
• File:Crypto_key.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/65/Crypto_key.svg License: CC-BY-SA-3.0 Contribu-
tors: Own work based on image:Key-crypto-sideways.png by MisterMatt originally from English Wikipedia Original artist: MesserWoland
• File:Cryptographic_Hash_Function.svg Source: https://upload.wikimedia.org/wikipedia/commons/2/2b/Cryptographic_Hash_
Function.svg License: Public domain Contributors: Original work for Wikipedia Original artist: User:Jorge Stolfi based on
Image:Hash_function.svg by Helix84
• File:Cuckoo.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/de/Cuckoo.svg License: CC BY-SA 3.0 Contributors: File:
Cuckoo.png Original artist: Rasmus Pagh
• File:Data_Queue.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/52/Data_Queue.svg License: CC BY-SA 3.0
Contributors: Own work Original artist: <a href='https://upload.wikimedia.org/wikipedia/commons/5/52/Data_Queue.svg' class='internal'
title='Data Queue.svg'>This Image</a> was created by User:Vegpuff.

• File:Doubly-linked-list.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/5e/Doubly-linked-list.svg License: Public do-


main Contributors: Own work Original artist: Lasindi
• File:Dsu_disjoint_sets_final.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/ac/Dsu_disjoint_sets_final.svg License:
CC BY-SA 3.0 Contributors: Own work Original artist: 93willy
• File:Dsu_disjoint_sets_init.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/67/Dsu_disjoint_sets_init.svg License: CC
BY-SA 3.0 Contributors: Own work Original artist: 93willy
• File:Dual_heap.jpg Source: https://upload.wikimedia.org/wikipedia/commons/b/b7/Dual_heap.jpg License: CC BY-SA 3.0 Contribu-
tors: Own work Original artist: Pratiklahoti8004
• File:Dynamic_array.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/31/Dynamic_array.svg License: CC0 Contributors:
Own work Original artist: Dcoetzee
• File:Edit-clear.svg Source: https://upload.wikimedia.org/wikipedia/en/f/f2/Edit-clear.svg License: Public domain Contributors: The
Tango! Desktop Project. Original artist:
The people from the Tango! project. And according to the meta-data in the file, specifically: “Andreas Nilsson, and Jakub Steiner (although
minimally).”
• File:Fibonacci_heap-decreasekey.png Source: https://upload.wikimedia.org/wikipedia/commons/0/09/Fibonacci_heap-decreasekey.
png License: CC-BY-SA-3.0 Contributors: ? Original artist: ?
• File:Fibonacci_heap.png Source: https://upload.wikimedia.org/wikipedia/commons/4/45/Fibonacci_heap.png License: CC-BY-SA-3.0
Contributors: ? Original artist: ?
• File:Fibonacci_heap_extractmin1.png Source: https://upload.wikimedia.org/wikipedia/commons/5/56/Fibonacci_heap_extractmin1.
png License: CC-BY-SA-3.0 Contributors: ? Original artist: ?
• File:Fibonacci_heap_extractmin2.png Source: https://upload.wikimedia.org/wikipedia/commons/9/95/Fibonacci_heap_extractmin2.
png License: CC-BY-SA-3.0 Contributors: ? Original artist: ?
• File:Fibonacci_search.png Source: https://upload.wikimedia.org/wikipedia/commons/e/e5/Fibonacci_search.png License: CC BY-SA
4.0 Contributors: Own work Original artist: Esquivalience
• File:Folder_Hexagonal_Icon.svg Source: https://upload.wikimedia.org/wikipedia/en/4/48/Folder_Hexagonal_Icon.svg License: Cc-by-
sa-3.0 Contributors: ? Original artist: ?
• File:FusionTreeSketch.gif Source: https://upload.wikimedia.org/wikipedia/commons/8/8a/FusionTreeSketch.gif License: CC BY-SA
3.0 Contributors: Own work Original artist: Vladfi
• File:HASHTB12.svg Source: https://upload.wikimedia.org/wikipedia/commons/9/90/HASHTB12.svg License: Public domain Contribu-
tors: ? Original artist: ?
• File:Hash_table_3_1_1_0_1_0_0_SP.svg Source: https://upload.wikimedia.org/wikipedia/commons/7/7d/Hash_table_3_1_1_0_1_0_
0_SP.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Jorge Stolfi
• File:Hash_table_4_1_0_0_0_0_0_LL.svg Source: https://upload.wikimedia.org/wikipedia/commons/2/2e/Hash_table_4_1_0_0_0_0_
0_LL.svg License: Public domain Contributors: Own work Original artist: Jorge Stolfi
• File:Hash_table_4_1_1_0_0_0_0_LL.svg Source: https://upload.wikimedia.org/wikipedia/commons/7/71/Hash_table_4_1_1_0_0_0_
0_LL.svg License: Public domain Contributors: Own work Original artist: Jorge Stolfi
• File:Hash_table_4_1_1_0_0_1_0_LL.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/58/Hash_table_4_1_1_0_0_1_
0_LL.svg License: Public domain Contributors: Own work Original artist: Jorge Stolfi
8.2. IMAGES 259

• File:Hash_table_5_0_1_1_1_1_0_LL.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/5a/Hash_table_5_0_1_1_1_1_


0_LL.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Jorge Stolfi
• File:Hash_table_5_0_1_1_1_1_0_SP.svg Source: https://upload.wikimedia.org/wikipedia/commons/b/bf/Hash_table_5_0_1_1_1_1_
0_SP.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Jorge Stolfi
• File:Hash_table_5_0_1_1_1_1_1_LL.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/d0/Hash_table_5_0_1_1_1_1_
1_LL.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Jorge Stolfi
• File:Hash_table_average_insertion_time.png Source: https://upload.wikimedia.org/wikipedia/commons/1/1c/Hash_table_average_
insertion_time.png License: Public domain Contributors: Author’s Own Work. Original artist: Derrick Coetzee (User:Dcoetzee)
• File:Heap-as-array.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/d2/Heap-as-array.svg License: CC BY-SA 4.0 Con-
tributors: Own work Original artist: Maxiantor
• File:Heap_add_step1.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/ac/Heap_add_step1.svg License: Public domain
Contributors: Drawn in Inkscape by Ilmari Karonen. Original artist: Ilmari Karonen
• File:Heap_add_step2.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/16/Heap_add_step2.svg License: Public domain
Contributors: Drawn in Inkscape by Ilmari Karonen. Original artist: Ilmari Karonen
• File:Heap_add_step3.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/51/Heap_add_step3.svg License: Public domain
Contributors: Drawn in Inkscape by Ilmari Karonen. Original artist: Ilmari Karonen
• File:Heap_delete_step0.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/1c/Heap_delete_step0.svg License: Public do-
main Contributors: http://en.wikipedia.org/wiki/File:Heap_add_step1.svg Original artist: Ilmari Karonen
• File:Heap_remove_step1.svg Source: https://upload.wikimedia.org/wikipedia/commons/e/ee/Heap_remove_step1.svg License: Public
domain Contributors: Drawn in Inkscape by Ilmari Karonen. Original artist: Ilmari Karonen
• File:Heap_remove_step2.svg Source: https://upload.wikimedia.org/wikipedia/commons/2/22/Heap_remove_step2.svg License: Public
domain Contributors: Drawn in Inkscape by Ilmari Karonen. Original artist: Ilmari Karonen
• File:Hopscotch-wiki-example.gif Source: https://upload.wikimedia.org/wikipedia/en/f/fa/Hopscotch-wiki-example.gif License: CC-
BY-3.0 Contributors: ? Original artist: ?
• File:Insert_’{}slower'_with_a_null_node_into_a_Patricia_trie.png Source: https://upload.wikimedia.org/wikipedia/commons/8/87/
Insert_%27slower%27_with_a_null_node_into_a_Patricia_trie.png License: CC BY-SA 3.0 Contributors: Microsoft Visio Original artist:
Saffles
• File:Insert_'test'_into_a_Patricia_trie_when_'tester'_exists.png Source: https://upload.wikimedia.org/wikipedia/commons/5/5e/
Insert_%27test%27_into_a_Patricia_trie_when_%27tester%27_exists.png License: CC BY-SA 3.0 Contributors: Microsoft Visio Orig-
inal artist: Saffles
• File:Insert_'toast'_into_a_Patricia_trie_with_a_split_and_a_move.png Source: https://upload.wikimedia.org/wikipedia/commons/
e/eb/Insert_%27toast%27_into_a_Patricia_trie_with_a_split_and_a_move.png License: CC BY-SA 3.0 Contributors: Microsoft Visio
Original artist: Saffles
• File:Inserting_the_string_'water'_into_a_Patricia_trie.png Source: https://upload.wikimedia.org/wikipedia/commons/3/30/
Inserting_the_string_%27water%27_into_a_Patricia_trie.png License: CC BY-SA 3.0 Contributors: Microsoft Visio Original artist:
Saffles
• File:Inserting_the_word_'team'_into_a_Patricia_trie_with_a_split.png Source: https://upload.wikimedia.org/wikipedia/commons/
0/01/Inserting_the_word_%27team%27_into_a_Patricia_trie_with_a_split.png License: CC BY-SA 3.0 Contributors: Microsoft Visio
Original artist: Saffles
• File:Internet_map_1024.jpg Source: https://upload.wikimedia.org/wikipedia/commons/d/d2/Internet_map_1024.jpg License: CC BY
2.5 Contributors: Originally from the English Wikipedia; description page is/was here. Original artist: The Opte Project
• File:Interval_heap_depq.jpg Source: https://upload.wikimedia.org/wikipedia/commons/e/ec/Interval_heap_depq.jpg License: CC BY-
SA 3.0 Contributors: Own work Original artist: Pratiklahoti8004
• File:LampFlowchart.svg Source: https://upload.wikimedia.org/wikipedia/commons/9/91/LampFlowchart.svg License: CC-BY-SA-3.0
Contributors: vector version of Image:LampFlowchart.png Original artist: svg by Booyabazooka

• File:Leaf_correspondence.jpg Source: https://upload.wikimedia.org/wikipedia/commons/a/a7/Leaf_correspondence.jpg License: CC


BY-SA 3.0 Contributors: Own work Original artist: Pratiklahoti8004
• File:Lifo_stack.png Source: https://upload.wikimedia.org/wikipedia/commons/b/b4/Lifo_stack.png License: CC0 Contributors: Own
work Original artist: Maxtremus
• File:Linear_Probing_Deletion.png Source: https://upload.wikimedia.org/wikipedia/commons/3/38/Linear_Probing_Deletion.png Li-
cense: CC BY-SA 4.0 Contributors: Own work Original artist: Cryptic C62
• File:Lock-green.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/65/Lock-green.svg License: CC0 Contributors: en:File:
Free-to-read_lock_75.svg Original artist: User:Trappist the monk
• File:Max-Heap.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/38/Max-Heap.svg License: CC BY-SA 3.0 Contributors:
Own work Original artist: Ermishin
• File:Merge-arrows.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/52/Merge-arrows.svg License: Public domain Con-
tributors: ? Original artist: ?
• File:Merkle-Damgard_hash_big.svg Source: https://upload.wikimedia.org/wikipedia/commons/e/ed/Merkle-Damgard_hash_big.svg
License: Public domain Contributors: No machine-readable source provided. Own work assumed (based on copyright claims). Original
artist: No machine-readable author provided. Davidgothberg assumed (based on copyright claims).
• File:Min-heap.png Source: https://upload.wikimedia.org/wikipedia/commons/6/69/Min-heap.png License: Public domain Contributors:
Transferred from en.wikipedia to Commons by LeaW. Original artist: Vikingstad at English Wikipedia
260 CHAPTER 8. TEXT AND IMAGE SOURCES, CONTRIBUTORS, AND LICENSES

• File:Nuvola_kdict_glass.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/18/Nuvola_kdict_glass.svg License: LGPL


Contributors:
• Nuvola_apps_kdict.svg Original artist: Nuvola_apps_kdict.svg: *Nuvola_apps_kdict.png: user:David_Vignoni
• File:Office-book.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/a8/Office-book.svg License: Public domain Contribu-
tors: This and myself. Original artist: Chris Down/Tango project
• File:Open_Access_logo_PLoS_transparent.svg Source: https://upload.wikimedia.org/wikipedia/commons/7/77/Open_Access_logo_
PLoS_transparent.svg License: CC0 Contributors: http://www.plos.org/ Original artist: art designer at PLoS, modified by Wikipedia users
Nina, Beao, and JakobVoss
• File:Patricia_trie.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/ae/Patricia_trie.svg License: CC BY 2.5 Contributors:
Own work Original artist: Claudio Rocchini
• File:People_icon.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/37/People_icon.svg License: CC0 Contributors: Open-
Clipart Original artist: OpenClipart
• File:Pointer_implementation_of_a_trie.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/5d/Pointer_implementation_
of_a_trie.svg License: CC BY-SA 4.0 Contributors: Own work Original artist: Qwertyus
• File:Portal-puzzle.svg Source: https://upload.wikimedia.org/wikipedia/en/f/fd/Portal-puzzle.svg License: Public domain Contributors: ?
Original artist: ?
• File:ProgramCallStack2_en.png Source: https://upload.wikimedia.org/wikipedia/commons/8/8a/ProgramCallStack2_en.png License:
Public domain Contributors: Transferred from en.wikipedia to Commons. Original artist: Agateller at English Wikipedia
• File:Question_book-new.svg Source: https://upload.wikimedia.org/wikipedia/en/9/99/Question_book-new.svg License: Cc-by-sa-3.0
Contributors:
Created from scratch in Adobe Illustrator. Based on Image:Question book.png created by User:Equazcion Original artist:
Tkgd2007
• File:Question_dropshade.png Source: https://upload.wikimedia.org/wikipedia/commons/d/dd/Question_dropshade.png License: Public
domain Contributors: Image created by JRM Original artist: JRM
• File:Red-black_tree_delete_case_2_as_svg.svg Source: https://upload.wikimedia.org/wikipedia/commons/5/5c/Red-black_tree_
delete_case_2_as_svg.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Red-black_tree_delete_case_3_as_svg.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/a0/Red-black_tree_
delete_case_3_as_svg.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Red-black_tree_delete_case_4_as_svg.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/3d/Red-black_tree_
delete_case_4_as_svg.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Red-black_tree_delete_case_5_as_svg.svg Source: https://upload.wikimedia.org/wikipedia/commons/3/36/Red-black_tree_
delete_case_5_as_svg.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Red-black_tree_delete_case_6_as_svg.svg Source: https://upload.wikimedia.org/wikipedia/commons/9/99/Red-black_tree_
delete_case_6_as_svg.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Red-black_tree_example.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/66/Red-black_tree_example.svg Li-
cense: CC-BY-SA-3.0 Contributors: Own work Original artist: Cburnett
• File:Red-black_tree_example_(B-tree_analogy).svg Source: https://upload.wikimedia.org/wikipedia/commons/7/72/Red-black_tree_
example_%28B-tree_analogy%29.svg License: CC-BY-SA-3.0 Contributors: This vector image was created with Inkscape. Original artist:
fr:Utilisateur:Verdy_p
• File:Red-black_tree_insert_case_3.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/d6/Red-black_tree_insert_case_
3.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Red-black_tree_insert_case_4.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/89/Red-black_tree_insert_case_
4.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Red-black_tree_insert_case_5.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/dc/Red-black_tree_insert_case_
5.svg License: CC BY-SA 3.0 Contributors: Own work Original artist: Abloomfi
• File:Singly-linked-list.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/6d/Singly-linked-list.svg License: Public do-
main Contributors: Own work Original artist: Lasindi
• File:Skip_list.svg Source: https://upload.wikimedia.org/wikipedia/commons/8/86/Skip_list.svg License: Public domain Contributors:
Own work Original artist: Wojciech Muła
• File:Skip_list_add_element-en.gif Source: https://upload.wikimedia.org/wikipedia/commons/2/2c/Skip_list_add_element-en.gif Li-
cense: CC BY-SA 3.0 Contributors: Own work Original artist: Artyom Kalinin
• File:Splay_tree_zig.svg Source: https://upload.wikimedia.org/wikipedia/commons/2/2c/Splay_tree_zig.svg License: CC-BY-SA-3.0
Contributors:
• Zig.gif Original artist: Zig.gif: User:Regnaron
• File:Suffix_automaton.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/f4/Suffix_automaton.svg License: CC BY-SA
4.0 Contributors: Own work Original artist: Qwertyus
• File:Suffix_tree_BANANA.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/d2/Suffix_tree_BANANA.svg License:
Public domain Contributors: own work (largely based on PNG version by Nils Grimsmo) Original artist: Maciej Jaros (commons: Nux,
wiki-pl: Nux) (PNG version by Nils Grimsmo)
• File:Text_document_with_red_question_mark.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/a4/Text_document_
with_red_question_mark.svg License: Public domain Contributors: Created by bdesham with Inkscape; based upon Text-x-generic.svg
from the Tango project. Original artist: Benjamin D. Esham (bdesham)
8.3. CONTENT LICENSE 261

• File:Total_correspondence_heap.jpg Source: https://upload.wikimedia.org/wikipedia/commons/2/2e/Total_correspondence_heap.jpg


License: CC BY-SA 3.0 Contributors: Own work Original artist: Pratiklahoti8004
• File:TreapAlphaKey.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/4b/TreapAlphaKey.svg License: CC0 Contribu-
tors: Own work, with labels to match bitmap version Original artist: Qef
• File:Tree_Rebalancing.gif Source: https://upload.wikimedia.org/wikipedia/commons/c/c4/Tree_Rebalancing.gif License: CC-BY-SA-
3.0 Contributors: Transferred from en.wikipedia to Commons by Common Good using CommonsHelper. Original artist: Mtanti at English
Wikipedia
• File:Tree_Rotations.gif Source: https://upload.wikimedia.org/wikipedia/commons/1/15/Tree_Rotations.gif License: CC-BY-SA-3.0
Contributors: Transferred from en.wikipedia to Commons. Original artist: Mtanti at English Wikipedia
• File:Tree_rotation.png Source: https://upload.wikimedia.org/wikipedia/commons/2/23/Tree_rotation.png License: CC-BY-SA-3.0
Contributors: EN-Wikipedia Original artist: User:Ramasamy
• File:Tree_rotation_animation_250x250.gif Source: https://upload.wikimedia.org/wikipedia/commons/3/31/Tree_rotation_animation_
250x250.gif License: CC BY-SA 4.0 Contributors: Own work Original artist: Tar-Elessar
• File:Trie_example.svg Source: https://upload.wikimedia.org/wikipedia/commons/b/be/Trie_example.svg License: Public domain Con-
tributors: own work (based on PNG image by Deco) Original artist: Booyabazooka (based on PNG image by Deco). Modifications by
Superm401.
• File:Unbalanced_binary_tree.svg Source: https://upload.wikimedia.org/wikipedia/commons/a/a9/Unbalanced_binary_tree.svg License:
Public domain Contributors: Own work Original artist: Me (Intgr)
• File:UnionFindKruskalDemo.gif Source: https://upload.wikimedia.org/wikipedia/commons/a/a3/UnionFindKruskalDemo.gif License:
CC BY-SA 4.0 Contributors: Own work Original artist: Shiyu Ji
• File:VebDiagram.svg Source: https://upload.wikimedia.org/wikipedia/commons/6/6b/VebDiagram.svg License: Public domain Contrib-
utors: Own work Original artist: Gailcarmichael
• File:Wiki_letter_w_cropped.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/1c/Wiki_letter_w_cropped.svg License:
CC-BY-SA-3.0 Contributors: This file was derived from Wiki letter w.svg: <a href='//commons.wikimedia.org/wiki/File:
Wiki_letter_w.svg' class='image'><img alt='Wiki letter w.svg' src='https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Wiki_
letter_w.svg/50px-Wiki_letter_w.svg.png' width='50' height='50' srcset='https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/
Wiki_letter_w.svg/75px-Wiki_letter_w.svg.png 1.5x, https://upload.wikimedia.org/wikipedia/commons/thumb/6/6c/Wiki_letter_w.svg/
100px-Wiki_letter_w.svg.png 2x' data-file-width='44' data-file-height='44' /></a>
Original artist: Derivative work by Thumperward
• File:Wikibooks-logo-en-noslogan.svg Source: https://upload.wikimedia.org/wikipedia/commons/d/df/Wikibooks-logo-en-noslogan.
svg License: CC BY-SA 3.0 Contributors: Own work Original artist: User:Bastique, User:Ramac et al.
• File:Wikibooks-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/fa/Wikibooks-logo.svg License: CC BY-SA 3.0
Contributors: Own work Original artist: User:Bastique, User:Ramac et al.
• File:Wikiquote-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/f/fa/Wikiquote-logo.svg License: Public domain
Contributors: Own work Original artist: Rei-artur
• File:Wikisource-logo.svg Source: https://upload.wikimedia.org/wikipedia/commons/4/4c/Wikisource-logo.svg License: CC BY-SA 3.0
Contributors: Rei-artur Original artist: Nicholas Moreau
• File:Wikiversity-logo-Snorky.svg Source: https://upload.wikimedia.org/wikipedia/commons/1/1b/Wikiversity-logo-en.svg License:
CC BY-SA 3.0 Contributors: Own work Original artist: Snorky
• File:Wiktionary-logo-v2.svg Source: https://upload.wikimedia.org/wikipedia/commons/0/06/Wiktionary-logo-v2.svg License: CC BY-
SA 4.0 Contributors: Own work Original artist: Dan Polansky based on work currently attributed to Wikimedia Foundation but originally
created by Smurrayinchester
• File:Zigzag.gif Source: https://upload.wikimedia.org/wikipedia/commons/6/6f/Zigzag.gif License: CC-BY-SA-3.0 Contributors: ? Orig-
inal artist: ?
• File:Zigzig.gif Source: https://upload.wikimedia.org/wikipedia/commons/f/fd/Zigzig.gif License: CC-BY-SA-3.0 Contributors: ? Origi-
nal artist: ?

8.3 Content license


• Creative Commons Attribution-Share Alike 3.0

Potrebbero piacerti anche