Sei sulla pagina 1di 39

Segment trees and interval trees

Lekcija 11

Sergio Cabello
sergio.cabello@fmf.uni-lj.si
FMF
Univerza v Ljubljani

Includes slides by Antoine Vigneron

Sergio Cabello RC – More trees


Outline

I segment trees
• stabbing queries

• windowing problem

• rectangle intersection

• Klee’s measure problem

I interval trees
• improvement for some problems

I higher dimension

Sergio Cabello RC – More trees


Data structure for stabbing queries

I orthogonal range searching: data is points, queries are rectangles


I stabbing problem: data is rectangles, queries are points
I in one dimension
• data: a set of n intervals

• query: report the k intervals that contain a query point q

I in Rd
• data: a set of n isothetic (axis-parallel) boxes

• query: report the k boxes that contain a query point q

Sergio Cabello RC – More trees


Motivation

I in graphics and databases, objects are often stored according to


their bounding box

Object

Bounding box

I query: which objects does point x belong to?


I first find objects whose bounding boxes contain x
I then perform screening

Sergio Cabello RC – More trees


Data structure for windowing queries

I windowing queries
• data: a set of n disjoint segments in R2

• query: report the k segments that intersect a query

rectangle R.
I motivation: zoom in maps

Sergio Cabello RC – More trees


Rectangle intersection

I input: a set B of n isothetic boxes in R2


I output: all the intersecting pairs in B 2

b4
b1

b3
b5

b2

I output: (b1 , b3 ),(b2 , b3 ),(b2 , b4 ),(b3 , b4 )

Sergio Cabello RC – More trees


Klee’s measure problem

I input: a set B of n isothetic boxes


I output: the area/volume of the union

I well understood in R2 ⇒O(n log n) time


I the union can have complexity Θ(n2 ). Example?
I poorly understood in Rd for d > 2

Sergio Cabello RC – More trees


Segment tree
I a data structure to store intervals, or segments
I allows to answer stabbing queries
• in R2 : report the segments that intersect a query vertical

line l
• in R: report the segments that intersect a query point

reported

reported

reported

l
• query time: O(log n + k)
• space usage: O(n log n)
• preprocessing time: O(n log n)
Sergio Cabello RC – More trees
Notations

I let S = (s1 , s2 , . . . sn ) be a set of segments in R


I let E be the set of the x–coordinates of the endpoints of the
segments of S
I we assume general position, that is: |E | = 2n
I first sort E in increasing order
I E = {e1 < e2 < · · · < e2n }

Sergio Cabello RC – More trees


Atomic intervals
I E splits R into 2n + 1 atomic intervals:
• [−∞, e ]
1
i i +1 ] for i ∈ {1, 2, . . . 2n − 1}
• [e , e

• [e
2n , ∞]
I these are the leaves of the segment tree

Sergio Cabello RC – More trees


Internal nodes
I the segment tree T is a balanced binary tree
I each internal node u with children v and v 0 is associated with an
interval Iu = Iv ∪ Iv0
I an elementary interval is an interval associated with a node of T
(it can be an atomic interval)

v v0
Iu

Iv Iv 0

Sergio Cabello RC – More trees


Example
root

Sergio Cabello RC – More trees


Partitioning a segment
I let s ∈ S be a segment whose endpoints have x–coordinates ei
and ej
I [ei , ej ] is split into several elementary intervals
I they are chosen as close as possible to the root
I s is stored in each node associated with these elementary
intervals
root

E
s

Sergio Cabello RC – More trees


Canonical subsets

I each node u is associated with a canonical subset S(u) of


segments
I let ei < ej be the x–coordinates of the endpoints of s ∈ S
I then s is stored in S(u) iff Iu ⊂ [ei , ej ] and Iparent(u) 6⊂ [ei , ej ]
I standard segment tree: S(u) is stored as a list pointed from u
I we can also add more structure/data/pointers from u
I useful for multi-level data structures
I we will use it

Sergio Cabello RC – More trees


Example

root

Sergio Cabello RC – More trees


Answering a stabbing query

root

Sergio Cabello RC – More trees


Answering a stabbing query

Algorithm ReportStabbing (u, xl )


Input: root u of T , x–coordinate of l
Output: segments in S that cross l
1. if u == NULL
2. then return
3. output S(u) traversing the list pointed from u
4. if xl ∈ Iu.left
5. then ReportStabbing (u.left, xl )
6. if xl ∈ Iu.right
7. then ReportStabbing (u.right, xl )

I it clearly takes O(k + log n) time

Sergio Cabello RC – More trees


Inserting a segment

root

E
s

Sergio Cabello RC – More trees


Insertion in a segment tree

Algorithm Insert(u, s)
Input: root u of T , segment s. Endpoints of s have x–coordinates
x− < x+
1. if Iu ⊂ [x − , x + ]
2. then insert s into the list storing S(u)
3. else
4. if [x − , x + ] ∩ Iu.left 6= ∅
5. then Insert(u.left, s)
6. if [x − , x + ] ∩ Iu.right 6= ∅
7. then Insert(u.right, s)

Sergio Cabello RC – More trees


Main Property
Lemma
A segment s is stored at most twice at each level of T .
Dokaz.
I by contradiction
I if s stored at more than 2 nodes at level i
I let u be the leftmost such node, u 0 be the rightmost
I let v be another node at level i containing s
v .parent

u v u0

I then Iv .parent ⊂ [x − , x + ]
I so s cannot be stored at v
Sergio Cabello RC – More trees
Analysis

I lemma of previous slide implies


• each segment stored in O(log n) nodes

• space usage: O(n log n)

I insertion in O(log n) time


• at most four nodes are visited at each level

I actually space usage is Θ(n log n) (example?)


I query time: O(k + log n)
I preprocessing
• sort endpoints: Θ(n log n) time

• build empty segment tree over these endpoints: O(n) time

• insert n segments into T : O(n log n) time

• overall: Θ(n log n) preprocessing time

Sergio Cabello RC – More trees


Rectangle intersection

I input: a set B of n isothetic boxes in R2


I output: all the intersecting pairs in B 2
I using segment trees, we give an O(n log n + k) time algorithm,
where k is the number of intersecting pairs
I this is optimal. Why?
I note: faster than our line segment intersection algorithm
I space usage: Θ(n log n) due to segment trees
I space usage is suboptimal

Sergio Cabello RC – More trees


Two kinds of intersections
I overlap

I inclusion

I intersecting edges I we can find them using


I reduces to intersection reporting stabbing queries
for isothetic segments
I done as exercise (first
homework)

Sergio Cabello RC – More trees


Reporting overlaps

I equivalent to reporting intersecting edges


I plane sweep approach
I sweep line status: BBST containing the horizontal line segments
that intersect the sweep line, by increasing y –coordinates
I each time a vertical line segment is encountered, report
intersection by range searching in the BBST
I preprocessing time: O(n log n) for sorting endpoints
I running time: O(k + n log n)

Sergio Cabello RC – More trees


Reporting inclusions

I also using plane sweep: sweep a horizontal line from top to


bottom
I sweep line status: the boxes that intersect the sweep line l , in a
segment tree with respect to x–coordinates
• the endpoints are the x–coordinates of the horizontal edges

of the boxes
• at a given time, only rectangles that intersect l are in the

segment tree
• we can perform insertion and deletions in a segment tree in

O(log n) time
I each time a vertex of a box is encountered, perform a stabbing
query in the segment tree

Sergio Cabello RC – More trees


Remarks

I at each step a box intersection can be reported several times


I in addition there can be overlap and vertex stabbing a box at the
same time

I to obtain each intersecting pair only once, make some simple


checks. How?

Sergio Cabello RC – More trees


Stabbing queries for boxes

I in Rd , a set B of n boxes
I for a query point q find all the boxes that contain it
I we use a multi-level data structure, with a segment tree in each
level
I inductive definition, induction on d
I first, we store B in a segment tree T with respect to
x1 –coordinate
I for each node u of T , associate a (d − 1)–dimensional
multi–level segment tree for the segments S(u), with respect to
(x2 , x3 . . . xd )

Sergio Cabello RC – More trees


Performing queries

I search for q in T using x1 –coordinate


I for all nodes in the search path, query recursively the
(d − 1)–dimensional multi–level segment tree
I there are log n such queries
I by induction on d, it follows that
• query time: O(k + log n)
d
d
• space usage: O(n log n)
d
• preprocessing time : O(n log n)

I can be slightly improved. . .

Sergio Cabello RC – More trees


Windowing queries

I in Rd , a set S of n disjont segments


I for a query axis-aligned rectangle R, find all the segments
intersecting R
I three types of segments intersect R:
• segments with one endpoint inside R

• segments that intersect vertical side of R

• segments that intersect horizontal side of R

I first type: range tree over the endpoints of the segments


I second type: multi-level data structure with segment tree
• store S in a segment tree T with respect to x–coordinate

• for each node u of T , store the segments S(u) sorted by

their intersection with vertical line in BST

Sergio Cabello RC – More trees


Windowing queries

I for segments of the second type:


• a query visits O(log n) nodes of the main tree

• the canonical subsets of those nodes are disjoint

• in each node we spend O(log n) time, plus time to report

segments (1d range-tree)


• each segment is reported once, because disjointness

I each segment reported at most twice: filter them


I For n disjoint segments:
2
• preprocessing: O(n log n) time
2
• query: O(k + log n) time

I where did we use that the segments are disjoint?

Sergio Cabello RC – More trees


Klee’s measure problem
I in R2 , a set S of n axis-parallel rectangles
I compute area of the union
I solution using O(n log n) time
I sweep a vertical line ` from left to right
T S
• keep the length of ` ( S)
• events: length changes when rectangles start or stop

intersecting `
• relevant values: distance between consecutive events and

the length
• we compute the area to the left of `, updating it at each

event
I use segment trees to maintain the length
I

http://www.cgl.uwaterloo.ca/~krmoule/courses/cs760m/klee
Sergio Cabello RC – More trees
Klee’s measure problem

I we need to maintain the length of union of intervals under


insertion and deletion of intervals
I make a segment tree (we know all endpoints in advance)
I at each node u we store
• list of S(u) (actually its cardinality is enough)

• length(u): the length of I covered by segments stored


u
below u
• note that length(u) only depends on subtree rooted at u

• this allows quick updates

I length(root) is the real length we want


I insertion or deletion of interval takes O(log n) time
• if S(u) 6= ∅, then length(u) = length(I )
u
• else, length(u) = length(u.left) + length(u.right)

Sergio Cabello RC – More trees


Klee’s measure problem

I in R3 best known algorithm in O(n3/2 ) time


I only lower bound: Ω(n log n)
I in R3 , recent progress for unit boxes

Sergio Cabello RC – More trees


Interval trees

I interval trees allow to perform stabbing queries in one dimension


• query time: O(k + log n)

• preprocessing time: O(n log n)

• space: O(n)

I based on different approach

Sergio Cabello RC – More trees


Preliminary

I let xmed be the median of E


• S : segments of S that are completely to the left of x
l med
• S
med : segments of S that contain xmed
• S : segments of S that are completely to the right of x
r med
Smed

Sl
Sr

xmed

Sergio Cabello RC – More trees


Data structure

I recursive data structure


I left child of the root: interval tree storing Sl
I right child of the root: interval tree storing Sr
I at the root of the interval tree, we store Smed in two lists
• M is sorted according to the coordinate of the left endpoint
L
(in increasing order)
• M
R is sorted according to the coordinate of the right
endpoint (in decreasing order)

Sergio Cabello RC – More trees


Example
s1
s3 s2
s4

s5 s6
s7

Ml = (s4 , s6 , s1 )
Mr = (s1 , s4 , s6 )

Interval tree on Interval tree on


s3 and s5 s2 and s7

Sergio Cabello RC – More trees


Stabbing queries

I query: xq , find the intervals that contain xq


I if xq < xmed then
• Scan M in increasing order, and report segments that are
l
stabbed. When xq becomes smaller than the x–coordinate
of the current left endpoint, stop.
• recurse on S
l
I if xq > xmed
• analogous, but on the right side

Sergio Cabello RC – More trees


Analysis

I query time
• size of the subtree divided by at least two at each level

• scanning through M or M : proportional to the number of


l r
reported intervals
• conclusion: O(k + log n) time

I space usage: O(n) (each segment is stored in two lists, and the
tree is balanced)
I preprocessing time: easy to do it in O(n log n) time
I pseudocode

Sergio Cabello RC – More trees

Potrebbero piacerti anche