Sei sulla pagina 1di 27

Spatial Indexing

SAMs

Spatial Indexing

Point Access Methods can index only points. What about regions?

Z-ordering and quadtrees Use the transformation technique and a PAM New methods: Spatial Access Methods SAMs

R-tree and variations

Problem

Given a collection of geometric objects (points, lines, polygons, ...) organize them on disk, to answer spatial queries (range, nn, etc)

Transformation Technique
Map an d-dim MBR into a point: ex. [(xmin, xmax) (ymin, ymax)] => (xmin, xmax, ymin, ymax) Use a PAM to index the 2d points Given a range query, map the query into the 2d space and use the PAM to answer it

R-trees

[Guttman 84] Main idea: allow parents to overlap!


=> guaranteed 50% utilization => easier insertion/split algorithms. (only deal with Minimum Bounding Rectangles - MBRs)

R-trees

A multi-way external memory tree Index nodes and data (leaf) nodes All leaf nodes appear on the same level Every node contains between m and M entries The root node has at least 2 entries (children)

Example

eg., w/ fanout 4: group nearby rectangles to parent MBRs; each group -> disk page
I

AC B E D

H
J

Example

F=4
P3 I G

P1 AC B E P2 D

H
P4 J
A B C D E H I F G J

Example

F=4
P3 I G
P1 P2 P3 P4

P1 AC B E P2 D

H
P4 J
A B C D E H I F G J

R-trees - format of nodes

{(MBR; obj_ptr)} for leaf nodes


P1 P2 P3 P4

x-low; x-high obj y-low; y-high ptr ... ...

A B C

R-trees - format of nodes

{(MBR; node_ptr)} for non-leaf nodes

x-low; x-high y-low; y-high node ptr ...

P1 P2 P3 P4

...
A B C

R-trees:Search
P1 AC B E P2 D P3 G I

P1 P2 P3 P4

H
P4 J
A B C D E H I F G J

R-trees:Search
P1
AC B E P2 D F

P3
G

I
H P4 J
A B C D E

P1 P2 P3 P4

H I F G

R-trees:Search

Main points:

every parent node completely covers its children a child MBR may be covered by more than one parent - it is stored under ONLY ONE of them. (ie., no need for dup. elim.) a point query may follow multiple branches. everything works for any(?) dimensionality

R-trees:Insertion
Insert X P1 AC B X P2 D E P3 G I

P1 P2 P3 P4

H
P4 J
A B C D E X H I F G J

R-trees:Insertion
Insert Y P1 AC B Y P2 D E P3 G I
P1 P2 P3 P4

H
P4 J
A B C D E H I F G J

R-trees:Insertion

Extend the parent MBR


P3 I

P1 AC B Y P2 D E

P1 P2 P3 P4

G
F H P4 J
A B C D E Y H I F G J

R-trees:Insertion

How to find the next node to insert the new object?

Using ChooseLeaf: Find the entry that needs the least enlargement to include Y. Resolve ties using the area (smallest)

Other methods (later)

R-trees:Insertion

If node is full then Split : ex. Insert w

P1

K AC W E

P3 G

I H P4 J

P1 P2 P3 P4

B
P2 D

A B C K D E

H I F G

R-trees:Insertion

If node is full then Split : ex. Insert w


Q1 Q2

P1

K P5 A C

P3 W E G

I H P4 J Q2

P1 P5 P2

P3 P4

B
P2 D Q1

A B C K W H I F G D E J

R-trees:Split

Split node P1: partition the MBRs into two groups.

P1

(A1: plane sweep, K AC W until 50% of rectangles)

A2: linear split


A3: quadratic split A4: exponential split: 2M-1 choices

R-trees:Split

pick two rectangles as seeds; assign each rectangle R to the closest seed

seed2 R seed1

R-trees:Split

pick two rectangles as seeds; assign each rectangle R to the closest seed: closest: the smallest increase in area

seed2
R seed1

R-trees:Split

How to pick Seeds: Linear:Find the highest and lowest side in each dimension, normalize the separations, choose the pair with the greatest normalized separation Quadratic: For each pair E1 and E2, calculate the rectangle J=MBR(E1, E2) and d= J-E1-E2. Choose the pair with the largest d

R-trees:Insertion

Use the ChooseLeaf to find the leaf node to insert an entry E If leaf node is full, then Split, otherwise insert there

Propagate the split upwards, if necessary

Adjust parent nodes

R-Trees:Deletion

Find the leaf node that contains the entry E Remove E from this node If underflow: Eliminate the node by removing the node entries and the parent entry Reinsert the orphaned (other entries) into the tree using Insert

Other method (later)

R-trees: Variations

R+-tree: DO not allow overlapping, so split the objects (similar to z-values) R*-tree: change the insertion, deletion algorithms (minimize not only area but also perimeter, forced re-insertion ) Hilbert R-tree: use the Hilbert values to insert objects into the tree

Potrebbero piacerti anche