Sei sulla pagina 1di 73

CREATING

GEO ENABLED
APPLICATIONS WITH MYSQL 5.6
Also covered new MySQL 5.7 features

Alexander Rubin
Principal Consultant, Percona
About Me

Alexander Rubin, Principal Consultant, Percona


Working with MySQL for over 10 years
Started at MySQL AB, Sun Microsystems, Oracle
(MySQL ConsulKng)
Joined Percona in 2013

hPp://www.mysqlperformanceblog.com/
author/alexanderrubin/
Agenda

CreaKng geo-enabled applicaKon


CalculaKng distance between 2 points
SpaKal/GIS funcKons in MySQL
New features in MySQL 5.6 and 5.7
Other systems: SphinxSearch
Geo-enabled applicaFons

What it should do?


Where is the data?
How to convert and store it?
How to query it?

Geo-enabled applicaFons: examples

What it should do?


Common tasks: Geocoding

Find coordinates for the given address



Common tasks: Reverse geocoding

Find address for the given (lat, lon)


Common task: nd Point of Interest
8
Geo-enabled applicaFons

Where is the data?


Free GEO Data Sources

US Zip codes/boundaries
www.census.gov
Point of Interests, Roads, Etc
www.openstreetmap.org
Free GEO Data Sources

US Zip codes/boundaries
`p://`p2.census.gov/geo/Kger/TIGER2013/
ZCTA5/tl_2013_us_zcta510.zip
Point of Interests, Roads, Etc
hPp://download.geofabrik.de/north-america-
latest.osm.pbf
Free GEO Data Sources

Formats

Shapele (.shp, .shx, .dbf)


hPp://en.wikipedia.org/wiki/Shapele
OSM (openstreetmap)
hPp://en.wikipedia.org/wiki/
OpenStreetMap#Data_format
Need to convert to MySQL spaKal format
Geo-enabled applicaFons

How to convert?
ConverFng to MySQL

ZIP codes and boundaries

GDAL server has conversion uKlity


Install GDAL
Conversion is easy :
$ ogr2ogr -overwrite -progress -f "MySQL"
MYSQL:zcta,user=root tl_2013_us_zcta510.shp
0...10...20...30...40...50...60...70...80...90...100
done.

(Ogr2org will create all tables)


ConverFng to MySQL: Openstreetmap

OSM.PBF conversion

trac.osgeo.org/gdal/wiki/
DownloadingGdalBinaries
elgis.argeo.org
launchpad.net/~ubuntugis
GDAL on Linux > 1.10
Current RPMs = 1.9
Ubuntu has unstable release for v1.10
or compile from source
hPp://www.gdal.org/ogr/drv_osm.html
ConverFng to MySQL

On Ubuntu
$ apt-add-repository ppa:ubuntugis/ubuntugis-unstable
$ apt-get update && apt-get install gdal-bin
$ ogr2ogr --version
GDAL 1.10.1, released 2013/08/26
$ ogrinfo --formats|grep OSM
-> "OSM" (readonly)

$ ogr2ogr -overwrite -progress -f "MySQL"


MYSQL:osm,user=root north-america-latest.osm.pbf
0...10...20...30...40...50...60...70...80...90...100
done.
Test Data to Play with

Public Amazon AMI: GIS-MySQL-Ubuntu - ami-ddfdf5b4


GDAL server installed
OSM data converted
ZIP code data converted


hPp://www.mysqlperformanceblog.com/
2014/03/24/creaKng-geo-enabled-
applicaKons-mysql-5-6/
Geo-enabled applicaFons

How to store it?


MySQL with Non-SpaFal

mysql> CREATE TABLE `poi` (


`ID` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`lat` decimal(17,14) NOT NULL,
`lng` decimal(17,14) NOT NULL,
PRIMARY KEY (`ID`)
);
How to store boundaries/polygons, lines, etc?
SpaFal Data types in MySQL

Since one of the rst versions

Geometry -> (any object)


Point
LineString
Polygon / MulKPolygon
MySQL with SpaFal Extensions

Org2org import from OSM

mysql> CREATE TABLE `points` (


`OGR_FID` int(11) NOT NULL AUTO_INCREMENT,
`SHAPE` geometry NOT NULL,
`osm_id` text,
`name` text,
...
`other_tags` text,
UNIQUE KEY `OGR_FID` (`OGR_FID`),
SPATIAL KEY `SHAPE` (`SHAPE`)
) ENGINE=MyISAM AUTO_INCREMENT=13660668 DEFAULT
CHARSET=latin1
SpaFal Data types in MySQL

MyISAM
Supports both SPATIAL types and indexes
InnoDB
Supports SPATIAL types only, not indexes
added SPATIAL index support in labs version of
MySQL 5.7
hPp://labs.mysql.com
MySQL with SpaFal Extensions

`SHAPE` geometry

mysql> select astext(SHAPE) from points limit 1;


+-------------------------------+
| astext(SHAPE) | World Known
+-------------------------------+ Text (WKT)
| POINT(-87.9101245 41.7585879) |
+-------------------------------+
1 row in set (0.00 sec)
MySQL with SpaFal Extensions

`SHAPE` geometry

mysql> select astext(shape) from tl_2013_us_zcta510


where zcta5ce10 = '27701'\G

********************** 1. row ************************

astext(shape): POLYGON((-78.902351 35.988107,-78.902436


35.988116,-78.902597 35.98814,-78.902725
35.988147,-78.902992 35.988143,
-78.902351 35.988107))
Polygon for ZIP =
27701
(Durham, NC)
SPATIAL Support in MySQL versions
25

3.2
3+ 5.0 5.6 5.7

1 Support 2 SpaFal 3 New


4 InnoDB
funcFons
for MyISAM data types Support
added
Only in InnoDB

Geo-enabled applicaFons

How to query it?


Geo-enabled applicaFons

Distance calculaKon
MySQL 4.1 5.5
Distance on the Earth
Task: Find 10 nearby hotels
and sort by distance
What do we have:
1. Given point on Earth: LaKtude, Longitude
Hotel Latitude Longitude
2. Hotels table: Name

Ques0on: How to calculate distance


between us and hotel?
SpaFal FuncFons in MySQL
29

MySQL < 5.6 = no distance funcKon


Need to calculate it manually
MySQL 5.6 introduced st_distance
Planar coordinates only
Manual calculaFon of the distance

Some math follows


Distance between 2 points
on Sphere: The Haversine Formula

For two points on a sphere (of radius R) with laKtudes 1 and


2, laKtude separaKon = 1 2, and longitude
separaKon the distance d between the two points:
The Haversine Formula in MySQL

R = earths radius
lat = lat2 lat1; long = long2 long1
a = sin(lat/2) + cos(lat1) * cos(lat2) * sin(long/2)
c = 2*atan2(a, (1a)); d = R*c
angles need to be in
radians
3956 * 2 * ASIN ( SQRT (
POWER(SIN((orig.lat - dest.lat)*pi()/180 / 2), 2) +
COS(orig.lat * pi()/180) * COS(dest.lat * pi()/180) *
POWER(SIN((orig.lon - dest.lon) * pi()/180 / 2), 2) ) ) as
distance
MySQL Query: Find Nearby Hotels

mysql> set @orig_lat=121.9763;


mysql> set @orig_lon=37.40445; The magic
formula
mysql> set @dist=10;
SELECT
*,3956 * 2 * ASIN(SQRT(
POWER(SIN((@orig_lat - abs(dest.lat)) * pi()/180 / 2), 2) +
COS(@orig_lat * pi()/180 ) * COS(abs(dest.lat) * pi()/180) *
POWER(SIN((@orig_lon - dest.lon) * pi()/180 / 2), 2) ))
as distance
FROM hotels dest
having distance < @dist
ORDER BY distance limit 10\G
Find Nearby Hotels: Results
+----------------+--------+-------+--------+
| hotel_name | lat | lon | dist |
+----------------+--------+-------+--------+
| Hotel Astori.. | 122.41 | 37.79 | 0.0054 |
| Juliana Hote.. | 122.41 | 37.79 | 0.0069 |
| Orchard Gard.. | 122.41 | 37.79 | 0.0345 |
| Orchard Gard.. | 122.41 | 37.79 | 0.0345 |
...
+----------------+--------+-------+--------+
10 rows in set (4.10 sec)

4 seconds - very slow for web query!


MySQL Explain query

mysql> Explain
select_type: SIMPLE
table: dest
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 1787219
Extra: Using filesort
1 row in set (0.00 sec)
How to speed up the query

We only need hotels in 10 miles radius


no need to scan the whole table

10 Miles
How to calculate needed coordinates

1 of laKtude ~= 69 miles
1 of longitude ~= cos(laKtude)*69

To calculate lon and lat for the rectangle:


set lon1 = mylon-dist/


abs(cos(radians(mylat))*69);
set lon2 = mylon+dist/
abs(cos(radians(mylat))*69);
set lat1 = mylat-(dist/69);
set lat2 = mylat+(dist/69);
Modify the query

SELECT destination.*,
<distance formula> as
distance
FROM users destination, users origin
WHERE origin.id=userid
and destination.longitude
between lon1 and lon2
and destination.latitude
between lat1 and lat2
order by distance limit 10
Geo-enabled applicaFons

Distance calculaKon
MySQL 5.6+
SpaFal Data types in MySQL 5.6

ST_DISTANCE (g1, g2)


LimitaKons:
Planar coordinates, no SRIDs
ST_DISTANCE will not use SPATIAL index
ST_DISTANCE: all restaurants around (lat, lon)

mysql> SELECT name, Percona Durham HQ


st_distance(SHAPE,
POINT(-78.90423, 36.004122))
as distance
FROM points WHERE ["amenity"=>"restaurant>]
ORDER BY distance LIMIT 5;
+--------------+-----------------------+
| name | distance |
+--------------+-----------------------+
| The Pit | 0.004491402659514203 |
| Piedmont | 0.005275827703781463 |
| Torero's | 0.0061142069085163425 |
| Fishmonger's | 0.006185844254434481 |
| Pop's | 0.006347849340528991 |
+--------------+-----------------------+
5 rows in set (6.68 sec)
ST_DISTANCE explain plan

id: 1
select_type: SIMPLE
table: points
type: ALL
possible_keys: NULL
key: NULL
key_len: NULL
ref: NULL
rows: 13660667
Extra: Using where; Using filesort
1 row in set (0.00 sec)
Geo-enabled applicaFons

MySQL does not use index


for ST_DISTANCE

Geo-enabled applicaFons

Same old trick 10 Miles

now with new funcKon


ST_DISTANCE + ST_WITHIN example

set @lat= 37.615223; San Francisco InternaKonal


set @lon = -122.389979; Airport
set @dist = 10;
set @rlon1 = @lon-@dist/abs(cos(radians(@lat))*69);
set @rlon2 = @lon+@dist/abs(cos(radians(@lat))*69);
set @rlat1 = @lat-(@dist/69);
set @rlat2 = @lat+(@dist/69);
MySQL ENVELOPE funcFon
46

SELECT envelope(linestring(point(@rlon1, @rlat1),


point(@rlon2, @rlat2))));
ST_DISTANCE + ST_WITHIN
mysql> select
st_distance(point(@lon, @lat), shape) as distance,
name from points where st_within(shape,
envelope(linestring(point(@rlon1, @rlat1), point(@rlon2,
@rlat2)))) order by distance limit 5;
+-----------------------+-------------------------------------------+
| distance | name |
+-----------------------+-------------------------------------------+
| 0.0011611322792736568 | Terminal A |
| 0.0013233041109197517 | Burger Joint |
| 0.0013656901881431793 | Terminal G |
| 0.0015500015032373087 | San Francisco International Airport (SFO) |
| 0.0016255441550510536 | San Francisco International Airport (SFO) |
+-----------------------+-------------------------------------------+
5 rows in set (0.04 sec)
ST_DISTANCE + ST_WITHIN explain plan
mysql> explain select
st_distance(point(@lon, @lat), shape) as distance,
name from points where st_within(shape,
envelope(linestring(point(@rlon1, @rlat1), point(@rlon2,
@rlat2)))) order by distance limit 5\G
************************ 1. row ***************************
id: 1
select_type: SIMPLE
table: points
type: range
possible_keys: SHAPE
key: SHAPE
key_len: 34
ref: NULL
rows: 20
Extra: Using where; Using filesort
1 row in set (0.00 sec)
Geo-enabled applicaFons

Search inside Polygon


New ST_CONTAINS funcFon in 5.6

MySQL < 5.6


CONTAINS(), WITHIN() funcKons
Not exact, use MBR()
New ST_CONTAINS funcFon in 5.6

MySQL 5.6+
New funcKons with st_ prex
ST_CONTAINS(g1, g2)
exact calculaKons
Uses SPATIAL index
What is the dierence?
52

hPp://www.mysqlperformanceblog.com/2013/10/21/using-the-new-spaKal-funcKons-in-mysql-5-6-for-geo-enabled-applicaKons/
MySQLs MBR false posiFves
53

CONTAINS funcFon (older MySQL) ST_CONTAINS (MySQL 5.6)

mysql> select zip from mysql> select zip from


postalcodes where postalcodes where
contains(geom, st_contains(geom,
point(-122.409153, point(-122.409153,
37.77765)); 37.77765));
+-------+ +-------+
| zip | | zip |
+-------+ False +-------+
| 94102 | posiKves | 94103 |
| 94103 | +-------+
| 94158 |
+-------+
3 rows in set (0.00 sec) 1 row in set (0.00 sec)
Geo-enabled applicaFons

Planar vs. GEO


Geo Polygon
55

Pictures from sphinxsearch.com/blog/2013/07/02/geo-distances-with-sphinx/comment-page-1/


Planar Polygon
56

Pictures from sphinxsearch.com/blog/2013/07/02/geo-distances-with-sphinx/comment-page-1/


Geo-enabled applicaFons

MySQL supports only


planar coordinates
Geo-enabled applicaFons

New MySQL 5.7 features


New in MySQL 5.7

Labs.mysql.com MySQL GIS/RTREE support


SPATIAL Indexes for InnoDB
New blog post:
hPp://mysqlserverteam.com/why-boost-
geometry-in-mysql/
InnoDB SPATIAL index in MySQL 5.7
mysql> select version();
+------------------+
| version() |
+------------------+
| 5.7.4-labs-april |
+------------------+
mysql> show create table tl_2013_us_zcta510\G
CREATE TABLE `tl_2013_us_zcta510` (
`OGR_FID` int(11) NOT NULL AUTO_INCREMENT,
`SHAPE` geometry NOT NULL,
`zcta5ce10` varchar(5) DEFAULT NULL,
...
UNIQUE KEY `OGR_FID` (`OGR_FID`),
SPATIAL KEY `SHAPE` (`SHAPE`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1
InnoDB SPATIAL index in MySQL 5.7

mysql> explain SELECT .. FROM tl_2013_us_zcta510


WHERE st_contains(shape,
GeomFromText('POINT(-78.90423 36.004122)', 1)) limit 1\G
*********************** 1. row ***************************
id: 1
select_type: SIMPLE
table: tl_2013_us_zcta510
partitions: NULL
type: range
possible_keys: SHAPE
key: SHAPE
key_len: 34
ref: NULL
rows: 2
filtered: 100.00
Extra: Using where
Real world Examples

1. Find ZIP code for (lat, lon)


2. Find all coee shops near ZIP
Example

Find ZIP code for (lat, lon)


ST_CONTAINS example
mysql> SELECT zcta5ce10 as ZIP
FROM tl_2013_us_zcta510
WHERE
st_contains(shape,
POINT(-78.90423, 36.004122));
+-------+
| ZIP |
+-------+
| 27701 |
+-------+
1 row in set (0.00 sec)
ST_CONTAINS explain plan
**************** 1. row **********************
id: 1
select_type: SIMPLE
table: tl_2013_us_zcta510
type: range
possible_keys: SHAPE
key: SHAPE
key_len: 34
ref: NULL
rows: 1
Extra: Using where
1 row in set (0.00 sec)
Geo-enabled applicaFons

Find all coee places in this


ZIP code
Find places within ZIP code example

mysql> select shape into @shape


from zcta.tl_2013_us_zcta510
where zcta5ce10='27701';
Query OK, 1 row affected (0.00 sec)
Find places within ZIP code example

mysql> SELECT name, st_distance(shape,


centroid(@shape) ) as dist
FROM points
WHERE st_within(shape, @shape)
and other_tags like '%"amenity"=>"cafe"%'
limit 10;
+--------------------+----------------------+
| name | dist |
+--------------------+----------------------+
| Blue Coffee Cafe | 0.00473103443182092 |
| Amelia Cafe | 0.013825134250907745 |
| Serrano's Delicafe | 0.013472792849827055 |
| Blend | 0.009123578862847042 |
+--------------------+----------------------+
4 rows in set (0.09 sec)
Find places within ZIP code example

mysql> explain SELECT name, st_distance(shape,


centroid(@shape) ) as dist
-> FROM points
-> WHERE st_within(shape, @shape)
-> and other_tags like '%"amenity"=>"cafe"%' limit 10\G
************************** 1. row ***************************
id: 1
select_type: SIMPLE
table: points
type: range
possible_keys: SHAPE
key: SHAPE
key_len: 34
ref: NULL
rows: 10
Extra: Using where
Other Systems: SPATIAL/GIS support
Geosearch in the full text world
Three interesKng things (details in Ballroom E)

One, GEODIST performance is not constant,


implementaKons might be severely dierent
Two, Sphinx (a search server that talks SQL)
has a good implementaKon, see benchmark
Three, a simple technique called geohashing
works great on top of a plain Btree index
Geodistance combined with fulltext search

mysql> SELECT *, GEODIST(0.6283904731897259,


-1.3771386072508853,lat_radians,long_radians) as
distance FROM osm_rt WHERE match('restaurant') ORDER
BY distance ASC LIMIT 0,5;
+----------+-------------+--------------+------------+
| id | lat_radians | long_radians | distance |
+----------+-------------+--------------+------------+
| 13539964 | 0.628380 | -1.377061 | 406.587860 |
| 4197225 | 0.628308 | -1.377168 | 549.418457 |
| 8282856 | 0.628310 | -1.377094 | 563.830994 |
| 7815247 | 0.628311 | -1.377210 | 627.531677 |
| 4197271 | 0.628312 | -1.377213 | 629.719055 |
+----------+-------------+--------------+------------+
5 rows in set (0.03 sec)
Geo-enabled applicaFons

QuesKons?

alexander.rubin@percona.com

hPp://www.mysqlperformanceblog.com/author/
alexanderrubin/

Potrebbero piacerti anche