I’ve been looking hard at datastores recently, in particular PostgreSQL + PostGIS. In school, I did a few GIS related things in class and I assumed it was magic. It turns out there is no magic.
At work, we have an application that accepts polygons and returns objects whose location are within that polygon. In GIS-speak this is often called an Area of Interest or AOI and they’re often represented as a few data types: GeoJSON, Well-known Text and Well-known Binary (WKT/WKB).
So what happens when the request is accepted and how does PostGIS make this all work?
Here’s a graphical representation of my GeoJSON (the ‘raw’ link should show the GeoJSON document):
And the same as WKT:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
POLYGON ((-88.35548400878906 36.57362961247927, -88.24562072753906 36.57362961247927, -88.24562072753906 36.630406976495614, -88.35548400878906 36.630406976495614, -88.35548400878906 36.57362961247927)) |
In the PostGIS world, geospatial data is stored as a geometry type.
Here’s a query returning WKT->Geometry:
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
db # SELECT ST_GeometryFromText('POLYGON ((-88.35548400878906 36.57362961247927, -88.24562072753906 36.57362961247927, -88.24562072753906 36.630406976495614, -88.35548400878906 36.630406976495614, -88.35548400878906 36.57362961247927)) | |
'); | |
-[ RECORD 1 ]——-+——————————————————————————————————————————————————————————————- | |
st_geometryfromtext | 0103000000010000000500000000000040C01656C0CDCEF4B16C49424000000040B80F56C0CDCEF4B16C49424000000040B80F56C0059C012DB150424000000040C01656C0059C012DB150424000000040C01656C0CDCEF4B16C494240 |
The PostGIS geometry datatype is an lwgeom struct.
So, to find all of the objects in a given AOI, the SQL query would look something like this (PostGIS FAQ):
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SELECT id FROM objects WHERE geom && ST_GeometryFromText('POLYGON ((-88.35548400878906 36.57362961247927, -88.24562072753906 36.57362961247927, -88.24562072753906 36.630406976495614, -88.35548400878906 36.630406976495614, -88.35548400878906 36.57362961247927))') |
OK, but what does the && operator do? Thanks to this awesome dba.stackexchange answer, we finally get some answers! In a nutshell:
- && is short for geometry_overlaps
- geometry_overlaps passes through several layers of abstraction (the answer did not show them all)
- in the end, floating point comparisons see if the object in the database’s polygon is within the requested polygon