Looking at PostGIS intersects queries

I’ve been looking hard at datastores recently, in particular PostgreSQL + PostGIS. In school, I did a few GIS related things in class and I assumed it was magic. It turns out there is no magic.

At work, we have an application that accepts polygons and returns objects whose location are within that polygon. In GIS-speak this is often called an Area of Interest or AOI and they’re often represented as a few data types: GeoJSON, Well-known Text and Well-known Binary (WKT/WKB).

So what happens when the request is accepted and how does PostGIS make this all work?

Here’s a graphical representation of my GeoJSON (the ‘raw’ link should show the GeoJSON document):


Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

view raw
geojson.json
hosted with ❤ by GitHub

And the same as WKT:

POLYGON ((-88.35548400878906 36.57362961247927, -88.24562072753906 36.57362961247927, -88.24562072753906 36.630406976495614, -88.35548400878906 36.630406976495614, -88.35548400878906 36.57362961247927))

view raw
map.wkt
hosted with ❤ by GitHub

In the PostGIS world, geospatial data is stored as a geometry type.

Here’s a query returning WKT->Geometry:

db # SELECT ST_GeometryFromText('POLYGON ((-88.35548400878906 36.57362961247927, -88.24562072753906 36.57362961247927, -88.24562072753906 36.630406976495614, -88.35548400878906 36.630406976495614, -88.35548400878906 36.57362961247927))
');
-[ RECORD 1 ]——-+——————————————————————————————————————————————————————————————-
st_geometryfromtext | 0103000000010000000500000000000040C01656C0CDCEF4B16C49424000000040B80F56C0CDCEF4B16C49424000000040B80F56C0059C012DB150424000000040C01656C0059C012DB150424000000040C01656C0CDCEF4B16C494240

view raw
output
hosted with ❤ by GitHub

The PostGIS geometry datatype is an lwgeom struct.

So, to find all of the objects in a given AOI, the SQL query would look something like this (PostGIS FAQ):

SELECT id FROM objects WHERE geom && ST_GeometryFromText('POLYGON ((-88.35548400878906 36.57362961247927, -88.24562072753906 36.57362961247927, -88.24562072753906 36.630406976495614, -88.35548400878906 36.630406976495614, -88.35548400878906 36.57362961247927))')

view raw
intersection.sql
hosted with ❤ by GitHub

OK, but what does the && operator do? Thanks to this awesome dba.stackexchange answer, we finally get some answers! In a nutshell:

  • && is short for geometry_overlaps
  • geometry_overlaps passes through several layers of abstraction (the answer did not show them all)
  • in the end, floating point comparisons see if the object in the database’s polygon is within the requested polygon