Improve performance on a st_dwithin query (in PostGIS)

المشرف العام · 22 ديسمبر 2015

At the momenent I am working on a query, as described in an earlier question. I have the two tables

testme (tracked GPS-profile with Point Geometry) and
roads (geometry roads shapefile)

Besides the distance of each tracked point (row) in table pt in want to find the closest point using a combination of ST_Distance and ST_DWithin.

DROP TABLE IF EXISTS raw_2015_processed;EXPLAIN ANALYZECREATE TABLE raw_2015_processed ASSELECT pt.id, pt."DeliveryID", pt."VehicleID", pt."TrackID", pt."Longitude", pt."Latitude", pt."Altitude", pt."Heading", pt."Speed", pt."Satelites", pt."HDOP", pt."VDOP", pt."Xfcd", pt.ts, pt.received, pt.the_geom,(SELECT ST_ClosestPoint(line.geom,pt.the_geom) AS closest_geom FROM roads AS line WHERE ST_DWithin(line.geom,pt.the_geom, 0.5) LIMIT 1),(SELECT ST_Distance(line.geom,pt.the_geom) AS distance FROM roads AS line ORDER BY pt.the_geom line.geom LIMIT 1),(SELECT ST_AsText(ST_ClosestPoint(line.geom, pt.the_geom)) AS closest_coordinates FROM roads AS line ORDER BY pt.the_geom line.geom LIMIT 1)FROM raw_2015 AS ptORDER by pt.id;`Working with a reduced file 'Testme' of 144 rows EXPLAIN ANALYZE returns following Query Plan:

"Sort (cost=84207.04..84207.40 rows=144 width=131) (actual time=11797.677..11797.685 rows=144 loops=1)"" Sort Key: pt.id"" Sort Method: quicksort Memory: 54kB"" -> Seq Scan on testme pt (cost=0.00..84201.87 rows=144 width=131) (actual time=82.458..11797.268 rows=144 loops=1)"" SubPlan 1"" -> Limit (cost=9.46..578.16 rows=1 width=155) (actual time=81.508..81.508 rows=1 loops=144)"" -> Bitmap Heap Scan on roads line (cost=9.46..578.16 rows=1 width=155) (actual time=81.388..81.388 rows=1 loops=144)"" Recheck Cond: (geom && st_expand(pt.the_geom, 0.5::double precision))"" Rows Removed by Index Recheck: 0"" Filter: ((pt.the_geom && st_expand(geom, 0.5::double precision)) AND _st_dwithin(geom, pt.the_geom, 0.5::double precision))"" -> Bitmap Index Scan on geom_index_roads (cost=0.00..9.46 rows=139 width=0) (actual time=79.722..79.722 rows=450402 loops=144)"" Index Cond: (geom && st_expand(pt.the_geom, 0.5::double precision))"" SubPlan 2"However, when I run the query for a larger dataset with more data points (< 5 Million points) , it gets very slow (i.e. several hours / days). Do you guys see a way to increase the speed of the query? Is there an an alternative option to st_dwithin or a different query structure which can proof to be helpful?

أكثر...

Improve performance on a st_dwithin query (in PostGIS)

المشرف العام

Administrator