|
|
|
|
|
by wenc
408 days ago
|
|
Probably no difference for your use-case (ST_Distance). If you already have data in Postgres, you should continue using Postgis. In my use case, I use DuckDB because of speed at scale. I have 600GBs of lat-longs in Parquet files on disk. If I wanted to use Postgis, I would have to ingest all this data into Postgres first. With DuckDB, I can literally drop into a Jupyter notebook, and do this in under 10 seconds, and the results come back in a flash: (no need to ingest any data ahead of time) import duckdb
duckdb.query("INSTALL spatial; LOAD spatial;")
duckdb.query("select ST_DISTANCE(ST_POINT(lng1, lat1), ST_POINT(lng2, lat2)) dist from '/mydir/*.parquet'")
|
|
Also as a side note, is everyone just using DuckDB in memory? Because as soon as you want some multiple session stuff I'd assume you'd use DuckDB on top of a local database, so again I don't see the point but I'm sure I'm missing something.