|
|
|
|
|
by chatmasta
1205 days ago
|
|
I'll make this same offer for Splitgraph :) If you feel like writing a Postgres FDW then we can add it to the engine on the backend, so that anyone with a Postgres client could connect to postgres://data.splitgraph.com:5432 and SELECT from a table backed by crul (either "mounted" for live querying, and/or ingested once/periodically for subsequent querying). The user just needs to provide parameters for the table; it's up to the FDW how to interpret those parameters. It would take some thinking and planning, and it's possibly not even a good idea ;) But generally any "data source" is packageable as an FDW as long as you can model it in such a way that you can reasonably implement certain functions for operations like table scans. For most FDWs, this is easy and the tradeoff of a large query is usually limited to excess bandwidth and latency while the query executor reads the result from the FDW. But with a live source pointing to a crawler instance, a table scan could in the worst case mean waiting for the crawler to parse the responses to hundreds of rate-limited network requests. So it's probably better to ingest the data once (and/or periodically) for a particular crul "table" (whatever you decide that means) rather than to query it live. Fortunately, you can still write an FDW as the adapter layer, because Splitgraph ingests data on a schedule by querying the FDW of the live data source (while tolerating a long-running query). Alternatively (or additionally) you could write an Airbyte adapter which we also support, but only for ingestion - if you want live queryable tables then an FDW is necessary. We've been interested in adding something like this (think Apify + Postgres) for a while. If done well it could be really cool. Let me know if you want to talk about it: miles@splitgraph.com |
|
Edit: just saw "airbyte" in a bunch of places, which I assume answers my question. So updated question: airbyte works well for ya?