Hacker News new | ask | show | jobs
by saurik 1004 days ago
When you say "produced locally", do you mean on the client? If so, does this mean you require me to use some alternate PostgreSQL driver locally, parsing the SQL to add your one feature?

(If it is, this really feels like it should be a separate general purpose local extension mechanism into which random functions can be added, instead of something tied to this use case... maybe I want to add some locally-executed string parsing function, for example...)

(...but, the entire concept of having some functions be "locally" executed also feels really awkward/limited and will involve a ridiculous amount of work to make, at the end of the day, it only sort of work in some places in the query, so I bet you don't mean what I do when I say "locally", right?)

(But, like... doing it remotely--on the database server as part of the query plan--frankly seems kind of crazy to me, as it is going to be so slow and add a massive CPU load to what should be an I/O workload. Makes for good demos I bet, but otherwise unusable in a database context.)

(Regardless, the premise of seeing this as a feature kind of squicks me... like, it honestly gives me strong apprehensions about using your extension at all, as I can see--very clearly--the mission creep it is going cause as you deal with demands to drag more and more popular embedding models with lots of execution dependencies as part of the extension that has to be loaded into the server, as well as fielding distracting discussions about the performance of the embedding helpers...)

(...this frankly shouldn't be part of the same extension: it should be another extension that happens to return this extensions data type--or even potentially returns some more generic one, like an array of float, making it drop-in compatible with other extensions for vector indexing--and there should then almost certainly be separate such extensions for each major model you want to support.)

1 comments

>When you say "produced locally", do you mean on the client?

Sorry for the confusion. By “produced locally” I meant “produced on your DB server” as opposed to being an API call to a third party service such as OpenAI or HuggingFace.

(But, like... doing it remotely--on the database server as part of the query plan--frankly seems kind of crazy to me, as it is going to be so slow and add a massive CPU load to what should be an I/O workload. Makes for good demos I bet, but otherwise unusable in a database context.)

It seems like you’re worried about these workflows being on the Postgres server, which may lead to performance issues.

However, if performance becomes an issue, the functions can be executed on another server. In this approach, whether or not the functions run on the Postgres server, the end user gets access to a better developing experience as all the functions they need are available within SQL.

>...this frankly shouldn't be part of the same extension We agree. These functions are already in another repository, and not part of the same extension. The repository is here: https://github.com/lanterndata/lantern_extras