| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by fnikol 930 days ago

All things aside, it's a framework for building data workflows in Python.

Like taking the data from that source (e.g., SQL), processing them (e.g., pytorch or openai), and storing the results somewhere (e.g., data on Mongo metadata on SQL).

It actually consists of the following: 1. nifty abstractions for Data (e.g., sources, encoders, listeners), Metadata (e.g., vector indexes), Compute (e.g., sync, async, parallel). 2. gluing engine that transparently handles the interaction between components 3. out-of-the-box integrations with established tools (databases, AI models and APIs, compute engines)

This way, you can build customized data layers that sit on top of your database and save you from moving the data to dedicated systems (e.g., vector databases or MLops tools)

For further discussion, feel free to join our slack https://join.slack.com/t/superduperdb/shared_invite/zt-1zuoj...

1 comments

ed 930 days ago

Ah, it’s a langchain competitor, possibly with better DB support.

One of the nice things about langchain is the code examples, making it easy to get simple services up and running. And because it’s a toolkit I can take what I need and leave the rest.

However, the ecosystem around langchain is really exploding, is there some way you can retool what you have to extend langchain with better DB support, rather than build your own thing?

fnikol 929 days ago

Indeed both frameworks support model chaining.

However, achieving goals like "training your LLM" or enabling "real-time inference" requires more than just pipelines. For that, we have invested in enhancing compatibility with databases and facilitating parallel computing.

About your last point, I 'm not sure I fully understand. Do you mean to write a guide for moving lang-chain models to superduper? Or to create superduper wrappers for langchain ? Or to move the core functionalities of superduper to langchain ?

The guide, is something have in our immediate plans. The wrappers are under discussion. The latter I don't think it's possible due to architectural differences. For example, superduper is designed with multi-node environments in mind.