| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by dluc 1026 days ago

We are also developing an open-source solution for those who would like to test it out and/or contribute, it can be consumed as a web service, or embedded into .NET apps. The project is codenamed "Semantic Memory" (available in GitHub) and offers customizable external dependencies, such as using Azure Queues, RabbitMQ, or other alternatives, and options for Azure Cognitive Search, Qdrant (with plans to include Weaviate and more). The architecture is similar, with queues and pipelines.

We believe that enabling custom dependencies and logic, as well as the ability to add/remove pipeline steps, is crucial. As of now, there is no definitive answer to the best chunk size or embedding model, so our project aims to provide the flexibility to inject and replace components and pipeline behavior.

Regarding Scalability, LLM text generators and GPUs remain a limiting factor also in this area, LLMs hold great potential for analyzing input data, and I believe the focus should be less on the speed of queues and storage and more on finding the optimal way to integrate LLMs into these pipelines.

3 comments

ddematheu 1026 days ago

The queues and storage are the foundation on which some of these other integrations can be built on top. Agree fully on the need for LLMs within the pipelines to help with data analysis.

Our current perspective has been on leveraging LLMs as part of async processes to help analyze data. This only really works when your data follows a template where I might be able to apply the analysis to a vast number of documents. Alternatively it becomes too expensive to do at a per document basis.

What types of analysis are you doing with LLMs? Have you started to integrate some of these into your existing solution?

link

dluc 1026 days ago

Currently we use LLMs to generate a summary, used as an additional chunk. As you might guess, this can take time, so we postpone the summarization at the end (the current default pipeline is: extract, partition, gen embedding, save embeddings, summarize, gen embeddings (of the summary), save emb)

Initial tests though are showing that summaries are affecting the quality of answers, so we'll probably remove it from the default flow and use it only for specific data types (e.g. chat logs).

There's a bunch of synthetic data scenarios we want to leverage LLMs for. Without going too much into details, sometimes "reading between the lines", and for some memory consolidation patterns (e.g. a "dream phase"), etc.

link

ddematheu 1026 days ago

Makes sense. Interesting on the fact that summaries affect quality sometimes.

For synthetic data scenarios are you also thinking about synthetic queries over the data? (Try to predict which chunks might be more used than others)

link

dluc 1026 days ago

yes, queries and also planning.

For instance, given the user "ask" (which could be any generic message in a copilot), decide how to query one or multiple storages. Ultimately, companies and users have different storages, and a few can be indexed with vectors (and additional fine tuned models). But there's a lot of "legacy" structured data accessible only with SQL and similar languages, so a "planner" (in the SK sense of planners) could be useful to query vector indexes, text indexes and knowledge graphs, combining the result.

link

bradneuberg 1026 days ago

Really interesting library.

Is anyone aware of something similar but hooked into Google Cloud infra instead of Azure?

link

dluc 1025 days ago

we could easily add that if there's interest, e.g. using Pub/Sub and Cloud Storage. If there are .NET libraries, should be straightforward implementing some interfaces. Similar considerations for the inference part, embedding and text generation.

link

derekperkins 1014 days ago

GCP also has a hosted vector db https://cloud.google.com/vertex-ai/docs/vector-search/overvi...

link

CharlieDigital 1026 days ago

Why .NET apps specifically?

link

dluc 1025 days ago

Multiple reasons, some are subjective as usual in these choices. Customers, performance, existing SK community, experience, etc.

However, the recommended use is running it as a web service, so from a consumer perspective the language doesn't really matter.

link