Hacker News new | ask | show | jobs
by data_maan 1140 days ago
Thanks for the insights.

I wonder if one needs even LlamaIndex?

From their site:

>Storing context in an easy-to-access format for prompt insertion.

>Dealing with prompt limitations (e.g. 4096 tokens for Davinci) when context is too big.

>Dealing with text splitting.

Not sure if it isn't easier to roll one's own for that...?

I know a thing or two about the math behind LLMs and all this software build around a few core ideas just seems to be a lot of overkill...

When mentioning about PGVevtor, did you refer to this repo or is there a class within LangChain that has the same name? https://github.com/pgvector/pgvector

3 comments

You’re almost certainly going to have to write your own splitting code for anything nontrivial. LlamaIndex breaks down hard when there’s a lot of markup in the document, for example. You’ll also want control over the vector search strategy (just using the query or chunk embedding may not be enough)
in terms of search store and engine, would you agree that pgvector is sufficient for most text-specific cases?
I agree. I mentioned in a thread below that these frameworks are useful for discovering appropriate index-retrieval strategy that works best for you product.

On PGVector, I tried to use LangChains class (https://python.langchain.com/en/latest/modules/indexes/vecto...) but it was highly opinionated and it didn't make sense to subclass nor implement interfaces so in this particular project I did it myself.

As part of implementing with SQLModel I absolutely leaned on https://github.com/pgvector/pgvector :)

Thanks for the observation.

FWIW, individual classes are generally tiny, so we found using langchain is fine and then for places we need to beef up (chunking, not calling 'eval', ...), we do our own class/subclass. That way we can align with community for broader pieces and patterns, and decrease technical risks from smaller fly-by-night repos.

At the same time, the underlying APIs are super simple, so just rolling your own entirely, with no framework, can make sense. We need to deal with businesses wanting to plug in their own APIs & models, so that happens to be less attractive to us.

That said, purpose built frameworks can be great. Our data agent has a headless tier and we are building it fine with langchain, and benefiting from the ecosystem there, but I can imagine someone with more specific needs enjoying rasa..

Splitting things is easy! Store the dense vectors of 512 characters or so and use an overlayed index of terms to set context of the current conversation.

Use Weaviate Cloud for the vector engine…