| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by cmenge 304 days ago

We're processing tenders for the construction industry - this comes with a 'free' bucket sort from the start, namely that people practically always operate only on a single tender.

Still, that single tender can be on the order of a billion tokens. Even if the LLM supported that insane context window, it's roughly 4GB that need to be moved and with current LLM prices, inference would be thousands of dollars. I detailed this a bit more at https://www.tenderstrike.com/en/blog/billion-token-tender-ra...

And that's just one (though granted, a very large) tender.

For the corpus of a larger company, you'd probably be looking at trillions of tokens.

While I agree that delivering tiny, chopped up parts of context to the LLM might not be a good strategy anymore, sending thousands of ultimately irrelevant pages isn't either, and embeddings definitely give you a much superior search experience compared to (only) classic BM25 text search.

2 comments

elliotto 304 days ago

I work at an AI startup, and we've explored a solution where we preprocess documents to make a short summary of each document, then provide these summaries with a tool call instruction to the bot so it can decide which document is relevant. This seems to scale to a few hundred documents of 100k-1m tokens, but then we run into issues with context window size and rot. I've thought about extending this as a tree based structure, kind of like an LLM file system, but have other priorities at the moment.

Embeddings had some context size limitations in our case - we were looking at large technical manuals. Gemini was the first to have a 1m context window, but for some reason its embedding window is tiny. I suspect the embeddings might start to break down when there's too much information.

codyb 303 days ago

For anyone unfamiliar, construction tenders are part of the project bidding process and appear to be a structured and formal manner in which contractors submit bids for large projects.