Hacker News new | ask | show | jobs
by turnsout 395 days ago
Yes, this is what we do as a RAG workflow. We created a list of all 32bit unsigned integers and whether they were even or odd, and we pass that into the context. The future is amazing!
3 comments

I'm new to RAG and have a question: how do you get all the numbers into the context window?

Does the RAG part look up just the needed number?

I think that Gemini has a million token window (yes?) - do you have access to a model with a larger window?

Regardless, I find your ideas intriguing and wish to subscribe to your Substack.

We have an agentic system that looks up the context size, and then summarizes the even/odd table if necessary. We lose a little bit of accuracy, but now we can handle any model. Be sure to like & subscribe!
Have you tried quantizing them down to 4 bits to save on RAM?
I have found that even 2 bit quantization works, but you have to make sure you only discard the LABs (that’s what we are calling the Left Aligned Bits internally). I have no idea why it works so well but it has cut our costs significantly.
I.. can't tell if you're joking or not. Pretty sure someone out there is unironically doing something as stupid as this in production
The good news is they're definitely joking. The bad news is that indeed, there's definitely someone out there doing this unironically.
Yes, there is.

A former co-worker had to print 5 lines of text. Sometimes, some of the lines were empty but he didn't want to print an empty line.

So he did the usual, use 'if', a lot of 'if'. He handled all the possible cases of empty VS not empty lines.