| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by bigfudge 842 days ago
	Although RAG is often implemented via vector databases to find 'relevant' content, I'm not sure that's a necessary component. I've been doing what I call RAG by finding 'relevant' content for the current prompt context via a number of different algorithms that don't use vectors. Would you define RAG only as 'prompt optimisation that involves embeddings'?

1 comments

eevmanu 842 days ago

Sure thing, your RAG approach sounds intriguing, especially since you're sidestepping vector databases. But doesn't the input context length cap affect it? (chatgpt plus at 32K [0] or gpt4 via open ai at 128K [1]) Seems like those cases would be pretty rare though.

[0]: https://openai.com/chatgpt/pricing#:~:text=8K-,32K,-32K

[1]: https://platform.openai.com/docs/models/gpt-4-and-gpt-4-turb...

link

bigfudge 840 days ago

Yes, context window is a limiting factor, but that's true however you identify the content to augment generation.

link