Hacker News new | ask | show | jobs
by majormajor 18 days ago
I think if the dream of semantic search from vector embeddings had worked out as well as people had hoped then "grep over a bunch of text" would have some significant disadvantages.

But in practice I never saw anyone crack the embedding-generation-and-comparison problems well enough to actually get better results than grep for things like "find similar code and see what it does."

(You also don't need that advanced a model to use "grep over a pile of files", but the models today can run MUCH faster than GPT 3.5/4 were running over the APIs back then, making "summarize all five hundred of these matches from those files" much more usable.)

1 comments

I’ve had very good luck having my system search for available tool functions with natural language (ultimately against Qdrant). I’m surprised to hear that people are trying to grep files, instead.
People? No, that's what AI agents themselves do.

There are theoretical gains from using a vector search engine in an agentic loop, but grep is the lowest common denominator of agentic search.