Hacker News new | ask | show | jobs
by DanielVZ 1215 days ago
They went from indexing with embeddings + LLM to just using a biased embedding for their use case. This should save them most of their costs.
2 comments

Maybe this helps people understand what they are doing at index time.

* Version 1. Ask the LLM to describe the code snippet. Create an embedding of the description. LLM generation + embeddings required.

* Version 2. run the code snippet directly through the embedding API. Skip the LLM text generations step. Now run the code snippet through the bias matrix and finally index the resulting embedding.

I assume this only works b/c they fine tuned a bias matrix on code snippet and text pairs. Feels more like a light version of transfer learning to me.

The article was a little unclear in the actual approach for V2 so if I have anything wrong please correct me.

I wouldn’t say most—maybe a factor of 2. Getting the embedding is still an API call to an LLM.
I’m pretty sure they were using a high cost LLM to summarize, and for embeddings you only need Ada, which is orders pf magnitude cheaper.