Hacker News new | ask | show | jobs
by simonw 974 days ago
You mean this table here?

    text-embedding-ada-002     53.3
    text-search-davinci-*-001 52.8
    text-search-curie-*-001     50.9
    text-search-babbage-*-001 50.4
    text-search-ada-*-001     49.0
That's not comparing it to the davinci/curie/babbage GPT3 models, it's comparing to the "search-text-*" family.

Those were introduced in https://openai.com/blog/introducing-text-and-code-embeddings as the first public release of embeddings models from OpenAI.

> We’re releasing three families of embedding models, each tuned to perform well on different functionalities: text similarity, text search, and code search. The models take either text or code as input and return an embedding vector.

It's not at all clear to me if there's any relationship between those and the GPT3 davinci/curie/babbage/ada models.

My guess is that OpenAI's naming convention back then was "davinci is the best one, then curie, then babbage, then ada".

1 comments

How interesting. I assumed that a consistent codename such as Ada/Davinci refers to the lineage/DNA of the OpenAI model from which a distinct product was created. But I can see how these codenames could be "just" a revision label of A/B/C/D (Ada/Babbage/Curie/Davinci), similar to "Pro/Max/Ultra". If true, a product named "M2 Ultra" could have nothing to do with another product called "Watch Ultra".
Wow I genuinely hadn't noticed the A/B/C/D thing!