Hacker News new | ask | show | jobs
by yawnxyz 1168 days ago
Ok is this really "training" a new model on the data? Or are you breaking it down into vector embeddings, and then using an embeddings search against the relevant content and then passing that into the context window of the OpenAI API?

This is cool, thanks for putting it together — but I think we as a group of designers and engineers should get our language right. If we mean creating embeddings — we should say it, since "training a new model" is very different from using embeddings...

1 comments

It's pretty straightforward to build something like this.

Pseudo:

    embedding = OpenAI.generate_embedding(some_question)

    embedding_matches = pinecone.query(embedding)
    context_strings = embedding_matches.context_strings

    OpenAI.chat(some_question + context_strings)
You give OpenAI's chat API something like:

    """
    This is my user question, how old is James bond?
 
    using this context answer this question:

    {{from doc: james bon is 19 years old}}
    """
Really powerful, really useful - but really simple to create.