Hacker News new | ask | show | jobs
by ares623 239 days ago
> OpenAI never trains on your data by default.

Are embeddings used for RAG considered company data (presumably calculated by OpenAI), or OpenAI’s?

(I don’t know if that’s how RAG actually works)

1 comments

> (I don’t know if that’s how RAG actually works)

As I understand it, strictly speaking RAG is broader than what you describe, but in practice you're correct for most implementations: https://en.wikipedia.org/wiki/Retrieval-augmented_generation...

Are embeddings usable for training then? It should be right since that’s what an LLM “sees” anyway. I wonder if that’s what’s going on behind the scenes.