| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by neoromantique 1254 days ago

I'm sorry to be THAT GUY, but it is addressed in the article :)

>GPT embeddings

To index these stories, I loaded up to 2000 tokens worth of comment text (ordered by score, max 2000 characters per comment) and the title of the article for each story and sent them to OpenAI's embedding endpoint, using the standard text-embedding-ada-002 model, this endpoint accepts bulk uploads and is fast but all 160k+ documents still took over two hours to create embeddings. Total cost for this part was around $70.