Hacker News new | ask | show | jobs
by liampulles 780 days ago
This is a key usecase for text embeddings. Essentially it is a process of converting sentences or paragraphs to vectors, where the closeness of vectors then represents a semantic similarity.

So you can convert all the paragraphs in your document into vectors, convert your question into a vector, and then find the e.g. 10 closest vectors, or all that fall under a certain maximum distance, etc.

You can store the embeddings in a vector database, to search across multiple documents.

1 comments

Thank you for the answer! Wouldn't it make sense to compare the potential answers to the questions to the parts of the documents?
If the potential answers are given, sure