| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by tablatom 1179 days ago
	As an LLM novice, can someone explain what these "your document" apps are doing? My understanding is that GPT-4 doesn't support fine-tuning, and 50MB is too large to add to the prompt (which would be too expensive anyway).

2 comments

naveedjanmo 1179 days ago

Hey! I'm the developer of Unriddle - it works using text embeddings. The document is split into small chunks and each chunk is assigned a numerical representation, or "vector", of its semantic meaning and relation to the other chunks. When a user prompts this too is assigned a vector and then compared to the rest of the chunks. The similar chunks are then fed into GPT-4 along with the query, ensuring the total number of words doesn't exceed the context window limit.

link

petra 1179 days ago

//The similar chunks are then fed into GPT-4 along with the query

Since GPT can use things from his context arbitrarily ,does it solve the hallucination issue, even for ebooks?

link

tablatom 1179 days ago

Awesome - I knew about vectorising/embeddings for semantic search, but I hadn't thought of using the search results as a prompt prefix - clever!

link

nemo44x 1179 days ago

Yeah it’s the pattern b all these tools are using.

Use SebtenceTransformers in python to write to the database (PineconeDB) and then do the same for queries. Use the results as context.

link

wufufufu 1179 days ago

What OpenAI API calls allow sending these small chunks?

When you query something like "What is this research about?" is it able to use data from all chunks?

link

naveedjanmo 1179 days ago

It's just the GPT-4 API - the chunks are sent as part of a prompt. In that case it won't use data from all chunks but it will try to find any chunks that provide descriptions of the document. I've found with research papers, for example, it fetches parts of the introduction and abstract.

link

wufufufu 1179 days ago

Oh so there is pre-processing to find the useful portions? What are you using for the pre-processing?

I feel that it's inevitable that OpenAI et al. will be able to handle large PDF documents eventually. But until then I'm sure there's a lot of value of in this kind of pre-processing/chunking.

link

naveedjanmo 1179 days ago

Yeah I think you're right - the 32k context window for GPT-4 (not available for everyone yet) is already enough for research papers. I'm using a library called Langchain, there's also LlamaIndex.

link

shekhar101 1179 days ago

Can the vectorization of chunks and finding context close to query be done with any LLMs and then only relevant chunks be sent to OpenAI?

link

naveedjanmo 1179 days ago

Vectorisation is done via OpenAI's embedding API. And the chunking/querying is happens through the Langchain library. But there are a few different ways of doing it - another good library is LLamaIndex.

link

shekhar101 1179 days ago

Thanks a lot! Do you _have_ to do vectorization and querying with the same LLM? Can someone do vectorization with 1 and do querying with reevant chunks with another?

link

oneshoe 1179 days ago

Simply speaking - They chunk the document (make it smaller so that it can be sent to gpt) and then vectorize it (change it to numbers / vector array). From there that is stored in a vector store - now, when you query you first query your vector store for the context (part of the 50MB file) and then send the context along with the question to GPT.

You are right GPT-4 doesn't support fine-tuning but, I think (in general) people might be misunderstanding what fine-tuning does.

link

shekhar101 1179 days ago

Good explanation. Thanks! Can the first part, i.e. vectorizing and finding relevant chunks be done with any LLM (e.g. a self hosted one) and the second part, i.e. querying relevant chunks be done with OpenAI?

link