Hacker News new | ask | show | jobs
by maxbaines 953 days ago
To train a bespoke LLM takes a lot of effort and compute, you are perhaps better off using Retrieval Augmented Generation (RAG). Here's some information from Langchain

https://js.langchain.com/docs/modules/data_connection/ https://python.langchain.com/docs/modules/data_connection/

Also OpenAi last week released Assistants which is an easy way to achieve RAG without needing new tools such as Vector Db's. Although 5000 docs is perhaps to large for assistants.

The first decision is whether you would use an Open Model such as Llama2 and host that yourself or a Model such as GPT 4 from openAi or Claude2 from Anthropic etc.

2 comments

Thank you! I checked out Langchain last night and wow I am super impressed by how accessible it is.

Do you have any good resources on cleaning up / structuring of data? The 5000 articles I have span multiple years which which means contextual information may be "spread out". The data I have contains dates of when the article was written, I'm pondering how to ensure the LLM doesn't talk about a fact in 2015 like it's still true in the present day.

im agree with this, the efford and time required to train a model usually not is worth, using a RAG with model of 7B or more are, usually more than sufficent