|
|
|
|
|
by poomer
1094 days ago
|
|
This class of startup, "build domain specific LLMS using your own data", is extremely crowded right now but I am not optimistic about their future. For large companies, the actual modeling work for this is already easy for any ML team, thanks to existing FOSS work on stuff like PEFT and LoRA. The hard part is figuring out what data goes into the fine tuning process and how to get this data in a usable form, but this is very business specific and can't be automated in a SaaS process. For SMBs, the value would be in using the LLM to generate responses to customer Q&A/search queries. But these companies aren't going to integrate some external third party service, they'll only use it if it's already baked into their CMS - Wordpress/Shopify/Wix/etc. I just don't see who the final consumer for this product would be. |
|
It seems to me that the vast majority of these people would be better off just doing semantic search with their documents chunked, run through an embeddings process, and stored in a vector database, with the search queries and results then run through an LLM at the final step to create an actual "answer". For applications where this is not practical, I agree that LoRA should be the next approach. I have a hard time believing that the future is everyone training their own domain specific LLMs from the ground up.