| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by kcorbitt 716 days ago

(Disclaimer: I'm the founder of OpenPipe, one of the fine-tuning services OP tried and ultimately the one that produced the highest performing model, it appears.)

Data extraction is a use case that fine-tuned models are fantastic at, so I'm not surprised that OP got good results. That said, I've also found it's pretty easy to beat GPT-4 across many task types if you have a way of getting strong training data. We published some research[1] a week ago where we found that across 4 example tasks spanning creative summarization, question answering, data extraction and classification a fine-tuned Llama 3 8B was able to outperform GPT-4 on 3 of them. The key was to create a repeatable way of generating high-quality training data, which is also addressed in the post.

[1]: https://openpipe.ai/blog/mixture-of-agents

3 comments

GlassOwAter 715 days ago

Is this something, as a tech enthusiast that's no expert, I can easily fine tune are run?

My use case would be fine tuning on technical docs. Specific news, 2 years of blog posts, primary source material, and Twitter explainer thread. I want to gather all the niche information of a topic from the last two years, dump it into this and have an LLM that is a subject-matter expert.

afro88 715 days ago

Fine tuning doesn't quite work that way. You have to format the training data set as request/response. The idea of fine tuning is to get the model to output things in a specific format, style or structure.

Your use case is better suited to RAG. This is where you retrieve data from a large dataset and inject it into the user's request so the AI model has the context it needs to answer accurately.

But that's not a silver bullet and you would need to spend significant time on chunking strategy and ranking of results to hopefully get a decent response accuracy.

w4nderlust 715 days ago

Here is an example of the Predibase platform, referred in the article for the Solar model, but that can train also Llama-3, Phi-3 and Mistral. https://www.youtube.com/watch?v=R2JQhzfaOFw&themeRefresh=1 I think you can assess by yourself if it's easy enough to do for you. (Predibase founder here)

colordrops 716 days ago

Why isn't someone providing a "meta model" that uses an LLM to choose between various fine tuned models depending on the question to get overall better results than gpt4?

billmalarky 716 days ago

Founding AI Engineer at OpenPipe here, using a fine tuned "router LLM" to route between various specialized (inc fine tuned but not necessarily) applied models depending on the input is becoming a common pattern in more modern "graph like" LLM applications.

See LangGraph's "conditional edges" concept here: https://langchain-ai.github.io/langgraph/concepts/low_level/...

You can see how that "routing function" could include a call to a "Router LLM." And yes, fine tuning is a great method to better improve the routing intelligence of said Router LLM.

Great question btw!

anon373839 715 days ago

Worth mentioning that you don’t even need separate models to implement this. Dynamically loading LoRA adapters is much more efficient, and is the approach Apple took.

bashfulpup 715 days ago

Already a big thing. See the constellation architecture used here:

https://arxiv.org/html/2403.13313v1

sheepscreek 715 days ago

Very loosely, isn’t this what is happening inside most LLMs that have a “multi-head” mechanism?

drphilwinder 715 days ago

Check out https://unify.ai/chat if you're interested in a router optimised for cost/ttft/performance for commercial language models.

babelfish 715 days ago

Is using model responses to train a new model against the ToS for the major LLM providers (OpenAI, Anthropic, etc)?

yreg 715 days ago

There doesn't seem to be any restriction like that in OpenAI terms.

zepton 715 days ago

There is: "you may not... Use Output to develop models that compete with OpenAI"

(from https://openai.com/policies/terms-of-use/)

yreg 715 days ago

Thanks, I've missed that.

I suppose the Output could be washed by publishing it on the web and having another entity crawl it.

OpenAI doesn't treat anyone else's content any differently, acting like it's a fair game, so why should we care.

jaredhallen 715 days ago

Data laundering. What a time to be alive.

babelfish 715 days ago

It seems like you do not work for OpenPipe (OP), so it probably doesn't matter for you, but it could (should) matter a whole lot for OpenPipe and/or their customers