Hacker News new | ask | show | jobs
by Redster 811 days ago
What LLM are you hoping to use. Have you considered using HelixML? If I am reading you right, the primary concern is compute costs, not human-time costs?
2 comments

We are finding there is a trade-off between model performance and hosting costs post-training. The optimal outcome is where we have a model that performs well on next-token prediction (and some other in-house tasks we've defined) that ultimately results in a model that we can host on the lowest-cost hosting provider rather than be locked in. I think we'd only go the proprietary model route if the model really was that much better. We're just trying to save our selves weeks/months of benchmarking time/costs if there was already an established option in this space.
That said, I think that dvt's comment is helpful about RAG likely being what you need rather than fine-tuning, but wanted to offer something if you know that's what you need.