Hacker News new | ask | show | jobs
by koboll 1152 days ago
Ghostwriter is notably worse than GPT-4, so while it may be true in a sense that "Training a custom model allows us to tailor it to our specific needs and requirements", the reality is they'd be getting better results just using OpenAI right now. Probably true for almost every other use case.

That said, I am patiently waiting and champing at the bit for the day this isn't true anymore. Cool to see the groundwork being laid for it.

6 comments

Stable Diffusion 1.5 is not SOTA, but in reality the sea of augmentations makes SD kinda unbeatable, if you are willing to put in the work to use them.

I think LLMs could end up the same way, if the comminity consolidates around a good one.

Not everyone wants to depend on and trust a cloud service, and not everyone needs GPT-4 quality.

If there's a viable way to tune and run models locally they could still be useful if you don't need it to play chess and imitate a Python interpreter at the same time.

Is it possible to add to an LLM without re-training it, my understanding was no.
The original “pre-training” Is what’s expensive. The “fine-tuning” (also training that it modifies network weights) for instruction following or other tasks costs the thousand dollar range.
If one of your specific needs and requirements is that you do not share data with OpenAI then this is a viable option.
just my 5 cents. it should be easier to train small custom model which works off a big pre-trained one. getting latent state as an input. while big model does all the hard work. but, getting latent means it should be accessible. that's why open source models are so valuable, even if they are not that good in general. more over, open source models can be used in other projects in various setups.
They’re competing directly with Microsoft (and getting crushed) because GitHub is their biggest competitor, so it makes sense that they wouldn’t want to use OpenAI products.

Agree that Ghostwriter is subpar though.

That could all change in a few months. We saw locally runnable, open source image generation catch up quick.