Hacker News new | ask | show | jobs
by salesynerd 818 days ago
GPT-4 seems to be the least biased of all the LLMs. As a newbie to the field, does it mean that OpenAI have the most "balanced" data and/or does it do a great job in training their model? If the training is the secret sause of success, will it make sense for these companies to share their "best" data with each other?
2 comments

It could also mean that they are the ones that so far have put most effort to "patch" the LLM
Absolutely this. You can fill many holes in a ship if you have many fingers.

I think we quickly forget how silly the old models were compared to the newer ones.

OpenAI had a head start and a considerable amount of like/dislike and "what could be better" data - not to mention the "rewrite" button meaning the answer written by the LLM wasn't adequate enough.

Oh and the side by side comparisons etc. SO MANY DATAPOINTS.

These low hanging fruit in the realm of data science I haven't seen the other companies use which is confusing.

They have invested the most in preference alignment with special attention to DEI (for better or for worse).