Hacker News new | ask | show | jobs
by brucethemoose2 929 days ago
I'm not saying its better than 70B, just that its very strong from what others are saying.

Actually I am testing the 34B myself (not the 7B), and it seems good.

2 comments

UNA: Uniform Neural Alignment. Haven't u noticed yet? Each model that I uniform, behaves like a pre-trained.. and you likely can fine-tune it again without damaging it.

If you chatted with them, you know .. that strange sensation, you know what is it.. Intelligence. Xaberius-34B is the highest performer of the board, and is NOT contaminated.

How much data do you need for UNA? Is a typical fine tuning dataset needed or can you get away with less than that?
In addition to what was said, if its anything like DPO you don't need a lot of data, just a good set. For instance, DPO requires "good" and "bad" responses for each given prompt.
doesn't require much data, in a 7B can take a couple hours ~
That’s cool. A couple hours on a single GPU or like 8x a100s?
> I'm not saying its better than 70B, just that its very strong from what others are saying.

Gotcha

> Actually I am testing the 34B myself (not the 7B), and it seems good.

I've heard good things about it