Hacker News new | ask | show | jobs
by lolinder 859 days ago
Mistral's process for releasing new models is extremely low-information. After getting very confused by this link I tried looking for a link that has any better information, and there just isn't one.

I thought Mixtral's release was weird when they just pasted a magnet link [0] into Twitter with no information, but at least people could download and analyze it so we got some reasonable third-party commentary in between that and the official announcement. With this one there's nothing at all to go on besides the name and the black box.

[0] https://news.ycombinator.com/item?id=38570537

2 comments

Company creates blackbox technology, and the company's communications are themselves like a blackbox... fitting

(I know that Mistral does a lot more stuff in the open than other companies, just couldn't resist the parallel between this and the blackbox limitations of LLMs in general)

> for releasing new models is extremely low-information

To be fair, this is not a release. This was the previous release https://mistral.ai/news/mixtral-of-experts/

It looks more like not trying very hard to hide things until release, rather than being a black box.

If this were the first incident like this I would agree, but they very intentionally dropped the magnet link for Mixtral on Twitter with no further context. That leaves me wondering if this was also a weird on purpose thing rather than just them being casual.
Does it matter? You know that if you really want to play with things early, you may get an opportunity. And if you want to read more details, you'll get an announcement too. What's the problem with it being either on purpose or casual?
It's an observation, not a complaint. It does leave very little to go on for an HN discussion, though, besides a meta conversation like this one.
Well, what could they say? Given the lack of transparency on the data it well could be:

“We’ve trained LLaMA MoE on a lot of GPT4 data. And this it is not as good as GPT4. And this is our blob, so we can release it under any license. If someone is silly enough to use what this blob generates, this is not our problem.”