| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by evilduck 69 days ago
	I just wanted to express gratitude to you guys, you do great work. However, it is a little annoying to have to redownload big models though and keeping up with the AI news and community sentiment is a full time job. I wish there was some mechanism somewhere (on your site or Huggingface or something) for displaying feedback or confidence in a model being "ready for general use" before kicking off 100+ GB model downloads.

2 comments

danielhanchen 69 days ago

Hey thanks - yes agreed - for now we do:

1. Split metadata into shard 0 for huge models so 10B is for chat template fixes - however sometimes fixes cause a recalculation of the imatrix, which means all quants have to be re-made

2. Add HF discussion posts on each model talking about what changed, and on our Reddit and Twitter

3. Hugging Face XET now has de-duplication downloading of shards, so generally redownloading 100GB models again should be much faster - it chunks 100GB into small chunks and hashes them, and only downloads the shards which have changed

link

ssrshh 68 days ago

If you would know - is this also why LM Studio and Ollama model downloads often fail with a signature mismatch error?

link

danielhanchen 68 days ago

Probably yes

link

evilduck 68 days ago

Ah thanks, I wasn't aware of #3, that should be a huge boon.

link

danielhanchen 68 days ago

Oh yes! This only applies if one uses hf download / snapshot_download - other normal download methods sadly won't have XET

link

CamperBob2 69 days ago

Best policy is to just wait a couple of weeks after a major model is released. It's frustrating to have to re-download tens or hundreds of GB every few days, but the quant producers have no choice but to release early and often if they want to maintain their reputation.

Ideally the labs releasing the open models would work with Unsloth and the llama.cpp maintainers in advance to work out the bugs up front. That does sometimes happen, but not always.

link

danielhanchen 69 days ago

Yep agreed at least 1 week is a good idea :)

We do get early access to nearly all models, and we do find the most pressing issues sometimes. But sadly some issues are really hard to find and diagnose :(

link