| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by benenglish 1203 days ago
	Wondering how difficult this would be to get running on a m1 max?

2 comments

ComplexSystems 1203 days ago

I got one token every 8 minutes or so.

link

popol12 1203 days ago

Using which model ? On a pretty mid range i5 11th gen I'm getting 0.35 token/s, using the 7B model. Haven't tried the bigger models.

link

2Gkashmiri 1203 days ago

Is that good? Not good?

link

gorbypark 1203 days ago

A token is approximately 4 characters. So, four characters per 8 minutes is pretty slow. This comment would take 1224 minutes to generate, if I was an AI.

link

Tepix 1203 days ago

Usually you want tokens per second, not seconds per token. So it's a bad sign.

link

swyx 1203 days ago

another commenter posted a fork that does it https://news.ycombinator.com/item?id=35067469

per the readme it looks like there a few bugs to figure out in case anyone here is a pytorch expert

link