| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by moonu 64 days ago
	Idk if you've seen this already but Taalas does this interesting thing where they embed the model directly onto the chip, this leads to super-fast speeds (https://chatjimmy.ai) but the model they're using is an old small Llama model so the quality is pretty bad. But they say that it can scale, so if that's really true that'd be pretty insane and unlock the inference you're talking about.

2 comments

lachlan_gray 64 days ago

Robotics/control systems is exactly what came to mind when I saw this release! What struck me is the possibility of look ahead search in real time, a bit like alphazero's mcts.

link

pstuart 64 days ago

It's a fascinating proposition and no doubt they'll get bigger models in there, and likely be able to cluster multiple models for mega MOE. One thing that would really be great is if they could take the power requirements down -- the chip requires 2.5KW, which is modest in terms of what the big boys use but would be an issue on a battery powered robot.

link

fennecfoxy 63 days ago

"The chip" no, a whole rack/deployment they offer takes 2.5kW. Not just one chip. Squeezing 2.5kW thru 1 chip would be mental.

link

pstuart 63 days ago

My bad, thanks for the correction. Skimming FTL.

link