Hacker News new | ask | show | jobs
by msdz 3 hours ago
> although quality seems very bad

The weights they “etched” into the FPGA card that’s used for the ChatJimmy demo are that of a Llama 3-something 8b model.

The actually impressive and novel thing is that Taalas’ve managed to automate that process (clearly – nobody transforms 8 billion numbers into a physical representation by hand).

So now, they can work on scaling this process up, and with low enough lead times (I’ll be convinced they have inside connections to TSMC if they can actually deliver on the promised mere 3-4 months delay), will be able to offer 30-100b+ parameter models under half a year after they’re released, at thousands of tokens per second while probably drawing less wattage (per token, not sure about overall).

Exciting times ahead, folks.