|
|
|
Show HN: I ran a language model on a PS2
(github.com)
|
|
46 points
by xaskasdf
90 days ago
|
|
The Emotion Engine has 32 MB of RAM total, so the trick is streaming weights from CD-ROM one matrix at a time during the forward pass — only activations, KV cache and embeddings live in RAM. This means models bigger than the RAM can still run, they just read more from disc. Had to build a custom quantized format (PSNT), hack endianness, write a tokenizer pipeline, and most of the PS2 SDK from scratch (releasing that separately). The model itself is also custom — a 10M param Llama-style architecture I trained specifically for this. And it works. On real hardware. |
|
I doubt the VUs can help with inference given their small scratchpad sizes and instruction set though, haha.