| HN Mirror

Y	Hacker News new \| ask \| show \| jobs

by verytrivial 618 days ago

You gain in potential parallelism with FPGA, so with very small "at the edge" models they could speed things up, right? But the models are always going to be large, so memory bandwidth is going to be a bottle neck unless some v fancy FPGA memory "fabric" is possible. Perhaps for extremely low latency classification tasks? I'm having trouble picturing that application though.

The code itself is surprisingly small/tight. I'm been playing with llama.cpp for the last few days. The CPU only archive is like 8Mb on gitlab, and there is no memory allocation during run time. My ancient laptop (as in 2014!) is sweating but producing spookily good output with quantized 7B models.

(I'm mainly commenting to have someone correct me, by the way, since I'm interested in this question too!)

2 comments

UncleOxidant 617 days ago

Lower latency, but also much lower power. This sort of thing would be of great interest to companies running AI datacenters (which is why Microsoft is doing this research, I'd think). Low latency is also quite useful for real-time tasks.

> The code itself is surprisingly small/tight. I'm been playing with llama.cpp for the last few days.

Is there a bitnet model that runs on llama.cpp? (looks like it: https://www.reddit.com/r/LocalLLaMA/comments/1dmt4v7/llamacp...) which bitnet model did you use?

link

dailykoder 617 days ago

> Perhaps for extremely low latency classification tasks? I'm having trouble picturing that application though.

Possibly, yes. I have no concrete plans yet. Maybe language models are the wrong area though. Some general either image classification or object detection would be neat (say lane detection with a camera or something like that)

link

tgv 617 days ago

Real-time translation or speech transcription for the hearing-impaired onto AR-glasses? Now you've got a good reason to make it look like a Star Trek device.

Or glasses that can detect threats/opportunities in the environment and call them out via ear plugs, for the vision-impaired.

link