Hacker News new | ask | show | jobs
by phh 1125 days ago
Well, I already said it :P RK3588's NPU can only do convolutions and a small list of activation functions. Realistically it's usable only on image (2D, 3D data). The software side is pretty weird: It's amazing the amount of models they support despite the limited fixed-hardware function, but there has been literally 0 development in the last 6 months and it does have a lot of bugs that doesn't seem hard to fix. Even when it comes to images, it doesn't support changing the resolution of the input (you need to ""recompile the model"" for that), which is super weird since hardware pipeline doesn't care much about the size.

Anyway, I really don't recommend it, unless you're making your own model, you know before-hand what's supported and what isn't, and your input is fixed resolution (which is a pretty fair usage in an embedded system) (fixed = doesn't change at every frame. handling hotplug from one webcam with a resolution to another with another resolution is fine)

I think looking at the examples give you a reasonable show of what it can do: https://github.com/rockchip-linux/rknn-toolkit/tree/master/e... It's mobilenet, yolov3, resnet50. There aren't more examples because they didn't had more examples. There aren't more examples because that's pretty much all you can reasonably run.

As far as I can tell, modern image models using transformer/vit won't be runnable on it. (it acts enough as a coprocessor that it's possible to do some parts in CPU some parts in NPU - and Rockchip framework handles that -, so maybe it's somehow possible)

(Note: I say this as a huge Rockchip lover, their mainline support is top-notch, they make very durable product (their 2015's RK3288 is still far from obsolete), and I bought a RK3588 SBC to play with a NPU accelerator (whose full specification is publicly available btw), in the hope to have a self-hosted LLM voice assistant)