>> I have no problem imagining a security camera application needing to monitor quite a few video channels.
As a joke I sometimes tell people the automatic flushing toilets in public bathrooms work by having a little camera monitored by someone in a 3rd world country who remotely flushes as needed, while monitoring a whole lot of video feeds. They usually don't buy it, but will often acknowledge that our world is uncomfortably close to having stuff like become reality.
On the inference accelerator? IIUC, the RAM is just to hold the model and whatever state it needs during a particular inference operation. I'm not an expert on ML but AFAIK 16 GiB is plenty. I suppose it'd also need to hold onto reference frames for the video decoding, but at 1080p with e.g. YUV420 (12 bits per pixel), you can hold a lot of those in 16 GiB. edit: e.g., 4 references for each of the 96 streams would take ~1 GiB.
Even on the host, 16 GiB is fine for say an NVR. They don't need to keep a lot of state in RAM (or for that matter to do a lot of on-CPU computation either). I can run an 8-stream NVR on a Raspberry Pi 2 without on-NVR analytics. That's about its limit because the network and disk are on the same USB2 bus, but there's spare CPU and RAM.
As a joke I sometimes tell people the automatic flushing toilets in public bathrooms work by having a little camera monitored by someone in a 3rd world country who remotely flushes as needed, while monitoring a whole lot of video feeds. They usually don't buy it, but will often acknowledge that our world is uncomfortably close to having stuff like become reality.