| HN Mirror

The 1.0 Kinect is several processors:

- A "DSP" based camera system that just does transfers from the sensor system to USB, and that does some other monitoring and housekeeping. It does no vision processing, it's just pass-through of video from the chips to the host.

- An ARM-based audio system that handles mic data, runs an echo cancelation algorithm against host-provided speaker data, and provides the raw and echo-canceled mic data to the host.

Early Kinect versions used yet another processor for managing the tilt motor and accelerometer. This stuff was move to the ARM later (that a tilt-motor processor existed in year 1 units is a fine example of team structure affecting product structure).

All of these CPUs have their own USB interfaces, and there's an internal USB hub so there's only one wire going to the host. :-)

None of these chips use an RTOS; that would just get in the way.

The vision processing is done on the host, where you have heavy lifting capability with GPUs and tons of memory and so on. Doing that processing on the camera would quite expensive in terms of hardware and power, and wouldn't be able to adapt as well to new algorithms.