|
|
|
|
|
by kjaer
2672 days ago
|
|
For finger tracking, version 1 used random forests [1], because of the performance/hardware budget trade-off: they're harder to train than a traditional deep learning algorithm, but are much more efficient to compute on the device (branching being basically free on a CPU). Version 2 uses a deep learning accelerator [2], which makes it possible to do the heavier computation of DNNs (which involve floating-point operations, which would be much more expensive on the CPU). From an engineering perspective, I just love seeing how it touches all abstraction layers of the stack, and the types of solutions that come out of thinking about the silicon and the high-level ML models at the same time. [1] https://www.microsoft.com/en-us/research/wp-content/uploads/... [2] https://homes.cs.washington.edu/~kklebeck/lebeck-tech18.pdf |
|
In non-DNN image processing it's quite common to use ints as well (iDCT, FFT, etc) for the potential performance gains vs. floating point.