Hacker News new | ask | show | jobs
by ramesh31 1652 days ago
Image similarity is a terrible way to do this; you lose all temporal data, which would make different speeds of motion impossible to discern. You're much better off feeding your XYZ + timeseries data into a hidden markov model, then classifying gestures with SVM prediction. There's a really great C++ gesture recognition library that I've used to do just that for a VR game that featured custom spellcasting gestures [0]

[0] https://github.com/nickgillian/grt

1 comments

Hi, I totally agree, but I had like 1-2 hours left to implement the similarity check part, so I took a simple image similarity approach. (Also, I only have images for target spells for now, need to create target data with temporal data as well)

Your idea for gesture recognition is interesting, I will look into that in more detail.

PS: I already convert 3D to 2D with filtering so I can apply comparison of 2 2D+time data:

What, I originally had in my mind was to modify and use a similarity metric such as EMD (Earth Mover’s Distance) which I believe would be a great fit for our case.