|
|
|
|
|
by jackbach
1226 days ago
|
|
Hello! I am the creator of this experiment. Glad to see the conversation about hand tracking in the browser over here. This demos is done under the context of a series of creative experiments on how to use real time hand tracking in the browser for creative interactions. Will be posting more experiments soon. Tech background: I am using MediaPipe to control the hand rig in threejs. MediaPipe provides landmarks that are used to control a threejs Skeleton (hierarchy of bones with rotations). Feel free to ask, I will answer any questions! |
|
Fwiw, some things I've found fun: Clip-on fish-eye lens, intended for phone but fitting on laptop, for expanding webcam field of view. Additional cameras: on sticks above screen tips for high-res stereo positioning over kbd; asymmetric high-off-to-side to trade some resolution for some field of view (meh); high-overhead for whole-workspace tracking. Binocular periscope with webcam splitter and screen-tip mirrors (blech - low-res awkward fiddly). Look-down mirror on webcam, partial or full, to get kbd view (nice in VR). Look-down with curved mirror along top of keyboard to get "out along kbd surface view" and crufty touch detection for kbd-as-touch-surface (cute but fiddly - only makes sense to save a camera or two; caveat I had high-contrast white hands on black thinkpad kbd). Putting tracking markers on fingers (flats, a-frames, or cubes on velcro rings) makes for less jittery tracking, but is awkward (meh). Markers taped around keyboard help with calibration.
Magic wand. I found I could more-or-less manage to type while holding a chopstick. So stuck a marker cube on one end, and an arc-sliced-off a small-Xmass-ball on tip, so it slides smoothly across (thinkpad) keys. Barber-pole rotation marker. Anvil'ed tip pressure sensor, a finger microswitch, and very thin and soft ribbon cable to arduino. But didn't actually get the pressure sensor working before punted on all this. Chopstick was narrow enough to avoid breaking hand tracking.
Some gotchas: 2K camera resolution was painful for tracking. (Several years ago) mediapipe finger tracking was annoyingly noisy for doing stereo. You only get one usb2 camera per usb port, even if it's usb3 (maybe usb3 cameras allow working around that limit nowadays?). If you do hand, arm, face and marker tracking on several cameras, even with native gpu mediapipe, you're burning a lot of gpu just on the human interface device, before your likely-graphical-itself app even starts. If I had it to do over now, I'd punt mirrors, use 4K usb3 cameras, and at least with desktop, more cameras. Nicely merging high-latency camera tracking with lower-latency keyboard, touchpad, and graphics tablets, requires changes to the input event pipeline, and adapting apps to deal with "oh my! That space key pressed several keys ago - it was pressed with a pointer finger at position 3!, so that means we roll back app state and then ...".
Here we are a half-century later, still banging on glorified xerox altos. We're so broken.