Hacker News new | ask | show | jobs
by rowanG077 1357 days ago
I doubt that since a weekend project. You need to analyze a video stream of hundreds fps in real time with very low latency requirments. You'd probably need an FPGA to do that. Custom training an AI to only trigger on heads.
4 comments

No, it's trivial these days, and this very task is one of the introductory tutorials to Google's Coral AI Tensorflow boards:

https://coral.ai/models/object-detection/

Which you can buy as a USB stick for $59:

https://coral.ai/products/accelerator/

Sure, it's a little more complex if you want to train on specific heads as opposed to a generic model, but not hugely so. The biggest challenge would likely be downsampling the video stream into something the model can process quickly enough.

When you say it’s a weekend project for a college student it suggests it’s mundane or easy for most professional developers within a weekend.

This sounds exactly like the old “Twitter would be easy to implement give me a month” naivety.

Development is hard, even small projects aren’t usually finished in a weekend, let alone something there’s no reference implementation for and that we’re merely speculating about.

If such a project was built in a weekend to work reliably enough others could use it successfully in gameplay, it would be highly impressive. Maybe we’d call it an extraordinary effort, if for no other reason “because software”.

That's a generic AI. I would be very surprised you could get it to work accurately for games and hyper specific the heads of (possibly non-human)models. The box you see in the marketing material is completely useless for this use case. Besides how do you get the data to your standard second PC, scale the input the USB thing expects. Send it over USB and back, and then you have to send it to a custom mouse that receives the click command over bluetooth? I don't think 7ms just for the inference step is going to cut it.

A weekend project were you have to design an ultra low-latency software + hardware stack, a custom mouse and train a custom AI. I would be incredible if anyone manages this in a weekend.

If you want absolutely olympic response times (110-120ms), it is a little tight but nowhere near impossible. 7ms as the other poster says is absolutely insane. Most videogames will already trigger on consistent peak human response times, so you'll want to trim down your bot to go to a more average person (160ms, maybe a bit higher) to fly under all the heuristics they might (but probably don't) use. Plenty of time to compute things.
7ms is enough. But that is one in a chain of latencies. 110ms is way too long if you want your trigger to be effective. In 110ms the enemy has moved out of your crosshair. Humans deal with that with dynamic adjustment but the AI will just click the mouse. I'd think you'd need to hit less then 25ms end to end and even that might not be enough for far away targets.
I mentioned earlier - reaction time under 100ms on track events is considered a foul start and the offender disqualified.
You are completely misunderstanding the goal of such a cheat. It's a "triggerbot" as in it fires the mouse when an enemy head enters the crosshair. The classical use case is that the user camps at an edge and waits for the enemy to peek. The user has his crosshair at the correct offset from the edge so that when the enemy peeks he could hit them with a human reaction time.

However a human sees the enemy BEFORE it enters the crosshair and estimates and corrects when the enemy will enter the crosshair. A trigger bot measure exactly WHEN the head is in the crosshair, it has no predictive power of enemy dynamics. This totally changes the latency game. A 150ms trigger latency from the time a head is in the crosshair basically means you shoot when either the crosshair or enemy has moved significant amounts. This also means that you can very obviously cheat with a trigger bot if you use the trigger badly, the human needs to "hide" the trigger latency to make it appear human. You can't compare it to human reaction time at all, a trigger needs to be about an order of magnitude faster to be useable.

7ms isn't going to be the bar. Human reaction times are in the 150+ms range, you have 20x your budget.

I agree it's still tough though, most I/O paths on generic hardware will have buffers at every step that you'll have to fight to eliminate.

7ms is enough. But that is one in a chain of latencies. 150ms is way too long if you want your trigger to be effective. In 150ms the enemy has moved out of your crosshair. Humans deal with that with dynamic adjustment but the AI will just click the mouse. I'd think you'd need to hit less then 25ms end to end and even that might not be enough for far away targets.
25ms would be quite obvious, and absolutely wouldn't be necessary.

Humans regularly manage to hit heads without that reaction time, and a human will still be involved in the aiming / predicting where to aim.

The AI would need to lead the target under any reasonable implementation. Just clicking the mouse when crosshairs are over a target would scarcely deserve the name AI.
Leading the targets makes this MUCH harder from just an over the counter object detection AI. In fact making that and making it look human so it's not trivially detectable would blow this from a weekend project you can do in a few hours(Which I already doubt.) To weeks/months/years long project since you need it be really humanlike... Even now cutting edge AI can be spotted by humans.
I was thinking a decent proof-of-concept. All you'd really need is a generic object detector and fake mouse.

Latency _could_ make it a lot more difficult, but beating a human I don't expect would be hard.

You don't need to do hundreds of fps. Just grab the newest frame, process it, repeat. Missing frames at most increases effective latency or means your cheat isn't 100% effective if you miss a head. It's a sliding scale of improvements, not a deal breaker.

You also don't even necessarily have to process the whole frame. Just the bit actually _at_ the crosshairs is probably going to be enough for a crappy version.

And a fake mouse is just usb-hid, usb gadget whatever search terms, not like you'd have to break any new ground there.

I consider it a "weekend project" because it's just throwing together a couple of existing libraries in a fairly standard way. Like most things, cleaning it up enough to be perfect could/would take far longer.

You don't need 'hundreds' fps - you can do it w/ 10sec and get better reaction time than humans (reaction under 100ms in track events is consider foul). It should not be so difficult with a separate GPU and a capture card. Serial mouse/keyboard inputs are trivial as well.
How is Serial mouse/keyboard trivial? You'd need to make a custom mouse that receives data from your second PC. You'd probably want it to be wireless else you will have 2 cables coming out of it. And no just having a second fake mouse is not fair. That's easily detectable by software and would definitely raise a lot of red flags.
This is how: take ESP32 - it has bluetooth, wifi and serial port all integrated and it costs couple of dollars. It's small and it runs on 3.3V, it has a voltage regulator that it allows to connect to 5V.

I can do the code and place the esp32 in a mouse on a Saturday (my C is always rusty, not using it professionally). And I am a hobbyist at best. So the original serial mouse/keyboard are connected to ESP32 that normally proxies the signal of the hand movement to the PC.

If I press a pedal (w/ foot) the second computer would receive a signal, calculate a human alike trajectory to move the mouse and send it to the ES32. The latter will execute it along with the left click to shoot.

Like I said - trivial. The same can be done with the USB port as well, and it's not harder. Just that for PS/2 I have the tools laying around.

It is trivial. All you need is a device that can work as a USB gadget that you can plug your keyboard into. I could do it with my phone, many Raspberry Pi-like SBCs, or even Arduino-like boards...
YOLOv3 can do it