Hacker News new | ask | show | jobs
by supergeek 1027 days ago
Exciting! I do a lot of FPV drone racing and it always felt ripe for an AI control to take over. Because the video system is so low quality the course is intentionally very high contrast and almost perfect for a vision system. Along with the relatively limited and constant inputs available.

I will say, flying indoors on a relatively simple track like theirs is a lot easier than flying outdoors on more real world tracks.

It's a bummer that they didn't fly the standardized 2023 multiGP global qualifier track. Then we could rank their AI against every pilot more objectively. You can see that track design here: https://www.multigp.com/global-qualifier/

3 comments

It may well be because it simply can't complete that track yet. As soon as it can complete it at all it will probably be near the top performers. But if the situation changes considerably from the training conditions I would still expect it to fail, as well as with relatively minor (but unexpected) changes to the track.
I think that the idea of using the residual model (trained on high-resolution outside-in pose estimation) to correct the inside-out model is particularly interesting because it might actually be possible to generalize; if the kinematics residuals (basically the delta between simulation training and the real world aerodynamic + inertial behavior of the aircraft) are able to be made general enough to reflect the flight dynamics of the drone, rather than the specific actions on a specific course, this is a promising approach for general purpose flight.

It's a really fascinating approach compared to most "AI-guided" drones which use models for vision and pathfinding but a traditional IMU+PID loop for kinematic control.

It reminds me of the way the Mario Brothers game changed overnight: for the longest time it was considered impossible and then suddenly it was unbeatable. But both benefit from the static course, if it generalizes it will be a game changer, otherwise, I'm not that impressed. But there may well be some gold to be found here and I applaud them for making it work this far. Maybe they could purposefully improve their performance in real world situations by making this one harder, for instance by changing the lighting from one run to another, introducing or removing obstacles or moving goals around. That might force the model to come out more general. We've seen similar strategies used with good results in image classification problems. In fact I used them myself when building the lego sorter, as long as everything was always lined up perfect it worked a lot worse then when introducing various complications. During the real world runs those would show up all by themselves anyway and where before they were classified wrong or ended up in the recycling bin for another shot they were suddenly classified right.

Of course a setup where you can gather your training data with thousands of images per hour has some advantages over one where if you get it wrong you have to rebuild your drone...

A changing course is one of the biggest impacts of flying outside. You have the wind directly acting on your drone, and you have the wind acting on the gates and flags. Flags will spin around in the wind, double gates will lean, sometimes quite a lot in the wind.

There's a whole skill to feeling the wind on your body and anticipating how the drone will behave. When I feel a big gust of wind I'm going to slow down out on the course to get my bearings.

Is the Mario Brothers reference to some AI in "solving" auto-playing Mario successfully?
How much do human drone racers memorize the individual standardized track layouts vs reacting on the fly?
Navigating between gates is usually done by flying what you can see and reacting real-time to conditions.

When you get to a gate, or perform a turn or a split-s, these manoeuvres can be executed faster than you can see and process so Instead you go faster than your brain by memorising the timing of your inputs and execute them from memory rather than flying what you can see.

For example, if i want to do a barrel roll, i don’t watch the horizon. Instead i rely on my memory of how fast it rotates at full stick deflection.

Sometimes these set pieces go wrong - you clip a gate for example - it’s usually my hearing processes that a fraction of a second before my vision does.

It’s quite common to fly without vision for short periods, the analogue video feed is not reliable.

I'd say it's a combination of both. Just like in auto racing, some people are really good at flying "seat of the pants" or "reading the course" and adapting to a new layout gate-to-gate, while other pilots are better at a course with practice and memorization. At the top, it's about both together. Most races are conducted with analog video at a very low output power to reduce interference between racers, so the visuals are pretty weak and most pilots do rely on a fair amount of memorization. But, there's also a not-inconsequential conditions angle that comes in when flying outside, so pilots need to be adaptable.

In terms of "fairness" to the computer, I think this approach isn't up to snuff with a human pilot until it can fly an outdoor course in changing conditions. Still, I find this particular approach very interesting since it's inside-out (self-contained) flying once it's trained, with guided learning to start. I found the earlier purely outside-in guidance approaches to be rather ho-hum as they weren't very practical and basically skipped the "hard parts."

In auto racing (other than rally), pro drivers typically know the track incredibly well, down to every bump that could possibly affect traction. Often times there are blind crests where they have to know the track perfectly to be able to go through a corner at maximum attack. F1 drivers know the tracks like the back of their hand except for the times where they go to new tracks, and even then they can practice in the simulator.
What drone/controller/headset are you using?