Hacker News new | ask | show | jobs
by Cyph0n 3436 days ago
Nicely done! But I'm assuming that this is more of an exercise rather than a real-world application of ML? I say this because the task of keeping a car between two lines is trivially done using control algorithms. Of course, the CV part -- "seeing" the lines -- requires some form of ML to work in the real world.
5 comments

Obviously, "Lane Following Autopilot using my brain and controls theory" would not make it to the top of HN. Welcome to the new era where Tensorflow replaces Lyapunov and ML spares you the need of understanding hard problems... until you need guarantees and safety... but but it's ok let's add more data.
I agree with you. If you can leverage control theory from the 1950s to solve your problem, what's the point?

However, I will state that using e.g. Lyapunov functions to prove the stability of the system requires a model of the system. And even if you need a guarantee for your system, that guarantee is only as good as the fidelity of your model. For an inexpensive RC car, with slippage and saturation, without torque control or inertial sensing, you're going to have a hard time doing something that sounds as principled as what you suggest.

You seem to be forgetting the entire vision pipeline that automatically extracts "lanes" and that information gets incorporated in an end to end manner requiring only true steering angles and nothing else. Its easy to comment but its not as straightforward or trivial as one might assume.
Indeed, I was not really talking about the vision pipeline. But once you decouple the problem (use ML for vision, planning for the trajectory, controls for the rest), you'll get much more stability, guarantees and insight into how to improve your problem. These kinds of end-to-end approaches are very hard to evaluate, they have zero educational value, are not parsimonious and tend to reduce people's analytical skills.
But to be able to decouple the vision pipeline you need a lot of manual annotation work which is tedious.
Tedious, and also solved for a decade already. Also, it's much easier to just find lanes using traditional CV and simply using annotators to verify the lane labels.
You can never use 1950s control theory to solve your problem? I think you didn't understand the my comment, so please let me clarify: I was claiming that even the control problems in this RC-car-lane-keeping domain can benefit from learning approaches.
Are you disagreeing with my comment? Or stating that I should have included additional points in my comment?

In any case I think I understand your comment, that in addition to the control problem, there's a perception problem.

Rather than demeaning someone's effort, learn why specific methods are considered appropriate/state-of-the-art when solving certain problems.

This is an unbelievably wrong comment. All the Lyapunov and traditional Process control theory in the world won't help you solve autonomous driving. Also regarding "Guarantees and Safety" they don't magically appear out of thin air when you use traditional process control especially in noisy domains like autonomous driving. This comment is equivalent of "I can write code to solve Atari Pong in any programming language deterministicly so any post showing Deep Reinforcement Learning is stupid"...

So control theory is ok for unmanned aerial drones but autonomous driving is just too far? Control theory can't handle noisy domains?

Guarantees of safety (more accurately stability) is the entire point of lyaponov analysis, and it's used on noisy systems all of the time (https://www.mathematik.hu-berlin.de/~imkeller/research/paper...). Can you point to a specific noisy system that control theory is ill suited for?

Once you have a path to follow, classical control theory can be used to control the steering angle to follow it.

But classical control theory hasn't been able to extract, from camera pixels, the open path in a road with cars, bicycles, and pedestrians. Camera inputs are million-dimensional, and there aren't accurate theoretical models for them.

Unmanned drones are orders of magnitude easier since you don't have anything that you can just fly into once you are above few hundred feets. They also don't have to rely on any vision based sensing. E.g. a drone has altitude, current speed, heading all of which while noisy can be represented easily as a small set of values.

The whole Lyapunov and control theory assumes perfect knowledge of sensors. Even though the signal itself might be error prone you have a signal. In case of autonomous driving even in simple cases as those described in the blogposts knowing the exact position of the markers and then using them to tune the contoller is not as easy as you might think.

The end-to-end system shown here solves three problems it processes the images to derive the signal, it then represents it optimally to the controller and then tunes the controller using provided training labels.

I cited Lyapunov, more as the ABC of nonlinear controls. Much more can be done in an analytical fashion, the "end-to-end" system here does not "solve" anything. It is a trained steering command regressor, nothing fancy, it's likely to work in this guy's living room, under certain lighting conditions, there is no way of predicting its accuracy, sensibility or anything else. Engineers have been breaking down systems into sub systems for a reason -> tractability of testing and improvement. End-to-end systems like that have close to zero value if you need something reliable.
Obviously, my message was slightly provocative, deep learning methods and classical controls (which by the way are able to quantify robustness to plant uncertainties and noisy signals) are all very useful but shall be used in combination. End-to-end techniques that bundle perception, planning and control in an opaque net are fun to play with (like in this article), it just very sad to see people believing this produces robust and safety-critical systems and we see too much of such articles on HN.
I agree with you in the sense that if a known and reliable way to map knowledge and information from one domain to another (e.g. from desired trajectory + perceived current position to steering inputs), I'd much prefer that than black box ish neural nets. Neural nets aren't meant to be the silver bullet.

But in this case though, any kind of state space control also requires rather precise knowledge of the physical laws that govern the dynamics of the vehicles. When such information is not available, can neural nets do a decent job at mimicking an analytical control algorithm? I think that's an interesting problem worth exploring.

Why learn to walk when crawling is effective? When crawling you have hard guarantees that you won't fall down.
When falling down actually corresponds to killing a pedestrian, then I'd rather try to understand the complexity of robust walking rather than observing a bunch of humans, mimic their behavior and hope for the best.
Also, general autonomous driving isn't as hard as it sounds. Given a list of time-dependent coordinates of obstacles, it should be pretty easy to navigate around such that no collision occurs. The hardest part is testing, but this is just a matter of tedious work and doesn't require great intellectual effort.
> Given a list of time-dependent coordinates of obstacles

A tree falls in an intersection because some carpenter ants chewed through trunk. Cars swerve to miss the tree and collide in an inelastic ball of nonlinearity, showering debris everywhere. You approach this at 65 mph and have 23 ft to decide what to do. Fear not, you have a list, a perfect list with coordinates, velocities, and material properties of every solid body in the area. Furthermore, without great intellectual effort, you can solve the millions of coupled differential equations that govern the dynamics of the entire system in near real time. Oh, and your list also has a measure of importance of each bit of mass, whether it is human, animal, or inert. And your list also accounts for the degrees of freedom introduced by every other car approaching the intersection, also using their own respective lists and perfect knowledge of the world to miss each other?

Do you happen to have a PhD in Robitics?

I think Rockets are straightforward too just a bottle with expanding gasse through a series of nozzles, pointed at different angles at correct time but since I know I am not a rocket scientist I dont go around claiming moon-landing was not a "great intellectual" effort.

Yep, this was an exercise to compete in the DIYRobocars race in West Oakland last weekend. There were 9ish cars with 7 running end to end Tensorflow autopilots and the others using OpenCV/line detection. Open CV one the race.
This is primarily a fun toy problem.

It uses a Raspberry Pi and ~50 lines of code. So I don't think anyone should expect it to do something that's impossible with other approaches.

> trivially done using control algorithms

Is it really trivial? Honest question... Which control algorithms are you speaking of?

As u/gumby said, I was thinking of a PID controller. Basically, the car would continuously measure how far away it is from the line and compare that with the "expected" (computed) value. Based on the error between the two values, the controller would adjust some variable (e.g., wheel angle).

Computing how much of an adjustment is required is where the PID part comes in. The controller uses the Derivative (rate of change) of the error as well the the Integral of the error to improve its estimate. These two values can intuitively be thought of as the predicted error and history of the error, respectively.

[1]: https://www.wikiwand.com/en/PID_controller

I think it's important to ponder that a PID controller is , in almost every case it is used, a heuristic controller which achieves pretty mediocre performance. Unless the plant is second-order linear system, with constant gains, and maybe some non-constant biasing (what the I term is supposed to handle), a PID controller is theoretically inappropriate, and requires tuning.

The handful of parameters you need to adjust in a PID controller parameterizes a very small class of controllers. For a given control problem, the controller you want might fall outside that class. People try to expand the class of "PID" controllers in all sorts of ways (e.g. anti-windup), but from where I stand it's just hacks on top of hacks.

It makes sense to consider a much wider class of controllers, with many more parameters, to possibly achieve better performance, or at least to avoid having an expert tune some gains in place of collecting bucket-loads of data.

As someone who's tried to control a car with PID, there is more to it than that. The angle of the car relative to the lines and the distance from the line need to be taken into account separately, since they each independently contribute to the distance from the lines as you move forward, and the PID controller can't separate them by itself. Think of how hard it is to keep straight in heavy fog when you can only see a few feet in front of the car, and you'll get the idea.

The delay in your steering response (how fast you can measure the error and turn the wheel) is large enough here that if you aren't actively taking the non-linearities of the problem into account you will oscillate off the road. The other way to fix this is to aim for a point far ahead of you such that your steering response time is significantly less than your "following distance", but that results in cutting corners.

Cool. I'd never heard of a PID controller... although it sounds vaguely related to a Kalman filter. Indeed: https://www.quora.com/Is-there-any-intrinsic-connection-betw...

Thanks also for the link to wikiwand!

Huh, I'll need to read up on Kalman filters then!

Yeah, Wikiwand is amazing. Be sure to grab the browser plugin: it automatically redirects any Wikipedia links to Wikiwand.

This is part of the reason I love HN.. I just spent my lunch learning about PID controllers! Thank you!
For context, a hardware PID controller is a commodity part you can buy for a couple of dollars.
umm...say, a PID loop?
> But I'm assuming that this is more of an exercise rather than a real-world application of ML?

While this example is simplified - and I wouldn't recommend it for a real-world full-size vehicle trial - it does implement everything (scaled down) described in NVidia's paper:

https://images.nvidia.com/content/tegra/automotive/images/20...

In short, the project uses OpenCV for the vision aspect, a small CNN for the model, uses "behavioral cloning" (where the driver drives the vehicle, taking images of the "road" and other sensor data like steering - as features and labels respectively - then trains on that data), and augmentation of the data to add more training examples, plus training data for "off course" correction examples...

If you read the NVidia paper, you'll find that's virtually all the same things they did, too! Now - they gathered a butt-ton (that's a technical measurement) more data, and their CNN was bigger and more complex (and probably couldn't be trained in reasonable time without a GPU), plus they used multiple cameras (to simulate the "off-lane" modes), and they gathered other label data (not just steering, but throttle, braking, and other bits)...but ultimately, the author of the smaller system captured everything.

Furthermore, NVidia's system was used on a real-world car, and performed quite well; there are videos out there of it in action.

This is virtually the same kind of example system that the "behavioral cloning" lab of Udacity's Self-Driving Car Engineer Nanodegree is using. We're free to select what and how to implement things, of course, but I am pretty certain we all understand that this form of system works fairly well in a real-world situation, and so most of us are going down the same route (ie, behavioral cloning, cnn, opencv, etc). Our "car" though is a simulation vehicle on a track, built using Unity3D.

> Of course, the CV part -- "seeing" the lines -- requires some form of ML to work in the real world.

Actually, it doesn't. The first lab we did in the Udacity course used OpenCV and Numpy exclusively to "find and highlight lane-lines" (key part was to convert the image from BGR to HSV, and mask using the hue). No ML was required.

That said - I wouldn't trust it for real-world vehicle driving use - but it possibly could be used as part of a system; however, as NVidia has shown, a CNN works much better, without needing to do any pre-processing with OpenCV to extract features of the image - the CNN learns to do this on its own.