Hacker News new | ask | show | jobs
by hliyan 2257 days ago
Really impressed with the use of Monte Carlo method. A while back I ran into some resistance trying to advocate simpler statistical methods to solve a problem domain similar to this, while the team in question repeatedly wanted to reach for a machine learning solution. I'd love to know if I was wrong here. In my mind, when an algorithmic or heuristic path to a solution is available, we should attempt it first before reaching for ML.
4 comments

A common trap in tech is to reach for the fanciest tool instead of the simplest. The best engineering often comes from mastery of the simple tools, and the most beautiful engineering is the one that makes you say 'that's so obvious, why didn't anyone ever come up with that before?'
One of my favorite software solutions that I came up with was “human learning” powered. A few years ago I was working as the engineering manager at a small company. I had 8 other people who I was in charge of and we had moved to a new office in the middle of this. Our team took up two rooms in the office and I had to figure out who would be sitting where. I had some preconceived notions of who would be productive together, who would annoy each other, etc., but there were enough possible combinations to make this a large enough search space.

So I wrote a very simple python script that would randomly generate layouts of who would sit in each room and next to whom. Every time it gave me a result I scanned it for conditions that would make it not work and add a rule to skip such configurations. After about six such edits I got a layout I thought was acceptable. The team as far as I know was happy and nobody questioned it for the entire time we were there. This saved me time because I didn’t have to pre-program all the conditions, only add ones I had already seen not work. Saved both CPU and brain cycles, so to speak.

This is great! In a way it's like the output was a question rather than an answer.

This is quite similar to education research on Teachable Agents [1], which is an embodiment of the idea that to know something you must be able to teach it to someone else. In Teachable Agents you teach the computer rules (e.g. how ecosystems work), and then it gets a quiz where it compares the output to the right answer. When its wrong, you as the teacher must figure out whether the rules you taught it were correct and/or if it needs more rules.

Teachable Agents works for things where there's a right or wrong answer, because the computer is doing the test proctoring. But in your method the human is doing that, and I think it works quite well for things of a more qualitative nature like the arts, with the human playing the role of the critic or curator.

[1]: https://slate.com/technology/2015/04/teachable-agents-making...

That is super cool, thank you!
There's a great talk from StrangeLoop[0] where a presenter uses Alloy modelling in exactly the same manner:

- generate a couple examples

- see something wrong

- add a rule

- repeat

[0] https://www.youtube.com/watch?v=FvNRlE4E9QQ

Is this how sports schedules, NFL, NBA, etc are made?
They are heavily based on divisions and recent seasons.

(For example, in the NFL, the teams in each division play each other twice, and then get the rest of their games by rotating through teams from the rest of the league)

This is a pretty cool video about a couple that made the MLB schedule for years https://youtu.be/yT0CMOGKKhU
Another common trap is to believe a widely known simple method must be better just because you are unfamiliar with state of the art research.

Many times people have specifically researched why a certain method is better than status quo in practice, on real world data with real world operating constraints.

A good example is the paper “Let’s Put the Garbage-Can Regressions and Garbage-Can Probits Where They Belong” - explain an extremely common and hugely severe problem with “garbage can” regression models.

http://www.saramitchell.org/achen04.pdf

The first arduous task of a ML solution is to get on parity with computing basic statistics on the data coming in, and there are many fail points before and after that point. Saying 'we need a ML solution' is a lot like saying 'we need to write this in assembly'
He wants a ML project on his résumé.
Absolutely!

Try writing a simple calculator app with ML and you won't even have marked up enough training data by the time I have written it algorithmically. When it's finished, the ML one won't be as reliable.

I've been involved in a few projects where ML was a candidate approach. Mostly the answer was just to write down what we already know about the problem domain instead.