Hacker News new | ask | show | jobs
by nraynaud 2070 days ago
A few years back I had a traveling salesperson problem, I googled a bit, and somehow didn't find anything interesting, so I just created a random loop, and randomly mutated it until it got better. A few months ago I discovered a neat little heuristic: the best loop never self-intersect. And a few days ago, I found that there is a definite algorithm that gives a result whose worst case is bounded WRT to the optimal.

My question is how do you find all those algorithms? Wikipedia never really feel like a good introductory nor discovery place. In particular, some problems have been perfectly studied by scholars, but you don't find them because you have not found the keywords that will direct google in your search. And I am not a researcher, so I don't keep a tab on a domain, I'm a jack of all trades, I work on a wide set of things.

6 comments

Your method is similar to a popular probabilistic technique called simulated annealing [1].

The general setting for this type of algorithm is that you have a large number of states and you want to find the state that maximizes a particular score. In the TSP, the states are paths and the score is the (negative of the) path length. The idea is to define a Markov Chain whose stationary distribution is proportional to the score. This means that if you jump from state to state according to the probabilities defined by the MC, eventually the probability of ending up at a given state is proportional to that states score.

In the case of TSP the states can be represented as permutations of the cities and the neighbors of a state are given by swapping the positions of two cities. The probability of moving to a neighbor is high if the length of the neighbor is shorter than the current path.

[1] https://en.wikipedia.org/wiki/Simulated_annealing

> My question is how do you find all those algorithms? Wikipedia never really feel like a good introductory nor discovery place.

I know a place.

From the Github repo Coding Interview University[0] there is a link to a Jupyter notebook[1] on various ways to solve the traveling salesman problem - it is a very good, detailed, resource.

[0] https://github.com/jwasham/coding-interview-university

[1] https://nbviewer.jupyter.org/url/norvig.com/ipython/TSP.ipyn...

> you have not found the keywords

Knowing the right keywords and roughly how they fit together is a huge component of domain expertise. Building general background knowledge through experience and readings goes a long way for this reason.

I think that most of the times they are particular to a domain to solve particular problem. For example Some researcher would have a grant for developing an optical inspection of a circuit board using a camera moving on the 2d euclidean space efficiently is a TSP. You can develop a Polynomial Time approximation scehne(PTAS) to solve it.

At first it's novel then st some point it appears in a book specific to polynomial time approximation schems in books dedicated to the cause. Look at the references in wikipedia. They often point to a larger body of work than what the wiki can possibly hope to cover.

I would try to look for academic papers on the topic. Even if you can’t find an exact match, papers typically have a section that describes other related work. You can also scan through the citations to see if anything might be a match. If there’s some relevant paper, you’ll probably find it after searching through a few papers.

When reading a paper I would just scan the abstract, conclusion, introduction, related work section (possibly in that order) to see if it’s relevant.

For finding papers I like semantic scholar and Google scholar.

OOC: how did you create your initial random loop? This was a Kaggle problem about seven years ago. The competitive solutions found a good initial guess by breaking up the domain into a grid. They would solve TSP in each grid cell, and then stitch these solutions together to get a reasonable initial global solution. If you created your initial loop greedily (as I did) you got a garbage solution, and also wasted a lot of time that you could have spent doing random swaps or simulated annealing.
sorry, I misremembered, I used a Z-order path in the end:

https://github.com/nraynaud/webgcode/blob/gh-pages/webapp/cn...

I guess I decided that some order was better than random.