| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by monort 496 days ago
	Talk by the inventor: https://www.youtube.com/watch?v=ArQNyOU1hyE

3 comments

kristopolous 496 days ago

This strikes me as something that many people probably figured out a non-rigorous version of and didn't think it was special.

It's kind of one of those resource management hacks you do when you're constrained and screwed by limitations. Splitting things up by priority is a common go-to for resource allocation. This is a spin on that.

I wonder how many other "in the trenches hacks" people have done that overturn widely accepted things the inventors didn't realize were a big deal: "well I usually have a bunch of deliveries to make and I've figured out a clever way to map out the quickest path... "

Don't get me wrong - recognizing it and then formalizing it, doing the work, publishing the paper - that's a lot of effort. I'm not taking that away.

vanderZwan 496 days ago

Also relevant: in this particular case the authors themselves note that the results better theoretical behavior in the worst case, but no practical uses yet. So I think any software engineer exploring this direction would have abandoned it pretty quickly, for the same reason that galactic algorithms aren't typically invented by them either (unless they also do compsci as a hobby of course). In fact the Wiki page for galactic algorithm mentions another optimal-but-impractical hash table as one of its examples[0][1].

[0] https://en.wikipedia.org/wiki/Galactic_algorithm

[1] https://www.quantamagazine.org/scientists-find-optimal-balan...

josh-sematic 496 days ago

Relevant xkcd: https://xkcd.com/664/

optimalsolver 496 days ago

Also:

https://xkcd.com/1425/

br1 495 days ago

Leapfrog Triejoin is an example of the trenches contributing to academia and academia valuing it: https://x.com/RelationalAI/status/1836115579133939752

vanderZwan 496 days ago

> I wonder how many other "in the trenches hacks" people have done that overturn widely accepted things the inventors didn't realize were a big deal: "well I usually have a bunch of deliveries to make and I've figured out a clever way to map out the quickest path... "

A lot of them. Having said that: yes, I can imagine that others would have thought up Dijkstra's shortest path algorithm, since he himself said it came to him while shopping, and that it only took him twenty minutes to reason through the original O(n²) algorithm. (edit: oh wait, that's what you're alluding to isn't it? Heh, that went straight over my head).

On the other hand, I don't think the faster versions of Dijkstra's algorithm would have been invented by anyone without at least some understanding of priority queues and big-O behavior. And at that point I hope people realize that they possess some specialized knowledge that might not be entirely common.

In fact, I'd argue that the true strength of Dijkstra's write-up is that it gives us a vocabulary to reason about it and come up with specialized data structures for particular situations.

Anyway, what you're touching on is the difference between engineering and science: engineering works with confidence built from tests, rules of thumb that reflect lessons learned from historical results, and (in modern times) verified predictions from science. Those rules of thumb might be used when lacking a deeper scientific understanding of why it works. The tests might exist to work around the limitations of scientific knowledge (e.g. modelling turbulence). Science creates insights and predictions through modelling of empirical results. At least that's the difference according to Bill Hammack[0].

In an ideal world the two professions work together and build on each other's results to propel each other forward of course.

[0] https://www.youtube.com/playlist?list=PL0INsTTU1k2X4kCPqmi1e...

chrisweekly 496 days ago

> "some specialized knowledge that might now be entirely common"

now -> not, right?

great comment

I'm not being pedantic about a typo, but it reverses the point I think you're making about UNcommon knowledge...

vanderZwan 496 days ago

Yes, that was a typo that made it look like I contradicted myself, thank you for catching that :)

kristopolous 495 days ago

I was referring to the general TSP being solved.

vanderZwan 494 days ago

Eh, the trading salesmal problem is more like the Collatz conjecture: it looks simple but there's a lot of complexity hiding under the surface, and it requires some expertise to truly understand why it's really hard. So then we're talking about the opposite problem.

Note that your informal description did not match the TSP since there's no reason to disallow backtracking or visiting the same place twice.

abetusk 496 days ago

Thanks so much for this link. I remain convinced that papers are so much more understandable with an accompanying talk by the creators. I wish papers would just come with a video talk included.

elcritch 496 days ago

Exactly, the authors get to eschew the formalism required in papers. Often the core ideas of research are simple in themselves and the real complexity lies in formally proving the results.

Also, I'd not be surprised if someone already invented and used this funnel hashing technique in say the 80's in some game or whatnot but just never realized what they had stumbled onto. Not to diminish the research, it's very ingenius.

forrestthewoods 496 days ago

Academic papers are terrible at knowledge transfer. A more casually spoken blog post is 100% more effective at communicating ideas imho.

Academia is a weird and broken place.

Disclaimer: work in a research lab full of awesome PhDs who largely agree with me!

abetusk 496 days ago

I think papers make good references. I think of it more like the equivalent of a "datasheet" for an electronic part, say. Once you understand the intricacies, it's a valuable reference but more often than not, it's not very good and conveying motivation or intuition.

IshKebab 496 days ago

They're usually not very good as a reference either - they miss out key steps due to oversight or lack of time.

federiconafria 496 days ago

Great way to see it, papers should not be your first point of contact.

amelius 496 days ago

> Academic papers are terrible at knowledge transfer.

Well, at least they are better than patents.

joaohaas 496 days ago

Thanks for the video, def a lot better than the article.

I do find it a bit weird that this is somehow better than just over-allocating (and thus reducing the chances of key collisions, which also makes worst case 'less worse') given his approach also allocates more memory through the aux arrays.

yencabulator 495 days ago

I don't think anybody is really saying it is. Academics treat big-Oh performance on very very full hash tables like a sport. Real world code on real world CPUs often has a more complex cost function than what the academics considered; cache sizes, fitting in a cacheline, memory bandwidth, icache pressure, ...

sigbottle 496 days ago

He's not allocating through aux arrays, he's splitting the already allocated memory into log(n) layers. You can just track those aux arrays with math in the implementation.

It's probably not better than over-allocating except in memory constrained scenarios. But the overhead of funnel hashing is not high - it requires 0 extra memory

hydroreadsstuff 496 days ago

Overallcoation has a limit. You only have so much RAM/storage. Beyond that you start swapping. I could really use a hash table (or similar structure) that degrades less with higher occupancy.

rocqua 496 days ago

Could it be that overallocation means you need a bigger search to find empty places or answer queries?