| HN Mirror

Y	Hacker News new \| ask \| show \| jobs


	by scentoni 1115 days ago
	Those are examples of useful functions. They are not examples of useful hash functions.

3 comments

derefr 1115 days ago

I think you're conflating "functions used to build a hash table" with "functions used to build a dictionary ADT." A hash table doesn't have to serve the purpose of storing items in a dictionary ADT. A hash table where everything that's similar collides into the same buckets is just as useful for finding "a linked list of similar items" in O(1), as a hash table where everything gets its own unique bucket is useful for finding items in O(1).

link

dimatura 1115 days ago

It's true that these functions are in some sense very different to traditional hash functions, because they are designed to create "collisions" (or at least be similar) for elements that are similar. But I still think it is intuitive to use the term, maybe qualified as a "similarity hash" (I've also seen "perceptual hash" or "semantic hash"). In both cases we are describing a (potentially) large object with a short code. In normal hashes, the hash is supposed to be the same for objects that are exactly the same - in similarity hashes, that is softened to similar hashes for similar objects.

link

pcthrowaway 1115 days ago

Functions that create collisions can still hashes can't they?

Things like CRCs, Checksums are loosely considered hash functions: https://en.wikipedia.org/wiki/List_of_hash_functions

MD5SUM is a classic example of a hash function with a low (but still reasonable) chance of a collision

link

gabetax 1115 days ago

Specifically, these functions to not provide "uniform distribution".

link

jagged-chisel 1115 days ago

That isn’t a requirement of an algorithm called a hash function

link

gabetax 1115 days ago

From the article:

> For a function to be useful as a hash function, it must exhibit the property of uniform distribution

It's also listed as the first property on the Wikipedia page:

https://en.wikipedia.org/wiki/Hash_function#Uniformity

link

jagged-chisel 1115 days ago

I can’t find the text you quoted. The article starts off:

> A hash function is any function that can be used to map data of arbitrary size to fixed-size values

And the first sentence in the section you link says

> A good hash function should map the expected inputs as evenly as possible over its output range.

It doesn’t have to be “good” to be a hash function.

link

anamexis 1115 days ago

That's not a meaningful definition of a hash function, then.

    def hash(val):
        return 0

link

jagged-chisel 1115 days ago

Yes, this is a hash function. Whether it’s “useful” or “good” is the purview of the architect.

Tangential edit: https://xkcd.com/221/

link

throwmeaway1212 1115 days ago

That has a pretty uniform distribution..

link