If the hash is secure independent of s-box selection, I'd much rather bet on pi being normal than "the NIST beacon values aren't generated by AES in CTR mode" ;-)
"These fears can be allayed by using numbers created in a way that leaves little room for adjustment. An example would be the use of initial digits from the number π as the constants. Using digits of π millions of places after the decimal point would not be considered trustworthy because the algorithm designer might have selected that starting point because it created a secret weakness the designer could later exploit."
that's a good idea. if you want to post some code turning some historical nist randomness beacon data into 1024 64-bit integers, then test it and run it against smasher using the included utilities script (to run tests in parallel) I'm happy to include the results in the readme.
It should probably just default the s-box to all zeroes or some other method of deriving it ex-nihilo, but a default is practical just so you don't need to synchronize s-boxes between uses.
I would imagine anyone interested in using this for serious business™ would start with a new s-box hierarchy.
I'm confused. Supercop is a benchmark for cryptographic hash functions, but SMHasher is a test for non-cryptographic hash functions. The use cases list cryptography, but also universal hash functions which are generally not crypto-grade. It compares itself to the SHA hashes, but only has 64 bit output.
Is Beamsplitter supposed to be cryptography grade or not?
No. It's a quality hash, like crypto hashes, but of those one of the slowest ones. Extremely slow, like Siphash. But it's useful for Javascript, I guess.
I should warn any reader not to use this or any other novel cryptographic algorithm in production. Don't use anything crypto in production until it has been very heavily analyzed for years by professional cryptographers.
The source is using `.cpp`, though it does not appear to be using any C++ features.
Would it be reasonable to move to `.c` so that it can be integrated in all sorts of things?
Aside, when something is Apache licensed, and someone wants to make, say an Erlang NIF with something, what effects does that embedding have on the NIF library and users of the NIF library?
That's the wrong attitude. Universities are a place where you should be much of the learning yourself. There is not enough time in a class for a lecturer to recite every word or idea that is present in a large textbook but there is definitely enough time outside of class to read it.
> Universities are a place where you should be much of the learning yourself.
The professors in the first three years of my schooling definitely did everything wrong, then. Passing and failing classes had next to nothing to do with independent learning.
That’s dumb. I’m not asking for a professor to read to me. I’m asking for the school to provide a curriculum that introduces me to these subjects since I PAID them to educate me. Cryptography was not introduced/required for my CS degree.
My point was that you need to do reading and research on your own in order to get the most out of a an education there is not enough time for a professor to go over everything. I'm not sure why you are surprised you didn't get much out of it if you only did the minimum. You have the opportunity in university to do and learn almost anything you want./
You are making assumptions about people that you shouldn’t - that’s my point. I was working full time as a Software Engineer and was a new father while I was going to school. My ability to learn outside of the school’s curriculum was limited. The school did not emphasize this area of CS. I don’t expect my professor to hold my hand, but I do expect the school to establish a challenging curriculum.
I see that you ran out of particle names for your projects. May I introduce you to super-symmetry, then?
What's the issue with picking names that do not exist already? It has got the upside that millions of webpages will not appear in the results when people are searching for your project's name.
The italis section is just a thing I add to comments sometimes: a few sentences that loosely relate to the thing I am talking about. It can be a quote, an imagined dialog, a flippant comment, ... other examples are at [0][1].
I was annoyed.
I'm physicist and computer projects have the annoying custom of picking names from physics, engineering or what else. Other people also come up with their own names, why should not computer tech people also do this?
Atom editor, Electron framework, Neutrino.js, Crankshaft, ...
Thanks for the explanation. I think the italis is cool and fresh.
hmmm, really interesting how you feel about the names. It sounds like that is super annoying.
I never thought about how naming would affect people invested in the names like this.
I don't think I need to defend it, so I'm not trying to here, just sharing that for me, beamsplitter sounds like such a cool word, as if a beam were a physical thing like a rock that could be split. Also something solid in itself, and connotes advanced, possibly war, tech. lasers. I'll going for that connotation. hash functions are usually very pathetically named.
also there's more to this name in this project because my initial design imagined the "beam" of the input, ricocheting around a network of s-boxes getting mixed. It seemed to me like the perfect hash, aesthetically and efficiently, and universal. but to my disappointment, I couldn't get a pure, s-box only design to work. I had to include some "traditional mixing function hacks" like multiplication, rotation and xor. But I wanted to keep the name because it was aspirational.
I can imagine that it must feel like all these annoying computer software people taking all these names that are not from their area, but from your area, and not leaving anything good for the rest. And when they have such high profile already! Like nobody will listen to the poor physicists, especially once all their names are taken, and then it will be more lonely. A nameless space, with nothing left. Sounds pretty sad.
Funny is for me, it seems physics stands above software, so using such names is a way to increase perceived value. But from your view, software has the higher profile.
Let me apologize for my complaints. It didn't occur to me that these names might be used out of admiration for a field.
I've also have to admit that physics needs relatively few new names in general which would make picking one a lot easier. There are also naming patterns as well, e.g. for superpartners (new articles in supersymmetry are either prefixed with S- or suffixed with -ion [0] in a predictable way).
That was cool to read your reply. actually looking back over my code, I see my achievement was better than I thought. I only used addition, and rotation. no mult, nor xor.
technically tho rotation can be thought of as including multiplication and xor. but also not. so I don't know.
A universal set of hash functions is a set of hash functions such that randomly choosing any hash function from the set guarantees an upper bound on the number of collisions regardless of which keys from the universe are input to it (which are also random).
Basically it makes it more difficult for an adversary to exploit collisions from your hash function.
I don't get how this could be used. I tried to imagine, and ended up with something wrong. This is what I imagined:
You have a list of hash functions, and choose one at random, then hash a password. Later a hacker gets these hashed passwords, and has an extra hard time? But this wouldn't work for checking passwords because you wouldn't know what hash.
I may be wrong, but after doing a bit of research, here's one example:
Alice is storing keys in a hash table. Since this is a hash-table, the hash (H) that Alice will choose must be fast. However, real-world hash-tables will use a relatively small number of bits from the output of H, because even if you have a table sized to 4 billion, that's only 32 bits.
Let's say that Alice does this by taking the lowest N bits of the output of H (this works in practice regardless of which bits Alice uses) where 2^N is the size of the table. N may change as elements are added
Eve wants to mess with Alice by sending a bunch of keys that all have the same bottom M bits, where M is the largest expected value for N. Since the hash H is very fast, this is very computationally cheap to brute-force, particularly if you have access to very parallel hardware like a GPU.
Now consider that instead of using hash H, Alice uses hash-family U. Whenever a hash table is created (or rehashed,) Alice selects a random hash from U. Eve can no longer easily generate keys that will collide in the hash table.
From what I can tell, for password hashing, this is not appreciably better than salting, if the size of the set of possible salts and the size of the set U are the same.
This allows hash tables to have expected insertion and search times of O(1) as well which is as far as I’m aware the major motivation for it. Also the set of functions can be huge and the hash functions in the universe is infinite following the fact that there are infinitely many primes which would mean that brute force quickly becomes impractical for cracking the hash.
> This was obtained from random.org by requesting 8,192 random bytes, as were all S-boxes tested so far.
https://en.wikipedia.org/wiki/Nothing-up-my-sleeve_number