Hacker News new | ask | show | jobs
by kefka 3645 days ago
Animats has a great way to describe it ( https://news.ycombinator.com/item?id=11820643)

It has the cult-like approach of Xanadu - use new terminology, tie it to an economic model, and re-invent everything. Also, there seems to be a cult leader.

Jargon: noun (data), nock (interpreter), mint (compiler), span (type), twig (expression), gate (function), mold (constructor), core (object), mark (protocol). It's Newspeak for programmers.

The overall concept seems to be a federated social system, like Diaspora. Everybody has a online presence which they own. But you can take your ball and go home, moving your online presence somewhere else, and it still gets found by others. Somehow. (That's a hard problem at scale.)

There's a claim that nobody can create vast numbers of identities for spam purposes because there are only 2^32 possible human identities. (That number should have been at least as big as the population of the planet. 2^36, maybe.) Apparently you have to buy address space, which is a profit center for somebody.

There's a download, which gets you their "OS" (which runs on top of another OS), an interpreter for their Hoon language and access to their chat environment.

Not sure what to think of this, but someone put in a lot of work.

3 comments

Re: the jargon, there's a technical reason for it. Urbit atoms—which are the Nock runtime's only non-product "type"—are a lot of things, but they're basically a union-type between a bit-string and a bignum (i.e. a way to talk about a set of packed bits and an arbitrary-length number interchangeably.)

Bignums themselves are somewhat expensive data structures to pass around, so runtimes that use them usually don't use them for everything; instead, they have an Integer type that has Bignum (data-structure on the stack, or pointer to data structure on the heap) and Fixnum (regular integer machine-register) implementations, and write a glue layer to treat the two interchangeably.

So, you've got a runtime where your only "type" is an integer, but it breaks down into "cheap integers" and "expensive integers."

Now, in most dynamic-language runtimes (i.e. runtimes where type-tagging and pattern-matching doesn't all happen at compile-time), you want a type that "represents itself" to use for type-tagging, dictionary keys, function references, etc. This is usually a Symbol or Interned String (or Atom!) type, with the runtime keeping a symbol table mapping fixnums ('cheap integers') to strings (which are effectively, in Nock, 'expensive integers'). You need this table, because dynamic language functions manipulate/build/pass around/compare a lot of symbols, and the interpreter would be really slow if you were passing and comparing 'expensive integers' by value.

But Nock can't really keep a symbol-table, because every "atom" ID embedded in a Nock program is a universal ID, a handle any other Nock runtime might be passed and attempt to manipulate. (And you can't even use a global symbol-table that uses hashing—i.e. a DHT—because Nock doesn't differentiate the "Symbol" atoms from any other atoms, so you'd have to treat every atom the same and hash them all—and, even pretending that's cheap, you're eventually going to get collisions that way.) The only symbol table that works in a globally-distributed system is one that maps bitstrings to themselves. And that's as good as no symbol table at all.

So what do you do when you can't make your strings ('expensive integers') into interned strings ('cheap integers')? Just use the cheap integers from the start! In Nock, this means that all the runtime-defined symbols—all the atoms in the language and the stdlib—are bitstrings that are short enough to be encoded directly as fixnums. Since Nock has the requirement of running on 32-bit machines, that means 32-bit bitstrings: four-ASCII-character strings. Thus most of the jargon.

Do note, it's not required to use four-letter names for "symbols" you manipulate in the Nock runtime; but doing otherwise means passing around 'expensive' handles instead of 'cheap' ones.

---

I find Urbit's solution here interesting—but they probably chose it for purity's sake.

I would like to contrast it to the implementation in Erlang's BEAM runtime (which also, coincidentally, has 'atoms', but where these are simply its particular symbol type):

• an atom table is kept by each "node", with no need for the atoms in each node to have identical mappings;

• atom IDs are mapped back to their original bitstrings when serializing any data structure including them, even if just for internal persistence;

• each distributed-RPC connection between nodes is stateful, keeping its own atom table (kept synchronized on both sides by additional messages) as a cache to speed up communication between nodes. This can be seen to be similar to a streaming compression algorithm's Huffman tree cache, but it's interesting to look at it as the connection being its own virtual 'node' that has this particular atom table, which both nodes are then modelling and translating to/from.

None of that strikes me as an explanation for the obtuse jargon in particular.
In summary: the runtime desires you to call things four-letter-words. So the creators have decided to make up new four-letter words for everything they have to refer to in code, rather than just, say, abbreviating perfectly good non-four-letter words.
You only covered atoms, actually. Nouns are a union between an atom and an ordered pair of nouns, which means that they shaped like binary trees. A struct of `[a b c d]` is actually laid out as `[a [b [c d]]]`
>That number should have been at least as big as the population of the planet.

Addresses can actually be up to 128-bit. I think the idea is that addresses longer than 32-bit are just assumed to be bots, but that convention would probably be changed if Urbit gets popular enough for the 32-bit limit to matter.

...if Urbit gets popular enough for the 32-bit limit to matter.

I'm trying to think of a 32-bit address space that was found to be too small... I know there was one, it's on the tip of my tongue...

That was an address space for computers, not people.
The stated reason is that Urbit addresses are for "responsible adults", and that each responsible adult will have 1 address. Each address has 96 bits worth of subordinate addresses, so you should be able to give all of your devices, dependants, etc an address.
It's Newspeak for programmers.

No, it's much uglier than Newspeak.[0]

[0]http://www.newspeaklanguage.org