Hacker News new | ask | show | jobs
by ohyes 5515 days ago
A list structure is already a 'dictionary tree', except it has lousy performance in access. (only really visible on large data sets).

* (second (assoc 5 '((1 2) (3 4) (5 6))))

> 6

The lousy performance doesn't matter, most of the time, because you have other data-structures for data, and your compiler generally is only interested in sequential access (constructing a parse tree out of it). The lisp code already being a tree makes parsing fairly simple...

So, I guess my point is that I don't see the point of this?

The only reason to really upgrade to a dictionary would be if you wanted to have non-sequential access to the contents of of your large-scale code-data-structure... generally the lists aren't large enough to merit it. You can do all of the dictionary operations on a list, and they are 'fast enough'. ----

I feel like the rationale here is created by conflating the idea of a cons and a list. I think generally, a list already accomplishes what the author sets out to accomplish.

They are a single step higher level than conses, and give you a hierarchical structure with the possibility for named associations. I don't really see how using something more complicated and memory intensive to build this basic data structure qualifies as an improvement...

2 comments

> The lousy performance doesn't matter, most of the time, because you have other data-structures for data, and your compiler generally is only interested in sequential access ...

Exactly. Much of the time you don't care. The rest of the time, you can still use assoc-lists, except you create a branching structure on the keys such that no key in the list is a prefix of any other. The first entry of a node contains the value "at" that node, and the tail of the node is a list of branches, where each branch is a pair consisting of a string key and a child node.

The "get" and "put" functions in this structure are very simple recursive functions, and you can maintain very large dictionaries ("maps") this way. The get and put operations are essentially O(1), constant time, if you ignore the sheer length of the keys (which are usually reasonably short anyway). I've done tests where I insert hundreds of thousands of pseudo-random keys and values, and read them back, all very quickly and reliably. The code splits the keys into branching structures automatically. It's also a purely functional structure, so you can keep old versions of the maps.

The value at each node can be anything you like, not just a string, though to detect the absence of a value you need to use a "Maybe" type. In Fexl you can define constructors:

  \absent  = (       \absent\present absent)
  \present = (\value \absent\present present value)
So then the default value of a newly created branch is always absent, but a "put" operation replaces it with (present value). When you delete a key, it automatically detects if the node is empty (i.e. its value is absent and it has no branches) and then eliminates the node from its parent.

You also detect singleton nodes, which have an absent value and only one branch. In that case you can concatenate the key of that branch with the parent key upstairs, eliminating the extra branch.

Exactly. Rose by any name smells just as good.