Hacker News new | ask | show | jobs
by hliyan 33 days ago
Incredibly beautiful, possibly because it maps so well to the mental model we typically use to organize knowledge in our heads. I don't know how we lost the folder/container vs. document/content iconography, and other things (like layout of items, sorting) during the shift to web applications.
5 comments

Knowledge doesn’t neatly align to a nested hierarchy. Especially written knowledge.

Language is an imperfect means to convey knowledge, and people store that knowledge in subjective and highly personal ways.

You may mentally recall balloons within “entertainment” or “party”, whereas I might store that knowledge under “horror”.

Add onto that the massive focus on using graph theory to scale social networking technologically, and you effectively lose any motivation for rigid hierarchy.

A folder system doesn't have to be strictly rigid, you can still have "symlinks" so the same article appearing in different folders (aka labels if you can easily duplicate content inside folders, but you retain the nested, drill-down approach)
Wikimedia Commons has this feature. Editors can manually bless certain combinations of traits as "subcategories".

For example, https://commons.wikimedia.org/wiki/Category:Paintings_of_cas... contains the subcategories "Paintings of castles by country" (nested hierarchy), "Frescos of castles" (a medium), "Paintings of Château de Chillon" (a subject), and "Young Knight in a Landscape by Carpaccio" (multiple views onto a specific item). Each item may appear in multiple subcategories. As far as I can tell, the UI won't let you search for frescos of Italian castles (unless somebody's made a subcategory for that), or view all paintings of castles regardless of their subcategory.

I'm not very fond of this approach. I'd prefer for each item to have an unstructured set of tags ("fresco", "depiction of a castle", "depiction of Italy"), with automatic derivation of parent tags ("fresco" implies "painting") and the option to search by multiple tags. It should be possible to automatically discover tags which best refine a search, so that the UI can still suggest them to the user, as it does today.

> I'd prefer for each item to have an unstructured set of tags ("fresco", "depiction of a castle", "depiction of Italy"), with automatic derivation of parent tags ("fresco" implies "painting") and the option to search by multiple tags.

It's definitely possible to do this. IMSLP (a large repository of freely available sheet music, which differs by cross-cutting features such as genre, historical period, contributors (composers and others), instrumentation etc.) is MediaWiki based and has a plugin that does exactly that. These days the would probably want to host all the tags on Wikidata so that they can be multilingual and queryable out of the box, though.

Which is actually done on commons, it just isn't very popular (on images, click the structured data tab and then look at depicts) [admittedly i think a big part of the problem is is implementation choices and UI decisions].
That's only "depicts" claims and is nowhere near comprehensive. It doesn't even come close to matching what's currently stated using categories. Running searches on that data is also hard compared to what IMSLP gives you for their own system.
The Library of Congress uses both approaches, to an extent.

The cataloguing system uses a hierarchical classification, based on one originally developed by Thomas Jefferson, on whose initial donation the Library of Congress is based. This is known as the Library of Congress Classification, and is used to specifically locate a given title or work within the stacks, that is, each item has one and only one location.

There are also subject headings which are more tag-based, though also on a controlled vocabulary. A given work is given a (relatively small number) of subjects to which it's associated. These are not hierarchical, though of course the listing of subject headings itself follows a sequence. Unlike the classification, which assigns a single location to each work, the headings are a search aid to patrons searching for a set of related works within a subject heading, or facilitate branching of a search to possibly related subjects.

Tagging systems, especially ad hoc tags supplied by untrained users, are popular but tend to produce numerous issues over time. Not that formal systems (as with the LoC systems mentioned here) are immune to same. One feature of the LoC systems is that they've evolved processes for managing change over time. Examples would be terminology or classifications which are now deprecated, or of regions and polities which have changed or no longer exist (e.g., the Austro-Hungarian empire, the USSR), or of changes in underlying classifications (e.g., of chemical elements or of biological classifications, both of which have evolved significantly over the life of the Library of Congress).

The history of hierarchical information classifications is long and IMO fascinating, dating at least to Aristotle and his Categories, as well as numerous variants used in classifications of knowledge (such as Francis Bacon's) or encyclopedias, including Diderot's and Britannica.

> Knowledge doesn’t neatly align to a nested hierarchy. Especially written knowledge.

The category tree being displayed comes directly fron wikipedia. E.g. Wikipedia has pages like https://en.wikipedia.org/wiki/Category:Art

Which isn't a hierarchy, it's a tagging system. The tags have some hierarchy but that's not uncommon. The distinguishing characteristic of "nested hierarchy" is that a particular thing should only appear exactly once in the hierarchy.

Since this is so terribly impossible most systems almost immediately make it possible for things to show up in more than one place, which means it's actually hierarchial tagging, whether or not the organizer(s) realize it.

You could also make a distinction based on how many tags things end up with; if it's almost always one, you could call it a nested hierachy with some exceptions, but if it's almost always more than one, and often much, much more than one, it's a tagging system. Even by that criterion that creates a spectrum rather than a binary distinction, Wikipedia is very much organized by tags and not hierarchies. I don't know what the average is but every Wikipedia page I've ever looked at the tags for has quite a few.

The ICD-11 has an interesting "nested hierarchy with some exceptions": each entry has a primary location, and secondary locations are implemented a bit like symlinks.
Yes, and it sad the search in this UI doesn’t work…
I agree, for some reason I have always alternated between wanting not just the universal search box but a browsable hierarchy to mentally run my fingers over and discover in a structured way.

We let go of the the manual index somewhere along the way since it doesn’t scale like search, obviously, but for the same reason I keep a library and enjoy traversing others’ private ones and visiting public ones, I keep coming back to browse.

I guess this model doesn't maximize engagement
This is why I frequently post about how I miss Gopher. It kind of forced this hierarchy.
Did you prefer the Yahoo/internet frontpage approach to google search though? I didn't, but I remember a time when it was a live debate. It has been interesting to see some sites like youtube or wikipedia evolve a quasi-hierarchical frontpage though.
I did. I made my own custom start page back in the day with frequently used searches, along with headlines, weather, stocks, etc. Instead of menus, it just divided topics using horizontal rules, much like an actual newspaper.
I dunno, I never had a "Sheep Looking at Viewer" category in my mental model until I randomly clicked around the media folder.