Hacker News new | ask | show | jobs
by hossbeast 3530 days ago
Quoting,

"Keywords are a common method of accessing data for which one does not have the exact coordinates. The usual problem with keywords, however, is that two people never chose the same keywords. The keywords then become useful only to people who already know the application well."

Interesting that some problems have been known for so long, with no solution in sight.

The whole article is super interesting in the context of everything the author did not yet know

2 comments

One solution for this comes from the field of librarianship -- the use of standard ontologies for classifying information. The two most widely used are the Dewey Decimal system (proprietary) and the Library of Congress Catalog Classification System (nonproprietary). I've seen arguments that Dewey is more logically consistent, but the LCCCS's open nature lends it a strong advantage.

Even such ontologies aren't entirely stable -- they change over time, and as with other bits of knowledge, reflect cultural fads and fashions. But I've been recommending to several systems (Pocket, Ello, blogging platforms) that a classification / tagging system based on these might actually be a fairly reasonable start, if only in that there's a very large, mostly-well-considered basis to start from.

wouldnt word2vec solve keywords problem?