Hacker News new | ask | show | jobs
by xcvbxzas 2996 days ago
I think many of these issues can be addressed with mechanisms proposed by the author.

Mainly, the more complex tags which can themselves refer to other tags.

1. This is probably the trickiest one. You may be able to do some sort of translation between a hierarchical system and the tag system using tags themselves. You could have a series of tags that refer to each other, such that the hierarchical location is essentially encoded in the tags themselves.

2. Again, maybe just special tags?

3. Yeah, again, tags. Just tag the thing with the media it's on.

4. Aside from the basic UI side of things which should help, there is the idea of shared tagging systems. I don't recall if that came from the author or another commenter on HN. And you can basically ask the same question about hierarchical systems. It's not exactly a solved problem there either.

5. Again, the complex tags. Just make a tag for the project.

6. Obviously UI is a big question. I'm not sure how it relates so much to media-specific browsers though. They basically present a different view of a section of a filesystem. You have to do some work to let them do this, or else use a system like iTunes and buy all of your media through them.

7. Although I feel this is well addressed by the author, one thing I think you aren't considering is that each of these applications requires their own setup in order to provide that view. You often can't just take the directory from one of these programs and use a different program to view it and have it all work properly. If you only have one program for each media type and never want to use anything else that works, sort of. Many years ago I directed iTunes to redo the file layout for my music collection and rendered it effectively useless for direct browsing. I never really recovered from that due to the time involved to sort it out.

And mutability isn't totally handwaved away, again with the complex tag system you could tag mutated works with a reference back to the original. This doesn't cover the case where you don't wish to retain the original, but then you could just do a simple find/replace with the old and new hashes in the simplest case.

1 comments

I think you're missing some of the subtleties of solving these problems using "just more tags."

In a hierarchical system, a lot of these organizational issues are local. If I have one directory that consists of a project organized one way, and another directory that consists of a different project organized a different way, those different organizations don't really interact with each other in any way.

If you are using tags for everything, in order to avoid weird mishmashes of different ways of using tags, you would need to either have a completely standardized tagging system that everything used consistently, or you'd have to always include various contextual information in your queries or in your browsing in order for the queries to make sense. For instance: [mount: my-hd][project: my-project][type: jpeg]

I think you overstate the problem with different applications as well. For a large amount of the metadata that is relevant for these applications, there is a standard tagging system. ID3 for music, EXIF for images, XMP for various image and video formats. It's true that there is some metadata that these applications store in proprietary databases, but that's mostly an issue of it being difficult to come to a consensus on standards that meet everyone's needs, and it's easier to just write some proprietary metadata somewhere. With tagging systems, if there wasn't agreement on the schema of tags, you'd still have the same issue.

I don't think it's a bad idea to consider alternatives that are more general and more flexible than what we're doing now, but I do think that it's pretty easy to handwave about how nice a tag based system would be, but a lot harder to solve all of the little problems that are going to come up and turn it into a real, coherent, working whole, and then getting enough critical mass so that it is used outside of a small niche with a handful of applications.

I'm sure you're right that there are a lot of overlooked subtleties. That said I'm not sure some of those problems you mentioned would exist, or at least I'm not sure they would be any worse with a tag system than a hierarchical one.

For example, how is that example query any worse than the current situation? Right now you'd navigate to the project directory (requires specifying more than your example already) and then use some search method depending on OS/WM/etc. And then you still end up with a big list of jpegs to look through. This is sort of a worst-case example for both systems, and still I think the tag system comes out ahead here - by a little - just because it would give you the ability to spread the project across multiple drives without requiring you to do two searches if you don't know which drive the desired image is on. You can improve the situation for either system by manually specifying more information. Put better tags on the images or put them in more specific directories or title them.

As for specific applications, it's not the metadata encoded into the files that I'm talking about. It's as simple as the directory structure itself that is used to store all of this. I can't have one application organize everything and then trivially point another application at the directory and have it work.

With a tag-based system this starts to change. I don't need to tell a new music player where my music is, and then go through whatever process is needed to let it properly work with the current directory organization. At worst I tell it which tags to include or perhaps exclude. From there many options exist. Maybe it pulls in metadata from the files themselves. Maybe I provide an external file in whatever format. Maybe I tell it which tags to associate with which fields. You could do a lot of things here.

I also won't end up telling the application to reorganize things as I did many years ago with iTunes, which promptly made it nearly impossible to wade through my music manually. I had it sort everything into directories based on the artist with subdirectories for albums. It sounded great, until I remembered just how much music I had off OCRemix, where an album is a large collaboration between many people. All of those albums were ripped apart. Ironically, I also had some standardization issues with things like artist names which caused more trouble. Once I stopped using iTunes I basically abandoned that collection because of the work required to fix it.

Yeah, standardization is going to be sort of a problem, but I don't think it's quite as big of a deal as you think. For one, the OS is going to ship with a bunch of standard tags just for itself to work. There will also just be a lot of really standard stuff people are interested in that can be shipped with them. You also have file extentions, for both specific extentions and also generally what kind of information they contain. And finally there is just good old translations. The hierarchical system basically utilizes all these methods and suffers from the same problem - namely you can put directories wherever you want and name them whatever you want. Same problem, different manifestation.

I think the biggest benefit would come from a system that can present itself either hierarchically or tag-based. They both have merits. I've already presented some ideas on how you could store the hierarchical structure in the tags. I'm not so sure how you store the tags in a hierarchical system directly. You could probably fake it with a separate datastore easily enough though.

Finally, when did this discussion of general design goals turn into one of a real-world implementation, much less widespread adoption? I'm not sure how this is relevant.