|
|
|
|
|
by jl6
4312 days ago
|
|
Hi there. I'd love to hear a little more about what kind of methods of organisation you employ for managing such a large volume of data (particularly the parts which aren't downloaded from somewhere else). Not so much in terms of the storage infrastructure, but in terms of directory structures, links, indexes, etc.. Do you have millions++ of files or just a lot of very large files? I'm asking because I feel that not enough is published on the subject of personal filing/archiving systems, whereas it's something we all do and there's a lot of best practice sitting out there uncaptured. |
|
A lot of my older files, sadly, are stored in "SORT/Sort Me/To be sorted/Old computer/Sort again/Miscellaneous..." and the like. My server has an mlocate index, so I'll use mlocate, and I'll use find sometimes. I make sure to preserve metadata like last-modified/created dates, so I can use that to narrow things down.
Newer stuff, I try to keep a bit more organized, but I still have lots of unmanaged stuff floating around. For big projects, or big files, that's easy enough; my photos are sorted into a Y/M/D hierarchy, my VHS digitization projects are fairly well organized, some other things have their own structure. For my scanned documents, I just dump them all into a mess of folders, but then have a custom Django app with a management command that indexes them and gives me a nice "document management" website, and then I just search based on OCR'd text or title or date.
I really hate hierarchical filesystems. After using computers for this long, I'm convinced that hierarchy-optional, metadata-driven stuff is the only future I'll be happy in. I long for the ability to save things without really having to say anything about where it's saved, and still be able to find it... So, sorry, I don't think I have a satisfactory answer for you, as I don't think there's a good solution to this problem as long as we have filesystems where the organizational primative is a hierarchy. Even with tag-based systems that build on top of that, it's usually clunky and you still fundamentally have to figure out where to save something "first", even if you plan to access it via tag/metadata later. Such a pain.