|
|
|
|
|
by dunham
1343 days ago
|
|
I should write this stuff up, but I haven't. I do have some brief notes on the "Notes.app" format here: https://github.com/dunhamsteve/notesutils/blob/master/notes.... But I didn't discuss my methodology -- Generic decoding of protobuf, building up a schema as you go -- the tricky part there is that a byte array and a substructure look the same, so you have to try to decode it, and if successful, try that schema on the next example. Here is another fun technique - scanning through a disassembly of an Apple framework looking for assembly patterns that match the protobuf compiler output (this was dependent on which language was targeted by protobuf): https://gist.github.com/dunhamsteve/224e26a7f56689c33cea4f0f... So you find the serializer / deserializer code and figure out what the original protbuf spec looked like. |
|
SQLite got me a little experience with b-trees (as did couch), and I got to write a little query planner.
Lucene was interesting because it was compact, had some skip lists for fast lookup, and was a log-structured merge tree. I borrowed bits of it for an index in a binary file format for work.
For realmdb / couchdb, I looked at the source code.
I did realm so I could extract my Craft.app docs. It's interesting because it's a column structured database, so I got to learn a little bit about that. I also learned that C++ had changed a bit since I last used it (lambdas!).
And couch is an append-only btree. I got to learn to read Erlang with that project.
I've also have a web scraper that reads from the Chrome cache (whose format keeps changing). I archive things like recipes that show up in the cache.
And I've got code on github that decodes iOS desktop backups, which some people have found useful. (Written mainly so I could poke around in various applications' data and extract stuff from my keychain.)