Hacker News new | ask | show | jobs
by dunham 1343 days ago
I should write this stuff up, but I haven't.

I do have some brief notes on the "Notes.app" format here:

https://github.com/dunhamsteve/notesutils/blob/master/notes....

But I didn't discuss my methodology -- Generic decoding of protobuf, building up a schema as you go -- the tricky part there is that a byte array and a substructure look the same, so you have to try to decode it, and if successful, try that schema on the next example.

Here is another fun technique - scanning through a disassembly of an Apple framework looking for assembly patterns that match the protobuf compiler output (this was dependent on which language was targeted by protobuf):

https://gist.github.com/dunhamsteve/224e26a7f56689c33cea4f0f...

So you find the serializer / deserializer code and figure out what the original protbuf spec looked like.

2 comments

For lucene / sqlite, I used the docs on the web site.

SQLite got me a little experience with b-trees (as did couch), and I got to write a little query planner.

Lucene was interesting because it was compact, had some skip lists for fast lookup, and was a log-structured merge tree. I borrowed bits of it for an index in a binary file format for work.

For realmdb / couchdb, I looked at the source code.

I did realm so I could extract my Craft.app docs. It's interesting because it's a column structured database, so I got to learn a little bit about that. I also learned that C++ had changed a bit since I last used it (lambdas!).

And couch is an append-only btree. I got to learn to read Erlang with that project.

I've also have a web scraper that reads from the Chrome cache (whose format keeps changing). I archive things like recipes that show up in the cache.

And I've got code on github that decodes iOS desktop backups, which some people have found useful. (Written mainly so I could poke around in various applications' data and extract stuff from my keychain.)

Re keychain, you're probably aware of it, but https://github.com/ptoomey3/Keychain-Dumper/ is very thorough in extracting keychain data (including data that one would expect to no longer be in there).
Thanks for the notes on Notes!

As I said, it's way out of my wheelhouse, but I'm planning on spending some time in the next months on trying to merge a decrypted backup of old iOS Signal chat history into the decrypted backup of current Android chat history and trying to restore from that (re-encrypted) backup (there is no native iOS to Android transfer on Signal yet), so I'm starting to look into any learning material that will allow me to not fail within the first 2 minutes of trying :).

Definitely do start writing this stuff up!