Hacker News new | ask | show | jobs
by dmd 3500 days ago
So there's 23407 email messages but only one file containing all of them?
1 comments

Yep! All messages are in a single mbox file and it's 3.2GB.

  cdubz@professor-farnsworth ~/data $ du -h Mail-chris.mbox 
  3.2G	Mail-chris.mbox
Wow. I stand corrected and that's awful. Yet another reason to use gmvault!
Why it awful? As an archive, seems decent.
(1) inconsistent escaping rules (dealing with the literal string \nFrom)

(2) easy to corrupt

Worth noting that Google provides some Python sample code for parsing the file which works great.
Interesting - could you point out where to find this? I poked around a bit but didn't come up with anything.