Hacker News new | ask | show | jobs
by elibarzilay 6304 days ago
Something that could be done for now, is to write a piece of mzscheme code that "marshalls" the data in (utf-8-encoded) byte-strings. Assuming that most of the 2gb is made of strings, and that these strings are mostly ascii, this should reduce the consumption by close to a factor of 4.

(I can imagine an interface that is transparent at the arc level, where are strings are just passed to the backend and retrieved from it, and the backend converts them to and from byte strings. Later on it could change to use a FS or a DB or whatever.)