|
|
|
|
|
by psanford
795 days ago
|
|
The only benefit this format provides is the ability to read some columns without needing to read all columns. Unfortunately it is not a seekable format. That's a pretty big miss. It also wouldn't be that hard to make it seekable. All you would have to do is make each tsv file two columns: record-id, value. |
|
> ZIP files are a collection of individually compressed files, with a directory as a footer to the file, which makes it easy to seek to a specific file without reading the whole file... The nature of .zip files makes it possible to seek and read just the columns required without having to read/decode the other columns.