Hacker News new | ask | show | jobs
by mcdeltat 310 days ago
Another thing is human readable is typically synonymous with unindexed, which becomes a problem when you have large files and care about performance. In bioinformatics we often distribute sidecar index files with the actual data, which is janky and inefficient. Why not have a decent format to begin with?

Further, when the file is unindexed it's even harder to read it as a human because you can't easily skip to a particular section. I have this trouble often where my code can efficiently access the data once it's loaded, but a human-eye check is tedious/impossible because you have to scroll through gigabytes to find what you want.

1 comments

> Another thing is human readable is typically synonymous with unindexed

Indexing is not directly related to binary vs text. Many text formats in bioinformatics are indexed and many binary formats are not when they are not designed with indexing in mind.

> a human-eye check is tedious/impossible because you have to scroll through gigabytes to find what you want.

Yes, indexing is better but without indexing, you can use command line tools to extract the portion you want to look at and then pipe to "more" or "less".