Hacker News new | ask | show | jobs
by djhworld 2524 days ago
Do you have example of alternatives for the BufferedReader with the NIO APIs?

I do a lot of work with large GZIP that are read line by line using the standard IO (i.e. GzipInputStrem(FileInputStream)) etc) but your comment has really made me second guess my choice of doing that...

2 comments

I would check out the compression handlers in Netty, which underlies Vert.X and many other projects that need high IO performance.

You should be able to hack something together that feeds zero copy buffers into Netty compression handler. Maybe using Netty or Vert.X file API, or maybe just raw NIO2.

I'm not sure how fast this would be, but my gut says "very". Netty can easily saturate 40 Gigabit Ethernet lines, and file IO should have less overhead. That's ~5 gigabytes a second.

It's going to be a good bit of coding for sure. Vert.X/Netty/NIO2 are all async and pretty low level. They're generally 1 thread per core, and along with SSD read patterns you're probably best off reading files in parallel, one per core. Might not be worth the effort.

You may want to look into ZStandard as well. It's Superior to Gzip in most ways when you need something fast but decent.

The NIO API uses channels and buffers

  Path path = ...
  var count = 0;
  try(var channel = FileChannel.open(path)) {
    var buffer = ByteBuffer.allocateDirect(8192);
    while(channel.read(buffer) != -1) {
      while(buffer.hasRemaining()) {
        if (buffer.get() == '\n') {
          count++;
        }
      }
      buffer.clear();
    }
  }
  System.out.println(count);
but in your particular case, i don't think there is a Gzip decoder that works on ByteBuffer.