Hacker News new | ask | show | jobs
by technicolor 864 days ago
Author here. I wasn't suggesting it would have any perf impact. Just that it was an interesting change set.
2 comments

I feel I'm missing something here, but it feels like the intro is not only suggesting there was a performance impact, but very explicitly stating there was:

> profile of a sample word count program I was writing, which showed the program was spending way too much time in the syscall module. That in this context can only mean one thing: way too many read syscalls were getting called.

I find it hard to believe that the profile would look any different with 1 vs 2 syscalls per 2GB chunk. The syscall overhead is going to be insignificant compared to actually copying the data. The program is going to be spending a lot of time doing syscalls no matter how many there are, because the individual syscalls will just start to take more time as you increase the size of the buffer.

Edit:

Compare:

  strace -e read -T perl -MFcntl -e 'sysopen FD, "foo", Fcntl::O_RDONLY; while (sysread FD, $buf, 1*1024*1024*1024) {} '

  read(3, ..., 1073741824) = 1073741824 <0.455901>
  read(3, ..., 1073741824) = 1073741824 <0.219711>
  read(3, ..., 1073741824) = 1073741824 <0.213923>
  read(3, ..., 1073741824) = 1073741824 <0.211783>
Vs.

  strace -e read -T perl -MFcntl -e 'sysopen FD, "foo", Fcntl::O_RDONLY; while (sysread FD, $buf, 2*1024*1024*1024) {} '

  read(3, ..., 2147483648) = 2147479552 <0.921789>
  read(3, ..., 2147483648) = 2147479552 <0.487007>
  read(3, ..., 2147483648) = 8192 <0.000031>
I tried your strace commands on my linux desktop and I don't get anything like your results. I get a few lines of 'read(3, "\177ELF\2\1\1...' type stuff. Are you using bash (as I am) or some other shell? Could there be some escaping that I'm missing? Thanks.
It's reading from a file named foo, which in this case was 4GB large and created with:

  dd if=/dev/zero of=foo bs=1024 count=$((1024*1024*4))
You probably don't have a file of that name in the current working directory?
I missed that, thanks.
Makes sense, I guess I misunderstood the premise of the article. It's a good dive.