Hacker News new | ask | show | jobs
by wahern 3160 days ago
But file offsets _aren't_ per process. File offsets (aka I/O position cursors) are kept in the file table entry data structure, and those are shared for descriptors that have been dup'd or fork'd. If the cursor wasn't shared then this program

  #include <stdio.h>
  #include <stdlib.h>
  
  #include <err.h>
  #include <unistd.h>
  
  int
  main(void) {
  	FILE *fh = tmpfile();
  	if (!fh)
  		err(1, "tmpfile");  
  	int fd = fileno(fh);
  	if (fd == -1)
  		errx(1, "fileno: no descriptor");  
  	const char digits[] = "0123456789";
  	if (sizeof digits != write(fd, digits, sizeof digits))
  		err(1, "write");
  	if (-1 == lseek(fd, 0, SEEK_SET))
  		err(1, "lseek");  
  	if (-1 == fork())
  		err(1, "fork");
  	char ch;
  	switch (read(fd, &ch, 1)) {
  	case -1:
  		err(1, "read");
  	case 0:
  		errx(1, "read: EOF");
  	}
  	printf("%ld: %c\n", (long)getpid(), ch);  
  	return 0;
  }
would print '0' twice. However, it actually prints '0' then '1'.

Descriptor tables are per process, but the only thing a descriptor table entry stores is a flags field (basically, O_CLOEXEC/FD_CLOEXEC, plus maybe some esoteric platform specific flags), and a pointer to a file table entry data structure. Most state, like the O_NONBLOCK flag and file offsets, are kept in the [often shared] file table entry. The file table and its entries are completely independent from any particular process; in fact, traditionally there's only one global file table, just like there's only one process table.

These errors can usually be avoided if one always cite to a primary source (e.g. POSIX standard, vendor source code) for every assertion and/or validates the assertion with actual code. Maybe it's my legal training, but whenever I make an assertion, especially a technical assertion, I make it a habit of following those two rules, even when posting comments. And quite often I end up learning something new in the process.