Hacker News new | ask | show | jobs
by copyconstruct 3159 days ago
I'm the author of that post. The goal wasn't to mislead - like I mentioned, I'm learning these things myself and definitely could've gotten several things wrong.

I meant file offsets are per process, not that every process gets its own table entry.

> when two processes perform reads through a shared file table entry for a regular file, the order of reads will matter as between the two processes because of the shared cursor; the sequence of data each process reads could differ based on random scheduling latencies.

Not sure I follow. won't the two processes still have their own descriptors which point to the same file entry but maintain their own offsets? I think what I understood from your comment is that descriptors are _shared_ by the parent and child with share by reference semantics? So both the parent and the child _are using the same descriptor_ which in turn has an offset in the file table entry.

1 comments

But file offsets _aren't_ per process. File offsets (aka I/O position cursors) are kept in the file table entry data structure, and those are shared for descriptors that have been dup'd or fork'd. If the cursor wasn't shared then this program

  #include <stdio.h>
  #include <stdlib.h>
  
  #include <err.h>
  #include <unistd.h>
  
  int
  main(void) {
  	FILE *fh = tmpfile();
  	if (!fh)
  		err(1, "tmpfile");  
  	int fd = fileno(fh);
  	if (fd == -1)
  		errx(1, "fileno: no descriptor");  
  	const char digits[] = "0123456789";
  	if (sizeof digits != write(fd, digits, sizeof digits))
  		err(1, "write");
  	if (-1 == lseek(fd, 0, SEEK_SET))
  		err(1, "lseek");  
  	if (-1 == fork())
  		err(1, "fork");
  	char ch;
  	switch (read(fd, &ch, 1)) {
  	case -1:
  		err(1, "read");
  	case 0:
  		errx(1, "read: EOF");
  	}
  	printf("%ld: %c\n", (long)getpid(), ch);  
  	return 0;
  }
would print '0' twice. However, it actually prints '0' then '1'.

Descriptor tables are per process, but the only thing a descriptor table entry stores is a flags field (basically, O_CLOEXEC/FD_CLOEXEC, plus maybe some esoteric platform specific flags), and a pointer to a file table entry data structure. Most state, like the O_NONBLOCK flag and file offsets, are kept in the [often shared] file table entry. The file table and its entries are completely independent from any particular process; in fact, traditionally there's only one global file table, just like there's only one process table.

These errors can usually be avoided if one always cite to a primary source (e.g. POSIX standard, vendor source code) for every assertion and/or validates the assertion with actual code. Maybe it's my legal training, but whenever I make an assertion, especially a technical assertion, I make it a habit of following those two rules, even when posting comments. And quite often I end up learning something new in the process.