Hacker News new | ask | show | jobs
by exDM69 5318 days ago
GCC will probably not use any specific hardware facilities, which means this is probably going to be implemented with regular atomic operations.

Within a transaction block, the results of all reads are stored (to a local, hidden variable). When the transaction is about to finish, all reads are repeated and if any of them yields a different result, the transaction is restarted. When the transaction is committed, there will likely be some kind of a global lock (that will be held for a very small time).

As GCC probably doesn't require any kind of threading or locking, it's most likely that the write lock will be a spinlock using an atomic read-modify-write and some kind of yield instruction (monitor/mwait on new cpu's, pause on older).

As far as I can see, there really aren't lots of other methods to implement STM, especially from within the C compiler.

3 comments

The article hints that GCC uses a combination of HTM and STM, if HTM is available.
The paper I linked to claims they have an STM, and a hybrid HTM-STM system if hardware support is available.
Since you can't do all reads in one atomic instruction and you also need to make it atomic with the write (CAS), wouldn't that still require a lock for the whole operation?
As long as you write to separate parts of memory and are cautious with freeing memory you don't need locks for reads.
How can you make sure there are no writes to the memory you are reading from?
Don't write to the same location. Everything needs to be a pointer, but updating pointers is an atomic operation. aka assume a is an integer.

  a->(0x00010001)->5
  a->(0x00030001)->6
You can keep reading 0x00010001 and getting 5 all day even as a is "actually' 6. This also works with strings or objects etc, the only downside is you tend to eat up a far amount of memory, and you need to avoid freeing 0x00010001 when something still thinks a's value is stored there.
You can still not read or write to more than 2 pointers atomically on x86_64 so my question remains.
All you need to do is read the location that pointer points to as an atomic operation. So allocating a 50kb string would work in the same way as long as you could store it in a specific processes memory.

  PX a=(0x00010001) //which points to 5
  P0 x0=a=(0x00010001) //which points to 5
  P1 y0=a
  P1 pointer y1=0
  P1 y1 = malloc(sizeof(int))
  P2 x1=a=(0x00010001) //which points to 5
  P1 *y = *y0 + 1 //aka 6
  p3 x2=a=(0x00010001) //which points to 5
  P1 a=y //you could do a lock to verify that a == (0x00010001) but if you don't care about dirty writes then then you can also do this as an atomic operation.
  p3 x3=a=(0x00030001)//which points to 6
And once x0,x1,x2 stop pointing to (0x00010001) you can free that memory. The assumption is xN and yN is a process specific local variable preferably a register.