Well in most cases I've looked at generated assembly (not that often), the xmm registers are used even for scalar operations, which I thought was the default option for gcc on x86-64, but I suppose it might differ on different systems (or perhaps 32-bit mode was used for some reason).
Right, if you're using xmm regs you're getting double or single precision. Sometimes the x87 regs get used and that's when "accidentally computed with extra precision and then double-rounded" comes up.