On Fri, 14 Jan 2005, Andi Kleen wrote:

> > Looked at  arch/i386/lib/mmx.c. It avoids the mmx ops in an interrupt
> > context but the rest of the prep for mmx only saves the fpu state if its
> > in use. So that code would only be used rarely. The mmx 64 bit
> > instructions seem to be quite fast according to the manual. Double the
> > cycles than the 32 bit instructions on Pentium M (somewhat higher on Pentium 4).
> With all the other overhead (disabling exceptions, saving register etc.)
> will be likely slower. Also you would need fallback paths for CPUs
> without MMX but with PAE (like Ppro). You can benchmark
> it if you want, but I wouldn't be very optimistic.

So the PentiumPro is a cpu with atomic 64 bit operations in a cmpxchg but
no instruction to do an atomic 64 bit store or load although the
architecture conceptually supports 64bit atomic stores and loads? Wild.

