But of course. Consider, however, a eight CPU system, where the other four CPU:s also have LOCK# high. You then have to wait for at least 7 memory latency delays, which is almost forever with modern CPU:s.
Also, the memory access that is accompanied by LOCK tends to involve physical RAM, not cache (since reading the data from the processor local cache would defeat the purpose), which on a P4 takes _at least_ 200 cycles.
Now, consider having a LOCK for each reference count change in pike (this is basically what hubbe implemented). You get severe slowdowns.
/ Per Hedbor ()
Previous text:
2004-02-03 12:23: Subject: Re: Default backend and thread backends?
On Tue, Feb 03, 2004 at 12:05:02PM +0100, Per Hedbor () @ Pike (-) developers forum wrote:
However, the lock instruction can take forever to execute (the actual delay depends on the system architecture)
No, it won't. From the manual (Intel):
"Causes the processor's LOCK# signal to be asserted during execution of the accompanying instruction (turns the instruction into an atomic instruction). In a multiprocessor environment, the LOCK# signal insures that the processor has exclusive use of any shared memory while the signal is asserted."
That's all. Implementation on other architectures is similar, AFAIK.
Regards, /Al
/ Brevbäraren