Lock-free Interprocess Communication
The title of this blog entry is also the title of an excellent article found at Dr Dobb’s. If you’re interested in low-level programming, I recommend that you take a look at it. An Alexander Dokumentov (seriously, that has to be a pseudonym… “I am Alexander Dokumentov – guru of manuals and specifications! Bow before me!”) presents an intriguing way of passing information between processes or threads, without the need for a single lock at all.
The basic idea is so simple that it hurts; and at the same time it goes against everything I learned in my real-time systems programming courses (yes, this is a perfect opportunity to retort with a comment about how little I must have learned):
“I propose a communication type that requires only atomic writing of processor word from processor cache into main memory and atomic processor word reading from main memory into the processor register or processor cache. To the best of my knowledge, all platforms satisfy these requirements if the address in memory is properly aligned.”
And he continues to explain this a bit further:
“In our case, it is important that shared memory which is used to transmit data between processes is aligned by the size of the processor word. Aligning guarantees that read/write operations to the main memory will be atomic. Atomic in this case is assumed to mean that if a value of a word initially was V1, and a process writes another value V2, then any other process or thread even without any synchronisation will read either the old value V1 or the new value V2.”
The rest of the article brings up excellent examples of how to send data in various ways using this technique, but the quotes above appears to be the article’s main idea. (According to the references, it seems that an Andrei Alexandrescu might have already presented this idea in his article Lock-free data structures from 2004, but I haven’t checked the article in question.)
What are the practical benefits, then? Well, single-processor systems won’t really see much of an improvement - if any at all - as long as the lock-based method is implemented in a decent way. It’s a whole different ball game for multi-processor systems, though: a test from the article shows a performance boost by a factor of 30! Given the explosion of multi-processor systems nowadays, this definitely seems like a great benefit.
On the downside, this method of communication demands that the CPU alignment is known in advance. I don’t know, but I get a feeling that this method can’t be used on both 32 bit and 64 bit CPUs, for example. No backward compatibility? Me no likey. But then again, I’m sure that the CPU capabilities can be ascertained during run-time, so a dynamic atomic size ought to be able to implement.

December 15th, 2006 at 5:03 am
Thank you. It was fun to read
Alexander Dokumentov is not a pseudonym
December 20th, 2006 at 7:22 pm
[...] “Alexander Dokumentov” I think Mr. Dokumentov himself has left a comment in an old post of mine! So I guess that it really isn’t a pseudonym at all. Oh well, I still giggle when I think of my little footnote there: (seriously, that has to be a pseudonym… “I am Alexander Dokumentov – guru of manuals and specifications! Bow before me!â€) [...]