The atomic_counter template provides for incrementing and decrementing values in a thread-safe manner.

More on the concepts involved and a detailed description of the implementation can be found in the August 1998 issue of Visual C++ Developer. A similar treatment appears as a whitepaper on this site.

The template allows you to declare a thread-safe value of any type.
atomic_counter<int> x1;
atomic_counter<unsigned long> x2;
atomic_counter<byte> x3 = 5;
A selection of read/modify/write operations are provided, including (see Members by Category) They allow modification of the value and testing of the value (either new or old, depending on the operator) in an unbroken operation, so other threads wonít change things between the “read” and the “write”, and the result is in sync with this particular operation even if other threads are changing the value too.

A conversion operator allows the atomic_counter to be used as a plain value.

The template is designed with a dual personality. It may be used with any type, as long as these underlying operations are supported. For example, a floating point type or a user-defined BCD class. In the general case (which is not implemented), it uses a critical section to wrap the underlying operation in a straightforward manner.

More interestingly, the template is explicitly specialized for the integral types in the following variations: signed or unsigned, 1, 2 or 4 bytes. In these cases, the atomic_counter takes up the same amount of room as the underlying variable (so it fits into structures where a plain number was expected), and uses special functions to perform the operations. This form is two to three times faster than the non-blocking case of using a Win32 Critical Section, and has the advantage of never having to block.

This code uses the assembly language primitives that are the basis of all such atomic operations. This is as efficient as you can get to accomplish this result, but be warned that itís not a cheap instruction. Benchmarks show that itís orders of magnitude slower than normal non-synchronized arithmetic on integers (about 100 nanoseconds compared to 2 nanoseconds). So, donít use atomic counters gratuitously. They are cheaper than general purpose synchronization primitives, but if you have 3 or more atomic counters it would be more efficient to use a Critical Section instead.

This class works for .NET “managed extensions” in VC++ version 7.1. That is, the entire Classics DLL (the same DLL used normally) may be linked with code compiled “using the CLR”, and this class is no exception. However, it provides __fastcall in the normal case, which is not available in “managed” code. So both the __fastcall and a regular function are both present in the DLL, and the class definition is conditionally compiled to use a normal function signature if being compiled with the /CLR switch.

This class detects the case when there is only one CPU present, and disables the locking, speeding up the access. This is done at run-time if it detects this case. Basically, the atomic instructions are in fact single instructions, and any interrupt (or time-slice change) will not occur in the middle of it. The read-modify-write instruction is all that is needed when there is only one CPU. Only with multithreading under multiple CPUs is the bus-locking and pipeline-flushing necessary. Note however that a bus-mastering device other than a CPU causes the same problem, so this class will not cooperate with non-CPU access to the memory location, when the single-CPU code is selected.