Selectively emulating sequential consistency in software improves efficiency in a multiprocessing computing environment. A writing CPU uses a high priority inter-processor interrupt to force each CPU in the system to execute a memory barrier. This step invalidates old data in the system. Each CPU that has executed a memory barrier instruction registers completion and sends an indicator to a memory location to indicate completion of the memory barrier instruction. Prior to updating the data, the writing CPU must check the register to ensure completion of the memory barrier execution by each CPU. The register may be in the form of an array, a bitmask, or a combining tree, or a comparable structure. This step ensures that all invalidates are removed from the system and that deadlock between two competing CPUs is avoided. Following validation that each CPU has executed the memory barrier instruction, the writing CPU may update the pointer to the data structure.
Read-copy-update (RCU) is performed within real-time and other types of systems, such that memory barrier usage within RCU is reduced. A computerized system includes processors, memory, updaters, and readers. The updaters update contents of a section of the memory by using first and second sets of per-processor counters, first and second sets of per-processor need-memory-barrier bits, and a global flip-counter bit. The global flip-counter bit specifies which of the first or second set of the per-processor counters and the per-processor need-memory-barrier bits is a current set, and which is a last set. The readers read the contents of the section of the memory by using the first and second sets of per-processor counters, the first and second sets of per-processor need-memory-barrier bits, and the global flip-counter bit, in a way that significantly reduces the need for memory barriers during such read operations.