Non-blocking Synchronization in Hardware and Software Michael Greenwald Distributed Systems Group Stanford University Non-blocking synchronization (NBS) has significant advantages over blocking synchronization; however it has not been widely deployed to date. This talk presents techniques used to build a multi-processor operating system kernel, the CacheKernel, where synchronization is done exclusively with NBS. One key feature was the availability of a richer hardware primitive: While, in theory, single word atomic primitives (LL/SC or Compare-And-Swap) are {\em universal}, in practice we have found significant advantages to using {\em two} word primitives that can update multiple non-contiguous words (e.g. Double-Compare-And-Swap (DCAS) or, equivalently, a pipelined extension to LL/SC). Finally, we briefly discuss how NBS can be used to support synchronized methods in JAVA by simplifying implementation, and increasing system robustness.