Lock-Free Programming

Most C++ developers have heard of `std::mutex` and `std::shared_mutex`, and some may have implemented their oversimplified versions using `std::atomic`. However, there have been few attempts to design a scalable shared mutex that works efficiently on modern 100+ core [NUMA](https://en.wikipedia.org/wiki/Non-uniform_memory_access) systems.

To understand the foundational problems causing these limitations, one should study memory contention and profile how the system behaves when many threads attempt to read or modify the same memory address. Before attempting to design high-level STL-like abstractions, one should understand how some key instructions operate:

- On x86:
  - `LOCK XADD`, `LOCK CMPXCHG`, and `PAUSE` for spinlocks and synchronization.
  - Memory barriers: `MFENCE`, `SFENCE`, `LFENCE`.
  - Transactional memory primitives: `XBEGIN` / `XEND (TSX)`.
- On AArch64:
  - Load/store-exclusive: `LDXR` / `STXR` and atomic compare-and-swap (CAS).
  - Memory barriers: `DMB`, `DSB`, `ISB`.
  - Spinlock optimization: `YIELD`.

Once the low-level profiling is complete, the next step could be to explore implementations of advanced concurrency primitives proposed in the Concurrency TS, like the [Distributed Counters in P0261](https://github.com/cplusplus/papers/issues/563) or [Byte-wise atomic `memcpy` in P1478](https://github.com/cplusplus/papers/issues/370).

---

This topic could also serve as the foundation for a research paper on concurrency primitives, especially for those pursuing a master’s or PhD in Systems Programming.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Lock-Free Programming #14

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Lock-Free Programming #14

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions