-
Notifications
You must be signed in to change notification settings - Fork 984
Description
Hi @daanx,
First thanks again for this awesome library!
My team (COD devs @ Activsion) have spotted some issues with hanging akin to #890 but on Windows. Very rare, real hard to repro, but given our fleet of test infra it happens quite a lot.
We found that it maybe due to improper use of atomics, and compiling to C++ possibly fixes it (comparing the assembly generated .cod files from MSVC) shows the C++ version would add memory barriers, and the nop
(npad 1) instruction, and possibly other changes.
So this was hopeful, but wanted to check what else changes by moving to C++, and I've found that now MSVCP140.dll loads too (which is not unexpected in general, but before it wasn't). And here are the two functions it needed.

(we have special custom build of mimalloc with switchable flavors at runtime controlled by env, but essentially each "mimalloc-flavor-xxxx.dll" is a mimalloc compiled with special flags, in case you are wondering about the naming).
So from the MSVCP140.dll it needs _Thrd_yield
an std::get_new_handler
- now the latter is expected I guess, but I got curious about the former.
Why would changing to C++ would need this now, and not before, and this led me to this discovery:
mimalloc/include/mimalloc/atomic.h
Line 361 in 09a2709
#if defined(__cplusplus) |
#if defined(__cplusplus)
#include <thread>
static inline void mi_atomic_yield(void) {
std::this_thread::yield();
}
#elif defined(_WIN32)
static inline void mi_atomic_yield(void) {
YieldProcessor();
}
#elif defined(__SSE2__)
#include <emmintrin.h>
...
and then to the std::this_thread::yield
implementation -
https://github.com/microsoft/STL/blob/5f8b52546480a01d1d9be6c033e31dfce48d4f13/stl/src/cthread.cpp#L86
which seems to call SwitchToThread
_CRTIMP2_PURE void __cdecl _Thrd_yield() noexcept { // surrender remainder of timeslice
SwitchToThread();
}
So the difference is that: YieldProcessor
emits pause
(same as what _mm_pause
would do), while std::this_thread::yield
would call SwitchToProcessor()
that may sometimes call into kernel to reschedule the current thread. I've asked ChatGPT here - https://chatgpt.com/share/689beeff-69c0-800a-be43-c16de7a405e3
I didn't want to change too much the behavior, so I've decided to stick for these two cases the way "C" did it (and also did not want to load the MSVC140.dll - it gets loaded anyway, but didn't want to disturb this part).
But it raises the question - what is really intended to be called here - The "pause" (YieldProcessor), or std::this_thread::yield SwitchToProcessor
?
Thank you!