Fix running of synchronous_loop.sh on macOS (Metal) (#1070)

KvanTTT · web-flow · commit dec92b4a29cd · 2025-06-26T10:58:16.000-04:00
* Fix train.py to support mps (Metal Performance Shaders)

* Update Compiling.md with info about `synchronous_loop.sh` on macOS
diff --git a/Compiling.md b/Compiling.md
@@ -151,3 +151,7 @@ As also mentioned in the instructions below but repeated here for visibility, if
    * Pre-trained neural nets are available at [the main training website](https://katagotraining.org/).
    * You will probably want to edit `configs/gtp_example.cfg` (see "Tuning for Performance" above).
    * If using OpenCL, you will want to verify that KataGo is picking up the correct device when you run it (e.g. some systems may have both an Intel CPU OpenCL and GPU OpenCL, if KataGo appears to pick the wrong one, you can correct this by specifying `openclGpuToUse` in `configs/gtp_example.cfg`).
+   * If you want to run `synchronous_loop.sh` on macOS, do the following steps:
+      * Install GNU coreutils `brew install coreutils` to support a `head` tool that can take negative numbers (`head -n -5` in `train.sh`)
+      * Install GNU findutils `brew install findutils` to support a `find` tool that supports `-printf` option, that's used by `export_model_for_selfplay.sh`. After that, fix `find` with `gfind` in the script.
+        Note: you can try to avoid fixing `export_model_for_selfplay.sh` by adjusting `PATH` with the installed findutils: `export PATH="/opt/homebrew/opt/findutils/libexec/gnubin:$PATH"` or by using the alias `alias find="gfind"`. However, it works not always.
diff --git a/python/train.py b/python/train.py
@@ -260,11 +260,15 @@ def main(rank: int, world_size: int, args, multi_gpu_device_ids, readpipes, writ
         atexit.register(multiprocessing_cleanup)
         assert torch.cuda.is_available()
 
-    if True or torch.cuda.is_available():
+    if torch.cuda.is_available():
         my_gpu_id = multi_gpu_device_ids[rank]
         torch.cuda.set_device(my_gpu_id)
         logging.info("Using GPU device: " + torch.cuda.get_device_name())
         device = torch.device("cuda", my_gpu_id)
+    elif torch.backends.mps.is_available(): # Check for Apple Metal Performance Shaders
+        my_gpu_id = multi_gpu_device_ids[rank]
+        logging.info("Using MPS device")
+        device = torch.device("mps", my_gpu_id)
     else:
         logging.warning("WARNING: No GPU, using CPU")
         device = torch.device("cpu")