Skip to content

"No such file or directory" crash for build scripts of 3rd party crates #3201

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
bradzacher opened this issue Jan 23, 2025 · 25 comments
Open
Labels

Comments

@bradzacher
Copy link

I'm working to upgrade my company's codebase to a newer rules_rust version.
We're currently on 0.49.3.

I originally tried just jumping to 0.57.0 as a bit of a yolo and dealt with the various breaking changes just fine and found a few issues with it. However I ran into one crash that I can't seem to figure out. All 3rd party crates with build scripts are crashing with the following error:

thread 'main' panicked at external/rules_rust/cargo/cargo_build_script_runner/lib.rs:131:33:
Unable to start command:
Command {
    program: "/var/lib/engflow/worker/work/9/exec/bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/crate_index__crossbeam-utils-0.8.20/_bs-",
    args: [
        "/var/lib/engflow/worker/work/9/exec/bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/crate_index__crossbeam-utils-0.8.20/_bs-",
    ],
    env: CommandEnv {
        clear: false,
        vars: {
            "AR": Some(
                "/usr/bin/ar",
            ),
            "CARGO_CFG_TARGET_ABI": Some(
                "",
            ),
            "CARGO_CFG_TARGET_ARCH": Some(
                "x86_64",
            ),
            "CARGO_CFG_TARGET_ENDIAN": Some(
                "little",
            ),
            "CARGO_CFG_TARGET_ENV": Some(
                "gnu",
            ),
            "CARGO_CFG_TARGET_FAMILY": Some(
                "unix",
            ),
            "CARGO_CFG_TARGET_FEATURE": Some(
                "fxsr,sse,sse2",
            ),
            "CARGO_CFG_TARGET_HAS_ATOMIC": Some(
                "16,32,64,8,ptr",
            ),
            "CARGO_CFG_TARGET_OS": Some(
                "linux",
            ),
            "CARGO_CFG_TARGET_POINTER_WIDTH": Some(
                "64",
            ),
            "CARGO_CFG_TARGET_VENDOR": Some(
                "unknown",
            ),
            "CARGO_CFG_UNIX": Some(
                "",
            ),
            "CARGO_ENCODED_RUSTFLAGS": Some(
                "--sysroot=/var/lib/engflow/worker/work/9/exec/bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/rust_linux_x86_64__x86_64-unknown-linux-gnu__stable_tools/rust_toolchain\u{1f}--cap-lints=allow",
            ),
            "CARGO_MANIFEST_DIR": Some(
                "/var/lib/engflow/worker/work/9/exec/bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/crate_index__crossbeam-utils-0.8.20/_bs.cargo_runfiles/crate_index__crossbeam-utils-0.8.20/",
            ),
            "CC": Some(
                "/usr/bin/clang-13",
            ),
            "CXX": Some(
                "/usr/bin/clang-13",
            ),
            "LD": Some(
                "/usr/bin/clang-13",
            ),
            "OUT_DIR": Some(
                "/var/lib/engflow/worker/work/9/exec/bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/crate_index__crossbeam-utils-0.8.20/_bs.out_dir",
            ),
            "RUSTC": Some(
                "/var/lib/engflow/worker/work/9/exec/bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/rust_linux_x86_64__x86_64-unknown-linux-gnu__stable_tools/rust_toolchain/bin/rustc",
            ),
            "RUST_BACKTRACE": Some(
                "full",
            ),
        },
    },
    cwd: Some(
        "/var/lib/engflow/worker/work/9/exec/bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/crate_index__crossbeam-utils-0.8.20/_bs.cargo_runfiles/crate_index__crossbeam-utils-0.8.20/",
    ),
    create_pidfd: false,
}
Os { code: 2, kind: NotFound, message: "No such file or directory" }
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace

I spent a while poking at our setup and tried things like updating the packages that appear broken -- but I can't seem to puzzle it out or fix it.

I have inspected the bazel output and it looks like the the required file exists on disk according to our tooling. In the above example I can clearly see bazel-out/k8-fastbuild-ST-921160b699d3/bin/external/crate_index__crossbeam-utils-0.8.20/_bs- is listed as an input, it exists on disk and it is an executable (I can even execute it locally and it works).

I've done some bisecting and it looks like the 0.50.0 release is the first version that started breaking.

@PixelDust22
Copy link
Contributor

PixelDust22 commented Feb 24, 2025

Bump. Running into a similar issue.

It was clearly documented that for build scripts, if rundir is unset, the current directory should be the cargo manifest directory. On my machine, the build script current directory was set to /private/var/tmp/_bazel_neo/534701a8e32353681ff0b2a029b72ba0/sandbox/darwin-sandbox/191/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/external/rules_rust++crate+crates__spirt-0.4.0/_bs.cargo_runfiles/rules_rust++crate+crates__spirt-0.4.0/ which doesn't exist.

/private/var/tmp/_bazel_neo/534701a8e32353681ff0b2a029b72ba0/sandbox/darwin-sandbox/191/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/external/rules_rust++crate+crates__spirt-0.4.0/ does exist, but there's no _bs.cargo_runfiles in it. The directories that do exist are: _bs-, _bs-.runfiles, _bs.out_dir-0.params.

I was unable to bisect it but looking at the commit history I'm suspecting #2891. You can reproduce by adding https://github.com/EmbarkStudios/spirt as a dependency.

cc @UebelAndre

@erenon
Copy link
Contributor

erenon commented Mar 4, 2025

pq-sys is also affected as of 0.7.0 (see build.rs in this commit). I think the cause is:

  • build.rs embeds a cargo envvar: PathBuf::from(env!("CARGO_MANIFEST_DIR")).
  • when the buildscript actually runs, the embedded envvar points to the previous sandbox (the one of the compilation), that is not present any more.

I verified this by adding some eprintln!s into build.rs:

CARGO_MANIFEST_DIR: .../sandbox/linux-sandbox/14/execroot/_main/external/rules_rust~~crate~crates__pq-sys-0.7.0
source_path: ".../sandbox/linux-sandbox/14/execroot/_main/external/rules_rust~~crate~crates__pq-sys-0.7.0/src/bindings_linux.rs"
out_path: ".../sandbox/linux-sandbox/15/execroot/_main/bazel-out/k8-fastbuild/bin/external/rules_rust~~crate~crates__pq-sys-0.7.0/_bs.out_dir/bindings.rs"

There's one thing I do not understand. aquery of the builds script compilation shows:

$ bazel aquery @@rules_rust~~crate~crates__pq-sys-0.7.0//:_bs_
action 'Compiling Rust bin _bs_ (8 files) [for tool]'
  Mnemonic: Rustc
  Target: @@rules_rust~~crate~crates__pq-sys-0.7.0//:_bs_
[...]
CARGO_MANIFEST_DIR=${pwd}/external/rules_rust~~crate~crates__pq-sys-0.7.0

What replaces ${pwd} with the actual absolute path? There's such a transformation in cargo/cargo_build_script_runner/bin.rs , but that is for the envvars passed to the buildscript runtime.

@aran
Copy link

aran commented Mar 6, 2025

Anyone found a workaround in the meantime?

@UebelAndre
Copy link
Collaborator

What happens if you build with --remote_download_all?

@erenon
Copy link
Contributor

erenon commented Mar 6, 2025

In my case, this happens even without remote execution.

@UebelAndre
Copy link
Collaborator

In my case, this happens even without remote execution.

--remote_download_all would eliminate Builds without the bytes as a factor, right? My understanding is you could still see the same issue locally or in remote execution.

@UebelAndre
Copy link
Collaborator

And does this occur on 0.58.0?

@aran
Copy link

aran commented Mar 6, 2025

What happens if you build with --remote_download_all

For me, no difference.

And does this occur on 0.58.0?

I ran into it on 0.58.0.

@bradzacher
Copy link
Author

@UebelAndre I just tested this against 0.58.0 and the bug still exists.
The bug is occurring only in our CI environment which does use RBE.

@erenon
Copy link
Contributor

erenon commented Mar 11, 2025

This issue is reproducible locally with these files:

# MODULE.bazel
bazel_dep(name = "rules_rust", version = "0.58.0")

crate = use_extension("@rules_rust//crate_universe:extension.bzl", "crate")
crate.spec(package = "pq-sys", version = "=0.7.0", features = [], default_features = False)

crate.from_specs(
  isolated = False,
  supported_platform_triples = [
    "x86_64-unknown-linux-gnu",
  ],
)
use_repo(crate, "crates")

BUILD.bazel is empty.
.bazelversion is "8.1.1".

Then:

bazel build @crates//:pq-sys

Produces:

thread 'main' panicked at external/rules_rust++crate+crates__pq-sys-0.7.0/build.rs:118:46:
Couldn't write bindings: Os { code: 2, kind: NotFound, message: "No such file or directory" }

Same with --remote_download_all. No remote exec or any other special config is needed.

@taj-p
Copy link
Contributor

taj-p commented Mar 11, 2025

I reproduced this on 0.59.0 too

@UebelAndre
Copy link
Collaborator

re: @erenon #3201 (comment)

This seems like a different issue. This is the pq-sys build script missing data vs the report failing to launch the build script itself in the Bazel wrapper.

@UebelAndre
Copy link
Collaborator

@Neo-Zhixing

/private/var/tmp/_bazel_neo/534701a8e32353681ff0b2a029b72ba0/sandbox/darwin-sandbox/191/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/external/rules_rust++crate+crates__spirt-0.4.0/ does exist, but there's no _bs.cargo_runfiles in it. The directories that do exist are: _bs-, _bs-.runfiles, _bs.out_dir-0.params.

Interesting you say _bs- is a directory. That should be a file/symlink. Are you sure it's really a directory in your repro?

@PixelDust22
Copy link
Contributor

@UebelAndre you are right, these are symlinks.

lrwxr-xr-x  1 neo  wheel  164 Mar 12 08:27 _bs- -> /private/var/tmp/_bazel_neo/534701a8e32353681ff0b2a029b72ba0/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/external/rules_rust++crate+crates__spirt-0.4.0/_bs-
drwxr-xr-x  5 neo  wheel  160 Mar 12 08:27 _bs-.runfiles
lrwxr-xr-x  1 neo  wheel  180 Mar 12 08:27 _bs.out_dir-0.params -> /private/var/tmp/_bazel_neo/534701a8e32353681ff0b2a029b72ba0/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/external/rules_rust++crate+crates__spirt-0.4.0/_bs.out_dir-0.params

@bradzacher
Copy link
Author

@UebelAndre is there any information that I can provide to help debug this?
I don't have the expertise to dive into this codebase, but I can provide whatever info that might help investigate.

This is sadly blocking us from upgrading to newer rust versions so I'd love to help in any way I can.

@UebelAndre
Copy link
Collaborator

I could use a repro that 100% reproduces the issue. We definitely have tests for cargo_build_script which don’t seem to hit this and I can’t find a delta 😞

@PixelDust22
Copy link
Contributor

Try this:
MODULE.bazel:

module(
    name = "mtl",
    version = "0.1.0",
)

bazel_dep(name = "rules_rust", version = "0.59.1")

crate = use_extension("@rules_rust//crate_universe:extension.bzl", "crate")
crate.spec(package = "spirt", version = "=0.4.0")

crate.from_specs()
use_repo(crate, "crates")

BUILD: empty

build command:

bazel build @crates//:spirt

Output:

--stderr:
 error: /private/var/tmp/_bazel_neo/cb279c75e79ad3fd7c269d61af4255a0/sandbox/darwin-sandbox/43/execroot/_main/bazel-out/darwin_arm64-fastbuild/bin/external/rules_rust++crate+crates__spirt-0.4.0/_bs.cargo_runfiles/rules_rust++crate+crates__spirt-0.4.0/khronos-spec/SPIRV-Headers/include/spirv/unified1 is not a directory
  help: git submodules are required to build from a git checkout
  help: run `git submodule update --init`
  note: if the error persists, please open an issue

Target @@rules_rust++crate+crates__spirt-0.4.0//:spirt failed to build
Use --verbose_failures to see the command lines of failed build steps.
ERROR: /private/var/tmp/_bazel_neo/cb279c75e79ad3fd7c269d61af4255a0/external/rules_rust++crate+crates__spirt-0.4.0/BUILD.bazel:27:13 Compiling Rust rlib spirt v0.4.0 (32 files) failed: (Exit 1): runner failed: error executing CargoBuildScriptRun command (from target @@rules_rust++crate+crates__spirt-0.4.0//:_bs) bazel-out/darwin_arm64-opt-exec-ST-d57f47055a04/bin/external/rules_rust+/cargo/cargo_build_script_runner/runner ... (remaining 10 arguments skipped)

@UebelAndre
Copy link
Collaborator

re @Neo-Zhixing #3201 (comment)

This looks like a different error than what was originally posted. If you download the spirit crate at that version there's no SPIRV-Headers in ./khronos-spec and that directory is missing entirely.

spirit-0.4.0
├── Cargo.lock
├── Cargo.toml
├── Cargo.toml.orig
├── README.md
└── src
    ├── app.rs
    ├── bodies.rs
    ├── cfg_loader.rs
    ├── empty.rs
    ├── error.rs
    ├── extension.rs
    ├── fragment
    │   ├── driver.rs
    │   ├── mod.rs
    │   └── pipeline.rs
    ├── lib.rs
    ├── macro_support.rs
    ├── spirit.rs
    ├── utils.rs
    └── validation.rs

3 directories, 18 files

I'm not sure how this crate is intended to be fetched but seems like the crates.io publish is insufficient.

That being said, this problem exists past the original issue of build scripts being completely unable to start so if anyone has a repro for that I can try to take a look.

@aran
Copy link

aran commented Mar 21, 2025

@UebelAndre would it be better to file a separate issue for what @erenon mentioned above, i.e. Couldn't write bindings: Os { code: 2, kind: NotFound, message: "No such file or directory" } ? (#3201 (comment))

I think it has a reliable reproduction at least.

@UebelAndre
Copy link
Collaborator

@UebelAndre would it be better to file a separate issue for what @erenon mentioned above, i.e. Couldn't write bindings: Os { code: 2, kind: NotFound, message: "No such file or directory" } ? (#3201 (comment))

I think it has a reliable reproduction at least.

Yeah, that’s a separate issue that I think is just unique to the crate

@erenon
Copy link
Contributor

erenon commented Mar 21, 2025

Filed a new ticket: #3369

@PixelDust22
Copy link
Contributor

PixelDust22 commented Apr 2, 2025

@UebelAndre The name of the crate was spirt, not spirit. spirit is an entirely different crate.

I manually downloaded spirt 0.4.0 from https://static.crates.io/crates/spirt/spirt-0.4.0.crate and the files are indeed there. I do believe that spirt 0.4.0 is a valid repro.

@lamcw
Copy link

lamcw commented Apr 7, 2025

Hey! 👋 I've done some extensive research with OP's issue (me and OP are in the same team 😄) and came to the conclusion that #2826 is the root cause.

The premise of this problem is that we have more than 1 execution platform registered in our setup, via

  • register_execution_platforms and/or
  • --extra_execution_platforms

We mean to use different libc/gcc toolchain registered at different paths in different execution platforms with this setup.

Back to the PR above, the change introduces a cargo_build_script_runfiles rule such that the data dependency is built using the target configuration, while the build script binary itself built & symlinked using the exec configuration. The binary is then consumed by cargo_build_script to execute the build script.

This becomes a problem for Bazel workspaces/modules with more than 1 execution platforms registered, because the execution platform can be different between the cargo_build_script_runfiles rule

cargo_build_script_runfiles(
name = name + "-",
script = ":{}_".format(name),
data = data,
tools = tools,
tags = binary_tags,
**wrapper_kwargs
)

and the cargo_build_script rule (a.k.a _build_script_run)

_build_script_run(
name = name,
script = ":{}-".format(name),
crate_features = crate_features,
version = version,
build_script_env = build_script_env,
use_default_shell_env = sanitized_use_default_shell_env,
links = links,
deps = deps,
link_deps = link_deps,
rundir = rundir,
rustc_flags = rustc_flags,
visibility = visibility,
tags = tags,
pkg_name = pkg_name,
**kwargs
)

This is because the cargo_build_script rule specifies the toolchain it depends on whereas the cargo_build_script_runfiles rule depends on no toolchain, so Bazel is free to select whatever execution platform is registered. In our case it simply selects the first execution platform registered in our setup, which is the wrong platform to use because it will have a different libc than the libc resolved for the execution platform used in cargo_build_script. Therefore the script runner is built with an incorrect RUNPATH pointing to a libc that does not exist when invoked in the CargoBuildScriptRun action. The "No such file or directory" error in OP actually is referring to a missing libc. This is NOT something to do with missing data as @UebelAndre has previously suggested.

Question for @UebelAndre -- what problems are #2826 trying to solve exactly? If it is simply trying to avoid the need for multiple configurations, and maintain that the data is built using the target configuration, would it not suffice to add a data attribute in the cargo_build_script rule and forward it directly into the output? I suspect the original behavior is probably the correct behavior. If we wish to avoid duplicated builds due to differences in configurations, we should probably use path mapping instead?

@jesses-canva
Copy link

This is similar to an issue we found in rules_go: bazel-contrib/rules_go#4127.

Bazel resolves the execution platform per target, per configuration, so you can't reliably pass information about "the" exec platform between targets.

Which is what's happening with cargo_build_script_runfiles, it's returning a combination of outputs for the target and exec platforms, but there's nothing to require its exec platform match the exec platform of cargo_build_script.

@lamcw
Copy link

lamcw commented May 29, 2025

@UebelAndre have you had a chance to take a look at this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants