Skip to content

fix(py_venv): work in terms of bytes when patching shebang lines #606

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

plobsing
Copy link

As Uv's implementation notes:

scripts might be binaries, so we read an exact number of bytes instead of the first line as string

Indeed, one wheel that contains a binary "script" is uv itself.

Constructing a venv that happens to include uv was previously failing with:

ERROR: /Users/peter/tecton/sdk/pypi/BUILD.bazel:97:8: Action sdk/pypi/.venv failed: (Exit 1): sandbox-exec failed: error executing Action command
  (cd /private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/execroot/_main && \
  exec env - \
    TMPDIR=/var/folders/9_/p2d_shr10b91_464_3jfl5t80000gn/T/ \
  /usr/bin/sandbox-exec -f /private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/sandbox.sb /var/tmp/_bazel_peter/install/96e26d97222159f904e14600d7490eb0/process-wrapper '--timeout=0' '--kill_delay=15' '--stats=/private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/stats.out' bazel-out/darwin_arm64-opt-exec-ST-2adb5a2e0ae2/bin/external/aspect_rules_py~/py/tools/venv_bin/venv_macos_aarch64_build '--location=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/.venv' '--venv-shim=bazel-out/darwin_arm64-fastbuild-ST-2adb5a2e0ae2/bin/external/aspect_rules_py~/py/tools/venv_shim/shim_macos_aarch64_build' '--python=python_3.8_macos_aarch64_runtime/python/install/bin/python3.8' '--pth-file=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/venv.pth' '--env-file=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/venv.env' '--bin-dir=bazel-out/darwin_arm64-fastbuild/bin' '--collision-strategy=error' '--venv-name=.venv' '--mode=static-copy' '--version=3.8')
Error:   × Unable to run command:
  ╰─▶ stream did not contain valid UTF-8

Changes are visible to end-users: yes

  • Searched for relevant documentation and updated as needed: yes
  • Breaking change (forces users to change their own code or config): no
  • Suggested release notes appear below: yes

fix(py_venv): binaries in the scripts folder no longer crash venv builder

Test plan

  • Manual testing; please provide instructions so we can reproduce:

Minimized repro including only a single, problematic package (uv) in the venv:
uv_repro.zip

bazel build :venv

As [Uv's implementation](https://github.com/astral-sh/uv/blob/db14cc3005d2cd53802cb04c2f1e177a22c934ac/crates/uv-install-wheel/src/wheel.rs#L425)
notes:

> scripts might be binaries, so we read an exact number of bytes instead of the first line as string

Indeed, one wheel that contains a binary "script" is [`uv` itself](https://pypi.org/project/uv/).

Constructing a venv that happens to include `uv` was previously failing
with:

```
ERROR: /Users/peter/tecton/sdk/pypi/BUILD.bazel:97:8: Action sdk/pypi/.venv failed: (Exit 1): sandbox-exec failed: error executing Action command
  (cd /private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/execroot/_main && \
  exec env - \
    TMPDIR=/var/folders/9_/p2d_shr10b91_464_3jfl5t80000gn/T/ \
  /usr/bin/sandbox-exec -f /private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/sandbox.sb /var/tmp/_bazel_peter/install/96e26d97222159f904e14600d7490eb0/process-wrapper '--timeout=0' '--kill_delay=15' '--stats=/private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/stats.out' bazel-out/darwin_arm64-opt-exec-ST-2adb5a2e0ae2/bin/external/aspect_rules_py~/py/tools/venv_bin/venv_macos_aarch64_build '--location=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/.venv' '--venv-shim=bazel-out/darwin_arm64-fastbuild-ST-2adb5a2e0ae2/bin/external/aspect_rules_py~/py/tools/venv_shim/shim_macos_aarch64_build' '--python=python_3.8_macos_aarch64_runtime/python/install/bin/python3.8' '--pth-file=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/venv.pth' '--env-file=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/venv.env' '--bin-dir=bazel-out/darwin_arm64-fastbuild/bin' '--collision-strategy=error' '--venv-name=.venv' '--mode=static-copy' '--version=3.8')
Error:   × Unable to run command:
  ╰─▶ stream did not contain valid UTF-8
```
Ok(())
}

const PLACEHOLDER_SHEBANG: &[u8] = b"#!/dev/null";
Copy link
Author

@plobsing plobsing Jun 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

BTW, I would love to know where this #!/dev/null placeholder comes from (so I can document it with a comment). FWICT, PEP 427 only recommends recognizing #!python and #!pythonw. @arrdem , you seem to have added this; do you recall?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shebang is unspecified behavior of rules_python. As part of how rules_python implements installing packages an interpreter path must be specified but since that path is being specified at module/workspace setup time there's no way to know either the Bazel label or the relative path or anything else about the interpreter with which the script may eventually be invoked. So rules_python does the "reasonable" (insane) thing and uses /dev/null as the shebang. It could use /bin/false or any other value.

I don't think it's reasonable or future-proof to hardcode this or use the read_exact strategy here. The protocol should be to read the first 512b, see if it starts with #! and there's a \n in there and replace that first line if such.

I think your rewind() machinery fails to strip the shebang from the copy source as this PR stands.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the info. I've added it to the PR. It is a bit wild, I wouldn't have expected rules_python to be the source.

The two behaviours identified as defects were exactly preserved from the prior implementation. This PR only fixes the defect it claims to — the venv builder choking on binary files in the scripts directory.

I do agree that its a little odd to do things this way, and I'd be happy to work with you towards getting a more correct shebang logic in place (BTW, do you know of any packages that trigger the shebang substitution logic so that we can cover all of this with a test?), but they are not what cause the issue I am seeking to address and so I do not think they should be a part of this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this patch set is unacceptable, the issue of binaries triggering stream did not contain valid UTF-8 could alternately be addressed by handling specifically that error. That looks like this: main...plobsing:rules_py:ignore_invalid_utf8 .

I like that solution less because, while it handles more cases than are handled today, including the one I care about, it just feels less correct. In principle, a Python source file is not required to be UTF-8 (PEP 263 is still current and documented for recent Pythons, even if the feature is little used); the encoding assumption/assertion made by using read_to_string to process bin files, even only Python sources, just isn't great in general.

@plobsing plobsing changed the title fix: work in terms of bytes when patching shebang lines fix(py_venv): work in terms of bytes when patching shebang lines Jun 28, 2025
Copy link

aspect-workflows bot commented Jun 28, 2025

Test

12 test targets passed

Targets
//examples/multi_version:py_version_default_test [k8-fastbuild]                         1s
//examples/multi_version:py_version_test [k8-fastbuild-ST-494921797612]                 2s
//examples/pytest:pytest_test [k8-fastbuild]                                            2s
//examples/pytest:sharded/test [k8-fastbuild]                                           3s
//examples/virtual_deps:pytest_test [k8-fastbuild]                                      1s
//py/tests/cc-deps:test_smoke [k8-fastbuild]                                            586ms
//py/tests/external-deps:test_can_import_runfiles_helper [k8-fastbuild]                 601ms
//py/tests/internal-deps:assert [k8-fastbuild]                                          464ms
//py/tests/py-binary:runfiles_from_pip_test [k8-fastbuild]                              714ms
//py/tests/py-test:test_env_vars [k8-fastbuild]                                         585ms
//py/tests/py_image_layer:py_image_test [k8-fastbuild]                                  5s
//py/tests/repo_relative_imports/test:test [k8-fastbuild]                               629ms

Total test execution time was 17s. 29 tests (70.7%) were fully cached saving 53s.

@plobsing plobsing requested a review from arrdem July 2, 2025 19:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants