-
-
Notifications
You must be signed in to change notification settings - Fork 48
fix(py_venv): work in terms of bytes when patching shebang lines #606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
As [Uv's implementation](https://github.com/astral-sh/uv/blob/db14cc3005d2cd53802cb04c2f1e177a22c934ac/crates/uv-install-wheel/src/wheel.rs#L425) notes: > scripts might be binaries, so we read an exact number of bytes instead of the first line as string Indeed, one wheel that contains a binary "script" is [`uv` itself](https://pypi.org/project/uv/). Constructing a venv that happens to include `uv` was previously failing with: ``` ERROR: /Users/peter/tecton/sdk/pypi/BUILD.bazel:97:8: Action sdk/pypi/.venv failed: (Exit 1): sandbox-exec failed: error executing Action command (cd /private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/execroot/_main && \ exec env - \ TMPDIR=/var/folders/9_/p2d_shr10b91_464_3jfl5t80000gn/T/ \ /usr/bin/sandbox-exec -f /private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/sandbox.sb /var/tmp/_bazel_peter/install/96e26d97222159f904e14600d7490eb0/process-wrapper '--timeout=0' '--kill_delay=15' '--stats=/private/var/tmp/_bazel_peter/dfecb8ec3f6f433d8509be7ebe017232/sandbox/darwin-sandbox/589/stats.out' bazel-out/darwin_arm64-opt-exec-ST-2adb5a2e0ae2/bin/external/aspect_rules_py~/py/tools/venv_bin/venv_macos_aarch64_build '--location=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/.venv' '--venv-shim=bazel-out/darwin_arm64-fastbuild-ST-2adb5a2e0ae2/bin/external/aspect_rules_py~/py/tools/venv_shim/shim_macos_aarch64_build' '--python=python_3.8_macos_aarch64_runtime/python/install/bin/python3.8' '--pth-file=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/venv.pth' '--env-file=bazel-out/darwin_arm64-fastbuild/bin/sdk/pypi/venv.env' '--bin-dir=bazel-out/darwin_arm64-fastbuild/bin' '--collision-strategy=error' '--venv-name=.venv' '--mode=static-copy' '--version=3.8') Error: × Unable to run command: ╰─▶ stream did not contain valid UTF-8 ```
Ok(()) | ||
} | ||
|
||
const PLACEHOLDER_SHEBANG: &[u8] = b"#!/dev/null"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This shebang is unspecified behavior of rules_python
. As part of how rules_python
implements installing packages an interpreter path must be specified but since that path is being specified at module/workspace setup time there's no way to know either the Bazel label or the relative path or anything else about the interpreter with which the script may eventually be invoked. So rules_python
does the "reasonable" (insane) thing and uses /dev/null
as the shebang. It could use /bin/false
or any other value.
I don't think it's reasonable or future-proof to hardcode this or use the read_exact
strategy here. The protocol should be to read the first 512b, see if it starts with #!
and there's a \n
in there and replace that first line if such.
I think your rewind()
machinery fails to strip the shebang from the copy source as this PR stands.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info. I've added it to the PR. It is a bit wild, I wouldn't have expected rules_python
to be the source.
The two behaviours identified as defects were exactly preserved from the prior implementation. This PR only fixes the defect it claims to — the venv builder choking on binary files in the scripts directory.
- Only inspecting the first 11 bytes, not looking for a newline:
rules_py/py/tools/py/src/venv.rs
Line 623 in 3ff3b51
if content.starts_with("#!/dev/null") { - Adding our own shebang as a prefix, not replacing the existing one:
rules_py/py/tools/py/src/venv.rs
Line 624 in 3ff3b51
content.replace_range(..0, &RELOCATABLE_SHEBANG);
I do agree that its a little odd to do things this way, and I'd be happy to work with you towards getting a more correct shebang logic in place (BTW, do you know of any packages that trigger the shebang substitution logic so that we can cover all of this with a test?), but they are not what cause the issue I am seeking to address and so I do not think they should be a part of this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this patch set is unacceptable, the issue of binaries triggering stream did not contain valid UTF-8
could alternately be addressed by handling specifically that error. That looks like this: main...plobsing:rules_py:ignore_invalid_utf8 .
I like that solution less because, while it handles more cases than are handled today, including the one I care about, it just feels less correct. In principle, a Python source file is not required to be UTF-8 (PEP 263 is still current and documented for recent Pythons, even if the feature is little used); the encoding assumption/assertion made by using read_to_string
to process bin
files, even only Python sources, just isn't great in general.
|
As Uv's implementation notes:
Indeed, one wheel that contains a binary "script" is
uv
itself.Constructing a venv that happens to include
uv
was previously failing with:Changes are visible to end-users: yes
fix(py_venv): binaries in the scripts folder no longer crash venv builder
Test plan
Minimized repro including only a single, problematic package (
uv
) in the venv:uv_repro.zip