Skip to content

FR?: ctx.actions.symlink action time scales w/ size of input; should be constant #14125

@ghost

Description

Description of the problem / feature request:

Action 1 generates a 2 GiB output, 2gb.out. Creating the output takes .75s, but postprocessing (checksumming?) via actuallyCompleteAction takes 5s
Action 2 is a ctx.actions.symlink with target_file = ctx.file.2gb_out. Creating the output is seemingly instantaneous, but postprocessing via actuallyCompleteAction costs another 5s.

Feature requests: what underlying problem are you trying to solve with this feature?

rules_pkg's pkg_zip deprecated the out attr and manages outputs internally w/ an implicit ctx.actions.symlink. That call to symlink adds 15s to my build's critical path.

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

defs.bzl

def _dummy_impl(ctx):
    out_phys = ctx.actions.declare_file(ctx.outputs.out.basename + ".real")
    ctx.actions.run_shell(
        outputs = [out_phys],
        command = "dd if=/dev/zero of=$1 bs=1M count=2K",  # 2 GiB
        arguments = [out_phys.path],
        execution_requirements = {"local": ""},  # don't bother w/ sandboxing
    )

    ctx.actions.symlink(
        output = ctx.outputs.out,
        target_file = out_phys,
        # execution_requirements = {"local": ""},
    )

dummy = rule(
    implementation = _dummy_impl,
    attrs = {
        "out": attr.output(mandatory = True),
    },
)

BUILD

load(":defs.bzl", "dummy")

dummy(
    name = "dummy",
    out = "dummy.out",
    tags = ["no-cache"],
)
$ bazelisk build --profile=symlink.profile.gz :dummy
Starting local Bazel server and connecting to it...
INFO: Analyzed target //:dummy (4 packages loaded, 6 targets configured).
INFO: Found 1 target...
INFO: From Action dummy.out.real:
2048+0 records in
2048+0 records out
2147483648 bytes (2.1 GB, 2.0 GiB) copied, 0.755435 s, 2.8 GB/s
Target //:dummy up-to-date:
  .bazel/bin/dummy.out
INFO: Elapsed time: 13.959s, Critical Path: 11.40s
INFO: 3 processes: 2 internal, 1 local.
INFO: Build completed successfully, 3 total actions

What operating system are you running Bazel on?

Linux, CentOS 8

What's the output of bazel info release?

release 4.2.1 and release 5.0.0-pre.20210929.1 (via USE_BAZEL_VERSION=rolling)

If bazel info release returns "development version" or "(@Non-Git)", tell us how you built Bazel.

n/a

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

n/a

Have you found anything relevant by searching the web?

No.

(Edit: Found issue #12158 and discovered undocumented options digest_function and unix_digest_hash_attribute_name. Perhaps these options or code adjacent to them could be leveraged for this?)

Any other information, logs, or outputs that you want to share?

  • I assume that the cost of actuallyCompleteAction boils down to checksumming the target of the output symlink. Since Bazel already has a checksum for the input, I would hope that this value could be reused.
  • In addition to the rules_pkg use case, skylib's copy_file is optionally a wrapper for ctx.actions.symlink, so more users may encounter this than you'd think.

Metadata

Metadata

Assignees

Labels

P2We'll consider working on this in future. (Assignee optional)team-Rules-APIAPI for writing rules/aspects: providers, runfiles, actions, artifactstype: feature request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions