Skip to content

Add build rules for Python arrow APIs #4

@CodingCanuck

Description

@CodingCanuck

I'm experimenting with Arrow's Python APIs for its Plasma Object Store: https://arrow.apache.org/docs/python/plasma.html

I currently don't know how to get this building with bazel. I suspect that the build target produced by https://github.com/3rdparty/bazel-rules-arrow/blob/main/arrow.BUILD only provides dependencies for using arrow's C++ Plasma API.

Working on a branch based on Ben's earlier prototype which predates this arrow rules repo, naively trying to depend on the rules produced by this repos's pattern from a py3_image target produce the following error:

in deps attribute of py_binary rule //tests/test_plasma/client_python:client.binary: '@com_github_apache_arrow//:arrow' does not have mandatory providers: 'py' or 'PyInfo'. Since this rule was created by the macro 'py3_image', the error might have been caused by the macro implementation

I suspect we'll need to add new build targets for Python, though I have no idea what that looks like.

I hope that Arrow's Python docs can get us started: https://arrow.apache.org/docs/developers/python.html#building-on-linux-and-macos

Note that because Arrow / Plasma use Cython, the first step in building their Python code is to first build their C++ code: https://arrow.apache.org/docs/developers/python.html#build-and-test

After building their C+ code (which their Python / Cython code depends on), their docs say that you can then build their pyarrow code (see: "Now, build pyarrow" in the link to their Python docs above).

At a glance, their recommended build process looks like it will require some work to bazelify: they build their python code using a custom build step in a setup.py that they provide.

@ArthurBandaryk do you have any experience with parts of this (building python libraries / cython / the non-C++ parts of Arrow, or wrapping custom build processes in bazel)? I'll scratch at this, but if you have any experience here I'd love a hand :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions