Skip to content

Conversation

@ilevkivskyi
Copy link
Contributor

No description provided.

ilevkivskyi referenced this pull request in python/mypy Nov 17, 2025
Vendor optimized base64 implementation from
https://github.com/aklomp/base64.
This is based on commit 9e8ed65048ff0f703fad3deb03bf66ac7f78a4d7 (May
2025).

Enable SIMD on macOS (64-bit ARM only). Other platforms probably use a
generic version. I'll look into enabling SIMD more generally in a
follow-up PR.

A `b64encode` micro-benchmark was up to 11 times faster compared to the
stdlib `base64` module (on a MacBook Pro).
if compiler.compiler_type == "unix":
cflags += ["-O3"]
if X86_64:
cflags.append("-msse4.2") # Enable SIMD (see also mypyc/build.py)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No auto detection? I'll have to patch this out for Debian, the amd64 baseline is sse2

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
cflags.append("-msse4.2") # Enable SIMD (see also mypyc/build.py)

There is autodetection already, -msse4.2 is not needed: https://github.com/mypyc/librt/pull/14/files#diff-25f140cc37c41f9eb31ca08bca258a6bb6f86e404efde9344762e6aaac4d5494R188-R203

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the original mypy PR python/mypy#20244 that added this, I guess it was necessary for some compilers, maybe @JukkaL can clarify.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The build fails without this, since SSE4.2 features can't be used without the flag, at least on Ubuntu 24.04. In particular, the C files within arch/sse in the path may need this flag. Auto-detection is supported by the underlying base64 library, but we'd still need to compile the SSE-related files using this flag, and setuptools doesn't make it straightforward to have different C compiler flags for specific files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a fix for per-file compile flags python/mypy#20253

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilevkivskyi @JukkaL The fix for per-file base64 compilation flags is ready for review & merging python/mypy#20253

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants