Skip to content

Tracking FASTCALL/VECTORCALL improvements (3324) #1647

@GalacticEmperor1

Description

@GalacticEmperor1

Issue №3324 opened by ankith26 at 2022-07-20 05:13:10

Recently, a new contributor @itzpr3d4t0r has expressed a lot of interest in using FASTCALL to speedup some pygame functions (thanks a lot for doing this!). I thought it would be a good idea to open an issue to be tracking progress on this, and also put out some info and have discussions about this.

To explain what FASTCALL is in short, it is basically a new and faster python calling convention. Usually in the older VARARGS calling convention, C API functions expect to handle a tuple args. On the CPython side, this adds small calling overhead on each function, because it has to make a temporary tuple object from the arguments users pass. With FASTCALL, the arguments directly come in as a C array of python object pointers.

There are also a couple of issues here. One of the key issues being, there is no clean C API to be dealing with this. Handling positional args becomes a bit more tedious, and handling keyword args becomes a huge pain. Handling keyword args also adds a bit of its own overhead here (if we do get to the stage where we have custom API to handle this)
A more minor and workaroundable issue is that FASTCALL is python 3.7+ only, and at the time of writing of this comment we still support python 3.6. We can put 3.6-only wrappers to our FASTCALL functions, and export them as regular VARARGS function. Obviously, this would mean that the optimisations won't actually work on python 3.6 (and in a few cases, performance may also regress a tiny bit on 3.6), but we don't have to worry about this as long as the functions keep working on python 3.6 while we support it.
When we do finally drop 3.6 support at some point, the wrapper macros should be easy enough to remove.

Recently pygame has been trying to port more and more of its function to support keyword args, but by applying FASTCALL we might be going against it. So I propose that we should not try to make everything FASTCALL, or everything support kwargs, but instead strike a balance between both. There are many places where kwargs support is a no-brainer. But there are places where FASTCALL has some good benefits.

Here are a few things of the top of my head that should help decide where we can use this, and where we shouldn't

  • In general, the FASTCALL optimisation best benefits short running pygame functions that are called many times in your average game loop. If a function is long running, or is only expected to be called at the start of programs (like resource loading functions), applying the FASTCALL optimisation tricks does not have much of an impact.
  • For functions dealing with a lot of arguments or many heterogenous arguments (places where adding keyword arguments makes things cleaner), we should (in general) refrain from applying FASTCALL without keyword support. If the function happens to fit the first criteria and users are rooting for optimisations on it, we can consider implementing more specialised versions of the function that can be FASTCALLed more elegantly.
  • If a function already supports keyword args and applying FASTCALL on it would mean an API break, we should obviously refrain from doing the same. As mentioned above, we can consider implementing a new and specialised version of the function if needed.
  • Another obvious thing, we should not apply FASTCALL in places where doing so will slow down things. A good example is functions that expect a tuple object and handle it best. It does not make sense to use FASTCALL if we are going to convert the C array to a Python tuple anyways.

This issue has been opened for us to list functions where applying FASTCALL would be acceptable (and hopefully get input from more people on which functions that fit above criteria needs optimising)


Comments

# # itzpr3d4t0r commented at 2022-07-20 08:07:37

I totally agree! I'll link some PRs to this in a moment. Also I'd like to make a little list for where to use FASTCALL, as it might be helpful. To follow this list you'll have to take into consideration everything said in the previous comment.

  • Short running-frequently called- functions or new specialized ones
  • Functions with a small number of positional only arguments (2-3)
  • Functions that do not support kwargs (METH_VARARGS only)
  • Functions with an arbitrary (still small otherwise keyword would be better) number of positional parameters
  • Longer execution, performance intensive functions that are frequently called that could be hindered by the lack of FASTCALL

Remember that functions with a single non keyword parameter are best fit for METH_O. Which is a python function calling convention that expects a pointer to self and a single PyObject * which represents the single argument. Refrain from using FASTCALL in those cases.

Right now it's possible to have a METH_FASTCALL | METH_KEYWORDS function that also accepts keyword arguments, but at the time of writing this there's no standard way to parse those keyword arguments, so refrain from using that. If and when it will be possible to effectively parse those keywords then it will make sense to have some functions use this calling convention.


# # illume commented at 2022-08-20 19:13:00

The keyword arguments issue: pygame/pygame#808

I have these questions to help us decide if the trade off of not using keyword arguments is worth it for us:

  • is usability affected if we include named function arguments in .pyi files, but have the actual functions be positional only? Do IDEs like vscode/sublime anaconda/vim/russiabrains/mu-editor/Thonny/IDLE/etc work ok?
  • are there any micro benchmarks yet for any of the changes?
  • any macro benchmarks for games that use any of the sped up functions? (slow-ARM, and desktop/laptop CPUs)

There are readability benefits for people using keyword arguments. Often for cases where there are many arguments, or the arguments aren't extreamly obvious. When the arguments are not obvious, and there are more than 2... the readability benefit is very large.


# # itzpr3d4t0r commented at 2022-08-20 19:38:57

I have these questions to help us decide if the trade off of not using keyword arguments is worth it for us:

We are not saying to stop using keyword arguments nor to change existing keyword functions to fastcall, just to use keyword functions with some more rationality, as they are indeed useful but not everywhere, plus they can really hinder performance where not needed. Generally speaking my idea is that the most popular programming laguages don't even have this keyword functionality, and functions are taught as positional, so while keywords are cool, they should be used where it's clear they are useful.

are there any micro benchmarks yet for any of the changes?
any macro benchmarks for games that use any of the sped up functions? (slow-ARM, and desktop/laptop CPUs)

About the concern that these optimizations would not have a great impact on overall game performance in terms of framerate, well the only thing to say is that every little step counts, pyramids weren't build in a day! They are not even small steps, just fastcall can bring 30-40% faster performance for short running functions, not to say it can also simplify the C implementation, as many times in VARARGS functions you'd parse arguments out the tuple or have many implementations exploit that tuple in a lazy way(see the rect functions) and not have an implementation that's based on the number of arguments passed.
We'll for sure have benchmarks results and code published for each case.

When the arguments are not obvious, and there are more than 2... the readability benefit is very large.

Indeed, we should definitely have that but you should also consider if the arguments from the third onwards are really worth being keyword , as they could be a "must pass it" argument. Generally speaking if readability is an issue, pretty much every IDE today lets you see the parameter names list by hovering the mouse over the function call.

...where there are many arguments, or the arguments aren't extreamly obvious

Can you elaborate? is there really a case in which there's an argument that's not obvious and have a function call add a name= make it instantly obvious? i struggle to see a case like that, especially because there's documentation for functions and objects and if something isn't clear you go read the docs, making it a keyword won't help.


# # MyreMylar commented at 2022-08-29 09:26:35

Caught up on some of the background here briefly.

From what I can make out it seems like FASTCALL was added a while ago, then Cpython devs had a bit of a rethink, didn't like the name and came up with Vectorcall. see:

https://docs.python.org/3/c-api/call.html

FASTCALL is in 3.7+ and the Vectorcall API is fully available from 3.9 (but is actually in 3.8 too). But it seems they aren't getting rid of FASTCALL either. I'm not clear on whether the FASTCALL functions do some things Vectorcall doesn't, but in general they are definitely related to the overall speedup python project with Vectorcall being the newer (and preferred) API from 3.9 onwards.

I suspect we should revisit the places we are using FASTCALL once we have a minimum version of 3.8. because it sounds like Vectorcall will be less likely to have issues in the future and may receive more attention/improvements.

Metadata

Metadata

Assignees

No one assigned

    Labels

    PerformanceRelated to the speed or resource usage of the project

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions