🏷️ added additional overloads and improved existing stubs of block-related sparse construction functions #775

JulVandenBroeck · 2025-07-29T15:22:50Z

I added all formats for the final functions in _construct.pyi.

Let me know if you spot any errors! 😄

JulVandenBroeck · 2025-07-29T15:31:39Z

Looks like I've got some mypy errors. I'll fix them later.

JulVandenBroeck · 2025-07-30T09:54:49Z

Most errors are related to mypy not recognizing [[<sparse array>], [None]] as type list[list[<sparse array> | None]]. Instead, the type is marked as list[object]. It does work for pyright though.
Simply removing the None entries from the test cases is a simple workaround.

There is still an error left that I have to investigate.

JulVandenBroeck · 2025-07-30T09:58:50Z

Okay, the last error is basically the same issue. [[csr_arr], [dok_arr]] is recognized as list[object] again. I can also just remove this test case.

JulVandenBroeck · 2025-07-30T10:01:04Z

mypy < pyright ? 👀

jorenham · 2025-07-30T12:17:14Z

mypy < pyright ? 👀

In my experience that's indeed often the case. But as we've recently seen, pyright also has its issues. I have high hopes for ty, though. Mypy has improved a lot over the lasts year though, which is certainly appreciated.

jorenham · 2025-07-30T12:21:01Z

Most errors are related to mypy not recognizing [[<sparse array>], [None]] as type list[list[<sparse array> | None]]. Instead, the type is marked as list[object]. It does work for pyright though. Simply removing the None entries from the test cases is a simple workaround.

There is still an error left that I have to investigate.

That's probably a classic case of "join vs union", which the (based)pyright docs explain quite nicely: https://docs.basedpyright.com/v1.18.2/usage/mypy-comparison/#unions-vs-joins

I believe that mypy intends to address this class of issues though: python/mypy#12056.

jorenham · 2025-07-30T12:22:18Z

Okay, the last error is basically the same issue. [[csr_arr], [dok_arr]] is recognized as list[object] again. I can also just remove this test case.

You could also leave it in with a # type: ignore[assert-type], and a comment explaining why mypy is behaving incorrectly (i.e. join vs union).

JulVandenBroeck · 2025-07-30T12:23:55Z

Ok, I like that approach. I'll restore the problematic tests

jorenham

Does the block_diag return type (matrix vs array) still depend on the input, or did that maybe change in the meantime?

scipy-stubs/sparse/_construct.pyi

JulVandenBroeck · 2025-07-30T12:38:28Z

Does the block_diag return type (matrix vs array) still depend on the input, or did that maybe change in the meantime?

The documentation states the following:

If at least one input is a sparse array, the output is a sparse array. Otherwise the output is a sparse matrix.

So, I thought it might be most accurate to create an overload for all possible combinations of matrices, numpy arrays, and sequences, while arrays are treated in a separate overload. I haven't found a way to get the right behavior for "a list with at least one element of type x", see also my comment here: #733 (comment).

jorenham · 2025-07-30T12:43:03Z

I haven't found a way to get the right behavior for "a list with at least one element of type x"

Yea, I don't think that's something we can express in the Python type system. I suppose you could brute-force-engineer something like that for tuples, but that wouldn't be very pretty.

JulVandenBroeck · 2025-07-30T12:45:47Z

I did try by creating a Protocol that contains __getitem__ returning x | y, but this approach only works for (based)pyright due to the union vs. join behavior again.

jorenham · 2025-07-30T15:03:14Z

#777 and #778 seem to have caused some merge conflicts

JulVandenBroeck · 2025-07-31T08:04:44Z

I think I did something wrong with git... Hopefully it didn't screw anything up.

JulVandenBroeck · 2025-07-31T08:10:16Z

Looks good, but np.int32 gets interpreted as np.signedinteger[Any] in some Python/NumPy combinations. Seems like it doesn't occur anymore starting from NumPy version 2.2.

jorenham · 2025-07-31T15:28:01Z

I think I did something wrong with git... Hopefully it didn't screw anything up.

Hmm yea, those commits don't look right. Did you by any chance rebase and then git push instead of git push -f?

scipy-stubs/sparse/_construct.pyi

JulVandenBroeck · 2025-07-31T17:20:53Z

Hmm yea, those commits don't look right. Did you by any chance rebase and then git push instead of git push -f?

I think you nailed it, but now I'm in too deep apparently 😆.
Will it cause problems with the merge? Otherwise, I'll make a new branch and corresponding PR...

jorenham · 2025-07-31T18:04:11Z

Hmm yea, those commits don't look right. Did you by any chance rebase and then git push instead of git push -f?

I think you nailed it, but now I'm in too deep apparently 😆.
Will it cause problems with the merge? Otherwise, I'll make a new branch and corresponding PR...

Haha no worries, I'll squash merge them

.github/workflows/ci.yml

scipy-stubs/sparse/_construct.pyi

…rsion-specific numpy errors

jorenham · 2025-08-02T16:51:20Z

Hmm, I just compared the timing results of uv run basedpyright --stats --verbose scipy-stubs/sparse, and this change increases the analysis time of scipy-stubs/sparse/_construct.pyi from 1038ms to 1316ms (+27%). Similarly, this increases the analysis time of tests/sparse/test_construct.pyi from 847ms to 964ms (+14%). There's little difference in outcomes when when repeating the measurements (I've seen +-25ms at most), so the hypothesis that this is causes a performance regression appears to be statistically significant.

I don't think that the test slowdown is any reason for concern, especially if you consider the tests that were added. But the +300ms of _construct.pyi itself can be pretty annoying for DX. For me it causes my laptop fan to go into helicopter mode for a couple of seconds each time that I make an edit to _construct.pyi with the basedpyright extension enabled, which wasn't the case before.

For context, these are the top 10 slowest analysis times per file in scipy-stubs:

1314ms: scipy-stubs/sparse/_construct.pyi
 919ms: scipy-stubs/signal/_signaltools.pyi
 460ms: scipy-stubs/linalg/_basic.pyi
 279ms: scipy-stubs/sparse/_base.pyi
 278ms: scipy-stubs/special/_ufuncs.pyi
 252ms: scipy-stubs/stats/_distribution_infrastructure.pyi
 227ms: scipy-stubs/sparse/linalg/_dsolve/linsolve.pyi
 221ms: scipy-stubs/sparse/_dok.pyi
 197ms: scipy-stubs/stats/_entropy.pyi
 190ms: scipy-stubs/ndimage/_filters.pyi
 190ms: scipy-stubs/special/_logsumexp.pyi

So even before this change, _construct.pyi was already the bottleneck, and this increases its bottlenecknessitude by +25% or so.

I'm also worried that this will affect scipy-stubs users that use scipy.sparse a lot. These kind of things have the tendency to snowball into additional seconds or minutes of CI runtime, and that can be pretty costly (and very annoying in general).

I'll see if I can figure out an easy way to improve the analysis time. Sometimes it's as easy as tweaking a single type alias, or swapping some overloads, so let's hope that's also the case here. In this case, I suspect that it has to do with the overload-overlap detection algorithm, which has an exponential time-complexity in terms of overload count and the amount of protocol members of the input types. So it might help to replace a couple of onp.CanArrayND with onp.CanArray. But to be honest, optimizing for type-checker performance isn't something I have much experience in, so I might very well be wrong here 🤷🏻.

Anyway, it's just something to keep in mind.

It would be pretty cool to have a script that warns about performance regressions in CI 🤔. I guess I'll open an issue for it.

jorenham · 2025-08-02T18:49:34Z

Ok so there's two easy ways to bring down the analysis time a bit:

set _ToArray1D2D: TypeAlias = onp.CanArray[tuple[int] | tuple[int, int], np.dtype[_SCT]] | Seq[_SCT | Seq[_SCT]], whichs shaves off ~170 ms or so
create a new _ToDType: TypeAlias = type[complex | _Numeric] | np.dtype[_Numeric] | str, and replace all npt.DTypeLike with _ToDType for an improvement of ~120 ms.

Since these changes aren't directly related to this PR, I first thought I'd create a separate PR for this. But fix 2. in particular would've caused nasty merge conflicts. And even though it's not very clean, it's fine by me to put these performance tweaks in this PR. Those two should be enough to counteract the +300 ms :)

JulVandenBroeck · 2025-08-03T08:46:26Z

Thanks for the detailed analysis!

It would be pretty cool to have a script that warns about performance regressions in CI 🤔. I guess I'll open an issue for it.

Indeed; I didn't notice this slowdown, so this warning would be quite nice!

Since these changes aren't directly related to this PR, I first thought I'd create a separate PR for this. But fix 2. in particular would've caused nasty merge conflicts. And even though it's not very clean, it's fine by me to put these performance tweaks in this PR. Those two should be enough to counteract the +300 ms :)

Sounds good! I'll start implementing these suggestions soon :)

jorenham · 2025-08-03T22:56:13Z

Ok great; thanks Jul!

jorenham self-requested a review July 29, 2025 23:38

jorenham added scipy.sparse is: feature labels Jul 29, 2025

jorenham reviewed Jul 30, 2025

View reviewed changes

scipy-stubs/sparse/_construct.pyi Outdated Show resolved Hide resolved

scipy-stubs/sparse/_construct.pyi Outdated Show resolved Hide resolved

JulVandenBroeck requested a review from jorenham July 30, 2025 13:39

jorenham reviewed Jul 31, 2025

View reviewed changes

scipy-stubs/sparse/_construct.pyi Outdated Show resolved Hide resolved

JulVandenBroeck and others added 9 commits July 31, 2025 19:05

Improved stubs for block_array and bmat

581c5cb

Better comments

e1f85fc

Improved stubs for block_diag

be391cf

Removed test cases that cause problems with mypy

5ef27ec

Fixed missing *s in overloads for block_diag

43bf8f5

Re-added problematic tests with notes

59f1aa4

Cleaned up overloads with type aliases

64fca16

📌 pin uv to 0.8.3 in CI (for now)

93dd0e9

💚 fix CI numpy testing matrix

aa7d30f

Improved stubs for block_array and bmat

cfa2252

JulVandenBroeck force-pushed the improved-block-functions branch from 779f04a to cfa2252 Compare July 31, 2025 17:11

Fixed errors for numpy<2.2

b6f5b13

jorenham reviewed Aug 1, 2025

View reviewed changes

julvdb added 2 commits August 1, 2025 09:33

Removed accidental workflow change

7844703

Cleaned up type aliases + changed int_ to int64 to work around ve…

e4d3ed5

…rsion-specific numpy errors

jorenham self-requested a review August 2, 2025 15:56

julvdb added 2 commits August 3, 2025 11:13

(Re)defined type aliases to improve perfomance

01c8711

Removed unused import

3f64a1a

jorenham mentioned this pull request Aug 3, 2025

detect regressions of pyright analysis time in CI #797

Open

jorenham added this to the 1.16.1.1 milestone Aug 3, 2025

jorenham merged commit 5b09c4c into scipy:master Aug 3, 2025
18 checks passed

JulVandenBroeck deleted the improved-block-functions branch August 6, 2025 09:09

Uh oh!

🏷️ added additional overloads and improved existing stubs of block-related sparse construction functions #775

🏷️ added additional overloads and improved existing stubs of block-related sparse construction functions #775

Conversation

JulVandenBroeck commented Jul 29, 2025

Uh oh!

JulVandenBroeck commented Jul 29, 2025

Uh oh!

JulVandenBroeck commented Jul 30, 2025

Uh oh!

JulVandenBroeck commented Jul 30, 2025

Uh oh!

JulVandenBroeck commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorenham commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorenham commented Jul 30, 2025

Uh oh!

jorenham commented Jul 30, 2025

Uh oh!

JulVandenBroeck commented Jul 30, 2025

Uh oh!

jorenham left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

JulVandenBroeck commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorenham commented Jul 30, 2025

Uh oh!

JulVandenBroeck commented Jul 30, 2025

Uh oh!

jorenham commented Jul 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

JulVandenBroeck commented Jul 31, 2025

Uh oh!

JulVandenBroeck commented Jul 31, 2025

Uh oh!

jorenham commented Jul 31, 2025

Uh oh!

Uh oh!

JulVandenBroeck commented Jul 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jorenham commented Jul 31, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jorenham commented Aug 2, 2025

Uh oh!

jorenham commented Aug 2, 2025

Uh oh!

JulVandenBroeck commented Aug 3, 2025

Uh oh!

Uh oh!

jorenham commented Aug 3, 2025

Uh oh!

Uh oh!

JulVandenBroeck commented Jul 30, 2025 •

edited

Loading

jorenham commented Jul 30, 2025 •

edited

Loading

JulVandenBroeck commented Jul 30, 2025 •

edited

Loading

jorenham commented Jul 30, 2025 •

edited

Loading

JulVandenBroeck commented Jul 31, 2025 •

edited

Loading