-
Notifications
You must be signed in to change notification settings - Fork 39
Linting and Formatting: adding Pylint checks to Ruff #1025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
batch_ndim = 2 | ||
|
||
actual_is_batch = actual_ndim == batch_ndim | ||
actual_batch_size = shape[0] if actual_is_batch else 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a better name would be something explicitly suggesting 2d? and then in line 192, use the same name but with 1d explicit instead of ... else 1
? I don't know of an exact name, but reflecting the value in it would be nice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now I still kept batch_ndim = 2, but I have added singular_ndim = 1.
If you think that for example these would be more clear:
batch_2dim = 2 and singular_1dim = 1 I could edit them
src/power_grid_model/_core/utils.py
Outdated
if array.ndim == 2: | ||
if array.ndim == batch_ndim: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to above, maybe we can have some "globals" indicating the dimension of batch data, regular data, etc. I imagine these magic numbers (1, 2, 3) referring to the possible dimensions are spread in many places.
If in the end batch_dim = 2
is just clear, let's define it that way then and put in somewhere. And then come with something for the other two.
Also, what does the return 1 below mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as you may see in this PR, even giving magic values a localized name does not necessarily make the code nicer. However, it does highlight the underlying design issues, and provides a way out and a first step towards refactoring the code in the future.
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
Signed-off-by: Laurynas Jagutis <laurynas.jagutis@alliander.com>
|
supported_dim = 3 | ||
if schema.dtype[attribute].shape == (supported_dim,) and data.shape[-1] != supported_dim: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is both not descriptive enough and also not correct. This is exactly why you should be very careful with magic values. I myself did not understand what it was about until I dug more deeply into this.
3
is NOT a supported amount of dimensions, UNLESS a very specific case. Therefore,supported_dim
is not a good name.- Only 1-dimensional and 2-dimensional datastructures are supported. There is only 1 exception and that is when we're dealing with asymmetric values in dense batch attribute data, in which case you have 3 dimensions and the last dimension contains exactly 3 values
- This also means that
if
statement is incorrect (i.e. we found a bug)
Here's my proposal:
supported_dim = 3 | |
if schema.dtype[attribute].shape == (supported_dim,) and data.shape[-1] != supported_dim: | |
dense_batch_ndim = 2 | |
dense_batch_asym_attr_ndim = dense_batch_ndim + 1 | |
n_asym_values = 3 | |
shape = schema.dtype[attribute].shape | |
if len(shape) > batch_ndim and not (len(shape) == dense_batch_asym_attr_ndim and shape[-1] != n_asym_values): |
you can also invert the if statement:
dense_batch_ndim = 2
dense_batch_asym_attr_ndim = dense_batch_ndim + 1
n_asym_values = 3
shape = schema.dtype[attribute].shape
if len(shape) <= batch_ndim or (len(shape) == dense_batch_asym_attr_ndim and shape[-1] != n_asym_values):
return _get_raw_data_view(data, dtype=schema.dtype[attribute].base)
raise ValueError("Given data has a different schema than supported.")
Please do this fix in a separate PR off main
together with a unit test that tries to trigger the issue from the lowest-level public function that calls this function (i.e., get_buffer_view
).
I recommend first creating the unit tests (don't forget to commit when if fails correctly) and then applying the fix
@@ -42,6 +42,10 @@ | |||
from power_grid_model._core.power_grid_meta import initialize_array, power_grid_meta_data | |||
from power_grid_model._core.typing import ComponentAttributeMapping, _ComponentAttributeMappingDict | |||
|
|||
SINGULAR_NDIM = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not a fan of the name SINGULAR
. Maybe something like SINGLE_DATASET_NDIM
(and BATCH_DATASET_NDIM
, resp.)?
@@ -42,6 +42,10 @@ | |||
from power_grid_model._core.power_grid_meta import initialize_array, power_grid_meta_data | |||
from power_grid_model._core.typing import ComponentAttributeMapping, _ComponentAttributeMappingDict | |||
|
|||
SINGULAR_NDIM = 1 | |||
BATCH_NDIM = 2 | |||
UNSUPPORTED_NDIM = 3 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
creating this on a global level doesn't make sense. In fact, all amount of dimensions above BATCH_NDIM
(or below 0, for all that matters) are unsupported and it is very prone to bugs (as is shown in my other comment).
return 1 | ||
return array.shape[0] | ||
sym_array = next(iter(batch_data.values())) | ||
|
||
sym_array = cast(DenseBatchArray | BatchColumn, sym_array) | ||
if sym_array.ndim == 3: | ||
if sym_array.ndim == UNSUPPORTED_NDIM: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems like a bug to me
if sym_array.ndim == UNSUPPORTED_NDIM: | |
if sym_array.ndim > BATCH_NDIM: |
or (but please double-check whether ndim == 0
is supported, or we may just introduce a new bug)
if sym_array.ndim == UNSUPPORTED_NDIM: | |
if sym_array.ndim not in (SINGULAR_NDIM, BATCH_NDIM): |
This implementation is based on a version of the repository before moving to Ruff, which can be accessed here:
https://github.com/PowerGridModel/power-grid-model/tree/c127b2cf42ff179fe985f91688e10a7cf2671a90
As per the previous implementation, Pylint only ran on src files, thus I tried to mimic the behavior:
https://github.com/PowerGridModel/power-grid-model/blob/c127b2cf42ff179fe985f91688e10a7cf2671a90/.pre-commit-config.yaml
Most of the noqa comments alligned with the pylint version:
https://github.com/PowerGridModel/power-grid-model/blob/c127b2cf42ff179fe985f91688e10a7cf2671a90/src/power_grid_model/validation/_rules.py
Most misalignments happened in src/power_grid_model/_core/power_grid_model.py, where the old pylint version did not have comments but now there was a need for them. A possible explanation could be: pylint-dev/pylint#8667