(refactor) remove redundant logic in _check_valid_index_key #7490

suzyahyah · 2025-03-30T11:45:42Z

This PR contributes a minor refactor, in a small function in src/datasets/formatting/formatting.py. No change in logic.

In the original code, there are separate if-conditionals for isinstance(key, range) and isinstance(key, Iterable), with essentially the same logic.

This PR combines these two using a single if statement.

Considerations

Although range in python is guaranteed to have integers, internally calling int() on an object that is already an int is negligible. (In python it returns the original object. It doesn't create a new integer object or perform any actual conversion)
Technically a range is already an Iterable, and we could just do isinstance(key, Iterable) but I explicitly did isinstance(key, (range, Iterable)) just to be super obvious and consistent that both cases are handled because I see slice, range, Iterable everywhere in this formatting.py
This PR removes the if len(key)>0 conditional. I think it is cleaner to have it this way for three reasons.

There was originally no else statement and the code would have failed silently anyway.
The if len(key)>0 should be caught much earlier, rather than in formatting.py.
There are actually multiple cases where this would fail, if len(key)>0, if key is non numeric or float, or if key is a list of lists. It's clunky to state all this and the error be thrown during max or indexing.

Previous PR and Issues Checks

No known PR or Issues (both closed or open) in hf datasets repository

Tests

Tested using Dataset (load_dataset("wikitext", "wikitext-103-raw-v1")), Pytorch DataLoader, with a Pytorch BatchSampler (list of indexes returned instead of single index).

(refactor) remove redundant logic in _check_valid_index_key

fa83a6c

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(refactor) remove redundant logic in _check_valid_index_key #7490

(refactor) remove redundant logic in _check_valid_index_key #7490

suzyahyah commented Mar 30, 2025 •

edited

Loading

(refactor) remove redundant logic in _check_valid_index_key #7490

Are you sure you want to change the base?

(refactor) remove redundant logic in _check_valid_index_key #7490

Conversation

suzyahyah commented Mar 30, 2025 • edited Loading

suzyahyah commented Mar 30, 2025 •

edited

Loading