Export bboxes dataset as VIA-tracks file #580

sfmig · 2025-05-08T16:28:31Z

Description

What is this PR

Bug fix
Addition of a new feature
Other

Why is this PR needed?
To support exporting bounding box datasets in "VIA-tracks" .csv format.

What does this PR do?

Adds a save_bboxes.py module
Adds tests
Updates the relevant docs (API reference and user guide)

It also:

renames _validate_dataset --> validate_pose_dataset for clarity
factors out the function _validate_file_path into a new movement.io.utils module, that holds functions shared across io submodules (right now, only this function).
adds tests for the _validate_file_path function

This PR used PR #497 as starting point.

References

Closes #495

The issue recommends that after this functionality is implemented, we should extend the following tests (introduced in PR #503) to run on bbox data too:

test_dimension_slider_with_nans
test_dimension_slider_multiple_files

Since this would increase the size of this PR (which is already quite chunky), I opened a separate issue for this --> issue #591

How has this PR been tested?

Tests pass locally and in CI.

Is this a breaking change?

No.

Does this PR require an update to the documentation?

Yes, it is included.

Checklist:

The code has been tested locally
Tests have been added to cover all new functionality
The documentation has been updated to reflect any changes
The code has been formatted with pre-commit

codecov · 2025-05-08T16:33:57Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (57b5cbd) to head (3c2dbc7).

Additional details and impacted files

@@            Coverage Diff             @@
##              main      #580    +/-   ##
==========================================
  Coverage   100.00%   100.00%            
==========================================
  Files           31        33     +2     
  Lines         1615      1719   +104     
==========================================
+ Hits          1615      1719   +104

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

for more information, see https://pre-commit.ci

sonarqubecloud · 2025-05-14T17:03:52Z

Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarQube Cloud

niksirbi

Looking good @sfmig!

I'm approving this for your convenience.
I left some comments, many of them optional.

My main two quibbles are:

I find the name and description of the extract_track_id_from_individuals arguments confusing (I couldn't understand what exactly it was doing before reading the source code). I've left some suggestions for alternative ways to describe it.
Some error messages are not logged

niksirbi · 2025-05-23T13:25:32Z

docs/source/user_guide/input_output.md

+By default the {func}`movement.io.save_bboxes.to_via_tracks_file` function will try to extract the track IDs from the individuals' names, but you can also select to extract them from the sorted list of individuals with `extract_track_id_from_individuals=True`.
+
+
+Alternatively, you can save the bounding box tracks to a .csv file with a custom header using the standard Python library `csv`. Below is an example of how you can do this:


Now that we have the built-in support for saving to VIA, I wonder if it's worth putting this "alternative" custom csv option under a dropdown, in the spirit of reducing the page length

niksirbi · 2025-05-23T13:42:08Z

tests/test_unit/test_io/test_utils.py

@@ -0,0 +1,117 @@
+"""Unit tests for the movement.io.utils module."""


I like how neatly you've structured the tests here.

You may be aware that some of these cases are also covered by the tests in test_validators/test_files_validator, especially this one.

The "duplication" occurs because _validate_file_path is a convenience wrapper around ValidFilePath. I think the partial overlap is fine in this case, but I was wondering what your thoughts are on such cases in general, as best practice suggests "separation of concerns".

I'm not sure if you've alse seen the fixtures in fixtures/files.py. Would it make sense to re-use them here? I'm just asking becuase in the early days of movement we had gone through some back and forth on figuring out the correct way to test for permission errors. For example, removing "write" permissions from files resulted in some weird errors down the line (e.g. pytest not being able to delet the tmp files during cleanup). I don't think your implementation will lead to the same error, but it may be safer to rely on existing (and long battle-tested) fixtures.

niksirbi · 2025-05-23T13:44:18Z

movement/io/utils.py

+from movement.validators.files import ValidFile
+
+
+def _validate_file_path(


Not very important, but I am wondering whether it makes more conceptual sense to put this function into validators/files.py as it's just a convenience wrapper around ValidFile. Happy for you to make the call.

niksirbi · 2025-05-23T14:02:44Z

movement/io/save_bboxes.py

+    image_file_prefix: str | None = None,
+    image_file_suffix: str = ".png",
+) -> Path:
+    """Save a movement bounding boxes dataset to a VIA tracks .csv file.


To achieve a uniform docstring style across I/O functions.

Suggested change

"""Save a movement bounding boxes dataset to a VIA tracks .csv file.

"""Save a ``movement`` bounding boxes dataset to a VIA tracks .csv file.

niksirbi · 2025-05-23T14:07:10Z

movement/io/save_bboxes.py

+        The movement bounding boxes dataset to export.
+    file_path : str or pathlib.Path
+        Path where the VIA tracks .csv file [1]_ will be saved.
+    extract_track_id_from_individuals : bool, optional


I find the name of this argument quite confusing, because track IDs are always derived from indiviuals names in some form, even if this it False.

I struggle to come up with a better name, however. Maybe use_trailing_numbers_as_track_ids? This may make more sence together with my suggestion for this argument's description.

Remember to also update the Input/Output docs page if you change this.

niksirbi · 2025-05-23T15:07:43Z

movement/io/save_bboxes.py

+        # Find the first non-digit character starting from the end
+        last_idx = len(individual) - 1
+        first_non_digit_idx = last_idx
+        while (
+            first_non_digit_idx >= 0
+            and individual[first_non_digit_idx].isdigit()
+        ):
+            first_non_digit_idx -= 1
+
+        # Extract track ID from (first_non_digit_idx+1) until the end
+        if first_non_digit_idx < last_idx:
+            track_id = int(individual[first_non_digit_idx + 1 :])


It might be much more straightforward to do this with a regex

Suggested change

# Find the first non-digit character starting from the end

last_idx = len(individual) - 1

first_non_digit_idx = last_idx

while (

first_non_digit_idx >= 0

and individual[first_non_digit_idx].isdigit()

):

first_non_digit_idx -= 1

# Extract track ID from (first_non_digit_idx+1) until the end

if first_non_digit_idx < last_idx:

track_id = int(individual[first_non_digit_idx + 1 :])

# Find last sequence of digits in the name

if match := re.search(r'(\d+)$', individual):

track_id = int(match.group(1))

niksirbi · 2025-05-23T15:09:19Z

movement/io/save_bboxes.py

+        return frame_n_digits
+
+
+def _get_map_individuals_to_track_ids(


very minor, but I find _map_individuals_to_track_ids much more natural (I read "map" as a verb).

niksirbi · 2025-05-23T15:19:14Z

movement/io/save_bboxes.py

+    Parameters
+    ----------
+    ds : xarray.Dataset
+        The movement bounding boxes dataset to export.


Suggested change

The movement bounding boxes dataset to export.

The ``movement`` bounding boxes dataset to export.

niksirbi · 2025-05-23T15:21:27Z

movement/io/save_bboxes.py

+        filenames (including leading zeros). If None, the number of digits is
+        automatically determined from the largest frame number in the dataset,
+        plus one (to have at least one leading zero). Default is None.
+    image_file_prefix : str, optional


What I found a bit confusing is that we require image filenames, even though the movement dataset itself doesn't contain any images. I understand why that is, but maybe we should add a "Note" in the docstring?

niksirbi · 2025-05-23T15:47:03Z

tests/test_unit/test_io/test_save_bboxes.py

+    if image_file_suffix is not None:
+        assert df["filename"].str.endswith(image_file_suffix).all()
+    else:
+        assert df["filename"].str.endswith(".png").all()


I don't understan how this is always .png, given that you also have .jpg in the parametrisation. What am I missing?

niksirbi · 2025-06-03T17:22:11Z

Note: this PR will need to be rebased and updated after #606 is merged, mostly because the Input/Output guide has been restructured.

sfmig mentioned this pull request May 8, 2025

Added support for exporting bboxes in VIA-tracks (issue #495) #497

Closed

7 tasks

sfmig force-pushed the smg/save-via-tracks-from-pr497 branch 3 times, most recently from f1bed45 to 24d3e19 Compare May 14, 2025 09:57

harsh-bhanushali-05 and others added 25 commits May 14, 2025 10:57

Add files via upload

3b72492

Add support for exporting bboxes in VIA-tracks

9ae5b89

[pre-commit.ci] auto fixes from pre-commit.com hooks

d6ab206

for more information, see https://pre-commit.ci

Bug fixes

5fbe43c

[pre-commit.ci] auto fixes from pre-commit.com hooks

be489d2

for more information, see https://pre-commit.ci

More big fixes

7a74ce7

[pre-commit.ci] auto fixes from pre-commit.com hooks

d51f08b

for more information, see https://pre-commit.ci

big fix

0a31390

[pre-commit.ci] auto fixes from pre-commit.com hooks

a00d6da

for more information, see https://pre-commit.ci

Pre commit changes

3ff9ab6

[pre-commit.ci] auto fixes from pre-commit.com hooks

a93a1cc

for more information, see https://pre-commit.ci

Pre-commit error changes

da2aa29

[pre-commit.ci] auto fixes from pre-commit.com hooks

37dd94b

for more information, see https://pre-commit.ci

fix

5fe2fca

[pre-commit.ci] auto fixes from pre-commit.com hooks

fa3c97d

for more information, see https://pre-commit.ci

fix.

45c7e16

[pre-commit.ci] auto fixes from pre-commit.com hooks

e8a8965

for more information, see https://pre-commit.ci

Fixing code to resolve CI testcases

aa9d4b5

Corrected the export format.

ac25ba4

Updated testcases

830e5ad

Updated the testcase

eab3584

Replace logging with loguru

f5ebd0d

Updated the logging

5d7a72d

Rename file and small edits

4801ec8

Export confidence optionally and pad with max digits plus one

227aab8

sfmig added 6 commits May 14, 2025 10:57

Fix quotes for loadable in VIA (WIP, test fix pending)

038b012

Loadable in VIA

db88119

Replace with json approach

767c2cf

Fix json serialisation issue

1524a22

Add preliminary test for double quotes

8db30ea

Fix frame_ being interpreted as a cross-reference by sphinx

842bbbc

sfmig force-pushed the smg/save-via-tracks-from-pr497 branch from 24d3e19 to 842bbbc Compare May 14, 2025 09:57

sfmig added 14 commits May 14, 2025 11:06

Small edits to test

21d11f8

Allow user to set padding for frame number

c4c1fa2

Rename variables. Add helper function for tests

d83d7b9

Add region id and count id

b2c645a

Check two lines literally

8e18070

Combine image filename argus to _write_single_row into one

b652dd3

Add test to check datasets are recoverable

0a43e1c

Add tests for region and count ID

35396e0

Change default extract_track_id_from_individuals to True

a628c67

Small edits

4d74049

Fixes for API reference

3eb9c75

Consistency in naming the file

0ad84b2

Add reference

4f66088

Update guide

76a0eee

sfmig added 2 commits May 14, 2025 18:20

Add reference

61c5ecd

Update guide

3c2dbc7

sfmig mentioned this pull request May 14, 2025

Extend some napari tests to bboxes data #591

Open

sfmig marked this pull request as ready for review May 14, 2025 17:34

sfmig requested a review from niksirbi May 14, 2025 17:34

niksirbi approved these changes May 23, 2025

View reviewed changes

niksirbi mentioned this pull request Jun 5, 2025

Remove "easily" from some longer examples #612

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Export bboxes dataset as VIA-tracks file #580

Export bboxes dataset as VIA-tracks file #580

Uh oh!

sfmig commented May 8, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 8, 2025 •

edited

Loading

Uh oh!

sonarqubecloud bot commented May 14, 2025

Uh oh!

niksirbi left a comment

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi May 23, 2025

Uh oh!

niksirbi commented Jun 3, 2025

Uh oh!

Uh oh!

		By default the {func}`movement.io.save_bboxes.to_via_tracks_file` function will try to extract the track IDs from the individuals' names, but you can also select to extract them from the sorted list of individuals with `extract_track_id_from_individuals=True`.


		Alternatively, you can save the bounding box tracks to a .csv file with a custom header using the standard Python library `csv`. Below is an example of how you can do this:

		@@ -0,0 +1,117 @@
		"""Unit tests for the movement.io.utils module."""

		from movement.validators.files import ValidFile


		def _validate_file_path(

	"""Save a movement bounding boxes dataset to a VIA tracks .csv file.
	"""Save a ``movement`` bounding boxes dataset to a VIA tracks .csv file.

	The movement bounding boxes dataset to export.
	The ``movement`` bounding boxes dataset to export.

Export bboxes dataset as VIA-tracks file #580

Are you sure you want to change the base?

Export bboxes dataset as VIA-tracks file #580

Uh oh!

Conversation

sfmig commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

References

How has this PR been tested?

Is this a breaking change?

Does this PR require an update to the documentation?

Checklist:

Uh oh!

codecov bot commented May 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

sonarqubecloud bot commented May 14, 2025

Quality Gate passed

Uh oh!

niksirbi left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

niksirbi commented Jun 3, 2025

Uh oh!

Uh oh!

sfmig commented May 8, 2025 •

edited

Loading

codecov bot commented May 8, 2025 •

edited

Loading