-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Fix: filesystem edge case in ModelCheckpoint
#18252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Fix: filesystem edge case in ModelCheckpoint
#18252
Conversation
ModelCheckpoint
ModelCheckpoint
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for sending the PR 🎉
if final_path is None: | ||
assert trainer.ckpt_path == final_path | ||
else: | ||
assert mc._fs._strip_protocol(trainer.ckpt_path) == Path(final_path).as_posix() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test here isn't failing without the fix. It never gets a path with a protocol. We need to write a new test where we pretend to be loading from a path with a protocol.
A unit test in test_model_checkpoint.py
that calls _find_last_checkpoints
should be enough to cover this case. Should I help with that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I definitely got caught up here, if you can tell from the commit history 😂 Any help would be much appreciated.
I was also concerned about the impact on ModelCheckpoint.file_exists
. I think the core question is: should the selected fsspec
filesystem
be an attribute of ModelCheckpoint
or of Trainer
? Maybe I'm thinking about it wrong...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test here isn't failing without the fix. It never gets a path with a protocol.
Hmm, it does fail for me locally. For local filepaths, LocalFileSystem._unstrip_protocol
adds the file://
prefix to paths.
@awaelchli I could take a stab at adding a test case if you'd prefer? Though I may need a little additional detail on your idea for it! |
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #18252 +/- ##
==========================================
- Coverage 84% 49% -35%
==========================================
Files 439 431 -8
Lines 34484 34341 -143
==========================================
- Hits 28877 16890 -11987
- Misses 5607 17451 +11844 |
@schmidt-ai @awaelchli how is it going here? :) |
️✅ There are no secrets present in this pull request anymore.If these secrets were true positive and are still valid, we highly recommend you to revoke them. 🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request. |
Fixes #17912.