-
Notifications
You must be signed in to change notification settings - Fork 28.8k
enable xpu in test_trainer #37774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
enable xpu in test_trainer #37774
Conversation
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Hi 👋, thank you for opening this pull request! The pull request is converted to draft by default. The CI will be paused while the PR is in draft mode. When it is ready for review, please click the |
@@ -2993,6 +3008,9 @@ def _device_agnostic_dispatch(device: str, dispatch_table: dict[str, Callable], | |||
BACKEND_EMPTY_CACHE["xpu"] = torch.xpu.empty_cache | |||
BACKEND_MANUAL_SEED["xpu"] = torch.xpu.manual_seed | |||
BACKEND_DEVICE_COUNT["xpu"] = torch.xpu.device_count | |||
BACKEND_RESET_MAX_MEMORY_ALLOCATED["xpu"] = torch.xpu.reset_peak_memory_stats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
xpu map reset_peak_memory_stats
to RESET_MAX_MEMORY_ALLOCATED
, since they have same functionality. CUDA's impl is same. The difference is CUDA keeps the reset_max_memory_allocated
API for backward-compatibility, XPU doesn't expose this API.
@@ -1243,7 +1246,6 @@ def test_mixed_bf16(self): | |||
|
|||
# will add more specific tests once there are some bugs to fix | |||
|
|||
@require_non_xpu | |||
@require_torch_gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require_torch_gpu
is enough to not test on xpu.
@@ -3992,7 +3994,7 @@ def test_fp16_full_eval(self): | |||
# perfect world: fp32_init/2 == fp16_eval | |||
self.assertAlmostEqual(fp16_eval, fp32_init / 2, delta=5_000) | |||
|
|||
@require_non_xpu | |||
@require_torch_gpu |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a gpu only test, since it's related to nvfuser
, using the correct decorator to reflect the truth
@ydshieh, pls help review, thx. |
128 PASSED, 27 FAILED for lacking of
optimizer_update_32bit
xpu op in BNB, the implementation in under going, once done, will pass after PR merged to bnb.