Skip to content

[Ref Mode] PyTorch reference mode (eager only) #339

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 31, 2025
Merged

Conversation

yf225
Copy link
Contributor

@yf225 yf225 commented Jul 20, 2025

Stacked PRs:


[Ref Mode] PyTorch reference mode (eager only)

This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

yf225 added a commit that referenced this pull request Jul 20, 2025
stack-info: PR: #339, branch: yf225/stack/34
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 20, 2025
yf225 added a commit that referenced this pull request Jul 20, 2025
Fixes #77.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 20, 2025
Fixes #77.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 20, 2025
Fixes #77.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 20, 2025
Fixes #77.

stack-info: PR: #339, branch: yf225/stack/34
)

# Step 3: Handle block_size (in ref mode, full dim size is always used as block_size)
block_size_list = [None] * len(end_list)
Copy link
Contributor Author

@yf225 yf225 Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Always use full dim size in ref modes regardless of block_size value

@yf225 yf225 requested review from jansel, oulgen, drisspg and joydddd July 20, 2025 23:50
class TestExamplesRefCompile(test_examples.TestExamples):
"""Run all TestExamples tests in reference torch.compile mode via HELION_REF_COMPILE=1."""

# NOTE: All tests in TestExamples are run in ref torch.compile(fullgraph=True) mode by default in this test file.
Copy link
Contributor Author

@yf225 yf225 Jul 20, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Currently all examples in TestExamples pass with ref eager mode and ref compile mode.

Planning to add more ref mode unit tests to cover test_reduce.py, test_associative_scan.py etc. in the next PR.

yf225 added a commit that referenced this pull request Jul 20, 2025
Fixes #77.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 20, 2025
Fixes #77. Please see
inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 20, 2025
Fixes #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 21, 2025
Fixes #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 21, 2025
Fixes #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 21, 2025
Fixes #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 21, 2025
Fixes #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
@yf225 yf225 requested a review from jansel July 31, 2025 05:30
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
Copy link
Contributor

@jansel jansel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Take a step back, it seems like most test files are using RefEagerTestDisabled and even for the tests that are enabled we are pretty aggressive disabling stuff like assertTrue and assertEqual.

How complete is this implementation?

self.assertGreater( # type: ignore[attr-defined]
total_assertions,
0,
f"Test {self._testMethodName} did not call torch.testing.assert_close, assertRaises, or skipTest", # type: ignore[attr-defined] # pyright: ignore[reportAttributeAccessIssue]
Copy link
Contributor Author

@yf225 yf225 Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

even for the tests that are enabled we are pretty aggressive disabling stuff like assertTrue and assertEqual

@jansel we have a requirement here that all enabled tests must either call torch.testing.assert_close or assertRaises or self.skipTest, so I believe even though assertTrue and assertEqual etc. are disabled, we are still checking correctness via those enabled assertions

Updated to not disable assertTrue and assertEqual, and added assertEqualCode / assertNotEqualCode

Copy link
Contributor Author

@yf225 yf225 Jul 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like most test files are using RefEagerTestDisabled

yes I'll work on polishing the next PR (#404) that enables all the test files. (At one time I had the combined PR passing all tests, but it's 3000+ lines so I split into two PRs 😆)

yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
yf225 added a commit that referenced this pull request Jul 31, 2025
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
@yf225 yf225 requested a review from jansel July 31, 2025 18:30
This PR focuses on enabling ref mode for APIs like hl.grid / hl.tile /
hl.dot. A follow-up PR will enable ref mode for reduction / scan / indexing APIs.
Part of #77.
Please see inline code comments on the PR.

stack-info: PR: #339, branch: yf225/stack/34
self, first: str, second: str, msg: str | None = None
) -> None:
if not self._in_ref_eager_mode:
super().assertNotEqual(first, second, msg) # type: ignore[misc]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New state: we only disable journal / in / not-in / code comparison in ref mode

@yf225 yf225 merged commit 4c0f7f3 into main Jul 31, 2025
10 checks passed
path: |
~/.cache/uv
~/.venv
key: ${{ runner.os }}-deps-${{ matrix.python-version }}-${{ hashFiles('.github/workflows/test.yml', 'requirements.txt') }}-${{ steps.date.outputs.month }}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yf225 i think you broke the cache key with this. Why did you need to use templates? we could just do this with strategy matrix instead

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can clean it up

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stamped, sorry for breakage and thanks so much for catching this!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants