-
Notifications
You must be signed in to change notification settings - Fork 541
[PyTorch Debug] NVFP4 debug stats support #2296
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Pawel Gadzinski <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Pawel Gadzinski <[email protected]>
for more information, see https://pre-commit.ci
Signed-off-by: Pawel Gadzinski <[email protected]>
Signed-off-by: Pawel Gadzinski <[email protected]>
Signed-off-by: Pawel Gadzinski <[email protected]>
Signed-off-by: Pawel Gadzinski <[email protected]>
|
/te-ci pytorch |
|
|
||
| @Registry.register_feature(namespace="transformer_engine") | ||
| class DisableFP8Layer: | ||
| class DisableFP8Layer(DisableQuantizationLayer): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It may be worth raising a deprecation warning in the constructor or something. DisableFP8GEMM would also benefit from this.
| return total_elements - first_zeros - second_zeros | ||
|
|
||
|
|
||
| def add_nvfp4_underflows_stats(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With RHT, it's possible that the NVFP4 data has fewer zeros than the high-precision data, so this stat would be negative. It's not possible yet (we currently only apply RHT to the NVFP4 column-wise data) and it's probably beyond the scope of this PR, but it's something to consider if we ever generalize.
Description
This PR adds support for NVFP4 statistics: underflows and mse. I add them in seperate feature, because we may want to have a lot nvfp4-specific features added later.
Also, I renamed few variables from "fp8"-like to "quantization"-like. I cannot rename all of them - for example "is_fp8_gemm_enabled" which is an API call, so I left some of them.
Fixes # (issue)
Type of change
Checklist: