Skip to content

Remove preserve_zero and zero_point_domain from choose_qparams_affine #2149

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

jainapurva
Copy link
Contributor

No description provided.

Copy link

pytorch-bot bot commented Apr 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/2149

Note: Links to docs will display an error until the docs builds have been completed.

❌ 9 New Failures

As of commit 414df66 with merge base 137b079 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 29, 2025
@jainapurva jainapurva added topic: not user facing Use this tag if you don't want this PR to show up in release notes topic: for developers Use this tag if this PR is mainly developer facing labels Apr 29, 2025
def choose_qparams_affine_tiny_gemm(
input: torch.Tensor,
mapping_type: MappingType,
block_size: Tuple[int, ...],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: change this to Tuple[int] as well to be consistent, assuming it means the same thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Block size is tuple with multiple integers, hence will need to do Tuple[int, ...]

Comment on lines +780 to +785
target_dtype: torch.dtype,
quant_min: Optional[Union[int, float]] = None,
quant_max: Optional[Union[int, float]] = None,
eps: Optional[float] = None,
scale_dtype: Optional[torch.dtype] = None,
zero_point_dtype: Optional[torch.dtype] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could probably simplify this list as well, only configurable things are needed, this can be a separate PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed

Comment on lines +850 to +855
target_dtype: torch.dtype,
quant_min: Optional[Union[int, float, bool]] = None,
quant_max: Optional[Union[int, float, bool]] = None,
eps: Optional[float] = None,
scale_dtype: Optional[torch.dtype] = None,
zero_point_dtype: Optional[torch.dtype] = None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

MappingType.SYMMETRIC.name,
MappingType.SYMMETRIC_NO_CLIPPING_ERR.name,
MappingType.ASYMMETRIC.name,
MappingType.SYMMETRIC,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if this op has to be lowered, we'd need to use str instead of enum

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all the new ops I've used MappingType enum. Should I update them to str?

@jainapurva jainapurva marked this pull request as ready for review April 30, 2025 17:36
@jainapurva jainapurva marked this pull request as draft April 30, 2025 18:10
@@ -301,12 +305,6 @@ def quantize_affine(
output_dtype (torch.dtype): requested dtype (e.g. torch.uint8) for output Tensor
quant_min (Optional[int]): minimum quantized value for output Tensor, if not specified, it will be derived from dtype
quant_max (Optional[int]): maximum quantized value for output Tensor, if not specified, it will be derived from dtype
zero_point_domain (ZeroPointDomain): the domain that zero_point is in, should be either integer or float
Copy link
Contributor

@jerryzh168 jerryzh168 May 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we should probably preserve these for now, and move to quant_api, same for the doc for peserve_zero arg

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: for developers Use this tag if this PR is mainly developer facing topic: not user facing Use this tag if you don't want this PR to show up in release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants