Add boilerplate code #1635

jainapurva · 2025-01-29T01:41:09Z

Stack from ghstack (oldest at bottom):

pytorch-bot · 2025-01-29T01:41:13Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1635

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

ROCM Infra failures during checkout of PyTorch

❌ 9 New Failures

As of commit 23f4a1c with merge base b2fb664 ():

NEW FAILURES - The following jobs have failed:

Code Analysis with Ruff / build (3.9) (gh)
Process completed with exit code 1.
Run Regression Tests / test (CPU 2.3, linux.4xlarge, torch==2.3.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types
Run Regression Tests / test (CPU 2.4, linux.4xlarge, torch==2.4.0 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types
Run Regression Tests / test (CPU 2.5.1, linux.4xlarge, torch==2.5.1 --index-url https://download.pytorch.org/whl/cpu, cpu) / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types
Run Regression Tests / test (CUDA 2.3, linux.g5.12xlarge.nvidia.gpu, torch==2.3.0, cuda, 12.1) / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types
Run Regression Tests / test (CUDA 2.4, linux.g5.12xlarge.nvidia.gpu, torch==2.4.0, cuda, 12.1) / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types
Run Regression Tests / test (CUDA 2.5.1, linux.g5.12xlarge.nvidia.gpu, torch==2.5.1 --index-url https://download.pytorch... / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types
Run Regression Tests / test-nightly (CPU Nightly, linux.4xlarge, --pre torch==2.7.0.dev20250122 --index-url https://down... / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types
Run Regression Tests / test-nightly (CUDA Nightly, linux.g5.12xlarge.nvidia.gpu, --pre torch==2.7.0.dev20250122 --index-... / linux-job (gh)
test/test_utils.py::TestTorchAOBaseTensor::test_print_arg_types

This comment was automatically generated by Dr. CI and updates every 15 minutes.

* Fix ZeroPointDomain.NONE support & make it default for da8w8 weights * Fix bug & apply review recommendations * Throw exceptions when None zero_point_domain is used * Use ZeroPointDomain.NONE for weight in int8_dynamic_activation_int8_weight * Rebase with the latest main branch * Fix typo

jerryzh168 · 2025-01-29T08:25:12Z

torchao/utils.py

+aten = torch.ops.aten
+
+
+@implements(aten.detach.default)


one thing here is what would happen when a tensor subclasses TorchAOBaseTensor and tried to overwrite these functions, to allow child tensor classes to overwrite these functions I think we'd have to copy the table

ao/torchao/utils.py

Line 402 in 7b0d2ce

cls._ATEN_OP_OR_TORCH_FN_TABLE = {}

when we detect a new child class is created, e.g.

def _implements(cls, aten_ops_or_torch_fns): # making sure we are querying the attribute from current class, not parent # please check if this works if "_ATEN_OP_OR_TORCH_FN_TABLE" not in dir(cls): # copy the table from parent def _dispatch__torch_function__(cls, func, types, args=(), kwargs=None): # making sure we are querying the attribute from current class, not parent # please check if this works if "_ATEN_OP_OR_TORCH_FN_TABLE" not in dir(cls): # copy the table from parent if it exists

vkuzo · 2025-01-29T15:56:14Z

torchao/utils.py

 class TorchAOBaseTensor(torch.Tensor):
    """A util tensor subclass that provides commonly used functions
-       new tensor subclass can inherit it to get all the utility functions
+       new tensor subclass can inherit it to get all the utility functions, and 


what are the pros and cons of using inheritance here versus just having utility functions and tensors using what they need from utility functions, without inheritance? It might be hard to come up with a TorchAOBaseTensor which is generic enough to truly handle all the important use cases in torchao.

makes sense, we could have both I think, e.g. _get_to_kwargs can be a standalone util function

As per my understanding:
Option 1: We can move out all the util functions, and make an independent tensor-subclass (inheriting from torch.Tensor), then developer will have to define/inherit util functions.
Option 2: We can add the minimum needed util functions in TorchAOBaseTensor, so that it gives a base start to developer and build on top of it.

depends on how much can be reused by other tensor subclasses I think, if the TorchAOBaseTensor is very small then it may not make sense to have it any more and it will not be useful for existing inference tensor subclasses. I feel it might be better to have both the current TorchAOBaseTensor and some utils that can be reused by other tensor subclasses as a starting point and then adapt based on use cases

Pass all args to pytest.main to propage user options like -k Tested locally with `python test/test_ops.py -k test_dequantize_tensor_core_tiled_layout_correctness_quant_dequant` which previously just ran all the tests but after this PR will run 60, the same number as `pytest test/test_ops.py -k test_dequantize_tensor_core_tiled_layout_correctness_quant_dequant`

only run docs CI jobs when docs have changed

…sion Differential Revision: D68726705 Pull Request resolved: #1630

There's a lot of content in the contributor guide that belongs better to "Quantization Overview", so here we split the content and put them in the right pages.

Update [ghstack-poisoned]

lint

[ghstack-poisoned]

Summary: Adds the workaround from pytorch/pytorch#141881 to the torchao float8 rowwise recipe, to reduce memory usage when FSDP is on. Test Plan: tested in torchtitan, LLaMa 3 8B 8H100 training with rowwise peak memory decreased from 67GiB to 59GiB Reviewers: Subscribers: Tasks: Tags:

* more stringent test for CPUOffloadOptimizer * fix missing sync

* synchronize param H2D * let CPU offload inherits Optimizer * add scheduler to test

stack-info: PR: #1658, branch: drisspg/stack/32

jerryzh168 · 2025-02-04T01:52:37Z

torchao/utils.py


-    def _get_to_kwargs(self, *args, **kwargs):


�this should be preserved I think, since it's called in child classes, we can just call the util function above

…#1657)

jainapurva mentioned this pull request Jan 29, 2025

Create separate float8 tensor subclass #1636

Draft

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 29, 2025

jerryzh168 reviewed Jan 29, 2025

View reviewed changes

vkuzo reviewed Jan 29, 2025

View reviewed changes

janeyx99 and others added 9 commits January 29, 2025 15:24

only run docs CI jobs on PRs when docs have changed (#1612)

2d8c8eb

only run docs CI jobs when docs have changed

Fix .item() issue in running parallel evaluation for BO mixed preci…

0c42823

…sion Differential Revision: D68726705 Pull Request resolved: #1630

Split contributor guide into quantization overview (#1618)

aa0b7ca

There's a lot of content in the contributor guide that belongs better to "Quantization Overview", so here we split the content and put them in the right pages.

Update api_ref_quantization docs (#1619)

c1f5872

[Experimental][Kleidi] Add GEMM operator tests (#1638)

b559c6d

skip failing MX tests on cuda capability 10.0 (#1624)

463a872

Update [ghstack-poisoned]

[Feat]: Add support for kleidiai quantization schemes (#1447)

7815262

Ruff lint (#1646)

48fdd31

lint

jainapurva marked this pull request as draft January 30, 2025 22:27

Update

d42c725

[ghstack-poisoned]

jainapurva force-pushed the gh/jainapurva/3/head branch from 49c8514 to d42c725 Compare January 30, 2025 22:56

jainapurva added topic: for developers Use this tag if this PR is mainly developer facing topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) labels Jan 30, 2025

vkuzo and others added 4 commits January 30, 2025 20:06

more stringent test for CPUOffloadOptimizer (#1650)

122eb73

* more stringent test for CPUOffloadOptimizer * fix missing sync

Fix LR scheduler issue with CPU offload optimizer (#1649)

6ffe236

* synchronize param H2D * let CPU offload inherits Optimizer * add scheduler to test

Fix ruff and make sure pre-commit is at same version (#1658)

7e54629

stack-info: PR: #1658, branch: drisspg/stack/32

jerryzh168 reviewed Feb 4, 2025

View reviewed changes

jerryzh168 approved these changes Feb 4, 2025

View reviewed changes

jainapurva added 2 commits February 4, 2025 09:58

Add int8 dynamic activation + int8 weight only test to TensorParallel (…

b2fb664

…#1657)

Merge remote-tracking branch 'origin/main' into gh/jainapurva/3/head

23f4a1c

jainapurva closed this Feb 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add boilerplate code #1635

Add boilerplate code #1635

jainapurva commented Jan 29, 2025 •

edited

Loading

pytorch-bot bot commented Jan 29, 2025 •

edited

Loading

jerryzh168 Jan 29, 2025

vkuzo Jan 29, 2025

jerryzh168 Jan 29, 2025 •

edited

Loading

jainapurva Jan 30, 2025

jerryzh168 Jan 30, 2025

jerryzh168 Feb 4, 2025 •

edited

Loading

		aten = torch.ops.aten


		@implements(aten.detach.default)

Add boilerplate code #1635

Add boilerplate code #1635

Conversation

jainapurva commented Jan 29, 2025 • edited Loading

pytorch-bot bot commented Jan 29, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1635

❗ 1 Active SEVs

❌ 9 New Failures

jerryzh168 Jan 29, 2025

Choose a reason for hiding this comment

vkuzo Jan 29, 2025

Choose a reason for hiding this comment

jerryzh168 Jan 29, 2025 • edited Loading

Choose a reason for hiding this comment

jainapurva Jan 30, 2025

Choose a reason for hiding this comment

jerryzh168 Jan 30, 2025

Choose a reason for hiding this comment

jerryzh168 Feb 4, 2025 • edited Loading

Choose a reason for hiding this comment

jainapurva commented Jan 29, 2025 •

edited

Loading

pytorch-bot bot commented Jan 29, 2025 •

edited

Loading

jerryzh168 Jan 29, 2025 •

edited

Loading

jerryzh168 Feb 4, 2025 •

edited

Loading