Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add boilerplate code #1635

Closed
wants to merge 17 commits into from
Closed

Conversation

jainapurva
Copy link
Contributor

@jainapurva jainapurva commented Jan 29, 2025

Copy link

pytorch-bot bot commented Jan 29, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1635

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 9 New Failures

As of commit 23f4a1c with merge base b2fb664 (image):

NEW FAILURES - The following jobs have failed:

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 29, 2025
* Fix ZeroPointDomain.NONE support & make it default for da8w8 weights

* Fix bug & apply review recommendations

* Throw exceptions when None zero_point_domain is used

* Use ZeroPointDomain.NONE for weight in int8_dynamic_activation_int8_weight

* Rebase with the latest main branch

* Fix typo
torchao/utils.py Outdated
aten = torch.ops.aten


@implements(aten.detach.default)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one thing here is what would happen when a tensor subclasses TorchAOBaseTensor and tried to overwrite these functions, to allow child tensor classes to overwrite these functions I think we'd have to copy the table

cls._ATEN_OP_OR_TORCH_FN_TABLE = {}
when we detect a new child class is created, e.g.

def _implements(cls, aten_ops_or_torch_fns):
    # making sure we are querying the attribute from current class, not parent
    # please check if this works
    if "_ATEN_OP_OR_TORCH_FN_TABLE" not in dir(cls):
         # copy the table from parent

def _dispatch__torch_function__(cls, func, types, args=(), kwargs=None):
    # making sure we are querying the attribute from current class, not parent
    # please check if this works
    if "_ATEN_OP_OR_TORCH_FN_TABLE" not in dir(cls):
         # copy the table from parent if it exists

torchao/utils.py Outdated
class TorchAOBaseTensor(torch.Tensor):
"""A util tensor subclass that provides commonly used functions
new tensor subclass can inherit it to get all the utility functions
new tensor subclass can inherit it to get all the utility functions, and
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are the pros and cons of using inheritance here versus just having utility functions and tensors using what they need from utility functions, without inheritance? It might be hard to come up with a TorchAOBaseTensor which is generic enough to truly handle all the important use cases in torchao.

Copy link
Contributor

@jerryzh168 jerryzh168 Jan 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, we could have both I think, e.g. _get_to_kwargs can be a standalone util function

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per my understanding:
Option 1: We can move out all the util functions, and make an independent tensor-subclass (inheriting from torch.Tensor), then developer will have to define/inherit util functions.
Option 2: We can add the minimum needed util functions in TorchAOBaseTensor, so that it gives a base start to developer and build on top of it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

depends on how much can be reused by other tensor subclasses I think, if the TorchAOBaseTensor is very small then it may not make sense to have it any more and it will not be useful for existing inference tensor subclasses. I feel it might be better to have both the current TorchAOBaseTensor and some utils that can be reused by other tensor subclasses as a starting point and then adapt based on use cases

janeyx99 and others added 9 commits January 29, 2025 15:24
Pass all args to pytest.main to propage user options like -k

Tested locally with `python test/test_ops.py -k test_dequantize_tensor_core_tiled_layout_correctness_quant_dequant`

which previously just ran all the tests but after this PR will run 60, the same number as `pytest test/test_ops.py -k test_dequantize_tensor_core_tiled_layout_correctness_quant_dequant`
only run docs CI jobs when docs have changed
…sion

Differential Revision: D68726705

Pull Request resolved: #1630
There's a lot of content in the contributor guide that belongs
better to "Quantization Overview", so here we split the content
and put them in the right pages.
@jainapurva jainapurva marked this pull request as draft January 30, 2025 22:27
[ghstack-poisoned]
@jainapurva jainapurva force-pushed the gh/jainapurva/3/head branch from 49c8514 to d42c725 Compare January 30, 2025 22:56
@jainapurva jainapurva added topic: for developers Use this tag if this PR is mainly developer facing topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories) labels Jan 30, 2025
vkuzo and others added 4 commits January 30, 2025 20:06
Summary:

Adds the workaround from
pytorch/pytorch#141881 to the torchao float8
rowwise recipe, to reduce memory usage when FSDP is on.

Test Plan: tested in torchtitan, LLaMa 3 8B 8H100 training with rowwise
peak memory decreased from 67GiB to 59GiB

Reviewers:

Subscribers:

Tasks:

Tags:
* more stringent test for CPUOffloadOptimizer

* fix missing sync
* synchronize param H2D

* let CPU offload inherits Optimizer

* add scheduler to test
stack-info: PR: #1658, branch: drisspg/stack/32

def _get_to_kwargs(self, *args, **kwargs):
Copy link
Contributor

@jerryzh168 jerryzh168 Feb 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

�this should be preserved I think, since it's called in child classes, we can just call the util function above

@jainapurva jainapurva closed this Feb 10, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. topic: for developers Use this tag if this PR is mainly developer facing topic: improvement Use this tag if this PR is an improvement (doesn't fit into any of the other categories)
Projects
None yet
Development

Successfully merging this pull request may close these issues.