Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use regular division inside Scale Estimation #3210

Merged

Conversation

nikita-savelyevv
Copy link
Collaborator

@nikita-savelyevv nikita-savelyevv commented Jan 23, 2025

Changes

Compute division inside SE algorithm always as a/b instead of a*(1/b) in some cases.

Reason for changes

During implementation #2727 some choices were made regarding how division operation is computed in order for the changes to be completely aligned with the previous implementation. Namely, before #2727 some divisions were computed as a*(1/b), and this is currently still the case.

The way these divisions are computed originally was not intended. Now, all divisions are aligned to the a/b form.

Compression time and memory are roughly the same.

Model Compression Compression Time Develop (sec.) Compression Time Branch (sec.) Peak Memory Develop (MiB) Peak Memory Branch (MiB)
tiny-llama int4, SE 222* 228* 3030 3032
phi4-mini in4, SE 789* 790* 10817 10768
llama-3.1-8b int4, SE 1776* 1801* 17756 18224

*time column includes PT -> OV conversion time.

Related tickets

163286

Tests

@github-actions github-actions bot added NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ labels Jan 23, 2025
Comment on lines 38 to 42
tinyllama_scale_estimation_per_channel_backend_TORCH:
metric_value: 0.81389
metric_value: 0.80873
num_int4: 188
num_int8: 124
atol: 0.006 # difference across devices: 0.80873 vs 0.81389
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A reference value for PT is changed to stay aligned with OV. Interestingly, the new value for OV (0.80873) is also the value that PT backend gets on some devices.

@nikita-savelyevv nikita-savelyevv marked this pull request as ready for review February 28, 2025 10:55
@nikita-savelyevv nikita-savelyevv requested a review from a team as a code owner February 28, 2025 10:55
@MaximProshin
Copy link
Collaborator

@nikita-savelyevv , please provide time compression data with these changes before we merge it

@nikita-savelyevv
Copy link
Collaborator Author

@nikita-savelyevv , please provide time compression data with these changes before we merge it

Added

@alexsu52 alexsu52 merged commit 73590b0 into openvinotoolkit:develop Mar 10, 2025
17 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NNCF OpenVINO Pull requests that updates NNCF OpenVINO NNCF PTQ Pull requests that updates NNCF PTQ
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants