Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Our perf thresholds have changed #178

Merged
merged 4 commits into from
Jun 7, 2024

Conversation

Lukasa
Copy link
Contributor

@Lukasa Lukasa commented Jun 7, 2024

No description provided.

@Lukasa Lukasa added the semver/none No version bump required. label Jun 7, 2024
@Lukasa Lukasa requested a review from dnadoba June 7, 2024 14:57
@Lukasa
Copy link
Contributor Author

Lukasa commented Jun 7, 2024

The retain/release counts in sync code have become very unstable, and I don't think they're providing us much value in comparison to the time I'm spending trying to dial them in. So I propose we just remove them.

@dnadoba
Copy link
Member

dnadoba commented Jun 7, 2024

Fair! If they are not stable, even in sync code, we need to disabled them. It is surprising that this now effects sync code. @hassila are there any known issues?
I'm only aware of ordo-one/package-benchmark#189 but I don't think this should make it non-deterministic.

Comment on lines 5 to 6
"releaseCount" : 120271,
"retainCount" : 109425,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we remove the retain/release counts from the output as well please?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Lukasa Lukasa merged commit 4688f24 into apple:main Jun 7, 2024
6 checks passed
@Lukasa Lukasa deleted the cb-fix-performance-thresholds branch June 7, 2024 15:27
@hassila
Copy link

hassila commented Jun 7, 2024

Nothing that I know of that should make them unstable (possibly if you run any sampling metric like thread counts etc? If you run with only ARC metrics enabled is it still unstable?). Then of course it doesn't always add up as mentioned in the related case (some objects are also created with initial ref count which is another factor why we sometime see more releases).

@Lukasa
Copy link
Contributor Author

Lukasa commented Jun 10, 2024

I don't think we had any sampling metrics enabled: we recorded syscalls, allocations, and retains/releases. And the issue wasn't an adding-up problem: it was that the values changed from run-to-run.

@hassila
Copy link

hassila commented Jun 10, 2024

Yeah, that's very strange then, that they don't add up is known, but they should be stable.

@hassila
Copy link

hassila commented Jun 10, 2024

(to be sure to isolate it, you might try running with only arc metrics enabled, then I can't really see how any other parts of the benchmark infrastructure could affect it much)

@dnadoba
Copy link
Member

dnadoba commented Aug 10, 2024

@Lukasa have you investigated why the thresholds have changed? How have you updated them?

I'm running ./dev/update-benchmark-thresholds locally which produces a stable output that I have opened a PR with just now: #184
However, CI doesn't agree on these. These numbers seem to be stable on CI as well.
Docker is used locally and on CI so I wouldn't expect any difference.

@Lukasa
Copy link
Contributor Author

Lukasa commented Aug 12, 2024

My current theory is that because we aren't scaling the iterations, we're extremely sensitive to minor variations. We should have things settle down if we scale the iteration count by kilo or mega.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
semver/none No version bump required.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants