Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add mfence before issuing rdtsc #30

Merged
merged 6 commits into from
Jan 12, 2024

Conversation

itzmeanjan
Copy link
Contributor

Given the fact that a x86 CPU doesn't guarantee when it'll actually issue RDTSC instruction, if used in isolation, we must use it with a memory fence instruction such as MFENCE so that all memory load/ store operations, which appear before RDTSC instruction, in the program order (i.e. source code or compiler generated assembly), are globally visible.

It's necessary to use a memory fence either lfence or mfence to enforce ordering constraints on CPU issuing rdtsc instruction.

See

Note, I use mfence (serializes both load and stores) instead of lfence (serializes only loads), just to be safer. It shouldn't hurt.
More about memory fencing @ https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.htm#text=fence&ig_expand=3977,4395,4395.

This should make all memory load/ store operations happening before RDTSC (in program order) globally visible. Read https://stackoverflow.com/a/12634857.

Signed-off-by: Anjan Roy <[email protected]>
…d on different optimization settings

Signed-off-by: Anjan Roy <[email protected]>
@oreparaz
Copy link
Owner

Thanks for this! I'm positive about the change.

Have you witness any difference in behavior? From a quick search I see some slight difference between mfence and lfence, do you think we should see if they cause any difference?

Comparison of unsigned expression in `>= 0` is always **true**.

Signed-off-by: Anjan Roy <[email protected]>
@itzmeanjan
Copy link
Contributor Author

Thanks for this! I'm positive about the change.

Thanks :)

Have you witness any difference in behavior?

One of the motivation behind submitting this patch is, when using dudect in production environment, we're seeing several intermittent false positive reports. False positives are mostly reported when the machine which is running these tests, is under heavy load. But with this memory fence barrier (mfence) added right before rdtsc, I've not seen any false positive reports yet.

From a quick search I see some slight difference between mfence and lfence, do you think we should see if they cause any difference?

I don't think using mfence would hurt anyway. I'll just stick to mfence because it should ensure both load and store operations appearing before (in program order) that fence must be serialized.

@itzmeanjan
Copy link
Contributor Author

We're getting some CI failures @ https://github.com/oreparaz/dudect/actions/runs/7466176204/job/20402300482, but they are failing due to different reasons.

@oreparaz
Copy link
Owner

[...] with this memory fence barrier (mfence) added right before rdtsc, I've not seen any false positive reports yet.

Nice to see that by introducing mfence you're getting less false positives.

I'll just stick to mfence because it should ensure both load and store operations appearing before (in program order) that fence must be serialized

Let's keep the mfence if that works OK for you. I'm reading a bit and there's some indication that mfence may not actually serialize the instruction stream [1] but someone else says that it does in Skylake [2]. I'll open an issue in case someone wants to do some more research here.

[1] https://www.felixcloutier.com/x86/mfence : "MFENCE does not serialize the instruction stream".
[2] https://hadibrais.wordpress.com/2018/05/14/the-significance-of-the-x86-lfence-instruction/#comment-310 "MFENCE also serializes the instruction stream on Skylake the same way LFENCE does. It doesn’t guarantee that on paper, though"

We're getting some CI failures

Fixing in #31 . I'll rebase this, add a bit of documentation and merge. Thanks!

@oreparaz
Copy link
Owner

See #32 for more comments on this.

Copy link
Owner

@oreparaz oreparaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Merging...

src/dudect.h Show resolved Hide resolved
@oreparaz oreparaz merged commit a18fdee into oreparaz:master Jan 12, 2024
5 checks passed
@itzmeanjan itzmeanjan deleted the add-mfence-before-rdtsc branch January 13, 2024 08:48
@itzmeanjan
Copy link
Contributor Author

Let's keep the mfence if that works OK for you. I'm reading a bit and there's some indication that mfence may not actually serialize the instruction stream [1] but someone else says that it does in Skylake [2]. I'll open an issue in case someone wants to do some more research here.

Makes sense, just another resource link from GNU/Linux kernel https://github.com/torvalds/linux/blob/9f8413c4a66f2fb776d3dc3c9ed20bf435eb305e/arch/x86/include/asm/msr.h#L201-L220.

itzmeanjan added a commit to itzmeanjan/ml-kem that referenced this pull request Jan 13, 2024
Why it is important ? Read oreparaz/dudect#30

Signed-off-by: Anjan Roy <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants