-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Order-preserving CommonDictionaryStorage #1908
base: main
Are you sure you want to change the base?
Conversation
Oh, it is not simple code. Looking at the diffs on GitHub I can't see how the thread safety is ensured without locking. I'll have to check it out locally. |
I don't think I altered the locking strategy that was currently in place, however now that I look at it I think switching Anyway, I need to review this myself since I wrote it two years ago and am a bit foggy on the details. But I would still appreciate a review if you feel like providing one. Side note, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, unfortunately, I think the way it currently is, it is not thread-safe. The main challenge is because of the change from one structure containing data (Bucket[] _buckets
) to two (int[] _indices
and List<Bucket> _buckets
). What was before an atomic operation on _buckets
, now is sometimes split into two that are not atomic.
For instance, the call TryGetValue(_buckets, key, out value)
would atomically grab field _buckets
, and as long as operations that may disrupt readers were done on a new array, which then would be atomically assigned to the field, things were fine.
Now, the call TryGetValue(_indices, _buckets, key, out value)
grabs two fields: _indices
and _buckets
in indeterminate order, that may not be in sync witch each other.
There are various ways of handling it; in my edit suggestions I propose the way based on the following rules:
- There is never a situation that
_indices
point to some bucket that does not exist. The opposite is OK - there may be a bucket that is not indexed yet.- If something is added, it is first added to
_buckets
, and only after to_indices
. - If something is removed, it is first removed from
_indices
, only then from_buckets
, and not really removed but marked asDUMMY
/Bucket.Removed
. - List
_buckets
never gets shorter (except forClear()
- special case).
- If something is added, it is first added to
- If both
_indices
and_buckets
have to be replaced by new objects (e.g. to shorten them like inClear()
), it is done in a predictable order to ensure Rule 1 is always satisfied.- If they are reset,
_indices
are reset first, then_buckets
. - If they are read,
_buckets
are read first, then_indices
.
- If they are reset,
Rule 2 is to ensure that we never get _indices
that refer to _buckets
that are not (or no longer) around. In other words, if we get a torn read of _indices
and _buckets
for the lookup, it is always new _indices
and old _buckets
.
To ensure proper ordering, Thread.MemoryBarrier()
was needed in more places than before.
In my code suggestions I tried to demonstrate what I mean by all this. I hope I got all the relevant places, but don't take my word for it.
return false; | ||
} | ||
public override bool TryGetValue(object key, out object value) | ||
=> TryGetValue(_indices, _buckets, key, out value); | ||
|
||
/// <summary> | ||
/// Static helper to try and get the value from the dictionary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not static anymore.
/// Used so the value lookup can run against a buckets while a writer | ||
/// replaces the buckets. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This description seems obsolete.
@@ -226,7 +209,7 @@ private void UpdateHelperFunctions(Type t, object key) { | |||
|
|||
// we need to clone the buckets so any lock-free readers will only see | |||
// the old buckets which are homogeneous | |||
_buckets = (Bucket[])_buckets.Clone(); | |||
_indices = (int[])_indices.Clone(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_indices = (int[])_indices.Clone(); | |
_buckets = new List<Bucket>(_buckets); |
_indices = new int[(int)(size / Load) + 1]; | ||
_indices.AsSpan().Fill(FREE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_indices = new int[(int)(size / Load) + 1]; | |
_indices.AsSpan().Fill(FREE); | |
var newIndices = new int[(int)(size / Load) + 1]; | |
newIndices.AsSpan().Fill(FREE); | |
_indices = newIndices; |
indices[pair.Key] = buckets.Count; | ||
buckets.Add(bucket); | ||
_count++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indices[pair.Key] = buckets.Count; | |
buckets.Add(bucket); | |
_count++; | |
_count++; | |
buckets.Add(bucket); | |
Thread.MemoryBarrier(); | |
indices[pair.Key] = buckets.Count; |
_indices[pair.Key] = DUMMY; | ||
_buckets[pair.Value] = Bucket.Removed; | ||
Thread.MemoryBarrier(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_indices[pair.Key] = DUMMY; | |
_buckets[pair.Value] = Bucket.Removed; | |
Thread.MemoryBarrier(); | |
_indices[pair.Key] = DUMMY; | |
Thread.MemoryBarrier(); | |
_buckets[pair.Value] = Bucket.Removed; |
public override bool TryGetValue(object key, out object value) | ||
=> TryGetValue(_indices, _buckets, key, out value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
public override bool TryGetValue(object key, out object value) | |
=> TryGetValue(_indices, _buckets, key, out value); | |
public override bool TryGetValue(object key, out object value) { | |
var buckets = _buckets; | |
Thread.MemoryBarrier(); | |
var indices = _indices; | |
return TryGetValue(indices, buckets, key, out value); | |
} |
_indices = new int[8]; | ||
_indices.AsSpan().Fill(FREE); | ||
_buckets.Clear(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
_buckets.Clear()
is not enough. A new list object is needed not to mess up readers in progress.
_indices = new int[8]; | |
_indices.AsSpan().Fill(FREE); | |
_buckets.Clear(); | |
var newIndices = new int[InitialBucketSize]; | |
newIndices.AsSpan().Fill(FREE); | |
_indices = newIndices; | |
Thread.MemoryBarrier(); | |
_buckets = new List<Bucket>(); |
When I looked at the error messages from the failing test, it surely looked like a typical example of a concurrent execution of a test that is clearly not parallel-safe: the test creates a test file with a fixed name, uses it, and then deletes it. The error was "file not found" when trying to use the test file. It is exactly the same cause in all failures of Why it is happening intermittently, and why only on some OS and some instances could have been a pure incidental timing issue. |
Cherry-picked my PR for 3.6. It's a nice feature to have so why not include it in 3.4...