nogil `len` and `eq` break atomicity on sets and dicts #129519

luisggpina · 2025-01-31T14:33:45Z

Bug report

Bug description:

Hi,

We're a research group focused on testing concurrent runtimes. Our work-in-progress prototype found a violation of atomicity on the current nogil build when using set __len__ or __eq__ with other concurrent operations on the same set/dict. The program below shows the wrong behavior for __len__ and __ixor__:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())

def t1(b1,s,res):
    b1.wait()
    s.__ixor__({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 'a'})

def Test():
  s =  {17, 'a', 'b', 18, 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 7 , 32 }:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Operation __len__ should see the initial set and return 7 or the set after the __ixor__ completes and return 32. However, it can see internal changes made by __ixor__ and return different numbers. Here's an example execution:

test begin...
0
found bug: 9
found bug: 17
found bug: 22
found bug: 30
found bug: 9
found bug: 9
found bug: 9
found bug: 9
found bug: 9

We have observed the same problem when __len__ runs concurrently with other operations: symmetric_difference_update, __isub__, and update. We'll share these test below in a comment.

We observed the same problem when dict's __len__ runs concurrently with __ior__:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())

def t1(b1,s,res):
    b1.wait()
    s.__ior__({0: 0, 1: 1, 2: 2, 3: 3, 4: 4})

def Test():
  d =  {'k1': 'v1', 'k2': 'v2', 'k3': 3, 10: 10}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, d,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, d,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 4, 9 }:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Again, the length of the dict should be either 4 or 9, depending on whether __ior__ took place. However, __len__ can see internal changes from __ior__ as shown by the output:

test begin...
0
found bug: 5
found bug: 6
found bug: 5
found bug: 5
found bug: 8
found bug: 7

Finally, we observed set's __eq__ to be able to access intermediate results of __isub__, as shown by the program below:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    s.__isub__({1, 'b', 'd', 17})

def t1(b1,s,res):
    b1.wait()
    res.append(s.__eq__({'a', 'b', 18, 'c', 'd', 'e'}))

def Test():
  s =  {'a', 17, 'b', 18, 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] != False:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

The original and resulting sets are never equal to the argument passed to __eq__, however the operation can see internal changes from __isub__ as shown by the ouput:

test begin...
0
found bug: True
found bug: True
found bug: True
found bug: True
found bug: True
found bug: True

@flypoodles and @overlorde are part of the team, adding them so they get notified about further discussion.

We note that we observed these outputs in several x86_64 machines and one ARM machine.

CPython versions tested on:

CPython main branch, 3.14

Operating systems tested on:

Linux

The text was updated successfully, but these errors were encountered:

luisggpina · 2025-01-31T14:38:47Z

Adding the tests mentioned in the main issue here, with sample outputs.

__len__ and symmetric_difference_update:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())

def t1(b1,s,res):
    b1.wait()
    s.symmetric_difference_update({3, 4, 'b', 'e', 'f'})

def Test():
  s =  {1, 2, 'a', 'b', 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 7, 8 }:
      print("found bug: " + str(res))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Output:

test begin...
0
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]
found bug: [9]

__len__ and __isub__ (this one is rare, you may have to run it for a while to observe the behavior we mention; ARM hardware seems to observe it more often):

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    s.__isub__({1, 'b', 17, 'd'})

def t1(b1,s,res):
    b1.wait()
    res.append(s.__len__())

def Test():
  s =  {'a', 17, 18, 'b', 'c', 'd', 'f'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 4, 7 }:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Output:

test begin...
0
1000
2000
3000
found bug: 5
4000
found bug: 5

__len__ and update:

import threading
import sys
def t0(b1,s,res):
    b1.wait()
    res.append(s.__len__())

def t1(b1,s,res):
    b1.wait()
    s.update({0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 'a'})

def Test():
  s =  {17, 18, 'a', 'b', 'c', 'd', 'e'}
  threads=[]
  barrier = threading.Barrier(2)
  res = []
  threads.append(threading.Thread(target= t0, args=(barrier, s,res)))
  threads.append(threading.Thread(target= t1, args=(barrier, s,res)))
  for i in range(0, len(threads)):
      threads[i].start()
  for i in range(0, len(threads)):
      threads[i].join()
  if res[0] not in { 7, 35 }:
      print("found bug: " + str(res[0]))

print("test begin...")
for i in range(0,50000):
  threads = []
  if i % 1000 == 0:
      print(i)
  for i in range(0,100):
      threads.append(threading.Thread(target= Test))
  for t in threads:
      t.start()
  for t in threads:
      t.join()
print("test Done")

Output:

test begin...
0
found bug: 24
found bug: 24
found bug: 21
found bug: 24
found bug: 32
found bug: 25
found bug: 28

ZeroIntensity · 2025-01-31T19:35:30Z

There's two possibilities here, as far as I can tell:

The lockless-ness of the set implementation is causing issues--the length isn't protected by a lock, so it's free to be read by another thread before the set finishes the update. Fixing this would involve extra locking, which would hurt performance; I'm not convinced it'd be worth doing that for this single use case.
There might be an actual race on the used field of set objects. An atomic load/store is used for most cases, but some functions (e.g. set_merge_lock_held) read it non-atomically and rely on the critical section. I don't think that's thread-safe, because not all functions (including __len__ in this case) hold the lock.

luisggpina added the type-bug An unexpected behavior, bug, or error label Jan 31, 2025

luisggpina mentioned this issue Jan 31, 2025

nogil violation of atomicity on set operations #126136

Open

Eclips4 added the topic-free-threading label Jan 31, 2025

picnixz added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Jan 31, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nogil `len` and `eq` break atomicity on sets and dicts #129519

nogil `len` and `eq` break atomicity on sets and dicts #129519

luisggpina commented Jan 31, 2025 •

edited

Loading

luisggpina commented Jan 31, 2025 •

edited

Loading

ZeroIntensity commented Jan 31, 2025

nogil __len__ and __eq__ break atomicity on sets and dicts #129519

nogil __len__ and __eq__ break atomicity on sets and dicts #129519

Comments

luisggpina commented Jan 31, 2025 • edited Loading

Bug report

Bug description:

CPython versions tested on:

Operating systems tested on:

luisggpina commented Jan 31, 2025 • edited Loading

Adding the tests mentioned in the main issue here, with sample outputs.

ZeroIntensity commented Jan 31, 2025

nogil `len` and `eq` break atomicity on sets and dicts #129519

nogil `len` and `eq` break atomicity on sets and dicts #129519

luisggpina commented Jan 31, 2025 •

edited

Loading

luisggpina commented Jan 31, 2025 •

edited

Loading