-
-
Notifications
You must be signed in to change notification settings - Fork 22
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
support free-threaded Python #165
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #165 +/- ##
==========================================
- Coverage 88.92% 88.74% -0.19%
==========================================
Files 13 13
Lines 2195 2203 +8
Branches 2195 2203 +8
==========================================
+ Hits 1952 1955 +3
- Misses 148 153 +5
Partials 95 95
... and 1 file with indirect coverage changes Continue to review full report in Codecov by Sentry.
|
CodSpeed Performance ReportMerging #165 will improve performances by 11.93%Comparing Summary
Benchmarks breakdown
|
e4ab0b7
to
aa38b16
Compare
@@ -86,28 +85,34 @@ impl StringMaybeCache for StringNoCache { | |||
} | |||
} | |||
|
|||
static STRING_CACHE: GILOnceCell<GILProtected<RefCell<PyStringCache>>> = GILOnceCell::new(); | |||
static STRING_CACHE: OnceLock<Mutex<PyStringCache>> = OnceLock::new(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Despite the use of a mutex here, the single-threaded benchmark is not meaningfully impacted.
We can worry about multithreaded performance in the far future if highly parallel uses of jiter should arise; and if users hit a pathological case before we do anything fancy they can always turn off the string cache.
98b54fd
to
787d2b0
Compare
def test_multithreaded_parsing(): | ||
"""Basic sanity check that running a parse in multiple threads is fine.""" | ||
expected_datas = [json.loads(data) for data in JITER_BENCH_DATAS] | ||
|
||
def assert_jiter_ok(data: bytes, expected: Any) -> bool: | ||
return jiter.from_json(data) == expected | ||
|
||
with ThreadPoolExecutor(8) as pool: | ||
results = [] | ||
for _ in range(1000): | ||
for data, expected_result in zip(JITER_BENCH_DATAS, expected_datas): | ||
results.append(pool.submit(assert_jiter_ok, data, expected_result)) | ||
|
||
for result in results: | ||
assert result.result() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can confirm with this simple test that the Mutex
basically stops parallelism when the cache is enabled, using jiter.from_json(data, cache_mode="none")
leads to about 8x speedup on my machine.
I still don't mind that really, this PR just gets the freethreaded mode working and we can worry about that sort of optimization later.
787d2b0
to
58dace4
Compare
@samuelcolvin I think this is good to ship; main thing for you to be aware of is the mutex on the cache as per the above, so parallelism is not as good as it could be with the cache. |
Add freethreaded Python support.
Pushing a bit early so that I can see results of benchmarks & CI.