Don't use input.count in PrefixThrough and PrefixUpTo for speedup. #354

BjornRuud · 2024-10-09T14:54:45Z

I have made a simple HTML tokenizer and noticed using Instruments that PrefixThrough was spending a lot of time in Substring.count, which in turn was calling distance(from:to:). I assume this is because the character count of a substring isn't known until calculated due to how Unicode works. This patch replaces the substring count check with a match end index check instead.

I was unable to make sense of the results from the benchmark app, but in my HTML project (link provided) there was a substantial speedup. In that repo there is a benchmark app in the branch swift-parsing, and you can switch between swift-parsing versions in Package.swift.

https://github.com/BjornRuud/HTMLLexer.git

Calling Substring.count will calculate the length of the string the first time it is called which for large strings can be slow.

stephencelis

Good find, thanks!

BjornRuud · 2024-10-09T17:25:48Z

The CI Ubuntu test failure is the same error I get for both the benchmark and tests with Swift 6.

Don't use input.count in PrefixThrough and PrefixUpTo for speedup.

d3fa32f

Calling Substring.count will calculate the length of the string the first time it is called which for large strings can be slow.

stephencelis approved these changes Oct 9, 2024

View reviewed changes

Merge branch 'main' into prefix-faster

a6ae32c

stephencelis merged commit b23c636 into pointfreeco:main Oct 9, 2024
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Don't use input.count in PrefixThrough and PrefixUpTo for speedup. #354

Don't use input.count in PrefixThrough and PrefixUpTo for speedup. #354

BjornRuud commented Oct 9, 2024

stephencelis left a comment

BjornRuud commented Oct 9, 2024

Don't use input.count in PrefixThrough and PrefixUpTo for speedup. #354

Don't use input.count in PrefixThrough and PrefixUpTo for speedup. #354

Conversation

BjornRuud commented Oct 9, 2024

stephencelis left a comment

Choose a reason for hiding this comment

BjornRuud commented Oct 9, 2024