Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use input.count in PrefixThrough and PrefixUpTo for speedup. #354

Merged
merged 2 commits into from
Oct 9, 2024

Conversation

BjornRuud
Copy link
Contributor

I have made a simple HTML tokenizer and noticed using Instruments that PrefixThrough was spending a lot of time in Substring.count, which in turn was calling distance(from:to:). I assume this is because the character count of a substring isn't known until calculated due to how Unicode works. This patch replaces the substring count check with a match end index check instead.

I was unable to make sense of the results from the benchmark app, but in my HTML project (link provided) there was a substantial speedup. In that repo there is a benchmark app in the branch swift-parsing, and you can switch between swift-parsing versions in Package.swift.

https://github.com/BjornRuud/HTMLLexer.git

Calling Substring.count will calculate the length of the string the first
time it is called which for large strings can be slow.
Copy link
Member

@stephencelis stephencelis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good find, thanks!

@BjornRuud
Copy link
Contributor Author

The CI Ubuntu test failure is the same error I get for both the benchmark and tests with Swift 6.

@stephencelis stephencelis merged commit b23c636 into pointfreeco:main Oct 9, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants