You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Due to typos or OCR errors regex patterns may not always match when they probably should, e.g. typing capital-O instead of zero in a british postcode, where letters and numbers are not usually interchangeable.
Yes, that should be a really good approach. It seems regex is backwards compatible, so we can replace it!
We have to figure out exactly how many errors we will allow, and perhaps default to 0, to be backwards compatible, but I can visualise that every detector that detects RegexFilth should be able to have a 'exact' regex and it's approximate counterpart.
Due to typos or OCR errors regex patterns may not always match when they probably should, e.g. typing capital-O instead of zero in a british postcode, where letters and numbers are not usually interchangeable.
It might be interesting to allow regex's to be matched fuzzily, and the package
regex
allows this!https://pypi.org/project/regex/#approximate-fuzzy-matching-hg-issue-12-hg-issue-41-hg-issue-109
We should investigate its use instead of the built in
re
.The text was updated successfully, but these errors were encountered: