Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated MS search lacks body_is_regex=1 #10228

Open
tripleee opened this issue Mar 4, 2024 · 3 comments
Open

Generated MS search lacks body_is_regex=1 #10228

tripleee opened this issue Mar 4, 2024 · 3 comments

Comments

@tripleee
Copy link
Member

tripleee commented Mar 4, 2024

What problem has occurred? What issues has it caused?

When you create a PR from chat, SmokeDetector follows up with search results after a while. However, the generated search fails to specify that the search is for a regular expression. Thus e.g. #10227 has a search which could never work.

(It didn't work anyway for other reasons, but without the "regex" check box, it couldn't have.)

What would you like to happen/not happen?

The generated search should have the regex box ticked. That is,

MS search [here](https://metasmoke.erwaysoftware.com/search?utf8=%E2%9C%93&body=85275+87805%28%3F%23NO+NorAm%29)

should be

MS search [here](https://metasmoke.erwaysoftware.com/search?utf8=%E2%9C%93&body_is_regex=1&body=85275+87805%28%3F%23NO+NorAm%29)
@tripleee tripleee changed the title Generated MS search lacks is_regex=1 Generated MS search lacks body_is_regex=1 Mar 4, 2024
Andrew5057 added a commit to Andrew5057/Anerdw-SmokeDetector that referenced this issue Nov 21, 2024
Add "body_is_regex=1" to the MS searches that SmokeDetector
automatically creates for new PRs. Should resolve
[Charcoal-SE#10228](Charcoal-SE#10228).
@Andrew5057
Copy link

[field]_is_regex already gets added to website, keyword, and username watches/blacklists in gitmanager.py. Number watches/blacklists are the only exception. How common are comments for number blacklists? Would a hard-coded exception for (?# NO NorAm) be enough?

@Andrew5057
Copy link

Andrew5057 commented Nov 21, 2024

Maybe we could just strip comments from number blacklists entirely. Something like item_to_search = regex.sub("\(\?#.*\)", "", item_to_blacklist)?

@makyen
Copy link
Contributor

makyen commented Nov 21, 2024

That regex is incorrect and will, potentially, strip things other than just comments. Please use the already existing regex to strip such comments. Actually searching for something that approximates what the phone number code does is substantially more involved than just stripping the comment and searching for the number. An approximation that's reasonably easy to code would be to use the same strategy as is used with the number search bookmarklets. Those don't attempt to handle potential obfuscations within the text. However, they do try to get at least some of those by also looking in the why data for the number, but the number showing up in the why data requires that it's already watched or blacklisted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging a pull request may close this issue.

3 participants