Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Edit regex to exclude comments (to avoid false positives) #860

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

DecimalTurn
Copy link
Contributor

This PR addresses #817

Regex demo: https://regex101.com/r/QUnpRh/2

@decalage2 decalage2 self-requested a review June 10, 2024 10:49
@decalage2 decalage2 self-assigned this Jun 10, 2024
@decalage2 decalage2 added this to the oletools 0.60 milestone Jun 10, 2024
@decalage2
Copy link
Owner

Is it useful to include devcontainer.json in this PR? (it doesn't look related)

@DecimalTurn
Copy link
Contributor Author

@decalage2 I needed to add this so that I could more easily use a Codespace to debug and find why the test was failling, but I forgot to create a new branch to avoid updating the PR, sorry. I'll remove that change when I've confirmed all the tests are passing.

I've only started using Codespaces recently, but I find it nice to be able to run code online without having to set it up on my computer. If you think it can be useful for you or others to have that possibility, I can always add that in a seperate PR.

@DecimalTurn
Copy link
Contributor Author

DecimalTurn commented Jun 27, 2024

The problem comes from the fact that the detect_autoexec and detect_suspicious functions don't just look at VBA code, but also Excel 4.0 macro information stored in the form of comments. In that case, we can't just ignore the comments since they contain important information.

I'm not sure what's the ideal approach to fix that, but I see some options:

  • Make a rule to check the content of vba_code for XLM information and if present, keep the comments.
  • Filter the VBA code earlier in the process by replacing line-comments with blank lines (could be done in filter_vba I imagine)
  • Remove the single quote in front of the XLM information (unless the code has to be valid VBA).
  • If the code needs to be valid VBA code, we could always use the legacy REM keyword at the start of each line of XLM information to make them a comment, but wouldn't be affected by the regex.

Let me know what you think is best.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants