Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Align "reference" field value in JSON transaction log with standard log. #3093

Open
wants to merge 1 commit into
base: v3/master
Choose a base branch
from

Conversation

asamuta
Copy link

@asamuta asamuta commented Feb 23, 2024

Transaction JSON log of "reference" is too verbose for some rules. There is a limit for "ref" field in "standard" log.

file: rule_message.cc

...
msg.append(" [ref \"" + utils::string::limitTo(200, rm->m_reference) + "\"]");
...

but there is no such limit in file: transaction.cc

...
LOGFY_ADD("reference", a.m_reference.c_str());
...

With this fix JSON out will be truncated as standard log.

Before:

...
"messages": [
      {
        ....
        "details": {
          "match": "Matched \"Operator `Gt' with parameter `10000' against variable `ARGS_COMBINED_SIZE' (Value: `971714.000000' )",
          "reference": "v0,31v32,32v0,31v32,74v0,32v33,166v0,33v34,5v0,31v32,32v0,30v31,165v0,33v34,193v0,33v34,227v0,33v34,41v0,33v34,47v0,33v34,222v0,35v36,9v0,28v29,10v0,25v26,237v0,35v36,62v0,31v32,20v0,31v32,20v0,31v32,20v0,31v32,19v0,31v32,19v0,31v32,20v0,31v32,20v0,31v32,20v0,31v32,19v0,32v33,19v0,25v26,26v0,25v26,39v0,25v26,321v0,25v26,216v0,25v26,147v0,25v26,229v0,25v26,295v0,25v26,147v0,25v26,219v0,26v27,412v0,26v27,357v0,26v27,254v0,26v27,232v0,26v27,309v0,26v27,76v0,26v27,116v0,26v27,32v0,26v27,346v0,26v27,211v0,26v27,467v0,26v27,363v0,26v27,79v0,26v27,190v0,26v27,205v0,26v27,217v0,33v34,17v0,33v34,17v0,33v34,16v0,33v34,17v0,33v34,17v0,33v34,17v0,33v34,16v0,33v34,17v0,33v34,17v0,34v35,17v0,..... 75 kilobytes of text here,
          "ruleId": "1100",
          "file": "/etc/nginx/modsec/owasp-modsecurity-crs/rules/REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf",
          "lineNumber": "209",
         ...
        }
      }
...

after the fix:

...
"reference":"v0,31v32,32v0,31v32,74v0,32v33,166v0,33v34,5v0,31v32,32v0,30v31,165v0,33v34,193v0,33v34,227v0,33v34,41v0,33v34,47v0,33v34,222v0,35v36,9v0,28v29,10v0,25v26,237v0,35v36,62v0,31v32,20v0,31v32,20v0,31v32, (79218 characters omitted)",
...

As long as there is no way to disable the "reference" field in JSON transaction log and the information it contains doesn't look useful in production logs please approve this change. I think the JSON log approach may be reworked later but meanwhile, we need a way to truncate it.

Copy link

Quality Gate Passed Quality Gate passed

Issues
0 New issues

Measures
0 Security Hotspots
No data about Coverage
No data about Duplication

See analysis details on SonarCloud

@airween airween added the 3.x Related to ModSecurity version 3.x label Feb 25, 2024
@airween airween self-assigned this Feb 25, 2024
@airween
Copy link
Member

airween commented Feb 25, 2024

Hi @asamuta,

first of all: thank your for your contribution!

Could you share with us some example, how can we reproduce this message in audit log? (Eg. share some details (eg. used rule set, special settings) and a curl example).

@asamuta
Copy link
Author

asamuta commented Feb 26, 2024

Hi @airween,

NGINX
MOD_SECURITY v3.0.12
OWASP CRS V4.0 https://github.com/coreruleset/coreruleset/releases/tag/v4.0.0

Additional rule that must be placed in REQUEST-900-EXCLUSION-RULES-BEFORE-CRS.conf

SecRule ARGS_COMBINED_SIZE "@gt 1000"\
     "id:400001,\
     phase:2,\
     pass,\
     nolog,\
     auditlog,\
     msg:'REQ_SIZE',\
     logdata:'%{ARGS_COMBINED_SIZE}',\
     ctl:ruleRemoveTargetByTag=attack-xss;ARGS"

This is a simplified rule to demonstrate the issue.

Payload - a big json (the example is attached).
example.json

CURL:
curl --location 'localhost:8080/test' --header 'Content-Type: application/json' --data-binary "@./example.json"

NOTE: Nginx in my case configured as a reverse proxy with upstream that targets a mock server that responds with simple valid JSON to all requests.

LOGFY_ADD("reference", a.m_reference.c_str());
LOGFY_ADD("reference", utils::string::limitTo(200, a.m_reference).c_str());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This changes the semantic (which might be ok) but it also might break automated parsing tools. I think when we cut of data, the remaining stuff should still be parsable (e.g. cut of in a way that all the remaining data is valid and follow the expected format)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I asked the example to check the behavior on mod_security2.

Unfortunately I can't find any documentation about the expected behavior, but as I know this is not a bug, but a feature - I mean the audit.log contains the whole log entry.

In fact, the limitTo() method is justified, because Nginx has a limit for the length of error.log line (NB: Apache has it too). But audit.log does not have any limit, so if we truncate the line in audit.log, we can lost relevant information.

In addition: I can imagine this PR can be a useful feature, but only with a configure option: if anyone wants to use this, then it can be optional in build time. Then we can be sure that user knows what he does.

Let me know how mod_security2 works.

@asamuta
Copy link
Author

asamuta commented Feb 26, 2024

Hi @MirkoDziadzka , @airween ,
I have tried to find any documentation about what the reference field value is. Unfortunately without any luck. Could you please guide me to any documentation about what this reference value means?

"reference": "v0,31v32,32v0,31v32,74v0,32v33,166v0 ....

How to interpret these numbers? How do these numbers help a SOC team? Is it worth storing tens of kilobytes?

Ideally of course it should be possible to switch it on/off using "audit log parts" letters. But it will be a much more complicated task... I think the build time option is also a good point.

Thanks for your efforts on this.
Waiting for your decision.

@zimmerle
Copy link
Contributor

Hi @asamuta, I developed this feature to pinpoint the exact location (within the request body or any other data) where the operator returned true. It was intended to highlight the relevant fragments related to the match, making visualization easier in the web console for specific use cases. The issue you've encountered seems to be a bug, although I haven't delved into the details. If it's not necessary, you can easily disable this feature with a build-time flag, or even at runtime if needed. Over the years, I've customized this for various consulting clients. If it's not being utilized, it might indeed be superfluous to retain this data. I understand your concern; even a few bytes can add up to significant costs over a billion requests.

As per v2, I don't think this feature is available there.

@MirkoDziadzka
Copy link
Contributor

MirkoDziadzka commented Feb 27, 2024

I agree that this feature might not be used. Unfortunately, (if I remember correctly), many regression tests are using the debug output of the variable origin.

My concern with this PR: Assuming somebody is using this and parsing the line and it fails (e.g. simple python scripts would raise an exception) and some workflow is breaking.

I'm not sure if this is a concern for the modec team, but long term ago I worked as a system admin and I would not like such a change. But not my decision. It might ok to just remove it.

@asamuta regarding the format:

  • it is a list of v{offset},{length} of all the variable locations in the request.
  • smashed together without separator
  • "documented" in the code in variable_origin.h

If we care to not break external parsers, we should probably split before "v"

@asamuta
Copy link
Author

asamuta commented Feb 28, 2024

Hi All,
As far as I understand the easiest option here is to create a build-time flag and cut the value before 'v' to keep parsers working. Please confirm that such a change will be accepted.

If so, please point me to such "build-time flag" so that I can implement this based on that example. Also, it would be great if you could advise me the name of the new flag according to the naming convention used in the project.
Thanks in advance.

@airween
Copy link
Member

airween commented Feb 28, 2024

Please confirm that such a change will be accepted.

I can agree an optional, not mandatory modification.

If so, please point me to such "build-time flag" so that I can implement this based on that example. Also, it would be great if you could advise me the name of the new flag according to the naming convention used in the project. Thanks in advance.

As I know libmodsecurity3 does not have any kind of configure option which can be used to demonstrate the example. But mod_security2 has. You should check out the branch v2/master (or it would be better if you make a new clone and check out there), run the ./autogen.sh command, and then the ./configure --help. There you can see some similar solution, eg. --enable-pcre-match-limit or --enable-pcre-match-limit-recursion. For the syntax, check the configure.ac. Please find the references in source code for symbol MODSEC_PCRE_MATCH_LIMIT or MODSEC_PCRE_MATCH_LIMIT_RECURSION.

If you don't want to check the source tree and the build system's behavior, just see the configure.ac file relevant parts.

For the name, I can imagine the --enable-truncate-reference-in-auditlog or something similar, with a numeric argument, which indicates how much is the maximum length of that field, eg --enable-truncate-reference-in-auditlog=200 which means the maximum length can be 200 characters (or the last occurrence of v{offset},{length} pattern). Please make an accurate note for that option.

The option name is not the best choice, so please help me to figure out the correct one.

@airween
Copy link
Member

airween commented Mar 29, 2024

To everyone: what do you think about to make this feature optional? Do you think we can use the configure flag --enable-truncate-reference-in-auditlog or --enable-truncate-reference-in-auditlog=NNN?

@asamuta could you prepare this patch with this suggestion? May be you should replace the name of the flag, but that's it, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.x Related to ModSecurity version 3.x
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants