Search Multiple logical patterns in single ugrep command #406
Replies: 6 comments 1 reply
-
Use option $ ugrep -%% 'A | B | (C AND D)' file The short form is: $ ugrep -%% 'A | B | (C D)' file because pattern spacing is the same as AND. Your query in Boolean form: $ ugrep -%% -HPnrz -o -J=8 --format='%f----%n----%o----%G%~' '(?<error_85>"This is error 100" | "This is Error 101") | ((?<error_86>"This is error 500" AND "This is Error 501"))' file.log Notes:
|
Beta Was this translation helpful? Give feedback.
-
I tried this out by setting $ ugrep -%% -P '(?<a>foo) | (?<b>bar) | ((?<c>foo) AND (?<d>baz))' file?.txt --format='%f----%n----%o----%G%~' --stats
file3.txt----1----foo----c
file3.txt----2----bar----b
file1.txt----1----foo----c
file1.txt----2----baz----d
file2.txt----1----bar----b
file2.txt----2----baz----d |
Beta Was this translation helpful? Give feedback.
-
Pattern matches can be found with ugrep, but it will be difficult to extract which AND-pattern matched, because group capture naming is associated with a single term that can be expressed as a single regex, such as Again, the reason for this difficulty is that regex pattern matching in general doesn't support an AND constraint and certainly not one that applies anywhere in a file. Ugrep uses regex pattern matching using each individual regex term and then combines the result to check the AND constraints. Furthermore, if the dictionary of these AND-patterns is large, then you may want to implement a custom solution. When there are many AND-patterns then the normalization into AND constraints can be huge and slower than expected. |
Beta Was this translation helpful? Give feedback.
-
To create a custom solution, here is what I would do:
It's more efficient for the last part to check each line in the input for matching terms so you don't scan the file more than once. Writing these code parts can be automated with a code generator. If the dictionary changes then run the code generator again. For example, I would use RE/flex to write a scanner and parser to extract the patterns from the dictionary and generate code. If the dictionary is JSON, then use a JSON parser to do the same. |
Beta Was this translation helpful? Give feedback.
-
Thanks for your time to help with the approach. Much Appreciated. |
Beta Was this translation helpful? Give feedback.
-
The fastest implementation that I can think of is in C++ with RE/flex to find patterns in lines and mark them when found e.g. mark in a bit array. The patterns can be found per line of input using regex |
Beta Was this translation helpful? Give feedback.
-
I am working on a log analysis task and need to search for multiple logical patterns within log files using ugrep. I want to combine all logical patterns in a single ugrep command and display which pattern was matched in the output. Here's an example of the command I am currently using:
ugrep -HPnrz -o -J=8 --format='%f----%n----%o----%G%~' -e '(?<error_85>version = 9.13.16604.0|version = 9.11.16404|version = 9.11.16405|version = 9.11.16407)' file.log
This command searches for any of the specified versions and outputs the results like this:
However, I have some patterns that need to use logical AND as well. For example, I want to search all files that contain all of these versions:
ugrep -HPnrz -o -J=8 --format='%f----%n----%o----%G%~' -e '(?<error_85>version = 9.13.16604.0 AND version = 9.11.16404 AND version = 9.11.16405 AND version = 9.11.16407)' file.log
How can I create a command that combines multiple logical patterns using -e pattern1 -e pattern2 and so on, and also includes in the output which pattern was matched?
Expecting command something like below
ugrep -HPnrz -o -J=8 --format='%f----%n----%o----%G%~' -e '(?<error_85>"This is error 100" | "This is Error 101")' -e '(?<error_86>"This is error 500" AND "This is Error 501")' file.log
Requirements:
Beta Was this translation helpful? Give feedback.
All reactions