Replies: 26 comments 36 replies
-
Implement the
|
Beta Was this translation helpful? Give feedback.
-
Allow modules that unpack/uncompress dataHaving a mechanism for allowing file unpacking/decompression within YARA itself would be great. For example, a module could implement ZIP decompression, and it would uncompress any ZIP file scanned by YARA, allowing to scan the ZIP's content as well. |
Beta Was this translation helpful? Give feedback.
-
Composable modifiersImplement composable modifiers as described in: https://gist.github.com/wxsBSD/44aa8b8133e3ea96e738b66ec1c600f2 |
Beta Was this translation helpful? Give feedback.
-
Actions ClauseI've got a friend who has repeatedly tried to get me to implement an "actions" clause in rules. The idea is that we could tie together detections with actions to be taken directly in the rule. Currently this is implemented in various systems I've helped build (or know about) over the years outside of the language, and while it works well it would be great if we supported this in the language. I'm thinking of something like this:
I've always thought being able to ship rules and a script to do something with them automatically is a great idea but have not implemented because I thought it was a stretch for the current design. Obviously there are considerations to be made around the safety of this too. I think supporting this might help take YARA to a new level for the power users. |
Beta Was this translation helpful? Give feedback.
-
Compiler ImprovementsOne of the issues with current YARA is that it is difficult to use the compiler as anything other than a way to output the necessary structures and bytecode for execution by libyara. Being able to have programatic access to the rules as an AST is helpful for managing complex rulesets and possibly even allowing for optimizations that are specific to certain user environments (if you know you're going to only be scanning PE files with this run then if you have a way to programatically recognize ELF specific rules then you can remove them just for this run). This would also allow us to have one true compiler instead of splitting efforts between gyp and the main compiler. I don't have specific ideas here but am hoping others can share their ideas here. |
Beta Was this translation helpful? Give feedback.
-
Loadable modulesModules that extend YARA's core pattern matching functionality should be decoupled from the library, so they can be developed separately.
For the YARA 4 codebase, I have begun working towards such a goal (see #1772). |
Beta Was this translation helpful? Give feedback.
-
Local variablesIn its current form, YARA already supports variable on a global level, by providing a list of external variables. I think having local variables on a per-rule basis would bring some benefits, such as:
The local variable is valid only within the scope of the rule. One local variable can reference other local variable. Example:
The implementation was already presented in this PR but has been abandoned for a long time. |
Beta Was this translation helpful? Give feedback.
-
External variables should be defined within the rule languageWorking with external variables ( |
Beta Was this translation helpful? Give feedback.
-
Namespaces on the language levelSimilar to the external variable should be defined within the rule language, I also think the same about namespaces. Namespaces defined on the language level would make it much more easier to share rules and avoid name clashes without having to rely on the user compiling everything in a separate namespace. Example:
|
Beta Was this translation helpful? Give feedback.
-
Avoid data scanning if not necessaryThe current implementation scans the data first and then evaluate all the rule conditions. This has been a source of frustration for many users, who expect that a condition like |
Beta Was this translation helpful? Give feedback.
-
User-defined arrays and their unification with iteratorsVery hard thing to express (but not impossible) in YARA is "N of M expressions are evaluated to At the same time, I see an opportunity to unify it with existing concepts like integer enumerations which are now denoted the same way as string sets and rule sets. When you even combine it with operator |
Beta Was this translation helpful? Give feedback.
-
YARA as a data extraction toolYARA is doing complex data processing through modules like .NET, PE, ELF, Mach-O etc. Some of the data these modules extract is currently available through
I can see this being achieved by modules being its own standalone modules which you can use to obtain all the information. In the same way This kind of resembles what loadable modules is talking about, but because the use-case is a little different, I wanted to list it out as a separate thing. |
Beta Was this translation helpful? Give feedback.
-
Extensible rule sectionsWhat if modules could extend the rule language by providing new sections into the rules? Sections The reason I would like to see something like this is that I never really liked the fact that all the Cuckoo module features have to be functions and they need to be in the condition. I would welcome more declarative approach with magic happening in the backend. I focus mainly on the
I realize that this is maybe too much, but I am throwing out there to see the reactions :) |
Beta Was this translation helpful? Give feedback.
-
Better Callback APIThe current callback mechanism could be easier to use. Currently you provide a single callback and have to handle a lot of logic in that single callback. It would be nicer if libyara took a structure of function pointers for the callbacks and called them automatically. I'm thinking of something like this:
This would make the interface more explicit for users, allowing them to handle only the events they care about, at the cost of a bit more logic inside libyara to figure out when to call each callback (if non-null). |
Beta Was this translation helpful? Give feedback.
-
Pluggable OutputHaving different output options built into yara (the CLI) would be awesome. We can separate out the ideas of formatting from destination. By making a separation of the formatting from the destination we can start to support things like "log in JSON to stdout" or "log in text to windows event log" or something like that. Ideally we would provide an API to let users of libyara register their own formatting and destination plugins so they can better route output. It is a little unclear to me how this would fit in with the callback API discussed in a different comment but having different options for formatting and destination would be very powerful for places that are embedding YARA into their analysis pipelines/products. |
Beta Was this translation helpful? Give feedback.
-
Debugging of YARAFinding out why your YARA rule doesn't hit can be kinda hard. Before
What I think YARA would benefit from is having the full debugging support which would allow you to:
The debugger itself would provide Debug Adapter Protocol server so that it can be easily integrated into Visual Studio Code, and other IDEs/editors supporting DAP. I already have working prototype for everything mentioned above (besides the point 5) including DAP support, but all of it is being shoehorned into YARA. If YARA were to be basically reimplemented, I would really welcome it to be something that's being thought of when designing the new core and I can certainly provide guidance on what is required for DAP to work properly. |
Beta Was this translation helpful? Give feedback.
-
A way for obtaining the length of an array/iteratorSome modules expose arrays like |
Beta Was this translation helpful? Give feedback.
-
Standardization of the language/changes to the languageI know that this isn't really a feature request for YARA per-se but I think it fits the fact that big changes are coming into YARA and therefore I think it should be worth it to discuss this. As development of YARA is gaining more and more traction, and more and more people contribute, I think it's becoming really hard to keep up with all the changes and plan any kind of updates to YARA in production environments. A lot of bug fixes are interleaved with big breaking changes, sometimes on a day-by-day basis. Since the language itself is dictated by the implementation, when the changes reach the code it also changes the language and it's sometimes too much to track. I really think a lightweight form of Rust RFCs might be a good example of how to deal with changes to the language. Things that it could see it would bring:
I don't ultimately call for having a separate RFC repository and to have complicated processes for every single change to the language. Even having |
Beta Was this translation helpful? Give feedback.
-
Add possibility to use full set of data types for ExternalsAt $dayjob we use YARA's Externals feature to scan objects with YARA passing in a lot of metadata about those objects. One of the limitations of this is that the Externals feature only supports str/int fields, but sometimes this means fudging the data to fit these limited data types, e.g. when really the object is a list, we have to change it to a pipe delimited string for matching purposes. It would be nice if lists/dictionaries were available to use in conjunction with externals. |
Beta Was this translation helpful? Give feedback.
-
Allow disabling/enabling specific types of warningsCurrently YARA has the One solution is following the model used by |
Beta Was this translation helpful? Give feedback.
-
The future of the Aho-Corasick algorithm and regex engineThis is more an open question than a proposal for some future version of the Yara. I read through the suggestions in this thread, which are amazing and exciting. Most of them are, however, additional features, understandably. I would also like to discuss the future of pattern-matching implementation, used algorithms, and heuristics even. @plusvic, are they any plans for this? |
Beta Was this translation helpful? Give feedback.
-
Changes to support printing of matched data irrespective of source. Currently it is possible to see the data that the rule matched if strings are the cause of the match using the -s flag. If the matches were the result of module data matching or from uint() type condition logic, then it can be hard to triage which entries are matching as expected. I don't have a proposed solution, but ideally it would be possible to specify a single flag when running yara via the CLI which prints out any matched data regardless of the source of the match. |
Beta Was this translation helpful? Give feedback.
-
Accessing an instance of a string match via $s[i] Right now, while we can access the length and offset of a string $s via #s and @s I would like to be able to iterate through and do different things with $s[i]. This might open up some new doors and possibilities for iteration and processing and checking multiple instances of a string, especially when dealing with regular expression matches. |
Beta Was this translation helpful? Give feedback.
-
Implement chained comparisonsIn many instances, it can be useful to use chained comparisons, like: |
Beta Was this translation helpful? Give feedback.
-
Enable wildcard with
|
Beta Was this translation helpful? Give feedback.
-
Allow x of () to accept expressions in the brackets.I think a lot of the things people currently find hard to express in conditions could be resolved To illustrate this, in the following rule, if you want to match 2 of, 2 of ($a*), 2 of ($b*) and 2 of ($c*) but would accept any combination of 2 of these, its hard to write the condition (unless i'm missing something). Whereas in the condition (currently not working) expressed it is simple rule myrule
{
strings:
$a1 = "foo"
$a2 = "bar"
$a3 = "rae"
$b1 = "123"
$b2 = "456"
$b3 = "789"
$c1 = {00 01 02}
$c2 = {03 04 05}
$c3 = {06 07 08}
condition:
2 of (2 of ($a*), 2 of ($b*), 2 of ($c*))
} |
Beta Was this translation helpful? Give feedback.
-
YARA is about to become 15 years old. During all these years a lot of decisions were made, many features got implemented, and many ideas got discarded for one reason or another. The project is now mature and stable, and a great community of users and contributors has blossomed around it. I feel it's time for laying out the plans for the next 15 years of YARA. It's time for dreaming big again.
I'm opening this discussion with the intention of starting a collective brain-dump that helps us shape the future of YARA. This is a place for describing the features and ideas that you would like to see in a future incarnation of YARA, no matter how crazy they sound. I'm particularly interested in the following topics:
The main purpose of this discussion is getting a grasp of how the community is interacting with YARA, what they miss and which are they main sources of frustration. Don't be shy, all contributions are welcomed.
Beta Was this translation helpful? Give feedback.
All reactions