Skip to content

Conversation

@rabbitstack
Copy link
Owner

@rabbitstack rabbitstack commented Jan 14, 2025

What is the purpose of this PR / why it is needed?

foreach adds iteration capabilities to the rule language. The decision to keep the implementation outside the functions package is deliberate.

The function mostly operates with raw expressions, and if it lived in the functions package, it would create a cyclic import and likely unleash more painful side effects. For the sake of simplicity, it is better to keep the function close to the parser and AST evaluation.

foreach accepts three required and multiple optional arguments. The first argument is the iterable value typically yielded by the pseudo field. The function recognizes process internal state collections such as modules, threads, memory mappings, or thread stack frames. Obviously, it is also possible to iterate over simple string slices. The second argument represents the bound variable which is an item associated with every element in the slice. The bound variable is accessed in the third argument, the predicate. It is usually followed by the segment that denotes the accessed value. Unsurprisingly, the predicate is commonly a binary expression that can be formed of not/paren expressions, other functions, and so on. The predicate is executed on every item in the slice. If the predicate evaluates to true, the function also returns the true value.

Lastly, foreach function can receive an optional list of fields from the outer context, i.e. outside predicate loop. Therefore, for the predicate to access the field not defined within the scope of the iterable, it must capture the field first.

Note that the side effect of introducing the foreach function is observed in the form of deprecation of previous segment/path fields. This trend will follow in subsequent pull requests, untangling and overly simplifying the accessor codebase.

Some examples of foreach usage:

  • Traverses process modules and return true if the module path matches the pattern
foreach(ps._modules, $mod, $mod.path imatches '?:\\Windows\\System32\\us?r32.dll')
  • For each process ancestor, check if the ancestor is services.exe and the current process is protected. In this example, the ps.is_protected field is captured before its usage in the predicate
foreach(ps._ancestors, $proc, $proc.name = 'services.exe' and ps.is_protected, ps.is_protected)

What type of change does this PR introduce?


Uncomment one or more /kind <> lines:

/kind feature (non-breaking change which adds functionality)

/kind bug-fix (non-breaking change which fixes an issue)

/kind refactor (non-breaking change that restructures the code, while not changing the original functionality)

/kind breaking (fix or feature that would cause existing functionality to not work as expected

/kind cleanup

/kind improvement

/kind design

/kind documentation

/kind other (change that doesn't pertain to any of the above categories)

Any specific area of the project related to this PR?


Uncomment one or more /area <> lines:

/area instrumentation

/area telemetry

/area rule-engine

/area filters

/area yara

/area event

/area captures

/area alertsenders

/area outputs

/area rules

/area filaments

/area config

/area cli

/area tests

/area ci

/area build

/area docs

/area deps

/area other

Special notes for the reviewer


Does this PR introduce a user-facing change?


Yes, the foreach function must be properly documented and exposed to the final user.

@rabbitstack rabbitstack added the scope: filters Anything related to filters label Jan 15, 2025
@rabbitstack rabbitstack changed the title feat(filter): foreach function feat(filter): foreach function Jan 15, 2025
@rabbitstack rabbitstack force-pushed the foreach-function branch 2 times, most recently from ccb3616 to f78ea76 Compare January 15, 2025 17:55
 Foreach adds iteration capabilities to the rule language. The decision to keep the function implementation outside the functions package is deliberate.

The function mostly operates with raw expressions, and if the function lived in the functions package, that would create a cyclic import, but also likely to unleash more painful side effects. For the sake of simplicity it is better to keep the function close to the parser and AST evaluation.

Foreach accepts three required and multiple optional arguments. The first argument is the iterable value typically yielded by the pseudo field. The function recognizes process internal state collections such as modules, threads, memory mappings, or thread stack frames. Obviously, it is also possible to iterate over a simple string slice. The second argument represents the bound variable which is an item associated with every element in the slice. The bound variable is accessed in the third argument, the predicate. It is usually followed by the segment that denotes the accessed value. Unsurprisingly, the predicate is commonly a binary expression which can be formed of not/paren expressions, other functions, and so on. The predicate is executed on every item in the slice. If the predicate evaluates to true, the function also returns the true value.

Lastly, foreach function can receive an optional list of fields from the outer context, i.e. outside predicate loop. Therefore, for the predicate to access the field not defined within the scope of the iterable, it must capture the field first.

Note that the side effect of introducing the foreach function
is observed in the form of deprecation of previous segment/paths
fields. This trend will follow in subsequent pull requests, untangling and overly simplifying the accessor codebase.
@rabbitstack rabbitstack merged commit 9a14aa9 into master Jan 16, 2025
6 checks passed
@rabbitstack rabbitstack deleted the foreach-function branch January 16, 2025 17:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

scope: filters Anything related to filters

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants