YaraX Analyzer with Yara-Forge Rule Repository integration by spoiicy · Pull Request #2980 · intelowlproject/IntelOwl

spoiicy · 2025-08-29T07:31:49Z

Description

This PR aims to add new YaraX analyzer alongwith integration of yara-forge rule repository for enhanced ruleset selection.

Type of change

Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue).
New feature (non-breaking change which adds functionality).
Breaking change (fix or feature that would cause existing functionality to not work as expected).

Checklist

Important Rules

If you miss to compile the Checklist properly, your PR won't be reviewed by the maintainers.
Everytime you make changes to the PR and you think the work is done, you should explicitly ask for a review by using GitHub's reviewing system detailed here.

spoiicy · 2025-08-29T08:10:25Z

JSON Result when valid match is found

  "report": [
    {
      "rule_metadata": {
        "id": "09a400f5-e837-58c2-9b51-9213c8ab0883",
        "date": "2024-01-01",
        "hash": "3a9ee09ed965e3aee677043ba42c7fdbece0150ef9d1382c518b4b96bbd0e442",
        "tags": "FILE",
        "score": 50,
        "author": "Jonathan Peters",
        "quality": 80,
        "modified": "2024-01-03",
        "reference": "https://www.gapotchenko.com/eazfuscator.net",
        "logic_hash": "5f3f3358e3cfb274aa2e8465dde58a080f9fb282aa519885b9d39429521db6d9",
        "source_url": "https://github.com/cod3nym/detection-rules//blob/5939dadd34ebd3c111f97ba0bc0085b639e142a5/yara/dotnet/obf_eazfuscator.yar#L1-L28",
        "description": "Detects .NET images obfuscated with Eazfuscator string encryption. Eazfuscator is a widely used commercial obfuscation solution used by both legitimate software and malware.",
        "license_url": "https://github.com/cod3nym/detection-rules//blob/5939dadd34ebd3c111f97ba0bc0085b639e142a5/LICENSE.md"
      },
      "pattern_details": [
        {
          "match_details": [
            {
              "match_length": 10,
              "match_offset": 249641,
              "match_xor_key": null
            }
          ],
          "pattern_identifier": "$sa1"
        },
        {
          "match_details": [
            {
              "match_length": 10,
              "match_offset": 249652,
              "match_xor_key": null
            }
          ],
          "pattern_identifier": "$sa2"
        },
        {
          "match_details": [
            {
              "match_length": 5,
              "match_offset": 271822,
              "match_xor_key": null
            }
          ],
          "pattern_identifier": "$sa3"
        },
        {
          "match_details": [
            {
              "match_length": 4,
              "match_offset": 263900,
              "match_xor_key": null
            }
          ],
          "pattern_identifier": "$sa4"
        },
        {
          "match_details": [
            {
              "match_length": 27,
              "match_offset": 44444,
              "match_xor_key": null
            },
            {
              "match_length": 27,
              "match_offset": 44873,
              "match_xor_key": null
            }
          ],
          "pattern_identifier": "$op1"
        },
        {
          "match_details": [],
          "pattern_identifier": "$op2"
        },
        {
          "match_details": [
            {
              "match_length": 9,
              "match_offset": 43142,
              "match_xor_key": null
            },
            {
              "match_length": 9,
              "match_offset": 43165,
              "match_xor_key": null
            },
            {
              "match_length": 9,
              "match_offset": 43202,
              "match_xor_key": null
            },
            {
              "match_length": 9,
              "match_offset": 43232,
              "match_xor_key": null
            },
            {
              "match_length": 9,
              "match_offset": 43262,
              "match_xor_key": null
            },
            {
              "match_length": 9,
              "match_offset": 43292,
              "match_xor_key": null
            },
            {
              "match_length": 9,
              "match_offset": 43322,
              "match_xor_key": null
            }
          ],
          "pattern_identifier": "$op3"
        },
        {
          "match_details": [],
          "pattern_identifier": "$op4"
        }
      ],
      "rule_identifier": "COD3NYM_SUSP_OBF_NET_Eazfuscator_String_Encryption_Jan24"
    }
  ],
  "data_model": null,
  "errors": [],
  "parameters": {
    "rule_set": "full"
  }
}

mlodic

:) good job, few things to address

mlodic · 2025-08-29T08:27:48Z

api_app/analyzers_manager/migrations/0164_analyzer_config_yarax.py

+    "name": "YaraX",
+    "description": "[YaraX](https://virustotal.github.io/yara-x/docs/intro/getting-started/) is a re-incarnation of YARA, a pattern matching tool designed with malware researchers in mind. This new incarnation intends to be faster, safer and more user-friendly than its predecessor.",
+    "disabled": False,
+    "soft_time_limit": 60,


I don't know the performance, it should be fast but we could put this to a higher value to be cautious

Though, for the most part of my testing, the analyzer finished in around 30 seconds. But sure I can raise this to a higher value.

yeah please do that if you can

mlodic · 2025-08-29T08:29:33Z

api_app/analyzers_manager/file_analyzers/yarax.py

+            return True
+
+        except Exception as e:
+            logger.error(f"Failed to update yara-forge rules. Error: {e}")


logger.exception so that we can have the traceback. Also please add message this to self.report as suggested in the other PRs. (the addition in self.report is automatically handled by AnalyzerRunException. Another option could be to just raise the AnalyzerRunException directly here.)

mlodic · 2025-08-29T08:30:49Z

api_app/analyzers_manager/file_analyzers/yarax.py

+
+        rule_dir = f"{BASE_RULES_LOCATION}/{self.rule_set}"
+        if not os.path.isdir(rule_dir) and not self.update(rule_set=self.rule_set):
+            logger.info(f"Failed to update {self.rule_set} rule set")


this logger.info is not necessary because the AnalyzerRunException will already trigger a log.error message

Removed it.

mlodic · 2025-08-29T08:32:47Z

api_app/analyzers_manager/file_analyzers/yarax.py

+            )
+
+        rule_dir = f"{BASE_RULES_LOCATION}/{self.rule_set}"
+        if not os.path.isdir(rule_dir) and not self.update(rule_set=self.rule_set):


is there a way to know whether these rules have a new version or not like for yara or is not possible?

In case it is not possible, it would be nice to add an additional parameter called "force packages download" or something similar to force the download of these packages even if they are already present. Otherwise there is no chance to get the new content. This parameter would be false as default. Thoughts?

I think this can be done by creating an entry in a simple model to track the last_downloaded_version and then when in future the rules are supposed to be updated, it can be cross-checked with the db entry.

If you visit this URL, you can see the tag_name , using which we can achieve this.

Let me know what are your thoughts on this implementation? :)

good proposal, feel free to do that thanks ;)

api_app/analyzers_manager/file_analyzers/yarax.py

mlodic · 2025-08-29T08:34:15Z

api_app/analyzers_manager/file_analyzers/yarax.py

+            rules = compiler.build()
+            logger.info("Successfully compiled and built rules")
+
+            logger.info(f"Starting scanning {self.filename} with {self.rule_set} rules")


can you log the self.md5 too please?

mlodic · 2025-08-29T08:36:37Z

api_app/analyzers_manager/file_analyzers/yarax.py

+
+            logger.info(f"Successfully scanned {self.filename} with hash {self.md5}")
+
+            return "No Match" if not result else result


I would always return the same type and, to be more specific, a JSON, You can create a dict with a single key called "results" and in case there are no matches it would be enough to have an empty list

I can definitely create a dict with key "results" but I would like to return the following

{"results": "No Match"}

as the result, since this would explicitly inform the user that there were no matches instead of passing an empty list, which can be vague.

fgibertoni

Small comments from me also, nice work overall!

api_app/analyzers_manager/migrations/0164_analyzer_config_yarax.py

api_app/analyzers_manager/file_analyzers/yarax.py

fgibertoni · 2025-08-29T09:45:46Z

api_app/analyzers_manager/file_analyzers/yarax.py

+            for rule in scan_results.matching_rules:
+                logger.info(f"Rule Identifier: {rule.identifier}")
+                rule_metadata = {}
+                for detail in rule.metadata:


Maybe unpacking them in the for loop to improve readability?

Suggested change

for detail in rule.metadata:

for identifier, value in rule.metadata:

Plus, if rule.metadata is a list of tuple with two elements each you can remove the for loop by calling dict() directly:

>>> test = [("value", "data"), ("value2", "data2")] >>> dict(test) {'value': 'data', 'value2': 'data2'}

Sounds good, I'll do it this way.

code-review-doctor

Some food for thought. View full project report here.

api_app/analyzers_manager/file_analyzers/yarax.py

fgibertoni

Great improvement! I'll let also @mlodic and @drosetti have a look at this before merging 😄

api_app/analyzers_manager/models.py

mlodic

there is some shared code with the Capa PR. It could make sense to to merge that one first and then import then changes here and re-use the same functions here (with a Mixing as helper). Other than that, changes are fine, good job!

api_app/analyzers_manager/models.py

spoiicy · 2025-09-15T14:07:15Z

there is some shared code with the Capa PRs. It could make sense to to merge that one first and then import then changes here and re-use the same functions here (with a Mixing as helper). Other than that, changes are fine, good job!

Sure this can definitely be done. I'll create the helper methods and make adequate changes to the existing code, so that we can have more modular code.

github-actions · 2025-09-26T09:24:38Z

This pull request has been marked as stale because it has had no activity for 10 days. If you are still working on this, please provide some updates or it will be closed in 5 days.

code-review-doctor

Worth considering. View full project report here.

code-review-doctor · 2025-11-02T19:22:39Z

api_app/analyzers_manager/file_analyzers/yarax.py

+        rules_file_path = self.get_rule_location()
+        logger.info(f"Found rules at {rules_file_path}")
+
+        with open(rules_file_path, mode="r") as f:


UnicodeDecodeError can occur if the content of the file has characters incompatible with the OS's default encoding. Python uses the OS's default text encoding on the content because encoding is not set. Read more.

mlodic requested changes Aug 29, 2025

View reviewed changes

fgibertoni requested changes Aug 29, 2025

View reviewed changes

spoiicy force-pushed the yarax branch from 5914e50 to 1766797 Compare September 7, 2025 13:16

spoiicy marked this pull request as ready for review September 7, 2025 13:44

code-review-doctor bot suggested changes Sep 7, 2025

View reviewed changes

api_app/analyzers_manager/file_analyzers/yarax.py Show resolved Hide resolved

spoiicy requested review from fgibertoni and mlodic September 7, 2025 13:45

fgibertoni reviewed Sep 8, 2025

View reviewed changes

api_app/analyzers_manager/models.py Show resolved Hide resolved

mlodic requested changes Sep 15, 2025

View reviewed changes

api_app/analyzers_manager/models.py Show resolved Hide resolved

github-actions bot added the stale label Sep 26, 2025

fgibertoni added keep-open To avoid workflow closing PRs and removed stale labels Sep 29, 2025

Akshit Maheshwary and others added 5 commits November 2, 2025 15:13

added yarax analyzer

151d4fe

dumped migration and minor changes to analyzer

0cc1b9e

tracking rules latest version and update analyzer config

8a811d5

created new mixin for helper methods for managing rules

8aaf8d0

refactored yarax analyzer to helper methods from RulesUtilityMixin

9e65c97

spoiicy force-pushed the yarax branch from 1766797 to 9e65c97 Compare November 2, 2025 19:22

code-review-doctor bot suggested changes Nov 2, 2025

View reviewed changes

spoiicy requested a review from fgibertoni November 3, 2025 13:13

mlodic self-requested a review November 4, 2025 14:49

mlodic approved these changes Nov 4, 2025

View reviewed changes

mlodic merged commit eb813e3 into develop Nov 5, 2025
11 checks passed


		logger.info(f"Successfully scanned {self.filename} with hash {self.md5}")

		return "No Match" if not result else result

	for detail in rule.metadata:
	for identifier, value in rule.metadata:

Uh oh!

Conversation

spoiicy commented Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Checklist

Important Rules

Uh oh!

spoiicy commented Aug 29, 2025

JSON Result when valid match is found

Uh oh!

mlodic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

spoiicy Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fgibertoni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

code-review-doctor bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fgibertoni left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

mlodic left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

spoiicy commented Sep 15, 2025

Uh oh!

github-actions bot commented Sep 26, 2025

Uh oh!

code-review-doctor bot left a comment

Choose a reason for hiding this comment

Uh oh!

code-review-doctor bot Nov 2, 2025

Choose a reason for hiding this comment

Uh oh!

spoiicy commented Aug 29, 2025 •

edited

Loading

spoiicy Aug 29, 2025 •

edited

Loading

mlodic left a comment •

edited

Loading