Conversation
JSON Result when valid match is found |
mlodic
left a comment
There was a problem hiding this comment.
:) good job, few things to address
| "name": "YaraX", | ||
| "description": "[YaraX](https://virustotal.github.io/yara-x/docs/intro/getting-started/) is a re-incarnation of YARA, a pattern matching tool designed with malware researchers in mind. This new incarnation intends to be faster, safer and more user-friendly than its predecessor.", | ||
| "disabled": False, | ||
| "soft_time_limit": 60, |
There was a problem hiding this comment.
I don't know the performance, it should be fast but we could put this to a higher value to be cautious
There was a problem hiding this comment.
Though, for the most part of my testing, the analyzer finished in around 30 seconds. But sure I can raise this to a higher value.
| return True | ||
|
|
||
| except Exception as e: | ||
| logger.error(f"Failed to update yara-forge rules. Error: {e}") |
There was a problem hiding this comment.
logger.exception so that we can have the traceback. Also please add message this to self.report as suggested in the other PRs. (the addition in self.report is automatically handled by AnalyzerRunException. Another option could be to just raise the AnalyzerRunException directly here.)
|
|
||
| rule_dir = f"{BASE_RULES_LOCATION}/{self.rule_set}" | ||
| if not os.path.isdir(rule_dir) and not self.update(rule_set=self.rule_set): | ||
| logger.info(f"Failed to update {self.rule_set} rule set") |
There was a problem hiding this comment.
this logger.info is not necessary because the AnalyzerRunException will already trigger a log.error message
| ) | ||
|
|
||
| rule_dir = f"{BASE_RULES_LOCATION}/{self.rule_set}" | ||
| if not os.path.isdir(rule_dir) and not self.update(rule_set=self.rule_set): |
There was a problem hiding this comment.
is there a way to know whether these rules have a new version or not like for yara or is not possible?
In case it is not possible, it would be nice to add an additional parameter called "force packages download" or something similar to force the download of these packages even if they are already present. Otherwise there is no chance to get the new content. This parameter would be false as default. Thoughts?
There was a problem hiding this comment.
I think this can be done by creating an entry in a simple model to track the last_downloaded_version and then when in future the rules are supposed to be updated, it can be cross-checked with the db entry.
If you visit this URL, you can see the tag_name , using which we can achieve this.
Let me know what are your thoughts on this implementation? :)
There was a problem hiding this comment.
good proposal, feel free to do that thanks ;)
| rules = compiler.build() | ||
| logger.info("Successfully compiled and built rules") | ||
|
|
||
| logger.info(f"Starting scanning {self.filename} with {self.rule_set} rules") |
There was a problem hiding this comment.
can you log the self.md5 too please?
|
|
||
| logger.info(f"Successfully scanned {self.filename} with hash {self.md5}") | ||
|
|
||
| return "No Match" if not result else result |
There was a problem hiding this comment.
I would always return the same type and, to be more specific, a JSON, You can create a dict with a single key called "results" and in case there are no matches it would be enough to have an empty list
There was a problem hiding this comment.
I can definitely create a dict with key "results" but I would like to return the following
{"results": "No Match"}
as the result, since this would explicitly inform the user that there were no matches instead of passing an empty list, which can be vague.
fgibertoni
left a comment
There was a problem hiding this comment.
Small comments from me also, nice work overall!
api_app/analyzers_manager/migrations/0164_analyzer_config_yarax.py
Outdated
Show resolved
Hide resolved
| for rule in scan_results.matching_rules: | ||
| logger.info(f"Rule Identifier: {rule.identifier}") | ||
| rule_metadata = {} | ||
| for detail in rule.metadata: |
There was a problem hiding this comment.
Maybe unpacking them in the for loop to improve readability?
| for detail in rule.metadata: | |
| for identifier, value in rule.metadata: |
There was a problem hiding this comment.
Plus, if rule.metadata is a list of tuple with two elements each you can remove the for loop by calling dict() directly:
>>> test = [("value", "data"), ("value2", "data2")]
>>> dict(test)
{'value': 'data', 'value2': 'data2'}There was a problem hiding this comment.
Sounds good, I'll do it this way.
There was a problem hiding this comment.
Some food for thought. View full project report here.
Sure this can definitely be done. I'll create the helper methods and make adequate changes to the existing code, so that we can have more modular code. |
|
This pull request has been marked as stale because it has had no activity for 10 days. If you are still working on this, please provide some updates or it will be closed in 5 days. |
There was a problem hiding this comment.
Worth considering. View full project report here.
| rules_file_path = self.get_rule_location() | ||
| logger.info(f"Found rules at {rules_file_path}") | ||
|
|
||
| with open(rules_file_path, mode="r") as f: |
There was a problem hiding this comment.
UnicodeDecodeError can occur if the content of the file has characters incompatible with the OS's default encoding. Python uses the OS's default text encoding on the content because encoding is not set. Read more.
closes #2592 and #2589
Description
This PR aims to add new YaraX analyzer alongwith integration of yara-forge rule repository for enhanced ruleset selection.
Type of change
Please delete options that are not relevant.
Checklist
developdumpplugincommand and added it in the project as a data migration. ("How to share a plugin with the community")test_files.zipand you added the default tests for that mimetype in test_classes.py.FREE_TO_USE_ANALYZERSplaybook by following this guide.urlthat contains this information. This is required for Health Checks (HEAD HTTP requests)._monkeypatch()was used in its class to apply the necessary decorators.MockUpResponseof the_monkeypatch()method. This serves us to provide a valid sample for testing.DataModelfor the new analyzer following the documentation# This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission.Black,Flake,Isort) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.testsfolder). All the tests (new and old ones) gave 0 errors.DeepSource,Django Doctorsor other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.Important Rules