Skip to content

Allow metadata to contain a list of values#201

Merged
plusvic merged 3 commits intoVirusTotal:masterfrom
cccs-rs:master
May 20, 2022
Merged

Allow metadata to contain a list of values#201
plusvic merged 3 commits intoVirusTotal:masterfrom
cccs-rs:master

Conversation

@cccs-rs
Copy link
Contributor

@cccs-rs cccs-rs commented Mar 9, 2022

Derivation of: #74 😍

This allows the Match.meta values to becomes lists if there are rule meta with the same name but different values. This is also to bring output from the Python library more inline with the output from the commandline.

ie. Suppose there's a match given for the following rule:

rule myRule :
{
	meta:
              ...
		malware = "BAD THING"
		malware = "REALLY BAD THING"
         ...
}

The corresponding values in Match.meta['malware'] will be:
["BAD THING", "REALLY BAD THING"] instead of just "REALLY BAD THING"

@google-cla
Copy link

google-cla bot commented Mar 9, 2022

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

For more information, open the CLA check for this pull request.

@plusvic
Copy link
Member

plusvic commented Mar 10, 2022

But this breaks backward compatibility, right? People that are already using Match.meta do not expect to receive an array when they do Match.meta['malware']. We should keep with the approach followed in #74, add a new function for those that want to retrieve metadata with multiple values.

@cccs-rs
Copy link
Contributor Author

cccs-rs commented Mar 10, 2022

But this breaks backward compatibility, right? People that are already using Match.meta do not expect to receive an array when they do Match.meta['malware']. We should keep with the approach followed in #74, add a new function for those that want to retrieve metadata with multiple values.

Could a flag in Rules.match() control this behaviour, say via a param overwrite_meta_values?
The flag will have the default state to maintain backwards compatibility, but for those that want this feature, they can change the flag.

Not sure if it's ideal to have a special function that outputs arrays for all metadata since it could just be a flag that changes the output from Dict[str, str] to Dict[str, List[str]].

@plusvic
Copy link
Member

plusvic commented Mar 10, 2022

Yes, an argument allow_duplicate_metadata to Rules.match() that changes that behaviour should be fine. The default value would be False, which means that Match.meta will be a dictionary a Dict[key, value] as it is now. If the value is True , Match.meta will be a Dict[key, [value1, value2, ....]].

@cccs-rs
Copy link
Contributor Author

cccs-rs commented Mar 14, 2022

bump?

@cccs-rs
Copy link
Contributor Author

cccs-rs commented Mar 22, 2022

bump

4 similar comments
@cccs-rs
Copy link
Contributor Author

cccs-rs commented Mar 31, 2022

bump

@cccs-rs
Copy link
Contributor Author

cccs-rs commented Apr 5, 2022

bump

@cccs-rs
Copy link
Contributor Author

cccs-rs commented Apr 12, 2022

bump

@cccs-rs
Copy link
Contributor Author

cccs-rs commented Apr 19, 2022

bump

@cccs-rs
Copy link
Contributor Author

cccs-rs commented Apr 26, 2022

bump

@plusvic
Copy link
Member

plusvic commented Apr 26, 2022

@cccs-rs, adding the same comment every week is not going to speed up things. It's ok to ping the maintainers from time to time, but not at this rate.

yara-python.c Outdated
PyDict_SetItemString(meta_list, meta->identifier, object);
Py_DECREF(object);

if (CALLBACK_ALLOW_DUPLICATES){
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of the global variable CALLBACK_ALLOW_DUPLICATES is not safe with concurrent code. A better solution is adding a new field allow_duplicate_metadata to the CALLBACK_DATA structure. This field would be initialized with the value passed to match, and the callback function receives a pointer to CALLBACK_DATA where all the context about the current scan can be found.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated, thanks for the feedback! 😀

@plusvic
Copy link
Member

plusvic commented May 19, 2022

I also miss a test case for this new feature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants