Display extra-words in detection_log if present#4402
Conversation
ff9eaec to
2a4c4c6
Compare
AyanSinhaMahapatra
left a comment
There was a problem hiding this comment.
@alok1304 thanks! See comments for changes.
Please always run a subset of the tests which could be effected locally and inspect failures before raising a PR, and make sure you also regenerate test expectations for these failures, and verify that they are intended. See https://scancode-toolkit.readthedocs.io/en/stable/contribute/contrib_dev.html#running-tests
Please also add a test from #4400 just focused on extra words, without tests it's much more work for reviewers to verify this is working correctly.
4042c2b to
a4415e7
Compare
1e24b13 to
49bac47
Compare
|
still, we are not getting extra-words in I am trying to find examples files where extra-words are present. |
c33c3c0 to
bad458f
Compare
|
@AyanSinhaMahapatra I added test when we get |
bad458f to
7516775
Compare
|
These failing test cases are passing on my system see |
AyanSinhaMahapatra
left a comment
There was a problem hiding this comment.
Thanks @alok1304, looking good mostly, ready to merge with a couple changes, see comments for more details.
Don't worry about the extra test failures, there are there because of #4369, in the tests where we perform a force upgrade of all our dependencies and run the tests there. See https://github.com/aboutcode-org/scancode-toolkit/blob/develop/azure-pipelines.yml#L224
src/licensedcode/detection.py
Outdated
| return DetectionCategory.LICENSE_CLUES.value | ||
|
|
||
| # Case where all matches have `matcher` as `1-hash` or `4-spdx-id` | ||
| # Case where all matches have `matcher` as `1-hash` or `4-spdx-id` or 2-aho |
There was a problem hiding this comment.
This is not correct, could you revert is_correct_detection back to:
all(matcher in ("1-hash", "1-spdx-id") for matcher in matchers)
The 2-aho cases are meant to be caught below here, only if all other cases in between are not present:
# Cases where Match Coverage is a perfect 100 for all matches
else:
return DetectionCategory.PERFECT_DETECTION.value
The only bug we needed to fix was in get_detected_license_expression, where we were missing catching the analysis == DetectionCategory.EXTRA_WORDS.value and thus the detection log not being populated..
5204567 to
197b261
Compare
|
These failing test cases happening when i change this |
AyanSinhaMahapatra
left a comment
There was a problem hiding this comment.
Thanks, please see comment below.
b15bfee to
8f29232
Compare
|
@AyanSinhaMahapatra I did all modifications or changes that you said, in case of Also, I squash all my commits into a single commit. |
|
@alok1304 could you add the |
d8acde4 to
0b2d3a3
Compare
|
Hii @AyanSinhaMahapatra I do this things, |
17e6129 to
18fb19a
Compare
|
@AyanSinhaMahapatra The file that you are providing there is the case in which we don't receive This text present one of the file in https://registry.npmjs.org/source-map/-/source-map-0.6.1.tgz, Here, due to I think we have to modify this. |
Reference: aboutcode-org#4400 Signed-off-by: Alok Kumar <[email protected]>
Signed-off-by: Alok Kumar <[email protected]>
Signed-off-by: Alok Kumar <[email protected]>
Signed-off-by: Alok Kumar <[email protected]>
Signed-off-by: Alok Kumar <[email protected]>
Also i regenerate the tests for https://github.com/aboutcode-org/scancode-toolkit/blob/develop/tests/packagedcode/test_license_detection.py#L265 previously there is not detection of `extra-words` due to `referenced_filenames` tag in license rule Signed-off-by: Alok Kumar <[email protected]>
d42b216 to
7eb8db7
Compare
This is a seperate case, which is correctly detected as a We cannot use |
36a5bc2
into
aboutcode-org:develop
extra_wordsin secondary types of Detections.2-ahofrom matcher because previously, due to this, we getperfect-detectionbecause of this https://github.com/aboutcode-org/scancode-toolkit/blob/develop/src/licensedcode/detection.py#L1729 , here2-ahomatcher included in this.update: when I do this many test cases are failing.
these work fine when our matcher is
3-seq.Fixes #4400
Tasks
Run tests locally to check for errors.
Signed-off-by: Alok Kumar [email protected]