[WIP] Add new file categorization plugin #2945#2958
[WIP] Add new file categorization plugin #2945#2958johnmhoran wants to merge 13 commits intodevelopfrom
Conversation
|
This PR is related to issue #2945. |
pombredanne
left a comment
There was a problem hiding this comment.
Thanks! See my nits inline.
Also the test file jarDependImg.png likely comes from http://svn.apache.org/repos/asf/xmlgraphics/batik/branches/webapi/documentation-sources/resources/images/jarDependImg.png under an Apache license that we did not document and is also too big for a test file. Can you instead create an extra small PNG that is not from a third-party?
| from commoncode.datautils import String | ||
| from plugincode.post_scan import PostScanPlugin | ||
| from plugincode.post_scan import post_scan_impl | ||
| from commoncode.cliutils import PluggableCommandLineOption |
| from commoncode.cliutils import POST_SCAN_GROUP | ||
|
|
||
| """ | ||
| Categorize files. |
There was a problem hiding this comment.
Could you expand the documentation to explain the workings of this module at a high level?
|
|
||
| options = [ | ||
| PluggableCommandLineOption( | ||
| ("--file-cat",), |
There was a problem hiding this comment.
We need a more explicit, non-abbreviated command line option, like --categorize-file or --file-category to stay consistent with other options.
| @classmethod | ||
| def categorize(cls, resource): | ||
| """ | ||
| Return True if this Categorizer applies to this resource or False or None |
There was a problem hiding this comment.
| Return True if this Categorizer applies to this resource or False or None | |
| Return True if this Categorizer category applies to this ``resource``. |
But what is resource ? a path string? or a location? some object? we need to be precise there.
| resource.save(codebase) | ||
|
|
||
|
|
||
| class Categorizer: |
There was a problem hiding this comment.
Add some docstring to document what is this class used for and how to extend it
| class Categorizer: | |
| class Categorizer: | |
| """ | |
| A categorizer can categorize a file Resource based on its path, name, | |
| extension and other attributes such as file or mime type. | |
| Subclasses must override the ``categorize`` class method. | |
| """ |
|
|
||
| class Categorizer: | ||
| order = 0 | ||
| analysis_priority = None |
There was a problem hiding this comment.
Could this be named importance or prominence?
analysis_priority seems to imply this is only use to drive some analysis?
| analysis_priority = "1" | ||
| file_category = "archive" | ||
| file_subcategory = "debian" | ||
| category_notes = "special type of archive" |
There was a problem hiding this comment.
why not just call this notes?
| order = 0 | ||
| analysis_priority = None | ||
| file_category = None | ||
| file_subcategory = None |
There was a problem hiding this comment.
We should always enforce these to be lowercase: we have a mix or capital case or lowercase subcategories at the moment.
| @classmethod | ||
| def categorize(cls, resource): | ||
| return extension_in(resource.extension, [".bzl"]) or ( | ||
| name_in(resource.name, ["build.bazel"]) and resource.type == "file" |
There was a problem hiding this comment.
is resource_type == file needed here? Are these not all about files?
|
|
||
| @classmethod | ||
| def categorize(cls, resource): | ||
| return extension_uppercase_in(resource.extension, [".S"]) |
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
Reference: #2945 Signed-off-by: John M. Horan <[email protected]>
1f7d78f to
e4f64d2
Compare
* Add in postscan plugin entry points Signed-off-by: John M. Horan <[email protected]>
This PR add a new file categorization plugin.