Adds export report funtionality#870
Adds export report funtionality#870viyatb merged 4 commits intoowtf:developfrom saganshul:export_report_docx
Conversation
DePierre
left a comment
There was a problem hiding this comment.
A few suggestions after a first read-through. I didn't review the UI part.
| if not target_id: | ||
| raise tornado.web.HTTPError(400) | ||
| try: | ||
| filter_data = dict(self.request.arguments) # IMPORTANT!! |
There was a problem hiding this comment.
Useless comment. It would be much more helpful to specify why it is important.
| except exceptions.InvalidErrorReference: | ||
| raise tornado.web.HTTPError(400) | ||
|
|
||
| class ReportExportHandler(custom_handlers.APIRequestHandler): |
There was a problem hiding this comment.
A docstring giving an overview of the API would be nice. For instance, it would be interesting to know what formats of report are supported, etc.
| grouped_plugin_outputs = collections.OrderedDict(sorted(grouped_plugin_outputs.items())) | ||
|
|
||
| # Get mappings | ||
| if self.get_argument("mapping", None): |
There was a problem hiding this comment.
Could be shortened:
mappings = self.get_argument("mapping", None)
if mappings:
mappings = self.get_component("mapping_db").GetMappings(mappings)| for test_group in self.get_component("db_plugin").GetAllTestGroups(): | ||
| test_group["mapped_code"] = test_group["code"] | ||
| test_group["mapped_descrip"] = test_group["descrip"] | ||
| if mappings: |
There was a problem hiding this comment.
Could be shortened:
if mappings and test_group['code'] in mappings:
test_group["mapped_code"] = mappings[test_group['code']][0]
test_group["mapped_descrip"] = mappings[test_group['code']][1]Also, it would be more readable to name [0] and [1], for instance:
code, description = mappings[test_group['code']]
test_group["mapped_code"] = code
test_group["mapped_descrip"] = description| test_groups[test_group['code']] = test_group | ||
|
|
||
| vulnerabilities = [] | ||
| for key, value in grouped_plugin_outputs.iteritems(): |
There was a problem hiding this comment.
Just a note: iteritems does not exist in python 3.x. If OWTF is planing on supporting python3.x, it would be time to stop adding more incompatible code. Would items be enough? Or will it add to much overhead in python2.x?
|
|
||
| vulnerabilities = [] | ||
| for key, value in grouped_plugin_outputs.iteritems(): | ||
| obj = test_groups[key] |
There was a problem hiding this comment.
Is there a need for the temporary variable (for creating a copy and not a reference)? Or would the following work as well?
for key, value in grouped_plugin_outputs.items():
test_groups[key]['data'] = value
vulnerabilities.append(test_groups[key])Also in your current version, are we sure that test_groups[key] exists?
| parsed_ip = IPAddress(ip) | ||
| return parsed_ip.is_private | ||
|
|
||
| def get_rank(user_rank, owtf_rank): |
There was a problem hiding this comment.
Are you sure that such function is not already available? Either in OWTF or even in PTP?
There was a problem hiding this comment.
Apparently no, there is no other reference where the ranking scale is defined (except hardcoded in https://github.com/owtf/owtf/blob/master/framework/interface/templates/plugin_report.html)
There was a problem hiding this comment.
No there is not such a function exists.
There was a problem hiding this comment.
As such, it would be better to separate the get_rank function from the ranking scale. By splitting the two, the ranking scale could be used somewhere else in the framework, in case another feature needs to translate the rankings.
Also, get_rank should check that the index is defined in the dict and default to a default value if it is not. For instance, something like:
return RANKS.get(rank, 'Unknown')The string Unknown should be defined somewhere else, or Unranked could be used. You get the idea.
I would suggest specifying what is the return value in the docstring. Also, I am not sure to understand why the function should make the decision of which ranking to use. I believe that this should be done before calling it. Something like:
# From the API
rank_str(max(user_rank, owtf_rank))Finally, the name of the function could be a bit more specific in what it does with something like rank_human_readable, rank_str. From what I read, the function does not get anything, it simply translates a ranking from numerical to string value.
There was a problem hiding this comment.
Now I am realizing that we don't need such function. Instead, all we need is the dict translating the numerical ranking to a string.
The get_rank function can be removed and instead we could do the following:
# In some util file
from ptp.libptp.constants import INFO, LOW, MEDIUM, HIGH
OWTF_UNRANKED = -1
OWTF_PASSING = 0
OWTF_CRITICAL = 5
RANKS = {
OWTF_UNRANKED: 'Unranked',
OWTF_PASSING: 'Passing',
INFO: 'Informational',
LOW: 'Low',
MEDIUM: 'Medium',
HIGH: 'High',
OWTF_CRITICAL: 'Critical'
}
# In the API
poutput["rank"] = RANKS.get(max(user_rank, owtf_rank), RANKS[OWTF_UNRANKED])Of course this could be polished a bit more and the string translated value could also be defined as constants, etc. It is just a draft to show the alternative I am seeing.
There was a problem hiding this comment.
@DePierre Can I get all INFO, LOW, MEDIUM, HIGH, UNRANKED, PASSING, CRITICAL variable from ptp.
There was a problem hiding this comment.
@saganshul ptp does not have CRITICAL by design, since it deemed that only user could have the context to rank an issue as CRITICAL.
PASSING does not exist, since ptp does not evaluate if a report is passing or not. However, UNKNOWN is used when no ranking could be found. It should be noted though that UNKNOWN is 0, while OWTF uses 0 for PASSING.
|
@saganshul Thank you very much for implementing that very useful feature! One thing that bothers me a little bit is to understand how easy/painful it would be to support other templates. In your opinion, do your changes allow supporting multiple report templates? If not, wouldn't it be better to spend a little more time on the feature and include that thought in the code? I would personally imagine a configuration file defining the required information in a report (vuln title, description, ranking, etc.). Then I would imagine OWTF supporting at least 2 or 3 templates (one with highest rating first for instance, one ordered by plugin groups, etc.). And finally I would imagine having the ability as a user to select among the different templates. |
|
@DePierre Anshul and I thought about creating a repository of community supported templates so that users can have the choice as many as variations they want. The configuration file would not be necessary since the user can trivially move around the parts in the DOCX/ODT file. |
|
@viyatb How will OWTF fill the fields in the report templates? Is there a well-defined list of elements that are required, optional, and so on for OWTF templates? I am 100% for that community repository writing the multiple templates. It's really an awesome idea! But then it would require very strong documentations about what can/should/may/might be included in the template. Going a bit further, it might be required to implement the report generation using a plugin system, since it will have to deal with potentially 10s of different templates. As far as I can read in this PR, everything is hardcoded for one template. This is a great PoC but not something that is maintainable and can be improved easily for multiple formats. I am not trying to diminish any work already done here. However more work should be put in the design of the feature to be both easily maintainable and improved. It should also be properly documented, which doesn't seem to be the case in this PR (regarding what fields are available, required, optional, etc.). |
|
@DePierre yeah, documentation :sighs: lol. All your suggestions are on point and is going to be done in iterative commits. 👍 |
|
@saganshul the JS part LGTM. |
|
I changed commit message because currently it is very basic implementation.
@DePierre The same API will work for all templates. API throws all the information it has in OWTF DB about scan. So we just need to write template. Same API works for all template. Also multiple formats can be supported since what we can convert docx to different formats. |
|
@saganshul Sorry maybe I was not clear in my previous comment. Regarding the multiple templates, I meant that currently, only one .docx template is supported (the one you linked) and that currently, the PR does not easily allow using another one. For instance, there could be a second template that sorts the findings by plugin type (web active first, web semi-passive second, web passive third, etc.). In that case, it will require some changes in the feature to easily support that second template. Maybe I am missing something that I am not seeing. Regarding the everything is hardcoded for one template, you just mentioned the issue: Template only needs to know keys and JSON structure. In other, if I want to add the duration time for each plugin, I will have to modify the feature to add one additional key. Or is it exposed? Once again, I am maybe missing something that I am not seeing. This last remark is also related to the lack of documentation. What does OWTF offer in terms of data? You say API throws all the information it has in OWTF DB about scan. In the PR, I see that it offers the vulnerability title, its data, etc. But I only understand that by reading the source code or know the inner working of its DB. For instance, with your PR, is it possible to know if a ranking is coming from OWTF or the user? I would believe that we would need to clearly define all the data that one would need in a report and expose all of them. Then, each template would have to follow that documentation and pick the data it needs. Does it make more sense? |
DePierre
left a comment
There was a problem hiding this comment.
I added a few more comments that, I believe, could improve the quality of the code.
| filter_data = dict(self.request.arguments) | ||
| plugin_outputs = self.get_component("plugin_output").GetAll(filter_data, target_id=target_id, inc_output=True) | ||
| # Group the plugin outputs to make it easier in template | ||
| grouped_plugin_outputs = {} |
There was a problem hiding this comment.
I believe that this could be shortened with:
grouped_plugin_outputs = {poutput['plugin_code']: [] for poutput in plugin_outputs}There was a problem hiding this comment.
for poutput in plugin_outputs:
if grouped_plugin_outputs.get(poutput['plugin_code']) is None:
grouped_plugin_outputs[poutput['plugin_code']] = []
poutput["rank"] = get_rank(poutput["user_rank"], poutput["owtf_rank"])
grouped_plugin_outputs[poutput['plugin_code']].append(poutput)
This total thing can't be replaced with
grouped_plugin_outputs = {poutput['plugin_code']: [] for poutput in plugin_outputs}
This thing only does this part:
for poutput in plugin_outputs:
if grouped_plugin_outputs.get(poutput['plugin_code']) is None:
grouped_plugin_outputs[poutput['plugin_code']] = []
There was a problem hiding this comment.
@saganshul I understand, I misread the indentation and was only considering the first three lines.
Would the following structure be better in your opinion?
from collections import defaultdict
grouped_plugin_outputs = defaultdict(list)
for output in plugin_outputs:
output['rank'] = RANKS.get(max(output['user_rank'], output['owtf_rank']))
grouped_plugin_outputs[output['plugin_code']].append(output)| else: | ||
| raise tornado.web.HTTPError(400) | ||
|
|
||
| except exceptions.InvalidTargetReference as e: |
There was a problem hiding this comment.
Could we reduce the scope of the try/except statement? It is usually recommended to have the smallest scope for try/except.
| grouped_plugin_outputs = {} | ||
| for poutput in plugin_outputs: | ||
| if grouped_plugin_outputs.get(poutput['plugin_code']) is None: | ||
| # No problem of overwriting |
There was a problem hiding this comment.
I am not sure to understand the the point of this comment. What is overridden? Maybe I am not understanding the code snippet but I believe that the line below this comment does not override anything.
| """ | ||
| Class handling APIs related to export report funtionality. | ||
| In OWTF, export report funtionality is implemented using docxtemplater. | ||
| docxtemplater takes two inputs: |
There was a problem hiding this comment.
I don't believe that this belongs to the docstring of the API. As far as I can see, docxtemplater is only being used in a JS file that should be abstracted from this API. It does not and should not be a concern for the API whether docxtemplater or .docx format is used or not.
The docstring should focus on what the API does. As far as I can tell, it fetches data from the DB to prepare some JSON structure that contain information about a target, plugins run, rankings, etc. What information does it fetch? What does it do with it? What return value does it return?
I would prefer to have a docstring that explains what it does rather than a docstring that explains what another piece of code somewhere else can do with the data it created.
| tornado.web.url(r'/api/targets/([0-9]+)/transactions/hrt/?([0-9]+)?/?$', api_handlers.TransactionHrtHandler, name='transactions_hrt_api_url'), | ||
| tornado.web.url(r'/api/targets/([0-9]+)/poutput/?' + plugin_group_re + '/?' + plugin_type_re + '/?' + plugin_code_re + '/?$', api_handlers.PluginOutputHandler, name='poutput_api_url'), | ||
| tornado.web.url(r'/api/targets/([0-9]+)/poutput/names/?' + plugin_group_re + '/?' + plugin_type_re + '/?' + plugin_code_re + '/?$', api_handlers.PluginNameOutput, name='plugin_name_api_url'), | ||
| tornado.web.url(r'/api/targets/([0-9]+)/export/?$', api_handlers.ReportExportHandler, name='report_export_api_url'), |
There was a problem hiding this comment.
It might befar-fetched but wouldn't it be more modular to have the API set to /api/targets/{id}/export/{format} or /api/targets/{id}/export/{template} instead?
It would free the /api/targets/{id}/export/ API that could later default to a specific template/format, or be used to show a dialog box where the user can specify more options.
What do you think?
There was a problem hiding this comment.
@DePierre Actually, we have plans to separate the React UI from Tornado API server. Means Render everything using React rather than Jinja2. We will make backend as simple API server.
Now, as you suggested API should be something like this -> /api/targets/{id}/export/{template} , but backend server will not have any information about templates, as the export functionality is completely client side and only React or UI will have information about available templates.
Use of /api/targets/{id}/export/ is to just return data from DB, docxtemplater will do the rest. I mean we don't need to have seperate api for template, because the same API will work for all template.
There was a problem hiding this comment.
@saganshul Thanks for the additional info. I believe you already thought about it and it's fine by me. It was simply a remark.
Did you have time to consider the other review comments I made?
There was a problem hiding this comment.
yeah! Just doing that :)
| parsed_ip = IPAddress(ip) | ||
| return parsed_ip.is_private | ||
|
|
||
| def get_rank(user_rank, owtf_rank): |
There was a problem hiding this comment.
As such, it would be better to separate the get_rank function from the ranking scale. By splitting the two, the ranking scale could be used somewhere else in the framework, in case another feature needs to translate the rankings.
Also, get_rank should check that the index is defined in the dict and default to a default value if it is not. For instance, something like:
return RANKS.get(rank, 'Unknown')The string Unknown should be defined somewhere else, or Unranked could be used. You get the idea.
I would suggest specifying what is the return value in the docstring. Also, I am not sure to understand why the function should make the decision of which ranking to use. I believe that this should be done before calling it. Something like:
# From the API
rank_str(max(user_rank, owtf_rank))Finally, the name of the function could be a bit more specific in what it does with something like rank_human_readable, rank_str. From what I read, the function does not get anything, it simply translates a ranking from numerical to string value.
|
@DePierre @viyatb I updated the PR.
TODO:
|
|
@saganshul do you have any screenshot on how you choose a specific template? |
| OWTF_CRITICAL = 5 | ||
|
|
||
| # Maps `int` value of ranks with `string` value. | ||
| RANKS = {OWTF_UNRANKED: 'Unranked', OWTF_PASSING: 'Passing', OWTF_INFO: 'Informational', OWTF_LOW: 'Low', OWTF_MEDIUM: 'Medium', OWTF_HIGH: 'High', OWTF_CRITICAL: 'Critical'} |
There was a problem hiding this comment.
Maybe format this in pretty print format.
RANKS = {
OWTF_UNRANKED: 'Unranked',
OWTF_PASSING: 'Passing',
OWTF_INFO: 'Informational',
OWTF_LOW: 'Low',
OWTF_MEDIUM: 'Medium',
OWTF_HIGH: 'High',
OWTF_CRITICAL: 'Critical',
}

Uses docxtemplater to create export report functionality.
Description
Rest API - /api/targets/<target_id>/export/
Google DOC (Input Template)- https://docs.google.com/document/d/1lVWs0hV9SROAaqbReV0FJHfKphrv1dH7nS9K11Vi3X4/edit?usp=sharing
Related Issue
#695
Reviewers
@viyatb @7a
How Has This Been Tested?
Tested on Google Chrome browser.
Screenshots (if appropriate):
Types of changes