Skip to content

Adds export report funtionality#870

Merged
viyatb merged 4 commits intoowtf:developfrom
saganshul:export_report_docx
Aug 9, 2017
Merged

Adds export report funtionality#870
viyatb merged 4 commits intoowtf:developfrom
saganshul:export_report_docx

Conversation

@saganshul
Copy link
Copy Markdown
Contributor

@saganshul saganshul commented Aug 4, 2017

Uses docxtemplater to create export report functionality.

Description

Rest API - /api/targets/<target_id>/export/
Google DOC (Input Template)- https://docs.google.com/document/d/1lVWs0hV9SROAaqbReV0FJHfKphrv1dH7nS9K11Vi3X4/edit?usp=sharing

Related Issue

#695

Reviewers

@viyatb @7a

How Has This Been Tested?

Tested on Google Chrome browser.

Screenshots (if appropriate):

screenshot from 2017-08-04 19-52-54
screenshot from 2017-08-04 19-53-05

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)
  • Other

@saganshul saganshul changed the title Adds exports report funtionality Adds export report funtionality Aug 4, 2017
Copy link
Copy Markdown
Contributor

@DePierre DePierre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few suggestions after a first read-through. I didn't review the UI part.

Comment thread framework/interface/api_handlers.py Outdated
if not target_id:
raise tornado.web.HTTPError(400)
try:
filter_data = dict(self.request.arguments) # IMPORTANT!!
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useless comment. It would be much more helpful to specify why it is important.

except exceptions.InvalidErrorReference:
raise tornado.web.HTTPError(400)

class ReportExportHandler(custom_handlers.APIRequestHandler):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A docstring giving an overview of the API would be nice. For instance, it would be interesting to know what formats of report are supported, etc.

Comment thread framework/interface/api_handlers.py Outdated
grouped_plugin_outputs = collections.OrderedDict(sorted(grouped_plugin_outputs.items()))

# Get mappings
if self.get_argument("mapping", None):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be shortened:

mappings = self.get_argument("mapping", None)
if mappings:
    mappings = self.get_component("mapping_db").GetMappings(mappings)

Comment thread framework/interface/api_handlers.py Outdated
for test_group in self.get_component("db_plugin").GetAllTestGroups():
test_group["mapped_code"] = test_group["code"]
test_group["mapped_descrip"] = test_group["descrip"]
if mappings:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be shortened:

if mappings and test_group['code'] in mappings:
    test_group["mapped_code"] = mappings[test_group['code']][0]
    test_group["mapped_descrip"] = mappings[test_group['code']][1]

Also, it would be more readable to name [0] and [1], for instance:

code, description = mappings[test_group['code']]
test_group["mapped_code"] = code
test_group["mapped_descrip"] = description

Comment thread framework/interface/api_handlers.py Outdated
test_groups[test_group['code']] = test_group

vulnerabilities = []
for key, value in grouped_plugin_outputs.iteritems():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a note: iteritems does not exist in python 3.x. If OWTF is planing on supporting python3.x, it would be time to stop adding more incompatible code. Would items be enough? Or will it add to much overhead in python2.x?

Comment thread framework/interface/api_handlers.py Outdated

vulnerabilities = []
for key, value in grouped_plugin_outputs.iteritems():
obj = test_groups[key]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a need for the temporary variable (for creating a copy and not a reference)? Or would the following work as well?

for key, value in grouped_plugin_outputs.items():
    test_groups[key]['data'] = value
    vulnerabilities.append(test_groups[key])

Also in your current version, are we sure that test_groups[key] exists?

Comment thread framework/utils.py Outdated
parsed_ip = IPAddress(ip)
return parsed_ip.is_private

def get_rank(user_rank, owtf_rank):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that such function is not already available? Either in OWTF or even in PTP?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No there is not such a function exists.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As such, it would be better to separate the get_rank function from the ranking scale. By splitting the two, the ranking scale could be used somewhere else in the framework, in case another feature needs to translate the rankings.

Also, get_rank should check that the index is defined in the dict and default to a default value if it is not. For instance, something like:

return RANKS.get(rank, 'Unknown')

The string Unknown should be defined somewhere else, or Unranked could be used. You get the idea.

I would suggest specifying what is the return value in the docstring. Also, I am not sure to understand why the function should make the decision of which ranking to use. I believe that this should be done before calling it. Something like:

# From the API
rank_str(max(user_rank, owtf_rank))

Finally, the name of the function could be a bit more specific in what it does with something like rank_human_readable, rank_str. From what I read, the function does not get anything, it simply translates a ranking from numerical to string value.

Copy link
Copy Markdown
Contributor

@DePierre DePierre Aug 5, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I am realizing that we don't need such function. Instead, all we need is the dict translating the numerical ranking to a string.

The get_rank function can be removed and instead we could do the following:

# In some util file
from ptp.libptp.constants import INFO, LOW, MEDIUM, HIGH
OWTF_UNRANKED = -1
OWTF_PASSING = 0
OWTF_CRITICAL = 5
RANKS = {
    OWTF_UNRANKED: 'Unranked',
    OWTF_PASSING: 'Passing',
    INFO: 'Informational',
    LOW: 'Low',
    MEDIUM: 'Medium',
    HIGH: 'High',
    OWTF_CRITICAL: 'Critical'
}

# In the API
poutput["rank"] = RANKS.get(max(user_rank, owtf_rank), RANKS[OWTF_UNRANKED])

Of course this could be polished a bit more and the string translated value could also be defined as constants, etc. It is just a draft to show the alternative I am seeing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DePierre Can I get all INFO, LOW, MEDIUM, HIGH, UNRANKED, PASSING, CRITICAL variable from ptp.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saganshul ptp does not have CRITICAL by design, since it deemed that only user could have the context to rank an issue as CRITICAL.

PASSING does not exist, since ptp does not evaluate if a report is passing or not. However, UNKNOWN is used when no ranking could be found. It should be noted though that UNKNOWN is 0, while OWTF uses 0 for PASSING.

See https://github.com/owtf/ptp/blob/develop/ptp/libptp/constants.py

@DePierre
Copy link
Copy Markdown
Contributor

DePierre commented Aug 4, 2017

@saganshul Thank you very much for implementing that very useful feature!

One thing that bothers me a little bit is to understand how easy/painful it would be to support other templates. In your opinion, do your changes allow supporting multiple report templates? If not, wouldn't it be better to spend a little more time on the feature and include that thought in the code?

I would personally imagine a configuration file defining the required information in a report (vuln title, description, ranking, etc.). Then I would imagine OWTF supporting at least 2 or 3 templates (one with highest rating first for instance, one ordered by plugin groups, etc.). And finally I would imagine having the ability as a user to select among the different templates.

@viyatb
Copy link
Copy Markdown
Member

viyatb commented Aug 4, 2017

@DePierre Anshul and I thought about creating a repository of community supported templates so that users can have the choice as many as variations they want. The configuration file would not be necessary since the user can trivially move around the parts in the DOCX/ODT file.

@DePierre
Copy link
Copy Markdown
Contributor

DePierre commented Aug 4, 2017

@viyatb How will OWTF fill the fields in the report templates? Is there a well-defined list of elements that are required, optional, and so on for OWTF templates?

I am 100% for that community repository writing the multiple templates. It's really an awesome idea! But then it would require very strong documentations about what can/should/may/might be included in the template.

Going a bit further, it might be required to implement the report generation using a plugin system, since it will have to deal with potentially 10s of different templates. As far as I can read in this PR, everything is hardcoded for one template. This is a great PoC but not something that is maintainable and can be improved easily for multiple formats.

I am not trying to diminish any work already done here. However more work should be put in the design of the feature to be both easily maintainable and improved. It should also be properly documented, which doesn't seem to be the case in this PR (regarding what fields are available, required, optional, etc.).

@viyatb
Copy link
Copy Markdown
Member

viyatb commented Aug 4, 2017

@DePierre yeah, documentation :sighs: lol. All your suggestions are on point and is going to be done in iterative commits. 👍

@viyatb
Copy link
Copy Markdown
Member

viyatb commented Aug 4, 2017

@saganshul the JS part LGTM.

@saganshul
Copy link
Copy Markdown
Contributor Author

@viyatb @DePierre PR updated. Please review changes :)

@saganshul
Copy link
Copy Markdown
Contributor Author

saganshul commented Aug 5, 2017

I changed commit message because currently it is very basic implementation.
TODO:

  • Add multi template support(Using community supported templates)(that would be very easy)
  • Support other formats like PDF, HTML etc.

@DePierre The same API will work for all templates. API throws all the information it has in OWTF DB about scan. So we just need to write template. Same API works for all template. Also multiple formats can be supported since what we can convert docx to different formats.
I am not getting what do you mean by this everything is hardcoded for one template. There is nothing hard coded all data is coming from API (Template only needs to know keys and JSON structure). Please correct me if I am wrong somewhere or if you are expecting something different.

@DePierre
Copy link
Copy Markdown
Contributor

DePierre commented Aug 5, 2017

@saganshul Sorry maybe I was not clear in my previous comment.

Regarding the multiple templates, I meant that currently, only one .docx template is supported (the one you linked) and that currently, the PR does not easily allow using another one. For instance, there could be a second template that sorts the findings by plugin type (web active first, web semi-passive second, web passive third, etc.). In that case, it will require some changes in the feature to easily support that second template. Maybe I am missing something that I am not seeing.

Regarding the everything is hardcoded for one template, you just mentioned the issue: Template only needs to know keys and JSON structure. In other, if I want to add the duration time for each plugin, I will have to modify the feature to add one additional key. Or is it exposed? Once again, I am maybe missing something that I am not seeing.

This last remark is also related to the lack of documentation. What does OWTF offer in terms of data? You say API throws all the information it has in OWTF DB about scan. In the PR, I see that it offers the vulnerability title, its data, etc. But I only understand that by reading the source code or know the inner working of its DB. For instance, with your PR, is it possible to know if a ranking is coming from OWTF or the user? I would believe that we would need to clearly define all the data that one would need in a report and expose all of them. Then, each template would have to follow that documentation and pick the data it needs.

Does it make more sense?

Copy link
Copy Markdown
Contributor

@DePierre DePierre left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few more comments that, I believe, could improve the quality of the code.

Comment thread framework/interface/api_handlers.py Outdated
filter_data = dict(self.request.arguments)
plugin_outputs = self.get_component("plugin_output").GetAll(filter_data, target_id=target_id, inc_output=True)
# Group the plugin outputs to make it easier in template
grouped_plugin_outputs = {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe that this could be shortened with:

grouped_plugin_outputs = {poutput['plugin_code']: [] for poutput in plugin_outputs}

Copy link
Copy Markdown
Contributor Author

@saganshul saganshul Aug 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for poutput in plugin_outputs:
    if grouped_plugin_outputs.get(poutput['plugin_code']) is None:
        grouped_plugin_outputs[poutput['plugin_code']] = []

    poutput["rank"] = get_rank(poutput["user_rank"], poutput["owtf_rank"])
    grouped_plugin_outputs[poutput['plugin_code']].append(poutput)

This total thing can't be replaced with

grouped_plugin_outputs = {poutput['plugin_code']: [] for poutput in plugin_outputs}

This thing only does this part:

for poutput in plugin_outputs:
    if grouped_plugin_outputs.get(poutput['plugin_code']) is None:
        grouped_plugin_outputs[poutput['plugin_code']] = []

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saganshul I understand, I misread the indentation and was only considering the first three lines.

Would the following structure be better in your opinion?

from collections import defaultdict
 
grouped_plugin_outputs = defaultdict(list)
for output in plugin_outputs:
    output['rank'] = RANKS.get(max(output['user_rank'], output['owtf_rank']))
    grouped_plugin_outputs[output['plugin_code']].append(output)

else:
raise tornado.web.HTTPError(400)

except exceptions.InvalidTargetReference as e:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we reduce the scope of the try/except statement? It is usually recommended to have the smallest scope for try/except.

Comment thread framework/interface/api_handlers.py Outdated
grouped_plugin_outputs = {}
for poutput in plugin_outputs:
if grouped_plugin_outputs.get(poutput['plugin_code']) is None:
# No problem of overwriting
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure to understand the the point of this comment. What is overridden? Maybe I am not understanding the code snippet but I believe that the line below this comment does not override anything.

Comment thread framework/interface/api_handlers.py Outdated
"""
Class handling APIs related to export report funtionality.
In OWTF, export report funtionality is implemented using docxtemplater.
docxtemplater takes two inputs:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't believe that this belongs to the docstring of the API. As far as I can see, docxtemplater is only being used in a JS file that should be abstracted from this API. It does not and should not be a concern for the API whether docxtemplater or .docx format is used or not.

The docstring should focus on what the API does. As far as I can tell, it fetches data from the DB to prepare some JSON structure that contain information about a target, plugins run, rankings, etc. What information does it fetch? What does it do with it? What return value does it return?

I would prefer to have a docstring that explains what it does rather than a docstring that explains what another piece of code somewhere else can do with the data it created.

tornado.web.url(r'/api/targets/([0-9]+)/transactions/hrt/?([0-9]+)?/?$', api_handlers.TransactionHrtHandler, name='transactions_hrt_api_url'),
tornado.web.url(r'/api/targets/([0-9]+)/poutput/?' + plugin_group_re + '/?' + plugin_type_re + '/?' + plugin_code_re + '/?$', api_handlers.PluginOutputHandler, name='poutput_api_url'),
tornado.web.url(r'/api/targets/([0-9]+)/poutput/names/?' + plugin_group_re + '/?' + plugin_type_re + '/?' + plugin_code_re + '/?$', api_handlers.PluginNameOutput, name='plugin_name_api_url'),
tornado.web.url(r'/api/targets/([0-9]+)/export/?$', api_handlers.ReportExportHandler, name='report_export_api_url'),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might befar-fetched but wouldn't it be more modular to have the API set to /api/targets/{id}/export/{format} or /api/targets/{id}/export/{template} instead?

It would free the /api/targets/{id}/export/ API that could later default to a specific template/format, or be used to show a dialog box where the user can specify more options.

What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DePierre Actually, we have plans to separate the React UI from Tornado API server. Means Render everything using React rather than Jinja2. We will make backend as simple API server.

Now, as you suggested API should be something like this -> /api/targets/{id}/export/{template} , but backend server will not have any information about templates, as the export functionality is completely client side and only React or UI will have information about available templates.

Use of /api/targets/{id}/export/ is to just return data from DB, docxtemplater will do the rest. I mean we don't need to have seperate api for template, because the same API will work for all template.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@saganshul Thanks for the additional info. I believe you already thought about it and it's fine by me. It was simply a remark.

Did you have time to consider the other review comments I made?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah! Just doing that :)

Comment thread framework/utils.py Outdated
parsed_ip = IPAddress(ip)
return parsed_ip.is_private

def get_rank(user_rank, owtf_rank):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As such, it would be better to separate the get_rank function from the ranking scale. By splitting the two, the ranking scale could be used somewhere else in the framework, in case another feature needs to translate the rankings.

Also, get_rank should check that the index is defined in the dict and default to a default value if it is not. For instance, something like:

return RANKS.get(rank, 'Unknown')

The string Unknown should be defined somewhere else, or Unranked could be used. You get the idea.

I would suggest specifying what is the return value in the docstring. Also, I am not sure to understand why the function should make the decision of which ranking to use. I believe that this should be done before calling it. Something like:

# From the API
rank_str(max(user_rank, owtf_rank))

Finally, the name of the function could be a bit more specific in what it does with something like rank_human_readable, rank_str. From what I read, the function does not get anything, it simply translates a ranking from numerical to string value.

@saganshul
Copy link
Copy Markdown
Contributor Author

saganshul commented Aug 8, 2017

@DePierre @viyatb I updated the PR.

TODO:

  • API documentation.

@viyatb
Copy link
Copy Markdown
Member

viyatb commented Aug 8, 2017

@saganshul do you have any screenshot on how you choose a specific template?

Comment thread framework/constants.py Outdated
OWTF_CRITICAL = 5

# Maps `int` value of ranks with `string` value.
RANKS = {OWTF_UNRANKED: 'Unranked', OWTF_PASSING: 'Passing', OWTF_INFO: 'Informational', OWTF_LOW: 'Low', OWTF_MEDIUM: 'Medium', OWTF_HIGH: 'High', OWTF_CRITICAL: 'Critical'}
Copy link
Copy Markdown
Member

@viyatb viyatb Aug 8, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe format this in pretty print format.

RANKS = {
    OWTF_UNRANKED: 'Unranked',
    OWTF_PASSING: 'Passing',
    OWTF_INFO: 'Informational',
    OWTF_LOW: 'Low',
    OWTF_MEDIUM: 'Medium',
    OWTF_HIGH: 'High',
    OWTF_CRITICAL: 'Critical',
}

@saganshul
Copy link
Copy Markdown
Contributor Author

@viyatb
screenshot from 2017-08-09 13-12-31

Copy link
Copy Markdown
Member

@viyatb viyatb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@viyatb viyatb merged commit 8d176c7 into owtf:develop Aug 9, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants