Check the content type before returning the json value.#2177
Check the content type before returning the json value.#2177fduxiao wants to merge 6 commits intoaio-libs:masterfrom fduxiao:content_type_check
Conversation
Codecov Report
@@ Coverage Diff @@
## master #2177 +/- ##
==========================================
- Coverage 97.2% 97.15% -0.05%
==========================================
Files 39 39
Lines 7945 7956 +11
Branches 1378 1382 +4
==========================================
+ Hits 7723 7730 +7
- Misses 98 100 +2
- Partials 124 126 +2
Continue to review full report at Codecov.
|
aiohttp/web_request.py
Outdated
| You can also handle it manually by passing the | ||
| `wrong_format` parameter. | ||
| """ | ||
| if if_check and self.content_type != 'application/json': |
There was a problem hiding this comment.
How would I use, say, jsonapi if it will never match this hardcoded content-type?
aiohttp/web_request.py
Outdated
| @asyncio.coroutine | ||
| def json(self, *, loads=json.loads): | ||
| """Return BODY as JSON.""" | ||
| def json(self, *, loads=json.loads, wrong_format=None, if_check=True): |
There was a problem hiding this comment.
I think signature is quite overloaded. I suggest just to allow to pass what exact content-type should be considered as valid. No match - throw an exception, nothing more. For BC, default value could be None until some release when we change it to 'application/json'. Thoughts?
aiohttp/web_request.py
Outdated
| wrong_format_flag = False | ||
| if isinstance(allowed_types, str): | ||
| wrong_format_flag = allowed_types != self.content_type | ||
| elif isinstance(allowed_types, list): |
There was a problem hiding this comment.
What if allowed_types be a set or tuple or frosenset or whatever container?
There was a problem hiding this comment.
So will iterable make sense? I mean how about hasattr(alowed_types, '__iter__')?
There was a problem hiding this comment.
May be do like it was don't in ClientResponse.json() ? It's quite abstract from what kind of type was passed and the behaviour would be consistent between client and server.
aiohttp/web_request.py
Outdated
| if if_check and self.content_type != 'application/json': | ||
| if wrong_format is None: | ||
| wrong_format_flag = False | ||
| if isinstance(allowed_types, str): |
There was a problem hiding this comment.
What if I receive application/json in legacy form when it will have a charset option defined explicitly like application/json;charset=utf-8?
There was a problem hiding this comment.
How about just checking the type, subtype and suffix? And should json be decoded according to the charset?
There was a problem hiding this comment.
I suggest just supporting a container (list, tuple, whatever) only and raising explicit TypeError on single string.
By default it should be None for backward compatibility. That's sad but we cannot change it.
Application might have a parameter for enabling checking json types in the application by default though.
There was a problem hiding this comment.
How about just checking the type, subtype and suffix?
That's correct, but will make things too much complicated.
And should json be decoded according to the charset?
In modern RFC's it's mandatory should be UTF-8. However, there are lot of legacy around which forces charset via params. I don't thing we should care about non-UTF8 JSON, but if client asks about it explicitly, there is no reason to deny him, imho.
I suggest just supporting a container (list, tuple, whatever) only and raising explicit TypeError on single string.
It's good idea, but it conflicts with how client's json() acts. I don't think we should have different behaviour of a similar things - that would annoy people all around.
There was a problem hiding this comment.
So I think we can use either Container or str, and None is for wildcard (or should I use a '*' as in add_route, since allowed_types=None looks weird for wildcard); otherwise a TypeError is raised
There was a problem hiding this comment.
I don't like global switches. You have argument and it's default. None disables the check. For awhile it will be default to preserve BC. After some release it becomes application/json in defaults and may be we could prohibit None usage, but that's another story.
There was a problem hiding this comment.
Could I suggest a function(mapping a string to boolean) be passed as an argument to filter the Content-Type and str, None are also accepted and nothing more is supported?
There was a problem hiding this comment.
I would recommend to avoid callback style. This is the road to make aiohttp acts weird depending on what functions user passed around. There is pretty simple rules to find out if MIME type is json-like. In the rest cases user could override that behavior by explicit string value.
There was a problem hiding this comment.
So that means the user won't want to behavior weirdly? Then I just need to check whether the type is str or tuple or list or set and compare them directly, or that it is something that means json-like or wildcard. Also I need to derive the charset from Content-type and pass it to loads.
There was a problem hiding this comment.
Why he should? .json() provides stable interface and guarantee about what it does. User may customize default expected content-type via string argument. Why there is need to do more?
If user need to do more complex content type check, it should make it at middleware level, not in .json() call.
aiohttp/web_request.py
Outdated
| raise HTTPNotAcceptable() | ||
| elif callable(wrong_format): | ||
| yield from wrong_format() | ||
| elif callable(err_handler): |
There was a problem hiding this comment.
You expect callable, but below err_handler is a coroutine actually.
There was a problem hiding this comment.
I can only use module inspect to decide whether an object is a coroutine. Should I decide whether it is a coroutine or not or even something else(says a string) and make calling correspondingly. I think more details should be given.
There was a problem hiding this comment.
I would suggest to not pass exc_handler at all. No argument - no problems (:
aiohttp/web_request.py
Outdated
| elif callable(wrong_format): | ||
| yield from wrong_format() | ||
| elif callable(err_handler): | ||
| yield from err_handler() |
There was a problem hiding this comment.
Why to pass here err_handler after all? You can wrap req.json() call with try/except and do the rest on user code side.
There was a problem hiding this comment.
I meant to have reduced the use of try/except so one could get rid of the fact that if you want to handle the error yourself, you need the try/except block i.e. one doesn't need to raise the error, catch it and handle it since catch an error is costly.
There was a problem hiding this comment.
No need to worry about try/except usage. If you do it, you explicitly sign about that you're aware about that things here could go wrong. Some custom argument that does the same makes it not so oblivious.
As about speed, you shouldn't worry about catch performance. I bet, your app business logic will be million times slower than that.
aiohttp/web_request.py
Outdated
| then the request is rejected. | ||
| You can also handle it manually by passing the | ||
| `wrong_format` parameter. | ||
| You can also handle it manually by passing the `err_handler` parameter. |
There was a problem hiding this comment.
I think there is need to note what will happens if content-type won't match.
aiohttp/web_request.py
Outdated
| """ | ||
| wrong_format_flag = False | ||
| if isinstance(allowed_types, str): | ||
| wrong_format_flag = allowed_types != self.content_type |
There was a problem hiding this comment.
The self.content_type may be application/json; charset=UTF-8. Am I right ? So, use startswith or handle that in a different way.
There was a problem hiding this comment.
Yes. So how do you think of these codes:
content_type = self.content_type
(mime_type, subtype, suffix, parameters) = parse_mimetype(content_type)
content_type = mime_type + '/' + subtype
if suffix != '':
content_type += '+' + suffix
charset = parameters.get('charset', None)There was a problem hiding this comment.
seems OK, I want to review this part in context of whole path.
aiohttp/web_request.py
Outdated
| wrong_format_flag = False | ||
| if isinstance(allowed_types, str): | ||
| wrong_format_flag = allowed_types != self.content_type | ||
| elif isinstance(allowed_types, list): |
There was a problem hiding this comment.
what about tuple and set ?
There was a problem hiding this comment.
Maybe all Containers should be supported, says
if isinstance(allowed_types, str):
wrong_format_flag = allowed_types != self.content_type
elif isinstance(allowed_types, collections.Container):
wrong_format_flag = self.content_type not in allowed_types
elif allowed_types is None: # do not check the content_type
wrong_format_flag = False
else: # for other type
raise TypeError("Unsupported container of `Content-Type`")There was a problem hiding this comment.
-1 for any containers, though. Even list.
aiohttp/web_request.py
Outdated
| @asyncio.coroutine | ||
| def json(self, *, loads=json.loads): | ||
| """Return BODY as JSON.""" | ||
| def json(self, *, loads=json.loads, allowed_types=None, err_handler=None): |
There was a problem hiding this comment.
I vote for allowed_types='application/json'
There was a problem hiding this comment.
That would be breaking change. See previous discussions.
There was a problem hiding this comment.
Shall we directly check if the mime type starts with 'application/' and ends with '+json' i.e. flask's way to deal with it? Hence no need to consider about how should allowed_type behavior
There was a problem hiding this comment.
If we stay with string only argument as @socketpair suggested, that's would be friendly default behaviour with almost no cost.
aiohttp/web_request.py
Outdated
| if err_handler is None: | ||
| raise HTTPNotAcceptable() | ||
| elif callable(err_handler): | ||
| yield from err_handler() |
There was a problem hiding this comment.
- Please do not check if it is callable. Instead, check return value of call. if it is awaitable -- await it. (see signal handlers in aiohttp)
- In current code, with
err_handler=42-- error will be just ignored, which is a bug. - errors must not be handled as callbacks or so. There are exceptions for that.
There was a problem hiding this comment.
As we discussed above, there's no need to give callbacks. Instead, only the 406 error is raised.
aiohttp/web_request.py
Outdated
| charset = parameters.get('charset', None) | ||
|
|
||
| if isinstance(allowed_types, str): | ||
| if allowed_types == 'json': |
There was a problem hiding this comment.
Useless. Users must supply full contenttype like xxxx/yyyyy
There was a problem hiding this comment.
This means 'json-like'. View my code after that. I will then check whether the Content-Type is 'application/json' or starts with 'application' and ends with '+json'. If it is a string and is not 'json', I will compare it with Content-Type directly. This may be convenient i.e. one won't need to always specify the argument for different json format
aiohttp/web_request.py
Outdated
| raise HTTPNotAcceptable() | ||
| elif allowed_types != content_type: | ||
| raise HTTPNotAcceptable() | ||
| elif isinstance(allowed_types, set): |
There was a problem hiding this comment.
-2 for containers. Why they are intended for ? Which user-story (case) do you suggest ?
There was a problem hiding this comment.
Previously, we discussed what if one sends 'application/json; charset=utf-8' and 'application/json' is hard-coded and now I strip the charset. My concern is that will there be something else the frontend will send fail to pass the examination, says the handler need to handle both application/json and application/javascript or some uncontrollable i.e. for compatibility of an old frontend framework.
There was a problem hiding this comment.
We should keep code of aiohttp simple and clear. Simple thing should be done in a simple way, and difficult ones should be possible.
If an specific application needs to handle such very obscure situation, it may control content-type by it's own before decoding of json. And also, I think, situation you have provided is just an imagination, not the real one. If no, please describe deeper.
aiohttp/web_request.py
Outdated
|
|
||
| body = yield from self.text() | ||
| return loads(body) | ||
| return loads(body, encoding=charset) |
There was a problem hiding this comment.
The other arguments have the same meaning as in load(), except encoding which is ignored and deprecated.
There was a problem hiding this comment.
Charset should go as additional argument of the function. And should be passed to .text() if defined.
There was a problem hiding this comment.
What if the frontend sends a request specifying charset? And I think a json object should be parsed with loads, with encoding supported by python itself. Should .text() decode something like "\\u4e16\\u754c\\u4f60\\u597d"?
There was a problem hiding this comment.
PLEASE look at the aiohttp sources before asking and before changing the code. .text() will analyze conten-type header (regarding charset) before converting bytes to text.
There was a problem hiding this comment.
JSON-decoding is two-step process. First, it needs to convert from bytes to characters (symbols). Thi is done by UTF-8 (typically) decoder in .text(). Second-step -- is to decode slash-escapes according to JSON-format. this is done using loads(). So, encoding you are talking about is for first step only. As you can see, letter u after every slash signifies that every item is UNICODE symbol (as opposed to some byte value). And so encoding is not applicable.
There was a problem hiding this comment.
Yes. I did a search and found it so I removed these codes.
socketpair
left a comment
There was a problem hiding this comment.
- Please remove containers
- Make Support of either None or full content type
- Remove encoding from loads
aiohttp/web_request.py
Outdated
| """ Return BODY as JSON. If the `Content-Type` is not in allowed_types, | ||
| then the request is rejected. | ||
| """ | ||
| content_type = self.content_type |
There was a problem hiding this comment.
it's not required to parse self.content_type if allowed_type is None
aiohttp/web_request.py
Outdated
| if suffix != '': | ||
| content_type += '+' + suffix | ||
|
|
||
| if isinstance(allowed_type, str): |
There was a problem hiding this comment.
Much simplier:
if allowed_type is not None:
parse content type
if parsed_ct != allowed_type:
raise ...
without any else
aiohttp/web_request.py
Outdated
| def json(self, *, loads=json.loads): | ||
| """Return BODY as JSON.""" | ||
| def json(self, *, loads=json.loads, allowed_type=None): | ||
| """ Return BODY as JSON. If the `Content-Type` is not in allowed_types, |
There was a problem hiding this comment.
is not in is wrong now.
| allowed_type, then the request is rejected. | ||
| """ | ||
|
|
||
| if allowed_type is None: # do not check the content_type |
There was a problem hiding this comment.
I vote for replacing this chain with only one condition like:
if allowed_type is not None:
...| content_type += '+' + suffix | ||
|
|
||
| if allowed_type != content_type: | ||
| raise HTTPNotAcceptable() |
There was a problem hiding this comment.
So, what do you guys think now? Should the message be passed as an argument? I think the error should be raised directly since that's exactly what I want for a JSON API in most cases and I think we've reached the agreement that for special requirements one should write a middleware or catch the exception as we discussed before.
There was a problem hiding this comment.
What would be wrong with just
def json(self, *, loads=json.loads, allowed_type='application/json'):
if allowed_type and self.content_type != allow_type:
raise HTTPNotAcceptable(text='invalid content-type: "{}" not "{}"'.format(self.content_type, allowed_type))
...?
If users really want more sophisticated behaviour than this they can add their own logic, there's nothing very complicated here that couldn't be replicated in a custom coroutine.
|
also is from wikipedia:
That doesn't sound quite right. |
|
Fixed by another PR #3889 |
What do these changes do?
Check the content type before returning the json value.
Are there changes in behavior for the user?
When getting json requests one could directly
await request.json().If one'd like to ignore the
Content-Type, he/she should specifyif_checkmanually i.e.await request.json(if_check=False).Related issue number
referencing #2174
Checklist
CONTRIBUTORS.txtchangesfolder2174.bugfixfor example (588.bug)issue_idchange it to the pr id after creating the pr.feature: Signifying a new feature..bugfix: Signifying a bug fix..doc: Signifying a documentation improvement..removal: Signifying a deprecation or removal of public API..misc: A ticket has been closed, but it is not of interest to users.