Python add support for utf-8 error messages#16946
Conversation
|
|
|
mehrdada
left a comment
There was a problem hiding this comment.
This is an area we really need tests for; ideally ones that failed previously.
|
The unit test I added will fail if my code are removed. And about the decoding part, I tried to decode to unicode object instead of passing the byte string to the user. It seems more consistent with previous logic. |
|
You are right that I can't reproduce this client-side error... That's why I renamed the title to server-side. |
|
|
|
|
|
|
|
@mehrdada Can you review the code? I added the unit test for UTF-8 error message, which will fail both the client side and server side without updated |
|
Please also add a test for invalid |
src/python/grpcio/grpc/_common.py
Outdated
|
|
||
| def decode(b): | ||
| if isinstance(b, str): | ||
| def decode(b): #pylint: disable=inconsistent-return-statements |
There was a problem hiding this comment.
maybe smth simple like this?
if isinstance(b, bytes):
return b.decode('utf-8', 'replace')
return b
similar for def encode, just always encoding it as utf-8 should make a deal
There was a problem hiding this comment.
If possible, I want to keep the message content. The replace will get rid of character that utf-8 can't understand. It will fail some cases, if people transmit pure binary data.
There was a problem hiding this comment.
gRPC HTTP/2 spec explicitly states that message is an utf-8 message, so pure binary data is not an acceptable input (and therefore it should be sanitized, but not honored).
Please see https://github.com/grpc/grpc/blob/master/doc/PROTOCOL-HTTP2.md#responses for more details.
It will be also consistent to grpc-go behavior, which currently sanitizes status messages by replacing invalid bytes sequences with the Unicode replacement character.
There was a problem hiding this comment.
Please review the latest change. I have adopted your advices and added test cases.
|
@euroelessar Thanks for your suggestion. Since we don't need to worry about other codec, the logic is much simpler. And I added your test case to the unit test. |
|
|
@lidizheng, if this is going to affect metadata encoding/decoding, we should keep in mind that binary encoded metadata needs to be support based on this proposal https://github.com/grpc/proposal/blob/master/G1-true-binary-metadata.md. |
|
@srini100
As for the metadata in your concern, it is built in Cython layer which have another set of And if we want to proceed to the unicode metadata... I guess we can open another issue for it. |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
@nicolasnoble Can you take a look at my latest changes? I added interop tests and they are passed in CI. But other tests that I didn't touch are failing. I tried to de-flake them by force-running multiple times, but they still fail... |
* Both server and client should be fine with utf-8 error messages now * Adding an interop test: special status message
|
|
|
|
#16102