Skip to content

Conversation

@laanwj
Copy link
Member

@laanwj laanwj commented Apr 16, 2016

(the univalue patch should be taken upstream - the RPC test and doc change should end up here. Posting it here for visibility and because this fixes an ancient issue: #2127. I think this is important to many non-English users)

This adds full UTF-8 support for both on input and output through JSON.

bitcoin-cli usage

Before:

$ src/bitcoin-cli -datadir=/store/tmp/testbtc getnewaddress "рыба"
1HQCE3H87fZ1er1ExLiF5V4a1Kxf46r3J2
$ src/bitcoin-cli -datadir=/store/tmp/testbtc listaccounts
{
...  "\u00c3\u0083\u00c2\u0091\u00c3\u0082\u00c2\u0080\u00c3\u0083\u00c2\u0091\u00c3\u0082\u00c2\u008b\u00c3\u0083\u00c2\u0090\u00c3\u0082\u00c2\u00b1\u00c3\u0083\u00c2\u0090\u00c3\u0082\u00c2\u00b0": 0.00000000
}
$ src/bitcoin-cli -datadir=/store/tmp/testbtc getaccount 1HQCE3H87fZ1er1ExLiF5V4a1Kxf46r3J2
�����±�°

After:

$ src/bitcoin-cli -regtest getaccountaddress "рыба"
mrVZ8GaURJmL3LEUu6ygU2CrUrZ8979iK2
$ src/bitcoin-cli -regtest listaccounts
{
...
  "рыба": 0.00000000
}
$ src/bitcoin-cli -regtest getaccount mrVZ8GaURJmL3LEUu6ygU2CrUrZ8979iK2                                       
рыба

GUI

This also affects the GUI debug console.

Before:

i18n_before

When strings are passed directly (such as for getaccount's return argument) it works fine, but when they go through JSON formatting/parsing, it fails.

After:

i18n_after

RPC test

A test for this new functionality has been added to the wallet.py test.

See the commit message of the first commit for details on what exactly had to be changed in univalue.

@jonasschnelli
Copy link
Contributor

Nice!
Concept ACK.

Not sure how we should treat the UniValue changes. Should we try to keep the "upstream" repository bitcoin/univalue in sync?

@maflcko
Copy link
Member

maflcko commented Apr 16, 2016

Concept ACK. Tests look good.

"upstream" repository bitcoin/univalue in sync

The patch goes to jgarzik/univalue as bitcoin/univalue is only for bitcoin related patches. (Still in the end we need to merge jgarzik/univalue into bitcoin/univalue because we use this as subtree.)

@laanwj laanwj force-pushed the 2016_04_i18n_unicode_rpc branch from 7a7afa2 to 09cce63 Compare April 18, 2016 13:30
@laanwj laanwj force-pushed the 2016_04_i18n_unicode_rpc branch from 09cce63 to 0ef6abf Compare April 28, 2016 13:18
@gmaxwell
Copy link
Contributor

Neat. What happens with characters like unicode direction modifiers? Can you end up with your terminal in a weird state?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs rebase.

@laanwj
Copy link
Member Author

laanwj commented May 18, 2016

Neat. What happens with characters like unicode direction modifiers? Can you end up with your terminal in a weird state?

I don't think that's possible with just unicode?

ANSI escape sequences are a whole different story. In JSON objects they'll be escaped, but when printing strings (say, for help, when there is a single string result) everything is let through. Not sure whether this should be changed, but this would be a change local to bitcoin-cli, not the unicode framework. I'd suggest doing so in a separate pull.

Nothing new here though: It's always been possible to pass any kind of crap characters from the server (just look at my OP) to bitcoin-cli, this just makes the round-trip sane and conform to UTF-8.

@dcousens
Copy link
Contributor

dcousens commented May 19, 2016

concept ACK, once-over utACK (did not cross-verify UTF-8 impl.).

Could probably use more tests for that JSONUTF8Writer...

@laanwj
Copy link
Member Author

laanwj commented May 20, 2016

Could probably use more tests for that JSONUTF8Writer...
ing

There are few direct tests for it, but all the other tests that parses JSON with strings in it is testing it.
What would be nice is doing some fuzzing using https://github.com/laanwj/univalue/tree/2015_11_unifuzz and see that everything is accepted is valid utf-8.

@gmaxwell
Copy link
Contributor

@pstratem Any interest in fuzzing this?

@sipa
Copy link
Member

sipa commented May 25, 2016

Needs rebase.

@laanwj laanwj force-pushed the 2016_04_i18n_unicode_rpc branch from 0ef6abf to 0912569 Compare June 9, 2016 06:21
@laanwj
Copy link
Member Author

laanwj commented Jun 9, 2016

Rebased, done and removed the Python3 TODOs

laanwj added 5 commits June 10, 2016 15:19
f32df99 Merge branch '2016_04_unicode' into bitcoin
280b191 Merge remote-tracking branch 'jgarzik/master' into bitcoin
c9a716c Handle UTF-8
bed8dd9 Version 1.0.2.
5e7985a Merge pull request bitcoin#14 from laanwj/2015_11_escape_plan

git-subtree-dir: src/univalue
git-subtree-split: f32df99e96d99ab49e5eeda16cac93747d388245
Add a setting ensure_ascii to AuthServiceProxy. This setting,
defaulting to True (backwards compatible),
is passed through to json.dumps. If set to False, non-ASCII characters
>0x80 are not escaped. This is useful for testing server
input processing, as well as slightly more bandwidth friendly in case of
heavy unicode usage.
@laanwj laanwj force-pushed the 2016_04_i18n_unicode_rpc branch from 0912569 to 7982fce Compare June 10, 2016 13:22
@laanwj
Copy link
Member Author

laanwj commented Jun 10, 2016

Ok:

This should be ready now.

@gmaxwell
Copy link
Contributor

ACK

@laanwj laanwj merged commit 7982fce into bitcoin:master Jun 16, 2016
laanwj added a commit that referenced this pull request Jun 16, 2016
7982fce doc: Mention full UTF-8 support in release notes (Wladimir J. van der Laan)
6bbb4ef test: test utf-8 for labels in wallet (Wladimir J. van der Laan)
a406fcb test: add ensure_ascii setting to AuthServiceProxy (Wladimir J. van der Laan)
60ab9b2 Squashed 'src/univalue/' changes from 2740c4f..f32df99 (Wladimir J. van der Laan)
@laanwj
Copy link
Member Author

laanwj commented Jun 16, 2016

It looks like the build system doesn't detect changes to univalue source files and will not automatically rebuild the library: if you get RPC test failures concerning unicode, please build from a clean tree

@maflcko
Copy link
Member

maflcko commented Jun 16, 2016

make clean
make distclean

should take care of this, I assume.

@laanwj
Copy link
Member Author

laanwj commented Jun 16, 2016

Yes, make clean should be enough.

codablock added a commit to codablock/dash that referenced this pull request Dec 28, 2017
This was probably missed while backporting Bitcoin PR bitcoin#7892
andvgal pushed a commit to energicryptocurrency/gen2-energi that referenced this pull request Jan 6, 2019
This was probably missed while backporting Bitcoin PR bitcoin#7892
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants