Skip to content

Conversation

@practicalswift
Copy link
Contributor

@practicalswift practicalswift commented Apr 29, 2020

Document differences in bitcoind and bitcoin-qt locale handling.

Since this seems to be the root cause to the locale dependency issues we've seen over the years I thought it was worth documenting :)

Note that 1.) QLocale (used by Qt), 2.) C locale (used by locale-sensitive C standard library functions/POSIX functions and some parts of the C++ standard library such as std::to_string) and 3.) C++ locale (used by the C++ input/output library) are three separate things. This comment is about the perhaps surprising interference with the C locale (2) that takes place as part of the Qt initialization.

@hebasto
Copy link
Member

hebasto commented May 3, 2020

@practicalswift practicalswift force-pushed the bitcoin-qt-vs-bitcoind-locale branch from 040cda1 to aac53b7 Compare May 3, 2020 05:59
@practicalswift
Copy link
Contributor Author

@hebasto Thanks a lot for reviewing! Good points: I've adjusted accordingly. Please re-review :)

@hebasto
Copy link
Member

hebasto commented May 4, 2020

@practicalswift Why do you think that the BitcoinApplication constructor is a proper place to document locale issues? Did you consider other options, e.g., doc/ directory?

@practicalswift
Copy link
Contributor Author

practicalswift commented May 4, 2020

@hebasto I put the comment in the ctor where the QApplication call is made and thus where the implicit setlocale invocation is made, but I don't have any strong opinion about where to put it. You suggestion is putting it in the developer notes instead? I can do that :)

I'm fine as long as we have it documented somewhere so that future generations of Bitcoin Core developers don't have to wonder where the heck the "invisible" setlocale call is made that is the root cause of all locale issues :)

@hebasto
Copy link
Member

hebasto commented May 4, 2020

@hebasto I put the comment in the ctor where the QApplication call is made and thus where the implicit setlocale invocation is made, but I don't have any strong opinion about where to put it. You suggestion is putting it in the developer notes instead? I can do that :)

It seems better to get a broader consensus :)

@DrahtBot
Copy link
Contributor

DrahtBot commented May 29, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@practicalswift practicalswift force-pushed the bitcoin-qt-vs-bitcoind-locale branch from aac53b7 to 52d83bf Compare June 2, 2020 09:22
Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

Perhaps add a paragraph similar to the second one in your PR description.

I'm not sure, but this might be a bit long for the developer notes. Maybe either in the codebase like here, or in its own file in /doc, the main thing being that a developer git grepping on the topic finds it easily.

Some suggestions below; feel free to pick and choose or ignore.

@practicalswift practicalswift force-pushed the bitcoin-qt-vs-bitcoind-locale branch 2 times, most recently from 3ec8452 to 1bd43d1 Compare July 3, 2020 12:08
@practicalswift
Copy link
Contributor Author

@jonatack Thanks for excellent and detailed copy review! ❤️ Updated accordingly.

@jonatack @hebasto I've now added a small note in the developer notes that refers to the more detailed description in the code comment. Looks good? :)

@practicalswift
Copy link
Contributor Author

The CI failure is most likely unrelated. Can someone restart the Cirrus CI job? :)

@fjahr
Copy link
Contributor

fjahr commented Jul 15, 2020

Concept ACK

FWIW, I agree with @hebasto and would only put a much shorter version of the comment in the code. My perspective is that a lot of people look at src/qt/bitcoin.cpp but only very few of them care about localization. While it may be the best place to put it technically, it feels oddly specific there. A short hint and the link to the docs would be enough to let people know what is happening. The long-form could go into the developer notes or another place in the docs IMO.

@practicalswift practicalswift force-pushed the bitcoin-qt-vs-bitcoind-locale branch 2 times, most recently from 200bf0d to 26fa249 Compare July 29, 2020 17:18
@practicalswift
Copy link
Contributor Author

@hebasto @jonatack @fjahr Thanks a lot for reviewing. Feedback addressed: the documentation has now been moved to a more appropriate home (test/lint/lint-locale-dependence.sh). Should be ready for final review :)

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ACK 26fa24996e18b9ecaa1b21fcfd049af6c468d59d, I have reviewed the code and it looks OK, I agree it can be merged.

@fjahr
Copy link
Contributor

fjahr commented Jul 30, 2020

ACK 26fa24996e18b9ecaa1b21fcfd049af6c468d59d

Restarted failing CI job.

@practicalswift
Copy link
Contributor Author

I think this one should be ready for merge now :)

@practicalswift
Copy link
Contributor Author

@fanquake I think two ACK:s (hebasto, fjahr) and one Concept ACK (jonatack) should make this totally uncontroversial documentation only change ready for merge. What do you think? :)

@practicalswift
Copy link
Contributor Author

Ready for merge? :)

@maflcko
Copy link
Member

maflcko commented Aug 27, 2020

Since this seems to be the root cause to the locale dependency issues we've seen over the years I thought it was worth documenting :)

Do you have one example handy where the different handling was the root cause for an issue?

Also I am wondering what the motivation here is. While this difference in handling may be theoretically correct, we should never obverse it in practice. If we did, it would be a bug, as any locale dependent functions should be avoided.

@practicalswift
Copy link
Contributor Author

practicalswift commented Aug 27, 2020

@MarcoFalke

Do you have one example handy where the different handling was the root cause for an issue?

There are plenty but look at #6443: bug in Bitcoin Qt due to Qt's highly surprising "unsolicited" setlocale call on initialization. Not a bug in bitcoind due to the discussed difference.

The thing is that literally all locale bugs when using POSIX functions have been due to this difference. If it wasn't for the setlocale call which Qt does as part of initialization we wouldn't have seen a single POSIX locale bug.

We could literally kill the locale dependency bug class for good in Bitcoin Core by simply following the recommendation of the Qt documentation:

On Unix/Linux Qt is configured to use the system locale settings by default. This can cause a conflict when using POSIX functions, for instance, when converting between data types such as floats and strings, since the notation may differ between locales. To get around this problem, call the POSIX function setlocale(LC_NUMERIC,"C") right after initializing QApplication, QGuiApplication or QCoreApplication to reset the locale that is used for number formatting to "C"-locale. [https://doc.qt.io/qt-5/qcoreapplication.html]

(Aside: AFAICT there is not a single place in our code base where we want the POSIX functions to use a locale other than C when parsing or formatting numbers. (Obviously we want Qt to be localized but that is handled by QLocale.))

So, why documenting this?

A new developer coming to the project doing git grep setlocale won't see a single place (outside of the fuzz tests) where we opt in to POSIX locale dependence.

He or she might thus incorrectly assume that we don't risk running in to POSIX locale bugs.

If this PR is merged he/she will hopefully fully understand why we run such risk despite the apparent lack of setlocale call in our code base.

While this difference in handling may be theoretically correct, we should never obverse it in practice.

If consensus critical code would use a POSIX locale dependent function the worst case scenario would be a chain split between bitcoind clients and a subset of Bitcoin Qt clients (depending on their locale). For bitcoind the POSIX locale is guaranteed to be C whereas it can be whatever funky locale the user chooses in Bitcoin Qt.

Such a worst-case chain split would certainly be observable in practice :)

If we did, it would be a bug, as any locale dependent functions should be avoided.

I share the long-term goal of removing all usages of locale dependent functions, but I don't see how that would be in conflict with wanting to document how locales are currently used in our project, and the risks such use entails :)

Should I perhaps drop or rephrase the developer note comment?

As long as we document this somehow and somewhere in a greppable way in our project I'm happy :)

@maflcko
Copy link
Member

maflcko commented Aug 28, 2020

Ok, sounds reasonable. About the dev notes: I see them mostly as instructions on how we write code. "Avoid using locale dependent functions if possible" is already mentioned and the qt documentation seems a bit misplaced.

Concept ACK

@practicalswift practicalswift force-pushed the bitcoin-qt-vs-bitcoind-locale branch from 26fa249 to ca185cf Compare August 29, 2020 01:55
@practicalswift
Copy link
Contributor Author

practicalswift commented Aug 29, 2020

@MarcoFalke Thanks for reviewing! No longer touching the developer notes. Should hopefully be ready for merge :)

Copy link
Member

@hebasto hebasto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-ACK ca185cf

@maflcko maflcko merged commit baf9ced into bitcoin:master Aug 29, 2020
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Aug 30, 2020
…n-qt locale handling

ca185cf doc: Document differences in bitcoind and bitcoin-qt locale handling (practicalswift)

Pull request description:

  Document differences in `bitcoind` and `bitcoin-qt` locale handling.

  Since this seems to be the root cause to the locale dependency issues we've seen over the years I thought it was worth documenting :)

  Note that 1.) `QLocale` (used by Qt), 2.) C locale (used by locale-sensitive C standard library functions/POSIX functions and some parts of the C++ standard library such as `std::to_string`) and 3.) C++ locale (used by the C++ input/output library) are three separate things. This comment is about the perhaps surprising interference with the C locale (2) that takes place as part of the Qt initialization.

ACKs for top commit:
  hebasto:
    re-ACK ca185cf

Tree-SHA512: e51c32f3072c506b0029a001d8b108125e1acb4f2b6a48a6be721ddadda9da0ae77a9b39ff33f9d9eebabe2244c1db09e8502e3e7012d7a5d40d98e96da0dc44
@practicalswift practicalswift deleted the bitcoin-qt-vs-bitcoind-locale branch April 10, 2021 19:42
Fabcien pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Sep 17, 2021
Summary:
> Document differences in bitcoind and bitcoin-qt locale handling.
>
> Since this seems to be the root cause to the locale dependency issues we've seen over the years I thought it was worth documenting

This is a backport of [[bitcoin/bitcoin#18817 | core#18817]]

Test Plan: proofreading

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Subscribers: boardgamesaz, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D10136
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Aug 18, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants