Skip to content

Conversation

@achow101
Copy link
Member

@achow101 achow101 commented May 7, 2020

Instead of removing the database transaction logs and retrying the
wallet loading, just return an error message to the user. Additionally,
speciically for DB_RUNRECOVERY, notify the user that this could be due
to different BDB versions.

Kind of implements the suggestion from #18870 (comment)

@DrahtBot DrahtBot added the Wallet label May 7, 2020
@ryanofsky
Copy link
Contributor

Concept ACK c1029c6. This removes database loading with no logs behavior implemented in #2558 1859aaf, which hopefully should be unnecessary now that the wallet flushes very frequently.

This change should reduce chances for data loss and provide better error feedback if there's an unsuccessful attempt to load newer bdb logs with an older version of bdb. It also simplifies the wallet opening code.

It's possible this PR could cause new errors for users who weren't seeing errors before. If this happens we could add an easier to use manual recovery option, or implement a middle-ground automatic approach like #18870 (comment) that will retry loading the wallet without log files as long as the log files came from a compatible bdb version.

@DrahtBot
Copy link
Contributor

DrahtBot commented May 14, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK c1029c6. I wrote a test for this (feel free to steal) in f68911f (branch)

@ryanofsky
Copy link
Contributor

ryanofsky commented May 19, 2020

Some travis failures here. One in rpc_rawtransaction appears to be spurious https://travis-ci.org/github/bitcoin/bitcoin/jobs/684393890#L5036:

2020-05-07T18:54:10.222000Z TestFramework (ERROR): Unexpected exception caught during testing
Traceback (most recent call last):
  File "/home/travis/build/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/test_framework/test_framework.py", line 114, in main
    self.run_test()
  File "/home/travis/build/bitcoin/bitcoin/ci/scratch/build/bitcoin-x86_64-pc-linux-gnu/test/functional/rpc_rawtransaction.py", line 299, in run_test
    vout = next(o for o in rawTx['vout'] if o['value'] == Decimal('2.20000000'))
StopIteration

Other two failures in wallet_tool test seem to be the result of changed error strings https://travis-ci.org/github/bitcoin/bitcoin/jobs/684393889#L3118 and https://travis-ci.org/github/bitcoin/bitcoin/jobs/684393892#L4350:

AssertionError: not(Error loading wallet.dat. Is wallet being used by other process? == Error initializing wallet database environment "/home/travis/build/bitcoin/bitcoin/ci/scratch/test_runner/test_runner_₿_🏃_20200507_184719/tool_wallet_117/node0/regtest/wallets"!
  File "/home/travis/build/bitcoin/bitcoin/ci/scratch/build/bitcoin-i686-pc-linux-gnu/test/functional/tool_wallet.py", line 39, in assert_raises_tool_error
    assert_equal(stderr.strip(), error)

@achow101 achow101 force-pushed the dont-retry-bdbenv branch from c1029c6 to 38ac29c Compare May 19, 2020 19:15
@achow101
Copy link
Member Author

Cherrypicked the test and I think I've fixed the other test failures.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 38ac29c1e8e862669c2c1fe8d9ddbced8fb53392. Changes since last review were removing bdb error strings from error messages to deal with failing travis tests and adding new python test (thanks). It'd be nice if we could include detailed bdb error messages and fix the broken travis tests to be less sensitive, but no need for that here.

Travis is failing again currently because test_runner test list is out of date. Errors are "WARNING! The following scripts are not being run: ['wallet_db.py']. Check the test lists in test_runner.py." https://travis-ci.org/github/bitcoin/bitcoin/jobs/688954958#L3332

@achow101
Copy link
Member Author

Travis is failing again currently because test_runner test list is o

Fixed

@achow101 achow101 force-pushed the dont-retry-bdbenv branch from 38ac29c to c6a2c26 Compare May 19, 2020 21:49
@maflcko
Copy link
Member

maflcko commented May 20, 2020

ci is failing. You might have to add a suppression to ./test/sanitizer_suppressions/lsan

@ryanofsky
Copy link
Contributor

Apparently the new wallet_db.py test exposes a memory leak that happens when BerkeleyEnvironment::Open fails: https://travis-ci.org/github/bitcoin/bitcoin/jobs/689013660#L4659

I didn't look closely, but in case the leak does not seem to be something obvious, I'd forgive you if you just want to drop the test here and pretend the leak isn't happening. (I'm assuming it is pre-existing.) I can follow up with the test in another PR.

@achow101
Copy link
Member Author

I think I've fixed the memory leak.

@achow101 achow101 force-pushed the dont-retry-bdbenv branch from e78c682 to 274b990 Compare May 20, 2020 20:02
@achow101
Copy link
Member Author

Hmm. Guess that didn't fix it. I think I'll just ignore this leak for now. I think this leak also exists on master, so it's not a regression.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK 274b9909f64aa617eb322850b799bc38c04f5f0c. Only change since last review is dropping the test which exposed the memory leak. Agree it looks like the memory leak is probably in current code and just exposed by the test. This PR is just basically just removing code, and nothing that's been removed looks like it is freeing memory. I'll file an issue for fixing the memory leak and adding back the test

@ryanofsky
Copy link
Contributor

re: #18907 (review)

I'll file an issue for fixing the memory leak and adding back the test

Filed #19034

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left a nit

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK c2faa3432241d15f8d0a9968ba46cc0eef7fe21f. No changes since last review except rebase and translation string tweak

Copy link
Member

@Sjors Sjors left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tACK c2faa34 on macOS 10.15.5

I tested a few scenario's. First I created a fresh wallet with bdb 18.

  1. It closed it normally, recompiled with bdb4.8 and loaded it. That fails with Wallet file verification fails. wallet.dat corrupt. Try using the wallet tool bitcoin-wallet to salvage or restoring a backup.. Same error with or without this commit; bdb 18 is probably just too new.

  2. While the bdb18 wallet was open, I copied the database directory. I then closed Bitcoin QT and copied the directory back. I then opened it, without this PR, and it renamed the directory to .bak. Same error as above.

  3. Same as (2), but with this PR, now I get the new error message.

The commit message could be more clear that this error is only triggered for users who manually compiled with a newer BDB version.

Nit: speciically typo in commit message

@achow101
Copy link
Member Author

achow101 commented Jun 29, 2020

Same error with or without this commit; bdb 18 is probably just too new.

It is. The best version to test with is probably 5.x as it will produce compatible/identical wallet.dat files but incompatible logs.

Copy link
Contributor

@meshcollider meshcollider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK c2faa3432241d15f8d0a9968ba46cc0eef7fe21f

With three ACKs this can go in despite the annoyingly obvious typo ;)

EDIT:
there is a build error so you can fix the typo now

wallet/salvage.cpp: In function ‘bool RecoverDatabaseFile(const boost::filesystem::path&)’:
wallet/salvage.cpp:23:36: error: no matching function for call to ‘BerkeleyEnvironment::Open(bool)’
     if (!env->Open(true /* retry */)) {

@achow101 achow101 force-pushed the dont-retry-bdbenv branch from c2faa34 to c356251 Compare July 11, 2020 15:18
@achow101
Copy link
Member Author

Had to rebase. Fixed the typo and added a bit to the commit message.

Instead of removing the database transaction logs and retrying the
wallet loading, just return an error message to the user. Additionally,
specifically for DB_RUNRECOVERY, notify the user that this could be due
to different BDB versions. This error is pretty much only caused by
compiling with a newer version of BDB and then trying to open the wallet
with a version compiled with an older version of BDB.
@achow101 achow101 force-pushed the dont-retry-bdbenv branch from c356251 to d0ea9ba Compare July 13, 2020 15:01
@Sjors
Copy link
Member

Sjors commented Jul 14, 2020

re-utACK d0ea9ba

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK d0ea9ba. Only changes since last review are rebase and expanding error and commit messages.

@maflcko
Copy link
Member

maflcko commented Jul 22, 2020

Tested:

$  ./src/bitcoin-cli -regtest getnewaddress
2020-07-22T06:50:11Z [default wallet] keypool reserve 2
2020-07-22T06:50:11Z [default wallet] keypool keep 2
bitcoind: wallet/rpcwallet.cpp:278: UniValue getnewaddress(const JSONRPCRequest&): Assertion `false' failed.
error: Could not connect to the server 127.0.0.1:18443 (error code 1 - "EOF reached")

Make sure the bitcoind server is running and that you are connecting to the correct RPC port.
[2]-  Aborted                 (core dumped) ./src/bitcoind -regtest



$ ls ~/.bitcoin/regtest/wallets/
database  db.log  wallet.dat


$ ./src/bitcoind-with-older-bdb -regtest  
...
2020-07-22T06:55:42Z Using wallet directory /root/.bitcoin/regtest/wallets
2020-07-22T06:55:42Z init message: Verifying wallet(s)...
2020-07-22T06:55:42Z Using BerkeleyDB version Berkeley DB 4.8.30: (April  9, 2010)
2020-07-22T06:55:42Z Using wallet /root/.bitcoin/regtest/wallets/wallet.dat
2020-07-22T06:55:42Z BerkeleyEnvironment::Open: LogDir=/root/.bitcoin/regtest/wallets/database ErrorFile=/root/.bitcoin/regtest/wallets/db.log
2020-07-22T06:55:42Z BerkeleyEnvironment::Open: Error -30974 opening database environment: DB_RUNRECOVERY: Fatal error, run database recovery
2020-07-22T06:55:42Z Error: Error initializing wallet database environment "/root/.bitcoin/regtest/wallets"! This error could occur if this wallet was not shutdown cleanly and was last loaded using a build with a newer version of Berkeley DB. If so, please use the software that last loaded this wallet
Error: Error initializing wallet database environment "/root/.bitcoin/regtest/wallets"! This error could occur if this wallet was not shutdown cleanly and was last loaded using a build with a newer version of Berkeley DB. If so, please use the software that last loaded this wallet
2020-07-22T06:55:42Z Shutdown: In progress...
2020-07-22T06:55:42Z scheduler thread exit
2020-07-22T06:55:42Z Shutdown: done

LOCK(cs_db);
if (!env->Open(false /* retry */))
bilingual_str open_err;
if (!env->Open(open_err))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

style-nit: Could upgrade to the new style while touching the line

Suggested change
if (!env->Open(open_err))
if (!env->Open(open_err)) {

@maflcko maflcko merged commit c7007ba into bitcoin:master Jul 22, 2020
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Jul 24, 2020
… and instead error

d0ea9ba walletdb: Don't remove database transaction logs and instead error (Andrew Chow)

Pull request description:

  Instead of removing the database transaction logs and retrying the
  wallet loading, just return an error message to the user. Additionally,
  speciically for DB_RUNRECOVERY, notify the user that this could be due
  to different BDB versions.

  Kind of implements the suggestion from bitcoin#18870 (comment)

ACKs for top commit:
  Sjors:
    re-utACK d0ea9ba
  ryanofsky:
    Code review ACK d0ea9ba. Only changes since last review are rebase and expanding error and commit messages.

Tree-SHA512: f6e67dc70f58188742a5c8af7cdc63a2b58779aa0d26ae7f1e75805a239f1a342433860e5a238d6577fae5ab04b9d15e7f11c55b867065dfd13781a6a62e4958
deadalnix pushed a commit to Bitcoin-ABC/bitcoin-abc that referenced this pull request Jun 8, 2021
Summary:
Instead of removing the database transaction logs and retrying the
wallet loading, just return an error message to the user. Additionally,
specifically for DB_RUNRECOVERY, notify the user that this could be due
to different BDB versions. This error is pretty much only caused by
compiling with a newer version of BDB and then trying to open the wallet
with a version compiled with an older version of BDB.

This is a backport of [[bitcoin/bitcoin#18907 | core#18907]]

Test Plan: `ninja all check-all`

Reviewers: #bitcoin_abc, Fabien

Reviewed By: #bitcoin_abc, Fabien

Differential Revision: https://reviews.bitcoinabc.org/D9640
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Feb 15, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants