-
Notifications
You must be signed in to change notification settings - Fork 38.8k
wallet: Refactor the classes in wallet/db.{cpp/h} #18971
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Concept ACK |
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
|
Strong concept ACK for the effort and end result, but I'm scared it looks like the intermediate commits here are moving and changing code at the same time instead of just leaving code where it is or moving it in trivial to review I would definitely appreciate it (can't speak for other reviewers) if this PR could be updated to use MOVEONLY commits (multiple small ones throughout or a big one at the beginning or end), but I'll give this a try and maybe the review will be easier than I'm expecting 👍 |
cb030f6 to
0f991e7
Compare
I only found one commit which was moving and changing behavior at the same time. This was |
|
The patch changes ref counting of db files. Now |
|
|
@achow101 re @ryanofsky's comment, I set out to expand and clarify 67e30b2 into several simpler move-only commits. You can use achow101/bitcoin@refactor-storage...Empact:refactor-storage-moveonly |
0f991e7 to
e91544a
Compare
Got'cha, sorry about that, i've not notice it. I still have concern about touching db env in multi threading. |
|
Concept ACK |
e91544a to
6c47d41
Compare
Make Rewrite actually a member of BerkeleyDatabase instead of a static function in BerkeleyBatch
efe9763 to
0f0fa49
Compare
|
I get similar errors as Travis on macOS ( And: |
Instead of having WalletBatch construct the BerkeleyBatch, have BerkeleyDatabase do it and return a std::unique_ptr<BerkeleyBatch>
0f0fa49 to
e6f62ee
Compare
Should be fixed now. |
Instead of having Flush optionally shutdown the database and environment, add a Close() function that does that.
Refactor mapFileUseCount increment and decrement to separate functions AddRef and RemoveRef
Adds an Open function for the class abstraction that does nothing for now.
Make WalletDatabase actually an abstract class and not just a typedef for BerkeleyDatabase. Have BerkeleyDatabase inherit this class.
Instead of having BerkeleyEnvironment track the file use count, make BerkeleyDatabase do it itself.
Instead of opening the Db handle in BerkeleyBatch, make BerkeleyDatabase do that.
File ids were basically global anyways, it doesn't make sense to have them be environment specific if we're going to loop through all the environments anyways.
e6f62ee to
a2a30a5
Compare
|
This PR is now superseded by the PRs listed in the OP. Even though the final one, #19335, is the same diff and commits as this one, I wanted to have it be separate without the history from this PR as it is conceptually unrelated to this refactor. The refactoring into the abstract classes occurs in #19334. All of the PRs that were dependent on this one will be rebased onto #19334 |
ca24edf walletdb: Handle cursor internally (Andrew Chow) Pull request description: Instead of returning a Dbc (BDB cursor object) and having the caller deal with the cursor, make BerkeleyBatch handle the cursor internally. Split from #18971 ACKs for top commit: ryanofsky: Code review ACK ca24edf. Changes since last review: StartCursor rename, moving CloseCursor calls near returns promag: Code review ACK ca24edf. Tree-SHA512: f029b498c7f275aedca53ce7ade7cb99c82975fd6cad17346a4990fb3bcc54e2a5309b32053bd13def9ee464d331b036ac79abb8fc4fa561170c6cfc85283447
…Database d8e9ca6 walletdb: Move Rewrite into BerkeleyDatabase (Andrew Chow) 91d1091 walletdb: Move PeriodicFlush into WalletDatabase (Andrew Chow) 8f1bcf8 walletdb: Combine VerifyDatabaseFile and VerifyEnvironment (Andrew Chow) Pull request description: The `BerkeleyBatch` class has 4 static functions that operate on `BerkeleyDatabase` or `BerkeleyEnvironment`. It doesn't make sense for these to be standalone nor for them to be static functions. So instead, move them from `BerkeleyBatch` into `BerkeleyDatabase` and make them member functions instead of static. `BerkeleyBatch::VerifyEnvironment` and `BerkeleyBatch::VerifyDatabaseFile` are combined into a single `BerkeleyDatabase::Verify` function that operates on that `BerkeleyDatabase` object. `BerkeleyBatch::Rewrite` and `BerkeleyBatch::PeriodicFlush` both took a `BerkeleyDatabase` as an argument and did stuff on it. So we just make it a member function so it doesn't need to take a database as an argument. Part of #18971 ACKs for top commit: MarcoFalke: re-ACK d8e9ca6 only change is test fixup 🤞 promag: Code review ACK d8e9ca6, good stuff. Tree-SHA512: 9847e55b13d98bf4e5636cc14bc3f5351d56737f7e320fafffaed128606240765599e5400382c5aecac06690f7e36265ca3e1031f3f6d8a9688f6d5cb1bacd2a
…Batch abstract class b82f0ca walletdb: Add MakeBatch function to BerkeleyDatabase and use it (Andrew Chow) eac9200 walletdb: Refactor DatabaseBatch abstract class from BerkeleyBatch (Andrew Chow) Pull request description: In order to support alternative database systems, we need to have a generic `Batch` class. This PR adds a `DatabaseBatch` abstract class which is implemented by `BerkeleyBatch`. `DatabaseBatch` is now the class that is used by `WalletBatch` to interact with the database. To be able to get the correct type of `DatabaseBatch`, `BerkeleyDatabase` now has a `MakeBatch` function which returns a newly constructed `std::unique_ptr<DatabaseBatch>`. For `BerkeleyDatabase`, that will be `std::unique_ptr<BerkeleyBatch>`. The `Read`, `Write`, `Erase`, and `Exists` template functions are moved from `BerkeleyBatch`. Part of #18971 Requires #19308 and #19324 ACKs for top commit: Sjors: re-utACK b82f0ca MarcoFalke: ACK b82f0ca 🌘 meshcollider: LGTM, utACK b82f0ca Tree-SHA512: 6d2d41631c0983391dbecd702e881c6775b155c90b275df97f7157e42608ed251744f9d7ce5173d02a6c5cc38d90b611880fac7fa635d3d8c4d590681f56ac6a
…atabaseBatch abstract class b82f0ca walletdb: Add MakeBatch function to BerkeleyDatabase and use it (Andrew Chow) eac9200 walletdb: Refactor DatabaseBatch abstract class from BerkeleyBatch (Andrew Chow) Pull request description: In order to support alternative database systems, we need to have a generic `Batch` class. This PR adds a `DatabaseBatch` abstract class which is implemented by `BerkeleyBatch`. `DatabaseBatch` is now the class that is used by `WalletBatch` to interact with the database. To be able to get the correct type of `DatabaseBatch`, `BerkeleyDatabase` now has a `MakeBatch` function which returns a newly constructed `std::unique_ptr<DatabaseBatch>`. For `BerkeleyDatabase`, that will be `std::unique_ptr<BerkeleyBatch>`. The `Read`, `Write`, `Erase`, and `Exists` template functions are moved from `BerkeleyBatch`. Part of bitcoin#18971 Requires bitcoin#19308 and bitcoin#19324 ACKs for top commit: Sjors: re-utACK b82f0ca MarcoFalke: ACK b82f0ca 🌘 meshcollider: LGTM, utACK b82f0ca Tree-SHA512: 6d2d41631c0983391dbecd702e881c6775b155c90b275df97f7157e42608ed251744f9d7ce5173d02a6c5cc38d90b611880fac7fa635d3d8c4d590681f56ac6a
d416ae5 walletdb: Introduce WalletDatabase abstract class (Andrew Chow) 2179dbc walletdb: Add BerkeleyDatabase::Open dummy function (Andrew Chow) 71d28e7 walletdb: Introduce AddRef and RemoveRef functions (Andrew Chow) 27b2766 walletdb: Move BerkeleyDatabase::Flush(true) to Close() (Andrew Chow) Pull request description: A `WalletDatabase` abstract class is created from `BerkeleyDatabase` and is implemented by `BerkeleyDatabase`. First, to get to the point that this is possible, 4 functions need to be added to `BerkeleyDatabase`: `AddRef`, `RemoveRef`, `Open`, and `Close`. First the increment and decrement of `mapFileUseCount` is refactored into separate functions `AddRef` and `RemoveRef`. `Open` is introduced as a dummy function. This will raise an exception so that it always fails. `Close` is refactored from `Flush`. The `shutdown` argument in `Flush` is removed and instead `Flush(true)` is now the `Close` function. Split from #18971 Requires #19325 ACKs for top commit: ryanofsky: Code review ACK d416ae5. Only changes since last review were rebasing after base PR #19334 merge, and adding cs_db lock in BerkeleyDatabase destructor, which should avoid races accessing env->m_databases and env->m_fileids fjahr: Code review ACK d416ae5 meshcollider: Code review & test run ACK d416ae5 Tree-SHA512: 98d05ec093d7446c4488e2b0914584222a331e9a2f4d5be6af98e3f6d78fdd8e75526c12f91a8a52d4820c25bce02aa02aabe92d38bee7eb2fce07d0691b7b0d
d416ae5 walletdb: Introduce WalletDatabase abstract class (Andrew Chow) 2179dbc walletdb: Add BerkeleyDatabase::Open dummy function (Andrew Chow) 71d28e7 walletdb: Introduce AddRef and RemoveRef functions (Andrew Chow) 27b2766 walletdb: Move BerkeleyDatabase::Flush(true) to Close() (Andrew Chow) Pull request description: A `WalletDatabase` abstract class is created from `BerkeleyDatabase` and is implemented by `BerkeleyDatabase`. First, to get to the point that this is possible, 4 functions need to be added to `BerkeleyDatabase`: `AddRef`, `RemoveRef`, `Open`, and `Close`. First the increment and decrement of `mapFileUseCount` is refactored into separate functions `AddRef` and `RemoveRef`. `Open` is introduced as a dummy function. This will raise an exception so that it always fails. `Close` is refactored from `Flush`. The `shutdown` argument in `Flush` is removed and instead `Flush(true)` is now the `Close` function. Split from bitcoin#18971 Requires bitcoin#19325 ACKs for top commit: ryanofsky: Code review ACK d416ae5. Only changes since last review were rebasing after base PR bitcoin#19334 merge, and adding cs_db lock in BerkeleyDatabase destructor, which should avoid races accessing env->m_databases and env->m_fileids fjahr: Code review ACK d416ae5 meshcollider: Code review & test run ACK d416ae5 Tree-SHA512: 98d05ec093d7446c4488e2b0914584222a331e9a2f4d5be6af98e3f6d78fdd8e75526c12f91a8a52d4820c25bce02aa02aabe92d38bee7eb2fce07d0691b7b0d
…leyBatch 74507ce walletdb: Remove BerkeleyBatch friend class from BerkeleyDatabase (Andrew Chow) 00f0041 No need to check for duplicate fileids in all dbenvs (Andrew Chow) d86efab walletdb: Move Db->open to BerkeleyDatabase::Open (Andrew Chow) 4fe4b3b walletdb: track database file use as m_refcount within BerkeleyDatabase (Andrew Chow) 65fb880 Combine BerkeleyEnvironment::Verify into BerkeleyDatabase::Verify (Andrew Chow) Pull request description: `BerkeleyBatch` and `BerkeleyDatabase` are kind of messy. The goal of this is to clean up them up so that they are logically separated. `BerkeleyBatch` currently handles the creation of the `BerkeleyDatabase`'s `Db` handle. This is instead moved into `BerkeleyDatabase` and is called by `BerkeleyBatch`. Instead of having `BerkeleyEnvironment` track each database's usage, have `BerkeleyDatabase` track this usage itself with the `m_refcount` variable that is present in `WalletDatabase`. Lastly, instead of having each `BerkeleyEnvironment` store the fileids of the databases open in it, have a global `g_fileids` to track those fileids. We were already checking fileid uniqueness globally (by checking the fileids in every environment when opening a database) so it's cleaner to do this with a global variable. All of these changes allow us to make `BerkeleyBatch` and `BerkeleyDatabase` no longer be friend classes. The diff of this PR is currently the same as in ##18971 Requires #19334 ACKs for top commit: laanwj: Code review ACK 74507ce ryanofsky: Code review ACK 74507ce. No changes since last review other than rebase Tree-SHA512: 845d84ee1a470e2bf5d2e2e3d7738183d8ce43ddd06a0bbd57edecf5779b2f55d70728b1b57f5daab0f078650a8d60c3e19dc30b75b36e7aa952ce268399d5f6
…d BerkeleyBatch 74507ce walletdb: Remove BerkeleyBatch friend class from BerkeleyDatabase (Andrew Chow) 00f0041 No need to check for duplicate fileids in all dbenvs (Andrew Chow) d86efab walletdb: Move Db->open to BerkeleyDatabase::Open (Andrew Chow) 4fe4b3b walletdb: track database file use as m_refcount within BerkeleyDatabase (Andrew Chow) 65fb880 Combine BerkeleyEnvironment::Verify into BerkeleyDatabase::Verify (Andrew Chow) Pull request description: `BerkeleyBatch` and `BerkeleyDatabase` are kind of messy. The goal of this is to clean up them up so that they are logically separated. `BerkeleyBatch` currently handles the creation of the `BerkeleyDatabase`'s `Db` handle. This is instead moved into `BerkeleyDatabase` and is called by `BerkeleyBatch`. Instead of having `BerkeleyEnvironment` track each database's usage, have `BerkeleyDatabase` track this usage itself with the `m_refcount` variable that is present in `WalletDatabase`. Lastly, instead of having each `BerkeleyEnvironment` store the fileids of the databases open in it, have a global `g_fileids` to track those fileids. We were already checking fileid uniqueness globally (by checking the fileids in every environment when opening a database) so it's cleaner to do this with a global variable. All of these changes allow us to make `BerkeleyBatch` and `BerkeleyDatabase` no longer be friend classes. The diff of this PR is currently the same as in #bitcoin#18971 Requires bitcoin#19334 ACKs for top commit: laanwj: Code review ACK 74507ce ryanofsky: Code review ACK 74507ce. No changes since last review other than rebase Tree-SHA512: 845d84ee1a470e2bf5d2e2e3d7738183d8ce43ddd06a0bbd57edecf5779b2f55d70728b1b57f5daab0f078650a8d60c3e19dc30b75b36e7aa952ce268399d5f6
The data storage layer of the wallet is fairly complicated and not well encapsulated or modularized. This makes it difficult for us to introduce new data storage methods. So this PR refactors it such that there is a clear way to introduce other storage options. This is also intended to clean up the logic of the database layer so that it is easier to follow and reason about.
In the database, there are 3 classes:
BerkeleyBatch,BerkeleyDatabase, andBerkeleyEnvironment.BerkeleyDatabaseis typenamed toWalletDatabase. The goal is to introduce two abstract classes:WalletDatabaseandDatabaseBatch. These will be implemented byBerkeleyDatabaseandBerkeleyBatch. This abstract classes can be inherited by other storage methods. All of the database specific things should be hidden in those classes.To achieve this, we move the database handling things in
BerkeleyBatchand turn it purely into a data accessor for theBerkeleyDatabase. Particularly, various static functions that operated on aBerkeleyDatabaseare changed to be member functions ofBerkeleyDatabase.Read,Write,Erase, andExistsare refactored to have separate functions that can be overridden by other classes. Instead ofGetCursorreturning a BDB specificDbcobject, newCreateCursorandCloseCursorfunctions are added and the cursor is handled internally within theBerkeleyBatch. We are then able to introduce theDatabaseBatchabstract class.Functionality is further moved out of
BerkeleyBatchintoBerkeleyDatabase. Notably theDbhandle creation is moved fromBerkeleyBatch's constructor and to a newOpenfunction inBerkeleyDatabase.BerkeleyDatabasewill also handle closing with a newClosefunction instead of having this be part ofFlush.Additionally, the existence of
BerkeleyEnvironmentis hidden inside ofBerkeleyDatabase.mapFileUseCountis removed. InsteadWalletDatabasewill track it's use count withm_refcount. This is incremented by newly introducedAddRefandRemoveReffunctions.m_refcountis used to ensure that the database and environment are not closed while the database may still be in use.Instead of each
BerkeleyEnvironmenttracking the fileids of the databases in that environment, a global mapg_fileidsis used. Since we were already going through every open fileid for the uniqeness check, it doesn't matter whatBerkeleyEnvironmenthas a database with a particular fileid. All we care about is that there is a database in any of the environments that has the same fileid of the database that is currently being opened.Lastly,
BerkeleyBatch,BerkeleyDatabase, andBerkeleyEnvironmentare moved into new fileswallet/bdb.{cpp/h}.WalletDatabaseandDatabaseBatchare inwallet/db.{cpp/h}`. This just gives better organization of the code so they aren't all mixed together.I've started breaking this down into separate PRs. Some of the simpler stuff is moved up to the front so they can be merged first.
These PRs are independent of each other and can be merged out of order after #19290 is merged
These PRs require the previous listed to be merged first and need to be merged in order
DatabaseBatch: wallet: Refactor BerkeleyDatabase to introduce DatabaseBatch abstract class #19325WalletDatabase: wallet: Introduce WalletDatabase abstract class #19334I will continue to break this down as things get merged.