-
Notifications
You must be signed in to change notification settings - Fork 38.7k
refactor: Rewrite AcceptToMemoryPoolWorker() using smaller parts #16400
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
The following sections might be updated with supplementary metadata relevant to reviewers and maintainers. ConflictsReviewers, this pull request conflicts with the following ones:
If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first. |
maflcko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/validation.cpp
Outdated
| // Compare a package's feerate against minimum allowed. | ||
| bool CheckFeeRate(size_t package_size, CAmount package_fee, CValidationState& state) | ||
| { | ||
| CAmount mempoolRejectFee = pool.GetMinFee(gArgs.GetArg("-maxmempool", DEFAULT_MAX_MEMPOOL_SIZE) * 1000000).GetFee(package_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are some of the settings parsed in the constructor and saved as a member and others are not?
Could do the same here.
| CAmount mempoolRejectFee = pool.GetMinFee(gArgs.GetArg("-maxmempool", DEFAULT_MAX_MEMPOOL_SIZE) * 1000000).GetFee(package_size); | |
| CAmount mempoolRejectFee = pool.GetMinFee(m_max_pool_size).GetFee(package_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that I end up needing to access the package size limits directly when adding AcceptMultipleTransactions() in #16401, whereas I didn't need local copies of the reject fee. So my main motivation behind not saving these as members is to avoid cluttering the state, which was already complex to untangle.
I don't feel strongly about this though; I just wanted to make it as easy as possible for reviewers and future code maintainers to be able to understand what data is required by each function.
src/validation.cpp
Outdated
| bool& fReplacementTransaction = ws.fReplacementTransaction; | ||
| CAmount& nModifiedFees = ws.nModifiedFees; | ||
| CAmount& nConflictingFees = ws.nConflictingFees; | ||
| size_t& nConflictingSize = ws.nConflictingSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that the local name and the name of the member are identical, does it really make sense to alias it? Looks like you already changed every line in this function due to whitespace fixups, so might as well prefix what we need with ws. or args. in line. No strong opinion, though.
If you decide to keep the current approach, I think you can make review easier by not duplicating all the types (use auto&, which should keep the type and constness as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did think it was going to be easier for review if I didn't end up changing all the variables out at the same time. I think my preferred approach would be to fix the variable names in MemPoolAccept, Workspace, and ATMPArgs to conform to the style guide, but then continue to use these aliases so that we're not also changing all the variables in each of these functions.
One benefit of the aliases is that it makes it easier to tell what variables we're not using as well. I could also achieve that by changing the functions so that we pass in less data, such as by directly passing in only those members of the Workspace and Args structs that each function needs. Perhaps that is a better approach?
|
|
||
| // Check against previous transactions | ||
| // This is done last to help prevent CPU exhaustion denial-of-service attacks. | ||
| PrecomputedTransactionData txdata(tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason this is not stored in the workspace at this point? Maybe as a unique ptr or something, like the entry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a version of this code where it was a unique_ptr, but thought this was an easy way to avoid a heap allocation that we don't really need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit weird to have m_entry as a unique_ptr, but not have txdata in there as well just to avoid it being on the heap. I think making txdata a unique_ptr in Workspace would simplify #16401 a bit too.
sdaftuar
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarcoFalke Thanks for the quick review! I've pushed many updates to address feedback.
I plan to squash many of the cleanup commits down, but I wonder if the variable rename is easier to review by itself?
src/validation.cpp
Outdated
| // Compare a package's feerate against minimum allowed. | ||
| bool CheckFeeRate(size_t package_size, CAmount package_fee, CValidationState& state) | ||
| { | ||
| CAmount mempoolRejectFee = pool.GetMinFee(gArgs.GetArg("-maxmempool", DEFAULT_MAX_MEMPOOL_SIZE) * 1000000).GetFee(package_size); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason is that I end up needing to access the package size limits directly when adding AcceptMultipleTransactions() in #16401, whereas I didn't need local copies of the reject fee. So my main motivation behind not saving these as members is to avoid cluttering the state, which was already complex to untangle.
I don't feel strongly about this though; I just wanted to make it as easy as possible for reviewers and future code maintainers to be able to understand what data is required by each function.
src/validation.cpp
Outdated
| bool& fReplacementTransaction = ws.fReplacementTransaction; | ||
| CAmount& nModifiedFees = ws.nModifiedFees; | ||
| CAmount& nConflictingFees = ws.nConflictingFees; | ||
| size_t& nConflictingSize = ws.nConflictingSize; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did think it was going to be easier for review if I didn't end up changing all the variables out at the same time. I think my preferred approach would be to fix the variable names in MemPoolAccept, Workspace, and ATMPArgs to conform to the style guide, but then continue to use these aliases so that we're not also changing all the variables in each of these functions.
One benefit of the aliases is that it makes it easier to tell what variables we're not using as well. I could also achieve that by changing the functions so that we pass in less data, such as by directly passing in only those members of the Workspace and Args structs that each function needs. Perhaps that is a better approach?
|
|
||
| // Check against previous transactions | ||
| // This is done last to help prevent CPU exhaustion denial-of-service attacks. | ||
| PrecomputedTransactionData txdata(tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a version of this code where it was a unique_ptr, but thought this was an easy way to avoid a heap allocation that we don't really need.
ca6c291 to
2bfcbc6
Compare
|
I went ahead and squashed the commits up to the variable rename, unsquashed version is here: 16400.1 |
93e0451 to
a372e03
Compare
ajtowns
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Concept ACK. Looks pretty good; ATMPArgs seems helpful. I've only looked at the structure, not checked the logic.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like m_coins_to_uncache could just be a member var of MemPoolAccept directly; with either a UncacheCoins() method invoked by the caller on failure, or having a ~MemPoolAccept() destructor that automatically uncaches, with AcceptSingleTransaction calling m_coins_to_uncache.clear() before returning true to avoid uncaching things that should remain cached.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it makes sense to mix ATMP logic at this layer with managing the utxo cache -- I think that's better handled by callers in the long run, even we're just doing something very simple now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I was thinking of it more as "ATMP logic messes up the cache when it decides it can't accept, so cleaning up that mess should also be part of ATMP logic". But having it be done in ATMPWithTime is still part of ATMP in my book, so either way works.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should have EXCLUSIVE_LOCKS_REQUIRED(cs_main)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could make it even easier by having const ATMPArgs m_args as a member of MemPoolAccept, and not passing it around at all. That doesn't work for m_bypass_limits in AcceptMultipleTransactions, but you're basically manually checking the limits there anyway, so maybe could do false for all of them; otherwise could use Workspace for per-tx logic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to leave this as-is for the moment. I do think that holistically, we can think about doing one of two things with this refactor:
(a) go all in for storing state in the class and not pass around data between member functions
(b) dump this class and pass state around explicitly instead
I did something weird in this PR where I started to introduce a class (mostly to function as a namespace, I guess!), but I mostly clung to the historical style of passing everything around anyway (except for the mempool/coinsview related things, and the chain limits). So this is probably a pretty weird state to leave things, and I'm happy to dive down either direction so that the design is more cohesive, but I'd like feedback on which way to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I think either way is fine, but my preference is to keep the class for namespacing (so it's clear the private methods are only called from the public ones), and to keep the number of explicit params to the functions pretty small -- pulling the stuff you actually need from the Args/Workspace structs like you do looks good to me.
Maybe it might make sense to only pass things as args if they aren't known got MemPoolAccept's constructor or can be deallocated before the end -- like txdata -- or are per-transaction -- like Workspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @ajtowns for the review. I've addressed most comments; the main conceptual point I would like feedback on is whether wrapping the state in this MemPoolAccept() class is a good idea, or if I should go back to the style we have historically used of passing state explicitly to these functions (as I mention below).
Also, I've updated the related pull over at #16401 to be rebased on the current version of this PR, so we can track how these changes affect the goal. Seems to work fine so far.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'm going to leave this as-is for the moment. I do think that holistically, we can think about doing one of two things with this refactor:
(a) go all in for storing state in the class and not pass around data between member functions
(b) dump this class and pass state around explicitly instead
I did something weird in this PR where I started to introduce a class (mostly to function as a namespace, I guess!), but I mostly clung to the historical style of passing everything around anyway (except for the mempool/coinsview related things, and the chain limits). So this is probably a pretty weird state to leave things, and I'm happy to dive down either direction so that the design is more cohesive, but I'd like feedback on which way to go.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure it makes sense to mix ATMP logic at this layer with managing the utxo cache -- I think that's better handled by callers in the long run, even we're just doing something very simple now.
ec87639 to
0840a1d
Compare
|
Rebased |
|
ACK 0840a1d2e2270c06342710c4ebfaac1c41652dab (read diff with --ignore-all-space) I like that this splits up the massivley large ATMP into logical chunks Show signature and timestampSignature: Timestamp of file with hash |
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note to other reviewers. This says "package", but in the current code this is a single tx (the one that is added)
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doc-nit: Should this say that this method fully initializes the ws? Or is this only a coincidence not worth to mention?
|
utACK 0840a1d2e2270c06342710c4ebfaac1c41652dab. |
ajtowns
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ACK 0840a1d2e2270c06342710c4ebfaac1c41652dab ; added some suggestions, but this is already a good improvement. re-reviewed the code, checked it compiled and tests passed.
|
|
||
| // Check against previous transactions | ||
| // This is done last to help prevent CPU exhaustion denial-of-service attacks. | ||
| PrecomputedTransactionData txdata(tx); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems a bit weird to have m_entry as a unique_ptr, but not have txdata in there as well just to avoid it being on the heap. I think making txdata a unique_ptr in Workspace would simplify #16401 a bit too.
src/validation.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would make for a slightly smaller diff to say CTxMemPoolEntry& entry = *ws.m_entry; and only declare it after m_entry has been reset (and assert(wx.m_entry); beforehand in Finalize to keep the static analysis tools happy, I suppose)
0840a1d to
6e43129
Compare
|
Squashed |
Bumped the milestone to 0.20: as noted by @fanquake, as this is preparation work for a feature, it doesn't make a lot of sense to merge this last-minute for 0.19, so I think we should merge this early in the 0.20 release cycle. |
|
I think it makes sense to merge this for 0.19 to make policy backports to 0.19 easier @sdaftuar Are you still working on this? |
|
I independently figured we’d wait until 0.20 for the same reason @laanwj gave, but I can get this rebased sooner for consideration in 0.19. |
This is in preparation for re-using these validation components for a new
version of AcceptToMemoryPool() that can operate on multiple transactions
("package relay").
ef79e05 to
4a87c5c
Compare
|
Rebased (old version is 16400.2) |
|
re-ACK 4a87c5c (did the rebase myself and arrived at the same result, mod whitespace) Show signature and timestampSignature: Timestamp of file with hash |
|
ACK 4a87c5c |
…ler parts 4a87c5c [refactor] Rewrite AcceptToMemoryPoolWorker() using smaller parts (Suhas Daftuar) Pull request description: This is in preparation for re-using these validation components for a new version of AcceptToMemoryPool() that can operate on multiple transactions ("package relay"). ACKs for top commit: MarcoFalke: re-ACK 4a87c5c (did the rebase myself and arrived at the same result, mod whitespace) laanwj: ACK 4a87c5c Tree-SHA512: b0495c026ffe06146258bace3d5e0c9aaf23fa65f89f258abc4af5980812e68e63a799f1d923e78ac1ee6bcafaf1222b2c2690a527df9b65dff7b48a013f154e
Merge the upstream refactoring to AcceptToMemoryPoolWorker from bitcoin/bitcoin#16400 with the Namecoin-specific changes made to it.
…maller parts
Summary:
4a87c5cfdf7dd72d999ebeaf17db6695a7c6298d [refactor] Rewrite AcceptToMemoryPoolWorker() using smaller parts (Suhas Daftuar)
Pull request description:
This is in preparation for re-using these validation components for a new version of AcceptToMemoryPool() that can operate on multiple transactions ("package relay").
---
Because of how our code handles SigChecks I could not remove the code commented under "Validate input
scripts against standard script flags" from the new PreChecks function into the PR's new PolicyScriptChecks
because since D5128 anything related to that number can only be run after CheckInputs.
The other code in the above new function was related to SegWit therefore I just removed the
PolicyScriptChecks function altogether.
Other missing members of the Workspace and ATMPArgs that were unused due to us not having RBF
were also taken out.
Backport of Core [[bitcoin/bitcoin#16400 | PR16400]]
Test Plan:
ninja all check check-functional
Reviewers: deadalnix, #bitcoin_abc
Reviewed By: deadalnix, #bitcoin_abc
Differential Revision: https://reviews.bitcoinabc.org/D8203
ATMPW stands for AcceptToMemoryPoolWorker, which was removed in bitcoin#16400.
This is in preparation for re-using these validation components for a new version of AcceptToMemoryPool() that can operate on multiple transactions ("package relay").