Skip to content

Conversation

@BitcoinTsunami
Copy link

@BitcoinTsunami BitcoinTsunami commented Dec 7, 2020

The function ComputeTimeSmart in wallet.cpp assume that transaction are discovered in the right order.
Moreover the 'smarttime' determination algorithm is coded with realtime scenario in mind and not rescanning of old block.

The functional test demonstrate that if the user import a wallet, then rescan only recent history, and then rescan the entire history, the older transaction discovered would have an incorrect time determination.
In the context of rescanning old block, the only time value that as a meaning is the blocktime.

That's why I've fixed the problem with a simple separation between rescanning of old block and realtime time determination. The fix is written to have no impact on every realtime scenario and only impact the behaviour during a rescanning process.
This PR Fixes #20181.

To be fair, I don't think that this bug could be triggered with the wallet GUI, because it always proceed with a proper rescan.
But RPC API provide the possibility to trigger it. I've discovered it, because Specter desktop v0.10.0 was impacted. (cryptoadvance/specter-desktop#680).

@DrahtBot
Copy link
Contributor

DrahtBot commented Dec 7, 2020

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #21206 (refactor: Make CWalletTx sync state type-safe by ryanofsky)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@jonatack
Copy link
Member

jonatack commented Dec 7, 2020

If this PR is intended to address issue #20181, can you add "Closes #20181" or "Fixes #20181" to the PR description instead of the PR title?

@BitcoinTsunami BitcoinTsunami changed the title wallet, bugfix: fix ComputeTimeSmart function during rescanning process. fixes #20181 wallet, bugfix: fix ComputeTimeSmart function during rescanning process. Dec 7, 2020
@BitcoinTsunami
Copy link
Author

Is it right this way ?

@jonatack
Copy link
Member

jonatack commented Dec 7, 2020

Yes, thank you.

@sipa
Copy link
Member

sipa commented Dec 7, 2020

Concept ACK.

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK

@BitcoinTsunami
Copy link
Author

Thanks for all the feedback. I'll correct everything and ask question if needed.

@decryp2kanon
Copy link
Contributor

Concept ACK. Thanks!

@jonasschnelli
Copy link
Contributor

Concept ACK - code looks good (haven't looked at the tests). Code is relatively simple, basically just an if to bypass the smart-time calculation.

@BitcoinTsunami
Copy link
Author

I've push a new version with the rework marked as resolved done.
What is not done yet:

  • remove hardcoded pubkey in the functionnal test : wait for feedback on the way to do it to limit useless code.
  • update ComputeTimeSmart doxygen documentation : wait for feedback on my proposal in comment.
  • update AddToWalletIfInvolvingMe doxygen documentation : wait for feedback on my proposal in comment.

@BitcoinTsunami
Copy link
Author

I've push the full rework and add documentation update proposal in a separated commit.

@luke-jr
Copy link
Member

luke-jr commented Dec 11, 2020

Weak NACK. Seems like a hacky "fix" for what is pretty much a user error...?

@BitcoinTsunami
Copy link
Author

BitcoinTsunami commented Dec 12, 2020

Thanks for your feedback.

I'll try to give a broader view of the problem and of my fix, from my perspective.

First of all, the current behaviour seems to be present for years, and no recent modification has changed this.
It's rare that users encounter this one and there's clearly no emergency here.

The issue that triggered my investigation is this one : #20181
But by looking at open issue, I've now found two more open issue on Github that should be related to the same behaviour :
#11703
#6662
And one pull request abandoned due to the lack of activity :
#12024
This pull request proposed in 2018 by @achow101 is more radical than the way I fix it, because it remove the SmartTime computation, but this raises an interesting point for me on which I would like to come back later.

The thing is that when someone encounters this behavior from Bitcoin Core wallet, it clearly has no clue on what's going on, because it's hard to reproduce it and the date seems so random and nonsense, that there's no easy way to request for similar problems on search engines. And when you try to investigate it, nothing points you in the right direction.

If it's considered as an error from the RPC API user to proceed first with a recent scan, and after with an older one, it must be documented in the RPC API Reference.
And maybe it's the right solution for this problem : simply inform that rescanblockchain function should be manipulated with some rules in mind.
In that case, the documentation should precise that a miss handling of this method is fatal. Because if the API user proceeds to an improper rescan, there's no way to correct transaction datation later : the only way is to destroy the wallet.

Here why I think it might be a good idea to fix it : the API RPC users are mainly teams working around Bitcoin Core to enrich the ecosystem with more tools.
This part of the ecosystem is far more experimental (and less strict) than Bitcoin Core itself, and that's a good thing from my point of the view.
The specter desktop team has exploited the RPC rescanblockchain function to enable faster wallet import :

  • If it's a segwit wallet, the rescanning process starts automatically on the first segwit enable block.
  • During the backup process, the first transaction blockheight is saved, to enable auto optimized reimport of this wallet later.
  • If the user is a power user and knows the height of his first transaction, it could restart manually from here to restore faster his wallet.

I found all these ideas interesting for a faster and better user experience and guess what, they had a bug that broke my wallet, and pushed me to look at the problem.
This type of experimentation has not his place in the Bitcoin Core wallet, but is, for me, a net positive for the ecosystem.
We could imagine others optimisation for an external wallet project to improve the user experience with this RPC rescanblockchain function, for example : start with a fast utxo rescanning process with UI blocked, then unblock the interface and rescan only the current year, then the previous one, then the previous one,...
From an user perspective, I think that the oldest transactions are way less important than the more recents one.
It's just some random thought, but my core point is that RPC API functions are not going to be used the way the internal wallet uses them, and it's a good thing because it enables leveraged creativity outside the Bitcoin Core project and high quality and reference code for the Bitcoin Core wallet.
That's why I think it might be good to enable a non restricted usage of RPC rescanblockchain function between the two points (start_height and stop_height), even if it's not used by the wallet internally.

Now let's go back to the fix itself.
I've tried to limit the impact on my fix on the existing code, but I'll try to give a broader view of the way I've thought about it to feed the conversation.
As I've said in the description of my PR : In the context of rescanning old blocks, the only time value that has a meaning is the blocktime.
That's why my first reaction was to look for a way not to call the ComputeTimeSmart function during a rescanning of old blocks, because you don't want to launch a complex time evaluation function in the first place when you already know that the blocktime is the only time that means something.
I then discovered that the ComputeTimeSmart call was a very deep call in the hierarchy of functions and that a lot of wallet logic was shared in the middle part whatever the context.
The separation of the code between block rescan processing and realtime processing would have been an incorrect approach, due to this middleware logic.
The main wrong thing here, to me, is to compute the time of a specific context (realtime processing) in a shared function called deep in the hierarchy.
A good way to fix it, is to push this responsibility outside the hierarchy (at least near the surface of the hierarchy).
But this way will require a bigger refactoring for the wallet.cpp file. If someone finds this interesting, I could try to write a proof of concept in a branch outside a pull request. But this patch would have a far bigger impact on the code.

Finally this PR approach was a way of keeping the patch simple while preserving functionality.
But If this seems too hacky, there's 3 other ways :

  • Keep it as it's today, don't touch the wallet code, but update the RPC API Reference to explain restrictions on rescanblockchain function.
  • Make a bigger refactoring to push ComputeTimeSmart responsibility in the right place in the functions hierarchy.
  • Fix it like achow101 in his previous PR, by removing the part of ComputeTimeSmart that is only valid in a realtime context. This way, ComputeTimeSmart has not to be moved because it's not context dependent, but you've restricted a part of the current functionality. It's clearly an interesting approach too, from my point of view because it removes the root cause of the misplacement of the function in the hierarchy call.

Happy to pursue the discussion here.

@kristapsk
Copy link
Contributor

Seems like a hacky "fix" for what is pretty much a user error...?

How is rescanning blockchain a user error?

@luke-jr
Copy link
Member

luke-jr commented Dec 14, 2020

Manually scanning out of order, to be specific, is a user error because the wallet is assumed to always be scanned up to its sync-point.

@sipa
Copy link
Member

sipa commented Dec 14, 2020

@luke-jr I'd agree the wallet is written with that assumption, but I also don't see a downside to changing the behavior when the wallet can know it's dealing with an unusual situation. I haven't dug into the details, but I suspect this change makes things match expectations in strictly more situations.

@decryp2kanon
Copy link
Contributor

Concept ACK

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK ace3f77597a04785c2ab98708f31970934197044. This is great! A trivial fix replacing currently insane behavior (using latest transaction times to estimate times of transactions in old blocks) with the simple and obvious behavior of using the block timestamp as the transaction timestamp.

The only suggestion I would make is to maybe use block max time instead of block time for these transactions, to ensure that times within the set of rescanned transactions are monotonically increasing. (You can get the max time by chaining the FoundBlock expression FoundBlock().time(blocktime).maxTime(block_max_time)).

I will say I don't actually understand Luke's objection to this, and would be curious for an explanation. I understand (despite disagreeing with) the idea that it may be good to alert users that something is wrong by showing bad timestamps when they do something that the wallet can't support. But it's not clear what the exact user mistake is here, and if there is a better way to alert users about the bad usage, or just add code supporting the usage.

@ryanofsky
Copy link
Contributor

Another thing I think could be done to improve this is to avoid adding these transactions to wtxOrdered when rescanning_old_block is true:

wtx.nOrderPos = IncOrderPosNext(&batch);
wtx.m_it_wtxOrdered = wtxOrdered.insert(std::make_pair(wtx.nOrderPos, &wtx));

If we want to exclude these transactions from smart time calculation, better to fully exclude instead of half-excluding them. Maybe there is a way to adjust the smart time calculations to include them, but that would require more thought and might not be worth the effort.

Another code cleanup that would be nice to see is moving nTimeReceived, nOrderPos, and wtxOrdered updates inside the ComputeTimeSmart function instead of doing them before. ComputeTimeSmart is only called one place and the current way time calculation is split up outside and inside this function is probably unnecessarily confusing.

IMO, previous ACK is still ok and switching from block time to max block time, excluding these transactions from wtxOrdered, and making ComputeTimeSmart more self-contained are all optional improvements that could be followed up later.

@BitcoinTsunami
Copy link
Author

I tried this pull request today, and I still have the same problem. It's pulling in the wrong date after a rescan. I am now attempting to reload the entire blockchain. I need this fixed for my IRS audit.

You must encounter a similar problem as the one that make me code this PR. You can't recovered a wallet with incorrect datation. It's not possible yet, even with this PR : the only way is to destroy and recreate it. This PR only prevent the datation to be corrupted in the first place.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK bb6bc61bea3098d4dab895a73c2c0f7b8d293e2b. No significant changes since last review other than rebase, optimizing test, and switching to max block time instead of block time.

Thanks for looking into txordered suggestions! It'd be nice to clean up this code more but I can see how listtransaction constrains changing txordered, and I can see upsides of keeping ComputeTimeSmart functional even I might still prefer to see time logic less spread out over two method. There's space for more improvements later but I think this PR is clearly better than status quo.

One thing I don't know is what problem may cause the other reported issue #20591 (comment). Might be worth filing separate bug if there are known steps to reproduce.

Copy link
Contributor

@promag promag left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about nTimeSmart = min(nTimeSmart, block_max_time) and avoid rescanning_old_block argument?

@BitcoinTsunami
Copy link
Author

BitcoinTsunami commented Mar 9, 2021

How about nTimeSmart = min(nTimeSmart, block_max_time) and avoid rescanning_old_block argument?

Could you elaborate where do you imagine using it ?

Because min function between nTimeSmart and blocktime is already present in the current Bitcoin Core code and doesn't prevent the original problem.

I'm refering to this line :
nTimeSmart = std::max(latestEntry, std::min(blocktime, latestNow));

If you imagine introduce another minimum function after the previous one, you'll invalidate the maximum function with lastestEntry and change the original behavior of the ComputeTimeSmart function.
In the past achow101 has already purpose a previous PR with this type of approach (#12024). It's an elegant way to solve the issue, but it restrict existing functionality.

This PR solve the issue without removing existing functionality.

@BitcoinTsunami
Copy link
Author

Code review ACK bb6bc61. No significant changes since last review other than rebase, optimizing test, and switching to max block time instead of block time.

Thanks for looking into txordered suggestions! It'd be nice to clean up this code more but I can see how listtransaction constrains changing txordered, and I can see upsides of keeping ComputeTimeSmart functional even I might still prefer to see time logic less spread out over two method. There's space for more improvements later but I think this PR is clearly better than status quo.

One thing I don't know is what problem may cause the other reported issue #20591 (comment). Might be worth filing separate bug if there are known steps to reproduce.

Thanks ryanofsky for your feedback and your code review !

For the [#20591 (comment)], I need some confirmation from @charlierlee on my analysis :
I think that @charlierlee encountered an incorrect datation in a wallet.
After that, he searched for a solution online, found this PR and try it with a rescan of the wallet.
His incorrect datation was not corrected after that and he give his feedback here.

My point is that in that case : this PR won't save you because you can't recover an incorrect datation. It's an unrecoverable state. I mention in a previous comment "there's no way to correct transaction datation later : the only way is to destroy the wallet.". The rescan function never change a previous datation (even with a full rescan). That's why I think this PR is important, because it prevent the bad datation to occured in the first place.

@maddadder
Copy link

For the [#20591 (comment)], @BitcoinTsunami yes that is correct.

@ryanofsky
Copy link
Contributor

@promag did want to follow up on #20591 (comment)?

To me this PR seems like a straightforward improvement that gets the rescan context taken into account in a simple way and adds test coverage. I'm sure there are other ways it could be improved beyond this, but I think this change should only help make future improvements and not get in the way.

@BitcoinTsunami
Copy link
Author

Same code. I've just rebased from the latest master code to resolve conflict.

Copy link
Contributor

@meshcollider meshcollider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK b92d5522782a8d56f71f9ed4d04dbb068b9d65e6

Verified the test fails on current master but passes with this PR. Nits inline are non-blocking. Will merge if @ryanofsky or someone else re-ACKs.

Thanks @BitcoinTsunami!

Copy link
Member

@jonatack jonatack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review re-ACK b92d5522782a8d56f71f9ed4d04dbb068b9d65e6

Edit, also re-verified the debug build is clean and the test passes on the branch and fails on master.

2021-09-28T10:46:57.258000Z TestFramework (INFO): Check transactions time after restoration
2021-09-28T10:46:57.260000Z TestFramework (ERROR): Assertion failed
  File "/home/jon/projects/bitcoin/bitcoin/test/functional/wallet_transactiontime_rescan.py", line 154, in run_test
    assert_equal(tx['time'], cur_time + ten_days)
AssertionError: not(1635418013 == 1633690013)

Happy to re-review quickly for the suggestions.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code review ACK b92d5522782a8d56f71f9ed4d04dbb068b9d65e6. No changes since previous review other than rebase (conflicts on nearby lines)

@BitcoinTsunami
Copy link
Author

Thanks meshcollider and jonatack for your feedback.
All your recommendations are included in the new code.

@jonatack
Copy link
Member

ACK 240ea29 per git diff b92d552 240ea29, re-verified rebase to latest master + debug build clean + new test passes on the branch and fails on master, only change since my review a few hours ago is incorporation of latest review suggestions

Copy link
Contributor

@meshcollider meshcollider left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

re-utACK 240ea29

@meshcollider meshcollider merged commit 6a5381a into bitcoin:master Sep 28, 2021
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Sep 28, 2021
…ring rescanning process.

240ea29 doc: update doxygen documention of ComputeTimeSmart() and AddToWalletIfInvolvingMe() regarding rescanning_old_block parameter (BitcoinTsunami)
d6eb39a test: add functional test to check transaction time determination during block rescanning (BitcoinTsunami)
07b44f1 wallet: fix ComputeTimeSmart algorithm to use blocktime during old block rescanning (BitcoinTsunami)

Pull request description:

  The function ComputeTimeSmart in wallet.cpp assume that transaction are discovered in the right order.
  Moreover the 'smarttime' determination algorithm is coded with realtime scenario in mind and not rescanning of old block.

  The functional test demonstrate that if the user import a wallet, then rescan only recent history, and then rescan the entire history, the older transaction discovered would have an incorrect time determination.
  In the context of rescanning old block, the only time value that as a meaning is the blocktime.

  That's why I've fixed the problem with a simple separation between rescanning of old block and realtime time determination. The fix is written to have no impact on every realtime scenario and only impact the behaviour during a rescanning process.
  This PR Fixes bitcoin#20181.

  To be fair, I don't think that this bug could be triggered with the wallet GUI, because it always proceed with a proper rescan.
  But RPC API provide the possibility to trigger it. I've discovered it, because Specter desktop v0.10.0 was impacted. (cryptoadvance/specter-desktop#680).

ACKs for top commit:
  jonatack:
    ACK 240ea29 per `git diff b92d552 240ea29`, re-verified rebase to latest master + debug build clean + new test passes on the branch and fails on master, only change since my review a few hours ago is incorporation of latest review suggestions
  meshcollider:
    re-utACK 240ea29

Tree-SHA512: 514b02e41d011ddfa325f5e8080b93800e1ea4ed5853fa420670a6ac700e8b463000dbea65f8ced8565cfb950c7f51b69154034dcb111e67aca3b964a0061494
@bitcoin bitcoin locked and limited conversation to collaborators Oct 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Wrong time when importing transactions with rescanblockchain RPC command