Problem: Tendermint chain migration plan is incomplete.#68
Problem: Tendermint chain migration plan is incomplete.#68kansi merged 14 commits intobigchaindb:masterfrom
Conversation
| votes_recieved=<Sum_of_votes_recieved> | ||
| votes_allocated=<Sum_of_votes_allocated_in_election> | ||
| network_size=<Total_network_power> | ||
| ``` |
There was a problem hiding this comment.
I think it is possible to definitely state that the migration concluded or not. We can just state if the migration has concluded or not.
There was a problem hiding this comment.
Makes sense. Here I wanted it to look similar to how it looks for upsert-validator. We can add an additional election_concluded= line for both, what do you think?
There was a problem hiding this comment.
Thinking about conclusion (also related to this comment), we need to display election status, one of started, concluded, failed.
Let's agree on the names for these statuses and document them here and in BEP-21.
There was a problem hiding this comment.
Instead of started I would suggest ongoing? i.e. once an election transaction has been committed its status could be described as ongoing
There was a problem hiding this comment.
Ok, I will add an election_status= line but there is no need to enumerate the statuses here. You can just do it in BEP-21.
|
|
||
| A CLI command can be used to instruct the node to stop building blocks at a chosen height. | ||
|
|
||
| To make sure no blocks are committed after the agreed height, we propose an election process: |
There was a problem hiding this comment.
If the chosen height is reached before the election concludes, then I guess the election should be shut down or rendered obsolete somehow. The spec should say exactly what should happen.
There was a problem hiding this comment.
Following up on that, when an election concludes before the proposed migration height h and there are no new transactions in the system such that blockchain can reach height h then what is supposed to happen?
There was a problem hiding this comment.
Good points! I will document these cases.
|
|
||
| #### 2. Replay the blockchain | ||
|
|
||
| In order to replay the chain up to the migration height, we need to either keep the old Tendermint version running or skip replaying the corresponding part of the chain = archive the chain. We consider the former to be a huge hassle so we conceive an archiving scheme further in this chapter. |
There was a problem hiding this comment.
The last sentence ends ambiguously with, "...so we conceive an archiving scheme further in this chapter." Conceive? Chapter?
Maybe, "...so we describe an archiving scheme elsewhere in this BEP."
|
|
||
| Validators have to install and launch the new version of Tendermint. They need to prepare a new `genesis.json`. The new genesis file has to contain the validator set, the application hash (both at the migration height), and the identifier of the new Tendermint chain (`chain_id`). | ||
|
|
||
| The application hash is calculated by computing the Merkle tree root of the current UTXO. |
There was a problem hiding this comment.
That's not how the application hash is calculated today, but I guess that's an idea for how to calculate it in the future?
Maybe the details of how to compute the application hash should be left out of this BEP, because that's one of the things that can change over time. Vanshdeep started another BEP (in a pull request) to specify how to compute the application hash, in the future.
There was a problem hiding this comment.
This way of computing hash was described here prior to this patch. We do need to know how to compute the hash to be able to migrate.
There was a problem hiding this comment.
You definitely need to know how to compute the application hash to migrate, but the method of computing it can change from one chain to the next.
There was a problem hiding this comment.
This BEP should advise on a reasonable way to hash to use for the implementation. I think, the proposed way is more reasonable than the existing one. I will note it down that it might change.
|
|
||
| #### 4. Join the network after a migration | ||
|
|
||
| When a validator joins the network after a migration, it does not suffice for him to know `genesis.json` - he also needs the archive of the old chain (heights from 0 to the latest migration height). |
There was a problem hiding this comment.
There might be more than one migration, so the initial chain might go from height 0 to 33277, the second chain might go to height 88234, the third chain...
|
|
||
| To make sure no blocks are committed after the agreed height, we propose an election process: | ||
|
|
||
| - The initiator creates an election, which includes a block height. |
There was a problem hiding this comment.
Is the purpose of adding specific block height to add a kind of timeout for the election process? So that if the election concludes before or at the given block height the network halts and prepares to migrate or else the election is rendered obsolete?
There was a problem hiding this comment.
I will update this part - there is no need to ask for a height there.
| $ bigchaindb migration apply <dump> | ||
| ``` | ||
|
|
||
| After the dump is applied, Tendermint can be started. |
There was a problem hiding this comment.
what happens if tendermint is started before the migration dump is applied? it will still sync with the current state of the new chain. I supposes tendermint can be started before or after the migration dump is applied but the bigchaindb server can only be started after the dump has been applied
There was a problem hiding this comment.
Both need to be started after the dump is applied because without the dump you are not able to validate incoming transaction properly. I will try to make it explicit in the BEP.
| Validators have to install and launch the new version of Tendermint. They need to prepare a new `genesis.json`. The new genesis file has to contain the validator set, the application hash (both at the migration height), and the identifier of the new Tendermint chain (`chain_id`). | ||
|
|
||
| The application hash is calculated by computing the Merkle tree root of the current UTXO. | ||
| The application hash is calculated by computing the Merkle tree root of the current UTXO. Note that hash calculation is a subject to design in a separate BEP and can be changed in the future. |
There was a problem hiding this comment.
I don't know understand why you kept a specific method of computing the application hash in this BEP.
Moreover, we don't calculate the application hash like that today, and I'm not convinced we'll compute it that way in the future. There's a lot more to the current BigchainDB state than the UTXO set. The full state includes all asset.data and metadata information, for example.
There was a problem hiding this comment.
I used UTXO root since it can be used to do some useful verifications in the future, unlike the current hashing.
It's not indeed important at the moment because we will instruct the users to take the hash and the dump from the same validator - we do not anyway offer tools to verify the dump-hash integrity yet. So I can as well just recommend to use the already implemented way.
| When a validator joins the network after a migration, it does not suffice for him to know `genesis.json` - he also needs the archive of the old chain. | ||
| There might be more than one migration, so the initial chain might go from height 0 to 33277, the second chain might go to height 88234, the third chain can go from 88234 up to the recent height. In this case the validator has to get an archive containing archived blocks from height 0 to height 88234. | ||
|
|
||
| Such a validator needs to take a snapshot of the old chain from an existing member of the network. The snapshot can be created via a CLI command: |
There was a problem hiding this comment.
In the first iteration of this implementation, we can just recommend the user to use mongodump and mongorestore for achiving and loading data.
There was a problem hiding this comment.
Yes, we need to be explicit about the importance of keeping/publishing the dump together with genesis.json.
There was a problem hiding this comment.
I think Tendermint stores more information than MongoDB stores, such as all the block headers. Shouldn't that be "dumped" as well?
There was a problem hiding this comment.
After a migration, Tendermint starts from scratch so no - there is nothing to dump.
|
|
||
| After the snapshot is applied, Tendermint can be started. | ||
|
|
||
| Note that although the new node can join the network and work to some extent without applying the snapshot, it is not able to properly validate transactions so applying the snapshot is a must. |
There was a problem hiding this comment.
What effect will this have on the network? I suppose this node be treated as byzantine node?
There was a problem hiding this comment.
Yes, it will. Without sufficient data, it may claim invalid transactions to be valid.
|
I'm still a bit unclear about how the initial Tendermint state in a new chain gets set to the final state of the previous chain, but I suppose those details will get resolved once we start implementing (and we can revise this BEP to reflect what we do). I will approve this PR. |
The initial Tendermint state is described by |
|
|
||
| #### Migration election specs | ||
|
|
||
| We introduce two transaction specs, with operation types [`TENDERMINT_MIGRATION_ELECTION`][spec_tendermint_migration_election] and [`TENDERMINT_MIGRATION_ELECTION_VOTE`][spec_tendermint_migration_election_vote], for the purpose of implementing migration elections. `TENDERMINT_MIGRATION_ELECTION` is an extension of `CREATE`; `TENDERMINT_MIGRATION_ELECTION_VOTE` is an extension of `TRANSFER`. |
There was a problem hiding this comment.
I wouldn't call those two things "specs." They're two new operation types, to be added to the existing BigchainDB Transactions Spec v2 (BEP-13).
Yes, thanks. |
| title: Tendermint Migration Election Schema - Updgrade Tendermint after a breaking change in the chain | ||
| required: | ||
| - operation | ||
| - asset |
There was a problem hiding this comment.
The asset field is not defined?
Solution: Draft a migration plan. Address the problem of switching Tendermint chains. Describe how to replay the chain after a migration. Mention the implications on HTTP API.
Also, include election status into the command output doc.
Add some clarifications.
Rename snapshot -> archive. Use MongoDB tools. Document how to manage and validate archives.
97a49d3 to
c8f8170
Compare
|
Why is there is no transaction spec for creating a migration proposal? |
It matches |
|
But then what is the value of the asset field? |
Follows the same constraints as in |
|
|
||
| BigchainDB operators need a convenient way to do the following: | ||
|
|
||
| 1. Stop building blocks when the old Tendermint chain is at a jointly chosen height. |
There was a problem hiding this comment.
This line makes it seem like the operators have to agree on a migration height (e.g. 4592) ahead of time, but later on, you make it clear that the migration height is just whatever the current height happens to be when the election is concluded. Maybe change this line to read:
- Stop building blocks for the old Tendermint chain at the same height as all the other operators.
| ``` | ||
| votes_recieved=<Sum_of_votes_recieved> | ||
| votes_allocated=<Sum_of_votes_allocated_in_election> | ||
| network_size=<Total_network_power> |
There was a problem hiding this comment.
Is this line correct? Network size and total network power don't seem like the same thing to me.
|
|
||
| Therefore, after the migration election is concluded, each validator has to create a chain archive and keep it together with `genesis.json`. If `genesis.json` is published somewhere, the archive should be published alongside. | ||
|
|
||
| To generate the archive, validators can use the `mongodump` command (comes together with `mongod`): |
There was a problem hiding this comment.
We now know that the blockchain stored by MongoDB and the blockchain stored by Tendermint are different because Tendermint also stores invalid transactions. Does this matter? Don't the invalid transactions go into calculating some block header hashes on the Tendermint side?
In other words, shouldn't we be archiving Tendermint's LevelDB? Maybe just archive both MongoDB and LevelDB?
(We're considering replacing LevelDB with one MongoDB for everyone. If we do that, this will all become much easier.)
There was a problem hiding this comment.
No, we do not care about Tendermint storage here by design. After the migration, Tendermint starts from scratch while BigchainDB simply keeps the old data for querying but not for replay.
There was a problem hiding this comment.
Oh, I see, it makes sense now. Maybe you could write that in BEP-42 somewhere.
|
Edit: It seems that chain migration spec is exactly the same as the base election transaction spec. So |
| To offer a convenient way to get the data, we propose a CLI command: | ||
|
|
||
| ``` | ||
| $ bigchaindb status |
There was a problem hiding this comment.
This should be updated to bigchaindb election status <election_id> and the expected output (apart from status=) should be defined.
|
|
||
| #### Migration election specs | ||
|
|
||
| We introduce a new [transaction operation](./tendermint_migration_election.yaml), `TENDERMINT_MIGRATION_ELECTION`, for the purpose of implementing migration elections. `TENDERMINT_MIGRATION_ELECTION` follows the base election spec documented in [the TEP](../18). |
There was a problem hiding this comment.
./tendermint_migration_election.yaml is not accessible
|
|
||
| Note that new validators joining a permissioned network inherently have to trust the place they are getting data from - there is no generic way to assess the validity of `genesis.json` and the archive upon joining the network. | ||
|
|
||
| At the moment, there are no tools to verify the integrity between `genesis.json` and the archive. Such tools are subject to work on separately. In the future, one might need less trust in particular parties by downloading `genesison.json` and the archive from two different places and verifying the integrity. At the moment it is strongly recommended to download them from a single most trusted place. |
There was a problem hiding this comment.
genesison.json -> genesis.json
| The initiator executes: | ||
|
|
||
| ``` | ||
| $ bigchaindb election chain-migration new --private-key /home/user/.tendermint/config/priv_validator.json |
There was a problem hiding this comment.
according to the current description and implementation the command is bigchaindb election new chain-migration ...
Solution: Draft a migration plan. Address the problem of switching Tendermint chains. Describe how to replay the chain after a migration. Mention the implications on HTTP API.