Skip to content

Conversation

@codablock
Copy link

@codablock codablock commented Sep 21, 2018

This is my current state of the LLMQ implementation. It's for review only and not meant for merging for now. It also contains #2296. There are also still a few things to do, also in regard to compatibility to the actual DIPs. Some things are not implemented yet and for other things it turned out that the DIPs need to be modified.

If you want to experiment with it, there is the ./qa/rpc-tests/quorums.py script available which will locally start up a few MNs and leave them running until you kill that script. It will generate the necessary blocks to fund the MNs and activate DIP3. After all MNs have been started, sporks are put in place so that you can start testing LLMQs. At this point, no LLMQs have been created yet.

Regtest has a single LLMQ type configured, which creates a LLMQ of size 10 every 24 blocks. Give the DKG enough time (a second per block) when you generate blocks by yourself.

After a LLMQ has been activated/mined, send some TXs around to see LLMQ signing happen.

Connect to the MN network with either RPC or (as I do) a dash-qt instance with this configuration:

datadir=<mydatadir>
regtest=1

# needed if you want to control sporks from dash-qt
sporkkey=cP4EKFyJsHT39LDqgdcB43Y3YXjNyjb5Fuas1GQSeAtjnZWmZEQK

# listen as this dash-qt instance will also act as a MN
listen=1
bind=0.0.0.0
port=31001
externalip=127.0.0.1

allowprivatenet=1

# This one might be different on your machine, check out the running MNs and their configs (somewhere in /tmp) to find the correct port
addnode=127.0.0.1:12001

# Activates the budget system faster (probably not needed, but then it must be removed from quorums.py as well)
budgetparams=240:100:240

# Let's watch all quorums
watchquorums=1

# Let's run dash-qt as masternode
masternode=1

# probably not needed for these tests? We immediately activate DIP3 which then requires BLS keys
masternodeprivkey=cUP3knDebUNdFDRNAivB1Qcw4SQscmQgv7hUTYnuiwk39r3wDieE

######
# chia BLS12-381 (reversed)
#masternodeblsprivkey=3b7724d56e02f7948db20bc85e8b72c2fa4e0febd2d8b9315200835f1f54e5cf

@codablock codablock force-pushed the llmq branch 2 times, most recently from bf36161 to 1008a2c Compare September 21, 2018 15:45
Copy link
Member

@PastaPastaPasta PastaPastaPasta left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, that was a lot. I'm sorry... Lot's of formatting fixes. A few (maybe) actual code change requests

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\n

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\n

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same line or brackets

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same line or brackets

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\n

src/util.h Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

brace on newline for namespaces

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

alphabetize

@codablock
Copy link
Author

codablock commented Nov 17, 2018

Just force-pushed my current version. This version is now based on the current develop branch (v13).

A lot has changed in the code, especially the way I propagate and validate signature shares between LLMQ members. In the previous version, I was directly pushing sigshare batches to intra-quorum connected nodes, without checking if these already have these shares. I did this because the classical INV system would already send around messages of at least 32 bytes size, so I assumed directly sending the shares itself would actually reduce the overhead of the INV system. Turned out it still produces too much overhead and chatter between nodes, especially as we now have switched to BLS12-381 with 96 byte signatures. With this switch, even the classical INV system would probably be better performing then what I did initially (direct push).

Instead of switching to the INV system, I implemented a specialized inventory system for signature shares. It should perform MUCH better and with much less overhead then the general purpose INV system.

Instead of announcing INV items to the other nodes, LLMQ members will now send CSigSharesInv messages to other members. It contains some information to identify the signing session to which the shares belong and a bitset that represents the shares offered. Each bit represents a share from a LLMQ member. Bit at index 10 means it's a share from member 10.

Members will now regularly (every 100ms) check which shares were announced by other members and which shares are missing locally and request these from other members. They will also check what other nodes requested and send them these shares. All the communication is leveraging the CSigSharesInv object to reduce overhead to the minimum and to request/announce as many shares as possible in one go. Shares are still sent with the CBatchedSigShares to reduce overhead even further.

Also, in the old version I extensively used the CBLSWorker to speed up things in the handling of signature shares. I removed this and will also remove the use of it in other places later. Turned out that the overhead due to moving BLS primitives around between threads destroys most of the performance gains. Especially now as we switched to BLS12-381, where the primitives got a lot larger.

Instead, I'm leveraging (single-threaded) batched verification as much as possible now. Incoming shares are put into a pending batch and then processed every 100ms. Verification is then done in sub-batches to avoid DoS attacks (an attacker sending 100 valid shares and one last invalid one). I first try to batch together from all nodes and if that succeeds, all shares are accepted. If it fails, I revert to per-node batching to figure out which node sent the invalid shares.

I'll document all this in-code when I find time, as reviewing it will otherwise not be fun.

Another change is how I determine to which nodes to announce shares to. Previously, I only sent shares to members which were determined by the intra-quorum connections selection algorithm. This however had the disadvantage that it only announced shares to outgoing connections and did not take incoming connection from other members into account. As there is no easy way to determine which incoming connection is from a LLMQ member, I implemented something like "if someone ever sends me a share belonging to my quorum, he must be interested in my shares as well, so I announce him all the shares I got from now on". It doesn't necessarily mean it's really a member of the same LLMQ, but its 99% likely.

@codablock
Copy link
Author

Also updated the example dash.conf to work with the changes we did in develop

When many nodes have to be interconnected, waiting for the handshake
dramatically slows down the tests. This allows to disable waiting for
the handshake.
Useful when many sigs need to be deserialized and at the same time the hash
of these is never used.
- Distributed Key Generation (DKG)
- Mining of final quorum commitments
These are quite important and waiting for 2 minutes when the first peer
did not send it is not acceptable.
IsMasternodeOrDisconnectRequested will internally lock cs_vNodes so we must
ensure correct lock ordering.
@codablock
Copy link
Author

Closing this PR as the code has been merged through smaller PRs.

@codablock codablock closed this Jan 22, 2019
@codablock codablock deleted the llmq branch January 22, 2019 15:43
@UdjinM6 UdjinM6 removed this from the 14.0 milestone Mar 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants