-
Notifications
You must be signed in to change notification settings - Fork 38.7k
Reduce latency in network processing #3180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I'd really rather be never sleeping (except when waiting for IO, then event driven). Otherwise the delay will still delay block propagation. |
|
The problem is this isn't entirely about IO. SendMessages is actually largely a polling function with timers and other logic not dependent on IO. |
|
I understand this is important and it should be validated and pulled. I took a look at the code and as I understand it, it does the following:
Although I'm not particularly well versed in the bitcoin codebase this commit has my ACK on a code basis. I have built and tested it on my laptop and a vm which I -connect= and it synced the chain just fine. |
src/net.cpp
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect that this is being set erroneously given I'm getting 100% CPU pretty reliably and not when the patch is removed.
https://github.com/pstratem/bitcoin/blob/3c08b7a23fe3afc8025658316b99f750c9a90fbf/src/main.cpp#L3980
leaves vRecvMsg non-empty if the send buffer is full which will cause an infiniloop condition when a buffer fills up.
|
My assumption 1. turns out to have been wrong. This does not actually change the behavior of the node WRT how it handles requests with multiple getdata(block) in them. It simply drops out and leaves the balance of the requests till later which will be at most 100ms later when ProcessMessages() is next called. You'll want to add another |
|
The goal here is for the loop to process 1 item from vRecvGetData and/or 1 item from vRecvMsg per peer. Probably ProcessGetData should be moved outside of ProcessMessages entirely, but that is outside the scope of this patch (I think). |
|
I see what you're getting at, I have made the suggested change. |
|
ACK on the general idea of round-robin-processing requests. I haven't had time to code review or test. |
|
ACK on approach and implementation. Haven't tested. Some commits can be squashed together. In particular 3rd, which is a bugfix for the second. |
|
Tested the patched version, still getting occasional 100% CPU. I think perhaps we should return a different value from ProcessMessages() if we do not want the calling routine to sleep, that allows us to be more explicit than checking that the queues are not empty. It's just difficult to reliably prove that there is no other way out of ProcessMessages() which leaves non-empty queues. |
|
There are only 2 paths where ProcessMessages can run without consuming a message in vRecvMsg. Either the send buffer is full or the next message in the queue is not complete. I agree that a more general solution should be applied, but probably later as part of a larger reorganization of the networking code. |
|
Rebase please? |
|
Automatic sanity-testing: PASSED, see http://jenkins.bluematt.me/pull-tester/75ef87dd936cd9c84d9a9fd3afce6198409636c4 for binaries and test log. |
|
Tested on OSX, behaves properly. Message-handling should be eventually be reworked to use boost:asio (never sleep, be I/O event driven as @gmaxwell suggests) and to use some sort of active queue management to mitigate denial-of-service attacks. But this code is better than what we've got, so I am going to merge. |
Reduce latency in network processing
|
Ah, I already had a similar change in my bitcoind for a year or so (though in my case, I simply reduced this 100ms to 10ms). I guess that's why blockchain.info always picked up so much stuff. I thought it may be because of the 20ms latency.. It would explain the four blocks picked up by blockchain.info from 198.12.127.2, which I'm running w/ similar settings (just sending out far fewer transactions).. didn't think it'd have such an impact |
|
#2125 was raised ages ago to address this. Glad to see it finally implemented! |
| LogPrintf("ProcessMessage(%s, %u bytes) FAILED\n", strCommand.c_str(), nMessageSize); | ||
|
|
||
| break; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't really need this while loop any longer since we break out after the first iteration of it....
In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR bitcoin#3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases.
In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR bitcoin#3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases.
…1586) * net: fix typo causing the wrong receive buffer size Surprisingly this hasn't been causing me any issues while testing, probably because it requires lots of large blocks to be flying around. Send/Recv corks need tests! * net: make vRecvMsg a list so that we can use splice() * net: make GetReceiveFloodSize public This will be needed so that the message processor can cork incoming messages * net: only disconnect if fDisconnect has been set These conditions are problematic to check without locking, and we shouldn't be relying on the refcount to disconnect. * net: wait until the node is destroyed to delete its recv buffer when vRecvMsg becomes a private buffer, it won't make sense to allow other threads to mess with it anymore. * net: set message deserialization version when it's actually time to deserialize We'll soon no longer have access to vRecvMsg, and this is more intuitive anyway. * net: handle message accounting in ReceiveMsgBytes This allows locking to be pushed down to only where it's needed Also reuse the current time rather than checking multiple times. * net: record bytes written before notifying the message processor * net: Add a simple function for waking the message handler This may be used publicly in the future * net: remove useless comments * net: remove redundant max sendbuffer size check This is left-over from before there was proper accounting. Hitting 2x the sendbuffer size should not be possible. * net: rework the way that the messagehandler sleeps In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR #3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases. * net: add a new message queue for the message processor This separates the storage of messages from the net and queued messages for processing, allowing the locks to be split. * net: add a flag to indicate when a node's process queue is full Messages are dumped very quickly from the socket handler to the processor, so it's the depth of the processing queue that's interesting. The socket handler checks the process queue's size during the brief message hand-off and pauses if necessary, and the processor possibly unpauses each time a message is popped off of its queue. * net: add a flag to indicate when a node's send buffer is full Similar to the recv flag, but this one indicates whether or not the net's send buffer is full. The socket handler checks the send queue when a new message is added and pauses if necessary, and possibly unpauses after each message is drained from its buffer. * net: remove cs_vRecvMsg vRecvMsg is now only touched by the socket handler thread. The accounting vars (nRecvBytes/nLastRecv/mapRecvBytesPerMsgCmd) are also only used by the socket handler thread, with the exception of queries from rpc/gui. These accesses are not threadsafe, but they never were. This needs to be addressed separately. Also, update comment describing data flow
In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR dashpay#3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases.
In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR bitcoin#3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases.
In order to sleep accurately, the message handler needs to know if _any_ node has more processing that it should do before the entire thread sleeps. Rather than returning a value that represents whether ProcessMessages encountered a message that should trigger a disconnnect, interpret the return value as whether or not that node has more work to do. Also, use a global fProcessWake value that can be set by other threads, which takes precedence (for one cycle) over the messagehandler's decision. Note that the previous behavior was to only process one message per loop (except in the case of a bad checksum or invalid header). That was changed in PR bitcoin#3180. The only change here in that regard is that the current node now falls to the back of the processing queue for the bad checksum/invalid header cases.
Currently there is a 100ms sleep in the message processing thread that adds delay to processing messages.
This patch skips the sleep when there is additional work to do.
Additionally ProcessMessages & ProcessGetData are modified such that their maximum run time is reduced; this improves the average latency peers see when there is high load. (This has a significant impact on high latency networks during initial block synchronization).