-
Notifications
You must be signed in to change notification settings - Fork 38.7k
[net] [validation] Call ProcessNewBlock() asynchronously #12934
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Thanks, yes I have looked through those. This is more focused on separation between net_processing (PeerLogicValidation) and validation, whereas those primarily tackle socket handling and other ConnMan stuff. I don't think there's anything here that's redundant or incompatible with those refactors |
|
cc @TheBlueMatt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@TheBlueMatt, could you take a quick look at the last commit ("Call ProcessNewBlock() asynchronously in a separate thread"), and give a concept ACK/NACK if you think the approach makes sense? (All the commits before the last one are pretty straightforward util code.)
It seems like a clean change that disentangles network & validation code and could make bitcoin more responsive to other network requests when blocks are coming in.
src/core/consumerthread.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you could replace this class with using WorkItem = std::function<void()>;. Less code and would make queue interface more generic.
src/core/consumerthread.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could maybe use TraceThread from util.h to make the thread name visible to the os.
src/core/consumerthread.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ShutdownPill seems a little complicated. What advantages does it provide over just adding bool m_active to ProducerConsumerQueue with a simple method to set it to false and cancel blocked Pop() calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - it is pretty complicated. I didn't want to poke through the queue api just to enable shutdown - you'd have to have T Pop() potentially not return a T (i.e. throw an exception) - which seemed less desirable and maybe equally complicated.
Having said that - a lot of the complexity here is introduced to handle:
- being able to shutdown a specific ConsumerThread without shutting down others (in reality you probably only ever want to shut down all of them when you terminate the process)
- allowing for a queue with fewer slots than the number of threads servicing it (unlikely)
Only reason I allowed for these is so that the APIs to the queue work the way they sound like they should work and for unit test completeness. If I discard the above the code gets much simpler and it will just be broken for two use cases that seem pretty unlikely to ever happen right now (but you never know - and then maybe somebody gets frustrated one day).
src/validation_layer.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could use make_shared here, MakeUnique below
src/validation_layer.h
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably should take unique_ptr instead of raw argument to clarify ownership
|
Thank you for the review - one thing (general design related) to add to the discussion here: Since I've submitted this request - I happened to stumble upon two race conditions in validation that stem from concurrent calls to ProcessNewBlock (#12988, #13023) Explicitly simplifying the concurrency model hopefully reduces a bit the cognitive burden of future code changes in validation and I don't think makes anything substantially less efficient - much of validation is already single threaded (because of cs_main), and certain pieces fundamentally cannot be concurrent (i.e. connecttip). Validation is already complicated enough to understand on its own without worrying about concurrency. Seems like the clarity gains will outweigh the minor efficiency hit here - +the async api into should allow all the stuff around validation to be more easily be parallelized with less risk of inadvertently introducing a consensus bug. And it makes process separation / alternate p2p more natural if that's ever to be a thing in the future. If this design seems useful - my intention is to finish this pr up (some stuff around compact blocks that I still have to work through + refit the couple of places in rpc that call ProcessNewBlock) and explore subsequent prs to put a similar model in place around the mempool. I'd also like to explore feasibility for header processing. |
|
I do think the general approach is fine. It's not going to be really at all useful until we do a ton of locking cleanups in net_processing (it'll let us run ahead to the next message, which almost always requires cs_main, and we'll end up blocking on validation anyway). It's probably simpler than multiple net_processing threads (with the same cleanups required) which was most of my previous work, but they'll end up looking pretty similar on the net_processing end, we were gonna need this blocking-on-response logic either way. Should definitely get brought up at a meeting, though, to get wider feedback. @theuni probably has some thoughts, too. |
01d79ea to
0798b88
Compare
c3bd4ee to
bcab2cf
Compare
|
PR updated with latest commits, ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(edited 2018-08-16)
utACK 3d6f038. Code looks good, and from the linked IRC discussion, this seems like a promising direction to move in. All of my comments are just suggestions which you should feel free to ignore.
- c6396d9 Implement a thread-safe FIFO (producer/consumer style) queue (1/9)
- 3f09adf Unit tests for ProducerConsumerQueue (2/9)
- 7f8a888 Add ConsumerThread: to consumer and operate on work from a ProducerConsumerQueue (3/9)
- 134d47a ConsumerThread unit tests (4/9)
- b471728 ValidationLayer() - interface for calls into block validation (5/9)
- 1c9f741 Call ProcessNewBlock() asynchronously in a separate thread from p2p layer (6/9)
- ce1d845 Replace all instances of ProcessNewBlock() with ValidationLayer.Validate() (7/9)
- 3778736 Limit available scope of ProcessNewBlock to ValidationLayer (move-only) (8/9)
- 3d6f038 Fix whitespace in test_bitcoin.cpp (whitespace,move-only) (9/9)
| m_producer_cv.wait(l, [&]() { return m_data.size() < m_capacity; }); | ||
| } | ||
|
|
||
| m_data.push_back(std::forward<TT>(data)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Implement a thread-safe FIFO (producer/consumer style) queue" (d810614)
Should probably emplace_back to avoid creating a temporary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure i completely understand - data is already constructed when it is passed into the function - thought the most economical thing to do was std::forward which should use the move constructor when possible? is that different from what emplace_back would do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Talked about this offline with @ryanofsky, @skeees; this form is equivalent (in terms of constructor calls) to what @ryanofsky initially suggested so no need to change.
| * | ||
| * @see WorkerMode | ||
| */ | ||
| template <typename T, WorkerMode m_producer_mode = WorkerMode::BLOCKING, WorkerMode m_consumer_mode = WorkerMode::BLOCKING> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Implement a thread-safe FIFO (producer/consumer style) queue" (d810614)
I don't think it's a good idea for blocking and nonblocking defaults to be attributes of the Queue data structure, instead of arguments to (or variants of) the push and pop methods. Advantages to dropping these template arguments:
- Readability. It would be nice to be able to see if a push or pop call is blocking just by looking at the call, without have to check another part of the code to see how the queue data structure was initially declared.
- Code size. Dropping these arguments would avoid compiler potentially having to instantiate many copies of this code for different combinations of template arguments.
- Extensibility. There could be other useful blocking methods added in the future (like methods to wait for low/high water marks or for empty/full events) and it would either be verbose to have to add new classwide blocking/nonblocking defaults for new methods, or confusing to have to somehow tie existing defaults to new methods.
- Consistency. If you look at other C++ objects that support optional blocking like std::mutex or std::future, the blocking behaviour is determined only by the particular method call, not by template arguments from where the object was declared.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can see this either way - i wrote it with defaults essentially as constructor args because - at least for the use cases i can imagine you almost always want a default mode of operation for a given queue except for certain edge cases (shutdown is the most apparent one) - and i've seen data structures that handle this sort of initialization both ways (defaults on construction vs with every method call) but i'm also happy to change this up if the prevailing opinion is the other way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: #12934 (comment)
I can see this either way - i wrote it with defaults essentially as constructor args because - at least for the use cases i can imagine you almost always want a default mode of operation for a given queue except for certain edge cases (shutdown is the most apparent one) - and i've seen data structures that handle this sort of initialization both ways (defaults on construction vs with every method call) but i'm also happy to change this up if the prevailing opinion is the other way.
What about having Push, Pop, TryPush, and TryPop methods and dropping the enum entirely? I think this would make code using the queue easier to understand, since blocking would be explicit, instead of based on an enum value determined by a combination of method argument, class template argument, method argument default, and class template argument default values. It could also make code using the queue easier to write, since there would no longer be any chance of hitting various compile and runtime checks for invalid enum values.
I do understand the enum is useful in tests for listing all possible combinations of behavior, but for that purpose, you could just define the enum in test code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Readability. It would be nice to be able to see if a push or pop call is blocking just by looking at the call
I think this comment is spot on.
If you look at other C++ objects that support optional blocking like std::mutex or std::future, the blocking behaviour is determined only by the particular method call
What about having Push, Pop, TryPush, and TryPop methods and dropping the enum entirely?
Agree.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: #12934 (comment)
Another advantage to add to list above:
- If you make blocking an attribute of push/pop methods rather than an attibutes of the queue you can drop the consumer/producer terminology, which I'm finding confusing now that I'm looking at downstream code. E.g. if I push an item into the queue, that seems like producing from my perspective, but it's consuming from the queue/worker thread perspective and makes that code a bit strange, IMO.
| */ | ||
| T Pop() | ||
| { | ||
| static_assert(m_consumer_mode == WorkerMode::BLOCKING, ""); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Implement a thread-safe FIFO (producer/consumer style) queue" (d810614)
Could return std::future<T> in the case of a non-blocking Pop() to support it instead of having this asymmetry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing that would complicate internal implementation a bit - you'd have to hang on to the promise that satisfies the future internally - and either you'd have too many of those and need to block - or potentially allow your buffer holding promises to grow unbounded - so actually it might not be safely implementable.
Also I don't really see an immediate use case for this - you'd have to later wait on the future (blocking or non-blocking) - but you could just alternately wait on the queue. I can't think of any use cases right now where you'd want to reserve a place in line in a non-blocking fashion and then later claim that item.
efdb111 to
2f52e4d
Compare
src/Makefile.am
Outdated
| policy/policy.h \ | ||
| policy/rbf.h \ | ||
| pow.h \ | ||
| core/producerconsumerqueue.h \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Implement a thread-safe FIFO (producer/consumer style) queue" (c6396d9)
Just noticed this PR is creating a new src/core/ subdirectory to hold the queue and thread code. This seems good, but it may also be good to add a short core/README.md to say what the directory is supposed to be for. For example, if it's meant to hold utility code that isn't bitcoin specific, or if it might make sense in the future to move other code there.
|
|
||
| private: | ||
| ShutdownPill(ConsumerThread<MODE>& consumer) : m_consumer(consumer){}; | ||
| void operator()() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Add ConsumerThread: to consumer and operate on work from a ProducerConsumerQueue" (7f8a888)
Should this be marked "override?"
|
|
||
| protected: | ||
| WorkItem(){}; | ||
| virtual void operator()(){}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Add ConsumerThread: to consumer and operate on work from a ProducerConsumerQueue" (7f8a888)
It might be a good idea to make this abstract (= 0) to trigger a compile error in case a subclass declares this the wrong way and fails to override.
| * | ||
| * @see WorkerMode | ||
| */ | ||
| template <typename T, WorkerMode m_producer_mode = WorkerMode::BLOCKING, WorkerMode m_consumer_mode = WorkerMode::BLOCKING> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Re: #12934 (comment)
Another advantage to add to list above:
- If you make blocking an attribute of push/pop methods rather than an attibutes of the queue you can drop the consumer/producer terminology, which I'm finding confusing now that I'm looking at downstream code. E.g. if I push an item into the queue, that seems like producing from my perspective, but it's consuming from the queue/worker thread perspective and makes that code a bit strange, IMO.
| }; | ||
|
|
||
| template <WorkerMode PRODUCER_MODE> | ||
| class WorkQueue : public BlockingConsumerQueue<std::unique_ptr<WorkItem<PRODUCER_MODE>>, PRODUCER_MODE> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Add ConsumerThread: to consumer and operate on work from a ProducerConsumerQueue" (7f8a888)
Calling this variable PRODUCER_MODE here, but passing it as the consumer_mode argument to BlockingConsumerQueue/ProducerConsumerQueue is a little unexpected. Could you maybe note this in a short comment to avoid confusion? Alternately it might be clearer just to use ProducerConsumerQueue directly, and not have BlockingConsumerQueue as a thing.
TBH, also, I don't actually understand why PRODUCER_MODE exists as a parameter. Why would code constructing WorkQueue want to control the default consumer mode of the queue when it doesn't consume from the queue (ConsumerThread does)?
| } | ||
| } | ||
|
|
||
| BOOST_AUTO_TEST_CASE(foo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ConsumerThread unit tests" (134d47a)
Should replace foo.
|
|
||
| for (int i = 0; i < n_elements; i++) { | ||
| work[i] = i; | ||
| queue->Push(std::unique_ptr<TestWorkItem>(new TestWorkItem(work[i]))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ConsumerThread unit tests" (134d47a)
I think there is a race here between work[i] being assigned and then incremented in the worker thread. You could avoid it by setting work values after creating the vector but before creating the queue.
|
|
||
| BOOST_CHECK_LT(queue->size(), n_threads + 1); | ||
| for (int i = 0; i < n_elements; i++) { | ||
| BOOST_CHECK_EQUAL(work[i], i + 2); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ConsumerThread unit tests" (134d47a)
Might be good to move this check above Terminate() to make the test more strict.
|
|
||
| //! A special WorkItem() that is used to interrupt a blocked ConsumerThread() so that it can terminate | ||
| template <WorkerMode MODE> | ||
| class ShutdownPill : public WorkItem<MODE> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Add ConsumerThread: to consumer and operate on work from a ProducerConsumerQueue" (7f8a888)
I think I agree with James. Even if you want to support shutting down specific threads, an approach more like "Everybody wake up, check if you are supposed to exit, and if not go back to sleep again" might be simpler than "Random thread wake up, check if you are supposed to exit, and if not wake up another random thread, but not if you already seen this particular notification before," or whatever the correct description is. (If you do want to stick with the current approach, it could be nice to add a high level comment explaining it like this.)
|
|
||
| // if the same pill has been seen by the same thread previously then it can safely be discarded | ||
| // the intended thread has either terminated or is currently processing a work item and will terminate | ||
| // after completing that item and before blocking on the queue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Add ConsumerThread: to consumer and operate on work from a ProducerConsumerQueue" (7f8a888)
Why couldn't the intended thread just be blocked calling Pop() and not terminated or currently processing anything? It seems like this is assuming threads are notified in a circular order.
| void Stop(); | ||
|
|
||
| //! Submit a block for asynchronous validation | ||
| std::future<BlockValidationResponse> SubmitForValidation(const std::shared_ptr<const CBlock> block, bool force_processing, std::function<void()> on_ready = []() {}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ValidationLayer() - interface for calls into block validation" (b471728)
Would be slightly more efficient to make default on_ready value nullptr instead of a no-op lambda.
It's also kind of unclear what m_ready is supposed to be used for in this context. You might want to move down your other comment from m_on_ready about c++11 futures here.
|
|
||
| void BlockValidationRequest::operator()() | ||
| { | ||
| LogPrint(BCLog::VALIDATION, "%s: validating request=%s\n", __func__, GetId()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ValidationLayer() - interface for calls into block validation" (b471728)
Maybe add the word "block" somewhere in here to make it clear what this is validating and obvious this is a block hash.
|
|
||
| void ValidationLayer::Start() | ||
| { | ||
| assert(!m_thread || !m_thread->IsActive()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ValidationLayer() - interface for calls into block validation" (b471728):
I think it would be better to just assert !m_thread to simplify and be more conservative. If m_thread is allowed to be non-null, then I think you would need to add more synchronization here to make sure join is called before the thread is destroyed to prevent a crash in the destructor: https://en.cppreference.com/w/cpp/thread/thread/%7Ethread
| } | ||
|
|
||
| //! Waits until this thread terminates | ||
| //! RequestTerminate() must have been previously called or be called by a different thread |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Add ConsumerThread: to consumer and operate on work from a ProducerConsumerQueue" (7f8a888)
I think it would be clearer to just say this will block until RequestTerminate() is called. It should be perfectly fine to call RequestTerminate before or after this call and from any thread.
|
|
||
| std::future<BlockValidationResponse> ValidationLayer::SubmitForValidation(const std::shared_ptr<const CBlock> block, bool force_processing, std::function<void()> on_ready) | ||
| { | ||
| BlockValidationRequest* req = new BlockValidationRequest(*this, block, force_processing, on_ready); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ValidationLayer() - interface for calls into block validation" (b471728)
Would be safer / more efficient to use std::make_shared here.
| const std::shared_ptr<ValidationQueue> m_validation_queue; | ||
|
|
||
| //! the validation thread - sequentially processes validation requests from m_validation_queue | ||
| std::unique_ptr<ValidationThread> m_thread; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ValidationLayer() - interface for calls into block validation" (b471728)
Can this just be a ValidationThread instead of a pointer to one? The extra indirection doesn't seem helpful.
| public: | ||
| ValidationLayer(const CChainParams& chainparams) | ||
| : m_chainparams(chainparams), m_validation_queue(std::make_shared<ValidationQueue>(100)) {} | ||
| ~ValidationLayer(){}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "ValidationLayer() - interface for calls into block validation" (b471728)
Maybe assert thread is not joinable here, to help debugging in case this is not shut down correctly.
| // using the other before destroying them. | ||
| if (peerLogic) UnregisterValidationInterface(peerLogic.get()); | ||
| if (g_connman) g_connman->Stop(); | ||
| if (g_validation_layer) g_validation_layer->Stop(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
Should this also free g_validation_layer? (or have a comment saying why it shouldn't be freed)
| bool request_was_queued = pnode->IsAwaitingInternalRequest(); | ||
|
|
||
| // If an internal request was queued and it's not done yet, skip this node | ||
| if (request_was_queued && !pnode->ProcessInternalRequestResults(m_msgproc)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
I think this code might be clearer if the IsAwaitingInternalRequest were call were dropped and ProcessInternalRequestResults just returned requested_was_queued directly. It seems awkward how IsAwaitingInternalRequest and ProcessInternalRequestResults are checking some of the same things and then this code is combining their return values.
|
|
||
| #include <boost/thread.hpp> | ||
|
|
||
| class ValidationLayer; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
Probably remove, doesn't look like this is used right now.
| } | ||
|
|
||
| bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStream& vRecv, int64_t nTimeReceived, const CChainParams& chainparams, CConnman* connman, const std::atomic<bool>& interruptMsgProc, bool enable_bip61) | ||
| bool static ProcessMessage(CNode* pfrom, const std::string& strCommand, CDataStream& vRecv, int64_t nTimeReceived, const CChainParams& chainparams, CConnman* connman, ValidationLayer& validation_layer, const std::atomic<bool>& interruptMsgProc, bool enable_bip61) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
Unintended indent?
| // process from some other peer. We do this after calling | ||
| // ProcessNewBlock so that a malleated cmpctblock announcement | ||
| // can't be used to interfere with block relay. | ||
| if (!pindex || pindex->IsValid(BLOCK_VALID_TRANSACTIONS)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
Maybe drop the !pindex check here, or say in a comment whether this would ever be expected? It seems surprising to treat null pindex like the block is valid.
| MarkBlockAsReceived(pblock->GetHash()); | ||
| } | ||
|
|
||
| if (validation_response.is_new) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
This seems ok, but I wanted to note that previous code in the fBlockReconstructed case updated nLastBlockTime/mapBlockSource before calling MarkBlockAsReceived, instead of after.
| // out to be invalid. | ||
| mapBlockSource.emplace(resp.blockhash, std::make_pair(pfrom->GetId(), false)); | ||
| } | ||
| } // Don't hold cs_main when we call into ProcessNewBlock |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
Should s/ProcessNewBlock/SubmitBlock/
| // though the block was successfully read, and rely on the | ||
| // handling in ProcessNewBlock to ensure the block index is | ||
| // updated, reject messages go out, etc. | ||
| MarkBlockAsReceived(resp.blockhash); // it is now an empty pointer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
I guess with this line removed, the block will be marked received later, from the worker thread, after it is processed. This seems ok, though I could see why you might want to mark the block received when its received but before it's processed, so it's clearer what "received" and "processed" actually refer to.
|
|
||
| void CNode::SetPendingInternalRequest(const std::shared_ptr<const CBlock> block, std::future<BlockValidationResponse>&& pending_response, const CBlockIndex* pindex) | ||
| { | ||
| m_block_validating = block; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Call ProcessNewBlock() asynchronously in a separate thread from p2p layer" (1c9f741)
Would it be possible to assert m_block_validating variables are null/invalid here before overwriting them? It seems like it would help debugging if SetPendingInternalRequest were called for a new request before the previous request completed.
| * @param[out] fNewBlock A boolean which is set to indicate if the block was first received via this call | ||
| * @return True if state.IsValid() | ||
| */ | ||
| bool ProcessNewBlock(const CChainParams& chainparams, const std::shared_ptr<const CBlock> pblock, bool fForceProcessing, bool* fNewBlock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Limit available scope of ProcessNewBlock to ValidationLayer (move-only)" (3778736)
It's unusual that ProcessNewBlock is documented and declared in validation_layer.cpp but defined in src/validation.cpp. It seems like it will make the documentation hard to find. There are a bunch of other options that seem like they would be better to me:
- Moving
ProcessNewBlockdeclaration tovalidate_layer.cppbut moving the documentation tovalidate.cppnear the definition. - Leaving
ProcessNewBlockwhere it is but renaming it toProcessNewBlockinternal. - Making
ProcessNewBlocka private method of a class that is friends withValidationLayer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Limit available scope of ProcessNewBlock to ValidationLayer (move-only)" (3778736)
LOCKS_EXCLUDED(cs_main) annotation seems to have been dropped here.
| if (!ActivateBestChain(state, chainparams)) { | ||
| throw std::runtime_error(strprintf("ActivateBestChain failed. (%s)", FormatStateMessage(state))); | ||
| } | ||
| // Ideally we'd move all the RPC tests to the functional testing framework |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In commit "Fix whitespace in test_bitcoin.cpp (whitespace,move-only)" (3d6f038)
Maybe drop "move-only" from commit description, since this is actually just a whitespace change.
|
|
||
| // resubmit it so that it gets a chance to get to the right thread | ||
| // when resubmitting, do not block and do not care about failures | ||
| // theres a potential deadlock where we try to push this to a queue thats |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo found by codespell: “theres” should be “there is” :-)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo found by codespell: “thats” should be “that is” :-)
|
|
||
| T ret; | ||
|
|
||
| // use a temporary so theres no side effecting code inside an assert which could be disabled |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo found by codespell: “theres” should be “there is” :-)
| There hasn't been much activity lately and the patch still needs rebase, so I am closing this for now. Please let me know when you want to continue working on this, so the pull request can be re-opened. |
3339ba2 Make g_enable_bip61 a member variable of PeerLogicValidation (Jesse Cohen) 6690a28 Restrict as much as possible in net_processing to translation unit (Jesse Cohen) 1d4df02 [move-only] Move things only referenced in net_processing out of header file (Jesse Cohen) 02bbc05 Rescope g_enable_bip61 to net_processing (Jesse Cohen) Pull request description: As part of a larger effort to decouple net_processing and validation a bit, these are a bunch of simple scope cleanups. I've moved things out of the header file that are only referenced in net_processing and added static (or anonymous namespace) modifiers to everything possible in net_processing. There are a handful of functions which could be static except that they are exposed for the sake of unit testing - these are explicitly commented. There has been some discussion of a compile time annotation, but no conclusion has been reached on that yet. This is somewhat related to other prs bitcoin#12934 bitcoin#13413 bitcoin#13407 and will be followed by prs that reduce reliance on cs_main to synchronize data structures which are translation unit local to net_processing Tree-SHA512: 46c9660ee4e06653feb42ba92189565b0aea17aac2375c20747c0d091054c63829cbf66d2daddf65682b58ce1d6922e23aefea051a7f2c8abbb6db253a609082 Signed-off-by: Pasta <[email protected]> # Conflicts: # src/init.cpp # src/net_processing.cpp # src/net_processing.h # src/test/test_dash.cpp
Update
I think this is now in a state for code review
Summary of discussion of overall design as well as some concept acks: https://bitcoincore.org/en/meetings/2018/05/03/
Description
This is still in progress and not fully completed yet, but wanted to put it out for review in terms of overall design/architectureThe high level goal here (in this, and if accepted, subsequent PRs) is to allow for net and validation to live on separate threads and communicate mostly via message passing - both for the efficiency benefits that further parallelism in the net layer might provide, but also perhaps moreso as a step towards the goal of reducing the amount of shared state and forcing a cleaner separation between the net and validation layers in the core node.
To keep this PR as self contained as possible - this set of commits does the following:
ProducerConsumerQueue() / ConsumerThread(): infrastructure to facilitate async communication between the net and validation layersCBlockfor now) validation can be submitted and processed asynchronouslyProcessNewBlock()in net_processing with the new async interfaceValidationLayer::SubmitForValidation(CBlock) -> std::future<BlockValidationResult>Because the P2P layer assumes that for a given node every message is fully processed before any subsequent messages are processed, when an asynchronous validation request is submitted for a block coming from a node - that node is "frozen" until that request has been fully validated. In the meantime - the net layer may continue servicing other nodes that do not have pending asynchronous validation requests.
The ProducerConsumerQueue() was left sufficiently generic so that it may be interposed in other places where separation of components via asynchronous message passing might make sense from a design perspective.