Skip to content

Conversation

@gavinandresen
Copy link
Contributor

Threads have noticeable overhead on 32-bit systems, so using fewer threads for simple tasks is better than starting up threads willy-nilly.

The first commit implements a simple task queue class (CScheduler) which can be serviced by one or more threads.

The second and third port the networking code's DumpAddresses thread to be a task instead.

Porting other threads (ThreadFlushWalletDB is a good candidate) is left as an exercise for future pull requests.

The CScheduler class was developed and tested with code here: https://github.com/gavinandresen/scheduler (see main.cpp)

I tested the DumpAddresses change running with -debug=net and a hacked DEBUG_ADDRESS_INTERVAL, and made sure I got more than one "Flushed %d addresses to peers.dat" in my debug.log.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this work when the timestamp of the first item in the queue is in the past?

@jgarzik
Copy link
Contributor

jgarzik commented Apr 2, 2015

concept ACK
CScheduler ACK

  • Yes, DumpAddresses thread is a prime candidate for this
  • Ditto ThreadFlushWalletDB
  • Disagree on ThreadImport -- ThreadImport is a prime candidate for something to be executed in a long running thread of its own.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.wait() may wake spuriously, so it's better to turn the surrounding if into a while loop. I believe that means you also won't have to deal with the queue-empty case in the loop below.

@sipa
Copy link
Member

sipa commented Apr 2, 2015

Agree with @jgarzik; threadimport should not run in the scheduler, as it is expected to take up to hours - blocking several other operations. Wallet flushing is a good candidate.

@gavinandresen gavinandresen force-pushed the scheduler branch 3 times, most recently from 166bb79 to c7922ac Compare April 3, 2015 15:53
@gavinandresen
Copy link
Contributor Author

Fixed @sipa's nits, including running scheduler.cpp/.h through clang-format.

And reverted the ThreadImport change, so it is it's own thread again.

@theuni
Copy link
Member

theuni commented Apr 4, 2015

boost::condition_variable::wait_until and boost::cv_status didn't show up until 1.50. Can they be worked around?

Otherwise, we'll need to enforce >= 1.50 as a build requirement.

@sipa
Copy link
Member

sipa commented Apr 4, 2015

They're easy to work with using timed_wait, I think.

@jgarzik
Copy link
Contributor

jgarzik commented Apr 5, 2015

ut ACK latest rev

@gavinandresen gavinandresen force-pushed the scheduler branch 4 times, most recently from 0ceb336 to e395785 Compare April 7, 2015 20:05
@gavinandresen
Copy link
Contributor Author

Three changes:

  1. I decided I didn't like the the asymmetry of CScheduler creating the thread, but the caller managing it. So I replaced CScheduler::run() with CScheduler::ServiceQueue, which the caller thread-ifys. And added code and a destructor that asserts if thread management isn't done correctly. That made it easy to wrap the scheduler trhead in util.h's TraceThread, to get debug.log tracing and handling of exceptions that might be thrown from the scheduled tasks.
  2. Added compatibility code for boost version 1.49 and below.
  3. I fixed a race-condition bug when there are multiple threads servicing the queue.

I think I'm all done tweaking this now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wrong earlier. If you want to support multiple runner threads, you need to do this check inside the while loops above, as the item can disappear while they wait.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmm. I convinced myself an empty check isn't necessary there, but belt&suspenders says I should just put it back.

(not needed because: if the queue is emptied during timeout, then the wait loop is timed out and exited and the empty queue is caught here. And since there is no unschedule-task, the only other way the wait can exit is if a new task is added to the queue, which wakes up only one thread and so the queue can't be empty).

I'll add the empty condition to the wait loops.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, you convinced me... but indeed, belt & suspenders.

You could even combine the two loops: while (taskQueue.empty() || wait_until(taskQueue.begin()->first)) ?

EDIT: nevermind. Sorry, ignore my nits here, I'm not thinking clearly.

@sipa
Copy link
Member

sipa commented Apr 7, 2015

extremely-lightly-tested ACK (as in: does not crash).

@gavinandresen
Copy link
Contributor Author

Final (?) nits picked (got rid of the obsolete run from the .h, added belt&suspenders checks for empty queue, ran through clang-format again), this should be ready for merge.

I also did some more testing, writing a stress-test that created 4,000 tasks serviced by 100 threads over a couple of milliseconds ( https://github.com/gavinandresen/scheduler/blob/master/main.cpp ), and all is well, including running under valgrind.

@jgarzik
Copy link
Contributor

jgarzik commented Apr 8, 2015

lightly tested ACK

@luke-jr
Copy link
Member

luke-jr commented May 13, 2015

Debian 7:

test/scheduler_tests.cpp: In member function ‘void scheduler_tests::manythreads::test_method()’:
test/scheduler_tests.cpp:68:5: error: ‘sleep_for’ is not a member of ‘boost::this_thread’
test/scheduler_tests.cpp:84:5: error: ‘sleep_for’ is not a member of ‘boost::this_thread’

@gavinandresen
Copy link
Contributor Author

@luke-jr : fixed. I'd fixed that before, I think it got lost in a rebase / shuffle between working on home/work trees.

The fix is in the unit test commit, and is:

static void MicroSleep(uint64_t n)
{
#if defined(HAVE_WORKING_BOOST_SLEEP_FOR)
    boost::this_thread::sleep_for(boost::chrono::microseconds(n));
#elif defined(HAVE_WORKING_BOOST_SLEEP)
    boost::this_thread::sleep(boost::posix_time::microseconds(n));
#else
    //should never get here
    #error missing boost sleep implementation
#endif
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to see that this gets rid of the awful LoopForever() in util.h.

@laanwj
Copy link
Member

laanwj commented May 14, 2015

ACK apart from the nit

Philip Kaufmann and others added 4 commits May 14, 2015 10:37
- replaces some hard-coded values for the default confirmation target
- also simplify code that is using the new constant
Simple class to manage a task queue that is serviced by one or
more threads.
previously it was only used with certain boost versions. Now all versions
require it.
@gavinandresen
Copy link
Contributor Author

Nits picked, will merge as soon as Travis gives the green light.

Instead of starting Yet Another Thread to dump addresses,
use CScheduler to do it.
@gavinandresen
Copy link
Contributor Author

Travis was unhappy-- one of the builds was taking more than 50 minutes to complete.

I toned down the unit test by a factor of ten, in case that is what was causing the problem (it now services hundreds of tasks using 10 threads, instead of thousands using 100 threads).... if the builds STILL take longer than normal I'll have to recruit @theuni to help debug.

@laanwj
Copy link
Member

laanwj commented May 14, 2015

@gavinandresen It initially failed; so I merged #6136 and tried again. It's expected to take somewhat longer but should at least succeed.

@theuni
Copy link
Member

theuni commented May 14, 2015

@gavinandresen I think it's partially bad timing. #6136 forced travis to rebuild all dependencies and build up a new cache. Should be back to normal in an hour or so.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny nit, is the indenting here off?

@gavinandresen gavinandresen merged commit 9a1dcea into bitcoin:master May 14, 2015
gavinandresen added a commit that referenced this pull request May 14, 2015
9a1dcea Use CScheduler for net's DumpAddresses (Gavin Andresen)
ddd0acd Create a scheduler thread for lightweight tasks (Gavin Andresen)
68d370b CScheduler unit test (Gavin Andresen)
cfefe5b scheduler: fix with boost <= 1.50 (Cory Fields)
ca66717 build: make libboost_chrono mandatory (Cory Fields)
928b950 CScheduler class for lightweight task scheduling (Gavin Andresen)
e656560 [Qt] add defaultConfirmTarget constant to sendcoinsdialog (Philip Kaufmann)
@luke-jr
Copy link
Member

luke-jr commented May 14, 2015

Build and unit tests pass on Debian 7: ACK

@theuni
Copy link
Member

theuni commented May 15, 2015

@gavinandresen
Copy link
Contributor Author

gavinandresen commented May 15, 2015 via email

maflcko pushed a commit that referenced this pull request Apr 15, 2020
808ef36 [doc] Update thread information in developer docs (John Newbery)

Pull request description:

  - DumpAddresses thread was removed in #5964
  - Shutdown thread was removed in #5679
  - Add new threads (scheduler, RPC worker, indexer, tor control)
  - Small changes to documentation of other threads

ACKs for top commit:
  MarcoFalke:
    ACK 808ef36
  hebasto:
    ACK 808ef36.

Tree-SHA512: 85b6ace7bcc4dee030c63461bef1ded1a9581d4fa249c59f6fcd5d33d89c4357a6b8b35888ce0960f276d397b5e38a21e6c5d4b7b79544827a28c950e097b36d
sidhujag pushed a commit to syscoin/syscoin that referenced this pull request Apr 16, 2020
808ef36 [doc] Update thread information in developer docs (John Newbery)

Pull request description:

  - DumpAddresses thread was removed in bitcoin#5964
  - Shutdown thread was removed in bitcoin#5679
  - Add new threads (scheduler, RPC worker, indexer, tor control)
  - Small changes to documentation of other threads

ACKs for top commit:
  MarcoFalke:
    ACK 808ef36
  hebasto:
    ACK 808ef36.

Tree-SHA512: 85b6ace7bcc4dee030c63461bef1ded1a9581d4fa249c59f6fcd5d33d89c4357a6b8b35888ce0960f276d397b5e38a21e6c5d4b7b79544827a28c950e097b36d
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants