Skip to content

Conversation

@jamesob
Copy link
Contributor

@jamesob jamesob commented Oct 19, 2018

Adds a context manager to easily (and selectively) profile node performance during functional test execution using perf.

While writing some tests, I encountered some odd bitcoind slowness. I wrote up a utility (TestNode.profile_with_perf) that generates performance diagnostics for a node by running perf during the execution of a particular region of test code.

perf usage is detailed in the excellent (and sadly unmerged) #12649; all due props to @eklitzke.

Example

with node.profile_with_perf("large-msgs"):
    for i in range(200):
        node.p2p.send_message(some_large_msg)
    node.p2p.sync_with_ping()

This generates a perf data file in the test node's datadir (/tmp/testtxmpod0y/node0/node-0-TestName-large-msgs.perf.data).

Running perf report generates nice output about where the node spent most of its time while running that part of the test:

$ perf report -i /tmp/testtxmpod0y/node0/node-0-TestName-large-msgs.perf.data --stdio \
  | c++filt \
  | less

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 135  of event 'cycles:pp'
# Event count (approx.): 1458205679493582
#
# Children      Self  Command          Shared Object        Symbol
# ........  ........  ...............  ...................  ........................................................................................................................................................................................................................................................................
#
    70.14%     0.00%  bitcoin-net      bitcoind             [.] CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)                                                                                                                                                                                                                        
                |
                ---CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

    70.14%     0.00%  bitcoin-net      bitcoind             [.] CNetMessage::readData(char const*, unsigned int)                                                                                                                                                                                                                                
                |
                ---CNetMessage::readData(char const*, unsigned int)
                   CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

    35.52%     0.00%  bitcoin-net      bitcoind             [.] std::vector<char, zero_after_free_allocator<char> >::_M_fill_insert(__gnu_cxx::__normal_iterator<char*, std::vector<char, zero_after_free_allocator<char> > >, unsigned long, char const&)                                                                                      
                |
                ---std::vector<char, zero_after_free_allocator<char> >::_M_fill_insert(__gnu_cxx::__normal_iterator<char*, std::vector<char, zero_after_free_allocator<char> > >, unsigned long, char const&)
                   CNetMessage::readData(char const*, unsigned int)
                   CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

...

@jamesob jamesob changed the title test: add utility to easily profile node performance with perf tests: add utility to easily profile node performance with perf Oct 19, 2018
@jamesob jamesob changed the title tests: add utility to easily profile node performance with perf test: add utility to easily profile node performance with perf Oct 19, 2018
@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch from 629de1e to b93a17c Compare October 19, 2018 17:57
@jamesob jamesob changed the title test: add utility to easily profile node performance with perf tests: add utility to easily profile node performance with perf Oct 19, 2018
@jamesob
Copy link
Contributor Author

jamesob commented Oct 19, 2018

An example of running the generated perf data file through hotspot:

selection_109

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK b93a17c3153f050eb37292b51463b0aaa5860f1a. Context manager seems nice for writing new tests. This is just an idea, but maybe it would be useful if there were an option to collect perf data for all nodes from start to stop to gain insight into performance of existing tests.

@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch 3 times, most recently from d80a092 to ed471df Compare October 19, 2018 20:36
@jamesob
Copy link
Contributor Author

jamesob commented Oct 19, 2018

Thanks for the review, @ryanofsky. I really like your suggestions and so I've incorporated them all into a rebase.

You can now pass --perf when running an individual test file and all nodes will be profiled for the duration of the test. Here's an example run:

$ ./test/functional/p2p_unrequested_blocks.py --perf

2018-10-19T20:36:06.542000Z TestFramework (INFO): Initializing test directory /tmp/test71_cfvsc
...
2018-10-19T20:36:09.206000Z TestFramework (INFO): Stopping nodes
2018-10-19T20:36:09.384000Z TestFramework.node0 (INFO): Wrote perf output to '/tmp/test71_cfvsc/node0/node-0-AcceptBlockTest.perf.data'
2018-10-19T20:36:09.558000Z TestFramework.node1 (INFO): Wrote perf output to '/tmp/test71_cfvsc/node1/node-1-AcceptBlockTest.perf.data'
2018-10-19T20:36:09.608000Z TestFramework (WARNING): Not cleaning up dir /tmp/test71_cfvsc due to perf data
2018-10-19T20:36:09.608000Z TestFramework (INFO): Tests successful

@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch 2 times, most recently from 2468b4a to 358e757 Compare October 19, 2018 20:44
@promag
Copy link
Contributor

promag commented Oct 20, 2018

Concept ACK, very nice!

@DrahtBot
Copy link
Contributor

DrahtBot commented Oct 20, 2018

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

No conflicts as of last run.

@jtimon
Copy link
Contributor

jtimon commented Oct 22, 2018

Concept ACK

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 358e757d566d5f659901acc6f15fa9eb6eb3e741

@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch 2 times, most recently from 2cd3185 to dd22df8 Compare October 23, 2018 15:27
@jamesob
Copy link
Contributor Author

jamesob commented Oct 23, 2018

Addressed feedback from @ryanofsky @conscott.

Copy link
Member

@maflcko maflcko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK

@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch from dd22df8 to ac6a6b3 Compare October 24, 2018 19:11
@jamesob
Copy link
Contributor Author

jamesob commented Oct 24, 2018

Pushed feedback from @MarcoFalke.

@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch from c978948 to a94c68b Compare November 9, 2018 20:37
Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Concept ACK.

This seems cool. I haven't had a chance to play around with it yet. I've given the code a cursory glance and will review more thoroughly next week.

Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is really cool. I've now had a chance to play around with it and it's great. I think doing the following would make it even more useful and accessible for developers:

  • move the documentation about installing and using perf to doc/developer-notes.md
  • move the documentation about using profile_with_perf() to test/README.md, with a reference to doc/developer-notes.md for notes on perf.

@laanwj
Copy link
Member

laanwj commented Jan 2, 2019

Concept ACK

@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch from a94c68b to 1ae07ec Compare January 2, 2019 19:04
@jamesob
Copy link
Contributor Author

jamesob commented Jan 2, 2019

Thanks for the review, @jnewbery. I've pushed a rebased tip that addresses the feedback.

Here's the diff (excluding the rebase).

@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch 5 times, most recently from 11bc2f3 to 27a895d Compare January 2, 2019 19:24
Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 27a895d72a385bcaa861e08b03bb5c817f8e253f. Changes since last review: rebase, moving & adding documentation, tweaking output paths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In commit "docs: add perf section to developer docs" (27a895d72a385bcaa861e08b03bb5c817f8e253f)

Slightly strange the link text is "understand the security tradeoffs" instead of just "security tradeoffs."

@maflcko maflcko added this to the 0.18.0 milestone Jan 5, 2019
Copy link
Contributor

@jnewbery jnewbery left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great James. A few comments inline.

Introduces `TestNode.profile_with_perf()` context manager which
samples node execution to produce profiling data.

Also introduces a test framework flag, `--perf`, which will run
perf on all nodes for the duration of a given test.
@jamesob jamesob force-pushed the 2018-10-func-test-profiling branch from 27a895d to 13782b8 Compare January 22, 2019 13:56
@jamesob
Copy link
Contributor Author

jamesob commented Jan 22, 2019

Pushed an update addressing feedback from @jnewbery.

Copy link
Contributor

@ryanofsky ryanofsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK 13782b8. Main change is removing code in example test, and adding to test README instead.

@maflcko maflcko merged commit 13782b8 into bitcoin:master Feb 5, 2019
maflcko pushed a commit that referenced this pull request Feb 5, 2019
…ith perf

13782b8 docs: add perf section to developer docs (James O'Beirne)
58180b5 tests: add utility to easily profile node performance with perf (James O'Beirne)

Pull request description:

  Adds a context manager to easily (and selectively) profile node performance during functional test execution using `perf`.

  While writing some tests, I encountered some odd bitcoind slowness. I wrote up a utility (`TestNode.profile_with_perf`) that generates performance diagnostics for a node by running `perf` during the execution of a particular region of test code.

  `perf` usage is detailed in the excellent (and sadly unmerged) #12649; all due props to @eklitzke.

  ### Example

  ```python
  with node.profile_with_perf("large-msgs"):
      for i in range(200):
          node.p2p.send_message(some_large_msg)
      node.p2p.sync_with_ping()
  ```

  This generates a perf data file in the test node's datadir (`/tmp/testtxmpod0y/node0/node-0-TestName-large-msgs.perf.data`).

  Running `perf report` generates nice output about where the node spent most of its time while running that part of the test:

  ```bash
  $ perf report -i /tmp/testtxmpod0y/node0/node-0-TestName-large-msgs.perf.data --stdio \
    | c++filt \
    | less

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 135  of event 'cycles:pp'
  # Event count (approx.): 1458205679493582
  #
  # Children      Self  Command          Shared Object        Symbol
  # ........  ........  ...............  ...................  ........................................................................................................................................................................................................................................................................
  #
      70.14%     0.00%  bitcoin-net      bitcoind             [.] CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)
                  |
                  ---CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

      70.14%     0.00%  bitcoin-net      bitcoind             [.] CNetMessage::readData(char const*, unsigned int)
                  |
                  ---CNetMessage::readData(char const*, unsigned int)
                     CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

      35.52%     0.00%  bitcoin-net      bitcoind             [.] std::vector<char, zero_after_free_allocator<char> >::_M_fill_insert(__gnu_cxx::__normal_iterator<char*, std::vector<char, zero_after_free_allocator<char> > >, unsigned long, char const&)
                  |
                  ---std::vector<char, zero_after_free_allocator<char> >::_M_fill_insert(__gnu_cxx::__normal_iterator<char*, std::vector<char, zero_after_free_allocator<char> > >, unsigned long, char const&)
                     CNetMessage::readData(char const*, unsigned int)
                     CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

  ...
  ```

Tree-SHA512: 9ac4ceaa88818d5eca00994e8e3c8ad42ae019550d6583972a0a4f7b0c4f61032e3d0c476b4ae58756bc5eb8f8015a19a7fc26c095bd588f31d49a37ed0c6b3e
pravblockc pushed a commit to pravblockc/dash that referenced this pull request Aug 19, 2021
…mance with perf

13782b8 docs: add perf section to developer docs (James O'Beirne)
58180b5 tests: add utility to easily profile node performance with perf (James O'Beirne)

Pull request description:

  Adds a context manager to easily (and selectively) profile node performance during functional test execution using `perf`.

  While writing some tests, I encountered some odd bitcoind slowness. I wrote up a utility (`TestNode.profile_with_perf`) that generates performance diagnostics for a node by running `perf` during the execution of a particular region of test code.

  `perf` usage is detailed in the excellent (and sadly unmerged) bitcoin#12649; all due props to @eklitzke.

  ### Example

  ```python
  with node.profile_with_perf("large-msgs"):
      for i in range(200):
          node.p2p.send_message(some_large_msg)
      node.p2p.sync_with_ping()
  ```

  This generates a perf data file in the test node's datadir (`/tmp/testtxmpod0y/node0/node-0-TestName-large-msgs.perf.data`).

  Running `perf report` generates nice output about where the node spent most of its time while running that part of the test:

  ```bash
  $ perf report -i /tmp/testtxmpod0y/node0/node-0-TestName-large-msgs.perf.data --stdio \
    | c++filt \
    | less

  # To display the perf.data header info, please use --header/--header-only options.
  #
  #
  # Total Lost Samples: 0
  #
  # Samples: 135  of event 'cycles:pp'
  # Event count (approx.): 1458205679493582
  #
  # Children      Self  Command          Shared Object        Symbol
  # ........  ........  ...............  ...................  ........................................................................................................................................................................................................................................................................
  #
      70.14%     0.00%  bitcoin-net      bitcoind             [.] CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)
                  |
                  ---CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

      70.14%     0.00%  bitcoin-net      bitcoind             [.] CNetMessage::readData(char const*, unsigned int)
                  |
                  ---CNetMessage::readData(char const*, unsigned int)
                     CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

      35.52%     0.00%  bitcoin-net      bitcoind             [.] std::vector<char, zero_after_free_allocator<char> >::_M_fill_insert(__gnu_cxx::__normal_iterator<char*, std::vector<char, zero_after_free_allocator<char> > >, unsigned long, char const&)
                  |
                  ---std::vector<char, zero_after_free_allocator<char> >::_M_fill_insert(__gnu_cxx::__normal_iterator<char*, std::vector<char, zero_after_free_allocator<char> > >, unsigned long, char const&)
                     CNetMessage::readData(char const*, unsigned int)
                     CNode::ReceiveMsgBytes(char const*, unsigned int, bool&)

  ...
  ```

Tree-SHA512: 9ac4ceaa88818d5eca00994e8e3c8ad42ae019550d6583972a0a4f7b0c4f61032e3d0c476b4ae58756bc5eb8f8015a19a7fc26c095bd588f31d49a37ed0c6b3e
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Dec 16, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

10 participants