Skip to content

Conversation

@ldm5180
Copy link
Contributor

@ldm5180 ldm5180 commented Nov 4, 2018

Fixing warnings reported by GCC: memset of non-trivial type

@fanquake
Copy link
Member

fanquake commented Nov 4, 2018

Please send the leveldb changes upstream.

@ldm5180
Copy link
Contributor Author

ldm5180 commented Nov 4, 2018

Please send the leveldb changes upstream.

leveldb changes sent upstream. Rebased to only include the non-leveldb patch.

@DrahtBot
Copy link
Contributor

DrahtBot commented Nov 4, 2018

The following sections might be updated with supplementary metadata relevant to reviewers and maintainers.

Conflicts

Reviewers, this pull request conflicts with the following ones:

  • #14224 (Document intentional and unintentional unsigned integer overflows (wraparounds) using annotations by practicalswift)

If you consider this pull request important, please also help to review the conflicting pull requests. Ideally, start with the one that should be merged first.

@practicalswift
Copy link
Contributor

@ldm5180 Can you post the warnings emitted? What gcc version are you using?

@maflcko
Copy link
Member

maflcko commented Nov 4, 2018

What are the performance impacts of this change?

@ldm5180
Copy link
Contributor Author

ldm5180 commented Nov 4, 2018

@ldm5180 Can you post the warnings emitted? What gcc version are you using?

gcc version 8.2.0 (Ubuntu 8.2.0-7ubuntu1)

In file included from bench/prevector.cpp:6:
./prevector.h: In instantiation of ‘void prevector<N, T, Size, Diff>::fill(T*, ptrdiff_t) [with unsigned int N = 28; T = nontrivial_t; Size = unsigned int; Diff = int; ptrdiff_t = long int]’:                                                                                
./prevector.h:340:9:   required from ‘void prevector<N, T, Size, Diff>::resize(prevector<N, T, Size, Diff>::size_type) [with unsigned int N = 28; T = nontrivial_t; Size = unsigned int; Diff = int; prevector<N, T, Size, Diff>::size_type = unsigned int]’                   
bench/prevector.cpp:47:13:   required from ‘void PrevectorClear(benchmark::State&) [with T = nontrivial_t]’
bench/prevector.cpp:102:1:   required from here
./prevector.h:205:21: warning: ‘void* memset(void*, int, size_t)’ clearing an object of non-trivial type ‘struct nontrivial_t’; use assignment or value-initialization instead [-Wclass-memaccess]                                                                             
             ::memset(dst, 0, count * sizeof(T));
             ~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~
bench/prevector.cpp:12:8: note: ‘struct nontrivial_t’ declared here
 struct nontrivial_t {
        ^~~~~~~~~~~~

@ldm5180
Copy link
Contributor Author

ldm5180 commented Nov 4, 2018

What are the performance impacts of this change?

Expect, no impact. Is there a performance analysis test suite?

There are a few different cases. Here are them shown in the Compiler Explorer:

  1. Default fill w/ trivial type: https://gcc.godbolt.org/z/qzIKUJ (no difference)
  2. Default fill w/ nontrivial type: https://gcc.godbolt.org/z/h38Swd (difference discussed below)
  3. Copy fill w/ trivial type: https://gcc.godbolt.org/z/a2z3MJ (possibly better w/ std::fill)
  4. Copy fill w/ nontrivial type: https://gcc.godbolt.org/z/gB50DT (difference discussed below)

In both functions, when a trivial type is used there is no difference or the std::fill version is slightly better.

In both functions, when a nontrivial type is used, it gets better optimized with the std::fill version. You can see this by changing the optimization level from -O2 to -O3 and notice them both turn into almost the same code, but with the std::fill version still being about 10 instructions shorter. Without diving deeper into this, my estimation is that the std::fill version can be "seen" more clearly by the optimizer which is why it results in the same code with fewer optimizers enabled.

@sipa
Copy link
Member

sipa commented Nov 4, 2018

Concept ACK; from what I read is that for character types std::fill is generally compiled similar to memset would, so I expect no performance regression here.

Copy link
Contributor

@kallewoof kallewoof left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

utACK aa6f3d7db443c1154eda78f6fb006a69aa20ce3a

Problem:
- IS_TRIVIALLY_CONSTRUCTIBLE macro does not work correctly resulting
  in `memset()` usage to set a non-trivial type to 0 when
  `nontrivial_t` is passed in from the tests.
- Warning reported by GCC when compiling with `--enable-werror`.

Solution:
- Use the standard algorithm `std::fill_n()` and let the compiler
  determine the optimal way of looping or using `memset()`.
@maflcko maflcko changed the title Fix Compiler Warnings Refactor: Fix compiler warning in prevector.h Nov 5, 2018
@laanwj
Copy link
Member

laanwj commented Nov 6, 2018

Expect, no impact. Is there a performance analysis test suite?

Yes, there is src/bench/bench_bitcoin.
As the whole point of prevector is to be an optimization, I'd really prefer to see benchmarks here (if possible, both of gcc's and clang's standard library) and not make guesses.

@practicalswift
Copy link
Contributor

practicalswift commented Nov 6, 2018

@ldm5180 Very nice first time contribution! Hope to see more contributions from you!

Echoing @laanwj - I'd also be interested in benchmark results.

@ldm5180
Copy link
Contributor Author

ldm5180 commented Nov 12, 2018

I think the important benchmarks are the PrevectorResize*, but I just included all the Prevector tests.

The differences between g++ and clang++ are pretty stark, but the numbers within each version are very similar. Running them several times shows that they are not super stable number (at least on my machine). I don't know what your target is for acceptable, but here are some results:
PrevectorResizeNontrivial: clang gets ~4.8% faster, gcc gets ~0.4% slower
PrevectorResizeTrivial: clang gets ~3.7% slower, gcc gets ~0.1% slower

For clang, these might be outside the margin of error. For gcc, I don't think it is.

Baseline GCC:

# Benchmark, evals, iterations, total, min, max, median
PrevectorClearNontrivial, 10, 28300, 11.1891, 3.92909e-05, 4.03267e-05, 3.94721e-05
PrevectorClearTrivial, 10, 88600, 12.7673, 1.43817e-05, 1.44577e-05, 1.44081e-05
PrevectorDeserializeNontrivial, 10, 6800, 4.29737, 6.28875e-05, 6.33766e-05, 6.32723e-05
PrevectorDeserializeTrivial, 10, 52000, 4.91249, 9.38555e-06, 9.63038e-06, 9.44697e-06
PrevectorDestructorNontrivial, 10, 28800, 11.379, 3.93948e-05, 3.99467e-05, 3.94788e-05
PrevectorDestructorTrivial, 10, 88900, 12.5935, 1.41116e-05, 1.42736e-05, 1.41555e-05
PrevectorResizeNontrivial, 10, 28900, 7.21061, 2.46638e-05, 2.51662e-05, 2.49783e-05
PrevectorResizeTrivial, 10, 90300, 8.14905, 8.97413e-06, 9.25092e-06, 9.00301e-06

New version GCC:

# Benchmark, evals, iterations, total, min, max, median
PrevectorClearNontrivial, 10, 28300, 11.1975, 3.94149e-05, 3.97891e-05, 3.95875e-05
PrevectorClearTrivial, 10, 88600, 12.814, 1.43741e-05, 1.46315e-05, 1.4468e-05
PrevectorDeserializeNontrivial, 10, 6800, 4.3056, 6.29083e-05, 6.38494e-05, 6.33461e-05
PrevectorDeserializeTrivial, 10, 52000, 4.93494, 9.41098e-06, 9.55156e-06, 9.50646e-06
PrevectorDestructorNontrivial, 10, 28800, 11.3737, 3.92276e-05, 4.03218e-05, 3.93623e-05
PrevectorDestructorTrivial, 10, 88900, 12.5981, 1.41316e-05, 1.42107e-05, 1.41824e-05
PrevectorResizeNontrivial, 10, 28900, 7.25885, 2.47887e-05, 2.59588e-05, 2.50821e-05
PrevectorResizeTrivial, 10, 90300, 8.1371, 8.97296e-06, 9.07207e-06, 9.01449e-06

Baseline Clang:

# Benchmark, evals, iterations, total, min, max, median
PrevectorClearNontrivial, 10, 28300, 0.000571983, 2.00424e-09, 2.0771e-09, 2.0183e-09
PrevectorClearTrivial, 10, 88600, 0.00176638, 1.9287e-09, 2.15058e-09, 2.00778e-09
PrevectorDeserializeNontrivial, 10, 6800, 10.3682, 0.000151424, 0.000155099, 0.00015202
PrevectorDeserializeTrivial, 10, 52000, 7.87747, 1.50409e-05, 1.55236e-05, 1.51269e-05
PrevectorDestructorNontrivial, 10, 28800, 0.000430928, 1.25281e-09, 2.40497e-09, 1.27898e-09
PrevectorDestructorTrivial, 10, 88900, 0.00119125, 1.26731e-09, 1.63589e-09, 1.26902e-09
PrevectorResizeNontrivial, 10, 28900, 2.05184, 7.05283e-06, 7.19934e-06, 7.09096e-06
PrevectorResizeTrivial, 10, 90300, 6.48936, 7.11218e-06, 7.44356e-06, 7.17658e-06

New version Clang:

# Benchmark, evals, iterations, total, min, max, median                                     
PrevectorClearNontrivial, 10, 28300, 0.000853285, 2.86014e-09, 3.48901e-09, 3.00078e-09     
PrevectorClearTrivial, 10, 88600, 0.00264027, 2.52998e-09, 3.13509e-09, 3.02476e-09         
PrevectorDeserializeNontrivial, 10, 6800, 10.4545, 0.000152216, 0.000155233, 0.000153983    
PrevectorDeserializeTrivial, 10, 52000, 9.68222, 1.85139e-05, 1.89764e-05, 1.85454e-05      
PrevectorDestructorNontrivial, 10, 28800, 0.000431799, 1.25712e-09, 2.39365e-09, 1.27613e-09
PrevectorDestructorTrivial, 10, 88900, 0.00118995, 1.25501e-09, 1.64991e-09, 1.27141e-09    
PrevectorResizeNontrivial, 10, 28900, 1.95072, 6.57969e-06, 7.14017e-06, 6.75173e-06        
PrevectorResizeTrivial, 10, 90300, 6.70796, 7.38159e-06, 7.47863e-06, 7.44035e-06           

@laanwj
Copy link
Member

laanwj commented Nov 12, 2018

@ldm5180 thank you for doing benchmarks; so it's not significantly slower with gcc, surprisingly even faster with clang, looks good to me!

utACK 76e13b5

@promag
Copy link
Contributor

promag commented Nov 12, 2018

utACK 76e13b5, nice code simplification and tiny performance bonus (with clang).

@laanwj laanwj merged commit 76e13b5 into bitcoin:master Nov 12, 2018
laanwj added a commit that referenced this pull request Nov 12, 2018
76e13b5 warnings: Compiler warning on memset usage for non-trivial type (Lenny Maiorani)

Pull request description:

  Fixing warnings reported by GCC: memset of non-trivial type

Tree-SHA512: 357aeac60acfb922851daaf0bd8d4b81e377da7c9b31c2942b54cfdd4129dae61e577fc0a6aa430348cb07abd16ae32f986a64dbb2c1d90ec148f53e7451a229
laanwj added a commit that referenced this pull request Nov 22, 2018
69ca487 Implement prevector::fill once (Ben Woosley)
7bad78c Drop defunct IS_TRIVIALLY_CONSTRUCTIBLE handling from prevector.h (Ben Woosley)

Pull request description:

  This is clean-up post #14651:
  * Use one implementation of `prevector::fill`, as it's possible now that the implementations are identical.
  * Only apply the `IS_TRIVIALLY_CONSTRUCTIBLE` handling to the bench file where it is used, and drop the now-unnecessary associated compat includes.

Tree-SHA512: 5930b3a17fccd39af10add40202ad97a297aebecc049af72ca920d0d55b3e4c3c30ce864c8a683355895f0196396d4ea56ba9f9637bdc7d16964cdf66c195485
furszy added a commit to PIVX-Project/PIVX that referenced this pull request Dec 20, 2020
d856989 warnings: Compiler warning on memset usage for non-trivial type (Lenny Maiorani)
4cb188e Drop defunct IS_TRIVIALLY_CONSTRUCTIBLE handling from prevector.h (Ben Woosley)
c412d5f Reduce redundant code of prevector and speed it up (random-zebra)
e6741d0 Add new prevector benchmarks. (random-zebra)

Pull request description:

  Backports from Bitcoin a series of optimizations for `prevector` operations, especially around resizing, which takes a significative portion of the time spent by of `ReadBlockFromDisk()`.

  Running the benchmark tool before the patch:
  ```
  #Benchmark,                    count,  min,    max,    avg,    minc,   maxc,   avgc
  -----------                    ------  ----    ----    ----    -----   -----   ----
  PrevectorClearTrivial,         3328,   312631, 329431, 320311, 812854, 856536, 832823
  PrevectorDestructorTrivial,    3328,   317011, 353344, 325141, 824243, 918711, 845382
  PrevectorResizeTrivial,        3328,   321333, 327672, 324713, 835482, 851963, 844269
  ```

  and after:
  ```
  #Benchmark,                    count, min,    max,    avg,    minc,   maxc,   avgc
  -----------                    ------ ----    ----    ----    -----   -----   ----
  PrevectorClearTrivial,         65536, 15726,  16368,  16067,  40889,  42559,  41775
  PrevectorDestructorTrivial,    65536, 15865,  16416,  16141,  41249,  42684,  41968
  PrevectorResizeTrivial,        65536, 16036,  16320,  16213,  41696,  42433,  42156
  ```

  shows that `prevector::clear()`, `prevector::~prevector()`, and `prevector::resize()` become (on avg) **20X** faster.

  Based on:
  - bitcoin#12549
  - bitcoin#14651
  - bitcoin#14715

ACKs for top commit:
  Fuzzbawls:
    utACK d856989
  furszy:
    Quite cool. utACK d856989

Tree-SHA512: 8060458bfd5373448d83a3d905fbc95379cfc1c1aed19b750c98b06a1a6a4ba85a6715865388752a5bd579676e2b2e26d3584ec5b26d51e9ba1380bc4fa73f33
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
76e13b5 warnings: Compiler warning on memset usage for non-trivial type (Lenny Maiorani)

Pull request description:

  Fixing warnings reported by GCC: memset of non-trivial type

Tree-SHA512: 357aeac60acfb922851daaf0bd8d4b81e377da7c9b31c2942b54cfdd4129dae61e577fc0a6aa430348cb07abd16ae32f986a64dbb2c1d90ec148f53e7451a229
PastaPastaPasta pushed a commit to PastaPastaPasta/dash that referenced this pull request Jul 1, 2021
76e13b5 warnings: Compiler warning on memset usage for non-trivial type (Lenny Maiorani)

Pull request description:

  Fixing warnings reported by GCC: memset of non-trivial type

Tree-SHA512: 357aeac60acfb922851daaf0bd8d4b81e377da7c9b31c2942b54cfdd4129dae61e577fc0a6aa430348cb07abd16ae32f986a64dbb2c1d90ec148f53e7451a229
UdjinM6 pushed a commit to UdjinM6/dash that referenced this pull request Jul 5, 2021
76e13b5 warnings: Compiler warning on memset usage for non-trivial type (Lenny Maiorani)

Pull request description:

  Fixing warnings reported by GCC: memset of non-trivial type

Tree-SHA512: 357aeac60acfb922851daaf0bd8d4b81e377da7c9b31c2942b54cfdd4129dae61e577fc0a6aa430348cb07abd16ae32f986a64dbb2c1d90ec148f53e7451a229
@bitcoin bitcoin locked as resolved and limited conversation to collaborators Sep 8, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants