DRBG: Use the EVP layer to do AES encryption#5580
DRBG: Use the EVP layer to do AES encryption#5580kroeckx wants to merge 2 commits intoopenssl:masterfrom
Conversation
|
Can you add in your equivalent timing for 1.0.2 - that would help. |
|
Interesting test results! But why is master much faster than 1.1.0? How does this fit together with the regression reported in #5559? |
|
Note that that was one of the regressions, he started at 24K, and that commit changed it from 19K to 15K, so there are clearly other regressions too. |
crypto/rand/drbg_ctr.c
Outdated
There was a problem hiding this comment.
I don't see you free this context anywhere, and that might explain why Travis thinks this leaks...
There was a problem hiding this comment.
I was thinking I needed to free that and then forgot all about it. :)
|
On 1.0.2 it takes 41.48 s |
|
On a 1.0.2 shared build it takes 26.862, the 1.1 numers where all for a shared build too |
6a4c2e4 to
568a2d1
Compare
|
The 1.1.0 numbers I've reported before were from the Debian version which is at least build with -O2 instead of -O3. All build with a "./config shared" currently gives me: |
|
I have two questions: There was a discussion with @p-steuer about pre-increment vs. post-increment in #4530 (comment). Did you have that in your mind? How was this problem addressed?
Patrick added the CAVS tests in 7aba2e4. Did you run them before and after your modifications? IMHO adding the CAVS tests to the testsuite and verifying them should be part of this pull request. |
|
I used the ECB mode instead of a counter mode. Everything related to the counter is still in the DRBG, it didn't move to the EVP layer. I'll look into those CAVS tests. |
|
Ah, I see! |
|
I've added the CAVS tests. It also had extra kat tests in that pull request by they fail both before and after my patch. |
|
Switching back to aes-128-ctr gives this: |
levitte
left a comment
There was a problem hiding this comment.
Picking on style for a bit...
crypto/rand/drbg_ctr.c
Outdated
crypto/rand/drbg_ctr.c
Outdated
crypto/rand/drbg_ctr.c
Outdated
crypto/rand/drbg_ctr.c
Outdated
crypto/rand/drbg_ctr.c
Outdated
There was a problem hiding this comment.
intent? :) I at least got some right!
crypto/rand/drbg_ctr.c
Outdated
crypto/rand/drbg_ctr.c
Outdated
crypto/rand/drbg_ctr.c
Outdated
289fda5 to
a48b63f
Compare
crypto/rand/drbg_ctr.c
Outdated
There was a problem hiding this comment.
What makes you think that this is an error?
There was a problem hiding this comment.
No idea why that's there, fixed.
There was a problem hiding this comment.
Now all execution branches return success. So you might want to change the return value of ctr_XOR() back to void again and remove the return value checks in ctr_update().
Signed-off-by: Patrick Steuer <[email protected]>
|
I compared the performances with and without e6f5902 on my computer and strangely performance was exactly the same, about 25 sec. Does anyone see my mistake, or did I miss something? Here's my current git log: I tested 419a3af vs. 78a50c7, using the following commands: Performance test for 419a3afPerformance test for 78a50c7Is hardware support enabled?I checked |
|
Try using "apps/openssl" instead of "openssl"? My guess is that the
installed version is not a 1.1.0 version, so it's not picking up
your new shared library.
|
|
Thanks! I new there must have been some stupid mistake. But I didn't see that I forgot the relative path. Now the result looks much better! |
|
Heres some perf data i collected on my thinkpad ... master (b38fa98) master+this PR Copied from my original PR So for x86 this PR is a pretty good improvement. But it should be evaluated on other architectures as well .. For s390 it shows a 20% / 35% performance degeneration for longer / shorter buffers. This is because AES_encrypt already has hardware support on s390, so this PR only adds EVP overhead: master: master+this PR: The main reason this PR's performance gain (on x86) stays much behind the original PR's is that the generate operation is still a loop, encrypting one AES block at a time. If this PR is not planned to be included in coming 1.1.1 release, i would suggest to take the time to evaluate the pre-inc vs post-inc thing and possibly go for a ctr-mode-based generate operation. |
Commit a93ba40 breaks those kat tests. Didnt look into detail but its most likely due to DRBG instances reseeding automatically: The kats check for correct 32-to-128-bit counter overflow during generate operation and thus need to perform a lot of "deterministic" generate ops i.e, w/o reseeding. |
|
I can try that test again after disabling the reseeding. |
|
We're moving the beta soon, but I guess this is not a feature and so could possible be merged after we've moved to beta, but I would like to have something before we release and this looks like the most simple way for now. |
|
The extra kat test still fails after disabling reseeding |
My guess was wrong: The kats use callbacks returning all zero buffs to to provide a deterministic result. The first commit breaking the kats is a93ba40 : The second commit breaking the kats is 4917e91 : So the kat failure is not a bug, its just due to changes in the reseed behavior. The test cases could be adapted to reflect the new behavior, but since they only made sense in the ctr-mode based original implementation, they can just be ignored in the context of this PR. |
Okay. Long-term i would welcome a CTR-mode based/optimized*** solution. @mspncp :
Regarding the "pre-increment vs. post-increment" w.r.t. FIPS compliance discussion: The GCM specs (NIST SP 800-38D) feature a pre-inc pseudo code. However, libraries (including libcrypto) implement GCM using CTR-mode* based encryption, instead of using an "inc/ECB"-loop to match said pseudocode. My PR tried to do the same optimization for DRBG. Regarding the "EVP-interface must be used" discussion: I dont agree that libcrypto submodules should be limited** by only being allowed to use exported interfaces. *(which is usually implemented as post-inc, i.e. cipher's context structure keeps the counter updated for next operation) |
|
Is there something that still needs to happen? Or can someone review this? |
richsalz
left a comment
There was a problem hiding this comment.
Yes, this gets back some big performance loss.
Reviewed-by: Rich Salz <[email protected]> GH: #5580
Signed-off-by: Patrick Steuer <[email protected]> Reviewed-by: Kurt Roeckx <[email protected]> Reviewed-by: Rich Salz <[email protected]> GH: #5580
This is WIP because there are still style issues (line too long) in it, and I want to properly check errors. But it passes the regression tests.
To generate 1000000000 bytes of random data with "openssl rand", I need this amount of time:
1.1.0: 20.233s
HEAD: 12.944s
This PR: 3.024s