Skip to content

Performance issues with OpenSSL 1.1.1c caused by DEVRANDOM_WAIT #9078

@mspncp

Description

@mspncp

In pull request #8251, a select() call on /dev/random was added before reading from /dev/urandom. Let's call it the DEVRANDOM_WAIT feature, named after the preprocessor macro involved.

#ifdef DEVRANDOM_WAIT
static int wait_done = 0;
/*
* On some implementations reading from /dev/urandom is possible
* before it is initialized. Therefore we wait for /dev/random
* to be readable to make sure /dev/urandom is initialized.
*/
if (!wait_done && bytes_needed > 0) {
int f = open(DEVRANDOM_WAIT, O_RDONLY);
if (f >= 0) {
fd_set fds;
FD_ZERO(&fds);
FD_SET(f, &fds);
while (select(f+1, &fds, NULL, NULL, NULL) < 0
&& errno == EINTR);
close(f);
}
wait_done = 1;
}
#endif

The intention of this change was to mitigate a weakness of the /dev/urandom device, namely that it does not block until being initially seeded, contrary to the getrandom() system call, see issue #8215. Note that this change does not affect the seeding using the getrandom() source, which is the preferred method on newer operating systems.

There was a some discussion on the issue thread before PR #8251 was raised. While people unanimously agreed that using getentropy() is fine, because it blocks at boot time (see also @kroeckx's comment) until the kernel CSPRNG is properly seeded, there was some dissent about whether the /dev/urandom weakness should be addressed by OpenSSL.

@t8m and I agreed that it is actually the business of the OS (resp. its init system) to take care of this and not OpenSSL's business. (see #8215 (comment) and #8215 (comment)). Pull request #8251 emerged as a compromise; the idea was to wait for /dev/random to become readable (without reading any bytes) before reading from /dev/urandom. Nobody protested anymore, except for @t8m (here and here), who was no committer yet at that time, and his protests went unnoticed.

However, there is a recent issue report on openssl-users that this change introduced a regression and has unwanted performance impacts.

I built OpenSSL 1.1.1c from the recent release, but have noticed what
seems like a significant performance drop compared with 1.1.1b. I
notice this when starting lighttpd. With 1.1.1b, lighttpd starts in a
few seconds, but with 1.1.1c, it takes several minutes.

I think I have tracked down the change in 1.1.1c that is causing this.
It is the addition of the DEVRANDOM_WAIT functionality for linux in
e_os.h and crypto/rand/rand_unix.c. lighttpd (libcrypto) is waiting in
a select() call on /dev/random. After this eventually wakes up, it then
reads from /dev/urandom. OpenSSL 1.1.1b did not do this, but instead
just read from /dev/urandom. Is there more information about this
change (i.e., a rationale)? I did not see anything in the CHANGES file
about it.

Jay

I think we need to reconsider pull request #8251 and decide whether it was really good idea to fix a problem of the kernel resp. the init system in the OpenSSL library, or whether the negative side-effects outweigh the benefits. Currently, I'm leaning towards reverting the DEVRANDOM_WAIT commits on master and 1.1.1.

What's your opinion @openssl/committers?

Metadata

Metadata

Assignees

No one assigned

    Labels

    branch: 1.1.1Applies to OpenSSL_1_1_1-stable branch (EOL)branch: masterApplies to master branch

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions