Fix reseeding issues introduced by new RAND_OpenSSL() method by mspncp · Pull Request #4328 · openssl/openssl

mspncp · 2017-09-03T21:45:48Z

Checklist

documentation is added or updated (RAND_add() needs to be updated)

Introduction

The broken 'RAND_add()/RAND_bytes()' pattern

In the thread [openssl-dev] Plea for a new public OpenSSL RNG API I explained in detail that the different reseeding concepts of the classic OpenSSL RNG and the new RAND_DRBG API (pushing entropy via RAND_add() vs. pulling entropy via get_entropy() callback) make it so difficult to get both APIs work together in harmony, at least as far as reseeding is concerned. The following is a brief excerpt from that thread:

In OpenSSL, the classical way for the RNG consumer to add his own randomness is to call RAND_add() before calling RAND_bytes(). If the new RAND_OpenSSL() method (the "compatibility layer" hiding the public RAND_DRBG instance) is the default, then this does not work as expected anymore:

The reason is that a call to RAND_add() adds the provided randomness only to a global buffer (rand_bytes), from which it will be pulled during the next reseed. But no reseed is triggered. So the next RAND_bytes() call will be unaffected from the RAND_add(), which is not what the consumer expected. (The same holds for RAND_seed(), since drbg_seed() only calls into drbg_add())

Reseeding of DRBGs occurs only at the following occasions:

immediately after a fork() (new)

if the reseed_counter exceeds the reseed_interval

if RAND_DRBG_generate() is called requesting prediction_resistance

RAND_DRBG_reseed() is called explicitely

Note: Currently it looks like the situation is even worse: if RAND_add() is called multiple times before a reseed occurs, then the result of the previous call is overwritten.

In this pull request I propose a solution to this problem for general discussion.

Some background information on seeding the DRBG

NIST SP 800-90Ar1 has the concept of security_strength. Currently there are three different security strengths for the CTR_DRBG, specified by the NIDs NID_aes_128_ctr, NID_aes_192_ctr, NID_aes_256_ctr. Their parameter values are hardcoded (in ctr_init()) and taken from 'Table 3: Definitions for the CTR_DRBG' on page 49 of NIST SP 800-90Ar1. In particular, there is no need to have an extra RANDOMNESS_NEEDED constant.

The role of the derivation function for seeding

The internal state of the CTR_DRBG consists of two vectors

    unsigned char K[<keylen>];
    unsigned char V[<blocklen>];

so the total number of seed bytes (seedlen in Table 3) equals seedlen = keylen + blocklen. (As mentioned earlier: in the NIST document, everything is in bits, whereas in OpenSSL, entropy is in bits, buffer lengths are in bytes).

Seeding without derivation function

If no derivation function is used, the unmodified input is used to seed (K,V). This means, one has to provide exactly seedlen=keylen+blocken bytes, which in the case of a high quality entropy source contains much more entropy than needed: 8 * seedlen > security_strength.

Seeding with derivation function

If a derivation function is used (which we always do), then the input can be of variable length between drbg->min_entropylen and drbg->max_entropylen. The DF takes a variable length input and produces a pseudorandom output from which (K,V) is seeded. This is done using AES-CTR. In some sense, the DF acts as a meat grinder, compressing the entropy of the input.

Using the derivation has two advantages: 1) min_entropylen = keylen < seedlen , so you need less input if your input has full entropy and 2) The input can be of variable size min_entropylen < inputsize < max_entropylen , which is good if your input has low entropy (say, only 2 bits per byte).

All values can be found in Table 3 and are hardcoded in ctr_init():

	int ctr_init(RAND_DRBG *drbg)
	{
		...

		ctr->keylen = keylen;
		drbg->strength = keylen * 8;
		drbg->seedlen = keylen + 16;

		if (drbg->flags & RAND_DRBG_FLAG_CTR_USE_DF) {
			/* df initialisation */
			static unsigned char df_key[32] = {
				0x00,0x01,0x02,0x03,0x04,0x05,0x06,0x07,
				0x08,0x09,0x0a,0x0b,0x0c,0x0d,0x0e,0x0f,
				0x10,0x11,0x12,0x13,0x14,0x15,0x16,0x17,
				0x18,0x19,0x1a,0x1b,0x1c,0x1d,0x1e,0x1f
			};
			/* Set key schedule for df_key */
			AES_set_encrypt_key(df_key, drbg->strength, &ctr->df_ks);

			drbg->min_entropylen = ctr->keylen;
			drbg->max_entropylen = DRBG_MAX_LENGTH;
			drbg->min_noncelen = drbg->min_entropylen / 2;
			drbg->max_noncelen = DRBG_MAX_LENGTH;
			drbg->max_perslen = DRBG_MAX_LENGTH;
			drbg->max_adinlen = DRBG_MAX_LENGTH;
		} else {
			drbg->min_entropylen = drbg->seedlen;
			drbg->max_entropylen = drbg->seedlen;
			/* Nonce not used */
			drbg->min_noncelen = 0;
			drbg->max_noncelen = 0;
			drbg->max_perslen = drbg->seedlen;
			drbg->max_adinlen = drbg->seedlen;
		}

		...
	}

The easiest way would be if get_entropy() would allocate a buffer of length max_entropylen and fill it with bytes until the entropy threshold has been reached. However, max_entropylen = DRBG_MAX_LENGTH is an astronomically high value. But this value is only a theoretical upper limit from the NIST document. I changed this members value to a more realistic value:

	drbg->max_entropylen = 32 * drbg_min_entropylen    /* similar for adinlen, etc. */

This corresponds to the very pessimistic estimate that the entropy of the input lies between 1/4 bits-per-byte and 8 bits-per-byte. That should be enough.

Now in the callback, the buffer is filled until the required entropy threshold is reached. If the buffer overflows (which should never happen), get_entropy() fails. To facilitate this, I added a new "class" RAND_POOL which acts as container for this variable sized random input.

This is my vision of how get_entropy() should be used. Even if get_entropy() pulls from /dev/urandom or other full-entropy sources, the implementers should have the freedom to return more than the requested entropy from alternative entropy sources and/or be allowed to return cumulate low entropy input.

Summary of the pull request

The solution is split in two parts:

Commit 1 fixes the entropy pushing part, i.e. drbg_add()
Commit 2 fixes the entropy pulling part, i.e. drbg_get_entropy(), and restores the original semantics of RAND_poll()

This splitting is only for the moment to keep the changes overseeable and facilitate reviewing. Also, to keep things simple as long this is still WIP, I will avoid force pushing and address comments by adding new commits. I will squash the commits with a proper commit message when this goes out of WIP.

Commit 1: Make drbg_add() reseed the RAND_DRBG immediately

The obvious solution to use RAND_DRBG_reseed() creates a circular dependency, because now RAND_add() is involved both in pushing and pulling randomness:

RAND_add() -> drbg_add() -> RAND_DRBG_reseed() -> get_entropy()
                       -> get_entropy_from_system() -> RAND_poll_ex() -> RAND_add()

Using RAND_DRBG_generate() with additional input (generating and discarding some dummy output) instead of RAND_DRBG_reseed() does not solve the problem, because RAND_DRBG_generatE() can autoreseed occcasionally.

Instead, we add a new function rand_drbg_add() which is identical to RAND_DRBG_reseed(), except that it does not pull fresh entropy (and consequently does not reset the reseed_counter). This function is used only by the RAND_METHOD drbg_add().

Now the left hand side becomes:

RAND_add() -> drbg_add() -> rand_drbg_add()

The first commit breaks RAND_poll_ex(), but this function is rewritten and replaced by RAND_pool_fill() the second commit.

Commit 2: Cutting the Gordian knot: Decouple the reseeding code from the two RNG APIs

We do this by adding a new API for collecting entropy: The static fixed size RAND_BYTES_BUFFER is replaced by a more elaborate RAND_POOL "class" which acts as a container for collecting variable length randomness input for the derivation function.

The functions RAND_poll() and drbg_get_entropy() manage their own instances of such a RAND_POOL object, which they allocate and pass down the stack to the function RAND_POOL_fill() (formerly: RAND_poll_ex(). Since the lifetime and scope of these RAND_POOL objects is limited to the stack frame of RAND_poll() and drbg_get_entropy(), respectively, there is no need for locking.

Note also, that the RAND_POOL is completely decoupled from the RNG, the RAND_POOL_fill() callback has no notion which kind of RNG is reseeded from the data fed into the pool. Instead, the RAND_POOL class provides a rich API (@richsalz: no pun intended ;-) ) which appears a little oversized if one adheres to the principle that the operating system's RNG is the optimal entropy source with "full entropy" (8 bits per byte) and that it is not necessary to consider additional entropy sources or sources with lower entropy rates.

However, if one seeks for a flexible way of mixing in various entropy resources, as requested by multiple contributers, then such an API may come in handy. Currently, its just a blueprint and proof-of-concept. It provides a set of accessors providing information to the callbacks on which they could base there decisions on how much entropy should be added from the availabe resources.

I don't have a concept yet, how such a variable reseeding scheme could be managed and configured. But that's not part of this PR anyway. Input welcome, here or on openssl-dev.

Remarks

Randomness vs. Entropy

Many occurrances of the term "entropy" have been replaced by the more vague term "randomness" only recently. Although being mathematician, I have no problems using "entropy" synonymously to "randomness" in colloquial language. So originally I considered that discussion vain.

However, now this recent change change turns out to be really handy, because it enables me to consistently distinguish between a double randomness argument and a int entropy argument. The former is a floating point value measured in bytes, whereas the other is an integer value measured in bits and has a strict meaning in the sense of NIST SP800-90Ar1.

int RAND_POOL_add(RAND_POOL *pool, const void *buffer, int num, double randomness)
{
    size_t len  = (size_t)num;
    int entropy = (int)(randomness * 8.0);
    ...
}

TODO(DRBG)

Some places where I seek particular feedback or closer review have been marked with TODO(DRBG). These TODO's need to be resolved and removed before the PR can go out of WIP.

In particular, please have a closer look a the TODO(DRBG) comment preceding rand_drbg_add().

Reviewing and Testing

Since the RNG is a critical part of OpenSSL, any change to the code is like an open heart surgery. I'm not a seasoned committer and don't have the resources to test it on all platforms and under all circumstances. So I am grateful for any independent testing, in particular the DRBG chaining and forking stuff, since I a am only familiar with libcrypto and not such with libssl.

What I tested: I successfully ran the standard tests and did a little debugging with gdb to check the main code paths:

# global DRBG, fetching entropy from os for instantiation     (break at rbg_get_entropy)
./util/shlib_wrap.sh gdb --args ./apps/openssl rand -hex 32

# chained DRBG, fetching entropy from parent DRBGfor instantiation   (break at drbg_get_entropy, then add breakpoint inside 'if (drbg->parent)'-clause)
./util/shlib_wrap.sh gdb --args ./apps/openssl s_client -host www.openssl.org -port 443

# global DRBG, RAND_add()         (break at drbg_add)
echo 'This is a random number' > /tmp/randfile
./util/shlib_wrap.sh gdb --args ./apps/openssl rand -hex -rand /tmp/randfile  12

# test
./util/shlib_wrap.sh gdb --args ./test/drbgtest

mspncp · 2017-09-03T22:53:01Z

Ok, the windows build fails because the polling code in rand_win.c was not updated. I will fix this and have a look at the travis builds, too.

paulidale

I like where this is going. Great stuff.

paulidale · 2017-09-03T22:54:48Z

crypto/rand/rand_lcl.h

+ * This input will be sent through the derivation function which 'compresses'
+ * the low quality input into a high quality output
+ */
+# define DRBG_MINMAX_FACTOR              32


Could the factor be made larger? Allowing down to 0.1 bits per byte perhaps? There is no harm in making the buffer bigger, a higher quality source will always terminate early I believe.

I have some test data from /proc/net/dev on an embedded ARM processor that gave 0.14 bits per byte when assessed by the NIST SP 800-90B second draft estimates. Small processors like these are notorious for having difficulty finding entropy, and a number of poor sources are often the only recourse.

I actually have far worse data sets, one at 0.033 and one at 0.010 bit per byte but they've problems passing some of the second draft tests. My worst ever collected is an impressive 0.0022 bits per byte (using the first draft SP 800-90B tests), I didn't bother running the second draft tests for this one.

Of course. We're not talking about a huge amount of memory here. It was just some initial value and I didn't want to be too wasteful.

I like powers of two, so how about taking 128?

# define DRBG_MINMAX_FACTOR 128

This would correspond to an entropy ratio of 0.0625 bits per byte. One caveat however is that the resolution of RAND_POOL_bytes_needed() only goes down to 1 bit per byte:

int RAND_POOL_bytes_needed(RAND_POOL *pool, unsigned int entropy_per_byte)

I would object to replacing the integer by a double, or using something like bits-per-hundert-bytes instead. However, RAND_POOL_bytes_needed() is only a convenience method and you can always do the easy calculations by yourself, using the other accessors.

This sounds reasonable, I don't mind if it is a power or two or not. I agree, avoid doubles unless we go all in using them (which isn't going to happen).

mspncp · 2017-09-04T19:48:25Z

Phew! That was a hard job to stop all jobs from complaining. I had a hard time struggling with the subtleties of the OpenSSL build system and with long forgotten differences between C++ and C. Now all CI jobs build, except for the travis job #13671.5, which fails with the message

make: *** No rule to make target `build_tests'.  Stop.

There is nothing much I can do about this error :-|

kroeckx · 2017-09-04T21:46:17Z

crypto/rand/drbg_lib.c

Could you make it more clear that this is about additional data?

Agreed, thanks for the suggestion!

kroeckx · 2017-09-04T21:51:00Z

crypto/rand/drbg_rand.c

I would like to see some check to verify that it's not larger than what the NIST documentation has, which can probably be a compile time error.

Really? Normally, I am a great friend of static asserts, but in this case I would say that it is overkill: Both factors, DRBG_MINMAX_FACTOR and drbg->min_entropylen are hardcoded and not user input. And we are talking about values like 128 * 256 in the worst case. So I would say, since the 16bit era is history, we can trust the compiler to calculate this value correctly. ;-)

kroeckx · 2017-09-04T21:53:45Z

crypto/rand/rand_lib.c

I think we should just add the whole TSC, I see little value in only taking the last 8 bits.

I guess you can leave it like this for now.

I prefer adding the whole of it. There is some randomness in the bits outside of the last 8. Not a lot but some. My gathering xors the bytes together to produce a single byte of output but it isn't necessary.

If you don't mind, I would prefer if you could postpone this question to a followup PR, because it's only indirectly related to the subject of this PR.

mattcaswell · 2017-09-05T07:06:39Z

There is nothing much I can do about this error :-|

Yes - that is a problem unrelated to this PR. Are you able to submit a CLA though to make cla-check happy?

mattcaswell · 2017-09-05T07:08:46Z

Incidentally - is this submission personal or on behalf of NCP? If the latter then we will need a corporate CLA too (we will need an Individual CLA either way). If the former, it would probably be better to submit from a personal email address if you have one.

mspncp · 2017-09-05T11:59:53Z

A CLA for this company address has been signed long time ago. According to @richsalz, there seems to be a case sensitivity issue that prevents the bot from recognizing it.

A for the second question: This PR has been done entirely in my free time, so in that sense it could be considered personal. On the other hand, it is related to my work because we are currently using the FIPS DRBG with OpenSSL 1.0.2 and have an interest to continue using it after the migration to 1.1.

I will discuss this with my boss and let you know.

mattcaswell · 2017-09-05T13:04:25Z

A CLA for this company address has been signed long time ago. According to @richsalz, there seems to be a case sensitivity issue that prevents the bot from recognizing it.

Right - thanks. I found the CLA. There was a case-sensitivity issue which I have corrected. However I could only find an Individual CLA for yourself, not a corporate CLA for ncp-e.com. We need both unless this is a purely personal contribution. Ideally if its purely personal then you should use a different (personal) email address in your commits so its always clear who owns the code that is being contributed. For now I've suspended the CLA in our database until we figure out who is doing the contribution!

richsalz · 2017-09-05T20:25:22Z

I think this is good work, and am glad to see it getting thorough review feedback. I'm a little busy right now; don't wait for me :)

mspncp · 2017-09-06T09:28:10Z

Simplify RAND_POOL_add_begin()/RAND_POOL_add_end() calling sequence

Calling RAND_POOL_add_begin() with num == 0 is now considered a no-op, not an error anymore. As a consequence, some nested if statements like

#   ifdef OPENSSL_RAND_SEED_GETRANDOM
    bytes_needed = RAND_POOL_bytes_needed(pool, 8 /*entropy_per_byte*/);
    if (bytes_needed > 0) {
        buffer = RAND_POOL_add_begin(pool, bytes_needed);
        if (buffer != NULL) {
            int bytes = getrandom(buffer, bytes_needed, 0);
            entropy_available = RAND_POOL_add_end(pool, bytes, bytes);
        }
        if (entropy_available > 0)
            return entropy_available;
    }
#   endif

can be simplified to:

#   ifdef OPENSSL_RAND_SEED_GETRANDOM
    bytes_needed = RAND_POOL_bytes_needed(pool, 8 /*entropy_per_byte*/);
    buffer = RAND_POOL_add_begin(pool, bytes_needed);
    if (buffer != NULL) {
        int bytes = getrandom(buffer, bytes_needed, 0);
        entropy_available = RAND_POOL_add_end(pool, bytes, bytes);
    }
    if (entropy_available > 0)
        return entropy_available;
#   endif

mspncp · 2017-09-06T09:51:02Z

That's it for the moment.

Three TODO's are remaining. If no one has any more comments, than I will clean up the code on friday and put it out of WIP. My company signalled that they are willing to sign a corporate cla, but it will take a few more days until I get it. Probably beginning of next week.

TODO 1

Looks like the implementation of rand_drbg_add() requires less discussion than I thought. If nobody objects, I will just remove the comment and stick to variant 1).

    /*
     * TODO(DRBG): looking at the source code of ctr_update() it
     * becomes obvious that |adin| needs to be passed as |in1| and
     * not  as |in2| parameter of ctr_update():
     *
     *       ctr_update(drbg, adin, adinlen, NULL, 0, NULL, 0);
     *
     * Otherwise the line
     *
     *            ctr_XOR(ctr, ctr->KX, drbg->seedlen);
     *
     * would not get executed. So we have three options
     *
     *  1) pass |adin| as |entropy| argument to ctr_reseed():
     *
     *        ctr_reseed(drbg, adin, adinlen, NULL, 0);
     *
     *  2) call ctr_generate() with |adin| to generate dummy output:
     *
     *        ctr_generate(drbg, dummy, sizeof(dummy), adin, adinlen)
     *
     *  3) make ctr_update() non-static and call it directly
     *
     *        ctr_update(drbg, adin, adinlen, NULL, 0, NULL, 0);
     *
     * Variants 1) and 2) both invoke 3), but 2) is more heavyweigth
     * Variant 3) would be the simplest solution, but the function
     * is currently static. So I chose variant 1), even though
     * using |adin| for |entropy| looks like a hack.
     *
     * Any comments? Suggestions?
     */

TODO 2

I leave this TODO in the code, passing on the torch to others. Anybody with ideas for a good concept may feel free to come up with a PR.

/*
 * Try the various seeding methods in turn, exit when successful.
 *
 * TODO(DRBG): If more than one entropy source is available, is it
 * preferrable to stop as soon as enough entropy has been collected 
 * (as favored by @rsalz) or should one rather be defensive and add 
 * more entropy than requested and/or from different sources?
 * If yes, how can this be made configurable?
 */

TODO 3

This looks like a simple task for a seasoned OpenSSL contributor, so I'll leave it up to somebody else to post a followup PR.

/*
 * Default security strength (in the sense of [NIST SP 800-90Ar1])
 * of the default OpenSSL DRBG, and the corresponding NID.
 * 
 * Currently supported values: 128, 192, 256
 * 
 * TODO(DRBG): would be nice to have the strength configurable
 */
# define RAND_DRBG_STRENGTH             128
# define RAND_DRBG_NID                  NID_aes_128_ctr

kroeckx · 2017-09-06T21:15:11Z

So I have some comments, but that doesn't mean anything needs to change (yet):

RAND_add() will now be used as additional data for the global public DRBG. It's not used for the private global DRBG. We might need an new API that can say to which one it should go to, but I think we should deprecate those push type APIs, so no need to add an other one.
RAND_add() used to add entropy, now it's only additional data. When OpenSSL doesn't know how to get entropy it was possible to work around this by calling RAND_add(). But with the default changed to the DRBG it will now be impossible instantiate it, and it might stop working for someone who just upgrades from 1.1.0 to 1.1.1. But since we want to have a pull API for the entropy (and additional data), this push API isn't going to work anyway. We might need to do something like fall back to the old RAND method in that case.

kroeckx · 2017-09-06T21:19:01Z

crypto/rand/drbg_lib.c

Should that just say number of bytes (in pout)? It seems to give the impression that you want to know how much entropy it has as bytes.

Can you also say that it returns 0 on error?

The formulation was not very comprehensible. Second attempt:

/* * Set the RAND_DRBG callbacks for obtaining entropy and nonce. * * get_entropy() * * This is a request to allocate and fill a buffer of size |min_len| <= size <= |max_len| * (in bytes) which contains at least |entropy| bits of entropy. The buffers address is * to be returned in |*pout| and the number of collected randomness bytes as return value. * * If the callback succeeds to acquire |entropy| bits of entropy, it returns the number of * bytes written to the buffer (which may be less than the allocated size of the buffer). * If the callback fails, it returns a buffer length of 0. *

kroeckx · 2017-09-06T21:21:55Z

crypto/rand/drbg_lib.c

Could you document this function, and say that randomness is ignored because it's used for additional data?

I would like to add that normally we don't document those functions, because this one is documented by RAND_add()

Added comments to all DRBG RAND_METHOD callbacks:

/* * Implements the default OpenSSL RAND_add() method * * Since RAND_add() is only used to push additional input, * the |randomness| parameter is ignored. */ static int drbg_add(const void *buf, int num, double randomness) { ... } /*Implements the default OpenSSL RAND_seed() method */ static int drbg_seed(const void *buf, int num) { ... } etc.

kroeckx · 2017-09-06T21:45:14Z

crypto/rand/rand_lib.c

Please document this function, at least what the return value means.

I didn't implement this function ;-) Nevertheless, here is my documentation:

/* * Acquire entropy using intel specific cpu instructions * * Uses the RDSEED instruction if available, otherwise uses * RDRAND if available. * * For a difference between RDSEED and RDRAND, and why RDSEED * is the preferred choice, see * https://software.intel.com/en-us/blogs/2012/11/17/the-difference-between-rdrand-and-rdseed * * Returns the total entropy count, if it exceeds the requested * entropy count. Otherwise, returns an entropy count of 0. */ int rand_acquire_entropy_from_cpu(RAND_POOL *pool) { ... }

But it was rewritten, and the return value changed.

You're right. So is the documentation of this function ok for you now?

kroeckx · 2017-09-06T21:48:41Z

crypto/rand/rand_lib.c

This seems to be duplicate documentation that is also at RAND_DRBG_set_callbacks()?

kroeckx · 2017-09-06T21:50:49Z

crypto/rand/rand_lib.c

I see little point in documenting how it used to be done.

Changed to:

/* * Implements the get_entropy() callback * * If the DRBG has a parent, then the required amount of entropy input * is fetched using the parent's RAND_DRBG_generate(). * * Otherwise, the entropy is polled from the system entropy sources * using RAND_POOL_acquire_entropy(). */

paulidale

Reconfirmed.

One possible indentation nit.

paulidale · 2017-10-17T22:26:24Z

crypto/rand/rand_lcl.h

+/* DRBG helpers */
+int rand_drbg_restart(RAND_DRBG *drbg,
+                     const unsigned char *buffer, size_t len, size_t entropy);



kaduk · 2017-10-17T23:35:11Z

As a heads-up/last call, I intend to merge this in about 13 hours, unless objections or other volunteers to merge appear before then.

mspncp · 2017-10-18T10:45:02Z

This travis job fails to boot: https://travis-ci.org/openssl/openssl/jobs/289370926

Certainly not related to my last change ;-)

Reseeding is handled very differently by the classic RAND_METHOD API and the new RAND_DRBG api. These differences led to some problems when the new RAND_DRBG was made the default OpenSSL RNG. In particular, RAND_add() did not work as expected anymore. These issues are discussed on the thread '[openssl-dev] Plea for a new public OpenSSL RNG API' and in Pull Request #4328. This commit fixes the mentioned issues, introducing the following changes: - Replace the fixed size RAND_BYTES_BUFFER by a new RAND_POOL API which facilitates collecting entropy by the get_entropy() callback. - Don't use RAND_poll()/RAND_add() for collecting entropy from the get_entropy() callback anymore. Instead, replace RAND_poll() by RAND_POOL_acquire_entropy(). - Add a new function rand_drbg_restart() which tries to get the DRBG in an instantiated state by all means, regardless of the current state (uninstantiated, error, ...) the DRBG is in. If the caller provides entropy or additional input, it will be used for reseeding. - Restore the original documented behaviour of RAND_add() and RAND_poll() (namely to reseed the DRBG immediately) by a new implementation based on rand_drbg_restart(). - Add automatic error recovery from temporary failures of the entropy source to RAND_DRBG_generate() using the rand_drbg_restart() function. Reviewed-by: Paul Dale <[email protected]> Reviewed-by: Kurt Roeckx <[email protected]> Reviewed-by: Rich Salz <[email protected]> Reviewed-by: Ben Kaduk <[email protected]> (Merged from #4328)

The DRBG_RESEED state plays an analogue role to the |reseed_required_flag| in Appendix B.3.4 of [NIST SP 800-90A Rev. 1]. The latter is a local variable, the scope of which is limited to the RAND_DRBG_generate() function. Hence there is no need for a DRBG_RESEED state outside of the generate function. This state was removed and replaced by a local variable |reseed_required|. Reviewed-by: Paul Dale <[email protected]> Reviewed-by: Kurt Roeckx <[email protected]> Reviewed-by: Rich Salz <[email protected]> Reviewed-by: Ben Kaduk <[email protected]> (Merged from #4328)

The drbg's lock must be held across calls to RAND_DRBG_generate() to prevent simultaneous modification of internal state. This was observed in practice with simultaneous SSL_new() calls attempting to seed the (separate) per-SSL RAND_DRBG instances from the global rand_drbg instance; this eventually led to simultaneous calls to ctr_BCC_update() attempting to increment drbg->bltmp_pos for their respective partial final block, violating the invariant that bltmp_pos < 16. The AES operations performed in ctr_BCC_blocks() makes the race window quite easy to trigger. A value of bltmp_pos greater than 16 induces catastrophic failure in ctr_BCC_final(), with subtraction overflowing and leading to an attempt to memset() to zero a very large range, which eventually reaches an unmapped page and segfaults. Provide the needed locking in get_entropy_from_parent(), as well as fixing a similar issue in RAND_priv_bytes(). There is also an unlocked call to RAND_DRBG_generate() in ssl_randbytes(), but the requisite serialization is already guaranteed by the requirements on the application's usage of SSL objects, and no further locking is needed for correct behavior. In that case, leave a comment noting the apparent discrepancy and the reason for its safety (at present). Reviewed-by: Paul Dale <[email protected]> Reviewed-by: Kurt Roeckx <[email protected]> Reviewed-by: Rich Salz <[email protected]> (Merged from #4328)

kaduk · 2017-10-18T13:56:02Z

Pushed, at long last.
@mspncp thank you for adding those last few fixup commits and saving me some work.
Also thank you again for doing this, and your patience through the exhaustive review process.
Maybe we can move on to #4402 now...

mspncp · 2017-10-18T15:14:12Z

Thanks!

Thanks in return to all reviewers for the great effort you put in reviewing this PR. You gave me a hard time and kept me busy for quite a while, but your reviews also give me a good feeling about the final result. The random generator is such a crucial part of the crypto library and requires very much care. And I really don't want to be the guy who is blamed sometimes in the future for having introduced some RANDBLEED bug. ;-)

What's next

Now that pull request #4328 has finally been merged, I have regained new drive to follow my long-term goal which is to make the RAND_DRBG public. I added some thoughts about my private roadmap to Issue #4403. You are welcome to participate and comment.

kroeckx · 2017-10-18T18:57:09Z

Because I think it's so important to get right, I'm taking my time to review and understand everything.

mspncp · 2018-10-09T14:05:13Z

To all early reviewers of this pr: I would like to recover commit 7a6a26a, which I lost in my local repo a while ago by garbage collection. It is still displayed here on GitHub, together with the entire branch, but github doesn't let me fetch it, because it is not reachable from one of the advertised references.

$ git fetch github 7a6a26a0284199f1d0d8494cf6a46c9bd36968dc
error: Server does not allow request for unadvertised object 7a6a26a0284199f1d0d8494cf6a46c9bd36968dc

(I know that I can download the patches and recreate the branch, but I'd rather have the original commits)

If one of you checked it out for reviewing (or even if you only have a fetch rule for refs/pull/*/head), it is likely that you still have it lying around in your local copy. If yes, could you please be so kind and try to recover it and push it to your github clone for me?

git checkout -b for-msp  a6a26a0284199f1d0d8494cf6a46c9bd36968dc
git push github for-msp

(If you named your github clone differently, you have to replace github accordingly.)

If you succeed, please send me the link. Thank you in advance!

mattcaswell · 2018-10-09T14:15:31Z

To all early reviewers of this pr: I would like to recover commit 7a6a26a

Get it from this branch:

https://github.com/mattcaswell/openssl/tree/decouple-reseed-code

mspncp · 2018-10-09T16:11:46Z

Cool! Thank you very much, Matt! You can delete the branch in your repository again, I have pushed it to my repository as msp-github-4328-fix-rand-add-ori, together with the snapshot of the final version msp-github-4328-fix-rand-add.

In pull request openssl#4328 the seeding of the DRBG via RAND_add()/RAND_seed() was implemented by buffering the data in a random pool where it is picked up later by the rand_drbg_get_entropy() callback. This buffer was limited to the size of 4096 bytes. When a larger input was added via RAND_add() or RAND_seed() to the DRBG, the reseeding failed, but the error returned by the DRBG was ignored by the two calling functions, which both don't return an error code. As a consequence, the data provided by the application was effectively ignored. This commit fixes the problem by a more efficient implementation which does not copy the data in memory and by raising the buffer the size limit to INT32_MAX (2 gigabytes). This is less than the NIST limit of 2^35 bits but it was chosen intentionally to avoid platform dependent problems like integer sizes and/or signed/unsigned conversion. Additionally, the DRBG is now less permissive on errors: In addition to pushing a message to the openssl error stack, it enters the error state, which forces a reinstantiation on next call. Thanks go to Dr. Falko Strenzke for reporting this issue to the openssl-security mailing list. After internal discussion the issue has been categorized as not being security relevant, because the DRBG reseeds automatically and is fully functional even without additional randomness provided by the application. Fixes openssl#7381

In pull request #4328 the seeding of the DRBG via RAND_add()/RAND_seed() was implemented by buffering the data in a random pool where it is picked up later by the rand_drbg_get_entropy() callback. This buffer was limited to the size of 4096 bytes. When a larger input was added via RAND_add() or RAND_seed() to the DRBG, the reseeding failed, but the error returned by the DRBG was ignored by the two calling functions, which both don't return an error code. As a consequence, the data provided by the application was effectively ignored. This commit fixes the problem by a more efficient implementation which does not copy the data in memory and by raising the buffer the size limit to INT32_MAX (2 gigabytes). This is less than the NIST limit of 2^35 bits but it was chosen intentionally to avoid platform dependent problems like integer sizes and/or signed/unsigned conversion. Additionally, the DRBG is now less permissive on errors: In addition to pushing a message to the openssl error stack, it enters the error state, which forces a reinstantiation on next call. Thanks go to Dr. Falko Strenzke for reporting this issue to the openssl-security mailing list. After internal discussion the issue has been categorized as not being security relevant, because the DRBG reseeds automatically and is fully functional even without additional randomness provided by the application. Fixes #7381 Reviewed-by: Paul Dale <[email protected]> (Merged from #7382)

In pull request #4328 the seeding of the DRBG via RAND_add()/RAND_seed() was implemented by buffering the data in a random pool where it is picked up later by the rand_drbg_get_entropy() callback. This buffer was limited to the size of 4096 bytes. When a larger input was added via RAND_add() or RAND_seed() to the DRBG, the reseeding failed, but the error returned by the DRBG was ignored by the two calling functions, which both don't return an error code. As a consequence, the data provided by the application was effectively ignored. This commit fixes the problem by a more efficient implementation which does not copy the data in memory and by raising the buffer the size limit to INT32_MAX (2 gigabytes). This is less than the NIST limit of 2^35 bits but it was chosen intentionally to avoid platform dependent problems like integer sizes and/or signed/unsigned conversion. Additionally, the DRBG is now less permissive on errors: In addition to pushing a message to the openssl error stack, it enters the error state, which forces a reinstantiation on next call. Thanks go to Dr. Falko Strenzke for reporting this issue to the openssl-security mailing list. After internal discussion the issue has been categorized as not being security relevant, because the DRBG reseeds automatically and is fully functional even without additional randomness provided by the application. Fixes #7381 Reviewed-by: Paul Dale <[email protected]> (Merged from #7382) (cherry picked from commit 3064b55)

openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Sep 3, 2017

mspncp force-pushed the pr-fix-rand-add branch from 766ac66 to 7a6a26a Compare September 3, 2017 22:11

openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Sep 3, 2017

paulidale reviewed Sep 3, 2017

View reviewed changes

mspncp force-pushed the pr-fix-rand-add branch from 866967d to dbfe75f Compare September 4, 2017 19:03

openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Sep 4, 2017

mspncp force-pushed the pr-fix-rand-add branch from dbfe75f to 45fe9bc Compare September 4, 2017 19:14

openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Sep 4, 2017

mspncp force-pushed the pr-fix-rand-add branch from 45fe9bc to 6219d3d Compare September 4, 2017 19:27

openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Sep 4, 2017

kroeckx reviewed Sep 4, 2017

View reviewed changes

openssl-machine added the hold: cla required The contributor needs to submit a license agreement label Sep 6, 2017

kroeckx reviewed Sep 6, 2017

View reviewed changes

paulidale approved these changes Oct 17, 2017

View reviewed changes

Fix two more nits, indentation + spelling [fixup 0ee1a83]

68e1f14

kaduk closed this Oct 18, 2017

mspncp deleted the pr-fix-rand-add branch October 18, 2017 16:21

mspncp mentioned this pull request Nov 4, 2017

When to use the private and public RNGs #4641

Closed

This was referenced Mar 4, 2018

Use a nonce for the ctr drbg #5503

Closed

Make the public and private DRBG thread local #5547

Closed

mspncp mentioned this pull request Jun 22, 2018

Remove __cplusplus preamble from internal headers #6554

Closed

mspncp restored the pr-fix-rand-add branch October 9, 2018 14:08

mspncp mentioned this pull request Oct 10, 2018

DRBG: fix reseeding via RAND_add()/RAND_seed() with large input #7382

Closed

mspncp deleted the pr-fix-rand-add branch October 22, 2018 19:46

mspncp mentioned this pull request Oct 23, 2018

Fix data race in RAND_DRBG_generate #7399

Closed

mspncp mentioned this pull request Apr 11, 2019

Ignore entropy from RAND_add()/RAND_seed() in FIPS mode #8722

Closed

2 tasks

Uh oh!

Comments

Conversation

mspncp commented Sep 3, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Checklist

Introduction

The broken 'RAND_add()/RAND_bytes()' pattern

Some background information on seeding the DRBG

The role of the derivation function for seeding

Seeding without derivation function

Seeding with derivation function

Summary of the pull request

Commit 1: Make drbg_add() reseed the RAND_DRBG immediately

Commit 2: Cutting the Gordian knot: Decouple the reseeding code from the two RNG APIs

Remarks

Randomness vs. Entropy

TODO(DRBG)

Reviewing and Testing

Uh oh!

mspncp commented Sep 3, 2017

Uh oh!

paulidale left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mspncp commented Sep 4, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattcaswell commented Sep 5, 2017

Uh oh!

mattcaswell commented Sep 5, 2017

Uh oh!

mspncp commented Sep 5, 2017

Uh oh!

mattcaswell commented Sep 5, 2017

Uh oh!

richsalz commented Sep 5, 2017

Uh oh!

mspncp commented Sep 6, 2017

Simplify RAND_POOL_add_begin()/RAND_POOL_add_end() calling sequence

Uh oh!

mspncp commented Sep 6, 2017

That's it for the moment.

TODO 1

TODO 2

TODO 3

Uh oh!

kroeckx commented Sep 6, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mspncp commented Sep 3, 2017 •

edited

Loading