Skip to content

Native C types for OSSL_PARAM#8377

Closed
richsalz wants to merge 5 commits intoopenssl:masterfrom
richsalz:native-c-types
Closed

Native C types for OSSL_PARAM#8377
richsalz wants to merge 5 commits intoopenssl:masterfrom
richsalz:native-c-types

Conversation

@richsalz
Copy link
Contributor

This is an alternative to #8320. It changes the OSSL_PARAM structure to support native C integer types and OpenSSL's int64, as well as size_t. It is reasonably efficient (although you could expand the code to not use memcpy, and instead just do inline code like

*(int *)val = *(int *)p->buffer

if you want. It would make the code slightly bigger (losing most of the common code into separate functions).

I removed float and double; the one real place it's needed is for estimating entropy, and I think it's much simpler to use a scaled number (0-100) rather than a percentage. YMMV on that, and adding double is simple enough.

This PR is also more type-safe than what it is replacing; that code scares me around overflow/underflow and I am pretty convinced it doesn't do the right thing.

I also believe this code is simpler, more straightforward, and easier for developers to comprehend. Of course, having written that I would think that :)

There are a couple of tests. In particular they catch mis-matched types.

I don't have time now to add proper error codes, but will do that Friday morning (Eastern time).

@richsalz
Copy link
Contributor Author

richsalz commented Mar 1, 2019

Added the missing safety checks and support for double, @paulidale :)

@paulidale
Copy link
Contributor

Perhaps there is scope to meet in the middle between this and #8320

I think I'd be happy with sized integers (only int32_t, int64_t, uint32_t, uint64_t), BN and double. Plus integer widening from thirty two bit to sixty four bit (I don't see narrowing being as useful -- when was the last time you thought drat, I wish I'd used a shorter integer).

I'll prototype something which does this using casts/type conversion rather than the memcpys.

@paulidale paulidale mentioned this pull request Mar 1, 2019
2 tasks
@paulidale
Copy link
Contributor

Middle ground in #8379

@levitte
Copy link
Member

levitte commented Mar 1, 2019

I see where you'd going here... however, there's one thing that I find real troubling with this proposal, and it's that the size of an integer can now be determined in two different ways, 1) from the data type, and 2) from the buffer size. What happens if they don't match?

@richsalz
Copy link
Contributor Author

richsalz commented Mar 1, 2019

If they don't match, it's a programming error. It's the equivalent of doing this:

short s = ...someval...;
int ip = (int *)&s;
int i = *ip;

If you use the published API and macros it won't happen.
Try this code:

; cat a.c ; gcc a.c ; ./a.out

#include <stddef.h>
#include <stdio.h>
int main() {
    printf("int %d long %d float %d pointer %d\n",
	    (int)sizeof(int),
	    (int)sizeof(long),
	    (int)sizeof(double),
	    (int)sizeof(char *));
    return 0;
}
int 4 long 8 float 8 pointer 8

@richsalz
Copy link
Contributor Author

richsalz commented Mar 1, 2019

Please look at commit 1b6af2041d866dac5b8df2742c497373e80f55ba which handles BIGNUM, in particular the comments. Is this okay?

@levitte
Copy link
Member

levitte commented Mar 3, 2019

I remain doubtful about the need for several integer data types for size purposes. If we were talking differences in how the bits are organized, it would make much more sense.
(for example, reals have a different organization, so it makes complete sense to have a different data type for them)

@richsalz
Copy link
Contributor Author

richsalz commented Mar 3, 2019

I think the PARAM should exactly match the API. If an API takes an int allowing the user to put in a variable-length integer seems wrong. It also adds a new source of run-time errors. Finally, sacrificing clarity and efficiency for ... what, some architectural purity? Not sure what to call it; I use the term "geek esthetics."

@levitte
Copy link
Member

levitte commented Mar 4, 2019

It also adds a new source of run-time errors.

We do that either way.

@richsalz
Copy link
Contributor Author

richsalz commented Mar 4, 2019

We do this anyway

NO. It does not. If the native API takes an int, putting anything other than an int fails at compile-time.

@richsalz
Copy link
Contributor Author

richsalz commented Mar 4, 2019

So you say BN_set should put the size used in return_size, and BN_get should use that field?

@richsalz
Copy link
Contributor Author

richsalz commented Mar 4, 2019

I moved much of the return_size setting. It is not clear to me what return_size means for a bignum. Is it the number of bytes used in the serialization buffer? Or is it the number of bytes in the bignum? The current serialization cannot handle negative numbers so making both numbers always be the same seems wrong.

@levitte
Copy link
Member

levitte commented Mar 4, 2019

For big numbers, the return size is supposed to be the space it's serialization takes up in the buffer. BN_num_bytes() gives you the answer for OpenSSL bignums.

@levitte
Copy link
Member

levitte commented Mar 4, 2019

Regarding the sign, we can probably get away with only supporting unsigned numbers.

@levitte
Copy link
Member

levitte commented Mar 4, 2019

We do this anyway

NO. It does not. If the native API takes an int, putting anything other than an int fails at compile-time.

Huh?

You are setting an arbitrary number (data_type) and having a void pointer (buffer) point at the data.

Please do tell how your numbers provide better type security.

@richsalz
Copy link
Contributor Author

richsalz commented Mar 4, 2019

Let me turn that question back to you. If everything is a multi-byte number, you have a run-time error when someone puts too big a value there. In C with a header file in scope of course, this doesn't happen, putting a long/int64_t/size_t generates a warning. We don't have that situation here. But when using the macro's and the public get/set int API, you can only put an INT there

@richsalz
Copy link
Contributor Author

richsalz commented Mar 4, 2019

Then the documentation should say that only unsigned BN's are supported and that BN_num_bytes only works for non-negative values. It is inconsistent that we are supporting all sorts of integer types that openssl itself does not use, but you're willing to live with only partial BN support.

@levitte
Copy link
Member

levitte commented Mar 4, 2019

But when using the macro's and the public get/set int API

That's a condition for your claim.

you can only put an INT there

And the numbers in the OSSL_PARAM are still just numbers. In the base of base, i.e. the raw OSSL_PARAM, you have traded the data_type + buffer_size combo for a data_type-and-let's-hope-buffer_size-matches.

An API that helps make things safer is absolutely fine, I have nothing against that, and as long as all parts of the API are consistent with each other, that's absolutely fine.

However, the claim that exchanging a set of numbers in the underlying data type for another set of numbers would somehow enhance C type safety is simply false. But that doesn't change the added safety in an API added on type, regardless of the numbers. Please see the difference.

@richsalz
Copy link
Contributor Author

richsalz commented Mar 4, 2019

I didn't swap "one number for another." I have a type and a size. Both are necessary as I tried to show in #8377 (comment)

@levitte
Copy link
Member

levitte commented Mar 4, 2019

I didn't swap "one number for another." I have a type and a size. Both are necessary as I tried to show in #8377 (comment)

I'm done arguing this point. We're obviously each in our personal trench, I see no reason continuing.

@levitte
Copy link
Member

levitte commented Mar 4, 2019

Speaking of something different, have you noticed this:

https://travis-ci.org/openssl/openssl/jobs/501671755#L3785-L3807

ccache gcc  -Iinclude -Iapps/include  -pthread -m64 -Wall -O3 -DDEBUG_UNUSED -DPEDANTIC -pedantic -Wno-long-long -Wall -Wextra -Wno-unused-parameter -Wno-missing-field-initializers -Wswitch -Wsign-compare -Wshadow -Wformat -Wtype-limits -Wundef -Werror -Wmissing-prototypes -Wstrict-prototypes -DNDEBUG -D_DEFAULT_SOURCE -ansi  -c -o test/params_api_test-bin-params_api_test.o test/params_api_test.c
test/params_api_test.c: In function 'test_params':
test/params_api_test.c:31:9: error: initializer element is not computable at load time [-Werror]
         OSSL_PARAM_uint64("u64param", &u64),
         ^
test/params_api_test.c:31:9: error: initializer element is not computable at load time [-Werror]
test/params_api_test.c:32:9: error: initializer element is not computable at load time [-Werror]
         OSSL_PARAM_int("intparam", &i),
         ^
test/params_api_test.c:32:9: error: initializer element is not computable at load time [-Werror]
test/params_api_test.c:33:9: error: initializer element is not computable at load time [-Werror]
         OSSL_PARAM_size_t("--unused--", &sz),
         ^
test/params_api_test.c:33:9: error: initializer element is not computable at load time [-Werror]
test/params_api_test.c:34:9: error: initializer element is not computable at load time [-Werror]
         OSSL_PARAM_double("doubleparam", &d1),
         ^
test/params_api_test.c:34:9: error: initializer element is not computable at load time [-Werror]
test/params_api_test.c:35:9: error: initializer element is not computable at load time [-Werror]
         OSSL_PARAM_bignum("bignumparam", buffer, sizeof(buffer)),
         ^
test/params_api_test.c:35:9: error: initializer element is not computable at load time [-Werror]
cc1: all warnings being treated as errors

It seems to only happen when building with -ansi

@richsalz
Copy link
Contributor Author

richsalz commented Mar 4, 2019

Travis error fixed

Copy link
Member

@levitte levitte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should give you some ideas on how to handle bignums without the additional bn_size

static int i;
static uint64_t u64 = U64VAL;
static double d1 = DOUBLE;
static char buffer[200];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To get to know how many bytes the actual data is, you need this too:

    size_t buffer_l;

OSSL_PARAM_int("intparam", &i),
OSSL_PARAM_size_t("--unused--", &sz),
OSSL_PARAM_double("doubleparam", &d1),
OSSL_PARAM_bignum("bignumparam", buffer, sizeof(buffer)),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OSSL_PARAM_bignum is still incomplete, so you're forced to do this instead:

        OSSL_PARAM_DEF("bignumparam", OSSL_PARAM_BIGNUM, buffer, sizeof(buffer), &buffer_l),

@richsalz
Copy link
Contributor Author

richsalz commented Mar 5, 2019

I changed locate to not set an error. This caused some const complaints, not thrilled with the cast but it's the least invasive option I think.

crypto/params.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can only handle positive bignums? Not sure that's a good idea.

Copy link
Member

@levitte levitte Mar 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to look at BN_bn2bin and friends, and especially this from the manual:

BN_bn2bin() converts the absolute value of a into big-endian form and stores it at to. to must point to BN_num_bytes(a) bytes of memory.

I think that #8400 handles it by only supporting bignums as OSSL_PARAM_UNSIGNED_INTEGER

So really, if we want to be able to represent negative numbers too, we need to add supporting BN functions.

@mattcaswell
Copy link
Member

On the discussion on type safety I think there is some confusion here about what can be provided by this type of an API.

We can think about compile time type safety and run time type safety. We're not going to get compile time type safety because ultimately everything gets converted to void * pointers and there are casts. That's a fundamental property of the OSSL_PARAM structure, and its an accepted design decision.

What we can do is put run time layers around the underlying structure to give us back some of that safety at run time. I guess that's what this PR and #8451 are all about.

So when we think about run time type safety we have to consider what is the format and size of the data as it is stored in an OSSL_PARAM, and is it consistent with the type of the data that is expected. A run time type safety check would validate that the two are consistent, e.g. if we were expecting an unsigned 32-bit integer, then we need to make sure that the value stored in the OSSL_PARAM isn't negative and it is less than or equal to 32-bits in length. As long as that is true then our run time type safety check has passed.

It makes no difference to the run time type safety check how that data is physically stored in the OSSL_PARAM. As long as we know the format of the underlying data and its size then that is enough. If we are expecting an unsigned 32 bit integer, and the data we've got is an unsigned integer and it fits in less than (or equal) to 32-bits then we're good.

It seems to me that both this PR and #8451 achieve that goal. For example, both have the ability to take a size_t and encode it into an OSSL_PARAM structure, and both have the ability to decode a size_t from an OSSL_PARAM structure and complain if the format/size isn't right.

I don't see a particular benefit of this PR over #8451 in terms of type safety. In fact I think this PR confuses the types of what we are expecting to send/receive with how those types are encoded within an OSSL_PARAM structure. So I think we should proceed with #8451 over this PR.

In my mind neither of the PRs fully address defining what types we are expecting to send/receive in any particular scenario. But I think that aspect can be covered in a follow on PR. I've written a little of my thoughts on that aspect in issue #8454.

@richsalz
Copy link
Contributor Author

Thank you for the thoughtful review. Have you read the revised manpage? I believe this is now also easier to use, and a more natural fit for C programmers. I will finish up my implementation, anyway.

@mattcaswell
Copy link
Member

Have you read the revised manpage?

Yes.

@kaduk
Copy link
Contributor

kaduk commented Mar 11, 2019

Github apparently won't let me reply inline to:

I think we have to have a clear format and not just say "whatever the format of BIGNUM bn2bin is" and definitely not a BIGNUM directly. That format of bn2bin is (presumably) defined somewhere so this shouldn't be hard. We don't want to require Provider authors to have to link to libcrypto, so if they want to use gmp instead, then that's fine. Yes it should preserve the sign.

so I'll note that any sort of public-visible "BIGNUM-equivalent" thing seems like it's going to necessitate exposing BN_ULONG, which I somehow thought we were trying to move away from.

@mattcaswell
Copy link
Member

so I'll note that any sort of public-visible "BIGNUM-equivalent" thing seems like it's going to necessitate exposing BN_ULONG, which I somehow thought we were trying to move away from.

Um, no? It's just an array of bytes?

Copy link
Contributor Author

@richsalz richsalz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have rewritten this PR. Most importantly the manpage is rewritten. I believe this is simpler to use and more correct than the alternatives. The current "core param" tests do not compile because I changed the OSSL_PARAM struture. I simplified things so that a PARAM is in our out, and cannot be both.

Please take a look.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies that the OSSL_PARAM itself is modifiable by called functions. The reason I would rather not is to try to protect against badly written code that might corrupt the OSSL_PARAM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you use the API those mistakes won't happen. If you dn't use the API, well, it's a public structure so you can't prevent them anyway.

crypto/params.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some functions, you pass OpenSSL BIGNUMs directly, in others you pass intermediary export/import buffers. That's quite confusing...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As it says in the manpage, you can pass pointers to objects as INPUT parameters, but you must pass things exported into buffers for OUTPUT parameters. This is to maintain proper memory lifetimes (e.g., across windows dll's)

crypto/params.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should happen before the check of p->size, to give the caller a fighting chance to resize appropriately.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand the rationale, but r will be -1 if there's no size<bytes. I changed things around to include an output "guess" but I'm not thrilled with that. New commit pushed.

Copy link
Member

@levitte levitte Mar 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This suggested code implies that OSSL_PARAM would be used as a generic replacement for normal C function parameters.

I am firmly opposed to that idea, and that alone will get a -1 from me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above called_function tries to imply otherwise. How can I make that more clear?

Also move <bn.h> from header to params.c
Rewrite documentation.  Calling function uses
    set -- to set an input parameter
    reserve -- to reserve space for an output param
    retrieve -- to getch an output param
Called function uses
    get -- to get an input parameter
    return -- to set na output parameter
@levitte
Copy link
Member

levitte commented Mar 12, 2019

Since #8451 is now merged, this PR serves no direct purpose. Closing (in agreement with @richsalz)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants