Introduce hash thunking functions to do proper casting#23192
Introduce hash thunking functions to do proper casting#23192nhorman wants to merge 8 commits intoopenssl:masterfrom
Conversation
| if (lh->hashw) | ||
| hash = lh->hashw(data, lh->hash); | ||
| else | ||
| hash = lh->hash(data); |
There was a problem hiding this comment.
It's unclear why lh->hashw() shouldn't exist if lh->hash() does.
There was a problem hiding this comment.
I guess backwards compatibility?
There was a problem hiding this comment.
Either doall_util_fn() also needs to be changed to choice between the functions, or we need to say that the thunk functions are now required. I can't find any open source code that would have a problem with the requirement.
There was a problem hiding this comment.
Regarding the checking of hashw vs hash, yes, backwards compatibility was my concern - i.e. if a user compiles against openssl 3.1 and then runs with openssl 3.2, the DEFINE_LHASH_OF[_EX] wouldn't assign the thunking function, resulting in hashw being null.
Regarding doal_util_fn, that function need not be changed because the proper functions are selected in the calling points (OPENSSL_LH_doall / OPENSSL_LH_doall_arg / OPENSSL_LH_doall_arg_thunk). I could refactor that if needed, but I wanted to minimize the code change
|
I don't see approval from a committer here, so adding the appropriate |
|
A nitpick on the commit message:
This is a pretty misleading. While a void pointer may be converted to a pointer of another type, under various constraints, that does not imply that a function pointer of one type signature may be called with another type signature. And indeed the C language specification says it cannot, per the discussion on the bug. That is, OpenSSL is triggering undefined behavior here, and this PR is fixing it. It's unlikely OpenSSL was getting miscompiled because of it, but if the commit message cites the C language specification, it should be clear that this is fixing something the spec did not allow. |
|
@slontis yes, that's correct, as well as STACK_OF, PRIORITY_QUEUE, etc. If you look my ubsan-thunk branch you can see everything I've tried to address so far |
AFAIK no, we use generic pointers in OSSL_DISPATCH tables but we then cast them when they are called. |
Not quite. We assign them to diverse method structure members (which are function pointers), which have the intended signature (given by function signature typedefs from core_dispatch.h), and that's where we do the casting to the signature specified in core_dispatch.h. Then we simply call those functions by method structure member. |
That's what I meant. The needed cast is there - even earlier than just during the call. |
|
The easiest way to get a feel for what exactly needs to get fixed up is to build with clang17 with ubsan enabled |
Sashan
left a comment
There was a problem hiding this comment.
I could find only single typo in manpage. it looks good otherwise.
mattcaswell
left a comment
There was a problem hiding this comment.
Looks good. Just some minor nits.
|
Oh...it also needs a rebase |
ubsan on clang17 has started warning about the following undefined
behavior:
crypto/lhash/lhash.c:299:12: runtime error: call to function err_string_data_hash through pointer to incorrect function type 'unsigned long (*)(const void *)'
[...]/crypto/err/err.c:184: note: err_string_data_hash defined here
#0 0x7fa569e3a434 in getrn [...]/crypto/lhash/lhash.c:299:12
#1 0x7fa569e39a46 in OPENSSL_LH_insert [...]/crypto/lhash/lhash.c:119:10
#2 0x7fa569d866ee in err_load_strings [...]/crypto/err/err.c:280:15
[...]
The issue occurs because, the generic hash functions (OPENSSL_LH_*) will
occasionaly call back to the type specific registered functions for hash
generation/comparison/free/etc, using functions of the (example)
prototype:
[return value] <hash|cmp|free> (void *, [void *], ...)
While the functions implementing hash|cmp|free|etc are defined as
[return value] <fnname> (TYPE *, [TYPE *], ...)
The compiler, not knowing the type signature of the function pointed to
by the implementation, performs no type conversion on the function
arguments
While this is nominally not an issue for general run time usage (a
pointer to void can be used as a pointer to a specific type, as long as
the pointer respects the constraints of the C language specification),
it can be problematic for ancilliary tools that expect the compiler to
detect when a type conversion occurs (even if it is a no-op, CFI being
an expemplar case).
As such, ubsan warns us about this issue to avoid breakages in such
conditions.
This is an potential fix for the issue, implemented using, in effect,
thunking macros. For each hash type, an additional set of wrapper
funtions is created (currently for compare and hash, but more will be
added for free/doall/etc). The corresponding thunking macros for each
type cases the actuall corresponding callback to a function pointer of
the proper type, and then calls that with the parameters appropriately
cast, avoiding the ubsan warning
This approach is adventageous as it maintains a level of type safety,
but comes at the cost of having to implement several additional
functions per hash table type.
Related to openssl#22896
Introduce hash thunking functions to do proper casting
ubsan on clang17 has started warning about the following undefined
behavior:
crypto/lhash/lhash.c:299:12: runtime error: call to function err_string_data_hash through pointer to incorrect function type 'unsigned long (*)(const void *)'
[...]/crypto/err/err.c:184: note: err_string_data_hash defined here
#0 0x7fa569e3a434 in getrn [...]/crypto/lhash/lhash.c:299:12
#1 0x7fa569e39a46 in OPENSSL_LH_insert [...]/crypto/lhash/lhash.c:119:10
#2 0x7fa569d866ee in err_load_strings [...]/crypto/err/err.c:280:15
[...]
The issue occurs because, the generic hash functions (OPENSSL_LH_*) will
occasionaly call back to the type specific registered functions for hash
generation/comparison/free/etc, using functions of the (example)
prototype:
[return value] <hash|cmp|free> (void *, [void *], ...)
While the functions implementing hash|cmp|free|etc are defined as
[return value] <fnname> (TYPE *, [TYPE *], ...)
The compiler, not knowing the type signature of the function pointed to
by the implementation, performs no type conversion on the function
arguments
While the C language specification allows for pointers to data of one
type to be converted to pointers of another type, it does not
allow for pointers to functions with one signature to be called
while pointing to functions of another signature. Compilers often allow
this behavior, but strictly speaking it results in undefined behavior
As such, ubsan warns us about this issue
This is an potential fix for the issue, implemented using, in effect,
thunking macros. For each hash type, an additional set of wrapper
funtions is created (currently for compare and hash, but more will be
added for free/doall/etc). The corresponding thunking macros for each
type cases the actuall corresponding callback to a function pointer of
the proper type, and then calls that with the parameters appropriately
cast, avoiding the ubsan warning
This approach is adventageous as it maintains a level of type safety,
but comes at the cost of having to implement several additional
functions per hash table type.
Related to openssl#22896
|
rebased and nits fixed |
|
This pull request is ready to merge |
ubsan on clang17 has started warning about the following undefined
behavior:
crypto/lhash/lhash.c:299:12: runtime error: call to function err_string_data_hash through pointer to incorrect function type 'unsigned long (*)(const void *)'
[...]/crypto/err/err.c:184: note: err_string_data_hash defined here
#0 0x7fa569e3a434 in getrn [...]/crypto/lhash/lhash.c:299:12
#1 0x7fa569e39a46 in OPENSSL_LH_insert [...]/crypto/lhash/lhash.c:119:10
#2 0x7fa569d866ee in err_load_strings [...]/crypto/err/err.c:280:15
[...]
The issue occurs because, the generic hash functions (OPENSSL_LH_*) will
occasionaly call back to the type specific registered functions for hash
generation/comparison/free/etc, using functions of the (example)
prototype:
[return value] <hash|cmp|free> (void *, [void *], ...)
While the functions implementing hash|cmp|free|etc are defined as
[return value] <fnname> (TYPE *, [TYPE *], ...)
The compiler, not knowing the type signature of the function pointed to
by the implementation, performs no type conversion on the function
arguments
While the C language specification allows for pointers to data of one
type to be converted to pointers of another type, it does not
allow for pointers to functions with one signature to be called
while pointing to functions of another signature. Compilers often allow
this behavior, but strictly speaking it results in undefined behavior
As such, ubsan warns us about this issue
This is an potential fix for the issue, implemented using, in effect,
thunking macros. For each hash type, an additional set of wrapper
funtions is created (currently for compare and hash, but more will be
added for free/doall/etc). The corresponding thunking macros for each
type cases the actuall corresponding callback to a function pointer of
the proper type, and then calls that with the parameters appropriately
cast, avoiding the ubsan warning
This approach is adventageous as it maintains a level of type safety,
but comes at the cost of having to implement several additional
functions per hash table type.
Related to #22896
Reviewed-by: Sasa Nedvedicky <[email protected]>
Reviewed-by: Tomas Mraz <[email protected]>
Reviewed-by: Matt Caswell <[email protected]>
(Merged from #23192)
ubsan on clang17 has started warning about the following undefined behavior:
crypto/lhash/lhash.c:299:12: runtime error: call to function err_string_data_hash through pointer to incorrect function type 'unsigned long (*)(const void *)' [...]/crypto/err/err.c:184: note: err_string_data_hash defined here
#0 0x7fa569e3a434 in getrn [...]/crypto/lhash/lhash.c:299:12
#1 0x7fa569e39a46 in OPENSSL_LH_insert [...]/crypto/lhash/lhash.c:119:10
#2 0x7fa569d866ee in err_load_strings [...]/crypto/err/err.c:280:15
[...]
The issue occurs because, the generic hash functions (OPENSSL_LH_*) will occasionaly call back to the type specific registered functions for hash generation/comparison/free/etc, using functions of the (example) prototype:
[return value] <hash|cmp|free> (void *, [void *], ...)
While the functions implementing hash|cmp|free|etc are defined as [return value] (TYPE *, [TYPE *], ...)
The compiler, not knowing the type signature of the function pointed to by the implementation, performs no type conversion on the function arguments
While this is nominally not an issue for general run time usage (a pointer to void can be used as a pointer to a specific type, as long as the pointer respects the constraints of the C language specification), it can be problematic for ancilliary tools that expect the compiler to detect when a type conversion occurs (even if it is a no-op, CFI being an expemplar case).
As such, ubsan warns us about this issue to avoid breakages in such conditions.
This is an potential fix for the issue, implemented using, in effect, thunking macros. For each hash type, an additional set of wrapper funtions is created (currently for compare and hash, but more will be added for free/doall/etc). The corresponding thunking macros for each type cases the actuall corresponding callback to a function pointer of the proper type, and then calls that with the parameters appropriately cast, avoiding the ubsan warning
This approach is adventageous as it maintains a level of type safety, but comes at the cost of having to implement several additional functions per hash table type.
Related to #22896