Skip to content

Comments

Cleanup the s_time command#3952

Closed
bernd-edlinger wants to merge 2 commits intoopenssl:masterfrom
bernd-edlinger:cleanup_s_time_command
Closed

Cleanup the s_time command#3952
bernd-edlinger wants to merge 2 commits intoopenssl:masterfrom
bernd-edlinger:cleanup_s_time_command

Conversation

@bernd-edlinger
Copy link
Member

Various code-cleanups.
Handle WANT_READ/WRITE/CONNECT like other errors,
because all sockets are blocking, and this should not happen,
Turn off the linger option on connected sockets to avoid failure.
Continue test even without -cipher option as in 1.0.2.

@bernd-edlinger bernd-edlinger added 1.1.0 branch: master Applies to master branch labels Jul 17, 2017
apps/s_time.c Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code only just got added recently. Why are you taking it out again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, sorry that I did not notice earlier, but the sockets here are all blocking.
implementing non-blocking sockets would be much more effort.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for instance the SSL_write above...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand. SSL_ERROR_WANT_READ can occur with blocking sockets - and does so in the TLSv1.3 case (which is why this change was made).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently I have not tried with TLSv1.3, sorry.
Can you please explain why the SSL not simply reads the data?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...

I tried this but it did not help:

diff --git a/apps/s_socket.c b/apps/s_socket.c
index 804ab5b..5b0e141 100644
--- a/apps/s_socket.c
+++ b/apps/s_socket.c
@@ -179,7 +179,8 @@ int do_server(int *accept_sock, const char *host, const char
     asock = BIO_socket(BIO_ADDRINFO_family(res), BIO_ADDRINFO_socktype(res),
                        BIO_ADDRINFO_protocol(res), 0);
     if (asock == INVALID_SOCKET
-        || !BIO_listen(asock, BIO_ADDRINFO_address(res), BIO_SOCK_REUSEADDR)) {
+        || !BIO_listen(asock, BIO_ADDRINFO_address(res),
+                       BIO_SOCK_REUSEADDR | BIO_SOCK_NODELAY)) {
         BIO_ADDRINFO_free(res);
         ERR_print_errors(bio_err);
         if (asock != INVALID_SOCKET)

still tls1.2 -www -reuse
28410 connections in 13.43s; 2115.41 connections/user sec, bytes read 104177885
28410 connections in 31 real seconds, 3666 bytes read per connection

and tls1.3 -www -reuse
8615 connections in 15.03s; 573.19 connections/user sec, bytes read 38285060
8615 connections in 31 real seconds, 4444 bytes read per connection

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so my conclusion is that reused sessions in tls1.3
cause the same cpu cycles for the client than new tls1.3 sessions,
just the server has significantly less cpu cycles.
OTOH tls1.2 session reuse may cause more round trips,
but much less cpu load on both server and client, and if
the network latency is very low as in this case the numbers
are not as convincing as they should be.
Is this by design, or could the TLS session have an
estimation of the network delay and choose another
strategy, if the round trips are cheap?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./openssl s_server -www
vs.
./openssl s_time -new
=>
5134 connections in 8.30s; 618.55 connections/user sec, bytes read 0
5134 connections in 31 real seconds, 0 bytes read per connection
./openssl s_time -reuse
=> (as expected SSL_session_reused() = 0)
5049 connections in 8.36s; 603.95 connections/user sec, bytes read 0
5049 connections in 31 real seconds, 0 bytes read per connection

These numbers are as expected, i.e. we get less connections/sec with the attempted reuse than without. In both cases we are doing a full handshake (we're not actually reusing)...but we've gone through some of the hoops of attempting to reuse. At least that's my guess.

./openssl s_time -www /test -new
=>
4435 connections in 8.31s; 533.69 connections/user sec, bytes read 21953250
4435 connections in 31 real seconds, 4950 bytes read per connection

Also expected. Less connections/sec because now we are actually transferring data.

Note: the connections/user sec are almost identical, does that mean the client
has no less work to do for a reuse, but the server ?

I'm not sure what you mean. So far in the above numbers no reuse has actually happened (SSL_session_reused() == 0)

./openssl s_time -www /test -reuse
=> (as expected SSL_session_reused() = 1)
8734 connections in 14.90s; 586.17 connections/user sec, bytes read 38840098
8734 connections in 31 real seconds, 4447 bytes read per connection

For the first time connections/sec goes up because we are now really doing reuse, so this is as expected.

now I start the server with tls1_2
./openssl s_server -www -tls1_2

./openssl s_time -new
=>
4626 connections in 7.92s; 584.09 connections/user sec, bytes read 0
4626 connections in 31 real seconds, 0 bytes read per connection

./openssl s_time -reuse
=>
40253 connections in 16.34s; 2463.46 connections/user sec, bytes read 0
40253 connections in 31 real seconds, 0 bytes read per connection
Note: This is the only data point which is how I would expect it.

./openssl s_time -www /test -new
=>
4527 connections in 7.85s; 576.69 connections/user sec, bytes read 22413177
4527 connections in 31 real seconds, 4951 bytes read per connection

All of the above are largely as expected. Session reuse is faster than no session reuse. TLSv1.3 without session reuse is out performing TLSv1.2 without session reuse.

Slightly surprising is that TLSv1.2 www without session reuse is marginally better than the TLSv1.3 equivalent.

./openssl s_time -www /test -reuse
=> (yes SSL_session_reused() = 1 here)
725 connections in 0.87s; 833.33 connections/user sec, bytes read 2657125
725 connections in 31 real seconds, 3665 bytes read per connection

This one I cannot explain. TLSv1.2 www with session reuse is much worse than without???

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slightly surprising is that TLSv1.2 www without session reuse is marginally better than the TLSv1.3 equivalent.

Perhaps this could be to do with the cipher that is being selected in each case?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

./openssl s_time -www /test -reuse
=> (yes SSL_session_reused() = 1 here)
725 connections in 0.87s; 833.33 connections/user sec, bytes read 2657125
725 connections in 31 real seconds, 3665 bytes read per connection

This one I cannot explain. TLSv1.2 www with session reuse is much worse than without???

Yes, that was before I added the NODELAY flags.
with NODELAY the same test results in much better numbers:

still tls1.2 -www -reuse
28410 connections in 13.43s; 2115.41 connections/user sec, bytes read 104177885
28410 connections in 31 real seconds, 3666 bytes read per connection

that was the only data point that changed due to NODELAY.

apps/s_time.c Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not make portion that makes the call to a block and declare no_linger there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean this way there would be one #ifdef...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

apps/s_time.c Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Formally speaking it might be more appropriate to implement it as BIO_CTRL. I mean whole point with BIO is to arrange implementation-neutral interface, and this kind of very specific. On the other hand this is special case...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, kids, don't do that at home ;-)

@bernd-edlinger bernd-edlinger force-pushed the cleanup_s_time_command branch 2 times, most recently from ce7588d to aa1ee31 Compare July 18, 2017 17:11
apps/s_time.c Outdated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't really need else here. Starting with straight { works and I'd say is more readable in this case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, a matter of taste, but if you like it better without else it's ok for me.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some even "hate the else after return" :-) But to me, in this case, it's really else in #if. I'm not actually a stranger to such construct (I mean I've used it myself), but it's always "unless actually necessary". And it's not actually necessary in this case...

@bernd-edlinger bernd-edlinger force-pushed the cleanup_s_time_command branch from e275faf to 5f1ce53 Compare July 19, 2017 11:31
@kaduk
Copy link
Contributor

kaduk commented Jul 19, 2017

Is this by design, or could the TLS session have an
estimation of the network delay and choose another
strategy, if the round trips are cheap?

I would prefer to not go down that route. TLS 1.3 is pretty finalized, and the extra crypto has beneficial security properties.

@richsalz
Copy link
Contributor

Generic TLS is not a time protocol. :) Does it even make sense to use TLS 1.3 for this, since some timestamps were removed? Can we just error-out if only TLS 1.3 is available?

@bernd-edlinger
Copy link
Member Author

@mattcaswell I wonder if this return WANT_READ thing for TLS1.3 will not
break a few non-blocking legacy applications, because previously that was
possible, but only when the server initiated a renegotiation which is rarely done
and if it is done it will not be right in the beginning of the conversation.

So I have two questions:

  1. what is the benefit from returning WANT_READ for blocking sockets?
  2. wouldn't it be better to implicitly enable SSL_MODE_AUTO_RETRY if
    TLS1.3 gets negotiated?

@mattcaswell
Copy link
Member

  1. what is the benefit from returning WANT_READ for blocking sockets?
  2. wouldn't it be better to implicitly enable SSL_MODE_AUTO_RETRY if
    TLS1.3 gets negotiated?

So, when writing with blocking sockets you may actually be seeking to write your code in such a way as to avoid blocking unnecessarily. This is actually how s_client/s_server are written. We only actually attempt to read data when we know there is data on the socket waiting to be read. If the socket indicates it is readable then we go and read it, otherwise we loop around waiting for some potential keyboard input. In some cases though (and particularly this happens in TLSv1.3 NST), the socket has data to be read but when you read it it doesn't actually yield any application data. In that case signalling WANT_READ is the right thing to do. Implicitly enabling SSL_MODE_AUTO_RETRY is likely to break such applications so I think it is better to stick with the documented behaviour.

@mattcaswell mattcaswell added this to the 1.1.1 milestone Jan 24, 2018
@mattcaswell
Copy link
Member

What is the status of this PR?

@bernd-edlinger
Copy link
Member Author

I'll re-base.
I was not really satisfied with the performance numbers,
TLS1.2 was seemingly outperforming TLS 1.3 as far as I remember...

@mattcaswell
Copy link
Member

TLS1.2 was seemingly outperforming TLS 1.3 as far as I remember...

Could be the DRBG issues (i.e. using AES256 and not using the platform optimised version)

@bernd-edlinger bernd-edlinger force-pushed the cleanup_s_time_command branch from 5f1ce53 to d968c9e Compare March 20, 2018 18:02
@bernd-edlinger
Copy link
Member Author

Rebased.

I had to include <internal/sockets.h> because apparently socket headers were removed from e_os.h.

I think it was somewhat controversial that I prefer SSL_MODE_AUTO_RETRY
over handling SSL_ERROR_WANT_READ in a loop.
But I think this is fine, and we have not much test coverage for SSL_MODE_AUTO_RETRY
anyways.

@bernd-edlinger
Copy link
Member Author

The change log mentions:
"Continue test even without -cipher option as in 1.0.2."
But that part was already fixed by another PR.
I would just update the commit message in a moment.

Various code-cleanups.
Use SSL_CTX_set_mode(ctx, SSL_MODE_AUTO_RETRY) insead of handling
SSL_ERROR_WANT_READ everywhere.
Turn off the linger option on connected sockets to avoid failure.
Add BIO_set_conn_mode(conn, BIO_SOCK_NODELAY) to improve thruput.
@bernd-edlinger
Copy link
Member Author

Hmm. removed 1.1.0 label for now, I think that won't cherry-pick.
at least the part with missing cipher option is not fixed in 1.1.0

[to be squashed]
@bernd-edlinger
Copy link
Member Author

Sorry Matt, there was a build failure in the windows build because of a warning.

..\apps\s_time.c(396): error C2220: warning treated as error - no 'object' file generated
..\apps\s_time.c(396): warning C4133: 'function': incompatible types - from 'linger *' to 'const char *'

could you please re-confirm?

levitte pushed a commit that referenced this pull request Mar 21, 2018
Various code-cleanups.
Use SSL_CTX_set_mode(ctx, SSL_MODE_AUTO_RETRY) insead of handling
SSL_ERROR_WANT_READ everywhere.
Turn off the linger option on connected sockets to avoid failure.
Add BIO_set_conn_mode(conn, BIO_SOCK_NODELAY) to improve thruput.

Reviewed-by: Matt Caswell <[email protected]>
(Merged from #3952)
@bernd-edlinger
Copy link
Member Author

Merged to master, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

branch: master Applies to master branch

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants