Conversation
There was a problem hiding this comment.
I'd like to see the input and output interfaces consumed by the TX packetiser more clearly delineated. e.g. How is the packetiser instantiated? How and what dependencies are injected (e.g. the network BIO). Is this module the exclusive permitted user of BIO_sendmmsg? (Presumably it is.)
Will the TX packetiser need to handle timer events and have the event queue provided to it (e.g. frame/packet coalescing delays)?
There was a problem hiding this comment.
Is this module the exclusive permitted user of BIO_sendmmsg? (Presumably it is.)
At least in the original design the TX QUIC record layer would call the BIO_sendmmsg(). However that does not really work well with 1 call to record layer for QUIC 1 packet and it also does not work for packet coalescing. So... this is still unclear.
On the other hand I suppose some clever filtering BIO inserted in front of the real BIO_dgram might work.
I suppose you've meant to link some different PR. |
3b5241a to
694fe62
Compare
There was a problem hiding this comment.
I expected the packetiser to determine the length of application data frames - not be told by something external what that length is. Presumably the packetiser knows the maximum size that the packet can be, and we also (presumably) want to send maximum sized packets where ever possible. The packetiser will have to take the data input from various sources and form the complete packet from it. For example let say that the packetiser decides to send some ack frames and a crypto frame, and its got 1k of "space" left in the packet which it can fill with application data. It needs to take that much data from the steam buffers. Surely, only the packetiser knows the amount of space available and can form the application data frames accordingly?
There was a problem hiding this comment.
The design specifies that complete frames are passed into the packetiser. Either we need to change this or the stream send buffers need to know the length (which means the packetiser should have a call to make this available).
I'm imagining that several packets will be on the go at any one time.
Packetising is equivalent to the bin packing problem, so this should ease things. Also, there are frames that are best sent separately (e.g. ack frames don't need an ack if sent without any frames that do need an ack, likewise they don't count for bytes in flight alone).
There was a problem hiding this comment.
I was envisaging that the packetiser would have access to a set of stream send buffers. Each stream buffer would know how much data it has available to send - and the packetiser can query that. The packetiser would query each stream buffer in turn and fill the "current" packet with data from each buffer. We will have to have some algorithm to determine how much to take from each one - I can think of a few different strategies, e.g. 2 possibilities might be:
- Take as much data as possible from the first stream buffer, until either we've run out of data or the packet is full. If the packet is not yet full move onto the next stream buffer and repeat. Next time we create a packet continue where we left off, and iterate through each stream buffer in a round-robin fashion.
or
- Work out how many stream buffers we have. Divide the amount of space available by the amount of stream buffers and fill the packet with a small amount of data from every buffer. If any one stream buffer has less data than its allowance, then increase the allowance of all the other stream buffers accordingly.
Possibly different stream buffers have different priorities. You could envisage a "very high priority" that means "just send the data from this stream buffer as soon as possible and don't mix it with other stream data", which could be used for the ack frames scenario you describe. Or something like that anyway.
There was a problem hiding this comment.
IMO the best strategy (to avoid stalling multiple streams when packets are lost) is to fill in a packet with as much data available from a single stream as possible. Then in another packet continue with next stream to provide fairness to streams and so on. Of course if a stream does not have enough data to fill a full packet, add data from another stream. So this is something in-between 1 and 2.
There was a problem hiding this comment.
At least in the MVP I would not try to add priority to streams as prioritization can be done by the application itself by blocking adding more stream data to lower priority streams until it has sent all the high priority stream data.
There was a problem hiding this comment.
I thought the TXP creates the frames (i.e., the framer is part of the TXP). The calling modules just ask it to wrap their content into appropriate frames. I.e., ACK manager asks it to create ACK frames, the TLS handshake RL asks it to create CRYPTO frames, and so on. The TXP then buffers/sends the frames as appropriate.
There was a problem hiding this comment.
Retries should be done at higher priority.
For lost frames? Well they just need to be sent before sending any other data for the same stream IMO it is not that complicated.
There was a problem hiding this comment.
The design speaks otherwise:
TX Packetizer
This module creates frames from the application data obtained from
the application. It also receives CRYPTO frames from the TLS Handshake
Record Layer and ACK frames from the ACK Handling And Loss Detector
subsystem.
Which means the TXP should be creating the application data frames.
I also remember something about frames being passed to the TXP but I could easily be misremembering.
There was a problem hiding this comment.
Yeah, the design is vague in this regard. Perhaps it should be updated? I.e. It also receives data for CRYPTO frames from the TLS Handshake Record Layer and data for ACK frames from the ACK Handling And Loss Detector subsystem. ? I do not know what makes better sense.
There was a problem hiding this comment.
@mattcaswell said:
I was envisaging that the packetiser would have access to a set of stream send buffers. Each stream buffer would know how much data it has available to send - and the packetiser can query that. The packetiser would query each stream buffer in turn and fill the "current" packet with data from each buffer.
So I wonder, who would call the appropriate function to get the packetiser to start working?
Me, I was always thinking that it would be other parts that would essentially call the packetiser function to say "hey, I have something for you"... and while doing so, they could as well include "and here's my data buffer". As far as I understand, that's more or less exactly how this function should be called? I'm not sure I understand why the packetiser should do the extra churn of running around all the SSL objects to ask "hey, do you have something for me?" in this case.
So, in the scenario where the packetiser just gets the signal to start doing something and go asking all the SSL objects it can reach, how does that process get started?
There was a problem hiding this comment.
I envisaged a single API call that asks "how much data can I send at the moment?" That API would then have to query the flow controller and congestion controller to find the answer
There was a problem hiding this comment.
I envisage two (configurable) modes of operation:
- Send as soon as we have application/crypto data available to send.
- If we have a full packet worth of data, then send immediately - otherwise send after a configurable delay period.
In the first mode I imagine somewhere in the implementation of SSL_write there will be a call to send the packets.
In the second mode I envisage this being invoked by a callback as a result of a time based event firing.
We might need 2 API calls available, e.g. ossl_quic_send_packets_now and ossl_quic_send_packets_if_full....or something like that
There was a problem hiding this comment.
I was thinking of a single mode:
- if there is no data in flight or immediately after the handshake send now
- if there is data in flight, accumulate data but also start a short timeout and send if it expires
There was a problem hiding this comment.
Sending data now ought to take a stream SSL object -- presumably one is interested in flushing a stream rather than a connection. Accumulating data would take a connection SSL object (which is implicit in the TXP instance).
This would lead to:
int ossl_quic_send_packets(SSL_QUIC_TX_PACKETISER *tx, SSL *stream);which flushes and sends the stream now if it is not NULL and accumulates & sends if desirable otherwise.
ee6b0ad to
72deaff
Compare
There was a problem hiding this comment.
I agree with @mattcaswell that it should be the TXP composing packets, pulling frames from available sources as appropriate.
This is already how things work for the ACKM, you call ossl_ackm_is_ack_desired and ossl_ackm_get_ack_frame to get a structure logically expressing the desired ACK ranges. So TXP has to poll ACKM for this. However, the number of ACK ranges returned by ACKM may be more than TXP can or wants to fit in a packet, it is for TXP to decide what size it wants to make ACK frames. It would make sense to me to use a similar design for STREAM frames.
I see the TXP here as a kitchen chef which pulls from available sources of frames in order to generate full packets, since the TXP is in the best position to know how much space is available, etc.
I imagine the core mechanic would just be a loop filling a packet buffer which calls different possible frame sources until no more will fit. There might be some exceptions to this, like CRYPTO frames, where it makes more sense for other code to push to the TXP, but this doesn't seem like it would affect the design much (and if one really wanted to avoid further complicating the TXP, could be a small queue maintained outside the TXP which it polls).
While we could have dynamic registration of frame sources using callbacks, I think this is overkill since the possible frame sources are statically known.
This has never been in doubt. The question is who/what builds the frames. The design currently has the TXP building application frames but receiving completed frames from everything else. The question being posed is: should the TXP also compose the other types of frame too? |
|
My very rough take would be:
(This reminds me that I've been meaning to create an issue to create functions for wire format encoding/decoding of specific frame types, since we'll need them.) |
|
Why would the TXP need to worry about PING frames? It's seems pretty disconnected from packet assembly. I'm also not convinced that variable length means the TXP is responsible. ACKs e.g. could be sent as a separate packet if they don't fit the current one -- which also has the advantage of being non-ack eliciting. |
|
It's a bit rough around the edges and there are some outstanding questions but I think this is good enough for the moment. |
|
From discussions both here and in #18610, it seems that we have different views on if the TX and RX Frame handlers should work with a pull model or a push model. For a pull model, it's also remains a bit unclear when and how these Frame handlers are supposed to get started or by whom they should be called to do their work. |
|
State machine drives most things? |
There was a problem hiding this comment.
Is this module the exclusive permitted user of BIO_sendmmsg? (Presumably it is.)
At least in the original design the TX QUIC record layer would call the BIO_sendmmsg(). However that does not really work well with 1 call to record layer for QUIC 1 packet and it also does not work for packet coalescing. So... this is still unclear.
On the other hand I suppose some clever filtering BIO inserted in front of the real BIO_dgram might work.
|
|
||
| The MTU limits the size of an individual packet, the other two limit the | ||
| total amount of data that can be sent. The packetiser needs to query the | ||
| current limits using the `ossl_quic_flow_maximum_size()`, |
There was a problem hiding this comment.
I am unsure whether this isn't too late - IMO the flow control limits should be limited already when writing to the stream send buffers. I.e, on the SSL_ API front end call.
|
Getting an annoying error from github when trying to comment on one of the threads above...so I'll post my comment here instead:
The record layer design does not have 1 call to record layer for 1 QUIC packet. The relevant function is: A "record" here mean a QUIC packet. So So with these functions the packetizer can figure out the total amount of data it can send in one go, and the maximum amount of data that can go into a single record. From there it is expected to form the record templates accordingly and make one call to |
|
ping for second review |
| int ossl_quic_free_app_data(OSSL_QUIC_FRAME *frame, | ||
| void *data, size_t data_len); | ||
| ``` | ||
|
|
There was a problem hiding this comment.
I suspect there will need to be some running total held somewhere of the total amount of data not sent yet across all buffers. The packetiser will need to create a new packet either when some timeout has expired, or alternatively when sufficient data is pending.
There was a problem hiding this comment.
The running total would be the stats manager or ACK managers concern. I really don't think it belongs here.
It's the ACK manager's job to tell the TXP that retransmission is required & what needs to be retransmitted. The TXP then rebuilds packets based on this information. This complicates the coordination between the TXP and the send buffers because data block boundaries can be expected to change. I've not got a way to say forget about these blocks they need to be retransmitted and the boundaries could change.
|
Just a heads up, #18949 now has the TX record layer API and implementation. |
30b0078 to
0829f4e
Compare
|
Updated. |
0829f4e to
f87d0f5
Compare
|
|
||
| ```c | ||
| void OSSL_QUIC_STREAM_set_priority(SSL *stream, uint32_t priority); | ||
| uint32_t OSSL_QUIC_STREAM_get_priority(SSL *stream); |
There was a problem hiding this comment.
This still doesn't look quite right. Either these are public functions in which case I would expect them to be:
void SSL_set_priority(SSL *stream, uint32_t priority);
uint32_t SSL_get_priority(SSL *stream);In the above case we need to have some decision about how these functions behave in the case that a TLS SSL object is passed to them (I'd expect they just "work" - the priority is a hint only that is ignored in TLS).
The document needs to be clear about which functions that are presented are public and which are internal.
If, on the other hand, these are non-public functions I would expect them to look like this:
void QUIC_STREAM_set_priority(QUIC_STREAM *stream, uint32_t priority);
uint32_t QUIC_STREAM_get_priority(QUIC_STREAM *stream);
Aside: Are we using OSSL_QUIC_STREAM or QUIC_STREAM for the stream object. I've seen both used.
There was a problem hiding this comment.
Aside: Are we using OSSL_QUIC_STREAM or QUIC_STREAM for the stream object. I've seen both used.
In what context? For structures and types, QUIC_ is okay. For functions, they should begin ossl_ IMO.
There was a problem hiding this comment.
So, the object would be QUIC_STREAM, but any function that uses it would be OSSL_QUIC_STREAM_whatever(QUIC_STREAM *stream, ...) (or should that be ossl_quic_stream_whatever(QUIC_STREAM *stream, ...) for internal things)?
|
|
||
| ```c | ||
| int ossl_quic_retransmitting_app_data(QUIC_STREAM *stream, size_t offset, | ||
| void *data, size_t data_len); |
There was a problem hiding this comment.
I'm not clear how this is interpreted. IIUC this is telling a stream send buffer that data we previously requested to be sent now needs to be retransmitted - so remove any record of the fact that we sent it before and make the data eligible to be sent again. Is that correct? What does offset mean here?
There was a problem hiding this comment.
offset is the data offset in the stream. I'll change stream offset to something more verbose in the description.
I'm not convinced that offset is required but I included it for the moment.
Your interpretation is correct. I've reworded it a little to hopefully clarify.
There was a problem hiding this comment.
I'm unconvinced about the need for offset. Or at least I don't understand why we need both data and offset. They both seem to be being used to identify the data that needs to be retransmitted.
There was a problem hiding this comment.
Me either. It depends on how the send buffers maintain the information about what has been transmitted and what hasn't. Once that's known, we'll know which is required.
There was a problem hiding this comment.
I think you should pick one (data or offset) and run with just that in this design - not have both. We can always change it later. Having both is definitely wrong.
| by calling: | ||
|
|
||
| ```c | ||
| void ossl_qtx_flush_net(OSSL_QTX *qtx); |
There was a problem hiding this comment.
What is an OSSL_QTX? Isn't this supposed to be OSSL_QUIC_TX_PACKETISER?
There was a problem hiding this comment.
It's defined in #18949, I assumed it was the TX record layer.
f87d0f5 to
5f10b1f
Compare
5f10b1f to
32afb63
Compare
32afb63 to
2cd6052
Compare
mattcaswell
left a comment
There was a problem hiding this comment.
This looks good. I think its nearly there.
|
|
||
| ```c | ||
| int ossl_quic_retransmitting_app_data(QUIC_STREAM *stream, size_t offset, | ||
| void *data, size_t data_len); |
There was a problem hiding this comment.
I think you should pick one (data or offset) and run with just that in this design - not have both. We can always change it later. Having both is definitely wrong.
|
@t8m still okay? |
t8m
left a comment
There was a problem hiding this comment.
Good for now. Might need some finetunning later.
|
Merged, finally. |
Reviewed-by: Tomas Mraz <[email protected]> Reviewed-by: Matt Caswell <[email protected]> (Merged from #18570)
Reviewed-by: Tomas Mraz <[email protected]> Reviewed-by: Matt Caswell <[email protected]> (Merged from openssl#18570)
Reviewed-by: Tomas Mraz <[email protected]> Reviewed-by: Matt Caswell <[email protected]> (Merged from openssl#18570)
Intended as a companion for #18564, although less advanced.