drivers/net: Add netdev_driver_t::confirm_send by maribu · Pull Request #14660 · RIOT-OS/RIOT

maribu · 2020-07-30T18:45:37Z

Contribution description

Changed the API of netdev_driver_t:

The send() function should no longer return the number of bytes and should not block
The upper layer now must call the new confirm_send() function after calling send(); either busy waiting until something different to -EBUSY is returned, or after NETDEV_EVENT_TX_COMPLETE was signaled

During transition to the new API, the upper layer must remain backward compatible and must assume the legacy API if netdev_driver_t::confirm_send() is NULL.

Testing procedure

Just reading the doc and Murdock should be sufficient. Without any driver and no upper layer code being touched, this PR will just increase RAM consumption by one function pointer per network driver.

Issues/PRs references

Alternative to: #14625

Depends on and includes: drivers/net: Fix netdev_driver_t doc #14657

maribu · 2020-07-30T18:54:24Z

drivers/include/net/netdev.h

+     * @return  Number of bytes transmitted. (The size of the transmitted
+     *          frame including all overhead, such as frame check sequence,
+     *          bit stuffing, escaping, headers, trailers, preambles, start of
+     *          frame delimiters, etc. May be an estimate for performance
+     *          reasons.)


@miri64: We had a discussion on what this size should be in an issue. This interpretation is different to what most drivers implement (in the send(), not the confirm_send() function), but as porting would result in work anyway, I figured we should rather go for the best solution than the lazy solution.

IMO there is little reason to just return the result of iolist_size(iolist) on what was passed in send(), as the upper layer could just call iolist_size() directly to get this. So if we really want additional data for statistics, it should be IMO a value that has the most value when the value of iolist_size(iolist) is known anyway. And to me this is the number of bytes physically transmitted; as this would allow to calculate the layer 2 throughput, while iolist_size(iolist) can provide the layer 2 goodput.

The last sentence is wrong. The netif can easily calculate the sizes needed for goodput calculation, but it would need to make sure to call iolist_size(iolist) prior to adding the L2 header (and trailer, if applicable). But the reasoning should still be valid.

maribu · 2020-07-30T18:57:19Z

drivers/include/net/netdev.h

+     * @param[in]   iolist  IO vector list to send. Elements of this list may
+     *                      have iolist_t::iol_size == 0 and (in this case only)
+     *                      iolist_t::iol_data == 0.


Note this change. Previously the API doc read as if a driver would need to expect elements in the iolist that have iolist_t::iol_data == NULL but iolist_t::iol_size > 0. I think this was just a mistake in formulation and not intentionally.

jia200x · 2020-07-31T08:06:09Z

nice one!

IMO we could also specify that the frame from the send function is always a PSDU as well (Full L2 frame). What do you think?

drivers/include/net/netdev.h

maribu · 2020-07-31T08:27:48Z

IMO we could also specify that the frame from the send function is always a PSDU as well (Full L2 frame). What do you think?

I mostly agree, but I would like to leave the FCS calculation to the driver, as hardware support for that is very common.

maribu · 2020-07-31T08:38:02Z

IMO we could also specify that the frame from the send function is always a PSDU as well (Full L2 frame). What do you think?

I mostly agree, but I would like to leave the FCS calculation to the driver, as hardware support for that is very common.

And stuff like escaping (e.g. slipdev) or bitstuffing is also be better left to the driver, as doing this on the fly can be implemented quite efficiently, while doing so in advance involves a lot of memmove(), and resizing of the buffer.

jia200x · 2020-08-04T08:51:35Z

I guess this should be rebased on top of #14657

maribu · 2020-08-04T10:08:12Z

I guess this should be rebased on top of #14657

Yes. Done :-)

jia200x · 2020-08-04T10:21:58Z

drivers/include/net/netdev.h

+     *              via the `info` parameter
+     */
+    NETDEV_EVENT_TX_COMPLETE_DATA_PENDING,
    NETDEV_EVENT_TX_NOACK,                  /**< ACK requested but not received */


Shoudln't NOACK and TX_MEDIUM_BUSY be included as well? They are resolved at the same time

Done. I changed the return values to have -EGAIN in confirm_send() when called to early. This matches the semantics of the error code better. And this allows to return -EBUSY when the medium was busy; which is consistent with the interpretation of the same error code issued by send().

jia200x · 2020-08-04T12:47:50Z

I'm +1 with the fundamentals. Before going into style/wording reviews, @miri64 @bergzand do you agree with the proposed changes?

maribu · 2020-08-12T10:23:42Z

It would be nice to get some feedback on this.

I have now implemented a PTP client on top of the sock_udp that uses the timestamp in the auxiliary data for synchronization. The integration for the RX timestamp was pretty straight forward:

Use the void *rx_info in netdev_driver_t::recv() to pass the timestamp from the netdev to the netif
Extend gnrc_netif_hdr_t to contain a timestamp (if a pseudo-module is used, so that overhead only applies if the timestamping is actually needed)
Extend gnrc_sock_recv() to also extract the timestamp from the gnrc_netif_hdr_t
Extend the gnrc_sock_udp & gnrc_sock_ip to pass the timestamp from gnrc_sock_recv() to the user

With that, PTP synchronization does already work. Without TX timestamps, the network delay cannot yet be estimated and thus not yet be compensated.

I would now like to also implement the missing blocks for TX timestamp to also estimate the network delay. However, I would like to have a rough agreement on whether this approach or #14625 will make the race.

jia200x · 2020-08-12T10:28:09Z

I totally agree with the proposed change. We need more feedback though since it's an API change.

@miri64 @bergzand @PeterKietzmann @kaspar030 ?

miri64 · 2020-08-26T14:12:35Z

If it helps, I am all for it :-).

benpicco · 2020-09-10T16:46:41Z

drivers/include/net/netdev.h

+     *          NETDEV_EVENT_TX_COMPLETE event. This function must not return
+     *          `-EAGAIN` after that event was received.
+     */
+    int (*confirm_send)(netdev_t *dev, void *info);


Who is going to call that function and what are they going to do with the number of bytes sent?

The 802.15.4 sub-mac will call a callback on TX done, can we do this here too?

Who is going to call that function

E.g. in the current GNRC network stack, I would implement this in the gnrc_netif. That way the gnrc_netif could turn the asynchronous non-blocking API into a "blocking" API. (I'm not sure if blocking is the right word, but the gnrc_netif will only act upon a message when the transmission is fully complete. That way if e.g. the 6lowpan implementation would send 3 frames as fragments of a single IPv6 packet, the gnrc_netif would send them one by one. That way the driver does no longer need to block on send() for link layer fragmentation to work.)

and what are they going to do with the number of bytes sent?

The number of bytes sent is only useful to gather L2 stats IMO. So unless the l2 stats module is used, the caller will only check for the return value to be non-negative.

The 802.15.4 sub-mac will call a callback on TX done, can we do this here too?

In the long term I would like to adopt those ideas of the 802.15.4 radio HAL that can be generically applied to all network devices for the netdev API. As said, my personal goal is that in the end a compatible subset of the radio HAL becomes the "generic network HAL". That way code duplication between different network technologies could be avoided, while still being able to exploit technology specific features through the technology specific API additions on top of the "generic network HAL".

But I think it would be better to change the netdev API in small steps.

So would this do

dev->send(); while (dev->confirm_send() == -EAGAIN) {}

?

Yes. But more likely you want to block until NETDEV_EVENT_TX_COMPLETE is issued.

When would we get NETDEV_EVENT_TX_COMPLETE but have confirm_send() return -EAGAIN?

Shouldn’t one be enough?
I see also have .confirm_tx in the radio HAL.
I’d like to have @jia200x‘s 2 ct on this so we don’t end up with two slightly incompatible mechanisms for the same thing.

The only mechanism that tells the upper layer that a packet was received is when confirm_transmit() returns 0.
The thing is, some radios have an "TX DONE" IRQ (and impressively, some like CC2420 don't). If the cap IEEE802154_CAP_IRQ_TX_DONE is available, the upper layer knows that it can e.g lock a mutex and unlock it on IEEE802154_CONFIRM_TX_DONE. But e.g for CC2420 the lower layer needs to poll until confirm_transmit return 0.

So in the Radio HAL officially there's only one mechanism (the confirm function).

Maybe we should update the documentation of NETDEV_EVENT_TX_COMPLETE to reflect that's not a mandatory event.

Couldn't we instead use a timer in the CC2420 driver and keep the API easier instead? At least the driver should be in an excellent position to optimize the timeout duration.

Nothing blocks the driver to implement such a feature. however, the timeout should make sure that confirm_send returns 0. Otherwise the API is not respected

benpicco

I'm all for taking the burden of blocking .send away from the driver.
I take that this is not in conflict with the radio HAL work, so have my ACK.

jia200x · 2020-09-11T11:49:16Z

I'm all for taking the burden of blocking .send away from the driver.
I take that this is not in conflict with the radio HAL work, so have my ACK.

+1

Note that although not ideal, now it's possible to make all radios that return -EBUSY on send work with GNRC without needing gnrc_netif_pktq. This could quickly enable sending 6lowpan fragments without adding the overhead of the queue. But note this is absolutely not intended for the long term.

benpicco · 2020-09-11T11:51:08Z

Yea ideally the thread that produces those fragments should block instead of filling up a queue.

benpicco · 2020-12-21T15:19:21Z

+1

Is this an ACK 😉 ?

maribu · 2020-12-21T16:26:38Z

Let me squash and fix the whitespace errors. By now, the history of fixups is no useful information anymore, as I assume the memory of the reviewers of this PR has faded to much, so that they need to re-review the whole PR anyway.

jia200x · 2021-01-05T12:50:55Z

what's the status of this one? is it close to go? :)

maribu · 2021-01-05T13:05:52Z

what's the status of this one? is it close to go? :)

I'd hit the ACK button right away once you provide the second ACK ;-)

jia200x · 2021-01-05T13:42:37Z

I think it would be nice to add the deprecation release (21.10?). Otherwise I'm fine.
For me it's ok if you want to add it now or directly after in a follow up. As you wish :)

Otherwise, I ACK upon reply :)

Changed the API of `netdev_driver_t`: - The `send()` function should no longer return the number of bytes and should not block - The upper layer now must call the new `confirm_send()` function after calling `send()`; either busy waiting until something different to `-EBUSY` is returned, or after `NETDEV_EVENT_TX_COMPLETE` was signaled During transition to the new API, the upper layer must remain backward compatible and must assume the legacy API if `netdev_driver_t::confirm_send()` is `NULL`.

maribu · 2021-01-05T14:36:51Z

I think it would be nice to add the deprecation release (21.10?). Otherwise I'm fine.

My idea was to update gnrc_netif as a follow up to make use of the new behavior and add a pseudo-module that enables backward compatibility. Without this module presence of send_confirm could be assert()ed to ease debugging. And then we could deprecate the backward compatibility pseudo-module. Via makefiles/deprecated_modules.inc.mk out of tree users would get a motivating warning if they use drivers with legacy behavior :-)

maribu · 2021-01-05T14:38:14Z

I directly amended a typo fix ("An this case" --> "In this case")

jia200x

Alright! ACK

maribu · 2021-01-05T15:11:30Z

Thanks everyone :-)

Teufelchen1 · 2024-04-02T11:00:31Z

Just a short reminder that the deprecated enums are still in use within RIOT.

maribu requested review from PeterKietzmann, bergzand, jia200x and miri64 July 30, 2020 18:45

maribu commented Jul 30, 2020

View reviewed changes

jia200x reviewed Jul 31, 2020

View reviewed changes

drivers/include/net/netdev.h Show resolved Hide resolved

maribu force-pushed the netdev_confirm_send branch from a272b86 to 64b85cd Compare August 4, 2020 10:08

jia200x reviewed Aug 4, 2020

View reviewed changes

maribu mentioned this pull request Aug 4, 2020

core/msg: Add submodule core_msg_prio #14677

Closed

maribu removed the State: waiting for other PR State: The PR requires another PR to be merged first label Aug 12, 2020

benpicco reviewed Sep 10, 2020

View reviewed changes

benpicco approved these changes Sep 11, 2020

View reviewed changes

maribu mentioned this pull request Dec 21, 2020

drivers/nrf24l01p: Netdev driver for nrf24l01p #13743

Merged

benpicco added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Dec 21, 2020

maribu force-pushed the netdev_confirm_send branch from 0e97732 to da15774 Compare December 21, 2020 16:28

maribu mentioned this pull request Dec 23, 2020

sys/net/gnrc/tx_sync: new module #15694

Merged

2 tasks

maribu added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Jan 5, 2021

maribu force-pushed the netdev_confirm_send branch from da15774 to 53fcb97 Compare January 5, 2021 14:34

jia200x approved these changes Jan 5, 2021

View reviewed changes

maribu added Process: needs >1 ACK Integration Process: This PR requires more than one ACK and removed Process: needs >1 ACK Integration Process: This PR requires more than one ACK labels Jan 5, 2021

maribu merged commit e5a0154 into RIOT-OS:master Jan 5, 2021

maribu deleted the netdev_confirm_send branch January 5, 2021 15:11

jia200x mentioned this pull request Feb 24, 2021

esp32/eth: Don't overwrite queued event with RX packet #16084

Merged

benpicco mentioned this pull request Aug 10, 2022

sys/net/gnrc/netif: allow checking if a netdev is legacy or new API #18426

Merged

Conversation

maribu commented Jul 30, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Contribution description

Testing procedure

Issues/PRs references

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jia200x commented Jul 31, 2020

Uh oh!

Uh oh!

maribu commented Jul 31, 2020

Uh oh!

maribu commented Jul 31, 2020

Uh oh!

jia200x commented Aug 4, 2020

Uh oh!

maribu commented Aug 4, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jia200x commented Aug 4, 2020

Uh oh!

maribu commented Aug 12, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jia200x commented Aug 12, 2020

Uh oh!

miri64 commented Aug 26, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benpicco Sep 10, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

benpicco left a comment

Choose a reason for hiding this comment

Uh oh!

jia200x commented Sep 11, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

benpicco commented Sep 11, 2020

Uh oh!

benpicco commented Dec 21, 2020

Uh oh!

maribu commented Dec 21, 2020

Uh oh!

jia200x commented Jan 5, 2021

Uh oh!

maribu commented Jan 5, 2021

Uh oh!

jia200x commented Jan 5, 2021

Uh oh!

maribu commented Jan 5, 2021

Uh oh!

maribu commented Jan 5, 2021

Uh oh!

jia200x left a comment

Choose a reason for hiding this comment

Uh oh!

maribu commented Jul 30, 2020 •

edited

Loading

maribu commented Aug 12, 2020 •

edited

Loading

benpicco Sep 10, 2020 •

edited

Loading

jia200x commented Sep 11, 2020 •

edited

Loading