Skip to content

driver/kw2xrf: added ng_netdev implementation for the Freescale kw2x radio#2756

Merged
OlegHahm merged 2 commits intoRIOT-OS:masterfrom
jonas-rem:pr@ng_kw2xrf
May 14, 2015
Merged

driver/kw2xrf: added ng_netdev implementation for the Freescale kw2x radio#2756
OlegHahm merged 2 commits intoRIOT-OS:masterfrom
jonas-rem:pr@ng_kw2xrf

Conversation

@jonas-rem
Copy link
Copy Markdown
Contributor

This pr replaces #2329 and implements the driver for the new ng_netdev interface. The transceiver interface is not supported anymore in this pr.

Driver can be used for all version of the Freescale kw2x SiP and maybe for future Devices.

Current Status:

@haukepetersen
Copy link
Copy Markdown
Contributor

This driver (same is true for the Atmel AT86RF2xx driver) has a fundamental design flaw. In the current form it is NOT possible to use this driver on a shared SPI bus.

Reason: The driver does an SPI transfer in interrupt context -> see kw2xrf_irq_handler(). So say another device has an ongoing SPI transfer and the radio triggers it's external interrupt. This will lead to 2 enabled chip selects and undefined behavior on the SPI bus.

On top of that, this implementation is calling a mutex_lock() inside an ISR -> I am not quite sure, but I think this also leads to more or less undefined behavior. And even if it does not break anything, it does not make sense...

So for the Atmel driver I started to solve it for now by removing the spi_aquire() and spi_release() calls and putting in the documentation, that the driver has to have exclusive access to the SPI bus. An alternative I see could be to remove any SPI calls from interrupt context, but this could lead to timing difficulties.

@gebart @thomaseichinger @jfischer-phytec-iot @jremmert-phytec-iot : Any ideas how we could proceed with this?

@jremmert-phytec-iot: otherwise the driver looks good to me, will test when I am back in Office in two weeks....

@jonas-rem
Copy link
Copy Markdown
Contributor Author

I had exactly this problem during my tests of the csma mac layer, this led to unexpected behaviour.

The problem is "solved" in the same way you are doing in the atmel driver.

  • SPI device is only used by the radio-module -> exclusive access.
  • We make sure that an interrupt can only be triggered by the radio during "safe" periods. (Deactivate interrupt during SPI read/write).

Currently I see the following solutions:

  1. I had the idea to enable only one radio interrupt source simultaneously. The isr-handler would have to decide accordingly to the current driver state. For normal RX-TX operation this would work, but for some states there are several possible results, E.g. CCA_TX state. Results could be (i) CCA-failed, tx canceled and (ii) CCA-passed, tx done.
    So this is no real solution as it does not cover all states.
  2. The current solution (exclusive access for spi, remove spi_aquire() and spi_release()).
  3. Just send an indication message in isr-context and figure out the reason later in thread-context. As drawback, time critical actions could lead to problems due to slower and unpredictable task switching time spans. This concerns acknowledgement after a successful receive. But I would not expect problems there, as the maxACK_timeout is in the range of 700-900us in 2.5Ghz devices. Do we have other examples that suffers among them? What did you think of when you mentioned the timing difficulties? We could introduce a new netdev event such as NETDEV_EVENT_UNKNOWN.

If I understood it correctly, the mechanism used in the Linux Kernel implementation of the AT86RF2xx drivers is not really comparable to our problem. They trigger the SPI interaction in the ISR; the interaction is then requested and queued; ISR context ends here. If the interaction is ready, a callback function is invoked.

@jnohlgard
Copy link
Copy Markdown
Member

I had exactly this problem during my tests of the csma mac layer, this
led to unexpected behaviour.

The problem is "solved" in the same way you are doing in the atmel driver.

SPI device is only used by the radio-module -> exclusive access.

This is true for the kw22, and if we only use a single thread to talk to
the radio we won't need spi_acquire. If we ever use more than one thread to
talk to the radio, or a thread and an ISR, then there can be collisions and
race conditions. There may also be future devices with even more integrated
peripherals on the internal spi bus, or an external version of the same
radio.

We make sure that an interrupt can only be triggered by the radio during
"safe" periods. (Deactivate interrupt during SPI read/write).

This method can still miss deadlines for ack packets of you take too long
to reenable the IRQ.

Currently I see the following solutions:

I had the idea to enable only one radio interrupt source simultaneously.
The isr-handler would have to decide accordingly to the current driver
state. For normal RX-TX operation this would work, but for some states
there are several possible results, E.g. CCA_TX state. Results could be (i)
CCA-failed, tx canceled and (ii) CCA-passed, tx done.
So this is no real solution as it does not cover all states.

The current solution (exclusive access for spi, remove spi_aquire() and
spi_release()).

Just send an indication message in isr-context and figure out the reason
later in thread-context. As drawback, time critical actions could lead to
problems due to slower and unpredictable task switching time spans. This
concerns acknowledgement after a successful receive. But I would not expect
problems there, as the maxACK_timeout is in the range of 700-900us in
2.5Ghz devices. Do we have other examples that suffers among them? What did
you think of when you mentioned the timing difficulties? We could introduce
a new netdev event such as NETDEV_EVENT_UNKNOWN.

If I understood it correctly, the mechanism used in the Linux Kernel
implementation of the AT86RF2xx drivers is not really comparable to our
problem. They trigger the SPI interaction in the ISR; the interaction is
then requested and queued; ISR context ends here. If the interaction is
ready, a callback function is invoked.

I suggest using a thread with a higher priority to handle the radio and its
IRQs and trigger a context switch from inside the ISR. That way it will
still "work" if you are not the exclusive user of the spi bus, reducing the
deadlock situation to a missed deadline for the ACK packet.

Btw, how is the priority inversion handled by RIOT when a lower priority
process is holding a mutex that a higher priority process wants to take? Do
we raise the priority of the low prio task?
(sorry for not searching this myself. I'm on my phone and it's a hassle to
search through the github sources on a tiny screen)

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Sounds good. In case of an ongoing SPI interaction on the low prio task it would behave like 3 (see former post), correct? The higher prio task gets runnable for a short time, tries to aquire SPI access which should fail and goes to TASK_BLOCKED-state until the SPI is released respectively the low prio task is ready.

I will test this next week.

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Introduced an irq-worker thread that runs the spi-transaction in thread-context. A first test worked well, but I will test this in more detail later the day. In the first run I have chosen a stack-size of 128byte, which leads to isr-hard fault. Now it is running with 256byte.

@haukepetersen
Copy link
Copy Markdown
Contributor

i think the additional spi-worker-thread is not the way to go. I think i introduces too much overhead -> we need one additional task switch when receiving packets for example, and this we should try to work around. Also this approach is quite expensive in terms of memory usage.

However, I have not come up with a clear idea of how to handle the spi problem. I will spend some time on the Atmel radio and see if I have better ideas...

@PeterKietzmann
Copy link
Copy Markdown
Member

Hi guys. I opened issue #2769 as I think this is a general discussion not only concerning this PR. Please correct/extend the issue if necessary and move the discussion to that place if this if ok for you. Also (if we won't find a solution quickly) I guess we could further this PR with the "exclusive-access-method" and improve the driver when a good solution is available. What do you think?

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Hi Peter, thats a good idea!
I will revert the irq-worker thread commit and proceed with the exclusive access method. I think the exclusive access approach is reasonable in this situation, but more on that in the Issue you created.

@jonas-rem jonas-rem force-pushed the pr@ng_kw2xrf branch 2 times, most recently from 0f19a72 to 0db0145 Compare April 14, 2015 15:41
@jonas-rem
Copy link
Copy Markdown
Contributor Author

Reverted the irq-worker thread. Will now wait for @haukepetersen to udpate the Atmel driver and adapt this accordingly.

Short guide what is needed to test this:

@jonas-rem jonas-rem force-pushed the pr@ng_kw2xrf branch 2 times, most recently from c91f5aa to eaf8d6f Compare April 24, 2015 15:32
@OlegHahm OlegHahm modified the milestone: Release 2015.06 Apr 29, 2015
@jonas-rem
Copy link
Copy Markdown
Contributor Author

@haukepetersen @kaspar030 How to handle the initialization of this one? I have not proceed to implement the initialization process, since #2901 appeared. Moreover do we need a seperate test function for each driver or will there be one for all netdev drivers.

In my eyes the initialization process needs some rework to make it board independent. Excepting this issue this driver should be close to be ready. I have not seperated the netdev-stuff from the non-netdev stuff as in the atmel driver. What do you think? Is the seperation mandatory or can we merge this first and postpone the separation?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you're only using the first byte anyway, why not !=?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_set_tx_power() is called from _set(netdev-option) which receives an 16bit variable, accordingly the value_len of *val is 2. But to be consistent with the atmel driver I will move this check directly to the netconf_option setter and check for <sizeof(uint16_t)

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Sure, I changed radio_packet_length_t to uint8_t because the packet size in this device can never get bigger then 127. radio_packet_length_t is not defined in all boards.

@OlegHahm
Copy link
Copy Markdown
Member

Sure, I changed radio_packet_length_t to uint8_t because the packet size in this device can never get bigger then 127. radio_packet_length_t is not defined in all boards.

@authmillenon, that's not required for the nameless stack any more anyway, right?

Makefile.dep Outdated
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still seems tautological.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean, because this is already added in the auto-init mechanism? But the atmel-driver also includes this here.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean that this creates a dependency from A to A. The Atmel driver maps from a wild card (2%%) to 2xx.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh true, will remove

@OlegHahm
Copy link
Copy Markdown
Member

Okay for squashing.

squash: removed tab indent
@jonas-rem
Copy link
Copy Markdown
Contributor Author

squashed and rebased

@OlegHahm
Copy link
Copy Markdown
Member

Is preloading actually supported by this driver? Couldn't find the function.

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Yes, preloading is supported. The flag will be set here:

 732         case NETCONF_OPT_PRELOADING:
 733             kw2xrf_set_option(dev, KW2XRF_OPT_PRELOADING,
 734                               ((bool *)value)[0]);
 735             return sizeof(ng_netconf_enable_t);

Before sending, this flag is checked:

1066     if ((dev->option & KW2XRF_OPT_PRELOADING) == NETCONF_DISABLE) {
1067         DEBUG("kw2xrf: Sending now.\n");
1068         _set_sequence(dev, XCVSEQ_TRANSMIT);
1069     }

squash: driver: fixed typo to pass doccheck

squash: driver: bug fixes after test

squash: adapted Makefiles for correct build behaviour

restructured, no spi interaction in isr anymore

major bugfixes and restructurization

comments addressed

introduce new netconf option, ..AUTOCCA

squash: minor bugfix and add auto-init mechanism

squash: minor fixes to make travis happy
@OlegHahm
Copy link
Copy Markdown
Member

Sorry, might be that I'm a bit confused right now, but what do I have to call to load data in the buffer and what do I have to call to trigger the immediate sending?

@jonas-rem
Copy link
Copy Markdown
Contributor Author

To send data using the netdev-if and preloading:

  • set up preloading: dev->driver->set (dev, NETCONF_OPT_PRELOADING, NETCONF_ENABLE, sizeof(ng_netconf_enable_t);
  • preload using the send-call: dev->driver->send_data(dev, (ng_pktsnip_t *)msg.content.ptr);
  • and finally send out using: dev->driver->set(dev, NETCONF_OPT_STATE, NETCONF_STATE_TX);

Send data out immediately:

  • set up preloading: dev->driver->set (dev, NETCONF_OPT_PRELOADING, NETCONF_DISABLE, sizeof(ng_netconf_enable_t);
  • send out using: dev->driver->send_data(dev, (ng_pktsnip_t *)msg.content.ptr);

@OlegHahm
Copy link
Copy Markdown
Member

Ah, thanks for explanation. I wasn't aware of this mechanism.

@OlegHahm
Copy link
Copy Markdown
Member

How do I need to configure the test to run with the pba-d-01-kw22?

@OlegHahm OlegHahm closed this May 13, 2015
@OlegHahm OlegHahm reopened this May 13, 2015
@OlegHahm
Copy link
Copy Markdown
Member

Sorry, had a shaky hand...

@jonas-rem
Copy link
Copy Markdown
Contributor Author

You can find a branch on Johann´s git-repository: https://github.com/jfischer-phytec-iot/RIOT/tree/wip%40mkw2xd%2Bauto-init

  • Just checkout the branch wip@mkw2xd+auto-init and pull this one onto it.

@OlegHahm
Copy link
Copy Markdown
Member

Thanks. I cannot always receive on the IoT-LAB M3, but this seems in general a bit flakey. ACK

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Travis failed on native because of coap test app:

    Building application: coap (3/78) .
        failed .. retrying: native

This seems totally unrelated, any idea why this failed? When I try this manually, it builds the coap app.

@OlegHahm
Copy link
Copy Markdown
Member

I guess it is unrelated. Kicked Travis.

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Thanks, since Travis is happy now we can merge this? @OlegHahm who is intendet to do that? I have no merging priviledges.

@OlegHahm
Copy link
Copy Markdown
Member

I can and will do.

OlegHahm added a commit that referenced this pull request May 14, 2015
driver/kw2xrf: added ng_netdev implementation for the Freescale kw2x radio
@OlegHahm OlegHahm merged commit 4d64b98 into RIOT-OS:master May 14, 2015
@OlegHahm
Copy link
Copy Markdown
Member

And Congrats to your first commits that went into upstream RIOT! 🎆

@jfischer-no
Copy link
Copy Markdown
Contributor

🎆 🎉 🎆 🎉 🎆 🎉

@jonas-rem
Copy link
Copy Markdown
Contributor Author

Very cool! Thanks for reviewing!

@miri64 miri64 added the Area: network Area: Networking label Sep 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Area: drivers Area: Device drivers Area: network Area: Networking

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants