Skip to content

cortexm_common: add FPU support for cortex-m4f and cortex-m7#6630

Merged
MrKevinWeiss merged 3 commits intoRIOT-OS:masterfrom
OTAkeys:pr/cortex-m4f-fpu
Jan 4, 2019
Merged

cortexm_common: add FPU support for cortex-m4f and cortex-m7#6630
MrKevinWeiss merged 3 commits intoRIOT-OS:masterfrom
OTAkeys:pr/cortex-m4f-fpu

Conversation

@vincent-d
Copy link
Copy Markdown
Member

Fixes #1366.

Not sure if every cases are covered, but it works with nucleo-f401 and nucleo-f413.
I added a test app based on thread_msg which uses float.

No hard fault so far and float results seem OK with different threads.

@haukepetersen
Copy link
Copy Markdown
Contributor

Nice! Looks great. I think it might make sense to add a fpu feature, so we can do nicer filtering when building the test?!

@kaspar030
Copy link
Copy Markdown
Contributor

The test runs those three threads in parallel, right?

Maybe it would be useful to crazily switch between them, e.g., using a periodic timer?

@vincent-d
Copy link
Copy Markdown
Member Author

@haukepetersen I don't really know what it implies, so maybe we could. But I don't really like the Makefile stuff to disable the feature, so if you have a better idea, don't hesitate to give it.

@kaspar030 you're right, we should try to switch in the middle of the float calculations, I'll try to improve the test app

@vincent-d vincent-d added the Platform: ARM Platform: This PR/issue effects ARM-based platforms label Feb 22, 2017
@vincent-d
Copy link
Copy Markdown
Member Author

I added a patch to cmsis-dsp to include cpu_conf.h in order to fix the build issues

@vincent-d
Copy link
Copy Markdown
Member Author

Test app updated to switch threads more agressively

@vincent-d
Copy link
Copy Markdown
Member Author

Fixed CI issues

@kYc0o
Copy link
Copy Markdown
Contributor

kYc0o commented Mar 3, 2017

@vincent-d Can you squash to enable the CI?

@vincent-d vincent-d force-pushed the pr/cortex-m4f-fpu branch from b2b5037 to fc51f8c Compare March 3, 2017 15:55
@vincent-d
Copy link
Copy Markdown
Member Author

squashed

@kYc0o kYc0o added this to the Release 2017.04 milestone Mar 3, 2017
@kYc0o kYc0o added the CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR label Mar 3, 2017
endif
export MCPU := cortex-m4
else
CFLAGS_FPU ?= -mfloat-abi=soft
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default behaviour is then soft float?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this else is for everything not cortex-m4f

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I misread the ifneq (llvm,$(TOOLCHAIN)). Thanks!

nucleo-f334 opencm9-04 pca10000 pca10005 spark-core \
stm32f0discovery weio yunjia-nrf51822

DISABLE_MODULE += auto_init
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why auto_init should be disabled?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually a good question :D

I copied/paste another test, so I guess we could remove that line and enable auto_init.

export CFLAGS_FPU ?= -mfloat-abi=soft
else
# clang assumes there is an FPU
ifneq (llvm,$(TOOLCHAIN))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there any reason to not give these flags to clang, regardless of the default setting? Does clang give an error if you specify hard float? Toolchains are usually quite configurable and the defaults may not always be the same.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wihtout this ifneq, the following error is generated:

src/RIOT/cpu/cortexm_common/include/cmsis_gcc.h:344:54: error: unknown register name 'vfpcc' in asm
  __ASM volatile ("VMSR fpscr, %0" : : "r" (fpscr) : "vfpcc");
                                                     ^
1 error generated.

clang version : 3.8.1-12ubuntu1

@vincent-d
Copy link
Copy Markdown
Member Author

Auto_init no longer disabled.

@OlegHahm OlegHahm added CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR and removed CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Mar 6, 2017
@vincent-d
Copy link
Copy Markdown
Member Author

  • rebased
  • improved Makefile.onclude.cortexm_common to avoid having a warning when disabling cortexm_fpu module

@kYc0o kYc0o added Type: new feature The issue requests / The PR implemements a new feature for RIOT CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR labels Apr 3, 2017
@vincent-d vincent-d force-pushed the pr/cortex-m4f-fpu branch from 687fc5e to 9106721 Compare May 24, 2017 16:17
f = init;

while (1) {
for (int i = 0; i < 100000; i++) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems murdock doesn't like this too much, maybe specify uint32_t or limit it to 32767. If you do that I will test then hopefully merge now that I am a maintainer ;)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be fixed, haven't retested though

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you hit the wrong for loop, do the one on line 83.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I had fixed the other one, this one should be fixed now as well

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the AVR (among others) an int is only 16 bits, so this comparison is always true for those architectures.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bergzand I switch to unsigned long

@vincent-d vincent-d force-pushed the pr/cortex-m4f-fpu branch from f65f277 to 17e2c26 Compare May 30, 2018 07:25
@vincent-d vincent-d force-pushed the pr/cortex-m4f-fpu branch from 5e5fafb to 22e337e Compare May 30, 2018 13:54
@MrKevinWeiss
Copy link
Copy Markdown
Contributor

Well it's probably a silly mistake for me but I am trying to run the test on the nucleo-f446ze and it appears that it cannot switch threads. The MCU is a cortex M4 with FPU. It stays in thread pid == 3 and never switches.

2018-05-31 15:16:39,540 - INFO # THREADS CREATED
2018-05-31 15:16:39,541 - INFO # 
2018-05-31 15:16:39,542 - INFO # THREAD 3 start
2018-05-31 15:16:39,592 - INFO # T(3): 141.466812
2018-05-31 15:16:39,641 - INFO # T(3): 141.490616
2018-05-31 15:16:39,694 - INFO # T(3): 141.521576
2018-05-31 15:16:39,742 - INFO # T(3): 141.559769
2018-05-31 15:16:39,791 - INFO # T(3): 141.605148
2018-05-31 15:16:39,844 - INFO # T(3): 141.657516
2018-05-31 15:16:39,892 - INFO # T(3): 141.717255
2018-05-31 15:16:39,941 - INFO # T(3): 141.783844
2018-05-31 15:16:39,995 - INFO # T(3): 141.857513
2018-05-31 15:16:40,042 - INFO # T(3): 141.938278
2018-05-31 15:16:40,096 - INFO # T(3): 142.026108
2018-05-31 15:16:40,144 - INFO # T(3): 142.120789

If I set the while loops to while(0) is appears that the threads are there.

2018-05-31 15:18:04,680 - INFO # THREADS CREATED
2018-05-31 15:18:04,680 - INFO # 
2018-05-31 15:18:04,680 - INFO # THREAD 3 start
2018-05-31 15:18:04,684 - INFO # THREAD 4 start
2018-05-31 15:18:04,685 - INFO # THREAD 5 start

Is that the intended behavior for the test or is there a problem with the sched_switch() not yielding?

(I can also confirm it is getting into the timer_cb)

(void) arg;

xtimer_set(&timer, OFFSET);
sched_switch(THREAD_PRIORITY_MAIN - 2);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't "thread_yield()``` do the trick here?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested with thread_yield() and it seems to work ok!

2018-06-01 15:58:12,754 - INFO # THREADS CREATED
2018-06-01 15:58:12,755 - INFO # 
2018-06-01 15:58:12,756 - INFO # THTTRHHERREEAAADDD   55 5 s ststatraatrr
2018-06-01 15:58:12,757 - INFO # tt
2018-06-01 15:58:12,758 - INFO # 
2018-06-01 15:58:12,898 - INFO # T(5): 141.521576
2018-06-01 15:58:12,899 - INFO # T(3): 141.466812
2018-06-01 15:58:13,042 - INFO # T(5): 141.559769
2018-06-01 15:58:13,049 - INFO # T(3): 141.490616
2018-06-01 15:58:13,192 - INFO # T(5): 141.605148
2018-06-01 15:58:13,198 - INFO # T(3): 141.521576
2018-06-01 15:58:13,343 - INFO # T(5): 141.657516
2018-06-01 15:58:13,344 - INFO # T(3): 141.559769
2018-06-01 15:58:13,492 - INFO # T(5): 141.717255
2018-06-01 15:58:13,494 - INFO # T(3): 141.605148

@vincent-d vincent-d force-pushed the pr/cortex-m4f-fpu branch from 22e337e to cdb890d Compare July 17, 2018 14:33
@MrKevinWeiss
Copy link
Copy Markdown
Contributor

@kaspar030 @vincent-d Should we try to push this one forward?

@kaspar030
Copy link
Copy Markdown
Contributor

Yes! @vincent-d could you rebase once more?

@aabadie
Copy link
Copy Markdown
Contributor

aabadie commented Jan 3, 2019

@vincent-d, just a rebase is missing. I guess you can also squash.

@vincent-d
Copy link
Copy Markdown
Member Author

Rebased and squashed

@MrKevinWeiss
Copy link
Copy Markdown
Contributor

@vincent-d looks like arduino-uno and arduino-duemilanove needs insufficient memory to make murdock happy...

I would also like to retest quickly.

@MrKevinWeiss
Copy link
Copy Markdown
Contributor

Test is the same. @kaspar030 are the changes sufficient for you?

@vincent-d
Copy link
Copy Markdown
Member Author

Fixed and amended directly

@MrKevinWeiss
Copy link
Copy Markdown
Contributor

@vincent-d Thanks for the fast response. 👍

We wait for @kaspar030 to dismiss at this point.

@MrKevinWeiss MrKevinWeiss merged commit 9554f75 into RIOT-OS:master Jan 4, 2019
@kaspar030
Copy link
Copy Markdown
Contributor

I quickly checked code size, memory consumption and context switch performance. code & memory is not worth talking about, context switch performance is down 2%, which is a lot less than I expected and completely acceptable.

ACK & go.

Thanks @ALL and especially to @vincent-d for the work and patience!

@benpicco
Copy link
Copy Markdown
Contributor

It would be great if the test had a tests/01-run.py to verify the correct output of the test.
I'm a bit at a loss trying to figure out what is the correct behavior to expect from the test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CI: ready for build If set, CI server will compile all applications for all available boards for the labeled PR Platform: ARM Platform: This PR/issue effects ARM-based platforms Type: new feature The issue requests / The PR implemements a new feature for RIOT

Projects

None yet

Development

Successfully merging this pull request may close these issues.

FPU support for the Cortex-M4F family