Opencv4 tests and blas provider#208351
Conversation
|
Seems like xvfb-run is unsupported on darwin, and one tests fails on arm 😞 |
3a1c153 to
0472758
Compare
dd72636 to
5b00354
Compare
5b00354 to
222fe87
Compare
|
Ok, x86_64-darwin first had a timeout while building openblas, 2nd try it had a timeout during run of opencv tests. I suspect something is not right with build agent itself and I would disable tests for this platform, because I don't have a machine to test them myself. |
There was a problem hiding this comment.
| testRunner = if stdenv.isDarwin then "" else "${xvfb-run}/bin/xvfb-run -a "; | |
| testRunner = if stdenv.isDarwin then "" else "${lib.getExe xvfb-run} -a "; |
also for other reviewers, testRunner is only used when build == host so no need to use buildPackages
222fe87 to
9c647f0
Compare
|
aarch64-linux failed because of cargo: Not sure what happened with darwin builds, they displayed as queued and no direct link to logs present. But I've found this log link https://logs.nix.ci/?attempt_id=f8174375-f5f7-4aee-be64-1ea47874c035&key=nixos%2Fnixpkgs.208351 which probably is the last commits build, and it failed because of qimgv-1.0.3-alpha not supported on darwin and also because because of llvm build timeout. The same problem with qimv is also present for aarch64-darwin - https://logs.nix.ci/?attempt_id=a5ce9a8a-281c-4a10-a47e-6c7754ee5c51&key=nixos%2Fnixpkgs.208351 . So, looks like that qimv should not be included in darwin builds. But I don't understand how PR with opencv 4.7 and qimv got merged in the first place, it should've failed CI for the same reason ? |
|
as far as i can tell the darwin failures are unrelated |
|
this seems to double the build time of opencv so i think it would be better to run these tests in passthru.tests |
c016192 to
77a0192
Compare
3bec2cc to
6781d43
Compare
|
Oookey, I've refactored out tests into a separate |
There was a problem hiding this comment.
He is not active anymore. In the nixpkgs and elsewhere over github, it seems. Last PR in nixpkgs in 2017, no activity in his github profile over the last year. Personally, I think we better to not decieve ourselves and remove maintainers after some period (e.g. a year) of inactivity.
There was a problem hiding this comment.
added a comment about removing him to the commit msg so people don't think it was an accident
6781d43 to
e3a8995
Compare
removed mdaiter from maintainers, his last PR in nixpkgs was in 2017
…opencv and openblas
e3a8995 to
66efeba
Compare
|
Hi, @Artturin 👋 . Is there something else left preventing this PR from being merged ? I'd like to make a new release of my Stable Diffusion flake soon, hopefully with this fixes present in nixpkgs 😉 |
Description of changes
This PR is derived from my attempt to fix #196653 . I've figured out that by default blas provider uses openblas under the hood, so the change was trivial albeit a little ugly because of
blas.provider.pnamecheck. Everything seemed fine, but I decided to throw in some unit tests just to be sure I didn't break anything. Initially for testing I overrided blas.provider with mkl, ran tests with it and managed to enable 80% of the opencv's tests. After that I switched blas.provider to openblas, ran tests and ... they broke, hung indefinitely, to be precise 😑 . I tried to set --test_threads=1 , turned off several test groups fromtestNamesbut different tests hung anyway. Ok, I've tried to ran tests without my blas.provider changes - result was the same 🤯 . Then I sawsingleThreadedin openblas, and soon found link to the openblas docs and matched it with-DWITH_OPENMP=ONflag we use here. I've made experiments with this two flags, and results are quite logical - you can either use singleThreaded openblas, or build opencv with WITH_OPENMP=OFF, and then tests pass. So, I've figured out we have 3 options:I've ruled out 1st option, because concurrency bugs could be really hard to spot and reproduce, but when they finally hit you it could cause you a lot of headache. Between 2nd and 3rd option I've intuitively inclined toward 2nd option, because I think that linear algebra calculations should be only a small part of what opencv does. Moreover, it seems that blas support was initially added to fix perfomance bottlenecks, but later optimizations were applied and the need has vanished. But anyway, I decided to enable perf tests as well and compare both variants - in my unscientific benchmark (6C 12T i5-11400F + 32Gb) difference of total time between two versions was negligible. So, I decided that 2nd option is "good enough" (however, documentation on multithreading support of opencv and openblas contain hints to other possible approaches with pluggable MT backends, if anyone interested to dig this rabbit hole even more)
Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)nixos/doc/manual/md-to-db.shto update generated release notes