Replace bli_dlamch with something less archaic#498
Conversation
Since it's not 1970 anymore the standard library helpfully provides all of the floating point limit constants needed for `[sd]lamch`.
|
I seem to recall trying this many years ago, and, IIRC, I found that the more modern/straightforward implementation of dlamch caused issues somewhere down the line, probably in the LAPACK test suite. But before we can even consider its effects on LAPACK, we should get the patch working with our own test suite. :) |
|
Yeah, that should teach me not to author commits in the browser. I'll get it to compile and pass checks locally. Re LAPACK tests, is there an easy way to check compliance (e.g. putting this in libflame and running the LAPACK testsuite)? I can certainly print out the old and new values on a few archs and see where the differences lie. In particular, I think the "safe minimum" values may be slightly less than |
- Fix typos (missing ''). - DBL_ROUNDS -> FLT_ROUNDS. - Add includes for ctype.h and fenv.h.
|
BLAS tests pass now. |
It happens. At least it was just a branch.
I think the easiest way is to download netlib LAPACK and plug it in. (There should be a place to specify the BLAS library for testing purposes.) libflame's LAPACK code hasn't been refreshed in some time, so I don't think that's the best vehicle for testing anymore.
Thanks, that would be very interesting to see. No matter what your initial tests show, though, I'm nervous enough about this in the short term that I think you should add your new implementation as a preprocessor branch (guarded by something like |
|
@fgvanzee for IEE754 floating point all values match the "legacy" |
|
Looking at the LAPACK version, I'm actually convinced that, with the |
No idea. I have a distant memory of looking at this maybe 5-6 years ago, so perhaps it was all due to a bug that got fixed. (I mean, I don't want them to be different. So I'm glad you're seeing consistent numbers now.) Nevertheless, I don't see any harm in keeping the old code disabled via a preprocessor guard. |
Default is the new implementation, define `BLIS_ENABLE_LEGACY_LAMCH` to use the old version.
|
Preprocessor guard added. Define |
|
Thanks, much better. Out of curiosity, why does the new case 'E': return DBL_EPSILON;
case 'S': return safe_min;
case 'B': return FLT_RADIX;
case 'P': return FLT_RADIX*DBL_EPSILON;
case 'N': return DBL_MANT_DIG;
case 'R': return FLT_ROUNDS == FE_TONEAREST ? 1.0 : 0.0;
case 'M': return DBL_MIN_EXP;
case 'U': return DBL_MIN;
case 'L': return DBL_MAX_EXP;
case 'O': return DBL_MAX; |
@devinamatthews In case you missed my question from earlier. |
|
I could have sworn I responded... oh well. This is because things like the radix and rounding behavior don't depend on the precision, so there is only one constant instead of three (float, double, long double). |
|
Okay, thanks. |
* commit 'e366665c': Fixed stale API calls to membrk API in gemmlike. Fixed bli_init.c compile-time error on OSX clang. Fixed configure breakage on OSX clang. Fixed one-time use property of bli_init() (flame#525). CREDITS file update. Added Graviton2 Neoverse N1 performance results. Remove unnecesary windows/zen2 directory. Add vzeroupper to Haswell microkernels. (flame#524) Fix Win64 AVX512 bug. Add comment about make checkblas on Windows CREDITS file update. Test installation in Travis CI Add symlink to blis.pc.in for out-of-tree builds Revert "Always run `make check`." Always run `make check`. Fixed configure script bug. Details: - Fixed kernel list string substitution error by adding function substitute_words in configure script. if the string contains zen and zen2, and zen need to be replaced with another string, then zen2 also be incorrectly replaced. Update POWER10.md Rework POWER10 sandbox Skip clearing temp microtile in gemmlike sandbox. Fix asm warning Sandbox header edits trigger full library rebuild. Add vhsubpd/vhsubpd. Fixed bugs in cpackm kernels, gemmlike code. Armv8A Rename Regs for Safe Darwin Compile Armv8A Rename Regs for Clang Compile: FP32 Part Armv8A Rename Regs for Clang Compile: FP64 Part Asm Flag Mingling for Darwin_Aarch64 Added a new 'gemmlike' sandbox. Updated Fugaku (a64fx) performance results. Add explicit compiler check for Windows. Remove `rm-dupls` function in common.mk. Travis CI Revert Unnecessary Extras from 91d3636 Adjust TravisCI Travis Support Arm SVE Added 512b SVE-based a64fx subconfig + SVE kernels. Replace bli_dlamch with something less archaic (flame#498) Allow clang for ThunderX2 config AMD-Internal: [CPUPL-2698] Change-Id: I561ca3959b7049a00cc128dee3617be51ae11bc4
Since it's not 1970 anymore the standard library helpfully provides all of the floating point limit constants needed for
[sd]lamch. Fixes #497.