sync from incubator-mxnet #1

heaseny · 2020-02-19T10:28:41Z

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
Changes are complete (i.e. I finished coding on this PR)
All changes have test coverage:
Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
Code is well-documented:
For user-facing API changes, API doc string has been updated.
For new C++ functions in header files, their functionalities and arguments are documented.
For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Feature1, tests, (and when applicable, API doc)
Feature2, tests, (and when applicable, API doc)

Comments

If this change is a backward incompatible change, why must this change be made.
Interesting edge cases to note here

#17324) As of jemalloc 5, jemalloc default build can not be used in libraries that are dlopened. However, libmxnet.so is dlopened by Python (ctypes). To use MXNet with jemalloc 5, users must not link to system libjemalloc.so but must rather link to a libjemalloc compiled with DISABLE_INITIAL_EXEC_TLS

* license under bsd * fix rat exclude

* add batch_axis in validation handler * Add test case for batch axis support * change test class name

…ion (#17388)

* fix build error due to #17128 * whitelist all *MX* API * Update libmxnet.sym * Update libmxnet.ver Co-authored-by: Sheng Zha <[email protected]>

- Add missing iot urlParam - Fix use of potentially undefined variables. For example, if the &processor= url parameter is unspecified, this would previously cause use of undefined variable javascript error leading to an empty page.

* Update FindNCCL.cmake If Cuda root dir is not specified, search for the symlinked unix default * Update FindNCCL.cmake * fix the cmake variable name FindCUDAToolkit exposes result variables (cmake variables). Use those. https://github.com/apache/incubator-mxnet/blob/28e053edb4f2079743458bf087557bcac7e58c62/cmake/Modules/FindCUDAToolkit.cmake#L427-L464 * search for library in /usr/local/cuda * comment if * indent, comment fix * fix indent, add nccl_root alongside nccl_root_dir Also add warning about deprecation * fix indent, remove nccl_root checks * Update FindNCCL.cmake

* add api * make doc compatible with c_api.h INT64 docs

…her_nd error test (#17360)

* ndims and shape of dims * add c_api definition and update invoking of capi in sparse * minor fixes * fix c_api for sparse64 * fix lint, add test * change test, indext for thread * fix test * replace userdefined dtype with standard dtype * revert dtype to int for ndim of 64bit SparseAPI, retain uint32 for 32bit API * Sparse 32bit as well as 64bit API ndim dtype made int * revert index_t to IType as it is int64 anyway no need for hardcode * fix sparseEx API * fix the api breaking change for 32bit API * remove template parameter for ndim,aux_ndim since internally its always int

…ed memory footprint (#17410)

…17405) https://cmake.org/cmake/help/latest/policy/CMP0074.html

* Add event handlers in validation handler * update doc string

* fix template param as ndim is internally int * Update c_api.cc

* use temp file * fix dependency * Update model_store.py * Update test_gluon_model_zoo.py * remove NamedTempFile

* norm * full test * add default behaviour * add col norm forward * add matrix col row norms forward * row col norm backward * improve tests * beautify cpp * update broadcast_op * C1002 * billie holiday even told it even better * probing for windows unittest numpy error * update test * update test * fix style * retrigger unix ci * update according to reviews * fix backward set_num_input * fix CI

* fix seed for mkldnn test * use fixed seed for mkldnn test

* fix cpp predcit license * add white list (#210) * fix white list (#211) Co-authored-by: Lai Wei <[email protected]>

Includes ps-lite CMakeLists.txt refactor

* Add config.cmake * Update guide * Address comments * Remove undocumented include(build/private/local_config.cmake) * Fix typo

* safe load for yaml * Trigger notification

Build from source only works with JDK < 9 due to dependency on maven-compiler-plugin and `make ratcheck` is thus broken on newer systems.

) * Add cmake/upstream folder * Add upstream select_compute_arch.cmake * Change CUDA_COMMON_GPU_ARCHITECTURES * Include License * Workaround rat-exclude problems

* Initial commit - added first batch of misc ops * Initial commit - added first batch of misc ops * Added remaining misc ops, including Custom op logic * Added more test cases, fixed lint errors * Update documentation * Added run_backward=True for ops supporting backwards runs * Added issue link for bilinear UpSampling * Added remaining misc ops, including Custom op logic * Update documentation * Updated alias map * Fixed missing and incorrect alias issues * Added remaining missing aliases * Fixed Custom profile dump parsing and alias * Switched to using sets for O(1) op membership checks * Added fix for dtype issue in master

* fix storage type of softmax backward * remove unused variable

* Drop _cy2 * Drop Python 2 specific code in mxnet and tests * Replace io.open with open * Drop from __future__ imports * Fix lint

* support broadcasting on the indexed axis * support extra cases for broadcasting setitem, fixing a few other bugs in indexing

cublasGemmBatchedEx is only supported for GPU with architecture capabilities equal or greater than 5.0. Fixes a bug in #16408

* add op insert fix lint and compile error fix pylint error * fix slice bug * modify code logic and patterns fix sanity error * reuse existing function / modify alignment * support to insert a sigle number * use range_fwd kernel * reduce code * fix a small mistake / remove redundant function * reduce code again / resolove function conflict * fix pylint error / decompose main function into small pieces * test CI * test CI 2 * test CI 3 * test CI 4 fix sanity error * op : insert tensor * op : insert scalar fix sanity error * op : insert by slice fix compile error fix cmopile error again * fix code style problem

* enable mkldnn by default in pip packages * add make file for native build * rm *mkl build scripts * remove *mkl variants * Change readme/cd * clear native build configurations * fix static build test in ci * git mv linux_cu102mkl.mk linux_cu102.mk * fix merge conflict

* add gpu doc portion * resolve sam comments * add lib_api and resolve comments * resolve sam comments * try remove italic * fix typo and add mutable example * retrigger ci

* Add Bfloat16 * mshadow support bf16 * rebase bf16 mkldnn1.0 * support bf16 gemm * resolve fp32 ip bwd bug * add other bf16 ops * change func name from fp16 to lp16 (low precision 16), to include bf16 * add amp_cast bf16 support for ndarray * fix executor copy_params * add test case for bf16 * remove numpy dtype hook for bf16 * add bf16 type support * rebase to mxnet master * add single conv test * fix symbolic inference * add dtype check when copy * add single conv and bn test * skip fp16 amp_cast test in cpu * Fix resnet50 first convolution * Skip first convolution for bfloat16 * support bf16 fallback compute * recover origin test * add some bf16 unittests * fix bf16 bn test, enhance assert_almost_equal_with_err * using assert_almost_equal_with_err for fallback bn test * add relu6 bf16 support * fix lint * fix subgraph conv with data=0 * mkldnn doesn't support 0 dim tensor * rm dtype check when copy * using bf16 tvm * rm bf16 mnist demo * use official tvm * change function name; fix lint error * fix clang check error:conditional expression is ambiguous; 'float' can be converted to 'mshadow::bfloat::bf16_t' and vice versa * nvcc compiler build pass * fix gpu amp cast symbol error * fix mnist training error * fix cpp test: Engine.VarVersion error * workaround cpp failed test mkldnn fc bwd * to fix mkldnn test_mkldnn_ndarray_slice error * 1. move some code from to np_broadcast_reduce_op_value.cc to np_broadcast_reduce_op_value_part2.cc to pass Win CPU/GPU build (fatal error C1002: compiler is out of heap space in pass 2) 2. rm debug code * use official dlpack * rename np_broadcast_reduce_op_value_part2.cc and add some description * 1. update dlpack url in .gitmodule 2. disable mkldnn fc bwd * fix remaining NodePtr due to tvm update * mv some code from mxnet_op.h to mxnet_op_kernel_assign.h to avoid WIN compiler error 'fatal error C1002: compiler is out of heap space in pass 2' * fix WIN CPU build fail:compiler is out of heap space in pass 2 * fix WIN build fail * fix lint * add print for test bf16_concat * fix bf16 test fail * disable bf16 concat test * tmp skip to root cause edge test halt * fix bf16_bn test error * enable test_bulk * tmp rm bf16 to locate edge error * Revert "tmp rm bf16 to locate edge error" This reverts commit 7360246. * add Apache license header * trigger CI * add robust for test bf16 bn Co-authored-by: Zhennan Qin <[email protected]> Co-authored-by: YixinBao <[email protected]> Co-authored-by: Xinyu Chen <[email protected]> Co-authored-by: Wuxun Zhang <[email protected]>

* update symbol to json add remove_amp_cast argument to keep same with symbol.save * retrigger CI Co-authored-by: JackieWu <[email protected]>

* Additional fix for vector access. See 9634786 for the original. * CI * ci * ci * retrigger CI * ci Co-authored-by: JackieWu <[email protected]>

* Fix OS X staticbuild and add tests * Update OS X build from source documentation

* Fix * Try to fix...

…17608) Fixes `gzip: stdin: not in gzip format` errors due to CI trying to unzip HTML page containing the forward notice.

…d vsplit, add interoperability tests for h/v/dsplit (#17478)

* C++ * rayleigh * exponential * c++

* fix mkldnn fc bwd bug due to data inplace * enable mkldnn fc bwd * fix cpp tests * try: fix random seed * fix cpp test * loose rtol for fc cpp test * improve error message * limit max value for mkldnn tensors * limit the max value of test tensors * fix lint * remove fixed random seed * address review comments * Revert "address review comments" This reverts commit 56d873f. Co-authored-by: rongzha1 <[email protected]>

* Add fallback ops * More fallback ops * Add more fallback ops * Fix astype * Support mixed type of mx.np.ndarray and np.ndarray in binary ops * More fix * Add result_type * Remove implemented ops * Fix * Add more tests * fallback--------------- * 8 * all fallback * ok * delete repetition Co-authored-by: reminisce <[email protected]>

leezu and others added 30 commits January 20, 2020 11:56

license np_einsum file under bsd (#17367)

25505e9

* license under bsd * fix rat exclude

add batch_axis in validation handler (#17134)

29eaa50

* add batch_axis in validation handler * Add test case for batch axis support * change test class name

skipping randint flaky test for large vector and reordering op execut…

557a29b

…ion (#17388)

Fix Horovod build error due to missing exported symbols (#17348)

a967d37

* fix build error due to #17128 * whitelist all *MX* API * Update libmxnet.sym * Update libmxnet.ver Co-authored-by: Sheng Zha <[email protected]>

fix flaky test: boolean index and fix bugs (#17222)

52f3bba

Fix IOT Devices section of Get Started page (#17326)

b295d13

- Add missing iot urlParam - Fix use of potentially undefined variables. For example, if the &processor= url parameter is unspecified, this would previously cause use of undefined variable javascript error leading to an empty page.

Add API docs to INT64 APIs (#16617)

32d3bd8

* add api * make doc compatible with c_api.h INT64 docs

Fix R-package/src/Makevars for OpenCV 4 (#17404)

085fd79

fix all/any (#17392)

c81f2fe

add random.multivariate_normal, fix empty_like dtype problem, fix gat…

ebb1ae6

…her_nd error test (#17360)

skipping flaky tests randint,log_softmax. Skipped topk due to increas…

b72d195

…ed memory footprint (#17410)

quantile/percentile (#17234)

765ba32

np.broadcast_to extension (#17358)

b3dcd32

cmake: remove documentation for NVTX_ROOT_DIR in favor of NVTX_ROOT (#…

a252a17

…17405) https://cmake.org/cmake/help/latest/policy/CMP0074.html

Add event handlers in validation handler (#17322)

359da76

* Add event handlers in validation handler * update doc string

Update symbol.py (#17408)

09a57a6

fix template param as ndim is internally int (#17406)

593d5b6

* fix template param as ndim is internally int * Update c_api.cc

[BUGFIX] fix model zoo parallel download (#17372)

4a13edd

* use temp file * fix dependency * Update model_store.py * Update test_gluon_model_zoo.py * remove NamedTempFile

[BUILD] pslite fix link zmq (#17427)

e1779f4

fixed seed for mkldnn test (#17386)

47349ce

* fix seed for mkldnn test * use fixed seed for mkldnn test

update mkl to 2020.0 (#17355)

a1b0ff2

[LICENSE] fix cpp predcit license (#17377)

3ef8935

* fix cpp predcit license * add white list (#210) * fix white list (#211) Co-authored-by: Lai Wei <[email protected]>

Update 3rdparty/ps-lite (#17433)

373ea50

Includes ps-lite CMakeLists.txt refactor

cmake: add config files similar to make/config.mk (#17430)

a15e1b9

* Add config.cmake * Update guide * Address comments * Remove undocumented include(build/private/local_config.cmake) * Fix typo

[CD] update publish path (#17453)

0886d6c

safe load for yaml (#17399)

6fc322a

* safe load for yaml * Trigger notification

leezu and others added 29 commits February 13, 2020 18:35

apache-rat: use binary release instead of build from source (#17582)

755541c

Build from source only works with JDK < 9 due to dependency on maven-compiler-plugin and `make ratcheck` is thus broken on newer systems.

cmake: don't build PTX and 3.5 arch if cuda arch detection fails (#17521

d004c2b

) * Add cmake/upstream folder * Add upstream select_compute_arch.cmake * Change CUDA_COMMON_GPU_ARCHITECTURES * Include License * Workaround rat-exclude problems

use py3 for kvstore tests (#17593)

0f35489

fix CD and remove leftover from #15990 (#17551)

1c07771

add isposinf isneginf isfinite (#17563)

8d887ca

quantile_scalar (#17572)

39b158f

Fix storage type infer of softmax backward (#17576)

40c0c54

* fix storage type of softmax backward * remove unused variable

Python 2 cleanup (#17583)

f619c52

* Drop _cy2 * Drop Python 2 specific code in mxnet and tests * Replace io.open with open * Drop from __future__ imports * Fix lint

Support broadcast assign for npi_boolean_mask_assign_tensor (#17131)

9ee4f04

* support broadcasting on the indexed axis * support extra cases for broadcasting setitem, fixing a few other bugs in indexing

Fix transformer.cu interleaved matmul for cuda arch < 5 (#17596)

d352673

cublasGemmBatchedEx is only supported for GPU with architecture capabilities equal or greater than 5.0. Fixes a bug in #16408

Implement Weibull backward (#17590)

b6b1de0

Update CustomOp doc with changes for GPU support (#17486)

99d5773

* add gpu doc portion * resolve sam comments * add lib_api and resolve comments * resolve sam comments * try remove italic * fix typo and add mutable example * retrigger ci

update symbol to json (#16948)

9c10ed4

* update symbol to json add remove_amp_cast argument to keep same with symbol.save * retrigger CI Co-authored-by: JackieWu <[email protected]>

Additional fix for vector access. (#17230)

7743fb0

* Additional fix for vector access. See 9634786 for the original. * CI * ci * ci * retrigger CI * ci Co-authored-by: JackieWu <[email protected]>

Fix OS X staticbuild, update docs and add tests (#17602)

0f29cca

* Fix OS X staticbuild and add tests * Update OS X build from source documentation

Fix Non-ASCII character in docstring (#17600)

bc1fc55

* Fix * Try to fix...

[CI] Follow redirects when downloading apache-maven-3.3.9-bin.tar.gz (#…

006c4f9

…17608) Fixes `gzip: stdin: not in gzip format` errors due to CI trying to unzip HTML page containing the forward notice.

doc fix for argmax & argmin (#17604)

1133bfa

support np.dsplit, fix some error msgs and corner cases for hsplit an…

c4c639d

…d vsplit, add interoperability tests for h/v/dsplit (#17478)

Correct the grammar in 1-ndarray tutorial (#17513)

8525926

Fix get_started scala gpu (#17434)

0f3c5da

[numpy] add op random.rayleigh (#17541)

4559ab8

* C++ * rayleigh * exponential * c++

[CI] Upgrade sphinx and autodocsumm (#17594)

deeb9f9

heaseny merged commit 999d78b into heaseny:master Feb 19, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sync from incubator-mxnet #1

sync from incubator-mxnet #1

Uh oh!

heaseny commented Feb 19, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

sync from incubator-mxnet #1

sync from incubator-mxnet #1

Uh oh!

Conversation

heaseny commented Feb 19, 2020

Description

Checklist

Essentials

Changes

Comments

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants