Skip to content

Conversation

@heaseny
Copy link
Owner

@heaseny heaseny commented Feb 19, 2020

Description

(Brief description on what this PR is about)

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • The PR title starts with [MXNET-$JIRA_ID], where $JIRA_ID refers to the relevant JIRA issue created (except PRs with tiny changes)
  • Changes are complete (i.e. I finished coding on this PR)
  • All changes have test coverage:
  • Unit tests are added for small changes to verify correctness (e.g. adding a new operator)
  • Nightly tests are added for complicated/long-running ones (e.g. changing distributed kvstore)
  • Build tests will be added for build configuration changes (e.g. adding a new build option with NCCL)
  • Code is well-documented:
  • For user-facing API changes, API doc string has been updated.
  • For new C++ functions in header files, their functionalities and arguments are documented.
  • For new examples, README.md is added to explain the what the example does, the source of the dataset, expected performance on test set and reference to the original paper if applicable
  • Check the API doc at https://mxnet-ci-doc.s3-accelerate.dualstack.amazonaws.com/PR-$PR_ID/$BUILD_ID/index.html
  • To the best of my knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

  • Feature1, tests, (and when applicable, API doc)
  • Feature2, tests, (and when applicable, API doc)

Comments

  • If this change is a backward incompatible change, why must this change be made.
  • Interesting edge cases to note here

leezu and others added 30 commits January 20, 2020 11:56
#17324)

As of jemalloc 5, jemalloc default build can not be used in libraries that are dlopened. However, libmxnet.so is dlopened by Python (ctypes). To use MXNet with jemalloc 5, users must not link to system libjemalloc.so but must rather link to a libjemalloc compiled with DISABLE_INITIAL_EXEC_TLS
* license under bsd

* fix rat exclude
* add batch_axis in validation handler

* Add test case for batch axis support

* change test class name
* fix build error due to  #17128

* whitelist all *MX* API

* Update libmxnet.sym

* Update libmxnet.ver

Co-authored-by: Sheng Zha <[email protected]>
- Add missing iot urlParam
- Fix use of potentially undefined variables. For example, if the &processor=
  url parameter is unspecified, this would previously cause use of undefined
  variable javascript error leading to an empty page.
* Update FindNCCL.cmake

If Cuda root dir is not specified, search for the symlinked unix default

* Update FindNCCL.cmake

* fix the cmake variable name

FindCUDAToolkit exposes result variables (cmake variables). Use those.

https://github.com/apache/incubator-mxnet/blob/28e053edb4f2079743458bf087557bcac7e58c62/cmake/Modules/FindCUDAToolkit.cmake#L427-L464

* search for library in /usr/local/cuda

* comment if

* indent, comment fix

* fix indent, add nccl_root alongside nccl_root_dir

Also add warning about deprecation

* fix indent, remove nccl_root checks

* Update FindNCCL.cmake
* add api

* make doc compatible with c_api.h INT64 docs
* ndims and shape of dims

* add c_api definition and update invoking of capi in sparse

* minor fixes

* fix c_api for sparse64

* fix lint, add test

* change test, indext for thread

* fix test

* replace userdefined dtype with standard dtype

* revert dtype to int for ndim of 64bit SparseAPI, retain uint32 for 32bit API

* Sparse 32bit as well as 64bit API ndim dtype made int

* revert index_t to IType as it is int64 anyway no need for hardcode

* fix sparseEx API

* fix the api breaking change for 32bit API

* remove template parameter for ndim,aux_ndim since internally its always int
* Add event handlers in validation handler

* update doc string
* fix template param as ndim is internally int

* Update c_api.cc
* use temp file

* fix dependency

* Update model_store.py

* Update test_gluon_model_zoo.py

* remove NamedTempFile
* norm

* full test

* add default behaviour

* add col norm forward

* add matrix col row norms forward

* row col norm backward

* improve tests

* beautify cpp

* update broadcast_op

* C1002

* billie holiday even told it even better

* probing for windows unittest numpy error

* update test

* update test

* fix style

* retrigger unix ci

* update according to reviews

* fix backward set_num_input

* fix CI
* fix seed for mkldnn test

* use fixed seed for mkldnn test
* fix cpp predcit license

* add white list (#210)

* fix white list (#211)

Co-authored-by: Lai Wei <[email protected]>
Includes ps-lite CMakeLists.txt refactor
* Add config.cmake

* Update guide

* Address comments

* Remove undocumented include(build/private/local_config.cmake)

* Fix typo
* safe load for yaml

* Trigger notification
leezu and others added 29 commits February 13, 2020 18:35
Build from source only works with JDK < 9 due to dependency on
maven-compiler-plugin and `make ratcheck` is thus broken on newer systems.
)

* Add cmake/upstream folder

* Add upstream select_compute_arch.cmake

* Change CUDA_COMMON_GPU_ARCHITECTURES

* Include License

* Workaround rat-exclude problems
* Initial commit - added first batch of misc ops

* Initial commit - added first batch of misc ops

* Added remaining misc ops, including Custom op logic

* Added more test cases, fixed lint errors

* Update documentation

* Added run_backward=True for ops supporting backwards runs

* Added issue link for bilinear UpSampling

* Added remaining misc ops, including Custom op logic

* Update documentation

* Updated alias map

* Fixed missing and incorrect alias issues

* Added remaining missing aliases

* Fixed Custom profile dump parsing and alias

* Switched to using sets for O(1) op membership checks

* Added fix for dtype issue in master
* fix storage type of softmax backward

* remove unused variable
* Drop _cy2

* Drop Python 2 specific code in mxnet and tests

* Replace io.open with open

* Drop from __future__ imports

* Fix lint
* support broadcasting on the indexed axis

* support extra cases for broadcasting setitem, fixing a few other bugs in indexing
cublasGemmBatchedEx is only supported for GPU with architecture capabilities equal or greater than 5.0.

Fixes a bug in #16408
* add op insert

fix lint and compile error

fix pylint error

* fix slice bug

* modify code logic and patterns

fix sanity error

* reuse existing function / modify alignment

* support to insert a sigle number

* use range_fwd kernel

* reduce code

* fix a small mistake / remove redundant function

* reduce code again / resolove function conflict

* fix pylint error / decompose main function into small pieces

* test CI

* test CI 2

* test CI 3

* test CI 4

fix sanity error

* op : insert tensor

* op : insert scalar

fix sanity error

* op : insert by slice

fix compile error

fix cmopile error again

* fix code style problem
* enable mkldnn by default in pip packages

* add make file for native build

* rm *mkl build scripts

* remove *mkl variants

* Change readme/cd

* clear native build configurations

* fix static build test in ci

* git mv linux_cu102mkl.mk linux_cu102.mk

* fix merge conflict
* add gpu doc portion

* resolve sam comments

* add lib_api and resolve comments

* resolve sam comments

* try remove italic

* fix typo and add mutable example

* retrigger ci
* Add Bfloat16

* mshadow support bf16

* rebase bf16 mkldnn1.0

* support bf16 gemm

* resolve fp32 ip bwd bug

* add other bf16 ops

* change func name from fp16 to lp16 (low precision 16), to include bf16

* add amp_cast bf16 support for ndarray

* fix executor copy_params

* add test case for bf16

* remove numpy dtype hook for bf16

* add bf16 type support

* rebase to mxnet master

* add single conv test

* fix symbolic inference

* add dtype check when copy

* add single conv and bn test

* skip fp16 amp_cast test in cpu

* Fix resnet50 first convolution

* Skip first convolution for bfloat16

* support bf16 fallback compute

* recover origin test

* add some bf16 unittests

* fix bf16 bn test, enhance assert_almost_equal_with_err

* using assert_almost_equal_with_err for fallback bn test

* add relu6 bf16 support

* fix lint

* fix subgraph conv with data=0

* mkldnn doesn't support 0 dim tensor

* rm dtype check when copy

* using bf16 tvm

* rm bf16 mnist demo

* use official tvm

* change function name; fix lint error

* fix clang check error:conditional expression is ambiguous; 'float' can be converted to 'mshadow::bfloat::bf16_t' and vice versa

* nvcc compiler build pass

* fix gpu amp cast symbol error

* fix mnist training error

* fix cpp test: Engine.VarVersion error

* workaround cpp failed test mkldnn fc bwd

* to fix mkldnn test_mkldnn_ndarray_slice error

* 1. move some code from to np_broadcast_reduce_op_value.cc to np_broadcast_reduce_op_value_part2.cc to pass Win CPU/GPU build (fatal error C1002: compiler is out of heap space in pass 2)
2. rm debug code

* use official dlpack

* rename np_broadcast_reduce_op_value_part2.cc and add some description

* 1. update dlpack url in .gitmodule
2. disable mkldnn fc bwd

* fix remaining NodePtr due to tvm update

* mv some code from mxnet_op.h to mxnet_op_kernel_assign.h to avoid WIN compiler error 'fatal error C1002: compiler is out of heap space in pass 2'

* fix WIN CPU build fail:compiler is out of heap space in pass 2

* fix WIN build fail

* fix lint

* add print for test bf16_concat

* fix bf16 test fail

* disable bf16 concat test

* tmp skip to root cause edge test halt

* fix bf16_bn test error

* enable test_bulk

* tmp rm bf16 to locate edge error

* Revert "tmp rm bf16 to locate edge error"

This reverts commit 7360246.

* add Apache license header

* trigger CI

* add robust for test bf16 bn

Co-authored-by: Zhennan Qin <[email protected]>
Co-authored-by: YixinBao <[email protected]>
Co-authored-by: Xinyu Chen <[email protected]>
Co-authored-by: Wuxun Zhang <[email protected]>
* update symbol to json

add remove_amp_cast argument to keep same with symbol.save

* retrigger CI

Co-authored-by: JackieWu <[email protected]>
* Additional fix for vector access. See 9634786 for the original.

* CI

* ci

* ci

* retrigger CI

* ci

Co-authored-by: JackieWu <[email protected]>
* Fix OS X staticbuild and add tests

* Update OS X build from source documentation
…17608)

Fixes `gzip: stdin: not in gzip format` errors due to CI trying to unzip HTML page containing the forward notice.
…d vsplit, add interoperability tests for h/v/dsplit (#17478)
* C++

* rayleigh

* exponential

* c++
* fix mkldnn fc bwd bug due to data inplace

* enable mkldnn fc bwd

* fix cpp tests

* try: fix random seed

* fix cpp test

* loose rtol for fc cpp test

* improve error message

* limit max value for mkldnn tensors

* limit the max value of test tensors

* fix lint

* remove fixed random seed

* address review comments

* Revert "address review comments"

This reverts commit 56d873f.

Co-authored-by: rongzha1 <[email protected]>
* Add fallback ops

* More fallback ops

* Add more fallback ops

* Fix astype

* Support mixed type of mx.np.ndarray and np.ndarray in binary ops

* More fix

* Add result_type

* Remove implemented ops

* Fix

* Add more tests

* fallback---------------

* 8

* all fallback

* ok

* delete repetition

Co-authored-by: reminisce <[email protected]>
@heaseny heaseny merged commit 999d78b into heaseny:master Feb 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.