forked from apache/mxnet
-
Notifications
You must be signed in to change notification settings - Fork 0
sync from incubator-mxnet #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
#17324) As of jemalloc 5, jemalloc default build can not be used in libraries that are dlopened. However, libmxnet.so is dlopened by Python (ctypes). To use MXNet with jemalloc 5, users must not link to system libjemalloc.so but must rather link to a libjemalloc compiled with DISABLE_INITIAL_EXEC_TLS
* license under bsd * fix rat exclude
* add batch_axis in validation handler * Add test case for batch axis support * change test class name
* fix build error due to #17128 * whitelist all *MX* API * Update libmxnet.sym * Update libmxnet.ver Co-authored-by: Sheng Zha <[email protected]>
- Add missing iot urlParam - Fix use of potentially undefined variables. For example, if the &processor= url parameter is unspecified, this would previously cause use of undefined variable javascript error leading to an empty page.
* Update FindNCCL.cmake If Cuda root dir is not specified, search for the symlinked unix default * Update FindNCCL.cmake * fix the cmake variable name FindCUDAToolkit exposes result variables (cmake variables). Use those. https://github.com/apache/incubator-mxnet/blob/28e053edb4f2079743458bf087557bcac7e58c62/cmake/Modules/FindCUDAToolkit.cmake#L427-L464 * search for library in /usr/local/cuda * comment if * indent, comment fix * fix indent, add nccl_root alongside nccl_root_dir Also add warning about deprecation * fix indent, remove nccl_root checks * Update FindNCCL.cmake
* add api * make doc compatible with c_api.h INT64 docs
…her_nd error test (#17360)
* ndims and shape of dims * add c_api definition and update invoking of capi in sparse * minor fixes * fix c_api for sparse64 * fix lint, add test * change test, indext for thread * fix test * replace userdefined dtype with standard dtype * revert dtype to int for ndim of 64bit SparseAPI, retain uint32 for 32bit API * Sparse 32bit as well as 64bit API ndim dtype made int * revert index_t to IType as it is int64 anyway no need for hardcode * fix sparseEx API * fix the api breaking change for 32bit API * remove template parameter for ndim,aux_ndim since internally its always int
…ed memory footprint (#17410)
* Add event handlers in validation handler * update doc string
* fix template param as ndim is internally int * Update c_api.cc
* use temp file * fix dependency * Update model_store.py * Update test_gluon_model_zoo.py * remove NamedTempFile
* norm * full test * add default behaviour * add col norm forward * add matrix col row norms forward * row col norm backward * improve tests * beautify cpp * update broadcast_op * C1002 * billie holiday even told it even better * probing for windows unittest numpy error * update test * update test * fix style * retrigger unix ci * update according to reviews * fix backward set_num_input * fix CI
* fix seed for mkldnn test * use fixed seed for mkldnn test
* fix cpp predcit license * add white list (#210) * fix white list (#211) Co-authored-by: Lai Wei <[email protected]>
Includes ps-lite CMakeLists.txt refactor
* Add config.cmake * Update guide * Address comments * Remove undocumented include(build/private/local_config.cmake) * Fix typo
* safe load for yaml * Trigger notification
Build from source only works with JDK < 9 due to dependency on maven-compiler-plugin and `make ratcheck` is thus broken on newer systems.
* Initial commit - added first batch of misc ops * Initial commit - added first batch of misc ops * Added remaining misc ops, including Custom op logic * Added more test cases, fixed lint errors * Update documentation * Added run_backward=True for ops supporting backwards runs * Added issue link for bilinear UpSampling * Added remaining misc ops, including Custom op logic * Update documentation * Updated alias map * Fixed missing and incorrect alias issues * Added remaining missing aliases * Fixed Custom profile dump parsing and alias * Switched to using sets for O(1) op membership checks * Added fix for dtype issue in master
* fix storage type of softmax backward * remove unused variable
* Drop _cy2 * Drop Python 2 specific code in mxnet and tests * Replace io.open with open * Drop from __future__ imports * Fix lint
* support broadcasting on the indexed axis * support extra cases for broadcasting setitem, fixing a few other bugs in indexing
cublasGemmBatchedEx is only supported for GPU with architecture capabilities equal or greater than 5.0. Fixes a bug in #16408
* add op insert fix lint and compile error fix pylint error * fix slice bug * modify code logic and patterns fix sanity error * reuse existing function / modify alignment * support to insert a sigle number * use range_fwd kernel * reduce code * fix a small mistake / remove redundant function * reduce code again / resolove function conflict * fix pylint error / decompose main function into small pieces * test CI * test CI 2 * test CI 3 * test CI 4 fix sanity error * op : insert tensor * op : insert scalar fix sanity error * op : insert by slice fix compile error fix cmopile error again * fix code style problem
* enable mkldnn by default in pip packages * add make file for native build * rm *mkl build scripts * remove *mkl variants * Change readme/cd * clear native build configurations * fix static build test in ci * git mv linux_cu102mkl.mk linux_cu102.mk * fix merge conflict
* add gpu doc portion * resolve sam comments * add lib_api and resolve comments * resolve sam comments * try remove italic * fix typo and add mutable example * retrigger ci
* Add Bfloat16 * mshadow support bf16 * rebase bf16 mkldnn1.0 * support bf16 gemm * resolve fp32 ip bwd bug * add other bf16 ops * change func name from fp16 to lp16 (low precision 16), to include bf16 * add amp_cast bf16 support for ndarray * fix executor copy_params * add test case for bf16 * remove numpy dtype hook for bf16 * add bf16 type support * rebase to mxnet master * add single conv test * fix symbolic inference * add dtype check when copy * add single conv and bn test * skip fp16 amp_cast test in cpu * Fix resnet50 first convolution * Skip first convolution for bfloat16 * support bf16 fallback compute * recover origin test * add some bf16 unittests * fix bf16 bn test, enhance assert_almost_equal_with_err * using assert_almost_equal_with_err for fallback bn test * add relu6 bf16 support * fix lint * fix subgraph conv with data=0 * mkldnn doesn't support 0 dim tensor * rm dtype check when copy * using bf16 tvm * rm bf16 mnist demo * use official tvm * change function name; fix lint error * fix clang check error:conditional expression is ambiguous; 'float' can be converted to 'mshadow::bfloat::bf16_t' and vice versa * nvcc compiler build pass * fix gpu amp cast symbol error * fix mnist training error * fix cpp test: Engine.VarVersion error * workaround cpp failed test mkldnn fc bwd * to fix mkldnn test_mkldnn_ndarray_slice error * 1. move some code from to np_broadcast_reduce_op_value.cc to np_broadcast_reduce_op_value_part2.cc to pass Win CPU/GPU build (fatal error C1002: compiler is out of heap space in pass 2) 2. rm debug code * use official dlpack * rename np_broadcast_reduce_op_value_part2.cc and add some description * 1. update dlpack url in .gitmodule 2. disable mkldnn fc bwd * fix remaining NodePtr due to tvm update * mv some code from mxnet_op.h to mxnet_op_kernel_assign.h to avoid WIN compiler error 'fatal error C1002: compiler is out of heap space in pass 2' * fix WIN CPU build fail:compiler is out of heap space in pass 2 * fix WIN build fail * fix lint * add print for test bf16_concat * fix bf16 test fail * disable bf16 concat test * tmp skip to root cause edge test halt * fix bf16_bn test error * enable test_bulk * tmp rm bf16 to locate edge error * Revert "tmp rm bf16 to locate edge error" This reverts commit 7360246. * add Apache license header * trigger CI * add robust for test bf16 bn Co-authored-by: Zhennan Qin <[email protected]> Co-authored-by: YixinBao <[email protected]> Co-authored-by: Xinyu Chen <[email protected]> Co-authored-by: Wuxun Zhang <[email protected]>
* update symbol to json add remove_amp_cast argument to keep same with symbol.save * retrigger CI Co-authored-by: JackieWu <[email protected]>
* Additional fix for vector access. See 9634786 for the original. * CI * ci * ci * retrigger CI * ci Co-authored-by: JackieWu <[email protected]>
* Fix OS X staticbuild and add tests * Update OS X build from source documentation
* Fix * Try to fix...
…17608) Fixes `gzip: stdin: not in gzip format` errors due to CI trying to unzip HTML page containing the forward notice.
…d vsplit, add interoperability tests for h/v/dsplit (#17478)
* C++ * rayleigh * exponential * c++
* fix mkldnn fc bwd bug due to data inplace * enable mkldnn fc bwd * fix cpp tests * try: fix random seed * fix cpp test * loose rtol for fc cpp test * improve error message * limit max value for mkldnn tensors * limit the max value of test tensors * fix lint * remove fixed random seed * address review comments * Revert "address review comments" This reverts commit 56d873f. Co-authored-by: rongzha1 <[email protected]>
* Add fallback ops * More fallback ops * Add more fallback ops * Fix astype * Support mixed type of mx.np.ndarray and np.ndarray in binary ops * More fix * Add result_type * Remove implemented ops * Fix * Add more tests * fallback--------------- * 8 * all fallback * ok * delete repetition Co-authored-by: reminisce <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
(Brief description on what this PR is about)
Checklist
Essentials
Please feel free to remove inapplicable items for your PR.
Changes
Comments