Automate javadocs by natke · Pull Request #27 · natke/onnxruntime

natke · 2022-08-08T22:53:48Z

No description provided.

* Setting default version values for ovep dlls as well * Update backend_manager.cc Co-authored-by: mayavijx <[email protected]> Co-authored-by: mohsin <[email protected]>

…#11925) Validate axes attribute parameter properly rather than silently returning incorrect results

* ooptimize t5 encoder * update * update * update * refactor expand impl * cuda tests passed * update * alignment * more alignments * review comments

* fix win build errors * fix linux build * fix typo * minor fix * fix win in c api * fix linux build complaining bool * fix ORT_RETURN_ON_ERROR

…1965) fix symbolic shape error due to upgraded numpy + legacy sympy

…ng thread spinning (microsoft#11841) Introduce Start/Stop threadpool spinning switch Add a session config option to force spinning stop at the end of the Run()

* support mt5 * save external data to one file * update default value of --model_name_or_path and --decoder_onnx

* Eager mode ArgMax support. * Fix basic max and min functionality with minor generator update. Note this does not address all max and min api scope. * Add addmm test.

* fix mpi build for gcc8 or higher * fix memory profile for partial graph run * Revert "fix mpi build for gcc8 or higher" This reverts commit fb60beb. * remove debug code * fix build * fix build * fix cpplint and python black format

* op changes * review comments * shape consolidation, test trigger, cleanup * review comments

* Add nested function call tests * Add overload for Specialize * Pass symboltable to onnx shape inference * Avoid renaming empty names * Enable sequence_map tests which failed before this change

…#11922) Provider better documentation

…icrosoft#11491) * Using vectorized loads (float2) for fp16 to improve performance * Fix a few warnings from cpplint * Fix a few warnings from cpplint * Use __float2half2_rn and fix some cpplint warnings * Move some computaions to LaunchFastGeluKernel * Fix some Lint C++ warning * Using vectorized loads (float4) for fp16 to improve performance * Switch whether to optimize FastGelu with float4 vectorization * Switch to float4 memory access based on input_length in FastGelu * Comment how to set the threshold of float2 and float4 vectorized kernels * Add FastGelu fp16 unit tests for bias_length = 2 and 8 * Make vectorized kernels generic with aligned_vector * Unify the vectorized kernels with/without bias * Refactor the code to suppress cpplint warnings * Solve formatting issues * Remove cudaDeviceProp from FastGeluKernel and LaunchFastGeluKernel * Move fast_gelu_impl.h to rocm/bert * Fix some Lint C++ warnings and code alignment

* Register signal ops for op set 17 Note code is mostly being moved, not added. These ops were previously only registered as Microsoft contrib ops and only built if `BUILD_MS_EXPERIMENTAL_OPS=1`. They've been added to the ai.onnx standard op set in version 17. Main components of this change: * Move the kernels from the conrib_ops directory to the core directory. * Add function bodies for ms experimental ops. This will allow old models that use the contrib ops to continue to function. All the function bodies consist of a single op (the new standard op), so performance overhead should be minimal. Minor clean-up also in this change: * De-duplicate get_scalar_value_from_tensor: put it in a new utils.h. * Fix some bugs that caused compilation errors with the experimental ops. Tested with `build.sh --ms_experimental` * Fix some spelling errors and lint violations. * Replace a couple of switch statements with `MLTypeCallDispatcher`. * Use `InlineVector` instead of `std::vector`. Unblocks microsoft#11640

Fix couple of typos

…ft#11935) Improve performance of BiasGelu on OneDNN execution provider This modifies how BiasGelu is handled by the OneDNN execution provider by executing the gelu_erf primitive as a postop of the binary_add primitive. Also fixes extra data copies made when running on GPU. Signed-off-by: George Nash <[email protected]>

Bumps [async](https://github.com/caolan/async) from 2.6.3 to 2.6.4. - [Release notes](https://github.com/caolan/async/releases) - [Changelog](https://github.com/caolan/async/blob/v2.6.4/CHANGELOG.md) - [Commits](caolan/async@v2.6.3...v2.6.4) --- updated-dependencies: - dependency-name: async dependency-type: indirect ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

* [js/rn] upgrade dependencies for e2e test * use JDK11 only for gradle * expand variable

…l signal op definitions (microsoft#12006) * fix winml tests * remove legacy test * switch idft -> dft+inverse attr * upgrade opset 13->17 for signal ops tests

…ls. (microsoft#12008) Add support for double tensor output in TestPreTrainedModels.

Update the NNAPI headers to a more recent version (copied from TF Lite v2.9.1).

…ation lacking training_mode attribute (microsoft#12010) FusedBatchNormalization include training_mode attribute

* create op from ep * read input count from context * create holder to host nodes * fix typo * cast type before comparison * throw error on API fail * silence warning from minimal build * switch to unique_ptr with deleter to host nodes * fix typo * fix build err for minimal * fix build err for minimal * add UT for conv * enable test on CUDA * add comment * fix typo * use gsl::span and string view for Node constructor * Added two APIs - CopyKernelInfo and ReleaseKernelInfo * pass gsl::span by value * switch to span<NodeArg* const> to allow for reference to const containers * fix typo * fix reduced build err * fix reduced build err * refactoring node construction logic * rename exceptions * add input and output count as arguments for op creation * refactor static member * use ORT_CATCH instead of catch * cancel try catch * add static value name map * format input definition and set err code * fix comments * fix typo

* Pad fallback to CPU * Added queryPad in operatorRegistration.cpp * Acknowledged PR comments * Used any_of * used none_of instead of any_of Co-authored-by: Sumit Agarwal <[email protected]>

(1) add --run_shape_inference to make shape inference optional (2) add --vocab_mask to make the input optional (3) add --overwrite in gpt2 convert_to_onnx to allow overwrite existed raw onnx from PyTorch (4) save gpt2 model tensors to one external data file by default (5) group convert_beam_search arguments to multiple groups (6) make --decoder_onnx optional for gpt2 model (7) replace print by logger (8) update shape inference function to support external data. (9) when saving external data, show warning if onnx version < 1.12

[js/web] fix negative axes for unsqueeze

Bumps [electron](https://github.com/electron/electron) from 13.6.6 to 15.5.5. - [Release notes](https://github.com/electron/electron/releases) - [Changelog](https://github.com/electron/electron/blob/main/docs/breaking-changes.md) - [Commits](electron/electron@v13.6.6...v15.5.5) --- updated-dependencies: - dependency-name: electron dependency-type: direct:development ... Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

… op (microsoft#12327) * support more inputs for greedy search * fix docs * refactor test * lint * review comments

fix integer overflow

Container and memory allocation guidelines Re-org and add code samples Clarify the wording on returning gsl::span

* Removed unnecesary reorders * Removed unnecesary element wise clip

…microsoft#11968) * [ROCm] disable expected failure tests PoolTest.MaxPool_10_DilationPadding_?d * [ROCm] Add AveragePool, GlobalAveragePool, MaxPool, GlobalMaxPool Ops * (To squash after review) Replace rocm/nn/pool.cc with amd_hipify.py changes * [ROCM] Replace miCompat with Helper functions * (to squash) fix the compiling error of SetPoolingNdDescriptorHelper

…t updates (microsoft#12330) * Update to handle multiline declarations for the kernels which are typical these days. * Update to new path for the cpu contrib_op kernel registrations. * Update tools/python/find_optimizer_opset_version_updates_required.py Co-authored-by: Justin Chu <[email protected]>

* first draft * plus fixes * plus more links * Plus updates per review * plus more clarifications * plus updates * plus more nit fixes * plus some additions

Fix comparison of path characters when checking for ".ort" suffix. Some clean up of InferenceSession Load functions. - Reduce duplication between std::string/std::wstring versions. - Renaming for clarity.

…ut (microsoft#12408) * share quant param between tensors

…2429) Use InlinedVector in a TP Store per thread parallel section in std::optional and avoid memory allocation

) * Split GemmBase RocBlasGemm * Add composable kernel GEMM baseline * Make linter happy * Address review comment * Update bert cases with batchsize * Adjust includes to fix IWYU lint * Only builds and links used ck kernels to improve building time * Remove warmup run on SelectImpl * Add comment to utility function * Mute cpplint * Make RocBlasGemm<T>::SelectImpl semantically correct * Add reduced basic test cases for ck gemm * More robust gemm testing * Fix warnings * Fix grammar

* Rework some aspects of Graph::Resolve to reduce memory usage.

Fix Python Packaging CI Co-authored-by: Ethan Tao <[email protected]@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

[delete] delete rocm4.3.1

* update ortmodule opset to 15 * update torch version * fix ut * fix ut * rollback * rollback for orttrainer

Fix onnxruntime_training.cmake missing linkage issue

…auto keyword (microsoft#12483) * Workaround false positive error produced by clang ROCm's hip clang complaints that "use 'template' keyword to treat 'Foo' as a dependent template name" where Foo is not a dependent template name. Instead, avoid the using of auto keyword fixes the error here.

* set zero point to 0 if all value are 0.0 * fix bug: lower version of numpy.finfo doesn't have smallest_subnormal * check scale to make sure it is not subnormal

Fix various warning

* adding conditional variable again * Adding split test cases in python * Adding python cases for split * Enable s8s8 split * Optimize input * Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (microsoft#11651)" This reverts commit d5e34ac * Revert "Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (microsoft#11651)"" This reverts commit 3c1a330. * format file * Update c-api-linux-cpu.yml * Update c-api-linux-cpu.yml * Update c-api-linux-cpu.yml * Reformat file * Reformat file * format file * Optimize input * Remove unused import * Remove useless init * Format split.py with black

Working on JNI refactor for OnnxTensor. Simplifying the error handling logic in createTensor. Collapsing casting branches and migrating to ONNX element type enum. Disable cpplint for JNI C files.

baijumeswani and others added 30 commits June 22, 2022 10:27

Add support for gradient clipping, AdamWOptimizer and tensorseq as in…

fac8dae

…puts (microsoft#11697)

Dll version fix ovep4.1 (microsoft#11953)

f54476a

* Setting default version values for ovep dlls as well * Update backend_manager.cc Co-authored-by: mayavijx <[email protected]> Co-authored-by: mohsin <[email protected]>

MeanVarianceNormalization CPU EP axes attribute validation (microsoft…

f6d2fe8

…#11925) Validate axes attribute parameter properly rather than silently returning incorrect results

Optimize t5 encoder in beam search (microsoft#11926)

e24349b

* ooptimize t5 encoder * update * update * update * refactor expand impl * cuda tests passed * update * alignment * more alignments * review comments

fix win build errors (on device training) (microsoft#11844)

17a8ece

* fix win build errors * fix linux build * fix typo * minor fix * fix win in c api * fix linux build complaining bool * fix ORT_RETURN_ON_ERROR

Fix orttraining-linux-ci-pipeline - Symbolic shape infer (microsoft#1…

c398ad5

…1965) fix symbolic shape error due to upgraded numpy + legacy sympy

Allow saving on CPU usage for infrequent inference requests by reduci…

607b7df

…ng thread spinning (microsoft#11841) Introduce Start/Stop threadpool spinning switch Add a session config option to force spinning stop at the end of the Run()

MT5 onnx conversion for beam search (microsoft#11958)

2c4e4b6

* support mt5 * save external data to one file * update default value of --model_name_or_path and --decoder_onnx

Eager mode: Argmax and fixup max and min. (microsoft#11861)

fa7f80c

* Eager mode ArgMax support. * Fix basic max and min functionality with minor generator update. Note this does not address all max and min api scope. * Add addmm test.

Redesign InPlaceAccumulator op (microsoft#11842)

c2fd5cc

* op changes * review comments * shape consolidation, test trigger, cleanup * review comments

Restructure function inliner (microsoft#11731)

b1411c8

* Add nested function call tests * Add overload for Specialize * Pass symboltable to onnx shape inference * Avoid renaming empty names * Enable sequence_map tests which failed before this change

Deprecate APIs returning raw ptrs and provide replacements (microsoft…

088bc74

…#11922) Provider better documentation

Merge branch 'master' into training_dev/on_device_poc

d25cf4d

Fix a couple of typos (microsoft#11943)

f72288b

Fix couple of typos

Include opset 15 in Conv+BatchNormalization fusion (microsoft#11960)

8cd0250

[js/rn] upgrade dependencies for e2e test (microsoft#11863)

bd973bc

* [js/rn] upgrade dependencies for e2e test * use JDK11 only for gradle * expand variable

Fix WinML Tests are still targetting deprecated (deleted) experimenta…

7d712c8

…l signal op definitions (microsoft#12006) * fix winml tests * remove legacy test * switch idft -> dft+inverse attr * upgrade opset 13->17 for signal ops tests

[C# Tests] Add support for double tensor output in TestPreTrainedMode…

466b2d9

…ls. (microsoft#12008) Add support for double tensor output in TestPreTrainedModels.

[NNAPI EP] Update NNAPI headers (microsoft#11954)

f045994

Update the NNAPI headers to a more recent version (copied from TF Lite v2.9.1).

DML EP ResNet50 opset 15 fails in ONNX checker for FusedBatchNormaliz…

fc0143f

…ation lacking training_mode attribute (microsoft#12010) FusedBatchNormalization include training_mode attribute

[DML EP] Pad operator: Handle negative pad counts (microsoft#11974)

4552dd3

* Pad fallback to CPU * Added queryPad in operatorRegistration.cpp * Acknowledged PR comments * Used any_of * used none_of instead of any_of Co-authored-by: Sumit Agarwal <[email protected]>

[js/web][bugfix] fix negative axes for unsqueeze (microsoft#11944)

0702364

[js/web] fix negative axes for unsqueeze

wangyems and others added 29 commits August 3, 2022 10:10

Support vocab_mask/prefix_vocab_mask/no_repeat_number in greedysearch…

b622e5f

… op (microsoft#12327) * support more inputs for greedy search * fix docs * refactor test * lint * review comments

Update publish-python-apidocs.yml (microsoft#12433)

44ec2cf

Fix integer overflow in LongformerAttention (microsoft#12435)

97a340b

fix integer overflow

Container and memory allocation guidelines (microsoft#12387)

dc984a0

Container and memory allocation guidelines Re-org and add code samples Clarify the wording on returning gsl::span

Perform graph transformations during offline tooling (microsoft#12422)

7f58bd7

[oneDNN EP] Optimized DynamicQuantizeLinear operator (microsoft#12403)

d1497bd

* Removed unnecesary reorders * Removed unnecesary element wise clip

dev notes for layout transformer (microsoft#12396)

97268e0

* first draft * plus fixes * plus more links * Plus updates per review * plus more clarifications * plus updates * plus more nit fixes * plus some additions

Refactor InferenceSession Load member functions. (microsoft#12430)

3efd9a7

Fix comparison of path characters when checking for ".ort" suffix. Some clean up of InferenceSession Load functions. - Reduce duplication between std::string/std::wstring versions. - Renaming for clarity.

Minor doc fixes (microsoft#12388)

52d4699

Enable quant op to share quantization parameter between input and oup…

ac10f33

…ut (microsoft#12408) * share quant param between tensors

Remove dynamic allocation for ThreadPool ParallelSection (microsoft#1…

a4ef0e7

…2429) Use InlinedVector in a TP Store per thread parallel section in std::optional and avoid memory allocation

Lironkesem/unsqueeze_and_squeeze (microsoft#12421)

d452462

[CUDA] BiasSoftmax Supporting New Pattern (microsoft#12361)

37995a7

Rework parts of Graph::Resolve to reduce memory usage (microsoft#12176)

8d830ad

* Rework some aspects of Graph::Resolve to reduce memory usage.

Fix Python Packaging CI (Rocm) (microsoft#12477)

b879dca

Fix Python Packaging CI Co-authored-by: Ethan Tao <[email protected]@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

[DELETE] delete python package rocm4.3.1 (microsoft#12480)

3e1b0ac

[delete] delete rocm4.3.1

CUDA kernel for ClipGradNorm for TensorSeq gradients (microsoft#12412)

a7d6290

Update ORTModule Default Opset Version to 15 (microsoft#12419)

e85e31e

* update ortmodule opset to 15 * update torch version * fix ut * fix ut * rollback * rollback for orttrainer

DML EP fix training build error (microsoft#12461)

eb90b52

Fix onnxruntime_training.cmake missing linkage issue

set zero point to 0 if all value are 0.0 (microsoft#12470)

bdd6b00

* set zero point to 0 if all value are 0.0 * fix bug: lower version of numpy.finfo doesn't have smallest_subnormal * check scale to make sure it is not subnormal

Fix various warning in kernel explorer (microsoft#12501)

9c05577

Fix various warning

[Java] JNI refactor for ONNX Tensor (microsoft#12281)

8a86b34

Working on JNI refactor for OnnxTensor. Simplifying the error handling logic in createTensor. Collapsing casting branches and migrating to ONNX element type enum. Disable cpplint for JNI C files.

remove the link the comments (microsoft#12510)

730240d

Add workflow for generating Java docs

0b0eedf

natke closed this Aug 8, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

Automate javadocs#27

Automate javadocs#27
natke wants to merge 1021 commits intomasterfrom
automate-javadocs

natke commented Aug 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants

Comments

Conversation

natke commented Aug 8, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

20 participants