Skip to content

Comments

Automate javadocs#27

Closed
natke wants to merge 1021 commits intomasterfrom
automate-javadocs
Closed

Automate javadocs#27
natke wants to merge 1021 commits intomasterfrom
automate-javadocs

Conversation

@natke
Copy link
Owner

@natke natke commented Aug 8, 2022

No description provided.

baijumeswani and others added 30 commits June 22, 2022 10:27
* Setting default version values for ovep dlls as well

* Update backend_manager.cc

Co-authored-by: mayavijx <[email protected]>
Co-authored-by: mohsin <[email protected]>
…#11925)

Validate axes attribute parameter properly rather than silently returning incorrect results
* ooptimize t5 encoder

* update

* update

* update

* refactor expand impl

* cuda tests passed

* update

* alignment

* more alignments

* review comments
* fix win build errors

* fix linux build

* fix typo

* minor fix

* fix win in c api

* fix linux build complaining bool

* fix ORT_RETURN_ON_ERROR
…1965)

fix symbolic shape error due to upgraded numpy + legacy sympy
…ng thread spinning (microsoft#11841)

Introduce Start/Stop threadpool spinning switch
Add a session config option to force spinning stop at the end of the Run()
* support mt5
* save external data to one file
* update default value of --model_name_or_path and --decoder_onnx
* Eager mode ArgMax support.

* Fix basic max and min functionality with minor generator update. Note this does not address all max and min api scope.

* Add addmm test.
* fix mpi build for gcc8 or higher

* fix memory profile for partial graph run

* Revert "fix mpi build for gcc8 or higher"

This reverts commit fb60beb.

* remove debug code

* fix build

* fix build

* fix cpplint and python black format
* op changes

* review comments

* shape consolidation, test trigger, cleanup

* review comments
* Add nested function call tests

* Add overload for Specialize

* Pass symboltable to onnx shape inference

* Avoid renaming empty names

* Enable sequence_map tests which failed before this change
…icrosoft#11491)

* Using vectorized loads (float2) for fp16 to improve performance

* Fix a few warnings from cpplint

* Fix a few warnings from cpplint

* Use __float2half2_rn and fix some cpplint warnings

* Move some computaions to LaunchFastGeluKernel

* Fix some Lint C++ warning

* Using vectorized loads (float4) for fp16 to improve performance

* Switch   whether to optimize FastGelu with float4 vectorization

* Switch to float4 memory access based on input_length in FastGelu

* Comment how to set the threshold of float2 and float4 vectorized kernels

* Add FastGelu fp16 unit tests for bias_length = 2 and 8

* Make vectorized kernels generic with aligned_vector

* Unify the vectorized kernels with/without bias

* Refactor the code to suppress cpplint warnings

* Solve formatting issues

* Remove cudaDeviceProp from FastGeluKernel and LaunchFastGeluKernel

* Move fast_gelu_impl.h to rocm/bert

* Fix some Lint C++ warnings and code alignment
* Register signal ops for op set 17

Note code is mostly being moved, not added. These ops were previously
only registered as Microsoft contrib ops and only built if
`BUILD_MS_EXPERIMENTAL_OPS=1`. They've been added to the ai.onnx
standard op set in version 17.

Main components of this change:

* Move the kernels from the conrib_ops directory to the
  core directory.
* Add function bodies for ms experimental ops. This will allow
  old models that use the contrib ops to continue to function.
  All the function bodies consist of a single op (the
  new standard op), so performance overhead should be minimal.

Minor clean-up also in this change:

* De-duplicate get_scalar_value_from_tensor: put it in a new utils.h.
* Fix some bugs that caused compilation errors with the experimental
  ops. Tested with `build.sh --ms_experimental`
* Fix some spelling errors and lint violations.
* Replace a couple of switch statements with `MLTypeCallDispatcher`.
* Use `InlineVector` instead of `std::vector`.

Unblocks microsoft#11640
…ft#11935)

Improve performance of BiasGelu on OneDNN execution provider

This modifies how BiasGelu is handled by the OneDNN execution provider
by executing the gelu_erf primitive as a postop of the binary_add primitive.

Also fixes extra data copies made when running on GPU.

Signed-off-by: George Nash <[email protected]>
* [js/rn] upgrade dependencies for e2e test

* use JDK11 only for gradle

* expand variable
…l signal op definitions (microsoft#12006)

* fix winml tests

* remove legacy test

* switch idft -> dft+inverse attr

* upgrade opset 13->17 for signal ops tests
…ls. (microsoft#12008)

Add support for double tensor output in TestPreTrainedModels.
Update the NNAPI headers to a more recent version (copied from TF Lite v2.9.1).
…ation lacking training_mode attribute (microsoft#12010)

FusedBatchNormalization include training_mode attribute
* create op from ep

* read input count from context

* create holder to host nodes

* fix typo

* cast type before comparison

* throw error on API fail

* silence warning from minimal build

* switch to unique_ptr with deleter to host nodes

* fix typo

* fix build err for minimal

* fix build err for minimal

* add UT for conv

* enable test on CUDA

* add comment

* fix typo

* use gsl::span and string view for Node constructor

* Added two APIs - CopyKernelInfo and ReleaseKernelInfo

* pass gsl::span by value

* switch to span<NodeArg* const> to allow for reference to const containers

* fix typo

* fix reduced build err

* fix reduced build err

* refactoring node construction logic

* rename exceptions

* add input and output count as arguments for op creation

* refactor static member

* use ORT_CATCH instead of catch

* cancel try catch

* add static value name map

* format input definition and set err code

* fix comments

* fix typo
* Pad fallback to CPU

* Added queryPad in operatorRegistration.cpp

* Acknowledged PR comments

* Used any_of

* used none_of instead of any_of

Co-authored-by: Sumit Agarwal <[email protected]>
(1) add --run_shape_inference to make shape inference optional
(2) add --vocab_mask to make the input optional
(3) add --overwrite in gpt2 convert_to_onnx to allow overwrite existed raw onnx from PyTorch
(4) save gpt2 model tensors to one external data file by default
(5) group convert_beam_search arguments to multiple groups
(6) make --decoder_onnx optional for gpt2 model
(7) replace print by logger
(8) update shape inference function to support external data.
(9) when saving external data, show warning if onnx version < 1.12
wangyems and others added 29 commits August 3, 2022 10:10
… op (microsoft#12327)

* support more inputs for greedy search

* fix docs

* refactor test

* lint

* review comments
 Container and memory allocation guidelines
  Re-org and add code samples
  Clarify the wording on returning gsl::span
* Removed unnecesary reorders
* Removed unnecesary element wise clip
…microsoft#11968)

* [ROCm] disable expected failure tests PoolTest.MaxPool_10_DilationPadding_?d

* [ROCm] Add AveragePool, GlobalAveragePool, MaxPool, GlobalMaxPool Ops

* (To squash after review) Replace rocm/nn/pool.cc with amd_hipify.py changes

* [ROCM] Replace miCompat with Helper functions

* (to squash) fix the compiling error of SetPoolingNdDescriptorHelper
…t updates (microsoft#12330)

* Update to handle multiline declarations for the kernels which are typical these days.
* Update to new path for the cpu contrib_op kernel registrations.
* Update tools/python/find_optimizer_opset_version_updates_required.py

Co-authored-by: Justin Chu <[email protected]>
* first draft

* plus fixes

* plus more links

* Plus updates per review

* plus more clarifications

* plus updates

* plus more nit fixes

* plus some additions
Fix comparison of path characters when checking for ".ort" suffix.

Some clean up of InferenceSession Load functions.
- Reduce duplication between std::string/std::wstring versions.
- Renaming for clarity.
…2429)

Use InlinedVector in a TP
Store per thread parallel section in std::optional and avoid memory allocation
)

* Split GemmBase RocBlasGemm

* Add composable kernel GEMM baseline

* Make linter happy

* Address review comment

* Update bert cases with batchsize

* Adjust includes to fix IWYU lint

* Only builds and links used ck kernels to improve building time

* Remove warmup run on SelectImpl

* Add comment to utility function

* Mute cpplint

* Make RocBlasGemm<T>::SelectImpl semantically correct

* Add reduced basic test cases for ck gemm

* More robust gemm testing

* Fix warnings

* Fix grammar
* Rework some aspects of Graph::Resolve to reduce memory usage.
Fix Python Packaging CI

Co-authored-by: Ethan Tao <[email protected]@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
* update ortmodule opset to 15

* update torch version

* fix ut

* fix ut

* rollback

* rollback for orttrainer
Fix onnxruntime_training.cmake missing linkage issue
…auto keyword (microsoft#12483)

* Workaround false positive error produced by clang

ROCm's hip clang complaints that "use 'template' keyword to treat 'Foo' as a dependent template name"
where Foo is not a dependent template name. Instead, avoid the using of auto keyword fixes the error
here.
* set zero point to 0 if all value are 0.0

* fix bug: lower version of numpy.finfo doesn't have smallest_subnormal

* check scale to make sure it is not subnormal
* adding conditional variable again

* Adding split test cases in python

* Adding python cases for split

* Enable s8s8 split

* Optimize input

* Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (microsoft#11651)"

This reverts commit d5e34ac

* Revert "Revert "Remove git and python packages from the docker images used by Zip-Nuget-Java-Nodejs Packaging Pipeline (microsoft#11651)""

This reverts commit 3c1a330.

* format file

* Update c-api-linux-cpu.yml

* Update c-api-linux-cpu.yml

* Update c-api-linux-cpu.yml

* Reformat file

* Reformat file

* format file

* Optimize input

* Remove unused import

* Remove useless init

* Format split.py with black
 Working on JNI refactor for OnnxTensor.
  Simplifying the error handling logic in createTensor.
  Collapsing casting branches and migrating to ONNX element type enum.
  Disable cpplint for JNI C files.
@natke natke closed this Aug 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.