-
Notifications
You must be signed in to change notification settings - Fork 3.6k
Convert graph initializers into OrtValue Phase I #23979
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| // can not trace string tensor | ||
| ORT_ENFORCE(entry->second->data_type() != ONNX_NAMESPACE::TensorProto_DataType_STRING, "Can not trace string tensor"); | ||
| ORT_RETURN_IF_ERROR(planner.Trace(entry->first, entry->second)); | ||
| const auto [_, tensor_proto] = *entry; |
Check notice
Code scanning / CodeQL
Unused local variable
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI 10 months ago
To fix the problem, we need to remove the unused variable _ from the code. This can be done by modifying the line where _ is declared and removing it. The best way to fix this without changing existing functionality is to directly destructure the entry variable to only extract the tensor_proto part, which is actually used.
-
Copy modified line R270
| @@ -269,3 +269,3 @@ | ||
| "OrtValue index: ", ort_value_index, " from initializer_allocation_order not found among initialized tensors"); | ||
| const auto [_, tensor_proto] = *entry; | ||
| const auto& tensor_proto = entry->second; | ||
|
|
650c07f to
2c4f38f
Compare
dfcbe93 to
ac5bed7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can commit the suggested changes from lintrunner.
|
Cc: @ranjitshs pls, take a look and perhaps build as well. Thx! |
Yes. I will try to build and let you know in 1-2 days. |
onnxruntime/core/providers/shared_library/provider_wrappedtypes.h
Outdated
Show resolved
Hide resolved
0610388 to
0bc68a1
Compare
|
Hi @yuslepukhin |
Which commit did you try? |
|
@yuslepukhin |
|
Below are some stack trace for failures:
This is crashed in onnxruntime::Graph::AddInitializedTensor . Let me know if you find any useful info from this . I will try to understand changes in this PR. :) |
|
@yuslepukhin BTW, any idea when we are going for next release ? |
#25159) ### Description Updates the `OrtGraph` implementation to take advantage of the work done in PR #23979, which sets the infrastructure to store initializers as `OrtValue` instances in the `onnxruntime::Graph`. There still needs to be second part to the [aforementioned PR](#23979) to ensure that all initializers are stored as `OrtValue`s in the Graph. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
### Description Make protobuf weights refer to OrtValues on load. Create OrtValues for initializers that are loaded from ORT format for uniformity. Create OrtValues for ORT format initializers. Adjust exporting Graph::ToGraphProto() so it does not export in memory references in external data. Make CoreML process external data including in memory references so it can copy it. ### Motivation and Context Follow up for #23979
it is related to microsoft#25320 microsoft#23979
### Description Make protobuf weights refer to OrtValues on load. Create OrtValues for initializers that are loaded from ORT format for uniformity. Create OrtValues for ORT format initializers. Adjust exporting Graph::ToGraphProto() so it does not export in memory references in external data. Make CoreML process external data including in memory references so it can copy it. ### Motivation and Context Follow up for #23979
### Description It is related to #25320 #23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With #25320 #23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <[email protected]>
### Description It is related to #25320 #23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With #25320 #23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <[email protected]>
### Description Make protobuf weights refer to OrtValues on load. Create OrtValues for initializers that are loaded from ORT format for uniformity. Create OrtValues for ORT format initializers. Adjust exporting Graph::ToGraphProto() so it does not export in memory references in external data. Make CoreML process external data including in memory references so it can copy it. ### Motivation and Context Follow up for microsoft#23979
### Description It is related to microsoft#25320 microsoft#23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With microsoft#25320 microsoft#23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <[email protected]>
### Description It is related to microsoft#25320 microsoft#23979. Enable tensor raw data sharing for externalized tensor proto with kTensorProtoMemoryAddressTag ### Motivation and Context With microsoft#25320 microsoft#23979, all initialized tensor protos are associated with OrtValue, VitisiAI EP need to adapt to this change. Co-authored-by: mingyue <[email protected]>
…lues early (#26345) ### Description Converts weights early and revert "Properly remove in-memory references (#25652)" This reverts commit 3ca49d8 and makes appropriate adjustments for the current state of the code. This PR is made possible and on the heels of: #26263 #25833. Previous history: #23979 #25320 #25626 #25652 The first change (#26263) allows us to convert initializers to OrtValues early and save lots of memory at model loading time. Specifically, for Phi-4-mini-instruct-INT4 model before and after looks like this: **Before** <img width="1204" height="124" alt="Before change DEBUG 2025-10-16 144819" src="https://github.com/user-attachments/assets/674ff75b-057f-498a-a906-0140d59d46e6" /> **After** <img width="997" height="114" alt="After change DEBUG 2025-10-16 144819" src="https://github.com/user-attachments/assets/df1783af-7f50-4cd2-b3ad-6868f23be53f" /> The two peaks represent memory usage at optimization time (8.1Gb before) and after weights memory mapping (6.5Gb) After this change corresponding numbers look 3.5Gb and 4.7Gb respectively. Most of the savings during optimization phase come from `ConstantFolding` where we are able to reuse the resulting OrtValues directly for the new initializers. This PR concludes a series of PRs converting initializers to OrtValues. Memory consumption before the conversion began was 9.3Gb and 6.7Gb respectively. We are saving almost 6Gb during optimization and 2Gb for the steady state. <img width="1175" height="139" alt="image" src="https://github.com/user-attachments/assets/80e7d228-8a8e-4316-8e04-b02c2be30f04" /> The model also loads about 12 seconds faster. Example of ConstantFolding being one of the top contributors where we duplicate memory for higher peak before Resolve takes care of no longer used initializers. <img width="1100" height="558" alt="Sanpshot 3 Peak on ConstantFolding Transpose Optimizer" src="https://github.com/user-attachments/assets/95545abd-3f99-46d9-862e-bbf27cbb5b40" /> <img width="1060" height="600" alt="Snapshot 4 Peak AddInitializer from ConstantFolding" src="https://github.com/user-attachments/assets/dd457ec6-23ee-4efd-8c60-625d5faad61e" /> <img width="325" height="160" alt="image" src="https://github.com/user-attachments/assets/37c1194d-f683-49a7-afb1-073dfbb9bbfc" /> ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. --> Reduce memory usage.
### Description This PR converts TensorProto graph initializers to TensorProto/OrtValue pairs. Currently, we only split the output for some optimizers to the above pairs. Eventually, we should be able to convert all initializers to OrtValues on load. Small weights will continue to be an exception as they are sometimes required by ONNX inference functions. Some graph API leaks to EPs so we are not able to remove it at present, and this constrains our ability to convert everything at once. ### Motivation and Context Lay Gound for proper layers separation. Eventually eliminate weights copies in the EPs.
microsoft#25159) ### Description Updates the `OrtGraph` implementation to take advantage of the work done in PR microsoft#23979, which sets the infrastructure to store initializers as `OrtValue` instances in the `onnxruntime::Graph`. There still needs to be second part to the [aforementioned PR](microsoft#23979) to ensure that all initializers are stored as `OrtValue`s in the Graph. ### Motivation and Context <!-- - Why is this change required? What problem does it solve? - If it fixes an open issue, please link to the issue here. -->
Description
This PR converts TensorProto graph initializers to TensorProto/OrtValue pairs.
Currently, we only split the output for some optimizers to the above pairs.
Eventually, we should be able to convert all initializers to OrtValues on load.
Small weights will continue to be an exception as the are sometimes required by ONNX inference functions.
Some graph API leaks to EPs so we are not able to remove it at present, and this constrains our ability to convert everything at once.
Motivation and Context
Lay Gound for proper layers separation. Eventually eliminate weights copies in the EPs.