Skip to content

Comments

Support expand op#4

Merged
Honry merged 1 commit intoHonry:stable-diffusionfrom
BruceDai:support_expand
May 15, 2023
Merged

Support expand op#4
Honry merged 1 commit intoHonry:stable-diffusionfrom
BruceDai:support_expand

Conversation

@BruceDai
Copy link

@Honry PTAL, thanks.

@Honry Honry force-pushed the stable-diffusion branch 2 times, most recently from 5990333 to 3bbd08f Compare May 12, 2023 17:57
const auto& shape_tensor = *initializers.at(input_defs[1]->Name());
std::vector<int64_t> raw_shape;
// We already checked the raw_shape (new_shape) in IsOpSupportedImpl.
GetExpandShape(shape_tensor, raw_shape, logger);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetExpandShape return value not checked.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enlarge raw_shape to input.rank, right aligned with leading ones before passing to WebNN? My implementation of expand requires that the input rank and new shape length match (like slice).

Copy link
Author

@BruceDai BruceDai May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we enlarge raw_shape to input.rank, right aligned with leading ones before passing to WebNN?

On the above case, the size of new shape should be less than or equal to the size of input shape. So the size of new shape being greater than the size of input shape isn't allowed currently, right?

Copy link

@fdwr fdwr May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the size of new shape being greater than the size of input shape isn't allowed currently, right?

Yes, the old input shape and new output shape must have the same dimension count (or rank) in my WebNN implementation, where raw_shape.size() == input.shape().size() == output.shape.size(). Did you encounter a case in the model where the input shape and new output shape had differing rank? (the cases I've seen match, so if the input is 3D, then the new shape has 3 elements)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case of the size of new shape being greater than the one of input's shape?
For example 2D to 3D,

input_shape = [a, b];
new_shape = [1, a, b];

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case of the size of new shape being greater than the one of input's shape?

Although I haven't seen it in real models, ONNX allows it (https://github.com/onnx/onnx/blob/main/docs/Operators.md#expand in the dim_changed example), and so probably there exists at least one model out there which does. I would prefer to see the WebNN implementation be simple in its rules (like slice which requires starts and sizes to match rank with input) and for frameworks to resolve these higher layer policies before reaching the lower level WebNN. Is it easy to call reshape on the input with leading 1's if the newShape has more dimensions?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

input_shape = [a, b];
new_shape = [1, a, b];

🤔 We'll likely need to add a Reshape call in this case, because otherwise when running the ONNX backend tests in the checkin tests, we'll get a failure with Expand (but we can complete this CR and track that when merging to main too).

LOGS(logger, VERBOSE) << "The shape of expand must be 1D.";
return false;
}
const int64_t* shape_data = reinterpret_cast<const int64_t*>(unpacked_tensor.data());
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should sanity check the data TensorProto data type is actually ONNXTensorElementDataType::ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64 before attempting to dereference it.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you please guide me how to do sanity check? Are there some sample code? Thanks.

Copy link

@fdwr fdwr May 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tried actually compiling this, but I think it would just be:

if (tensor.data_type() != ONNXTensorElementDataType::ONNX_TENSOR_ELEMENT_DATA_TYPE_INT64) {
    LOGS(logger, VERBOSE) << "The shape element data type must be INT64.";
    return false;
}
const int64_t* shape_data = reinterpret_cast<const int64_t*>(unpacked_tensor.data());

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified, successfully compiled.

const auto& input_defs = node.InputDefs();
const auto& shape_name = input_defs[1]->Name();
if (!Contains(initializers, shape_name)) {
LOGS(logger, VERBOSE) << "The shape must be a constant initializer";
Copy link

@fdwr fdwr May 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
LOGS(logger, VERBOSE) << "The shape must be a constant initializer";
LOGS(logger, VERBOSE) << "The shape must be a constant initializer.";

(nit) period

Copy link

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some comments. Thanks.

@BruceDai
Copy link
Author

Thanks @fdwr I've updated commit to address your comments. Please take another look.

Copy link

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appreciated much Bruce.

@fdwr
Copy link

fdwr commented May 15, 2023

Btw, I fear force pushing is making it hard to find the previous comments in the code, because when I try to view them later in the Files tab, I just see this...

image

...and then I can't compare the small delta to what I've previously reviewed :(. Rather than force pushing, can you push a merge commit and just squash at the end before completing? TY.

@BruceDai
Copy link
Author

Btw, I fear force pushing is making it hard to find the previous comments in the code, because when I try to view them later in the Files tab, I just see this...

Thanks. I will follow it later.

Copy link
Owner

@Honry Honry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fdwr, @BruceDai !

Copy link
Owner

@Honry Honry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fdwr, @BruceDai !

@Honry Honry merged commit 43aa5b3 into Honry:stable-diffusion May 15, 2023
Honry pushed a commit that referenced this pull request Aug 28, 2023
### Description
Release OrtEnv before main function returns. Before this change, OrtEnv
is deleted when C/C++ runtime destructs all global variables in ONNX
Runtime's core framework.
The callstack is like this:
```
  * frame #0: 0x00007fffee39f5a6 libonnxruntime.so.1.16.0`onnxruntime::Environment::~Environment(this=0x00007fffee39fbf2) at environment.h:20:7
    frame #1: 0x00007fffee39f614 libonnxruntime.so.1.16.0`std::default_delete<onnxruntime::Environment>::operator()(this=0x00007ffff4c30e50, __ptr=0x0000000005404b00) const at unique_ptr.h:85:2
    frame #2: 0x00007fffee39edca libonnxruntime.so.1.16.0`std::unique_ptr<onnxruntime::Environment, std::default_delete<onnxruntime::Environment>>::~unique_ptr(this=0x5404b00) at unique_ptr.h:361:17
    frame #3: 0x00007fffee39e2ab libonnxruntime.so.1.16.0`OrtEnv::~OrtEnv(this=0x00007ffff4c30e50) at ort_env.cc:43:1
    frame #4: 0x00007fffee39fa96 libonnxruntime.so.1.16.0`std::default_delete<OrtEnv>::operator()(this=0x00007fffefff8f78, __ptr=0x00007ffff4c30e50) const at unique_ptr.h:85:2
    frame #5: 0x00007fffee39f394 libonnxruntime.so.1.16.0`std::unique_ptr<OrtEnv, std::default_delete<OrtEnv>>::~unique_ptr(this=0x7ffff4c30e50) at unique_ptr.h:361:17
    frame #6: 0x00007ffff78574b5 libc.so.6`__run_exit_handlers + 261
    frame #7: 0x00007ffff7857630 libc.so.6`exit + 32
    frame #8: 0x00007ffff783feb7 libc.so.6`__libc_start_call_main + 135
    frame #9: 0x00007ffff783ff60 libc.so.6`__libc_start_main@@GLIBC_2.34 + 128
    frame #10: 0x0000000000abbdee node`_start + 46
```
After this change, OrtEnv will be deleted before the main function
returns and nodejs is still alive.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants