Skip to content

Comments

expose graph node name returning non-zero status code#714

Merged
hariharans29 merged 23 commits intomasterfrom
errorMessageNodeName
Apr 5, 2019
Merged

expose graph node name returning non-zero status code#714
hariharans29 merged 23 commits intomasterfrom
errorMessageNodeName

Conversation

@hariharans29
Copy link
Member

Helps debugging if the graph node name that returns non-zero status code is available.

In case of exceptions from within node kernels (by using ORT_THROW), the stack trace should contain atleast the kernel name it failed at (if not the graph node name), so avoiding the (potentially) costly try...catch for now

@hariharans29 hariharans29 requested a review from a team as a code owner March 27, 2019 03:05
if (!compute_status.IsOK()) {
LOGS(logger, ERROR) << "Non-zero status code returned while running Node: ",
p_op_kernel->Node().Name(), " Status Message: ", compute_status.ErrorMessage();
return ORT_MAKE_STATUS(ONNXRUNTIME, FAIL, "Non-zero status code returned while running Node: ",
Copy link
Contributor

@skottmckay skottmckay Mar 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if anything external depends on the failure category or code. You may want to maintain those instead of always overwriting with ONNXRUNTIME and FAIL. #Closed

Copy link
Member Author

@hariharans29 hariharans29 Mar 27, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, thanks, fixed. #Closed

skottmckay
skottmckay previously approved these changes Mar 27, 2019
snnn
snnn previously approved these changes Mar 27, 2019
@snnn
Copy link
Contributor

snnn commented Mar 27, 2019

what if the name is empty?

@hariharans29 hariharans29 dismissed stale reviews from snnn and skottmckay via cfeabfe March 27, 2019 19:49
@hariharans29
Copy link
Member Author

what if the name is empty?

@snnn - I actually assumed handling was already done while resolving the graph (my bad), I have a light-weight mechanism to generate a node name if empty (based on op type). If op type is empty, it should break elsewhere as kernel resolving cannot happen, although it might be worthwhile to check missing op type at graph resolution time as well perhaps

@pranavsharma
Copy link
Contributor

what if the name is empty?

@snnn - I actually assumed handling was already done while resolving the graph (my bad), I have a light-weight mechanism to generate a node name if empty (based on op type). If op type is empty, it should break elsewhere as kernel resolving cannot happen, although it might be worthwhile to check missing op type at graph resolution time as well perhaps

The scope of this PR is to print the node name when available.


std::string node_name = node_proto.name();
if (node_name.empty())
node_name = GenerateNodeName("unnamed_" + op_type + "_" +
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? This name won't be visible on Netron any way when debugging.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe not, but atleast it helps if we can visualize the serialized proto (not sure if this is possible) and the information here would atleast tell us that a specific op type (offset by the specific count) malfunctioned (as opposed to having no information at all)

Copy link
Contributor

@pranavsharma pranavsharma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM cc @linkerzhang

else
current_op_type_count = ++iter->second;

std::string node_name = node_proto.name();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

std::string&


return AddNode(node_proto.name(),
size_t current_op_type_count = 1;
const auto op_type = node_proto.op_type();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

const auto&

Node& AddNode(const ONNX_NAMESPACE::NodeProto& node_proto,
const ArgNameToTypeMap& name_to_type);
const ArgNameToTypeMap& name_to_type,
TypeToCountMap& type_to_count_map);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TypeToCountMap& type_to_count_map [](start = 16, length = 33)

Not a huge fan of the API taking this in order to create a hopefully unique name for the node as it's a bit obtuse. Possibly not expected behaviour for the name to be changed during AddNode either.

Why do we need this new approach?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand your concern regarding this. But don't you think it adds a little bit of value in terms of node debuggability in case of missing node names ? -

  1. It gives information regarding the malfunctioning node (in terms of op_type + chronological order in the serialized proto).

  2. It is almost always guaranteed to be unique as the name is a function of op_type + order it is seen.

  3. As far as API change goes, correct me if I am wrong - I think the graph building methods are not currently being exposed to end-users anyway

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted the this aspect of the change

@hariharans29 hariharans29 reopened this Apr 4, 2019

const auto& compute_status = p_op_kernel->Compute(&op_kernel_context);
if (!compute_status.IsOK()) {
std::string msg_string =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's better to use ostringstream when you're constructing a string from multiple strings. the + operators will make a copy. even if it doesn't matter so much in this case since we're doing it in the error scenario, it's a good habit.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@hariharans29 hariharans29 merged commit ffd9071 into master Apr 5, 2019
@hariharans29 hariharans29 deleted the errorMessageNodeName branch April 5, 2019 19:51
jnagi-intel pushed a commit to jnagi-intel/onnxruntime that referenced this pull request Jan 5, 2026
…icrosoft#714)

* Added support for 2025.2 and SimplifiedLayerNormalization op

* [OVEP] Update OV version to 2025.2.0

* Revert "[OVEP] Update OV version to 2025.2.0"

This reverts commit d129250.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants