Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Gluon-based image-classification failed to run with dummy data on CPU machine when set as "symbolic" mode #10765

@juliusshufan

Description

@juliusshufan

Description

The example/gluon/image_classification.py failed to run on CPU machine when set as "symbolic" mode,
i.e., when the command is like below:
python image_classification.py --model=alexnet --mode=symbolic --dataset=dummy
Once model is NOT set as "symbolic", it can be executed smoothly.
Please kindly note: Build w.o. MKLDNN has no issue. (Build command: make -j($nproc) USE_OPENCV=1 USE_BLAS=openblas)

Environment info (Required)

CentOS-7.2

What to do:
Run python image_classification.py --model=alexnet **_--mode=symbolic_** --dataset=dummy


Package used (Python/R/Scala/Julia):
Python

Build info (Required if built from source)

GCC 4.8.5

MXNet commit hash:
9f8f042

Build config:
Build with or w.o. MKLDNN will all trigger this issue.
make -j($nproc) USE_MKLDNN=1 USE_OPENCV=1 USE_BLAS=mkl

Error Message:

Traceback (most recent call last):
File "image_classification.py", line 290, in
main()
File "image_classification.py", line 269, in main
initializer = mx.init.Xavier(magnitude=2))
File "/ec/fm/disks/nrv_algo_home01/shufanwu/pythonenv/py2.7_1/lib/python3.4/site-packages/mxnet-1.2.0-py3.4.egg/mxnet/module/base_module.py", line 520, in fit
self.update_metric(eval_metric, data_batch.label)
File "/ec/fm/disks/nrv_algo_home01/shufanwu/pythonenv/py2.7_1/lib/python3.4/site-packages/mxnet-1.2.0-py3.4.egg/mxnet/module/module.py", line 757, in update_metric
self.exec_group.update_metric(eval_metric, labels)
File "/ec/fm/disks/nrv_algo_home01/shufanwu/pythonenv/py2.7_1/lib/python3.4/site-packages/mxnet-1.2.0-py3.4.egg/mxnet/module/executor_group.py", line 616, in update_metric
eval_metric.update_dict(labels
, preds)
File "/ec/fm/disks/nrv_algo_home01/shufanwu/pythonenv/py2.7_1/lib/python3.4/site-packages/mxnet-1.2.0-py3.4.egg/mxnet/metric.py", line 132, in update_dict
self.update(label, pred)
File "/ec/fm/disks/nrv_algo_home01/shufanwu/pythonenv/py2.7_1/lib/python3.4/site-packages/mxnet-1.2.0-py3.4.egg/mxnet/metric.py", line 418, in update
pred_label = pred_label.asnumpy().astype('int32')
File "/ec/fm/disks/nrv_algo_home01/shufanwu/pythonenv/py2.7_1/lib/python3.4/site-packages/mxnet-1.2.0-py3.4.egg/mxnet/ndarray/ndarray.py", line 1876, in asnumpy
ctypes.c_size_t(data.size)))
File "/ec/fm/disks/nrv_algo_home01/shufanwu/pythonenv/py2.7_1/lib/python3.4/site-packages/mxnet-1.2.0-py3.4.egg/mxnet/base.py", line 149, in check_call
raise MXNetError(py_str(LIB.MXGetLastError()))
mxnet.base.MXNetError: [07:30:00] src/operator/tensor/./././elemwise_unary_op.h:302: Check failed: inputs[0].dptr
== outputs[0].dptr_ (0x7ff485a590c0 vs. 0x7ff485b79100)

Stack trace returned 10 entries:
[bt] (0) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(dmlc::StackTrace()+0x3f) [0x7ff591c05edf]
[bt] (1) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(dmlc::LogMessageFatal::~LogMessageFatal()+0x21) [0x7ff591c062b1]
[bt] (2) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(void mxnet::op::UnaryOp::IdentityComputemshadow::cpu(nnvm::NodeAttrs const&, mxnet::OpContext const&, std::vector<mxnet::TBlob, std::allocatormxnet::TBlob > const&, std::vector<mxnet::OpReqType, std::allocatormxnet::OpReqType > const&, std::vector<mxnet::TBlob, std::allocatormxnet::TBlob > const&)+0x940) [0x7ff59238b6f0]
[bt] (3) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(mxnet::exec::FComputeExecutor::Run(mxnet::RunContext, bool)+0xe9) [0x7ff5945baaf9]
[bt] (4) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(+0x2f40023) [0x7ff594584023]
[bt] (5) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(mxnet::engine::ThreadedEngine::ExecuteOprBlock(mxnet::RunContext, mxnet::engine::OprBlock*)+0x589) [0x7ff5945142a9]
[bt] (6) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(std::_Function_handler<void (std::shared_ptrdmlc::ManualEvent), mxnet::engine::ThreadedEnginePerDevice::PushToExecute(mxnet::engine::OprBlock*, bool)::{lambda()#1}::operator()() const::{lambda(std::shared_ptrdmlc::ManualEvent)#1}>::_M_invoke(std::_Any_data const&, std::shared_ptrdmlc::ManualEvent)+0x92) [0x7ff594524e62]
[bt] (7) /nfs/site/home/shufanwu/workspace/mxnet/v2/lib/libmxnet.so(std::thread::_Impl<std::_Bind_simple<std::function<void (std::shared_ptrdmlc::ManualEvent)> (std::shared_ptrdmlc::ManualEvent)> >::_M_run()+0x44) [0x7ff594521e54]
[bt] (8) /lib64/libstdc++.so.6(+0xb52b0) [0x7ff636eb22b0]
[bt] (9) /lib64/libpthread.so.0(+0x7e25) [0x7ff64372ae25]

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions