-
Notifications
You must be signed in to change notification settings - Fork 6.7k
avoid skipping validation stage when data read iterator exists #18289
Conversation
sync from incubator-mxnet
Synchronize from apache mxnet
sync from base repo
Sync from official MXNet master
|
Hey @heaseny , Thanks for submitting the PR
CI supported jobs: [sanity, miscellaneous, unix-cpu, website, edge, centos-gpu, clang, centos-cpu, windows-cpu, windows-gpu, unix-gpu] Note: |
|
@mxnet-bot run ci [unix-cpu] |
|
Jenkins CI successfully triggered : [unix-cpu] |
|
Jenkins CI successfully triggered : [unix-gpu, windows-gpu, centos-cpu, unix-cpu] |
|
@szha Can you help to review this pull request, and merge it if there is no problem? Thanks a lot. |
|
apologies for the spam. @heaseny we deprecated module in the master branch and we will continue to support it in v1.x. it would be great if you could rebase your change to the v1.x branch. |
|
@szha as work changing and I am not focusing on the mxnet test and have no environment, if the change is necessary in v1.x, can you help to rebase the change, or should I close this PR? |
|
picked up the change for v1.x in the above PR. |
Description
avoid skipping validation stage when data read iterator exists.
Test commands:
cd example/image-classification
python train_cifar10.py --network='resnet' --num-layers=50 --image-shape='3,28,28' --num-epochs=1
Abnormal test log in current master branch:
INFO:root:Epoch[0] Train-accuracy=0.386069
INFO:root:Epoch[0] Train-top_k_accuracy_5=0.875060
INFO:root:Epoch[0] Time cost=38.553
Normal test log which is expected:
INFO:root:Epoch[0] Batch [360-380] Speed: 1302.17 samples/sec accuracy=0.497266 top_k_accuracy_5=0.935156
INFO:root:Epoch[0] Train-accuracy=0.386069
INFO:root:Epoch[0] Train-top_k_accuracy_5=0.875060
INFO:root:Epoch[0] Time cost=39.096
INFO:root:Epoch[0] Validation-accuracy=0.448378
INFO:root:Epoch[0] Validation-top_k_accuracy_5=0.918018
Checklist
Essentials
Changes