Skip to content
This repository was archived by the owner on Nov 17, 2023. It is now read-only.

Conversation

@ankkhedia
Copy link
Contributor

Description

This PR adds a test framework for examples.

Checklist

Essentials

Please feel free to remove inapplicable items for your PR.

  • To the my best knowledge, examples are either not affected by this change, or have been fixed to be compatible with this change

Changes

Add a placeholder to test examples.

@ankkhedia
Copy link
Contributor Author

@mxnet-label-bot add [pr-work-in-progress, Example]

@marcoabreu marcoabreu added Example pr-work-in-progress PR is still work in progress labels Nov 14, 2018
Copy link
Member

@roywei roywei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for starting this! It will help detecting broken examples.
How about first try out a simple one for 5 epoch before adding more examples.
We need to see if it's correctly triggered in nightly tests

return False
return True

def test_cifar():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For our first test, maybe start with train_mnis with mlp network for 5 epochs?

"""
errors = []
try:
check_call(command)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check_call -> subprocess.check_call

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is not being used or run anywhere. Not getting tested either. I would prefer if these changes and PR - #13270 were merged into a single PR.

The above comment is just one instance of an error that might go unnoticed if we have separate PRs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR will go together. Its still a WIP and has not been tested completely. It will be tested before merging

check_call(command)
except Exception as err:
err_msg = str(err)
errors.append(err_msg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could combine these two statements - errors.append(str(err_msg))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

err_msg = str(err)
errors.append(err_msg)
finally:
if len(errors) > 0:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a more pythonic suggestion if errors:

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done!

errors.append(err_msg)
finally:
if len(errors) > 0:
logging.error('\n'.join(errors))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this happening in the finally block. Why not just log it in the except block and return false. And remove the finally block.

Copy link
Contributor Author

@ankkhedia ankkhedia Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

errors = []
try:
check_call(command)
except Exception as err:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what sort of exception are we expecting here? I think a CalledProcess(Exception/Error).

Better to know what exception we are catching than using a generic Exception to catch all errors.

Copy link
Contributor Author

@ankkhedia ankkhedia Nov 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If any exception gets caught it means a failure of example run so I think it makes sense to catch all the exceptions


#pylint: disable=no-member, too-many-locals, too-many-branches, no-self-use, broad-except, lost-exception, too-many-nested-blocks, too-few-public-methods, invalid-name
"""
This file tests and ensures that the examples run correctly.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it does not ensure the correctness of the example. it only performs a sanity check of the example. can you please change this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The discussion on the dev-list suggest that we start small just to check if examples are running fine. The individual examples test will have to take care of what correctness atrribute they want to check and should be taken care by the test owners. This PR just provide a template for putting tests for examples. The test owners are free to have their own helper functions and tests which can be generalised later. As we see the tests because some examples do inference, some do training and everything has a different form of output , it cannot be generalised at this point.

Please feel free to add additional helper functions over here if you feel it can be used across multiple examples.

Copy link
Contributor Author

@ankkhedia ankkhedia Nov 15, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change the wordings here that the examples run without errors if that makes more sense

@ankkhedia ankkhedia changed the title [WIP] [Example] adding tests for examples [Test] [Example] adding tests for examples Nov 16, 2018
@ankkhedia
Copy link
Contributor Author

@marcoabreu @kalyc @roywei @nswamy
Could you please take a look . This PR just defines basic structure and common functionalities can be abstracted out for reuse as we add more tests.

@marcoabreu Also we would require help for test in CI setup.

Copy link
Contributor

@kalyc kalyc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Has this file been tested locally?


Returns
-------
True if there are no warnings or errors.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[minor] unindent

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

if not os.path.isdir(working_dir):
os.makedirs(working_dir)
os.chdir(working_dir)
assert _run_command(example_name , ['python',os.path.join(example_dir,'train_cifar10.py')])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add condition that is being tested here -
assert True, _run_command(example_name , ['python',os.path.join(example_dir,'train_cifar10.py')])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you elaborate? I'm having trouble understanding what exactly you mean

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcoabreu Could you please elaborate if this is for the code or comment?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

@marcoabreu marcoabreu Nov 16, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah I see :) in python you generally omit booleans during comparisons.
But what's important here is that actually the return value is being evaluated. If the test succeeded, it's true, otherwise it's false. Your example is for an exception case, which has been handled a bit differently in this approach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am going with Marco's suggestion on this :)

@marcoabreu
Copy link
Contributor

@jlcontreras please assist with the CI setup


def test_cifar_default():
example_dir = os.path.join(os.path.dirname(__file__), '..', '..', 'examples','image_classification')
temp_dir = 'tmpdir'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@marcoabreu The teamp file gets created in some other folder structure whereas I want a temp file within same folder as there are some data and other artifacts required to run the example in this folder.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are the paths in the examples hardcoded or what's the issue here? Could you point me to some links?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@roywei roywei left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can test this script locally using `nosetests test_examples.py'

For me adding the following test works, I don't need to create or delete tmp directories as these two tests don't generate checkpoint files.


def test_mnist_mlp():
    assert _run_command('mnist_mlp' , ['python', 'example/image-classification/train_mnist.py', '--num-epochs', '5'])

def test_mnist_resnet():
    assert _run_command('mnist_resnet' , ['python', 'example/image-classification/train_mnist.py', '--network','resnet','--num-layers','110', '--gpus', '0', --num-epochs', '1'])

example_dir = os.path.join(os.path.dirname(__file__), '..', '..', 'examples','image_classification')
temp_dir = 'tmpdir'
example_name = 'test_cifar10'
working_dir = os.path.join(*([temp_dir] + example_name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing ) here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolved.

@ankkhedia
Copy link
Contributor Author

@jlcontreras @marcoabreu Could someone help us with the CI runs ?

shutil.rmtree(temp_dir, ignore_errors=True)
if not os.path.isdir(working_dir):
os.makedirs(working_dir)
os.chdir(working_dir)
Copy link
Contributor

@piyushghai piyushghai Nov 20, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this chdir be outside the if check ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing out

@ankkhedia
Copy link
Contributor Author

ankkhedia commented Nov 21, 2018

@kalyc The file has been tested locally

@ankkhedia
Copy link
Contributor Author

@mxnet-label-bot add [pr-awaiting-review]
@mxnet-label-bot remove [pr-work-in-progress]

@kalyc
Copy link
Contributor

kalyc commented Nov 21, 2018

@mxnet-label-bot update [pr-awaiting-review]

@marcoabreu marcoabreu added pr-awaiting-review PR is waiting for code review and removed Example pr-work-in-progress PR is still work in progress labels Nov 21, 2018
@vandanavk
Copy link
Contributor

LGTM.

@mxnet-label-bot update [pr-awaiting-testing]

@Chancebair for review/help with the CI

@marcoabreu marcoabreu added pr-awaiting-testing PR is reviewed and waiting CI build and test and removed pr-awaiting-review PR is waiting for code review labels Nov 27, 2018
@vandanavk
Copy link
Contributor

@marcoabreu @Chancebair @jlcontreras could you help with the CI for this PR?

@roywei
Copy link
Member

roywei commented Dec 11, 2018

Can we fix the CI failure?

@kalyc
Copy link
Contributor

kalyc commented Dec 18, 2018

As discussed offline we should merge the PR into the jenkins setup for testing examples to be able to test here - http://jenkins.mxnet-ci-dev.amazon-ml.com/job/test-kalyc-NightlyTestsForBinaries/job/jenkins_setup_testing_examples/

@ankkhedia please close the PR I will pull your changes here #13270

@kalyc
Copy link
Contributor

kalyc commented Dec 18, 2018

Updated PR #13270 with your changes @ankkhedia
Please feel free to close this one.

@ankkhedia
Copy link
Contributor Author

@kalyc Thanks for taking it forward.

@ankkhedia ankkhedia closed this Dec 19, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

pr-awaiting-testing PR is reviewed and waiting CI build and test

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants