-
Notifications
You must be signed in to change notification settings - Fork 26.3k
Description
Currently we have the following concepts in our CI scripts/test structure:
-
The idea of a job prefix, used when trying to consolidate test time stats between build and test jobs. Relevant code
pytorch/tools/testing/test_selections.py
Lines 25 to 35 in f2c47cf
def _get_stripped_CI_job() -> str: """E.g. convert 'pytorch_windows_vs2019_py36_cuda10.1_build' to 'pytorch_windows_vs2019_py36_cuda10.1'. """ job = os.environ.get("JOB_BASE_NAME", "").rstrip('0123456789') if job.endswith('_slow_test'): job = job[:len(job) - len('_slow_test')] elif job.endswith('_test') or job.endswith('-test'): job = job[:len(job) - len('_test')] elif job.endswith('_build') or job.endswith('-build'): job = job[:len(job) - len('_build')] return job -
The idea of a job name, which is the most specific one could go with test names, used here
pytorch/tools/stats/print_test_stats.py
Lines 782 to 793 in f2c47cf
def send_report_to_s3(head_report: Version2Report) -> None: job = os.getenv('JOB_BASE_NAME', os.environ.get('CIRCLE_JOB')) # SHARD_NUMBER is specific to GHA jobs, as the shard number would be included in CIRCLE_JOB already shard = os.environ.get('SHARD_NUMBER', '') sha1 = os.environ.get('CIRCLE_SHA1') branch = os.environ.get('CIRCLE_BRANCH', '') now = datetime.datetime.utcnow().isoformat() if branch not in ['master', 'nightly'] and not branch.startswith("release/"): pr = os.environ.get('CIRCLE_PR_NUMBER', 'unknown') key = f'pr_test_time/{pr}/{sha1}/{job}{shard}/{now}Z.json.bz2' # Z meaning UTC else: key = f'test_time/{sha1}/{job}{shard}/{now}Z.json.bz2' # Z meaning UTC -
The idea of a build environment, which is used everywhere in our build and test scripts in .jenkins/pytorch.
To address those concepts, we use the following environment variables:
- BUILD_ENVIRONMENT: this is sometimes used as ALL three of the concepts above
- JOB_BASE_NAME: this attempts to be a job name but is not very specific on GHA and is specific on Circle because we set JOB_BASE_NAME to CIRCLE_JOB. This variable is also used to derive BUILD_ENVIRONMENT like the first code example above.
All this is not fun to deal with. We should clean it up. A good idea is to revamp what we mean by BUILD_ENVIRONMENT and JOB_BASE_NAME.
Example proposal:
- BUILD_ENVIRONMENT: should represent the environment in which pytorch was built! This could also function as a great job prefix.
- JOB_NAME: should represent the most specific description of a job. This would include test config and shard number and all that good stuff.
- We may want other variables to represent other ideas, and I am not opposed to more variables, but we should clarify the existing ones and stick to our definitions of them.
cc @ezyang @seemethere @malfet @walterddr @lg20987 @pytorch/pytorch-dev-infra