Skip to content

Remove pmi from the fabric list.#7226

Merged
alalazo merged 12 commits intospack:developfrom
SJTU-HPC:openmpi
Mar 26, 2018
Merged

Remove pmi from the fabric list.#7226
alalazo merged 12 commits intospack:developfrom
SJTU-HPC:openmpi

Conversation

@weijianwen
Copy link
Copy Markdown
Contributor

This commit intents to:

  1. Remove pmi from the fabrics list since Process Manager Interface is not a fabric.
  2. Add configuration flags --with-slurm --with-pmi for OpenMPI (>1.5.4) when schedulers=slurm is specified.

@davydden davydden requested a review from alalazo February 13, 2018 06:42
alalazo
alalazo previously approved these changes Feb 13, 2018
default=None if _verbs_dir() is None else 'verbs',
description='List of fabrics that are enabled',
values=('psm', 'psm2', 'pmi', 'verbs', 'mxm'),
values=('psm', 'psm2', 'verbs', 'mxm'),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the docs, and I think I was wrong putting pmi under fabrics. Thanks for fixing this.

@alalazo
Copy link
Copy Markdown
Member

alalazo commented Mar 6, 2018

Can you please fix flake8?

@weijianwen
Copy link
Copy Markdown
Contributor Author

Flake8 fixed.

@alalazo alalazo closed this Mar 8, 2018
@alalazo alalazo reopened this Mar 8, 2018
@alalazo
Copy link
Copy Markdown
Member

alalazo commented Mar 8, 2018

@lee218llnl @healther @adamjstewart This is fine with me to merge, but since openmpi is widely used I'd like at least a second opinion before pushing the button.

@weijianwen
Copy link
Copy Markdown
Contributor Author

@alalazo Could you wait 1 day before I confirming the version checking in self.spec.satisfies('@1.5.4:')? It doesn't seem to check if the version is higher than 1.5.4 as expected because I found no --with-pmi in my openmpi configuration line.

@alalazo alalazo dismissed their stale review March 9, 2018 06:38

Changed variants, I'll review again later

@weijianwen
Copy link
Copy Markdown
Contributor Author

@alalazo This PR intents to move pmi away from the fabrics list and enable PMI with SLURM in proper versions. There is a limitation on with_or_without cluase -- it returns a string instad of an array. A return value like --with-slurm --with-pmi from with_or_without leads to configuration error no package slurm --with-pmi. So I've added a new variant pmi.

I've tested OpenMPI versions 1.10.7 and 2.0.4 compiled against GCC and ICC. All work well with SLURM 's PMI mechanism in which jobs are launched via srun --mpi=pmi2. Pure-MPI and hybrid-MPI-OpenMP samples are tested. Bugs in versions 2.1.x and 3.0.x prevent OpenMPI working with SLURM via PMI open-mpi/ompi#4338 . The pmi variant is turned off by default and users can lanch jobs via mpirun in this case.

I think this PR is reday to merge.

@adamjstewart
Copy link
Copy Markdown
Member

@alalazo Can you review one more time and make sure the latest PMI changes look good to you?

Copy link
Copy Markdown
Member

@alalazo alalazo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a comment on making the runtime error a conflict and a question on pmi.

return '--without-slurm'
elif spec.satisfies('@1.5.4:') and '+pmi' not in spec:
raise SpackError(
"+pmi is required for openmpi(>=1.5.5) to work with SLURM.")
Copy link
Copy Markdown
Member

@alalazo alalazo Mar 21, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can pmi be used without slurm? If so I think making it a variant is fine, otherwise we can just pass --with-pmi here and avoid the variant. In any case we can move this check above and make it a conflict:

conflicts('schedulers=slurm ~pmi', when='@1.5.4:', msg='+pmi is required for openmpi(>=1.5.5) to work with SLURM.')

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems PMI hasn't been adopted in schedulers other than SLURM. So I replaced pmi checking with the conflict in my latest commit.

@alalazo alalazo merged commit 658896a into spack:develop Mar 26, 2018
baberlevi added a commit to ResearchIT/spack that referenced this pull request Mar 26, 2018
@weijianwen weijianwen deleted the openmpi branch March 29, 2018 03:15
ch741 pushed a commit to ch741/spack that referenced this pull request Apr 20, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants