0% found this document useful (0 votes)
10 views37 pages

Slurm Webconsole Atos

The document discusses the development of a fully configurable HPC web portal, XCS, for managing Slurm jobs, emphasizing its modular design, user experience, and integration with security protocols. It also introduces the Bull Efficiency Manager (BEM), which enhances Slurm with additional functionalities for resource management. Future work includes unifying the interfaces of XCS and BEM and developing a new web portal framework to support HPC, AI, and Quantum tools.

Uploaded by

khajarauf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
10 views37 pages

Slurm Webconsole Atos

The document discusses the development of a fully configurable HPC web portal, XCS, for managing Slurm jobs, emphasizing its modular design, user experience, and integration with security protocols. It also introduces the Bull Efficiency Manager (BEM), which enhances Slurm with additional functionalities for resource management. Future work includes unifying the interfaces of XCS and BEM and developing a new web portal framework to support HPC, AI, and Quantum tools.

Uploaded by

khajarauf
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 37

A fully configurable HPC web

portal for managing Slurm jobs


Patrice Calegari

Slurm User Group SLUG’19


Salt Lake City, USA - September 18, 2019
© Atos
We will talk about…

Context of the projects

XCS - eXtreme factory Computing Studio

BEM - Bull Efficiency Manager

Conclusion and future work

2 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
Context of the projects
Bull/Atos HPC & AI Software R&D

▶ Our division, Atos BDS (Big Data & Security) is in charge of developing
supercomputing hardware and middleware.

▶ Our domains of interests: HPC, AI and Quantum simulations.

▶ User experience (UX) is extremely important

▶ Security is critical in all our activities (and those of our clients)

▶ We contribute to Slurm community and integrate Slurm in our HPC stack for
more than 10 years

4 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS
eXtreme factory
Computing Studio
Extreme factory Computing Studio v3 (XCS3)
Introduction
▶ Modular HPC, AI & Quantum portal
– as-a-Service cornerstone application,
– supports Slurm (and other schedulers)
– Role Based Access Control (RBAC)
– supports AD, LDAP (with Kerberos)
– XCS = REST API service + GUI
▶ Fully customizable user interface
– Responsive Web Design (RWD) GUI
– Single Page Application (SPA) with
configurable dashboards: layout,
components, languages, themes
Latest release: XCS 3.8.0 (April 5, 2019)

6 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS REST API
https://public.extremefactory.com/demo/api/doc/api-full.html

7 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS REST API
https://public.extremefactory.com/demo/app/api/doc/api-full.html

8 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS REST API
https://public.extremefactory.com/demo/app/api/doc/api-full.html

9 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS user dashboard
Example 1: 8 components

10 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS user dashboard
Example 2: 1 component

11 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS user dashboard
Example 3: 6 components with edited theme

12 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS dashboard main menu
import/export dashboards

13 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS dashboard main menu
REST API documentation

14 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS Fundamental concepts
Key software product for HPCaaS solutions

Give users and admins access to resources through web services


• Use of a GUI in a web browser that relies on a REST API

Be compatible with « all possible » environments


• Software, frameworks, middleware

Never be intrusive
• The solution should be used in existing environments without modifying them

Keep all the intelligence in the REST API server


• The goal of the GUI is only to be the HMI (Human Machine Interface)

15 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS architecture
current v3

Job submission HPC cluster


DC XCS SSH integration layer
HTTPS REST • Slurm

DC API
XCS DCs API • HPC applications
Data mngmt HTTPS
web
DC server XCS
Data base
XCS GUI
web server Security Directory
XCS web User Interface
• Dashboards service service
• Web Design

DC = Dashboard Component
16 | 18-09-2019 | Patrice Calegari | © Atos
HPC & AI R&D Software
Slurm job submission workflow with XCS

sbatch … Appli.sh $arg1 …

17 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS application administrator dashboard
HPC application general information

18 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS application administrator dashboard
HPC application form definition

19 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Bull Efficiency Manager
Bull Efficiency Manager (BEM)
Introduction

▶ Slurm has been enhanced by Bull/Atos to provide additional functionality


including topology-aware resource allocation and advanced placement policies,

▶ Bull Efficiency Manager (BEM) is the web application running upon the
Slurm workload manager to show cluster details interactively,

▶ BEM dashboards show information in graphs and tables for both current and
previous archived data about cluster resources.

21 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS architecture
current v3

Switch
HPC cluster
Topology DC BEM SSH integration layer
HTTPS REST

DC API
BEM DCs • Slurm
API
Slurm usage HTTPS
web
history DC server BEM
Data base
XCS GUI
web server Security Directory
BEM web User Interface • Dashboards service service
• Web Design

DC = Dashboard Component
22 | 18-09-2019 | Patrice Calegari | © Atos
HPC & AI R&D Software
BEM
Login Page

23 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Current resource usage 1/3

24 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Current resource usage 2/3

25 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Current resource usage 3/3

26 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Historical resource usage

27 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Topology resource allocation 1/3

28 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Current resource usage 2/3

29 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
BEM
Current resource usage 3/3

30 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
Conclusion & Future Work
Conclusions

▶ XCS is successfully used in production on many sites for several years and it
evolves continuously

▶ BEM is still under development and the first Minimal Viable Product (MVP) is
very promising

▶ Mobile devices are becoming a new standard way for doing “everything”, so
such a web portal approach will soon be mandatory for new users
(unexperienced users, young scientist of the new generation, non-technical
managers, etc.)

32 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
On going and future work

▶ Unify both interfaces (XCS & BEM) and share a unique security service

▶ Add new features to administrate Slurm

▶ We develop a new web portal framework to federate all our HPC, AI & Quantum
tools/microservices. It is an evolution of our current XCS solution with:
– a generic web GUI framework
– a security service (with flexible identity, authentication with SSO and
authorization management).
– global services (reverse proxy, gateway, discovery service, etc.)

33 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software
XCS and BEM architecture
Complete solution to be developed in 2020
Job submission HPC cluster
DC XCS integration layer
SSH
REST • Slurm

DC API
HTTPS HTTPS
XCS DCs API • HPC applications
web
Data mngmt
DC server XCS
Data base
Unified GUI HTTPS
NEW unified web User web server Security Directory
Interface • Dashboards service service
• Web Design
BEM
Switch
Topology DC BEM Data base
REST
DC API
HTTPS
BEM DCs API SSH BEM
web integration layer
Slurm usage
history DC server • Slurm
DC = Dashboard Component
34 | 18-09-2019 | Patrice Calegari | © Atos
HPC & AI R&D Software
XCS and Slurm native REST service architecture
Possible evolution…
Job submission HPC cluster
DC XCS integration layer
SSH
REST • Slurm

DC API
HTTPS HTTPS
XCS DCs API • HPC applications
web
Data mngmt
DC server XCS
Data base
Unified GUI HTTPS
NEW unified web User web server Security Directory
Interface • Dashboards service service
• Web Design

Slurm job Slurm


specific DC REST Slurm

DC API
HTTPS Data base
Slurm DCs API
deamon
Slurm admin
Slurm server
slurm.restd
specific DC
DC = Dashboard Component
35 | 18-09-2019 | Patrice Calegari | © Atos
HPC & AI R&D Software
Thank you
For more information please contact:
Mathis Clayer for Slurm topics ([email protected])
Patrice Calegari for GUI topics ([email protected])

Atos, the Atos logo, Atos Syntel, Unify, and Worldline are registered trademarks of the
Atos group. May 2019. © 2019 Atos. Confidential information owned by Atos, to be used
by the recipient only. This document, or any part of it, may not be reproduced, copied,
circulated and/or distributed nor quoted without prior written approval from Atos.
More on HPC web portals

▶ Web Portals for High-performance Computing: A Survey


– 36 page journal paper published by ACM
– https://dl.acm.org/citation.cfm?id=3197385

▶ Democratization of HPC through the Use of Web Portals: Different


Strategies
– Panel at SC’19 in Denver, November 20th, 3:30pm-5pm
– https://sc19.supercomputing.org/presentation/?id=pan102&sess=sess223

37 | 18-09-2019 | Patrice Calegari | © Atos


HPC & AI R&D Software

You might also like