A Grid job monitoring system

Dumitrescu, Catalin; Nowack, Andreas; Padhi, Sanjay; Sarkar, Subir

A Grid job monitoring system

2010, Journal of Physics: Conference Series

Abstract

This paper presents a web-based Job Monitoring framework for individual Grid sites that allows users to follow in detail their jobs in quasi-real time. The framework consists of several independent components : (a) a set of sensors that run on the site CE and worker nodes and update a database, (b) a simple yet extensible web services framework and (c) an Ajax powered web interface having a look-and-feel and control similar to a desktop application. The monitoring framework supports LSF, Condor and PBS-like batch systems. This is one of the first monitoring systems where an X.509 authenticated web interface can be seamlessly accessed by both end-users and site administrators. While a site administrator has access to all the possible information, a user can only view the jobs for the Virtual Organizations (VO) he/she is a part of. The monitoring framework design supports several possible deployment scenarios. For a site running a supported batch system, the system may be deployed as a whole, or existing site sensors can be adapted and reused with the web services components. A site may even prefer to build the web server independently and choose to use only the Ajax powered web interface. Finally, the system is being used to monitor a glideinWMS instance. This broadens the scope significantly, allowing it to monitor jobs over multiple sites.

Key takeaways

A number of monitoring tools do exist in the Grid world but none of them even comes close to real time monitoring that we are so used to with local batch systems.
K The monitor can be deployed fully at a site that runs a supported batch system, K Existing site sensors could be adapted to fill the database defined by the monitoring framework and reused with the web services components, K Sites may build the web server independently and use only the web interface.
An XMLHttpRequest object is at the heart of the Ajax Engine which prepares and sends the request to the server and unpacks the server response automatically which is subsequently processed by the JavaScript callback functions.
We also hope that the CMS Dashboard will eventually link jobs from its own monitoring page to the site monitoring.
The client side uses the jQuery JavaScript framework extensively, for basic effects and event handling, User Interface (UI) as well as Ajax calls.

Log In

A Grid job monitoring system

Sign up for access to the world's latest research

Abstract

Key takeaways

Related papers

Related topics

Related papers