Papers by Richard McClatchey

HAL (Le Centre pour la Communication Scientifique Directe), Jun 28, 2012
From n-Tier client/server applications, to more complex academic Grids, or even the most recent a... more From n-Tier client/server applications, to more complex academic Grids, or even the most recent and promising industrial Clouds, the last decade has witnessed significant developments in distributed computing. In spite of this conceptual heterogeneity, Service-Oriented Architectures (SOA) seem to have emerged as the common underlying abstraction paradigm. Suitable access to data and applications resident in SOAs via so-called 'Science Gateways' has thus become a pressing need in various fields of science, in order to realize the benefits of Grid and Cloud infrastructures. In this context, authors have consolidated work from three complementary experiences in European projects, which have developed and deployed large-scale production quality infrastructures as Science Gateways to support research in breast cancer, paediatric diseases and neurodegenerative pathologies respectively. In analysing the requirements from these biomedical applications the authors were able to elaborate on commonly faced Grid development issues, while proposing an adaptable and extensible engineering framework for Science Gateways. This paper thus proposes the application of an architecture-centric Model-Driven Engineering (MDE) approach to service-oriented developments, making it possible to define Science Gateways that satisfy quality of service requirements, execution platform and distribution criteria at design time. An novel investigation is presented on the applicability of the resulting grid MDE (gMDE) to specific examples, and conclusions are drawn on the benefits of this approach and its possible application to other areas, in particular that of Distributed Computing Infrastructures (DCI) interoperability.

Business systems these days need to be agile to address the needs of a changing world. Business m... more Business systems these days need to be agile to address the needs of a changing world. Business modelling requires business process management to be highly adaptable with the ability to support dynamic workflows, inter-application integration (potentially between businesses) and process reconfiguration. Designing systems with the in-built ability to cater for evolution is also becoming critical to their success. To handle change, systems need the capability to adapt as and when necessary to changes in users' requirements. Allowing systems to be self-describing is one way to facilitate this. Using our implementation of a self-describing system, a socalled description-driven approach, new versions of data structures or processes can be created alongside older versions providing a log of changes to the underlying data schema and enabling the gathering of traceable ("provenance") data. The CRISTAL software, which originated at CERN for handling physics data, uses versions of stored descriptions to define versions of data and workflows which can be evolved over time and thereby to handle evolving system needs. It has been customised for use in business applications as the Agilium-NG product. This paper reports on how the Agilium-NG software has enabled the deployment of an unique business process management solution that can be dynamically evolved to cater for changing user requirement.

From n-Tier client/server applications, to more complex academic Grids, or even the most recent a... more From n-Tier client/server applications, to more complex academic Grids, or even the most recent and promising industrial Clouds, the last decade has witnessed significant developments in distributed computing. In spite of this conceptual heterogeneity, Service-Oriented Architectures (SOA) seem to have emerged as the common underlying abstraction paradigm. Suitable access to data and applications resident in SOAs via so-called 'Science Gateways' has thus become a pressing need in various fields of science, in order to realize the benefits of Grid and Cloud infrastructures. In this context, authors have consolidated work from three complementary experiences in European projects, which have developed and deployed large-scale production quality infrastructures as Science Gateways to support research in breast cancer, paediatric diseases and neurodegenerative pathologies respectively. In analysing the requirements from these biomedical applications the authors were able to elaborate on commonly faced Grid development issues, while proposing an adaptable and extensible engineering framework for Science Gateways. This paper thus proposes the application of an architecture-centric Model-Driven Engineering (MDE) approach to service-oriented developments, making it possible to define Science Gateways that satisfy quality of service requirements, execution platform and distribution criteria at design time. An novel investigation is presented on the applicability of the resulting grid MDE (gMDE) to specific examples, and conclusions are drawn on the benefits of this approach and its possible application to other areas, in particular that of Distributed Computing Infrastructures (DCI) interoperability.

Research traceability using provenance services for biomedical analysis and the neuGRID consortium
IOS Press eBooks, 2010
ABSTRACT We outline the approach being developed in the neuGRID project to use provenance managem... more ABSTRACT We outline the approach being developed in the neuGRID project to use provenance management techniques for the purposes of capturing and preserving the provenance data that emerges in the specification and execution of workflows in biomedical analyses. In the neuGRID project a provenance service has been designed and implemented that is intended to capture, store, retrieve and reconstruct the workflow information needed to facilitate users in conducting user analyses. We describe the architecture of the neuGRID provenance service and discuss how the CRISTAL system from CERN is being adapted to address the requirements of the project and then consider how a generalised approach for provenance management could emerge for more generic application to the (Health)Grid community.
Version and State Management in a Distributed Workflow Application
Database and Expert Systems Applications, 1997
Most applications of workflow management in business are based on well-defined repetitively execu... more Most applications of workflow management in business are based on well-defined repetitively executed processes. Recently, workflow management principles have been applied to the scientific and engineering communities, where activities may dynamically change as the workflows are executed and may often evolve through multiple versions. These domains present new problems including tracking the progress of parts on which those activities are

arXiv (Cornell University), Sep 26, 2006
The Health-e-Child project aims to develop an integrated healthcare platform for European paediat... more The Health-e-Child project aims to develop an integrated healthcare platform for European paediatrics. In order to achieve a comprehensive view of children's health, a complex integration of biomedical data, information, and knowledge is necessary. Ontologies will be used to formally define this domain knowledge and will form the basis for the medical knowledge management system. This paper introduces an innovative methodology for the vertical integration of biomedical knowledge. This approach will be largely cliniciancentered and will enable the definition of ontology fragments, connections between them (semantic bridges) and enriched ontology fragments (views). The strategy for the specification and capture of fragments, bridges and views is outlined with preliminary examples demonstrated in the collection of biomedical information from hospital databases, biomedical ontologies, and biomedical public databases.

Business systems these days need to be agile to address the needs of a changing world. Business m... more Business systems these days need to be agile to address the needs of a changing world. Business modelling requires business process management to be highly adaptable with the ability to support dynamic workflows, inter-application integration (potentially between businesses) and process reconfiguration. Designing systems with the in-built ability to cater for evolution is also becoming critical to their success. To handle change, systems need the capability to adapt as and when necessary to changes in users' requirements. Allowing systems to be self-describing is one way to facilitate this. Using our implementation of a self-describing system, a socalled description-driven approach, new versions of data structures or processes can be created alongside older versions providing a log of changes to the underlying data schema and enabling the gathering of traceable ("provenance") data. The CRISTAL software, which originated at CERN for handling physics data, uses versions of stored descriptions to define versions of data and workflows which can be evolved over time and thereby to handle evolving system needs. It has been customised for use in business applications as the Agilium-NG product. This paper reports on how the Agilium-NG software has enabled the deployment of an unique business process management solution that can be dynamically evolved to cater for changing user requirement.

arXiv (Cornell University), Apr 10, 2005
Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a ... more Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about Grid weather, and gives them more control over the decision making process. This paper presents a set of services that have been developed to provide more interactive resource management capabilities within the Grid Analysis Environment (GAE) being developed collaboratively by Caltech, NUST and several other institutes. These include a steering service, a job monitoring service and an estimator service that have been designed and written using a common Grid-enabled Web Services framework named Clarens. The paper also presents a performance analysis of the developed services to show that they have indeed resulted in a more interactive and powerful system for user-centric Gridenabled physics analysis.

From n-Tier client/server applications, to more complex academic Grids, or even the most recent a... more From n-Tier client/server applications, to more complex academic Grids, or even the most recent and promising industrial Clouds, the last decade has witnessed significant developments in distributed computing. In spite of this conceptual heterogeneity, Service-Oriented Architectures (SOA) seem to have emerged as the common underlying abstraction paradigm. Suitable access to data and applications resident in SOAs via so-called 'Science Gateways' has thus become a pressing need in various fields of science, in order to realize the benefits of Grid and Cloud infrastructures. In this context, authors have consolidated work from three complementary experiences in European projects, which have developed and deployed large-scale production quality infrastructures as Science Gateways to support research in breast cancer, paediatric diseases and neurodegenerative pathologies respectively. In analysing the requirements from these biomedical applications the authors were able to elaborate on commonly faced Grid development issues, while proposing an adaptable and extensible engineering framework for Science Gateways. This paper thus proposes the application of an architecture-centric Model-Driven Engineering (MDE) approach to service-oriented developments, making it possible to define Science Gateways that satisfy quality of service requirements, execution platform and distribution criteria at design time. An novel investigation is presented on the applicability of the resulting grid MDE (gMDE) to specific examples, and conclusions are drawn on the benefits of this approach and its possible application to other areas, in particular that of Distributed Computing Infrastructures (DCI) interoperability.

arXiv (Cornell University), Mar 19, 2018
In complex data analyses it is increasingly important to capture information about the usage of d... more In complex data analyses it is increasingly important to capture information about the usage of data sets in addition to their preservation over time to ensure reproducibility of results, to verify the work of others and to ensure appropriate conditions data have been used for specific analyses. Scientific workflow based studies are beginning to realize the benefit of capturing this provenance of data and the activities used to process, transform and carry out studies on those data. This is especially true in biomedicine where the collection of data through experiment is costly and/or difficult to reproduce and where that data needs to be preserved over time. One way to support the development of workflows and their use in (collaborative) biomedical analyses is through the use of a Virtual Research Environment. The dynamic and distributed nature of Grid/Cloud computing, however, makes the capture and processing of provenance information a major research challenge. Furthermore most workflow provenance management services are designed only for data-flow oriented workflows and researchers are now realising that tracking data or workflows alone or separately is insufficient to support the scientific process. What is required for collaborative research is traceable and reproducible provenance support in a full orchestrated Virtual Research Environment (VRE) that enables researchers to define their studies in terms of the datasets and processes used, to monitor and visualize the outcome of their analyses and to log their results so that others users can call upon that acquired knowledge to support subsequent studies. We have extended the work carried out in the neuGRID and N4U projects in providing a so-called Virtual Laboratory to provide the foundation for a generic VRE in which sets of biomedical data (images, laboratory test results, patient records, epidemiological analyses etc.) and the workflows (pipelines) used to process those data, together with their provenance data and results sets are captured in the CRISTAL software. This paper outlines the functionality provided for a VRE by the Open Source CRISTAL software and examines how that can provide the foundations for a practice-based knowledge base for biomedicine and, potentially, for a wider research community.

Research traceability using provenance services for biomedical analysis and the neuGRID consortium
IOS Press eBooks, 2010
ABSTRACT We outline the approach being developed in the neuGRID project to use provenance managem... more ABSTRACT We outline the approach being developed in the neuGRID project to use provenance management techniques for the purposes of capturing and preserving the provenance data that emerges in the specification and execution of workflows in biomedical analyses. In the neuGRID project a provenance service has been designed and implemented that is intended to capture, store, retrieve and reconstruct the workflow information needed to facilitate users in conducting user analyses. We describe the architecture of the neuGRID provenance service and discuss how the CRISTAL system from CERN is being adapted to address the requirements of the project and then consider how a generalised approach for provenance management could emerge for more generic application to the (Health)Grid community.

arXiv (Cornell University), Jan 28, 2006
This paper studies the differences and similarities between domain ontologies and conceptual data... more This paper studies the differences and similarities between domain ontologies and conceptual data models and the role that ontologies can play in establishing conceptual data models during the process of information systems development. A mapping algorithm has been proposed and embedded in a special purpose Transformation Engine to generate a conceptual data model from a given domain ontology. Both quantitative and qualitative methods have been adopted to critically evaluate this new approach. In addition, this paper focuses on evaluating the quality of the generated conceptual data model elements using Bunge-Wand-Weber and OntoClean ontologies. The results of this evaluation indicate that the generated conceptual data model provides a high degree of accuracy in identifying the substantial domain entities along with their attributes and relationships being derived from the consensual semantics of domain knowledge. The results are encouraging and support the potential role that this approach can take part in process of information system development.
Version and State Management in a Distributed Workflow Application
Database and Expert Systems Applications, 1997
Most applications of workflow management in business are based on well-defined repetitively execu... more Most applications of workflow management in business are based on well-defined repetitively executed processes. Recently, workflow management principles have been applied to the scientific and engineering communities, where activities may dynamically change as the workflows are executed and may often evolve through multiple versions. These domains present new problems including tracking the progress of parts on which those activities are

Asset pipeline patterns
Proceedings of the 24th European Conference on Pattern Languages of Programs
Interactive real-time visualizations consist of both the navigation and viewing application, exec... more Interactive real-time visualizations consist of both the navigation and viewing application, executing in a runtime environment, and the digital content providing the data source for the visualization. As visualizations are becoming richer and more interactive the breadth and amount of digital content required has increased. The designers of the digital content need to view their designs in the runtime to ensure they are as expected. This creative process is iterative in nature: that of create, review and modify. However, there is a barrier that exists within this workflow; the format of the content is substantially different from the expected runtime format. In this paper four patterns are presented to share solutions to this challenge and lead to efficiency within the design and implementation of the visualisation workflow.

arXiv (Cornell University), Sep 26, 2006
The Health-e-Child project aims to develop an integrated healthcare platform for European paediat... more The Health-e-Child project aims to develop an integrated healthcare platform for European paediatrics. In order to achieve a comprehensive view of children's health, a complex integration of biomedical data, information, and knowledge is necessary. Ontologies will be used to formally define this domain knowledge and will form the basis for the medical knowledge management system. This paper introduces an innovative methodology for the vertical integration of biomedical knowledge. This approach will be largely cliniciancentered and will enable the definition of ontology fragments, connections between them (semantic bridges) and enriched ontology fragments (views). The strategy for the specification and capture of fragments, bridges and views is outlined with preliminary examples demonstrated in the collection of biomedical information from hospital databases, biomedical ontologies, and biomedical public databases.
A prototype distributed mammographic database for Europe
This paper concludes the series of publications on MammoGrid. A companion paper, A comparison of ... more This paper concludes the series of publications on MammoGrid. A companion paper, A comparison of some anthropometric parameters between an Italian and a UK population: 'proof of principle' of a European project using MammoGrid, including Solomonides as a co-author, has also been published by Clinical Radiology Vol 62.11 pp1052-60 (available online since 6th June 2007). Many ideas have been taken forward in large projects (Health-e-Child, NeuGRID). There has also been knowledge transfer by a Spanish company in an application covering the entire region of Extremadura in Spain. Solomonides has spoken on knowledge transfer from the project at "InnovAction" (Friuli-Venezia-Giulia, Udine, 2006) and at "Le frontiere technologiche in sanita" (INFN, Lecce, 2006).

arXiv (Cornell University), Apr 10, 2005
Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a ... more Selecting optimal resources for submitting jobs on a computational Grid or accessing data from a data grid is one of the most important tasks of any Grid middleware. Most modern Grid software today satisfies this responsibility and gives a best-effort performance to solve this problem. Almost all decisions regarding scheduling and data access are made by the software automatically, giving users little or no control over the entire process. To solve this problem, a more interactive set of services and middleware is desired that provides users more information about Grid weather, and gives them more control over the decision making process. This paper presents a set of services that have been developed to provide more interactive resource management capabilities within the Grid Analysis Environment (GAE) being developed collaboratively by Caltech, NUST and several other institutes. These include a steering service, a job monitoring service and an estimator service that have been designed and written using a common Grid-enabled Web Services framework named Clarens. The paper also presents a performance analysis of the developed services to show that they have indeed resulted in a more interactive and powerful system for user-centric Gridenabled physics analysis.

Intelligent IoT System Requirements to Support Self-Management for People with Learning Disabilities – A Study with Care Providers
2021 17th International Conference on Intelligent Environments (IE), 2021
Internet of Things (IoT) technology and Smart Home (SH) solutions are a growing area of research ... more Internet of Things (IoT) technology and Smart Home (SH) solutions are a growing area of research and development, particularly within health and social care where they have potential to offer information and support for self-management and independent living. However, successful design and deployment of these technologies are predicated on a clear understanding of user aspirations, needs, and requirements. This paper presents findings from seven one-to-one interviews with care staff who contributed their experiences representing a diverse range of job roles and categories of service-users supported. Our participants were from a regional supported-living provider with facilities for people with learning disabilities. The interviews provided insight into care service provision and service-users’ needs, as well as helping to identify areas of the current service that can be supported by IoT based intelligent systems. The feedback that staff provided was based on their knowledge and understanding of the needs of specific service user groups and included information regarding the appropriateness and applicability of the technologies, as well as the appeal, acceptability and usability of these solutions.

arXiv (Cornell University), Mar 19, 2018
In complex data analyses it is increasingly important to capture information about the usage of d... more In complex data analyses it is increasingly important to capture information about the usage of data sets in addition to their preservation over time to ensure reproducibility of results, to verify the work of others and to ensure appropriate conditions data have been used for specific analyses. Scientific workflow based studies are beginning to realize the benefit of capturing this provenance of data and the activities used to process, transform and carry out studies on those data. This is especially true in biomedicine where the collection of data through experiment is costly and/or difficult to reproduce and where that data needs to be preserved over time. One way to support the development of workflows and their use in (collaborative) biomedical analyses is through the use of a Virtual Research Environment. The dynamic and distributed nature of Grid/Cloud computing, however, makes the capture and processing of provenance information a major research challenge. Furthermore most workflow provenance management services are designed only for data-flow oriented workflows and researchers are now realising that tracking data or workflows alone or separately is insufficient to support the scientific process. What is required for collaborative research is traceable and reproducible provenance support in a full orchestrated Virtual Research Environment (VRE) that enables researchers to define their studies in terms of the datasets and processes used, to monitor and visualize the outcome of their analyses and to log their results so that others users can call upon that acquired knowledge to support subsequent studies. We have extended the work carried out in the neuGRID and N4U projects in providing a so-called Virtual Laboratory to provide the foundation for a generic VRE in which sets of biomedical data (images, laboratory test results, patient records, epidemiological analyses etc.) and the workflows (pipelines) used to process those data, together with their provenance data and results sets are captured in the CRISTAL software. This paper outlines the functionality provided for a VRE by the Open Source CRISTAL software and examines how that can provide the foundations for a practice-based knowledge base for biomedicine and, potentially, for a wider research community.

Neural Computing and Applications, 2020
This paper investigates the utility of unsupervised machine learning and data visualisation for t... more This paper investigates the utility of unsupervised machine learning and data visualisation for tracking changes in user activity over time. This is done through analysing unlabelled data generated from passive and ambient smart home sensors, such as motion sensors, which are considered less intrusive than video cameras or wearables. The challenge in using unlabelled passive and ambient sensors data for activity recognition is to find practical methods that can provide meaningful information to support timely interventions based on changing user needs, without the overhead of having to label the data over long periods of time. The paper addresses this challenge to discover patterns in unlabelled sensor data using kernel density estimation (KDE) for pre-processing the data, together with t-distributed stochastic neighbour embedding and uniform manifold approximation and projection for visualising changes. The methodology is developed and tested on the Aruba CASAS smart home dataset a...
Uploads
Papers by Richard McClatchey