Papers by Hamed Khorasgani

arXiv (Cornell University), Sep 27, 2021
Traditionally, fault detection and isolation community has used system dynamic equations to gener... more Traditionally, fault detection and isolation community has used system dynamic equations to generate diagnosers and to analyze detectability and isolability of the dynamic systems. Model-based fault detection and isolation methods use system model to generate a set of residuals as the bases for fault detection and isolation. However, in many complex systems it is not feasible to develop highly accurate models for the systems and to keep the models updated during the system lifetime. Recently, data-driven solutions have received an immense attention in the industries systems for several practical reasons. First, these methods do not require the initial investment and expertise for developing accurate models. Moreover, it is possible to automatically update and retrain the diagnosers as the system or the environment change over time. Finally, unlike the model-based methods it is straight forward to combine time series measurements such as pressure and voltage with other sources of information such as system operating hours to achieve a higher accuracy. In this paper, we extend the traditional model-based fault detection and isolation concepts such as residuals, and detectable and isolable faults to the data-driven domain. We then propose an algorithm to automatically generate residuals from the normal operating data. We present the performance of our proposed approach through a comparative case study. Khorasgani et al. This is an open-access article distributed under the terms of the Creative Commons Attribution 3.0 United States License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

arXiv (Cornell University), Sep 27, 2021
Deep reinforcement learning (RL) algorithms can learn complex policies to optimize agent operatio... more Deep reinforcement learning (RL) algorithms can learn complex policies to optimize agent operation over time. RL algorithms have shown promising results in solving complicated problems in recent years. However, their application on real-world physical systems remains limited. Despite the advancements in RL algorithms, the industries often prefer traditional control strategies. Traditional methods are simple, computationally efficient and easy to adjust. In this paper, we first propose a new Q-learning algorithm for continuous action space, which can bridge the control and RL algorithms and bring us the best of both worlds. Our method can learn complex policies to achieve long-term goals and at the same time it can be easily adjusted to address short-term requirements without retraining. Next, we present an approximation of our algorithm which can be applied to address short-term requirements of any pre-trained RL algorithm. The case studies demonstrate that both our proposed method as well as its practical approximation can achieve short-term and long-term goals without complex reward functions.
2021 IEEE International Conference on Big Data (Big Data)
Traditionally, the performance of multi-agent deep reinforcement learning algorithms are demonstr... more Traditionally, the performance of multi-agent deep reinforcement learning algorithms are demonstrated and validated in gaming environments where we often have a fixed number of agents. In many industrial applications, the number of available agents can change at any given day and even when the number of agents is known ahead of time, it is common for an agent to break during the operation and become unavailable for a period of time. In this paper, we propose a new deep reinforcement learning algorithm for multi-agent collaborative tasks with a variable number of agents. We demonstrate the application of our algorithm using a fleet management simulator developed by Hitachi to generate realistic scenarios in a production site. Index Terms-Multi-agent deep reinforcement, fleet traffic control, variable number of agents.

Annual Conference of the PHM Society, 2016
In this paper, we propose a mixed method for analyzing telemetry data from a robotic space missio... more In this paper, we propose a mixed method for analyzing telemetry data from a robotic space mission. The idea is to first apply unsupervised learning methods to the telemetry data divided into temporal segments. The large clusters that ensue typically represent the nominal operations of the spacecraft and are not of interest from an anomaly detection viewpoint. However, the smaller clusters and outliers that result from this analysis may represent specialized modes of operation, e.g., conduct of a specialized experiment on board the spacecraft, or they may represent true anomalous or unexpected behaviors. To differentiate between specialized modes and anomalies, we employ a supervised method of consulting human mission experts in the approach presented in this paper. Our longer term goal is to develop more automated methods for detecting anomalies in time series data, and once anomalies are identified, use feature selection methods to build online detectors that can be used in future missions, thus contributing to making operations more effective and improving overall safety of the mission.

International Journal of Prognostics and Health Management, 2020
This paper discusses a mixed method that combines unsupervised learning methods and human expert ... more This paper discusses a mixed method that combines unsupervised learning methods and human expert input for analyzing telemetry data from long-duration robotic space missions. Our goal is to develop more automated methods for detecting anomalies in time series data. Once anomalies are identified using unsupervised learning methods we use feature selection methods followed by expert input to derive the knowledge required for building on-line detectors. These detectors can be used in later phases of the current mission and in future missions for improving operations and overall safety of the mission. Whereas the primary focus in this paper is on developing data-driven anomaly detection methods, we also present a computational platform for data mining and analytics that can operate on historical data offline, as well as incoming telemetry data on-line.
at - Automatisierungstechnik, 2018
Fault detection and isolation schemes are designed to detect the onset of adverse events during o... more Fault detection and isolation schemes are designed to detect the onset of adverse events during operations of complex systems, such as aircraft, power plants, and industrial processes. In this paper, we combine unsupervised learning techniques with expert knowledge to develop an anomaly detection method to find previously undetected faults from a large database of flight operations data. The unsupervised learning technique combined with a feature extraction scheme applied to the clusters labeled as anomalous facilitates expert analysis in characterizing relevant anomalies and faults in flight operations. We present a case study using a large flight operations data set, and discuss results to demonstrate the effectiveness of our approach. Our method is general, and equally applicable to manufacturing processes and other industrial applications.

Fault Diagnosis of Dynamic Systems, 2019
We present our temporal and spectral analyses of 29 bursts from SGR J0501+4516, detected with the... more We present our temporal and spectral analyses of 29 bursts from SGR J0501+4516, detected with the Gamma-ray Burst Monitor onboard the Fermi Gamma-ray Space Telescope during the 13 days of the source activation in 2008 (August 22 to September 3). We find that the T 90 durations of the bursts can be fit with a log-normal distribution with a mean value of ∼ 123 ms. We also estimate for the first time event durations of Soft Gamma Repeater (SGR) bursts in photon space (i.e., using their deconvolved spectra) and find that these are very similar to the T 90 s estimated in count space (following a log-normal distribution with a mean value of ∼ 124 ms). We fit the time-integrated spectra for each burst and the time-resolved spectra of the five brightest bursts with several models. We find that a single power law with an exponential cutoff model fits all 29 bursts well, while 18 of the events can also be fit with two black body functions. We expand on the physical interpretation of these two models and we compare their parameters and discuss their evolution. We show that the time-integrated and time-resolved spectra reveal that E peak decreases with energy flux (and fluence) to a minimum of ∼ 30 keV at F = 8.7 Ă— 10 −6 erg cm −2 s −1 , increasing steadily afterwards. Two more sources exhibit a similar trend: SGRs J1550 − 5418 and 1806 − 20. The isotropic luminosity, L iso , corresponding to these flux values is roughly similar for all sources (0.4 − 1.5 Ă— 10 40 erg s −1).

IEEE Transactions on Automation Science and Engineering, 2018
This paper develops a structural diagnosis approach for fault detection and isolation in hybrid s... more This paper develops a structural diagnosis approach for fault detection and isolation in hybrid systems. Hybrid systems are characterized by continuous behaviors that are interspersed with discrete mode changes in the system, making the analysis of behaviors quite complex. In this paper, we address the mode detection problem in hybrid systems as the first step in diagnoser design. The proposed method uses analytic redundancy methods to detect the operating mode of the system even in the presence of system faults. We define hybrid minimal structurally overdetermined (HMSO) sets for hybrid systems. For residual generation, we develop the HMSO selection problem, formulated as a binary integer linear programming optimization problem to minimize the number of selected HMSOs and reduce online computational costs of the diagnosis algorithm. The proposed structural approach does not require preenumeration of all possible modes in the diagnoser design step. Therefore, our approach is feasible for hybrid systems with a large number of switching elements, implying that the system can have a large number of operating modes. The case study demonstrates the effectiveness of our approach. We discuss the results of our case study, and present directions for future work. Note to Practitioners-Developing feasible approaches for online monitoring, fault detection, and fault isolation of complex hybrid and embedded systems, such as automobiles, aircraft, power plants, and manufacturing processes, is essential in securing their safe, reliable, and efficient operation. Frequent changes in the operational modes of these systems because of operator actions, such as changing gears in an automobile, or environmental changes, such as driving on a wet or icy road make the fault detection and isolation task in these systems challenging. It is important to detect and isolate faults in all the operating modes, and at the same time, not mistake a mode change as a fault in the system. In this paper, we propose an approach that exploits the equation structure of hybrid systems behavior to combine mode detection and diagnosis in nonlinear hybrid systems. The proposed algorithm is scalable and efficient. We demonstrate its effectiveness using a case study of a reverse osmosis subsystem in an advances life support system for long duration manned space missions. Important challenges that can affect the success of our approach include the need for sufficiently detailed hybrid models that capture nominal and faulty behavior, and a sufficient number of sensors to make simultaneous mode detection and fault isolation possible.

Applied Soft Computing, 2018
This paper combines a residual-based diagnosis approach and an unsupervised anomaly detection met... more This paper combines a residual-based diagnosis approach and an unsupervised anomaly detection method for monitoring and fault diagnosis in smart buildings. Typically buildings are very complex, and it is computationally intractable to built accurate diagnosis models for large buildings. However, complete and fairly accurate models can be constructed for components like supply and exhaust fans in buildings. Our proposed method combines a model-based diagnosis approach that uses available models and a data driven approach that uses machine learning techniques along with the additional sensor data available from buildings to update a diagnosis reference model and build more complete diagnosis systems for buildings. To estimate the likelihood of each potential fault in the complex systems, the dependencies between components and, therefore, the sensor measurements need to be considered for accurate diagnosis. In this work, we employ the tree augmented naive Bayesian learning algorithm (TAN) to develop classifiers for fault detection and isolation. TAN structures can accommodate some dependencies between the measurements. We demonstrate and validate the proposed approach using a data-set from an outdoor air unit (OAU) system in the Lentz public health center in Nashville.

Applied Sciences, 2019
The increasing complexity and size of cyber-physical systems (e.g., aircraft, manufacturing proce... more The increasing complexity and size of cyber-physical systems (e.g., aircraft, manufacturing processes, and power generation plants) is making it hard to develop centralized diagnosers that are reliable and efficient. In addition, advances in networking technology, along with the availability of inexpensive sensors and processors, are causing a shift in focus from centralized to more distributed diagnosers. This paper develops two structural approaches for distributed fault detection and isolation. The first method uses redundant equation sets for residual generation, referred to as minimal structurally-over-determined sets, and the second is based on the original model equations. We compare the diagnosis performance of the two algorithms and clarify the pros and cons of each method. A case study is used to demonstrate the two methods, and the results are discussed together with directions for future work.
Datenanalyse in der intelligenten Fabrik
Handbuch Industrie 4.0, 2016

Reliability Engineering & System Safety, 2016
While most prognostics approaches focus on accurate computation of the degradation rate and the r... more While most prognostics approaches focus on accurate computation of the degradation rate and the remaining useful life (RUL) of individual components, it is the rate at which the performance of subsystems and systems degrade that is of greater interest to the operators and maintenance personnel of these systems. We develop a comprehensive methodology for system-level prognostics under different forms of uncertainty in this paper. Our approach combines an estimation scheme with a prediction scheme to compute the RUL as a stochastic distribution over the life of the system. We compare two prediction methods: (1) stochastic simulation and (2) the inverse first order reliability method (inverse-FORM). We compare the computational complexity and the accuracy of the two approaches using a case study of a system with several degrading components.

53rd IEEE Conference on Decision and Control, 2014
A number of residual generation methods have been developed for robust model-based fault detectio... more A number of residual generation methods have been developed for robust model-based fault detection and isolation (FDI). There have also been a number of offline (i.e., design-time) methods that focus on optimizing FDI performance (e.g., trading off detection performance versus cost). However, design-time algorithms are not tuned to optimize performance for different operating regions of system behavior. To do this, would need to define online measures of sensitivity and robustness, and use them to select the best residual set online as system behavior transitions between operating regions. In this paper we develop a quantitative measure of residual performance, called the detectability ratio that applies to additive and multiplicative uncertainties when determining the best residual set in different operating regions. We discuss this methodology and demonstrate its effectiveness using a case study.
Traditionally, the performance of multi-agent deep reinforcement learning algorithms are demonstr... more Traditionally, the performance of multi-agent deep reinforcement learning algorithms are demonstrated and validated in gaming environments where we often have a fixed number of agents. In many industrial applications, the number of available agents can change at any given day and even when the number of agents is known ahead of time, it is common for an agent to break during the operation and become unavailable for a period of time. In this paper, we propose a new deep reinforcement learning algorithm for multi-agent collaborative tasks with a variable number of agents. We demonstrate the application of our algorithm using a fleet management simulator developed by Hitachi to generate realistic scenarios in a production site.
This paper presents a framework for distributed fault detection and isolation in dynamic systems.... more This paper presents a framework for distributed fault detection and isolation in dynamic systems. Our approach uses the dynamic model of each subsystem to derive a set of independent, local diagnosers. If needed, the subsystem model is extended to include measurements and model equations from its immediate neighbors to compute its diagnosis. Our approach is designed to ensure that each subsystem diagnoser provides the correct results, therefore, a local diagnosis result is equivalent to the results that would be produced by a global system diagnoser. We discuss the distribute diagnosis algorithm, and illustrate its application using a multi-tank system.
Long-term planning, short-term adjustments
Model-based approaches to fault detection and isolation (FDI) rely on accurate models of the plan... more Model-based approaches to fault detection and isolation (FDI) rely on accurate models of the plant and a sufficient number of reliable measurements for residual generation and analysis. However, in ...

Degredation Modeling and Remaining Useful Life Prediction of Electrolytic Capacitors under Thermal Overstress Condition Using Particle Filters
Prognostic and remaining useful life (RUL) predictions for electrolytic capacitors under thermal ... more Prognostic and remaining useful life (RUL) predictions for electrolytic capacitors under thermal overstress condition are investigated in this paper. In the first step, the degradation process is modeled as a physics of failure process. All of the relevant parameters and states of the capacitor are considered during the degradation process. A particle filter approach is utilized to derive the dynamic form of the degradation model and estimate the current state of capacitor health. This model is then used to get more accurate estimation of the Remaining Useful Life (RUL) of the capacitors as they are subjected to the thermal stress conditions. The paper includes an experimental study, where the degradation of a set of identical capacitors under thermal overstress conditions is studied to demonstrate and validate the performance of the degradation modeling approach.
ArXiv, 2020
Dynamic dispatching aims to smartly allocate the right resources to the right place at the right ... more Dynamic dispatching aims to smartly allocate the right resources to the right place at the right time. Dynamic dispatching is one of the core problems for operations optimization in the mining industry. Theoretically, deep reinforcement learning (RL) should be a natural fit to solve this problem. However, the industry relies on heuristics or even human intuitions, which are often short-sighted and sub-optimal solutions. In this paper, we review the main challenges in using deep RL to address the dynamic dispatching problem in the mining industry.

Explosive growth in spatio-temporal data and its wide range of applications have attracted increa... more Explosive growth in spatio-temporal data and its wide range of applications have attracted increasing interests of researchers in the statistical and machine learning fields. The spatio-temporal regression problem is of paramount importance from both the methodology development and real-world application perspectives. Given the observed spatially encoded time series covariates and real-valued response data samples, the goal of spatio-temporal regression is to leverage the temporal and spatial dependencies to build a mapping from covariates to response with minimized prediction error. Prior arts, including the convolutional Long Short-Term Memory (CovLSTM) and variations of the functional linear models, cannot learn the spatio-temporal information in a simple and efficient format for proper model building. In this work, we propose two novel extensions of the Functional Neural Network (FNN), a temporal regression model whose effectiveness and superior performance over alternative sequ...
Uploads
Papers by Hamed Khorasgani