Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2018, Annals of Mathematics and Artificial Intelligence
Probabilistic methods for causal discovery are based on the detection of patterns of correlation between variables. They are based on statistical theory and have revolutionised the study of causality. However, when correlation itself is unreliable, so are probabilistic methods: unusual data can lead to spurious causal links, while nonmonotonic functional relationships between variables can prevent the detection of causal links. We describe a new heuristic method for inferring causality between two continuous variables, based on randomness and unimodality tests and making few assumptions about the data. We evaluate the method against probabilistic and additive noise algorithms on real and artificial datasets, and show that it performs competitively.
Lecture Notes in Statistics, 1994
The discovery of causal relationships from empirical data is an important problem in machine learning. In this paper the attention is focused on the inference o fprobabilis tic causal relationships, for which two different approaches, namely Glymour et al.'s approach based on constraints on correlations and Pearl and Verma's approach based on conditional independencies, have been proposed. These methods differ both in the kind of constraints they consider while selecting a causal model and in the way they search the model which better fits to the sample data. Preliminary experiments show that they are complementary in several aspects. Moreover, the method of conditional independence can be easily extended to the case in which variables have a nominal or ordinal domain. In this case, symbohc learning algorithms can be exploited in order to derive the causal law from the causal model.
Lecture Notes in Computer Science, 1994
The discovery of causal relationships from empirical data is an important problem in machine learning. In this paper the attention is focused on the inference o fprobabilis tic causal relationships, for which two different approaches, namely Glymour et al.'s approach based on constraints on correlations and Pearl and Verma's approach based on conditional independencies, have been proposed. These methods differ both in the kind of constraints they consider while selecting a causal model and in the way they search the model which better fits to the sample data. Preliminary experiments show that they are complementary in several aspects. Moreover, the method of conditional independence can be easily extended to the case in which variables have a nominal or ordinal domain. In this case, symbohc learning algorithms can be exploited in order to derive the causal law from the causal model.
2014
The relationship between statistical dependency and causality lies at the heart of all statistical approaches to causal inference. Recent results in the ChaLearn cause-effect pair challenge have shown that causal directionality can be inferred with good accuracy also in Markov indistinguishable configurations thanks to data driven approaches. This paper proposes a supervised machine learning approach to infer the existence of a directed causal link between two variables in multivariate settings with $n>2$ variables. The approach relies on the asymmetry of some conditional (in)dependence relations between the members of the Markov blankets of two variables causally connected. Our results show that supervised learning methods may be successfully used to extract causal information on the basis of asymmetric statistical descriptors also for $n>2$ variate distributions.
2014
The task of attributing cause and effect is prevalent and pervasive in history as well as present everyday lives. The factors affecting economy, human health, global warming can help us in solving many problems. The standard way to detect causal direction is to perform random experiments, which can be expensive, unethical or even impossible to perform. Inferring the same from already collected data can solve many problems. In this project we dealt with finding causal direction among two variables, given their distribution. This problem was part of the Cause Effect pair challenge. We used functional noise model and deterministic relational models to find the cost involved in causal directions for the two variables . We determined the existence of a relation using a SVM classifier on features extracted from the distribution of data.
Discovering interdependencies and causal relationships is one of the most relevant challenges raised by the information era. As more and better data become available, there is an urgent need for data-driven techniques with the capability of efficiently detecting hidden interactions. As such, this important issue is receiving increasing attention in the recent literature. The aim of the Learning Causality Special Session is to bring together theory-oriented and practitioners of this fascinating discipline. The main streams of causality detection by Computational Intelligence will be covered, namely, the probabilistic, information-theoretic, and Granger approaches.
2010
The rapid spread of interest in the last two decades in principled methods of search or estimation of causal relations has been driven in part by technological developments, especially the changing nature of modern data collection and storage techniques, and the increases in the speed and storage capacities of computers. Statistics books from 30 years ago often presented examples with fewer than 10 variables, in domains where some background knowledge was plausible.
2010
We consider two variables that are related to each other by an invertible function. While it has previously been shown that the dependence structure of the noise can provide hints to determine which of the two variables is the cause, we presently show that even in the deterministic (noise-free) case, there are asymmetries that can be exploited for causal inference. Our method is based on the idea that if the function and the probability density of the cause are chosen independently, then the distribution of the effect will, in a certain sense, depend on the function. We provide a theoretical analysis of this method, showing that it also works in the low noise regime, and link it to information geometry. We report strong empirical results on various real-world data sets from different domains.
2004
Causality and discovering causal relations are of interest because they allow us to explain and control systems and phenomena. There have been many debates on causality and whether it is possible to discover causal relations automatically. Different approaches to solving the problem of mining causality have been tried, such as utilising conditional probability or temporal approaches. Discussing, evaluating, and comparing these methods can add perspective to the efforts of all the people involved in this research area. The aim of this workshop is to bring researchers from different backgrounds together to discuss the latest work being done in this domain. The occurrence of this workshop is the result of the joint efforts of the authors, the programme committee members, and the Canadian AI organisers. This volume would not have been possible without the help of the members of the programme committee who reviewed the papers attentively. The Canadian AI'2004 organisers, General Chair, Kay Wiese (Simon Fraser University), Program Co-Chairs Scott Goodwin and Ahmed Tawfik (both from the University of Windsor), and Local Organiser Bob Mercer (University of Western Ontario), supported the workshop from the beginning to the end. Thanks to Weiming Shen for hosting the workshops at National Research Council Canada (NRC) facilities and helping with the co-ordination. The Department of Computer Science at the University of Regina, and especially Howard Hamilton contributed their time and resources towards the preparation of this volume. The efforts of all the people not mentioned by name, who in any way helped in making this workshop possible, are greatly appreciated.
Machine Learning for Pharma and Healthcare Applications ECML PKDD 2020 Workshop (PharML 2020), 2020
At the heart of causal structure learning from observational data lies a deceivingly simple question: given two statistically dependent random variables, which one has a causal effect on the other? This is impossible to answer using statistical dependence testing alone and requires that we make additional assumptions. We propose fast and simple criteria for distinguishing cause and effect in pairs of discrete or continuous random variables. The intuition behind them is that predicting the effect variable using the cause variable should be 'simpler' than the reverse different notions of 'simplicity' giving rise to different criteria. We demonstrate the accuracy of the criteria on synthetic data generated under a broad family of causal mechanisms and types of noise.
Neural Information Processing Systems, 2008
Causal structure-discovery techniques usually assume that all causes of more than one variable are observed. This is the so-called causal sufficiency assumption. In practice, it is untestable, and often violated. In this paper, we present an efficient causal structure-learning algorithm, suited for causally insufficient data. Similar to algorithms such as IC* and FCI, the proposed approach drops the causal sufficiency assumption and learns a structure that indicates (potential) latent causes for pairs of observed variables. Assuming a constant local density of the data-generating graph, our algorithm makes a quadratic number of conditionalindependence tests w.r.t. the number of variables. We show with experiments that our algorithm is comparable to the state-of-the-art FCI algorithm in accuracy, while being several orders of magnitude faster on large problems. We conclude that MBCS* makes a new range of causally insufficient problems computationally tractable.
ArXiv, 2019
The discovery of causal relationships is a fundamental problem in science and medicine. In recent years, many elegant approaches to discovering causal relationships between two variables from uncontrolled data have been proposed. However, most of these deal only with purely directed causal relationships and cannot detect latent common causes. Here, we devise a general method which takes a purely directed causal discovery algorithm and modifies it so that it can also detect latent common causes. The identifiability of the modified algorithm depends on the identifiability of the original, as well as an assumption that the strength of noise be relatively small. We apply our method to two directed causal discovery algorithms, the Information Geometric Causal Inference of (Daniusis et al., 2010) and the Kernel Conditional Deviance for Causal Inference of (Mitrovic, Sejdinovic, and Teh, 2018), and extensively test on synthetic data---detecting latent common causes in additive, multiplicat...
Advances in Neural …, 2009
Lecture Notes in Computer Science, 2007
We present an algorithm for causal structure discovery suited in the presence of continuous variables. We test a version based on partial correlation that is able to recover the structure of a recursive linear equations model and compare it to the well-known PC algorithm on large networks. PC is generally outperformed in run time and number of structural errors.
Applied informatics
This paper aims to give a broad coverage of central concepts and principles involved in automated causal inference and emerging approaches to causal discovery from i.i.d data and from time series. After reviewing concepts including manipulations, causal models, sample predictive modeling, causal predictive modeling, and structural equation models, we present the constraint-based approach to causal discovery, which relies on the conditional independence relationships in the data, and discuss the assumptions underlying its validity. We then focus on causal discovery based on structural equations models, in which a key issue is the identifiability of the causal structure implied by appropriately defined structural equation models: in the two-variable case, under what conditions (and why) is the causal direction between the two variables identifiable? We show that the independence between the error term and causes, together with appropriate structural constraints on the structural equat...
Computer, 2011
ABSTRACT As data becomes invisible, emerging technologies can help human analysts and decision makers understand, model, and visualize causal relationships.
Proceedings of the Seventh Conference on Uncertainty …, 1991
The presence of latent variables can greatly complicate inferences about causal relations between measured variables from statistical data. In many cases, the presence of latent variables makes it impossible to determine for two measured variables A and B, ...
Discovering causal relationships in large databases of observational data is challenging. The pioneering work in this area was rooted in the theory of Bayesian network (BN) learning, which however, is a NP-complete problem. Hence several constraint-based algorithms have been developed to efficiently discover causations in large databases. These methods usually use the idea of BN learning, directly or indirectly, and are focused on causal relationships with single cause variables. In this paper, we propose an approach to mine causal rules in large databases of binary variables. Our method expands the scope of causality discovery to causal relationships with multiple cause variables, and we utilise partial association tests to exclude noncausal associations, to ensure the high reliability of discovered causal rules. Furthermore an efficient algorithm is designed for the tests in large databases. We assess the method with a set of real-world diagnostic data. The results show that our method can effectively discover interesting causal rules in large databases.
Journal of Artificial Intelligence Research, 2022
We introduce in this survey the major concepts, models, and algorithms proposed so far to infer causal relations from observational time series, a task usually referred to as causal discovery in time series. To do so, after a description of the underlying concepts and modelling assumptions, we present different methods according to the family of approaches they belong to: Granger causality, constraint-based approaches, noise-based approaches, score-based approaches, logic-based approaches, topology-based approaches, and difference-based approaches. We then evaluate several representative methods to illustrate the behaviour of different families of approaches. This illustration is conducted on both artificial and real datasets, with different characteristics. The main conclusions one can draw from this survey is that causal discovery in times series is an active research field in which new methods (in every family of approaches) are regularly proposed, and that no family or method st...
Journal of Machine Learning Research
The goal of many sciences is to understand the mechanisms by which variables came to take on the values they have (that is, to find a generative model), and to predict what the values of those variables would be if the naturally occurring mechanisms were subject to outside manipulations. The past 30 years has seen a number of conceptual developments that are partial solutions to the problem of causal inference from observational sample data or a mixture of observational sample and experimental data, particularly in the area of graphical causal modeling. However, in many domains, problems such as the large numbers of variables, small samples sizes, and possible presence of unmeasured causes, remain serious impediments to practical applications of these developments. The articles in the Special Topic on Causality address these and other problems in applying graphical causal modeling algorithms. This introduction to the Special Topic on Causality provides a brief introduction to graphical causal modeling, places the articles in a broader context, and describes the differences between causal inference and ordinary machine learning classification and prediction problems.
We are interested in learning causal relationships between pairs of random variables, purely from observational data. To effectively address this task, the state-of-the-art relies on strong assumptions on the mechanisms mapping causes to effects, such as invertibility or the existence of additive noise, which only hold in limited situations. On the contrary, this short paper proposes to learn how to perform causal inference directly from data, without the need of feature engineering. In particular, we pose causality as a kernel mean embedding classification problem, where inputs are samples from arbitrary probability distributions on pairs of random variables, and labels are types of causal relationships. We validate the performance of our method on synthetic and real-world data against the state-of-the-art. Moreover, we submitted our algorithm to the ChaLearn's "Fast Causation Coefficient Challenge" competition, with which we won the fastest code prize and ranked third in the overall leaderboard.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.