Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
1993, Lecture Notes in Statistics
A discovery problem is composed of a set of alternative structures, one of which is the source of data, but any of which, for all the investigator knows before the inquiry, could be the structure from which the data are obtained. There is something to be found out about the actual structure, whichever it is. It may be that we want to settle a particular hypothesis that is true in some of the possible structures and false in others, or it may be that we want to know the complete theory of a certain sort of phenomenon. In this book, and in much of the social sciences and epidemiology, the alternative structures in a discovery problem are typically directed acyclic graphs paired with joint probability distributions on their vertices. We usually want to know something about the structure of the graph that represents causal influences, and we may also want to know about the distribution of values of variables in the graph for a given population. A discovery problem also includes a characterization of a kind of evidence; for example, data may be available for some of the variables but not others, and the data may include the actual probability or conditional independence relations or, more realistically, simply the values of the variables for random samples. Our theoretical discussions will usually consider discovery problems in which the data include the true conditional independence relations among the measured variables, but our examples and applications will always involve inferences from statistical samples. A method solves a discovery problem in the limit if as the sample size increases without bound the method converges to the true answer to the question or to the true theory, whatever
Lecture Notes in Statistics, 1993
Behaviormetrika, 2022
Existing causal discovery algorithms are often evaluated using two success criteria, one that is typically unachievable and the other which is too weak for practical purposes. The unachievable criterion-uniform consistency-requires that a discovery algorithm identify the correct causal structure at a known sample size. The weak but achievable criterion-pointwise consistency-requires only that one identify the correct causal structure in the limit. We investigate two intermediate success criteria-decidability and progressive solvability-that are stricter than mere consistency but weaker than uniform consistency. To do so, we review several topological theorems characterizing which discovery problems are decidable and/or progressively solvable. These theorems apply to any problem of statistical model selection, but in this paper, we apply the theorems only to selection of causal models. We show, under several common modeling assumptions, that there is no uniformly consistent procedure for identifying the direction of a causal edge, but there are statistical decision procedures and progressive solutions. We focus on linear models in which the error terms are either non-Gaussian or contain no Gaussian components; the latter modeling assumption is novel to this paper. We focus especially on which success criteria remain feasible when confounders are present.
Proceedings of the 22nd Conference in …, 2006
Causal discovery from observational data in the presence of unobserved variables is challenging. Identification of so-called Y substructures is a sufficient condition for ascertaining some causal relations in the large sample limit, without the assumption of no hidden common causes. An example of a Y substructure is A → C, B → C, C → D. This paper describes the first asymptotically reliable and computationally feasible scorebased search for discrete Y structures that does not assume that there are no unobserved common causes. For any parameterization of a directed acyclic graph (DAG) that has scores with the property that any DAG that can represent the distribution beats any DAG that can't, and for two DAGs that represent the distribution, if one has fewer parameters than the other, the one with the fewest parameter wins. In this framework there is no need to assign scores to causal structures with unobserved common causes. The paper also describes how the existence of a Y structure shows the presence of an unconfounded causal relation, without assuming that there are no hidden common causes.
Studies in Fuzziness and Soft Computing, 2006
We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraint-based approach. Both approaches rely on the Causal Markov assumption, but the two di er signi cantly in theory and practice. An important di erence between the approaches is that the constraint-based approach uses categorical information about conditional-independence constraints in the domain, whereas the Bayesian approach weighs the degree to which such constraints hold. As a result, the Bayesian approach has three distinct advantages over its constraint-based counterpart. One, conclusions derived from the Bayesian approach are not susceptible to incorrect categorical decisions about independence facts that can occur with data sets of nite size. Two, using the Bayesian approach, ner distinctions among model structures|both quantitative and qualitative|can be made. Three, information from several models can be combined to make better inferences and to better account for modeling uncertainty. In addition to describing the general Bayesian approach to causal discovery, we review approximation methods for missing data and hidden variables, and illustrate di erences between the Bayesian and constraint-based methods using arti cial and real examples.
Applied informatics
This paper aims to give a broad coverage of central concepts and principles involved in automated causal inference and emerging approaches to causal discovery from i.i.d data and from time series. After reviewing concepts including manipulations, causal models, sample predictive modeling, causal predictive modeling, and structural equation models, we present the constraint-based approach to causal discovery, which relies on the conditional independence relationships in the data, and discuss the assumptions underlying its validity. We then focus on causal discovery based on structural equations models, in which a key issue is the identifiability of the causal structure implied by appropriately defined structural equation models: in the two-variable case, under what conditions (and why) is the causal direction between the two variables identifiable? We show that the independence between the error term and causes, together with appropriate structural constraints on the structural equat...
Lecture Notes in Computer Science, 1994
The discovery of causal relationships from empirical data is an important problem in machine learning. In this paper the attention is focused on the inference o fprobabilis tic causal relationships, for which two different approaches, namely Glymour et al.'s approach based on constraints on correlations and Pearl and Verma's approach based on conditional independencies, have been proposed. These methods differ both in the kind of constraints they consider while selecting a causal model and in the way they search the model which better fits to the sample data. Preliminary experiments show that they are complementary in several aspects. Moreover, the method of conditional independence can be easily extended to the case in which variables have a nominal or ordinal domain. In this case, symbohc learning algorithms can be exploited in order to derive the causal law from the causal model.
Lecture Notes in Statistics, 1994
The discovery of causal relationships from empirical data is an important problem in machine learning. In this paper the attention is focused on the inference o fprobabilis tic causal relationships, for which two different approaches, namely Glymour et al.'s approach based on constraints on correlations and Pearl and Verma's approach based on conditional independencies, have been proposed. These methods differ both in the kind of constraints they consider while selecting a causal model and in the way they search the model which better fits to the sample data. Preliminary experiments show that they are complementary in several aspects. Moreover, the method of conditional independence can be easily extended to the case in which variables have a nominal or ordinal domain. In this case, symbohc learning algorithms can be exploited in order to derive the causal law from the causal model.
Statistica, 2010
38th Conference on Uncertainty in Artificial Intelligence, 2022
Inferring causal relationships from observational data is rarely straightforward, but the problem is especially difficult in high dimensions. For these applications, causal discovery algorithms typically require parametric restrictions or extreme sparsity constraints. We relax these assumptions and focus on an important but more specialized problem, namely recovering a directed acyclic subgraph of variables known to be causally descended from some (possibly large) set of confounding covariates, i.e. a confounder blanket. This is useful in many settings, for example when studying a dynamic biomolecular subsystem with genetic data providing causally relevant background information. Under a structural assumption that, we argue, must be satisfied in practice if informative answers are to be found, our method accommodates graphs of low or high sparsity while maintaining polynomial time complexity. We derive a sound and complete algorithm for identifying causal relationships under these conditions and implement testing procedures with provable error control for linear and nonlinear systems. We demonstrate our approach on a range of simulation settings.
Machine Learning for Pharma and Healthcare Applications ECML PKDD 2020 Workshop (PharML 2020), 2020
At the heart of causal structure learning from observational data lies a deceivingly simple question: given two statistically dependent random variables, which one has a causal effect on the other? This is impossible to answer using statistical dependence testing alone and requires that we make additional assumptions. We propose fast and simple criteria for distinguishing cause and effect in pairs of discrete or continuous random variables. The intuition behind them is that predicting the effect variable using the cause variable should be 'simpler' than the reverse different notions of 'simplicity' giving rise to different criteria. We demonstrate the accuracy of the criteria on synthetic data generated under a broad family of causal mechanisms and types of noise.
arXiv preprint arXiv:1206.3260, 2012
Abstract: An important task in data analysis is the discovery of causal relationships between observed variables. For continuous-valued data, linear acyclic causal models are commonly used to model the data-generating process, and the inference of such models is a well-studied problem. However, existing methods have significant limitations. Methods based on conditional independencies (Spirtes et al. 1993; Pearl 2000) cannot distinguish between independence-equivalent models, whereas approaches purely based on Independent ...
Cornell University - arXiv, 2022
Causal identification is at the core of the causal inference literature, where complete algorithms have been proposed to identify causal queries of interest. The validity of these algorithms hinges on the restrictive assumption of having access to a correctly specified causal structure. In this work, we study the setting where a probabilistic model of the causal structure is available. Specifically, the edges in a causal graph are assigned probabilities which may, for example, represent degree of belief from domain experts. Alternatively, the uncertainly about an edge may reflect the confidence of a particular statistical test. The question that naturally arises in this setting is: Given such a probabilistic graph and a specific causal effect of interest, what is the subgraph which has the highest plausibility and for which the causal effect is identifiable? We show that answering this question reduces to solving an NP-hard combinatorial optimization problem which we call the edge ID problem. We propose efficient algorithms to approximate this problem, and evaluate our proposed algorithms against real-world networks and randomly generated graphs. Preprint. Under review.
International Conference on Artificial Intelligence and Statistics, 2020
The discovery of causal relationships is a core part of scientific research. Accordingly, over the past several decades, algorithms have been developed to discover the causal structure for a system of variables from observational data. Learning ancestral graphs is of particular interest due to their ability to represent latent confounding implicitly with bi-directed edges. The well-known FCI algorithm provably recovers an ancestral graph for a system of variables encoding the sound and complete set of causal relationships identifiable from observational data. 1 Additional causal relationships become identifiable with the incorporation of background knowledge; however, it is not known for what types of knowledge FCI remains complete. In this paper, we define tiered background knowledge and show that FCI is sound and complete with the incorporation of this knowledge.
2011
We present two inference rules, based on so called minimal conditional independencies, that are sufficient to find all invariant arrowheads in a single causal DAG, even when selection bias may be present. It turns out that the set of seven graphical orientation rules that are usually employed to identify these arrowheads are, in fact, just different instances/manifestations of these two rules. The resulting algorithm to obtain the definite causal information is elegant and fast, once the (often surprisingly small) set of minimal independencies is found. * This research was funded by NWO Vici grant 639.023.604.
National Conference on Artificial Intelligence, 2006
This paper addresses the problem of identifying causal effects from nonexperimental data in a causal Bayesian network, i.e., a directed acyclic graph that represents causal relationships. The identifiability question asks whether it is possible to com- pute the probability of some set of (effect) variables given intervention on another set of (intervention) variables, in the presence of non-observable (i.e., hidden
Journal of Applied Logic, 2009
In this paper, I want to substantiate three related claims regarding causal discovery from non-experimental data. Firstly, in scientific practice, the problem of ignorance is ubiquitous, persistent, and far-reaching. Intuitively, the problem of ignorance bears upon the following situation. A set of random variables V is studied but only partly tested for (conditional) independencies; i.e. for some variables A and B it is not known whether they are (conditionally) independent. Secondly, Judea Pearl’s most meritorious and influential algorithm for causal discovery (the IC algorithm) cannot be applied in cases of ignorance. It presupposes that a full list of (conditional) independence relations is on hand and it would lead to unsatisfactory results when applied to partial lists. Finally, the problem of ignorance is successfully treated by means of ALIC, the adaptive logic for causal discovery presented in this paper.
Proceedings of AISTATS, 2001
The Fast Casual Inference (FCI) algorithm searches for features common to observationally equivalent sets of causal directed acyclic graphs. It is correct in the large sample limit with probability one even if there is a possibility of hidden variables and selection bias. In the worst case, the ...
Constraint-based causal discovery algorithms, such as the PC algorithm, rely on conditional independence tests and are otherwise independent of the actual distribution of the data. In case of continuous variables, the most popular conditional independence test used in practice is the partial correlation test, applicable to variables that are multivariate Normal. Many researchers assume multivariate Normal distributions when dealing with continuous variables, based on a common wisdom that minor departures from multivariate Normality do not have a too strong effect on the reliability of partial correlation tests. We subject this common wisdom to a test in the context of causal discovery and show that partial correlation tests are indeed quite insensitive to departures from multivariate Normality. They, therefore, provide conditional independence tests that are applicable to a wide range of multivariate continuous distributions.
We present a novel approach to constraint-based causal discovery, that takes the form of straightforward logical inference, applied to a list of simple, logical statements about causal relations that are derived directly from observed (in)dependencies. It is both sound and complete, in the sense that all invariant features of the corresponding partial ancestral graph (PAG) are identified, even in the presence of latent variables and selection bias. The approach shows that every identifiable causal relation corresponds to one of just two fundamental forms. More importantly, as the basic building blocks of the method do not rely on the detailed (graphical) structure of the corresponding PAG, it opens up a range of new opportunities, including more robust inference, detailed accountability, and application to large models.
Uncertainty Proceedings 1991, 1991
The presence of latent variables can greatly complicate inferences about causal relations between measured variables from statistical data. In many cases, the presence of latent variables makes it impossible to determine for two measured variables A and B, whether A causes B, B causes A, or there is some common cause. In this paper I present several theorems that state conditions under which it is possible to reliably infer the causal relation between two measured variables, regardless of whether latent variables are acting or not.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.