Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2006, Proceedings of the 22nd Conference in …
…
10 pages
1 file
Causal discovery from observational data in the presence of unobserved variables is challenging. Identification of so-called Y substructures is a sufficient condition for ascertaining some causal relations in the large sample limit, without the assumption of no hidden common causes. An example of a Y substructure is A → C, B → C, C → D. This paper describes the first asymptotically reliable and computationally feasible scorebased search for discrete Y structures that does not assume that there are no unobserved common causes. For any parameterization of a directed acyclic graph (DAG) that has scores with the property that any DAG that can represent the distribution beats any DAG that can't, and for two DAGs that represent the distribution, if one has fewer parameters than the other, the one with the fewest parameter wins. In this framework there is no need to assign scores to causal structures with unobserved common causes. The paper also describes how the existence of a Y structure shows the presence of an unconfounded causal relation, without assuming that there are no hidden common causes.
Lecture Notes in Statistics, 1993
A discovery problem is composed of a set of alternative structures, one of which is the source of data, but any of which, for all the investigator knows before the inquiry, could be the structure from which the data are obtained. There is something to be found out about the actual structure, whichever it is. It may be that we want to settle a particular hypothesis that is true in some of the possible structures and false in others, or it may be that we want to know the complete theory of a certain sort of phenomenon. In this book, and in much of the social sciences and epidemiology, the alternative structures in a discovery problem are typically directed acyclic graphs paired with joint probability distributions on their vertices. We usually want to know something about the structure of the graph that represents causal influences, and we may also want to know about the distribution of values of variables in the graph for a given population. A discovery problem also includes a characterization of a kind of evidence; for example, data may be available for some of the variables but not others, and the data may include the actual probability or conditional independence relations or, more realistically, simply the values of the variables for random samples. Our theoretical discussions will usually consider discovery problems in which the data include the true conditional independence relations among the measured variables, but our examples and applications will always involve inferences from statistical samples. A method solves a discovery problem in the limit if as the sample size increases without bound the method converges to the true answer to the question or to the true theory, whatever
2011
We present two inference rules, based on so called minimal conditional independencies, that are sufficient to find all invariant arrowheads in a single causal DAG, even when selection bias may be present. It turns out that the set of seven graphical orientation rules that are usually employed to identify these arrowheads are, in fact, just different instances/manifestations of these two rules. The resulting algorithm to obtain the definite causal information is elegant and fast, once the (often surprisingly small) set of minimal independencies is found. * This research was funded by NWO Vici grant 639.023.604.
arXiv preprint arXiv:1206.3260, 2012
Abstract: An important task in data analysis is the discovery of causal relationships between observed variables. For continuous-valued data, linear acyclic causal models are commonly used to model the data-generating process, and the inference of such models is a well-studied problem. However, existing methods have significant limitations. Methods based on conditional independencies (Spirtes et al. 1993; Pearl 2000) cannot distinguish between independence-equivalent models, whereas approaches purely based on Independent ...
Behaviormetrika, 2022
Existing causal discovery algorithms are often evaluated using two success criteria, one that is typically unachievable and the other which is too weak for practical purposes. The unachievable criterion-uniform consistency-requires that a discovery algorithm identify the correct causal structure at a known sample size. The weak but achievable criterion-pointwise consistency-requires only that one identify the correct causal structure in the limit. We investigate two intermediate success criteria-decidability and progressive solvability-that are stricter than mere consistency but weaker than uniform consistency. To do so, we review several topological theorems characterizing which discovery problems are decidable and/or progressively solvable. These theorems apply to any problem of statistical model selection, but in this paper, we apply the theorems only to selection of causal models. We show, under several common modeling assumptions, that there is no uniformly consistent procedure for identifying the direction of a causal edge, but there are statistical decision procedures and progressive solutions. We focus on linear models in which the error terms are either non-Gaussian or contain no Gaussian components; the latter modeling assumption is novel to this paper. We focus especially on which success criteria remain feasible when confounders are present.
2009
This paper (which is mainly expository) sets up graphical models for causation, having a bit less than the usual complement of hypothetical counterfactuals. Assuming the invariance of error distributions may be essential for causal inference, but the errors themselves need not be invariant. Graphs can be interpreted using conditional distributions, so that we can better address connections between the mathematical framework and causality in the world. The identification problem is posed in terms of conditionals. As will be seen, causal relationships cannot be inferred from a data set by running regressions unless there is substantial prior knowledge about the mechanisms that generated the data. There are few successful applications of graphical models, mainly because few causal pathways can be excluded on a priori grounds. The invariance conditions themselves remain to be assessed.
Recent approaches to causal modeling rely upon the Causal Markov Condition, which specifies which probability distributions are compatible with a Directed Acyclic Graph (DAG). Further principles are required in order to choose among the large number of DAGs compatible with a given probability distribution. Here we present a principle that we call frugality. This principle tells one to choose the DAG with the fewest causal arrows. We argue that frugality has several desirable properties compared to the other principles that have been suggested, including the well-known Causal Faithfulness Condition.
The Journal of Machine …, 2006
This paper provides an overview of structural modeling in its close relation to explanation and causation. It stems from previous works by the authors and stresses the role and importance of the notions of invariance, recursive decomposition, exogeneity and background knowledge. It closes with some considerations about the importance of the structural approach for practicing scientists.
We present a novel approach to constraint-based causal discovery, that takes the form of straightforward logical inference, applied to a list of simple, logical statements about causal relations that are derived directly from observed (in)dependencies. It is both sound and complete, in the sense that all invariant features of the corresponding partial ancestral graph (PAG) are identified, even in the presence of latent variables and selection bias. The approach shows that every identifiable causal relation corresponds to one of just two fundamental forms. More importantly, as the basic building blocks of the method do not rely on the detailed (graphical) structure of the corresponding PAG, it opens up a range of new opportunities, including more robust inference, detailed accountability, and application to large models.
Lecture Notes in Computer Science, 2007
We present an algorithm for causal structure discovery suited in the presence of continuous variables. We test a version based on partial correlation that is able to recover the structure of a recursive linear equations model and compare it to the well-known PC algorithm on large networks. PC is generally outperformed in run time and number of structural errors.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
Electronic Journal of Statistics
Cornell University - arXiv, 2022
International Conference on Artificial Intelligence and Statistics, 2020
Evaluation Review, 2004
Lecture Notes in Statistics, 1993
Studies in Fuzziness and Soft Computing, 2006
International Journal of Approximate Reasoning, 2017
Applied informatics
Proceedings of AISTATS, 2001
Handbooks of Sociology and Social Research, 2013
Lecture Notes in Statistics, 1994
Annals of Mathematics and Artificial Intelligence, 2018