no longer supports Internet Explorer.
To browse and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009, Synthese
While standard procedures of causal reasoning as procedures analyzing causal Bayesian networks are custom-built for (non-deterministic) probabilistic structures, this paper introduces a Boolean procedure that uncovers deterministic causal structures. Contrary to existing Boolean methodologies, the procedure advanced here successfully analyzes structures of arbitrary complexity. It roughly involves three parts: first, deterministic dependencies are identified in the data; second, these dependencies are suitably minimalized in order to eliminate redundancies; and third, one or-in case of ambiguities-more than one causal structure is assigned to the minimalized deterministic dependencies.
Kybernetika, 2014
Institute of Mathematics of the Czech Academy of Sciences provides access to digitized documents strictly for personal use. Each copy of any part of this document must contain these Terms of use.
IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans, 1996
This material is posted here with permission of the IEEE. Internal or personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution must be obtained from the IEEE by writing to [email protected].
National Conference on Artificial Intelligence, 2006
This paper addresses the problem of identifying causal effects from nonexperimental data in a causal Bayesian network, i.e., a directed acyclic graph that represents causal relationships. The identifiability question asks whether it is possible to com- pute the probability of some set of (effect) variables given intervention on another set of (intervention) variables, in the presence of non-observable (i.e., hidden
Studies in Fuzziness and Soft Computing, 2006
We examine the Bayesian approach to the discovery of directed acyclic causal models and compare it to the constraint-based approach. Both approaches rely on the Causal Markov assumption, but the two di er signi cantly in theory and practice. An important di erence between the approaches is that the constraint-based approach uses categorical information about conditional-independence constraints in the domain, whereas the Bayesian approach weighs the degree to which such constraints hold. As a result, the Bayesian approach has three distinct advantages over its constraint-based counterpart. One, conclusions derived from the Bayesian approach are not susceptible to incorrect categorical decisions about independence facts that can occur with data sets of nite size. Two, using the Bayesian approach, ner distinctions among model structures|both quantitative and qualitative|can be made. Three, information from several models can be combined to make better inferences and to better account for modeling uncertainty. In addition to describing the general Bayesian approach to causal discovery, we review approximation methods for missing data and hidden variables, and illustrate di erences between the Bayesian and constraint-based methods using arti cial and real examples.
Probabilistic abduction extends conventional symbolic abductive reasoning with Bayesian inference methods. This allows for the uncertainty underlying implications to be expressed with probabilities as well as assumptions, thus complementing the symbolic approach in situations where the use of a complete list of assumptions underlying inferences is not practical. However, probabilistic abduction has been of little use in first principle-based applications, such as abductive diagnosis, largely because no methods are available to automate the construction of probabilistic models, such as Bayesian networks (BNs). This paper addresses this issue by proposing a compositional modelling method for BNs.
robabilistic graphical models are useful tools for modeling systems governed by probabilistic structure. Bayesian networks are one class of prob- abilistic graphical model that have proven useful for characterizing both formal systems and for reasoning with those systems. Probabilistic dependencies in Bayesian networks are graphically expressed in terms of directed links from parents to their children. Casual Bayesian networks are a generalization of Bayesian networks that allow one to "intervene" and perform "graph surgery" — cutting nodes off from their parents. Causal theories are a formal framework for generating causal Bayesian networks. This report provides a brief introduction to the formal tools needed to compre- hend Bayesian networks, including probability theory and graph theory. Then, it describes Bayesian networks and causal Bayesian networks. It introduces some of the most basic functionality of the extensive NetworkX python package for working with complex graphs and networks [HSS08]. I introduce some utilities I have build on top of NetworkX including conditional graph enumeration and sampling from discrete valued Bayesian networks encoded in NetworkX graphs [Pac15]. I call this Causal Bayesian NetworkX, or CBNX. I conclude by introducing a formal framework for generating causal Bayesian networks called theory based causal induction [GT09], out of which these utilities emerged. I discuss the background motivations for frameworks of this sort, their use in computational cognitive science, and the use of computational cognitive science for the machine learning community at large
Causal independence modelling is a well-known method both for reducing the size of probability tables and for explaining the underlying mechanisms in Bayesian networks. Many Bayesian network models incorporate causal independence assumptions; however, only the noisy OR and noisy AND, two examples of causal independence models, are used in practice. Their underlying assumption that either at least one cause, or all causes together, give rise to an effect, however, seems unnecessarily restrictive. In the present paper a new, more flexible, causal independence model is proposed, based on the Boolean threshold function. A connection is established between conditional probability distributions based on the noisy threshold model and Poisson binomial distributions, and the basic properties of this probability distribution are studied in some depth. We present and analyse recursive methods as well as approximation and bounding techniques to assess the conditional probabilities in the noisy threshold models.
We discuss a decision theoretic approach to learn causal Bayesian networks from observational data and experiments. We use the information of observational data to learn a completed partially directed acyclic graph using a structure learning technique and try to discover the directions of the remaining edges by means of experiment. We will show that our approach allows to learn a causal Bayesian network optimally with relation to a number of decision criteria.
Cybernetics and Systems, 2008
In this paper we describe an important structure used to model causal theories and a related problem of great interest to semi-empirical scientists. A causal Bayesian network is a pair consisting of a directed acyclic graph (called a causal graph) that represents causal relationships and a set of probability tables, that together with the graph specify the joint probability of the variables represented as nodes in the graph. We briefly describe the probabilistic semantics of causality proposed by Pearl for this graphical probabilistic model, and how unobservable variables greatly complicate models and their application. A common question about causal Bayesian networks is the problem of identifying casual effects from nonexperimental data, which is called the identifability problem. In the basic version of this problem, a semi-empirical scientist postulates a set of causal mechanisms and uses them, together with a probability distribution on the observable set of variables in a domain of interest, to predict the effect of a manipulation on some variable of interest. We explain this problem, provide several examples, and direct the readers to recent work that provides a solution to the problem and some of its extensions. We assume that the Bayesian network structure is given to us and do not address the problem of learning it from data and the related statistical inference and testing issues.
Discrete Bayesian Networks (BNs) have been very successful as a framework both for inference and for expressing certain causal hypotheses. In this paper we present a class of graphical models called the chain event graph (CEG) models, that generalises the class of discrete BN models. This class is suited for representing conditional independence and sample space structures of asymmetric models. It retains many useful properties of discrete BNs, in particular admitting conjugate estimation. It provides a flexible and expressive framework for representing and analysing the implications of causal hypotheses, expressed in terms of the effects of a manipulation of the generating underlying system. We prove that, as for a BN, identifiability analyses of causal effects can be performed through examining the topology of the CEG graph, leading to theorems analogous to the Backdoor theorem for the BN.
Lecture Notes in Computer Science, 1994
The discovery of causal relationships from empirical data is an important problem in machine learning. In this paper the attention is focused on the inference o fprobabilis tic causal relationships, for which two different approaches, namely Glymour et al.'s approach based on constraints on correlations and Pearl and Verma's approach based on conditional independencies, have been proposed. These methods differ both in the kind of constraints they consider while selecting a causal model and in the way they search the model which better fits to the sample data. Preliminary experiments show that they are complementary in several aspects. Moreover, the method of conditional independence can be easily extended to the case in which variables have a nominal or ordinal domain. In this case, symbohc learning algorithms can be exploited in order to derive the causal law from the causal model.
Lecture Notes in Statistics, 1994
The discovery of causal relationships from empirical data is an important problem in machine learning. In this paper the attention is focused on the inference o fprobabilis tic causal relationships, for which two different approaches, namely Glymour et al.'s approach based on constraints on correlations and Pearl and Verma's approach based on conditional independencies, have been proposed. These methods differ both in the kind of constraints they consider while selecting a causal model and in the way they search the model which better fits to the sample data. Preliminary experiments show that they are complementary in several aspects. Moreover, the method of conditional independence can be easily extended to the case in which variables have a nominal or ordinal domain. In this case, symbohc learning algorithms can be exploited in order to derive the causal law from the causal model.
We present two inference rules, based on so called minimal conditional independencies, that are sufficient to find all invariant arrowheads in a single causal DAG, even when selection bias may be present. It turns out that the set of seven graphical orientation rules that are usually employed to identify these arrowheads are, in fact, just different instances/manifestations of these two rules. The resulting algorithm to obtain the definite causal information is elegant and fast, once the (often surprisingly small) set of minimal independencies is found. * This research was funded by NWO Vici grant 639.023.604.
The PC algorithm learns maximally oriented causal Bayesian networks. However, there is no equivalent complete algorithm for learning the structure of relational models, a more expressive generalization of Bayesian networks. Recent developments in the theory and representation of relational models support lifted reasoning about conditional independence. This enables a powerful constraint for orienting bivariate dependencies and forms the basis of a new algorithm for learning structure. We present the relational causal discovery (RCD) algorithm that learns causal relational models. We prove that RCD is sound and complete, and we present empirical results that demonstrate effectiveness.
One practical problem with building large scale Bayesian network models is an exponential growth of the number of numerical parameters in conditional probability tables. Obtaining large number of probabilities from domain experts is too expensive and too time demanding in practice. A widely accepted solution to this problem is the assumption of independence of causal influences (ICI) which allows for parametric models that define conditional probability distributions using only a number of parameters that is linear in the number of causes. ICI models, such as the noisy-OR and the noisy-AND gates, have been widely used by practitioners. In this paper we propose PICI, probabilistic ICI, an extension of the ICI assumption that leads to more expressive parametric models. We provide examples of three PICI models and demonstrate how they can cope with a combination of positive and negative influences, something that is hard for noisy-OR and noisy-AND gates.
Ai Communications, 1997
Causal concepts play a crucial role in many reasoning tasks. Organised as a model revealing the causal structure of a domain, they can guide inference through relevant knowledge. This is an especially difficult kind of knowledge to acquire, so some methods for automating the induction of causal models from data have been put forth. Here we review those that have a graph representation. Most work has been done on the problem of recovering belief nets from data but some extensions are appearing that claim to exhibit a true causal semantics. We will review the analogies between belief networks and "true" causal networks and to what extent methods for learning belief networks can be used in learning causal representations. Some new results in recovering possibilistic causal networks will also be presented.
Reasoning in Bayesian networks is exponential in a graph parameter w 3 known as induced width (also known as tree-width and max-clique size). In this paper, we investigate the potential of causal independence (CI) for improving this performance.
Arxiv preprint cs/9612101, 1996
A new method is proposed for exploiting causal independencies in exact Bayesian network inference. A Bayesian network can be viewed as representing a factorization of a joint probability into the multiplication of a set of conditional probabilities. We present a notion of causal independence that enables one to further factorize the conditional probabilities into a combination of even smaller factors and consequently obtain a finer-grain factorization of the joint probability. The new formulation of causal independence lets us specify the conditional probability of a variable given its parents in terms of an associative and commutative operator, such as "or", "sum" or "max", on the contribution of each parent. We start with a simple algorithm VE for Bayesian network inference that, given evidence and a query variable, uses the factorization to find the posterior distribution of the query. We show how this algorithm can be extended to exploit causal independence. Empirical studies, based on the CPCS networks for medical diagnosis, show that this method is more efficient than previous methods and allows for inference in larger networks than previous algorithms.
We consider two variables that are related to each other by an invertible function. While it has previously been shown that the dependence structure of the noise can provide hints to determine which of the two variables is the cause, we presently show that even in the deterministic (noise-free) case, there are asymmetries that can be exploited for causal inference. Our method is based on the idea that if the function and the probability density of the cause are chosen independently, then the distribution of the effect will, in a certain sense, depend on the function. We provide a theoretical analysis of this method, showing that it also works in the low noise regime, and link it to information geometry. We report strong empirical results on various real-world data sets from different domains.
Demonstratio Mathematica, 1999
Spirtes, Glymour and Scheines [19] formulated a Conjecture that a direct dependence test and a head-to-head meeting test would suffice to construe directed acyclic graph decompositions of a joint probability distribution (Bayesian network) for which Pearl's d-separation [2] applies. This Conjecture was later shown to be a direct consequence of a result of Pearl and Verma [21], cited as Theorem 1 in [13], see also Theorem 3.4. in [20]). This paper is intended to prove this Conjecture in a new way, by introducing the concept of p-d-separation (partial dependency separation). While Pearl's d-separation works with Bayesian networks, p-d-separation is intended to apply to causal networks: that is partially oriented networks in which orientations are given to only to those edges, that express statistically confirmed causal influence, whereas undirected edges express existence of direct influence without possibility of determination of direction of causation. As a consequence of the particular way of proving the validity of this Conjecture, an algorithm for construction of all the directed acyclic graphs (dags) carrying the available independence information is also presented. The notion of a partially oriented graph (pog) is introduced and within this graph the notion of p-d-separation is defined. It is demonstrated that the p-d-separation within the pog is equivalent to d-separation in all derived dags.
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.