Modelling that exploits visual elements and information visualisation are important areas that ha... more Modelling that exploits visual elements and information visualisation are important areas that have contributed immensely to understanding and the computerisation advancements in many domains and yet remain unexplored for the benefit of the law and legal practice. This paper investigates the challenge of modelling and expressing structures and processes in legislation and the law by using visual modelling and information visualisation (InfoVis) to assist accessibility of legal knowledge, practice and knowledge formalisation as a basis for legal AI. The paper uses a subset of the well-defined Unified Modelling Language (UML) to visually express the structure and process of the legislation and the law to create visual flow diagrams called lawmaps, which form the basis of further formalisation. A lawmap development methodology is presented and evaluated by creating a set of lawmaps for the practice of conveyancing and the Landlords and Tenants Act 1954 of the United Kingdom. This paper...
The graph of a Bayesian Network (BN) can be machine learned, determined by causal knowledge, or a... more The graph of a Bayesian Network (BN) can be machine learned, determined by causal knowledge, or a combination of both. In disciplines like bioinformatics, applying BN structure learning algorithms can reveal new insights that would otherwise remain unknown. However, these algorithms are less effective when the input data are limited in terms of sample size, which is often the case when working with real data. This paper focuses on purely machine learned and purely knowledge-based BNs and investigates their differences in terms of graphical structure and how well the implied statistical models explain the data. The tests are based on four previous case studies whose BN structure was determined by domain knowledge. Using various metrics, we compare the knowledge-based graphs to the machine learned graphs generated from various algorithms implemented in TETRAD spanning all three classes of learning. The results show that, while the algorithms produce graphs with much higher model selection score, the knowledge-based graphs are more accurate predictors of variables of interest. Maximising score fitting is ineffective in the presence of limited sample size because the fitting becomes increasingly distorted with limited data, guiding algorithms towards graphical patterns that share higher fitting scores and yet deviate considerably from the true graph. This highlights the value of causal knowledge in these cases, as well as the need for more appropriate fitting scores suitable for limited data. Lastly, the experiments also provide new evidence that support the notion that results from simulated data tell us little about actual real-world performance.
When presenting forensic evidence, such as a DNA match, experts often use the Likelihood ratio (L... more When presenting forensic evidence, such as a DNA match, experts often use the Likelihood ratio (LR) to explain the impact of evidence. The LR measures the probative value of the evidence with respect to a single hypothesis such as "DNA comes from the suspect", and is defined as the probability of the evidence if the hypothesis is true divided by the probability of the evidence if the hypothesis is false. The LR is a valid measure of probative value because, by Bayes Theorem, the higher the LR is, the more our belief in the probability the hypothesis is true increases after observing the evidence. The LR is popular because it measures the probative value of evidence without having to make any explicit assumptions about the prior probability of the hypothesis. However, whereas the LR can in principle be easily calculated for a distinct single piece of evidence that relates directly to a specific hypothesis, in most realistic situations 'the evidence' is made up of multiple dependent components that impact multiple different hypotheses. In such situations the LR cannot be calculated. However, once the multiple pieces of evidence and hypotheses are modelled as a causal Bayesian network (BN), any relevant LR can be automatically derived using any BN software application.
Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law
One of the greatest impediments to the use of probabilistic reasoning in legal arguments is the d... more One of the greatest impediments to the use of probabilistic reasoning in legal arguments is the difficulty in agreeing on an appropriate prior probability for the ultimate hypothesis, (in criminal cases this is normally "Defendant is guilty of the crime for which he/she is accused"). Even strong supporters of a Bayesian approach prefer to ignore priors and focus instead on considering only the likelihood ratio (LR) of the evidence. But the LR still requires the decision maker (be it a judge or juror during trial, or anybody helping to determine beforehand whether a case should proceed to trial) to consider their own prior; without it the LR has limited value. We show that, in a large class of cases, it is possible to arrive at a realistic prior that is also as consistent as possible with the legal notion of 'innocent until proven guilty'. The approach can be considered as a formalisation of the 'island problem' whereby if it is known the crime took place on an island when n people were present, then each of the people on the island has an equal prior probability 1/n of having carried out the crime. Our prior is based on simple location and time parameters that determine both a) the crime scene/time (within which it is certain the crime took place) and b) the extended crime scene/time which is the 'smallest' within which it is certain the suspect was known to have been 'closest' in location/time to the crime scene. The method applies to cases where we assume a crime has taken place and that it was committed by one person against one other person (e.g. murder, assault, robbery). The paper considers both the practical and legal implications of the approach. We demonstrate how the opportunity prior probability is naturally incorporated into a generic Bayesian network model that allows us to integrate other evidence about the case.
No comprehensive review of Bayesian networks (BNs) in healthcare has been published in the past, ... more No comprehensive review of Bayesian networks (BNs) in healthcare has been published in the past, making it difficult to organize the research contributions in the present and identify challenges and neglected areas that need to be addressed in the future. This unique and novel scoping review of BNs in healthcare provides an analytical framework for comprehensively characterizing the domain and its current state. The review shows that: (1) BNs in healthcare are not used to their full potential; (2) a generic BN development process is lacking; (3) limitations exist in the way BNs in healthcare are presented in the literature, which impacts understanding, consensus towards systematic methodologies, practice and adoption of BNs; and (4) a gap exists between having an accurate BN and a useful BN that impacts clinical practice. This review empowers researchers and clinicians with an analytical framework and findings that will enable understanding of the need to address the problems of restricted aims of BNs, ad hoc BN development methods, and the lack of BN adoption in practice. To map the way forward, the paper proposes future research directions and makes recommendations regarding BN development methods and adoption in practice.
Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challengin... more Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challenging. When using exact inference methods the space complexity can grow exponentially with the tree-width, thus making computation intractable. This paper presents a general purpose approximate inference algorithm, based on a new region belief approximation method, called Triplet Region Construction (TRC). TRC reduces the cluster space complexity for factorized models from worst-case exponential to polynomial by performing graph factorization and producing clusters of limited size. Unlike previous generations of region-based algorithms, TRC is guaranteed to converge and effectively addresses the region choice problem that bedevils other region-based algorithms used for BN inference. Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms.
2020 IEEE International Conference on Healthcare Informatics (ICHI), 2020
Advances in both computing power and novel Bayesian inference algorithms have enabled Bayesian Ne... more Advances in both computing power and novel Bayesian inference algorithms have enabled Bayesian Networks (BN) to be applied for decision-support in healthcare and other domains. This work presents CardiPro, a flexible, online application for interfacing with non-trivial causal BN models. Designed especially to make BN use easy for less-technical users like patients and clinicians, CardiPro provides near real-time probabilistic computation. CardiPro was developed as part of the PamBayesian research project (www.pambayesian.org) and represents the first of a new generation of online BN-based applications that may benefit adoption of AI-based clinical decision-support.
This document describes the basic principles of good writing. It targets primarily students and r... more This document describes the basic principles of good writing. It targets primarily students and researchers who have to write technical and business reports. However, the principles are relevant to any form of writing, including letters and memos. Therefore, the document contains valuable lessons for anybody wishing to improve their writing skills. The ideas described here are, apart from minor exceptions, not original. They draw on the idea in a range of excellent books and by various outstanding authors with whom I have worked. Thus, the approach represents a kind of modern consensus. This approach is very different from the style that was promoted by the traditional English schools' system, which encouraged students to write in an unnecessarily complex and formal way. The approach described here emphasises simplicity ('plain English') and informality. For example, it encourages shorter sentences and use of the simplest words and phrases possible. It explains how you c...
This document describes the basic principles of good writing. It is primarily targeted at student... more This document describes the basic principles of good writing. It is primarily targeted at students and researchers writing technical and business reports, but the principles are relevant to any form of writing, including letters and memos. Therefore, the document contains valuable lessons for anybody wishing to improve their writing skills. The ideas described here are, apart from fairly minor exceptions, not original. They are drawn from a range of excellent books and have also been influenced by various outstanding authors I have worked with. Thus, the approach represents a kind of modern consensus. This approach is very different to the style that was promoted by the traditional English schools ’ system, which encouraged students to write in an unnecessarily complex and formal way. The approach described here emphasises simplicity (‘plain English’) and informality. For example, it encourages shorter sentences and use of the simplest words and phrases possible. It explains how you...
Information visualisation creates visual representations that more easily convey meaningful patte... more Information visualisation creates visual representations that more easily convey meaningful patterns and trends hidden within large and otherwise abstract datasets. Despite potential benefits for understanding and communicating health data, information visualisation in medicine is underdeveloped. This is especially true in midwifery, where no qualitative research exists regarding the impact of different graphs on clinicians’ and patients’ understanding. This position paper is part of ongoing work investigating this gap and its potential impact. This work reviews a collection of literature from within the midwifery domain. We found almost two-thirds do not use data visualisation approaches to present knowledge realised from data, and those that did were generally restricted to basic bar charts and line graphs. Without effective information visualisation midwives will continue to be constrained by the challenge of trying to see what datasets. Keywords— data visualisation, information ...
An important recent preprint by Griffith et al highlights how 'collider bias' in studies ... more An important recent preprint by Griffith et al highlights how 'collider bias' in studies of COVID19 undermines our understanding of the disease risk and severity. This is typically caused by the data being restricted to people who have undergone COVID19 testing, among whom healthcare workers are overrepresented. For example, collider bias caused by smokers being underrepresented in the dataset may (at least partly) explain empirical results that suggest smoking reduces the risk of COVID19. We extend the work of Griffith et al making more explicit use of graphical causal models to interpret observed data. We show that their smoking example can be clarified and improved using Bayesian network models with realistic data and assumptions. We show that there is an even more fundamental problem for risk factors like 'stress' which, unlike smoking, is more rather than less prevalent among healthcare workers; in this case, because of a combination of collider bias from the bi...
In an era of big-data the general consensus is that relationships between variables of interest s... more In an era of big-data the general consensus is that relationships between variables of interest surface almost by themselves. Sufficient amounts of data can nowadays reveal new insights that would otherwise have remained unknown. Inferring knowledge from data, however, imposes further challenges. For example, the 2007-08 financial crisis revealed that big-data models used by investment banks and rating agencies for decision making failed to predict real-world financial risk. This is because while such big-data models are excellent at predicting past events, they may fail to predict similar future events that are influenced by new and hence, previously unseen factors.
But the real story here is that the paper’saccusationofracialbias(specificallythat the alg... more But the real story here is that the paper’saccusationofracialbias(specificallythat the algorithm is biased against black people) is based on a fundamental misunderstanding of causation and statistics. The algorithm is no more ‘biased’ against black people than it is biased against white single parents [6], old people [7], people living in Beattyville Kentucky [8], or women called ‘Amber’ [9]. In fact, as we show below, if you choose any factor that correlates with poverty you will inevitably replicate the statistical ‘bias’ claimed in the paper. And if you accept the validity of the claims in the paper then you must also accept, for example, that a charity which uses poverty as a factor to identify and help homeless people is being racist because it is biased against white people (and also, interestingly, Indian Americans [10]).
There has been great concern in the UK that people from the BAME (Black And Minority Ethnic) comm... more There has been great concern in the UK that people from the BAME (Black And Minority Ethnic) community have a far higher risk of dying from Covid19 than those of other ethnicities. However, the overall fatalities data from the Government's ONS (Office of National Statistics) most recent report on deaths by religion shows that Jews (very few of whom are classified as BAME) have a much higher risk than those of religions (Hindu, Sikh, Muslim) with predominantly BAME people. This apparently contradictory result is, according to the ONS statistical analysis, implicitly explained by age as the report claims that, when 'adjusted for age' Muslims have the highest fatality risk. However, the report fails to provide the raw data to support this. There are many factors other than just age that must be incorporated into any analysis of the observed data before making definitive conclusions about risk based on religion/ethnicity. We propose the need for a causal model for this. If w...
Modelling that exploits visual elements and information visualisation are important areas that ha... more Modelling that exploits visual elements and information visualisation are important areas that have contributed immensely to understanding and the computerisation advancements in many domains and yet remain unexplored for the benefit of the law and legal practice. This paper investigates the challenge of modelling and expressing structures and processes in legislation and the law by using visual modelling and information visualisation (InfoVis) to assist accessibility of legal knowledge, practice and knowledge formalisation as a basis for legal AI. The paper uses a subset of the well-defined Unified Modelling Language (UML) to visually express the structure and process of the legislation and the law to create visual flow diagrams called lawmaps, which form the basis of further formalisation. A lawmap development methodology is presented and evaluated by creating a set of lawmaps for the practice of conveyancing and the Landlords and Tenants Act 1954 of the United Kingdom. This paper...
The graph of a Bayesian Network (BN) can be machine learned, determined by causal knowledge, or a... more The graph of a Bayesian Network (BN) can be machine learned, determined by causal knowledge, or a combination of both. In disciplines like bioinformatics, applying BN structure learning algorithms can reveal new insights that would otherwise remain unknown. However, these algorithms are less effective when the input data are limited in terms of sample size, which is often the case when working with real data. This paper focuses on purely machine learned and purely knowledge-based BNs and investigates their differences in terms of graphical structure and how well the implied statistical models explain the data. The tests are based on four previous case studies whose BN structure was determined by domain knowledge. Using various metrics, we compare the knowledge-based graphs to the machine learned graphs generated from various algorithms implemented in TETRAD spanning all three classes of learning. The results show that, while the algorithms produce graphs with much higher model selection score, the knowledge-based graphs are more accurate predictors of variables of interest. Maximising score fitting is ineffective in the presence of limited sample size because the fitting becomes increasingly distorted with limited data, guiding algorithms towards graphical patterns that share higher fitting scores and yet deviate considerably from the true graph. This highlights the value of causal knowledge in these cases, as well as the need for more appropriate fitting scores suitable for limited data. Lastly, the experiments also provide new evidence that support the notion that results from simulated data tell us little about actual real-world performance.
When presenting forensic evidence, such as a DNA match, experts often use the Likelihood ratio (L... more When presenting forensic evidence, such as a DNA match, experts often use the Likelihood ratio (LR) to explain the impact of evidence. The LR measures the probative value of the evidence with respect to a single hypothesis such as "DNA comes from the suspect", and is defined as the probability of the evidence if the hypothesis is true divided by the probability of the evidence if the hypothesis is false. The LR is a valid measure of probative value because, by Bayes Theorem, the higher the LR is, the more our belief in the probability the hypothesis is true increases after observing the evidence. The LR is popular because it measures the probative value of evidence without having to make any explicit assumptions about the prior probability of the hypothesis. However, whereas the LR can in principle be easily calculated for a distinct single piece of evidence that relates directly to a specific hypothesis, in most realistic situations 'the evidence' is made up of multiple dependent components that impact multiple different hypotheses. In such situations the LR cannot be calculated. However, once the multiple pieces of evidence and hypotheses are modelled as a causal Bayesian network (BN), any relevant LR can be automatically derived using any BN software application.
Proceedings of the 16th edition of the International Conference on Articial Intelligence and Law
One of the greatest impediments to the use of probabilistic reasoning in legal arguments is the d... more One of the greatest impediments to the use of probabilistic reasoning in legal arguments is the difficulty in agreeing on an appropriate prior probability for the ultimate hypothesis, (in criminal cases this is normally "Defendant is guilty of the crime for which he/she is accused"). Even strong supporters of a Bayesian approach prefer to ignore priors and focus instead on considering only the likelihood ratio (LR) of the evidence. But the LR still requires the decision maker (be it a judge or juror during trial, or anybody helping to determine beforehand whether a case should proceed to trial) to consider their own prior; without it the LR has limited value. We show that, in a large class of cases, it is possible to arrive at a realistic prior that is also as consistent as possible with the legal notion of 'innocent until proven guilty'. The approach can be considered as a formalisation of the 'island problem' whereby if it is known the crime took place on an island when n people were present, then each of the people on the island has an equal prior probability 1/n of having carried out the crime. Our prior is based on simple location and time parameters that determine both a) the crime scene/time (within which it is certain the crime took place) and b) the extended crime scene/time which is the 'smallest' within which it is certain the suspect was known to have been 'closest' in location/time to the crime scene. The method applies to cases where we assume a crime has taken place and that it was committed by one person against one other person (e.g. murder, assault, robbery). The paper considers both the practical and legal implications of the approach. We demonstrate how the opportunity prior probability is naturally incorporated into a generic Bayesian network model that allows us to integrate other evidence about the case.
No comprehensive review of Bayesian networks (BNs) in healthcare has been published in the past, ... more No comprehensive review of Bayesian networks (BNs) in healthcare has been published in the past, making it difficult to organize the research contributions in the present and identify challenges and neglected areas that need to be addressed in the future. This unique and novel scoping review of BNs in healthcare provides an analytical framework for comprehensively characterizing the domain and its current state. The review shows that: (1) BNs in healthcare are not used to their full potential; (2) a generic BN development process is lacking; (3) limitations exist in the way BNs in healthcare are presented in the literature, which impacts understanding, consensus towards systematic methodologies, practice and adoption of BNs; and (4) a gap exists between having an accurate BN and a useful BN that impacts clinical practice. This review empowers researchers and clinicians with an analytical framework and findings that will enable understanding of the need to address the problems of restricted aims of BNs, ad hoc BN development methods, and the lack of BN adoption in practice. To map the way forward, the paper proposes future research directions and makes recommendations regarding BN development methods and adoption in practice.
Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challengin... more Performing efficient inference on high dimensional discrete Bayesian Networks (BNs) is challenging. When using exact inference methods the space complexity can grow exponentially with the tree-width, thus making computation intractable. This paper presents a general purpose approximate inference algorithm, based on a new region belief approximation method, called Triplet Region Construction (TRC). TRC reduces the cluster space complexity for factorized models from worst-case exponential to polynomial by performing graph factorization and producing clusters of limited size. Unlike previous generations of region-based algorithms, TRC is guaranteed to converge and effectively addresses the region choice problem that bedevils other region-based algorithms used for BN inference. Our experiments demonstrate that it also achieves significantly more accurate results than competing algorithms.
2020 IEEE International Conference on Healthcare Informatics (ICHI), 2020
Advances in both computing power and novel Bayesian inference algorithms have enabled Bayesian Ne... more Advances in both computing power and novel Bayesian inference algorithms have enabled Bayesian Networks (BN) to be applied for decision-support in healthcare and other domains. This work presents CardiPro, a flexible, online application for interfacing with non-trivial causal BN models. Designed especially to make BN use easy for less-technical users like patients and clinicians, CardiPro provides near real-time probabilistic computation. CardiPro was developed as part of the PamBayesian research project (www.pambayesian.org) and represents the first of a new generation of online BN-based applications that may benefit adoption of AI-based clinical decision-support.
This document describes the basic principles of good writing. It targets primarily students and r... more This document describes the basic principles of good writing. It targets primarily students and researchers who have to write technical and business reports. However, the principles are relevant to any form of writing, including letters and memos. Therefore, the document contains valuable lessons for anybody wishing to improve their writing skills. The ideas described here are, apart from minor exceptions, not original. They draw on the idea in a range of excellent books and by various outstanding authors with whom I have worked. Thus, the approach represents a kind of modern consensus. This approach is very different from the style that was promoted by the traditional English schools' system, which encouraged students to write in an unnecessarily complex and formal way. The approach described here emphasises simplicity ('plain English') and informality. For example, it encourages shorter sentences and use of the simplest words and phrases possible. It explains how you c...
This document describes the basic principles of good writing. It is primarily targeted at student... more This document describes the basic principles of good writing. It is primarily targeted at students and researchers writing technical and business reports, but the principles are relevant to any form of writing, including letters and memos. Therefore, the document contains valuable lessons for anybody wishing to improve their writing skills. The ideas described here are, apart from fairly minor exceptions, not original. They are drawn from a range of excellent books and have also been influenced by various outstanding authors I have worked with. Thus, the approach represents a kind of modern consensus. This approach is very different to the style that was promoted by the traditional English schools ’ system, which encouraged students to write in an unnecessarily complex and formal way. The approach described here emphasises simplicity (‘plain English’) and informality. For example, it encourages shorter sentences and use of the simplest words and phrases possible. It explains how you...
Information visualisation creates visual representations that more easily convey meaningful patte... more Information visualisation creates visual representations that more easily convey meaningful patterns and trends hidden within large and otherwise abstract datasets. Despite potential benefits for understanding and communicating health data, information visualisation in medicine is underdeveloped. This is especially true in midwifery, where no qualitative research exists regarding the impact of different graphs on clinicians’ and patients’ understanding. This position paper is part of ongoing work investigating this gap and its potential impact. This work reviews a collection of literature from within the midwifery domain. We found almost two-thirds do not use data visualisation approaches to present knowledge realised from data, and those that did were generally restricted to basic bar charts and line graphs. Without effective information visualisation midwives will continue to be constrained by the challenge of trying to see what datasets. Keywords— data visualisation, information ...
An important recent preprint by Griffith et al highlights how 'collider bias' in studies ... more An important recent preprint by Griffith et al highlights how 'collider bias' in studies of COVID19 undermines our understanding of the disease risk and severity. This is typically caused by the data being restricted to people who have undergone COVID19 testing, among whom healthcare workers are overrepresented. For example, collider bias caused by smokers being underrepresented in the dataset may (at least partly) explain empirical results that suggest smoking reduces the risk of COVID19. We extend the work of Griffith et al making more explicit use of graphical causal models to interpret observed data. We show that their smoking example can be clarified and improved using Bayesian network models with realistic data and assumptions. We show that there is an even more fundamental problem for risk factors like 'stress' which, unlike smoking, is more rather than less prevalent among healthcare workers; in this case, because of a combination of collider bias from the bi...
In an era of big-data the general consensus is that relationships between variables of interest s... more In an era of big-data the general consensus is that relationships between variables of interest surface almost by themselves. Sufficient amounts of data can nowadays reveal new insights that would otherwise have remained unknown. Inferring knowledge from data, however, imposes further challenges. For example, the 2007-08 financial crisis revealed that big-data models used by investment banks and rating agencies for decision making failed to predict real-world financial risk. This is because while such big-data models are excellent at predicting past events, they may fail to predict similar future events that are influenced by new and hence, previously unseen factors.
But the real story here is that the paper’saccusationofracialbias(specificallythat the alg... more But the real story here is that the paper’saccusationofracialbias(specificallythat the algorithm is biased against black people) is based on a fundamental misunderstanding of causation and statistics. The algorithm is no more ‘biased’ against black people than it is biased against white single parents [6], old people [7], people living in Beattyville Kentucky [8], or women called ‘Amber’ [9]. In fact, as we show below, if you choose any factor that correlates with poverty you will inevitably replicate the statistical ‘bias’ claimed in the paper. And if you accept the validity of the claims in the paper then you must also accept, for example, that a charity which uses poverty as a factor to identify and help homeless people is being racist because it is biased against white people (and also, interestingly, Indian Americans [10]).
There has been great concern in the UK that people from the BAME (Black And Minority Ethnic) comm... more There has been great concern in the UK that people from the BAME (Black And Minority Ethnic) community have a far higher risk of dying from Covid19 than those of other ethnicities. However, the overall fatalities data from the Government's ONS (Office of National Statistics) most recent report on deaths by religion shows that Jews (very few of whom are classified as BAME) have a much higher risk than those of religions (Hindu, Sikh, Muslim) with predominantly BAME people. This apparently contradictory result is, according to the ONS statistical analysis, implicitly explained by age as the report claims that, when 'adjusted for age' Muslims have the highest fatality risk. However, the report fails to provide the raw data to support this. There are many factors other than just age that must be incorporated into any analysis of the observed data before making definitive conclusions about risk based on religion/ethnicity. We propose the need for a causal model for this. If w...
Uploads
Papers by Norman Fenton