Conceptions of Evidence
What does Evidence Refer to in Law?
Stephen (1872: 3–4, 6–7) long ago noted that legal usage of the term
“evidence” is ambiguous. It sometimes refers to that which is adduced
by a party at the trial as a means of establishing factual claims.
(“Adducing evidence” is the legal term for presenting or producing
evidence in court for the purpose of establishing proof.) This meaning of
evidence is reflected in the definitional section of the Indian Evidence
Act (Stephen 1872: 149).[2] When lawyers use the term “evidence” in
this way, they have in mind what epistemologists would think of as
“objects of sensory evidence” (Haack 2004: 48). Evidence, in this sense,
is divided conventionally into three main categories:[3] oral evidence (the
testimony given in court by witnesses), documentary evidence
(documents produced for inspection by the court), and “real evidence”;
the first two are self-explanatory and the third captures things other than
documents such as a knife allegedly used in committing a crime.
The term “evidence” can, secondly, refer to a proposition of fact that is
established by evidence in the first sense.[4] This is sometimes called an
“evidential fact”. That the accused was at or about the scene of the crime
at the relevant time is evidence in the second sense of his possible
involvement in the crime. But the accused’s presence must be proved by
producing evidence in the first sense. For instance, the prosecution may
call a witness to appear before the court and get him to testify that he
saw the accused in the vicinity of the crime at the relevant time. Success
in proving the presence of the accused (the evidential fact) will depend
on the fact-finder’s assessment of the veracity of the witness and the
reliability of his testimony. (The fact-finder is the person or body
responsible for ascertaining where the truth lies on disputed questions of
fact and in whom the power to decide on the verdict vests. The fact-
finder is also called “trier of fact” or “judge of fact”. Fact-finding is the
task of the jury or, for certain types of cases and in countries without a
jury system, the judge.) Sometimes the evidential fact is directly
accessible to the fact-finder. If the alleged knife used in committing the
crime in question (a form of “real evidence”) is produced in court, the
fact-finder can see for himself the shape of the knife; he does not need to
learn of it through the testimony of an intermediary.
A third conception of evidence is an elaboration or extension of the
second. On this conception, evidence is relational. A factual proposition
(in Latin, factum probans) is evidence in the third sense only if it can
serve as a premise for drawing an inference (directly or indirectly) to a
matter that is material to the case (factum probandum) (see section
2.2 below for the concept of materiality). The fact that the accused’s
fingerprints were found in a room where something was stolen is
evidence in the present sense because one can infer from this that he was
in the room, and his presence in the room is evidence of his possible
involvement in the theft. On the other hand, the fact that the accused’s
favorite color is blue would, in the absence of highly unusual
circumstances, be rejected as evidence of his guilt: ordinarily, what a
person’s favorite color happens to be cannot serve as a premise for any
reasonable inference towards his commission of a crime and, as such, it
is irrelevant (see discussion of relevance in section 2.1 below). In the
third sense of “evidence”, which conceives of evidence as a premise for
a material inference, “irrelevant evidence” is an oxymoron: it is simply
not evidence. Hence, this statement of Bentham (1825: 230):[5]
To say that testimony is not pertinent, is to say that it is foreign to the
case, has no connection with it, and does not serve to prove the fact in
question; in a word, it is to say, that it is not evidence.
There can be evidence in the first sense without evidence in the second
or third sense. To pursue our illustration, suppose it emerges during
cross-examination of the expert that his testimony of having found a
finger-print match was a lie. Lawyers would describe this situation as
one where the “evidence” (the testimony of the expert) fails to prove the
fact that it was originally produced to prove and not that no “evidence”
was adduced on the matter. Here “evidence” is used in the first sense—
evidence as testimony—and the testimony remains in the court’s record
whether it is believed or not. But lawyers would also say that, in the
circumstances, there is no “evidence” that the accused was in the room,
assuming that there was nothing apart from the discredited expert
testimony of a fingerprint match to establish his presence there. Here,
the expert’s testimony is shown to be false and fails to establish that the
accused’s fingerprints were found in the room, and there is no (other)
factual basis for believing that he was in the room. The factual premise
from which an inference is sought to be drawn towards the accused’s
guilt is not established.
Fourthly, the conditions for something to be received (or, in technical
term “admitted”) as evidence at the trial are sometimes included in the
legal concept of evidence. (These conditions are discussed in section
2 below.) On this conception, legal evidence is that which counts as
evidence in law. Something may ordinarily be treated as evidence and
yet be rejected by the court. Hearsay is often cited as an example. It is
pointed out that reliance on hearsay is a commonplace in ordinary life.
We frequently rely on hearsay in forming our factual beliefs. In contrast,
“hearsay is not evidence” in legal proceedings (Stephen 1872: 4–5). As a
general rule, the court will not rely on hearsay as a premise for an
inference towards the truth of what is asserted. It will not allow a
witness to testify in court that another person X (who is not brought
before the court) said that p on a certain occasion (an out-of-court
statement) for the purpose of proving that p.
In summary, at least four possible conceptions of legal evidence are in
currency: as an object of sensory evidence, as a fact, as an inferential
premise and as that which counts as evidence in law. The sense in which
the term “evidence” is being used is seldom made explicit in legal
discourse although the intended meaning will often be clear from the
context.
2. Conditions for Receiving Evidence: What Counts as Evidence in
Law?
This section picks up on the fourth conception of evidence. To recall,
something will be accepted by the court as evidence—it is, to use
Montrose’s term, receivable as evidence in legal proceedings—only if
three basic conditions are
satisfied: relevance, materiality and admissibility (Montrose 1954).
These three conditions of receivability are discussed in turn below.
2.1 Relevance
2.1.1 Legal Significance of Relevance
The concept of relevance plays a pivotal role in legal fact-finding.
Thayer (1898: 266, 530) articulates its significance in terms of two
foundational principles of the law of evidence: first, without exception,
nothing which is not relevant may be received as evidence by the court
and secondly, subject to many exceptions and qualifications, whatever is
relevant is receivable as evidence by the court. Thayer’s view has been
influential and finds expression in sources of law, for example, in Rule
402 of the Federal Rules of Evidence in the United States.[6] Thayer
claims, and it is now widely accepted, that relevance is a “logical” and
not a legal concept; in section 2.1.3, we will examine this claim and the
dissent expressed by Wigmore. Leaving aside the dissenting view for the
moment, we will turn first to consider possible conceptions of relevance
in the conventional sense of logical relevance.
2.1.2 Conceptions of Logical Relevance
Evidence may be adduced in legal proceedings to prove a fact only if the
fact is relevant. Relevance is a relational concept. No fact is relevant in
itself; it is relevant only in relation to another fact. The term “probable”
is often used to describe this relation. We see two instances of this in the
following well-known definitions. According to Stephen (1886: 2,
emphasis added):
The word “relevant” means that any two facts to which it is applied are
so related to each other that according to the common course of events
one either taken by itself or in connection with other facts proves or
renders probable the past, present, or future existence or non-existence
of the other.
The second definition is contained in the United States’ Federal Rule of
Evidence 401 which (in its restyled version) states that evidence is
relevant if “it has a tendency to make a fact more or less probable than it
would be without the evidence” (emphasis added). The word “probable”
in these and other standard definitions is sometimes construed as
carrying the mathematical meaning of probability.[7] In a leading article,
Lempert gave this example to show how relevance turns on the
likelihood ratio. The prosecution produces evidence that the
perpetrator’s blood found at the scene of the crime is type A. The
accused has the same blood type. Suppose fifty percent of the suspect
population has type A blood. If the accused is in fact guilty, the
probability that the blood found at the scene will be type A is 1.0. But if
he is in fact innocent, the probability of finding type A blood at the
scene is 0.5—that is, it matches the background probability of type A
blood from the suspect population. The likelihood ratio is the ratio of the
first probability to the second—1.0:0.5 or, more simply, 2:1. Evidence is
considered relevant so long as the likelihood ratio is other than 1:1
(Lempert 1977). If the ratio is 1:1, that means that the probability of the
evidence is the same whether the accused is guilty or innocent.
The conventional view is that relevance in law is a binary concept:
evidence is either relevant or it is not. So long as the likelihood ratio is
other than 1:1, the evidence is considered relevant.[8] However, the
greater the likelihood ratio deviates from 1:1, the higher the so-called
probative value of the evidence (that is, on one interpretation of
probative value). We will take a closer look at probative value in section
3.1 below.
While the likelihood ratio may be useful as a heuristic device in
analysing evidential reasoning, it is controversial as to whether it
captures correctly the concept of relevance. In the first place, it is
unclear that the term “probable” in the standard definitions of relevance
was ever intended as a reference to mathematical probability. Some have
argued that relevance should be understood broadly such that any
evidence would count as relevant so long as it provides some reason in
support of the conclusion that a proposition of fact material to the case is
true or false (Pardo 2013: 576–577).
The mathematical conception of relevance has been disputed. At a trial,
it is very common for the opposing sides to present competing accounts
of events that share certain features. To use Allen’s example, the fact
that the accused drove to a particular town on a particular day and time
is consistent with the prosecution’s case that he was driving there to
commit a murder and also with the defence’s case that he was driving
there to visit his mother. This fact, being a common feature of both
sides’ explanations of the material events, is as consistent with the
hypothesis of guilt as with the hypothesis of innocence. On the
likelihood ratio conception of relevance, this fact should be irrelevant
and hence evidence of it should not be allowed to be adduced. But in
such cases, the court will let the evidence in (Park et al. 2010: 10). The
mathematical theory of relevance cannot account for this. It is argued
that an alternative theory of relevance better fits legal practice and is
thus to be preferred. On an explanatory conception of relevance,
evidence is relevant if it is explained by or provides a reason for
believing the particular explanation of the material events offered by the
side adducing the evidence, and it remains relevant even where, as in our
example, the evidence also supports or forms part of the explanation
offered by the opponent (Pardo and Allen 2008: 241–2; Pardo 2013:
600).
One possible response to the above challenge to the likelihood ratio
theory of relevance is to deny that it was ever meant to be
the exclusive test of relevance. Evidence is relevant if the likelihood
ratio is other than 1:1. But evidence may also be relevant on other
grounds, such as when it provides for a richer narrative or helps the
court in understanding other evidence. It is for these reasons that
witnesses are routinely allowed to give their names and parties may
present diagrams, charts and floor plans (so-called “demonstrative
evidence”) at the trial (McCormick 2013: 995). The admission of
evidence in the scenario painted by Allen above has been explained
along a similar line (Park et al. 2010: 16).
2.1.3 Logical Relevance versus Legal Relevance
The concept of relevance examined in the preceding section is
commonly known as “logical relevance”. This is somewhat of a
misnomer: “Relevance is not a matter of logic, but depends on matters of
fact” (Haack 2004: 46). In our earlier example, the relevance of the fact
that the accused has type A blood depends obviously on the state of the
world. On the understanding that relevance is a probabilistic relation, it
is tempting to think that in describing relevance as “logical”, one is
subscribing to a logical theory of probability (cf. Franklin 2011).
However, the term “logical relevance” was not originally coined with
this connotation in mind. In the forensic context, “logic” is used loosely
and refers to the stock of background beliefs or generalisations and the
type of reasoning that judges and lawyers are fond of labelling as
“commonsense” (MacCrimmon 2001–2002; Twining 2006: 334–335).
A key purpose of using the adjective “logical” is to flag the non-legal
character of relevance. As Thayer (1898: 269) famously claimed,
relevance “is an affair of logic and not of law.” This is not to say that
relevance has no legal dimension. The law distinguishes between
questions of law and questions of fact. An issue of relevance poses a
question of law that is for the judge to decide and not the jury, and so far
as relevance is defined in legal sources (for example, in Federal Rule of
Evidence 401 mentioned above), the judge must pay heed to the legal
definition. But legal definitions of relevance are invariably very broad.
Relevance is said to be a logical, and non-legal, concept in the sense that
in answering a question of relevance and in applying the definition of
relevance, the judge has necessarily to rely on extra-legal resources and
is not bound by legal precedents. Returning to Federal Rule of Evidence
401, it states generally that evidence is relevant if “it has a tendency to
make a fact more or less probable than it would be without the
evidence”. In deciding whether the evidence sought to be adduced does
have this tendency, the judge has to look outside the law. Thayer was
most insistent on this. As he put it, “[t]he law furnishes no test of
relevancy. For this, it tacitly refers to logic and general experience”
(Thayer 1898: 265). That the accused’s favorite color is blue is, barring
extraordinary circumstances, irrelevant to the question of his intention to
commit theft. It is not the law that tells us so but “logic and general
experience”. On Thayer’s view, the law does not control or regulate the
assessment of relevance; it assumes that judges are already in possession
of the (commonsense) resources to undertake this assessment.
Wigmore adopts a different position. He argues, against Thayer, that
relevance is a legal concept. There are two strands to his contention. The
first is that for evidence to be relevant in law, “a generally higher degree
of probative value” is required “than would be asked in ordinary
reasoning”;
legal relevance denotes…something more than a minimum of probative
value. Each single piece of evidence must have a plus value. (cf.
Pattenden 1996–7: 373)
As Wigmore sees it, the requirement of “plus value” guards against the
jury “being satisfied by matters of slight value, capable to being
exaggerated by prejudice and hasty reasoning” (Wigmore 1983b: 969,
cf. 1030–1031). Opponents of Wigmore acknowledge that there may be
sound policy reasons for excluding evidence of low probative value.
Receiving the evidence at the trial might raise a multiplicity of issues,
take too much time and expense, confuse the jurors or produce undue
prejudice in their mind. When the judge excludes evidence for any of
these reasons, and the judge has the discretion to do so in many
countries, the evidence is excluded despite it being relevant (e.g., United
States’ Federal Rule of Evidence 403). Relevance is a relation between
facts and the aforesaid reasons for exclusion are extrinsic to that relation;
they are grounded in considerations such as limitation of judicial
resources and jury psychology. The notion of “plus value” confuses
relevance with extraneous considerations (James 1941; Trautman 1952).
There is a second strand to Wigmore’s contention that relevance is a
legal concept. Relevance is legal in the sense that the judge is bound by
previously decided cases (“judicial precedents”) when he has to make a
ruling on the relevance of a proposed item of evidence.
So long as Courts continue to declare…what their notions of logic are,
just so long will there be rules of law which must be observed.
(Wigmore 1983a: 691)
Wigmore cites in support the judgment of Cushing C.J. in State v
LaPage where it was remarked:
[T]here are many instances in which the evidence of particular facts as
bearing on particular issues has been so often the subject of discussion in
courts of law, and so often ruled upon, that the united logic of a great
many judges and lawyers may be said to furnish…the best evidence of
what may be properly called common-sense, and thus to acquire the
authority of law. (1876 57 N.H. 245 at 288 [Supreme Court, New
Hampshire])
Wigmore’s position on relevance is strangely at odds with his strong
stand against the judge being bound by judicial precedents in assessing
the weight or credibility of evidence (Wigmore 1913). More
importantly, the second strand of his argument also does not sit well
with the first strand. If, as Wigmore contends, evidence must have a plus
value to make it legally relevant, the court has to consider the probative
value of the evidence and to weigh it against the amount of time and
expense likely to be incurred in receiving the evidence, the availability
of other evidence, the risk of the evidence misleading or confusing the
trier of fact and so forth. Given that the assessment of plus value and,
hence, legal relevance is so heavily contextual, it is difficult to see how a
judicial precedent can be of much value in another case in determining a
point of legal relevance (James 1941: 702).
2.2 Materiality and Facts-in-issue
We have just considered the first condition of receivability, namely,
relevance. That fact A is relevant to fact B is not sufficient to make
evidence of fact A receivable in court. In addition, B must be a
“material” fact. The materiality of facts in a particular case is determined
by the law applicable to that case. In a criminal prosecution, it depends
on the law which defines the offence with which the accused is charged
and at a civil trial, the law which sets out the elements of the legal claim
that is being brought against the defendant (Wigmore 1983a, 15–19;
Montrose 1954: 536–537).
Imagine that the accused is prosecuted for the crime of rape and the
alleged victim’s behaviour (fact A) increases the probability that she had
consented to have sexual intercourse with the accused (fact B). On the
probabilistic theory of relevance that we have considered, A is relevant
to B. Now suppose that the alleged victim is a minor. Under criminal
law, it does not matter whether she had consented to the sexual
intercourse. If B is of no legal consequence, the court will not allow
evidence of A to be adduced for the purpose of proving B: the most
obvious reason is that it is a waste of time to receive the evidence.
Not all material facts are necessarily in dispute. Suppose the plaintiff
sues the defendant for breach of contract. Under the law of contract, to
succeed in this action, the plaintiff must prove the following three
elements: that there was a contract between the parties, that the
defendant was in breach of the contract, and that the plaintiff had
suffered loss as a result of that breach. The defendant may concede that
there was a contract and that he was in breach of it but deny that the
plaintiff had suffered any loss as a result of that breach. In such a
situation, only the last of the material facts is disputed. Following
Stephen’s terminology, a disputed material fact is called a “fact in issue”
(Stephen 1872: 28).
The law does not allow evidence to be adduced to prove facts that are
immaterial or that are not in issue. “Relevance” is often used in the
broader sense that encompasses the concepts under discussion. Evidence
is sometimes described as “irrelevant” not for the reason that no logical
inference can be drawn to the proposition that is sought to be proved (in
our example, A is strictly speaking relevant to B) but because that
proposition is not material or not disputed (in our example, B is not
material).[9] This broader usage of the term “relevance”, though
otherwise quite harmless, does not promote conceptual clarity because it
runs together different concepts (see James 1941: 690–691; Trautman
1952: 386; Montrose 1954: 537).
2.3 Admissibility
2.3.1 Admissibility and Relevance
A further condition must be satisfied for evidence to be received in legal
proceedings. There are legal rules that prohibit evidence from being
presented at a trial even though it is relevant to a factual proposition that
is material and in issue. These rules render the evidence to which they
apply “inadmissible” and require the judge to “exclude” it. Two
prominent examples of such rules of admissibility or rules of exclusion
are the rule against hearsay evidence and the rule against character
evidence. This section considers the relation between the concept of
relevance and the concept of admissibility. The next section (section
2.3.2) discusses general arguments for and against exclusionary or
admissibility rules.
Here, again, the terminology is imprecise. Admissibility and
receivability are not clearly distinguished. It is common
for irrelevant evidence, or evidence of an immaterial fact to be
described as “inadmissible”. What this means is that the court will refuse
to receive evidence if it is irrelevant or immaterial. But, importantly, the
court also excludes evidence for reasons other than irrelevance and
immateriality. For Montrose, there is merit in restricting the concept of
“inadmissibility” to the exclusion of evidence based on those other
reasons (Montrose 1954: 541–543). If evidence is rejected on the ground
of irrelevance, it is, as Thayer (1898: 515) puts it, “the rule of reason that
rejects it”; if evidence is rejected under an admissibility or exclusionary
rule, the rejection is by force of law. The concepts of admissibility and
materiality should also be kept apart. This is because admissibility or
exclusionary rules serve purposes and rationales that are distinct from
the law defining the crime or civil claim that is before the court and it is
this law that determines the materiality of facts in the dispute.
Thayer (1898: 266, 530) was influential in his view that the law of
evidence has no say on logical relevance and that its main business is in
dealing with admissibility. If the evidence is logically irrelevant, it must
for that reason be excluded. If the evidence is logically relevant, it will
be received by the court unless the law—in the form of an exclusionary
or admissibility rule—requires its exclusion. In this scheme, the concept
of relevance and the concept of admissibility are distinct: indeed,
admissibility rules presuppose the relevance of the evidence to which
they apply.
Stephen appears to hold a different view, one in which the concept of
admissibility is apparently absorbed by the concept of relevance. Take,
for example, Stephen’s analysis of the rule that in general no evidence
may be adduced to prove “statements as to facts made by persons not
called as witnesses”, in short, hearsay (Stephen 1872: 122). As a general
rule, no evidence may be given of hearsay because the law prohibits it.
The question then arises as to the rationale for this prohibition.
Stephen’s answer to this question is often taken to be that hearsay is not
“relevant” and he is criticised for failing to see the difference between
relevance and admissibility (Whitworth 1881: 3; Thayer 1898: 266–268;
Pollock 1876, 1899; Wigmore 1983a: §12). His critics point out that
hearsay has or can have probative value and evidence of hearsay is
excluded despite or regardless of its relevance. On the generalisation that
there is no smoke without fire, the fact that a person claimed that p in a
statement made out-of-court does or can have a bearing on the
probability that p, and p may be (logically relevant to) a material fact in
the dispute.
Interestingly, Stephen seemed to have conceded as much. He
acknowledged that a policeman or a lawyer engaged in preparing a case
would be negligent if he were to shut his ears to hearsay. Hearsay is one
of those facts that are “apparently relevant but not really so” (Stephen
1872: 122; see also Stephen 1886: xi). In claiming that hearsay is
irrelevant, Stephen appears to be merely stating the effect of the law: the
law requires that hearsay be treated as irrelevant. He offered a variety of
justifications for excluding hearsay evidence: its admissibility would
“present a great temptation to indolent judges to be satisfied with
second-hand reports” and “open a wide door to fraud”, and “[e]veryone
would be at the mercy of people who might tell a lie, and whose
evidence could neither be tested nor contradicted” (Stephen 1872: 124–
125). For his detractors, these are reasons of policy and fairness and it
disserves clarity to sneak such considerations into the concept of
relevance.
Although there is force to the criticism that Stephen had unhelpfully
conflated admissibility and relevance (understood as logical relevance),
something can perhaps be said in his defence. Exclusionary rules or
rules of admissibility—at any rate, many of them—are more accurately
seen as excluding forms of reasoning rather than prohibiting proof of
certain types of facts (McNamara 1986). This is certainly true of the
hearsay rule. On one authoritative definition of the rule (decision of the
Privy Council in Subramaniam v PP, (1956) 1 Weekly Law
Reports 965), what it prohibits is the use of a hearsay statement to prove
the truth of the facts asserted therein.[10] The objection is to the drawing
of the inference that p from X’s out-of-court statement that p where X is
not available to be examined in court. But the court will allow the
evidence of X’s hearsay statement to be admitted—it will allow proof of
the statement— where the purpose of adducing the evidence is to
persuade the court that X did make the statement and this fact is relevant
for some other purpose. For instance, it may be relevant as to the state of
mind of the person hearing the statement, and his state of mind may be
material to his defence of having acted under duress. Hence, two writers
have commented that “there is no such thing as hearsay evidence, only
hearsay uses” (Roberts and Zuckerman 2010: 385).
Other admissibility rules are also more accurately seen as targeted at
forms of reasoning and not types of facts. In the United States, Federal
Rule of Evidence 404(a)(1) bars the use of evidence of a person’s
character “to prove that on a particular occasion the person acted in
accordance with the character” and Federal Rule of Evidence 404(b)(1)
provides that evidence of a crime or wrong
is not admissible to prove a person’s character in order to show that on a
particular occasion the person acted in accordance with the character.
It is doubtful that evidence of a person’s character and past behaviour
can have no probabilistic bearing on his behaviour on a particular
occasion; on a probabilistic conception of relevance, it is difficult to see
why the evidence is not relevant. Even so, there may be policy, moral or
other reasons for the law to prohibit certain uses of character evidence.
In declaring a fact as irrelevant for a particular purpose, we are not
necessarily saying or implying anything about probability. We may be
expressing a normative judgment. For policy, moral or other reasons, the
law takes the position that hearsay or the accused’s character or previous
misconduct must not be used as the premise for a particular line of
reasoning. The line of reasoning might be morally objectionable (“give a
dog a bad name and hang him for it”) or it might be unfair to permit the
drawing of the inference when the opponent was not given a fair
opportunity to challenge it (as in the hearsay situation) (Ho 2008: chs. 5,
6). If we take a normative conception of relevance instead of a logical or
probabilistic one, it is not an abuse of language to describe inadmissible
evidence as irrelevant if what is meant is that the evidence ought not to
be taken into account in a certain way.
2.3.2 Admissibility or Exclusionary Rules
On one historical account, admissibility or exclusionary rules are the
product of the jury system where citizens untrained in assessing
evidence sit as judges of fact. These rules came about because it was
thought necessary to keep away from inexperienced jurors certain types
of evidence that may mislead or be mishandled by them—for instance,
evidence to which they are likely to give too much weight or that carries
the risk of creating unfair prejudice in their minds (Thayer 1898;
Wigmore 1935: 4–5). Epistemic paternalism is supposedly at play
(Leiter 1997: 814–5; Allen and Leiter 2001: 1502). Subscription to this
theory has generated pressure for the abolition of exclusionary rules with
the decline of the jury system and the replacement of lay persons with
professional judges as triers of fact. There is doubt as to the historical
accuracy of this account; at any rate, it does not appear capable of
explaining the growth of all exclusionary rules (Morgan 1936–37; Nance
1988: 278–294).
Even if the theory is right, it does not necessarily follow that
exclusionary rules should be abolished once the jury system is removed.
Judges may be as susceptible to the same cognitive and other failings as
the jury and there may be the additional risk that judges may over-
estimate their own cognitive and intellectual abilities in their
professional domain. Hence, there remains a need for the constraints of
legal rules (Schauer 2006: 185–193). But the efficacy of these rules in a
non-jury system is questionable. The procedural reality is that judges
will have to be exposed to the evidence in order to decide on its
admissibility. Since a judge cannot realistically be expected to erase the
evidence from his mind once he has decided to exclude it, there seems
little point in excluding the evidence; we might as well let the evidence
in and allow judge to give the evidence the probative value that it
deserves (Mnookin 2006; Damaška 2006; cf. Ho 2008: 44–46).
Bentham was a strong critic of exclusionary rules. He was much in
favour of “freedom of proof” understood as free access to information
and the absence of formal rules that restrict such access (Twining 2006:
232, n 65). The direct object of legal procedure is the “rectitude of
decision”, by which he means the correction application of substantive
law to the true facts. The exclusion of relevant evidence—evidence
capable of casting light on the truth—is detrimental to this end. Hence,
no relevant evidence should be excluded; the only exceptions he would
allow are where the evidence is superfluous or its production would
involve preponderant delay, expense or vexation (Bentham 1827: Book
IX; Bentham 1825: Book VII; Twining 1985: ch. 2). Bentham’s
argument has been challenged on various fronts. It is said that he
overvalued the pursuit of truth, undervalued procedural fairness and
procedural rights, and placed too much faith in officials, underestimating
the risk of abuse when they are given discretion unfettered by rules
(Twining 1985: 70–71).
Even if we agree with Bentham that rectitude of decision is the aim of
legal procedure and that achieving accuracy in fact-finding is necessary
to attain this aim, it is not obvious that a rule-based approach to
admissibility will undermine this aim in the long run. Schauer has
defended exclusionary rules of evidence along a rule-consequentialist
line. Having the triers of fact follow rules on certain matters instead of
allowing them the discretion to exercise judgment on a case-by-case
basis may produce the greatest number of favourable outcomes in the
aggregate. It is in the nature of a formal rule that it has to be followed
even when doing so might not serve the background reason for the rule.
If hearsay evidence is thought to be generally unreliable, the interest of
accuracy may be better served overall to require such evidence to be
excluded without regard to its reliability in individual cases. Given the
imperfection of human reason and our suspicion about the reasoning
ability of the fact-finder, allowing decisions to be taken individually on
the reliability and admissibility of hearsay evidence might over time
produce a larger proportion of misjudgements than on the rule-based
approach (Schauer 2006: 180–185; Schauer 2008). However, this
argument is based on a large assumption about the likely effects of
having exclusionary rules and not having them, and there is no strong
empirical basis for thinking that the consequences are or will be as
alleged (Goldman 1999: 292–295; Laudan 2006: 121–122).
Other supporters of exclusionary rules build their arguments on a wide
range of different considerations. The literature is too vast to enter into
details. Here is a brief mention of some arguments. On one theory, some
exclusionary rules are devices that serve as incentives for lawyers to
produce the epistemically best evidence that is reasonably available
(Nance 1988). For example, if lawyers are not allowed to rely on
second-hand (hearsay) evidence, they will be forced to seek out better
(first-hand) evidence. On another theory, exclusionary rules allocate the
risks of error. Again, consider hearsay. The problem with allowing a
party to rely on hearsay evidence is that the opponent has no opportunity
to cross-examine the original maker of the statement and is thus
deprived of an important means of attacking the reliability of the
evidence. Exclusionary rules in general insulate the party against whom
the evidence is sought to be adduced from the risks of error that the
evidence, if admitted, would have introduced. The distribution of such
risks is said to be a political decision that should not be left to the
discretion of individual fact-finders (Stein 2005; cf. Redmayne 2006 and
Nance 2007a: 154–164). It has also been argued that the hearsay rule
and the accompanying right to confront witnesses promote the public
acceptance and stability of legal verdicts. If the court relies on direct
evidence, it can claim superior access to the facts (having heard from the
horse’s mouth, so to speak) and this also reduces the risk of new
information emerging after the trial to discredit the inference that was
drawn from the hearsay evidence (the original maker of the statement
might turn up after the trial to deny the truth of the statement that was
attributed to him) (Nesson 1985: 1372–1375; cf. Park 1986; Goldman
1999: 282; Goldman 2005: 166–167).
3. Strength of Evidence: the Property of Weight
Section 2 above dealt with the conditions that must be satisfied for a
witness’s testimony, a document or an object to be received as evidence
in legal proceedings. Suppose the judge decides to let the evidence be
presented at the trial. Having heard or seen the evidence, the fact-finder
now has to weigh it in reaching the verdict. Weight can refer to any of
the following three properties of evidence: (a) the probative value of
individual items of evidence, (b) the sufficiency of the whole body of
evidence adduced at the trial in meeting the standard of proof, or (c) the
relative completeness of this body of evidence. The first two aspects of
weight are familiar to legal practitioners but the third has been confined
to academic discussions. These three ideas are discussed in the same
order below.
3.1 Probative Value of Specific Items of Evidence
In reaching the verdict, the trier of fact has to assess the probative value
of the individual items of evidence which have been received at the trial.
The concept of probative value can also play a role at the prior stage
(which was the focus in section 2) where the judge has to make a ruling
on whether to receive the evidence in the first place. In many legal
systems, if the judge finds the probative value of a proposed item of
evidence to be low and substantially outweighed by countervailing
considerations, such as the risk of causing unfair prejudice or confusion,
the judge can refuse to let the jury hear or see the evidence (see, e.g.,
Rule 403 of the United States’ Federal Rules of Evidence).
The concept of probative value (or, as it is also called, probative force) is
related to the concept of relevance. Section 2.1.2 above introduced and
examined the claim that the likelihood ratio is the measure of relevance.
To recapitulate, the likelihood of an item of evidence, E (in our previous
example, the likelihood of a blood type match) given a
hypothesis H (that the accused is in fact guilty) is compared with the
likelihood of E given the negation of H (that the accused is in fact
innocent). Prior to the introduction of E, one may have formed some
belief about H based on other evidence that one already has. This prior
belief does not affect the likelihood ratio since its computation is based
on the alternative assumptions that H is true and that H is false (Kaye
1986a; Kaye and Koehler 2003; cf. Davis and Follette 2002 and 2003).
Rulings on relevance are typically made in the course of the trial as and
when an objection is raised to the introduction of evidence. The
relevance of an item of evidence is supposedly assessed on its own,
without consideration of other evidence (Mnookin 2013: 1544–5).[11]
Probative value, as with relevance, has been explained in terms of the
likelihood ratio (for detailed examples, see Nance and Morris 2002 and
Finkelstein and Levin 2003). It was noted earlier that evidence is either
relevant or not, and, on the prevailing understanding, it is relevant so
long as the likelihood ratio deviates from 1:1. But evidence can be more
or less probative depending on the value of the likelihood ratio. In our
earlier example, the probative value of a blood type match was 1.0:0.5
(or 2:1) as 50% of the suspect population had the same blood type as the
accused. But suppose the blood type is less common and only 25% of
the suspect population has it. The probative value of the evidence is now
1.0:0.25 (or 4:1). In both cases, the evidence is relevant; but the
probative value is greater in the latter than in the former scenario. It is
tempting to describe probative value as the degree of relevance but this
would be misleading as relevance in law is a binary concept.
There is a second way of thinking about probative value. On the second
view, but not on the first, the probative value of an item of evidence is
assessed contextually. The probative value of E may be low given one
state of the other evidence and substantial given a different body of other
evidence (Friedman 1986; Friedman and Park 2003; cf. Davis and
Follette 2002, 2003). Where the other evidence shows that a woman had
died from falling down an escalator at a mall while she was out
shopping, her husband’s history of spousal battery is unlikely to have
any probative value in proving that he was responsible for her death. But
where the other evidence shows that the wife had died of injuries in the
matrimonial home, and the question is whether the injuries were
sustained from an accidental fall from the stairs or inflicted by the
husband, the same evidence of spousal battery will now have significant
probative value.
On the second view, the probative value of an item of evidence (E) is not
measured simply by the likelihood ratio as it is on the first view.
Probative value is understood as the degree to which E increases (or
decreases) the probability of the proposition or hypothesis (H) in support
of (or against) which E is led. The probative value of E is measured by
the difference between the probability of H given E (the posterior
probability) and the probability of H absent E (the prior probability)
(Friedman 1986; James 1941: 699).
Probative value of E=P(H|E)−P(H)E=P(H|E)−P(H)
P(H|E)P(H|E) (the posterior probability) is derived by applying Bayes’
theorem—that is, by multiplying the prior probability by the likelihood
ratio (see discussion in section 3.2.2 below). On the present view, while
the likelihood ratio does not itself measure the probative value of E, it is
nevertheless a crucial component in the assessment.
A major difficulty with both of the mathematical conceptions of
probative value that we have just examined is that for most evidence,
obtaining the figures necessary for computing the likelihood ratio is
problematic (Allen 1991: 380). Exceptionally, quantitative base rates
data exist, as in our blood type example. Where objective data is
unavailable, the fact-finder has to draw on background experience and
knowledge to come up with subjective values. In our blood type
example, a critical factor in computing the likelihood ratio was the
percentage of the “suspect population” who had the same blood type as
the accused. “Reference class” is the general statistical term for the role
that the suspect population plays in this analysis. How should the
reference class of “suspect population” be defined? Should we look at
the population of the country as a whole or of the town or the street
where the alleged murder occurred? What if it occurred at an
international airport where most the people around are foreign visitors?
Or what if it is shown that both the accused and the victim were at the
time of the alleged murder inmates of the same prison? Should we then
take the prison population as the reference class? The distribution of
blood types may differ according to which reference class is selected.
Sceptics of mathematical modelling of probative value emphasize that
data from different reference classes will have different explanatory
power and the choice of the reference class is open to—and should be
subjected to—contextual argument and requires the exercise of
judgment; there is no a priori way of determining the correct reference
class. (On the reference class problem in legal fact finding, see, in
addition to references cited in the rest of this section, Colyvan, Regan
and Ferson 2001; Tillers 2005; Allen and Roberts 2007.
Some writers have proposed quantifiable ways of selecting, or assisting
in the selection, of the appropriate reference class. On one suggestion,
the court does not have to search for the optimal reference class. A
general characteristic of an adversarial system of trial is that the judge
plays a passive role; it is up to the parties to come up with the arguments
on which they want to rely and to produce evidence in support of their
respective arguments. This adversarial setting makes the reference class
problem more manageable as the court need only to decide which of the
reference classes relied upon by the parties is to be preferred. And this
can be done by applying one of a variety of technical criteria that
statisticians have developed for comparing and selecting statistical
models (Cheng 2009). Another suggestion is to use the statistical method
of “feature selection” instead. The ideal reference class is defined by the
intersection of all relevant features of the case, and a feature is relevant
if it is correlated to the matter under enquiry (Franklin 2010, 2011: 559–
561). For instance, if the amount of drug likely to be smuggled is
reasonably believed to co-vary with the airport through which it is
smuggled, the country of origin and the time period, and there is no
evidence that any other feature is relevant on which data is available, the
ideal reference class is the class of drug smugglers passing through that
airport originating from that country and during that time period. Both
suggestions have self-acknowledged limitations: not least, they depend
on the availability of suitable data. Also, as Franklin stresses, while
statistical methods “have advice to offer on how courts should judge
quantitative evidence”, they do so “in a way that supplements normal
intuitive legal argument rather than replacing it by a formula” (Franklin
2010: 22).
The reference class problem is not confined to the probabilistic
assessment of the probative value of individual items of evidence. It is a
general difficulty with a mathematical approach to legal proof. In
particular, the same problem arises on a probabilistic interpretation of
the standard of proof when the court has to determine whether the
standard is met based on all the evidence adduced in the case. This topic
is explored in section 3.2 below but it is convenient at this juncture to
illustrate how the reference class problem can also arise in this
connection. Let it be that the plaintiff sues Blue Bus Company to recover
compensation for injuries sustained in an accident. The plaintiff testifies,
and the court believes on the basis of his testimony, that he was run
down by a recklessly driven bus. Unfortunately, it was dark at the time
and he cannot tell whether the bus belonged to Blue Bus Company.
Assume further that there is also evidence which establishes that Blue
Bus Company owns 75% of the buses in the town where the accident
occurred and the remaining 25% is owned by Red Bus Company. No
other evidence is presented. To use the data as the basis for inferring that
there is 0.75 probability that the bus involved in the accident was owned
by Blue Bus Company would seem to privilege the reference class of
“buses operating in the town” over other possible reference classes such
as “buses plying the street where the accident occurred” or “buses
operating at the time in question” (Allen and Pardo 2007a: 109).
Different reference classes may produce very different likelihood ratios.
It is crucial how the reference class is chosen and this is ultimately a
matter of argument and judgment. Any choice of reference class (other
than the class that shares every feature of the particular incident, which
is, in effect, the unique incident itself) is in principle contestable.
Critics of the mathematization of legal proof raise this point as an
example of inherent limitations to the mathematical modelling of
probative value (Allen and Pardo 2007a).[12] Allen and Pardo propose an
alternative, the explanatory theory of legal proof. They claim that this
theory has the advantage of avoiding the reference class problem
because it does not attempt to quantify probative value (Pardo 2005:
374–383; Pardo and Allen 2008: 261, 263; Pardo 2013: 600–601).
Suppose a man is accused of killing his wife. Evidence is produced of
his extra-marital affair. The unique probative value of the accused’s
infidelity cannot be mathematically computed from statistical base rates
of infidelity and uxoricides (husbands murdering wives). In assessing its
probative value, the court should look instead at how strongly the
evidence of infidelity supports the explanation of the material events put
forward by the side adducing the evidence and how strongly it
challenges the explanation offered by the opponent. For instance, the
prosecution may be producing the evidence to buttress its case that the
accused wanted to get rid of his wife so that he could marry his mistress,
and the defence may be advancing the alternative theory that the couple
was unusual in that they condoned extra-marital affairs and had never let
it affect their loving relationship. How much probative value the
evidence of infidelity has depends on the strength of the explanatory
connections between it and the competing hypotheses, and this is not
something that can be quantified.
But the disagreement in this debate is not as wide as it might appear. The
critics concede that formal models for evaluating evidence in law may
be useful. What they object to is
scholarship arguing … that such models establish the correct or accurate
probative value of evidence, and thus implying that any deviations from
such models lead to inaccurate or irrational outcomes. (Allen and Pardo
2007b: 308)
On the other side, it is acknowledged that there are limits to
mathematical formalisation of evidential reasoning in law (Franklin
2012: 238–9) and that context, argument and judgment do play a role in
identifying the reference class (Nance 2007b).
3.2 Sufficiency of Evidence and the Standards of Proof
3.2.1 Mathematical Probability and the Standards of Proof
In the section 3.1 above, we concentrated on the weight of evidence in
the sense of probative value of individual items of evidence. The
concept of weight can also apply to the total body of evidence presented
at the trial; here “weight” is commonly referred to as the “sufficiency of
evidence”. The law assigns the legal burden of proof between parties to
a dispute. For instance, at a criminal trial, the accused is presumed
innocent and the burden is on the prosecution to prove that he is guilty
as charged. To secure a conviction, the body of evidence presented at the
trial must be sufficient to meet the standard of proof. Putting this
generally, a verdict will be given in favour of the side bearing the legal
burden of proof only if, having considered all of the evidence, the fact-
finder is satisfied that the applicable standard of proof is met. The
standard of proof has been given different interpretations.
On one interpretation, the standard of proof is a probabilistic threshold.
In civil cases, the standard is the “balance of probabilities” or, as it is
more popularly called in the United States, the “preponderance of
evidence”. The plaintiff will satisfy this standard and succeed in his
claim only if there is, on all the evidence adduced in the case, more than
0.5 probability of his claim being true. At criminal trials, the standard for
a guilty verdict is “proof beyond a reasonable doubt”. Here the
probabilistic threshold is thought to be much higher than 0.5 but courts
have eschewed any attempt at authoritative quantification. Typically, a
notional value, such as 0.9 or 0.95, is assumed by writers for the sake of
discussion. For the prosecution to secure a guilty verdict, the evidence
adduced at the trial must establish the criminal charge to a degree of
probability that crosses this threshold. Where, as in the United States,
there is an intermediate standard of “clear and convincing evidence”
which is reserved for special cases, the probabilistic threshold is said to
lie somewhere between 0.5 and the threshold for proof beyond
reasonable doubt.
Kaplan was among the first to employ decision theory to develop a
framework for setting the probabilistic threshold that represents the
standard of proof. Since the attention in this area of the law tends to be
on the avoidance of errors and their undesirable consequences, he finds
it convenient to focus on disutility rather than utility. The expected
disutility of an outcome is the product of the disutility (broadly, the
social costs) of that outcome and the probability of that outcome. Only
two options are generally available to the court: in criminal cases, it
must either convict or acquit the accused and in civil cases, it has to give
judgment either for the plaintiff or for the defendant. At a criminal trial,
the decision should be made to convict where the expected disutility of a
decision to acquit is greater than the expected disutility of a decision to
convict. This is so as to minimize the expected disutilities. To put this in
the form of an equation:
PDag>(1−P)DciPDag>(1−P)Dci
P is the probability that the accused is guilty on the basis of all the
evidence adduced in the case, Dag is the disutility of acquitting a guilty
person and Dci is the disutility of convicting an innocent person. A
similar analysis applies to civil cases: the defendant should be found
liable where the expected disutility of finding him not liable when he is
in fact liable exceeds the expected disutility of finding him liable when
he is in fact not liable.
On this approach, a person should be convicted of a crime only
where P is greater than:
11+DagDci11+DagDci
The same formula applies in civil cases except that the two disutilities
(Dag and Dci) will have to be replaced by their civil equivalents (framed
in terms of the disutility of awarding the judgment to a plaintiff who in
fact does not deserve it and disutility of awarding the judgment to a
defendant who in fact does not deserve it). On this formula, the crucial
determinant of the standard of proof is the ratio of the two disutilities. In
the civil context, the disutility of an error in one direction is deemed
equal to the disutility of an error in the other direction. Hence, a
probability of liability of greater than 0.5 would suffice for a decision to
enter judgment against the defendant (see Redmayne 1996: 171). The
situation is different at a criminal trial. Dci, the disutility of convicting
an innocent person is considered far greater than Dag, the disutility of
acquitting a guilty person.[13] Hence, the probability threshold for a
conviction should be much higher than 0.5 (Kaplan 1968: 1071–1073;
see also Cullison 1969).
An objection to this analysis is that it is incomplete. It is not enough to
compare the costs of erroneous verdicts. The utility of an accurate
conviction and the utility of an accurate acquittal should also be
considered and factored into the equation (Lillquist 2002: 108).[14] This
results in the following modification of the formula for setting the
standard of proof:
11+Ucg−UagUai−Uci11+Ucg−UagUai−Uci
Ucg is the utility of convicting the guilty, Uag is the utility of acquitting
the guilty, Uai is the utility of acquitting the innocent and Uci the utility
of convicting the innocent.
Since the relevant utilities depend on the individual circumstances, such
as the seriousness of the crime and the severity of the punishment, the
decision-theoretic account of the standard of proof would seem, on both
the simple and the modified version, to lead to the conclusion that the
probabilistic threshold should vary from case to case (Lillquist 2002;
Bartels 1981; Laudan and Saunders 2009). In other words, the standard
of proof should be a flexible or floating one. This view is perceived to be
problematic.
First, it falls short descriptively. The law requires the court to apply a
fixed standard of proof for all cases within the relevant category. In
theory, all criminal cases are governed by the same high standard and all
civil cases are governed by the same lower standard. That said, it is
unclear whether factfinders in reality adhere strictly to a fixed standard
of proof (see Kaplow 2012: 805-809).
The argument is better interpreted as a normative argument—as
advancing the claim about what the law ought to be and not what it is.
The standard of proof ought to vary from case to case. But this proposal
faces a second objection. For convenience, the objection will be
elaborated in the criminal setting; in principle, civil litigants have the
same two rights that we shall identify. According to Dworkin (1981),
moral harm arises as an objective moral fact when a person is
erroneously convicted of a crime. Moral harm is distinguished from the
bare harm (in the form of pain, frustration, deprivation of liberty and so
forth) that is suffered by a wrongfully convicted and punished person.
While accused persons have the right not to be convicted if innocent,
they do not have the right to the most accurate procedure possible for
ascertaining their guilt or innocence. However, they do have the right
that a certain weight or importance be attached to the risk of moral harm
in the design of procedural and evidential rules that affect the level of
accuracy. Accused persons have the further right to a consistent
weighting of the importance of moral harm and this further right stems
from their right to equal concern and respect. Dworkin’s theory carries
an implication bearing on the present debate. It is arguable that to adopt
a floating standard of proof would offend the second right insofar as it
means treating accused persons differently with respect to the evaluation
of the importance of avoiding moral harm. This difference in treatment
is reflected in the different level of the risk of moral harm to which they
are exposed.
There is a third objection to a floating standard of proof. Picinali (2013)
sees fact-finding as a theoretical exercise that engages the question of
what to believe about the disputed facts. What counts as “reasonable”
for the purposes of applying the standard of proof beyond reasonable
doubt is accordingly a matter for theoretical as opposed to practical
reasoning. Briefly, theoretical reasoning is concerned with what to
believe whereas practical reasoning is about what to do. Only reasons
for belief are germane in theoretical reasoning. While considerations that
bear on the assessment of utility and disutility provide reasons for
action, they are not reasons for believing in the accused’s guilt. Decision
theory cannot therefore be used to support a variable application of the
standard of proof beyond reasonable doubt.
The third criticism of a flexible standard of proof does not directly
challenge the decision-theoretic analysis of the standard of proof. On
that analysis, it would seem that the maximisation of expected utility is
the criterion for selecting the appropriate probabilistic threshold to apply
and it plays no further role in deciding whether that threshold, once
selected, is met on the evidence adduced in the particular case. It is not
incompatible with the decision-theoretic analysis to insist that the
question of whether the selected threshold is met should be governed
wholly by epistemic considerations. However, it is arguable that what
counts as good or strong enough theoretical reason for judging, and
hence believing, that something is true is dependent on the context, such
as what is at stake in believing that it is true. More is at stake at a trial
involving the death penalty than in a case of petty shop-lifting;
accordingly, there should be stronger epistemic justification for a finding
of guilt in the first than in the second case. Philosophical literature on
epistemic contextualism and on interest-relative accounts of knowledge
and justified belief has been drawn upon to support a variant standard of
proof (Ho 2008: ch. 4; see also Amaya 2015: 525–531).[15]
The premise of the third criticism is that the trier of fact has to make a
finding on a disputed factual proposition based on his belief in the
proposition. This is contentious. Beliefs are involuntary; we cannot
believe something by simply deciding to believe it. The dominant view
is that beliefs are context-independent; at any given moment, we cannot
believe something in one context and not believe it in another. On the
other hand, legal fact-finding involves choice and decision making and it
is dependent on the context; for example, evidence that is strong enough
to justify a finding of fact in a civil case may not be strong enough to
justify the same finding in a criminal case where the standard of proof is
higher. It has been argued that the fact-finder has to base his findings not
on what he believes but what he accepts (Cohen 1991, 1992: 117–125,
Beltrán 2006; cf. Picinali 2013: 868–869). Belief and acceptance are
propositional attitudes: they are different attitudes that one can have in
relation to a proposition. As Cohen (1992: 4) explains,
to accept that p is to have or adopt a policy of deeming, positing or
postulating that p—i.e. of including that proposition or rule among one’s
premises for deciding what to do or think in a particular context.
3.2.2 Objections to Using Mathematical Probability to Interpret
Standards of Proof
Understanding standards of proof in terms of mathematical probabilities
is controversial. It is said to raise a number of paradoxes (Cohen 1977;
Allen 1986, 1991; Allen and Leiter 2001; Redmayne 2008). Let us
return to our previous example. The defendant, Blue Bus Company,
owns 75% of the buses in the town where the plaintiff was injured by a
recklessly driven bus and the remaining 25% is owned by Red Bus
Company. No other evidence is presented. Leaving aside the reference
class problem discussed above, there is a 0.75 probability that the
accident was caused by a bus owned by the defendant. On the
probabilistic interpretation of the applicable standard of proof (that is,
the balance of probabilities), the evidence should be sufficient to justify
a verdict in the plaintiff’s favour. But all lawyers would agree that the
evidence is insufficient. The puzzle is why this is so. Various attempts
have been made to solve this puzzle (for surveys of these attempts, see
Enoch and Fisher 2015: 565–571; Redmayne 2008, Ho 2008: 135–143,
168–170). On the solution offered by Thomson, the statistical evidence
(the 75% ownership of buses) is not causally connected with the fact
sought to be proved (the accident) and as such cannot justify belief in or
knowledge of the fact (Thomson 1986). But it is questionable that the
court should aim at knowledge of the disputed fact and not simply at
accuracy in its finding (Enoch, Spectre and Fisher 2012; Enoch and
Fisher 2015).
There is another paradox in the mathematical interpretation of the
standard of proof. This is the “conjunction paradox”. To succeed in a
civil claim (or a criminal prosecution), the plaintiff (or the prosecution)
will have to prove the material facts—or “elements”—that constitute the
civil claim (or criminal charge) that is before the court (see discussion of
“materiality” in section 2.2 above). Imagine a claim under the law of
negligence that rests on two elements: a breach of duty of care by the
defendant (element A) and causation of harm to the plaintiff (element B).
To win the case, the plaintiff is legally required to prove A and B. For
the sake of simplicity, let A and B be mutually independent events.
Suppose the evidence establishes A to a probability of 0.6 and B to a
probability of 0.7. On the mathematical interpretation of the civil
standard of proof, the plaintiff should succeed in his claim since the
probability with respect to each of the elements exceeds 0.5. However,
according to the multiplication rule of conventional probability calculus,
the probability that A and B are both true is the product of their
respective probabilities; in this example, it is only 0.42 (obtained by
multiplying 0.6 with 0.7). Thus, the overall probability is greater that the
defendant deserves to win than that the plaintiff deserves to win, and yet
the verdict is awarded in favour of the plaintiff.
One way of avoiding the conjunction paradox is to take the position that
it should not be enough for each element to cross the probabilistic
threshold; the plaintiff (or the prosecution) should win only if the
probability of the plaintiff’s (or prosecution’s) case as a whole exceeds
the applicable probabilistic threshold. So, in our example, the plaintiff
should lose since the overall probability is below 0.5. But this suggested
solution is unsatisfactory. The required level of overall probability
would then turn on how many elements the civil claim or criminal
charge happens to have. The greater the number of elements, the higher
the level of probability to which, on average, each of them must be
proved. This is thought to be arbitrary and hence objectionable. As two
commentators noted, the legal definition of theft contains more elements
than that for murder. Criminal law is not the same in all countries. We
may take the following as a convenient approximation of what the law is
in some countries: murder is (1) an act that caused the death of a person
(2) that was done with the intention of causing the death, and to
constitute theft, there must be (1) an intention to take property, (2)
dishonesty in taking the property, (3) removal of the property from the
possession of another person and (4) lack of consent by that person.
Since the offence of theft contains twice the number of elements as
compared to murder, the individual elements for theft would have to be
proved to a much higher level of probability (in order for the probability
of their conjunction to cross the overall threshold) than the individual
elements for the much more serious crime of murder (Allen and Leiter
2001: 1504–5). This is intuitively unacceptable.
Another proposal for resolving the conjunction paradox is move away
from thinking of the standard of proof as a quantified threshold of
absolute probability and to construe it, instead, as a probability ratio.
The fact-finder has to compare the probability of the evidence adduced
at the trial under the plaintiff’s theory of the case with the probability of
the evidence under the defendant’s theory of the case (the two need not
add to 1), and award the verdict to the side with a higher probability
(Cheng 2013). One criticism of this interpretation of the standard of
proof is that it ignores, and does not provide a basis for ignoring, the
margin by which one probability exceeds the other, and the difference in
probability may vary significantly for different elements of the case
(Allen and Stein 2013: 598).
There is a deeper problem with the probabilistic conception of the
standard of proof. There does not seem to be a satisfactory interpretation
of probability that suits the forensic context. The only plausible
candidate is the subjective meaning of probability according to which
probability is construed as the strength of belief. The evidence is
sufficient to satisfy the legal standard of proof on a disputed question of
fact—for example, it is sufficient to justify the positive finding of fact
that the accused killed the victim—only if the fact-finder, having
considered the evidence, forms a sufficiently strong belief that the
accused killed the victim. Guidance on how to process evidence and
form beliefs can be found in a mathematical theorem known as Bayes’
theorem; it is the method by which an ideal rational fact-finder would
revise or update his beliefs in the light of new evidence.[16] To return to
our earlier hypothetical scenario, suppose the fact-finder initially
believes the odds of the accused being guilty is 1:1 (“prior odds”) or,
putting this differently, that there is a 0.5 probability of guilt. The fact-
finder then receives evidence that blood of type A was found at the
scene of the crime and that the accused has type A blood. Fifty percent
of the population has this blood type. On the Bayesian approach, the
posterior odds are calculated by multiplying the prior odds (1:1) by the
likelihood ratio (which, as we saw in section 2.1.2 above, is 2:1). The
fact-finder’s belief in the odds of guilt should now be revised to 2:1; the
probability of guilt is now increased to 0.67 (Lempert 1977).
The subjectivist Bayesian theory of legal fact-finding has come under
attack (see generally Amaya 2015: 82–93; Pardo 2013: 591). First, as we
already saw in section 3.1, ascertainment of the likelihood ratios is
highly problematic. Secondly, the Bayesian theory is not sensitive to the
weight of evidence which, roughly put, is the amount of evidence that is
available. This criticism and the concept of weight are further explored
in section 3.3.
Thirdly, while the Bayesian theorem offers a method for updating
probabilities in the light of new evidence, it is silent on what the initial
probability should be. In a trial setting, the initial probability cannot be
set at zero since this means certainty in the innocence of the accused. No
new evidence can then make any difference; whatever the likelihood
ratio of the evidence, multiplying it by zero (the prior probability) will
still end up with a posterior probability of zero. On the other hand,
starting with an initial probability is also problematic. This is especially
so in a criminal case. To start a trial with some probability of guilt is to
have the fact-finder harbouring some initial belief that the accused is
guilty and this is not easy to reconcile with the presumption of
innocence. (Tribe 1971: 1368–1372; cf. Posner 1999: 1514, suggesting
starting the trial with prior odds of 50:50, criticized by Friedman 2000.)
Fourthly, we have thus far relied for ease of illustration on highly
simplified—and therefore unrealistic—examples. In real cases, there are
normally multiple and dependent items of evidence and the probabilities
of all possible conjunctions of these items, which are numerous, will
have to be computed. These computations are far too complex to be
undertaken by human beings (Callen 1982: 10–15). The impossibility of
complying with the Bayesian model undermines its prescriptive value.
Fifthly, according to Haack, the Bayesian theory has it the wrong way
round. What matters is not the strength of the fact-finder’s belief itself.
The standard of proof should be understood instead in terms of what it is
reasonable for the fact-finder to believe in the light of the evidence
presented, and this is a matter of the degree to which the belief is
warranted by the evidence. Evidence is legally sufficient where it
warrants the contested factual claim to the degree required by law.
Whether a factual claim is warranted by the evidence turns on how
strongly the evidence supports the claim, on how independently secure
the evidence is, and on how much of the relevant evidence is available to
the fact-finder (that is, the comprehensiveness of the evidence—see
further discussion in section 3.3 below). Haack is against identifying
degrees of warrant with mathematical probabilities. Degrees of warrant
do not conform to the axioms of the standard probability calculus. For
instance, where the evidence is weak, neither p nor not-p may be
warranted; in contrast, the probability of p and the probability of not-
p must add up to 1. Further, where the probability of p and the
probability of q are both less than 1, the probability of p and q, being the
product of the probability of p and the probability of q, is less than the
probability of either. On the other hand, the degree of warrant for the
conjunction of p and q may be higher than the warrant for either.[17] (See
Haack 2004, 2008a,b, 2012, 2014 for the legal application of her general
theory of epistemology. For her general theory of epistemology, see
Haack 1993: ch. 4; Haack 2009: ch. 4; Haack 2003: ch. 3.)
Sixthly, research in experimental psychology suggests that fact-finders
do not evaluate pieces of evidence one-by-one and in the unidirectional
manner required under the mathematical model (Amaya 2015: 114–5).
A holistic approach is taken instead where the discrete items of evidence
are integrated into large cognitive structures (variously labelled as
“mental models”, “stories”, “narratives” and “theories of the case”), and
they are assessed globally against the legal definition of the crime or
civil claim that is in dispute (Pennington and Hastie 1991, 1993; Pardo
2000). The reasoning does not progress linearly from evidence to a
conclusion; it is bi-directional, going forward and backward: as the fact-
finder’s consideration of the evidence inclines him towards a particular
verdict, his leaning towards that conclusion will often produce a revision
of his original perception and his assessment of the evidence (Simon
2004, 2011).
The holistic nature of evidential reasoning as revealed by these studies
has inspired alternative theories that are of a non-mathematical nature.
One alternative, already mentioned, is the “explanatory” or “relative
plausibility” theory advanced by Allen together with Pardo and other
collaborators (Allen 1986, 1991, 1994; Pardo 2000; Allen and Leiter
2001; Allen and Jehl 2003; Pardo and Allen 2008; cf. Nance 2001,
Friedman 2001).[18] They contend that fact-finders do not reason in the
fashion portrayed by the Bayesian model. Instead, they engage in
generating explanations or hypotheses on the available evidence by a
process of abductive reasoning or drawing “inferences to the best
explanation”, and these competing explanations or hypotheses are
compared in the light of the evidence.[19] The comparison is not of a
hypothesis with the negation of that hypothesis, where the probability of
a hypothesis is compared with the probability of its negation. Instead,
the comparison is of one hypothesis with one or more particular
alternative hypotheses as advocated by a party or as constructed by the
fact-finder himself. On this approach, the plausibility of X, the factual
account of the case that establishes the accused’s guilt or defendant’s
liability, is compared with the plausibility of a hypothesis Y, a specific
alternative account that points to the accused’s innocence or the
defendant’s non-liability, and there may be more than one such specific
alternative account.
On this theory, the evidence is sufficient to satisfy the preponderance of
proof standard when the best-available hypothesis that explains the
evidence and the underlying events include all of the elements of the
claim. Thus, in a negligence case, the best-available hypothesis would
have to include a breach of duty of care by the plaintiff and causation of
harm to the defendant as these are the elements that must be proved to
succeed in the legal claim. For the intermediate “clear-and-convincing”
standard of proof, the best-available explanation must be substantially
better than the alternatives. To establish the standard of proof beyond
reasonable doubt, there must be a plausible explanation of the evidence
that includes all of the elements of the crime and, in addition, there must
be no plausible explanation that is consistent with innocence (Pardo and
Allen 2008: 238–240; Pardo 2013: 603–604).
The relative plausibility theory itself is perceived to have a number of
shortcomings. First, the theory portrays the assessment of plausibility as
an exercise of judgment that involves employment of various criteria
such as coherence, consistency, simplicity, consilience and more.
However, the theory is sketchy on the meaning of plausibility and the
criteria for evaluating plausibility are left largely unanalyzed.[20]
A second criticism of the relative plausibility theory is that, despite the
purported utilisation of “inference to the best explanation” reasoning, the
verdict is not controlled by the best explanation. For instance, even if the
prosecution’s hypothesis is better than the defence’s hypothesis, neither
may be very good. In these circumstances, the court must reject the
prosecution’s hypothesis even though it is the best of alternatives
(Laudan 2007). One suggested mitigation of this criticism is to place
some demand on the epistemic effort that the trier of fact must take (for
example, by being sufficiently diligent and thorough) in constructing the
set of hypotheses from which the best is to be chosen (Amaya 2009:
155).
The third criticism is targeted at holistic theories of evidential reasoning
in general and not specifically at the relative plausibility theory. While it
may be descriptively true that fact-finders decide verdicts by holistic
evaluation of the plausibility of competing explanations, hypotheses,
narratives or factual theories that are generated from the evidence, such
forms of reasoning may conceal bias and prejudice that stand greater
chances of exposure under a systematic approach such as Bayesian
analysis (Twining 2006: 319; Simon 2004, 2011; Griffin 2013). A
hypothesis constructed by the fact-finder may be shaped subconsciously
by a prejudicial generalisation or background belief about the accused
based on a certain feature, say, his race or sexual history. Individuating
this feature and subjecting it to Bayesian scrutiny has the desirable effect
of putting the generalisation or background belief under the spotlight
and forcing the fact-finder to confront the problem of prejudice.
3.3 The Weight of Evidence as the Degree of Evidential Completeness
A third idea of evidential weight is prompted by this insight from
Keynes (1921: 71):
As the relevant evidence at our disposal increases, the magnitude of the
probability of the argument may either decrease or increase, according
as the new knowledge strengthens the unfavourable or the favourable
evidence; but something seems to have increased in either case,—we
have a more substantial basis upon which to rest our conclusion. I
express this by saying that an accession of new evidence increases
the weight of an argument. New evidence will sometimes decrease the
probability of an argument, but it will always increase its “weight”.
At its simplest, we may think of weight in the context of legal fact-
finding as the amount of evidence before the court. Weight is
distinguishable from probability. The weight of evidence may be high
and the mathematical probability low, as in the situation where the
prosecution adduces a great deal of evidence tending to incriminate the
accused but the defence has an unshakeable alibi (Cohen 1986: 641).
Conversely, the state of evidence adduced in a case might establish a
sufficient degree of probability—high enough to cross the supposed
threshold of proof on the mathematical conception of the standard of
proof—and yet lack adequate weight. In the much-discussed gate-
crasher’s paradox, the only available evidence shows that the defendant
was one of a thousand spectators at a rodeo show and that only four
hundred and ninety nine tickets were issued. The defendant is sued by
the show organiser for gate-crashing. The mathematical probability that
the defendant was a gate-crasher is 0.501 and this meets the probabilistic
threshold for civil liability. But, according to the negation principle of
mathematical probability, there is probability of 0.499 that the defendant
did pay for his entrance. In these circumstances, it is intuitively unjust to
find him liable (Cohen 1977: 75). A possible explanation for not finding
him liable is that the evidence is too flimsy or of insufficient weight.
Proponents of the mathematical conception of the standard of proof have
stood their ground even while acknowledging that weight has a role to
play in the Bayesian analysis of probative value and the sufficiency of
evidence. If a party does not produce relevant evidence that is in his
possession, resulting in the court facing an evidential deficiency, it may
draw an adverse inference against him when computing the posterior
probability (Kaye 1986b: 667; Friedman 1997). One criticism of this
approach is that, in the absence of information about the missing
evidence, the drawing of the adverse inference is open to the objection
of arbitrariness (Nance 2008: 274). A further objection is that the
management of parties’ conduct relating to evidence preservation and
presentation should be left to judges and not to the jury. What a judge
may do to optimize evidential weight is to impose a burden of producing
evidence on a party and to make the party suffer an adverse finding of
fact if he fails to produce the evidence. This will serve as an incentive
for the party to act in a manner that promotes the interest in evidential
completeness (Nance 2008, 2010).
Cohen suggests that the standard of proof should be conceived entirely
as a matter of evidential weight which, on his theory, is a matter of the
number of tests or challenges to which a factual hypothesis is subjected
to in court. He offers an account of legal fact-finding in terms of an
account of inductive probability that was inspired by the work of writers
such as Francis Bacon and J.S. Mill. Inductive probability operates
differently from the classical calculus of probability. It is based on
inductive support for the common-sense generalisation that licences the
drawing of the relevant inference. Inductive support for a generalisation
is graded according to the number of tests that it has passed, or, putting
this in another way, by the degree of its resistance to falsification by
relevant variables. The inductive probability of an argument is equal to
the reliability grade of the inductive support for the generalisation which
covers the argument.
Proof beyond reasonable doubt represents the maximum level of
inductive probability. The prosecution may try to persuade the court to
infer that the accused was guilty of burglary by producing evidence to
establish that he was found in the vicinity of the victim’s house late at
night with the stolen object on him. This inference is licensed by the
generalisation that normally if a stranger is found immediately after a
burglary in possession of the stolen object, he intentionally removed it
himself. The defence may try to defeat the inference by showing that the
generalization does not apply in the particular case, for example, by
presenting evidence to show that the accused had found the object on the
street. The prosecution’s hypothesis is now challenged or put to the test.
As a counter-move, it may produce evidence to establish that the object
could not have been lying in the street as alleged. If the generalisations
on which the prosecution’s case rest survive challenges by the defence at
every possible point, then guilt is proved beyond reasonable doubt.
[21]
The same reasoning structure applies in the civil context except that
in a civil case, the plaintiff succeeds in proof on the preponderance of
evidence so long as the conclusion to be proved by him is more
inductively probable than its negation. (Cohen 1977, 1986; cf. Schum
1979.)[22]
Cohen’s theory seems to require that each test to which a hypothesis is
put can be unequivocally and objectively resolved. But usually this is
not the case. In our example, we may not be entirely convinced that the
accused found or did not find the object on the street, and our evaluation
would involve the exercise of judgment that is no less subjective as the
sort of judgments required when applying the standard probabilistic
conception of proof (Nance 2008: 275–6; Schum 1994: 261).