Sources And Methods: analytic confidence

Showing posts with label analytic confidence. Show all posts

Monday, November 18, 2019

Chapter 2: In Which The Brilliant Hypothesis Is Confounded By Damnable Data

"Stop it, Barsdale! You're introducing confounds into my experiment!"

A little over a month ago, I wrote a post that asked if the form of an estimative statement mattered in terms of communicating its content with regard to analytic confidence. Specifically, I asked people to determine which of the following was "more clear" in response to the question, "Do you think the Patriots will win this week?":

"It's a low confidence estimate, but the Patriots are very likely to win this week."
"The Patriots are very likely to win this week. This is a low confidence estimate, however."

I posted this as an informal survey and 72 people kindly took the time to take it. Here are the results:

At first glance, the results appear to be less than robust. The difference measured here is unlikely to be statistically significant. Even if it is, the effect size does not appear to be that large. The one thing that seems clear is that there is no clear preference.

Or is there?

Just like every PHD candidate who ever got disappointing results from an experiment, I have spent the last several weeks trying to rationalize the results away--to find some damn lipstick and get it on this pig!

I think I finally found something which soothes my aching ego a bit. The fundamental assumption of these kinds of survey questions is that, in theory, both answers are equally likely. Indeed, this sort of A/B testing is done precisely because the asker does not know which one the client/customer/etc. will prefer.

This assumption might not hold in this case. Statements of analytic confidence are, in my experience, rare in any kind of estimative work (although they have become a bit more common in recent years). When they are included, however, they are almost always included at the end of the estimate. Indeed, one of those who took the survey (and preferred the first statement above) commented that putting the statement of analytic confidence at the end, "is actually how it would be presented in most IC agencies, but whipsaws the reader."

How might the comfort of this familiarity change the results? On the one hand, I have no knowledge of who took my survey (though most of my readers seem to be at least acquainted in passing with intelligence and estimates). On the other hand, there is some pretty good evidence (and some common sense thinking) that documents the power of the familiarity heuristic, or our preference for the familiar over the unfamiliar. In experiments, the kind of thing that can throw your results off is known as a confound.

More important than familiarity with where the statement of analytic confidence traditionally goes in an estimate, however, might be another rule of estimative writing and another confound: BLUF.

Bottomline Up Front (or BLUF) style writing is a staple of virtually every course on estimative or analytic writing. "Answer the question and answer it in the first sentence" is something that is drummed into most analysts' heads from birth (or shortly thereafter). Indeed, the single most common type of comment from those that preferred the version with the statement of analytic confidence at the end was, as this one survey taker said, "You asked about the Patriots winning - the...response mentions the Patriots - the topic - within the first few words."

Note: Ellipses seem important these days and the ones in the sentence above mark where I took out the word "first." I randomized the two statements in the survey so that they did not always come up in the same order. Thus, this particular responder saw the second statement above (the one with the statement of analytic confidence at the end) first.

If the base rate of the two answers is not 50-50 but rather 40-60 (or worse in favor of the more familiar, more BLUFy answer) then these results could easily become very significant. It would be like winning a football game you were expected to lose by 35 points!

Thus, like all good dissertations, the only real conclusion I have come to is that the "topic needs more study."

Joking aside, it is an important topic. As you likely know, it is not enough to just make an estimate. It is also important to include a statement of analytic confidence. To do anything less in formal estimates is to be intellectually dishonest to whoever is making real decisions based on your analysis. I don't think that anyone would disagree that form can have a significant impact on how the content is received. The real questions are how does form impact content and to what degree? Getting at those questions in the all important area of formal estimative writing is truly something well-worth additional study.

Tuesday, October 1, 2019

Analytic Confidence And The New England Patriots: A Hypothesis

"Don't try to stop me! I'm having a thought!" (Image Source)

I was driving to work this morning, thinking about analytic confidence (as one does), and I had a thought. An interesting thought. Before I tell you what it was, you need to take the one question survey at the link below to see if my thought has any merit (I will post the results as a follow-up to this post):

Which statement seems more clear to you?

Did you take the survey? If not, go back and take it!

And now?

OK! Thanks!

People are often confused by the difference between an estimate and confidence in that estimate. This confusion is driven, to a very large part, by the way the terms are often (mis)used in formal analytic writing. It is not uncommon to see someone talk about their confidence when they are really making an estimate or, less commonly, to use estimative language to convey confidence.

The two concepts, however, are very different. The estimate communicates what you think is likely (or unlikely) to happen in the future. Confidence speaks to the likelihood that something is mucking up the process used to establish that estimate.

This is where the New England Patriots come in. For example, I think it is very likely that the New England Patriots will win their next game (Note: I am using the term "very likely" here the same way the DNI does). I watch football but am by no means an expert. I don't even know who the Patriots are playing next week. I just know that they are usually a good team, and that they usually win a lot of games. So, while I still think it is very likely that the Patriots will win, my confidence in that estimate is low. The process I used for deriving that estimate was so weak, I won't be surprised to find out that they have a bye next week.

On the other hand, it is easy to imagine a forecaster who is steeped in football lore. This hypothetical forecaster has an excellent track record derived in large part from a highly structured and efficient process for determining the odds of a victory. This forecaster might say exactly the same thing I did--the Patriots are very likely to win their next game--but, because of a superior process, this forecaster has high confidence in their estimate.

It is important to convey both--the estimate itself and analytic confidence--when communicating the results of analysis to a decisionmaker. To do anything less runs the risk of the decisionmaker misinterpreting the findings or assuming things about the process that are not true.

It is also important to note that the "analytic confidence" mentioned here differs significantly from the far more commonly discussed notion of psychological confidence. Psychological confidence is a statement about how one "feels" and can often be caused by cognitive bias or environmental factors. There is no reliable relationship between forecasting accuracy and psychological confidence.

Analytic confidence, on the other hand, is based on legitimate reasons why the analysis is more likely to be correct. For example, analysis derived from facts presented by reliable sources is more likely to be correct than analysis derived from sketchy or disreputable sources. In fact, there are a number of legitimate reasons for more rather than less analytic confidence (you can read about them here).

It is, of course, possible, for analytic and psychological notions of confidence to be consistent, at least in the context of an individual forecast. I, for example, "feel" that I have no reason to be confident in my estimate about the Patriots. I also know, as I go down the list of elements responsible for legitimate analytic confidence, that very few are present. Low applies to both my psychological and analytic variants of confidence, in this case.

That is not normal. Overconfidence bias is typically the cause of feelings of confidence outpacing a more rational assessment of the quality of the analytic process. Underconfidence, on the other hand, is typically caused by over-thinking a problem and is more common among experts than you might think.

Now to my thought. Finally.

One of the big problems with analytic confidence is communicating it to decisionmakers in an intuitive way. Part of this problem occurs, no doubt, because of the different meanings the word "confidence" can have. Most people, when they hear the word "confidence" used in casual conversation, assume you mean the psychological kind. Adding the word "analytic" in front of "confidence" doesn't seem to help much, as most people don't really have a notion of what analytic confidence is or how it differs from the more commonly used, psychological type of confidence (They don't want to know, either. They have enough to remember).

The classic solution has been to ignore analytic confidence completely. This is wrong for all the reasons discussed above. Occasionally, however, analysts elect to include a statement of analytic confidence, typically at the end of the analysis. Part of this is due to the "Bottomline Up Front (BLUF)" style of writing that is common to analysis. The logic here is that the most important thing is the estimate. That becomes the bottomline and, therefore, the first thing mentioned in the paper or briefing.

What if we flip that on its head? What if we go, at least in casual conversation, with the analytic confidence first?

Thus you had my two formulations:

"It's a low confidence estimate, but the Patriots are very likely to win this week."
"The Patriots are very likely to win this week. This is a low confidence estimate, however."

These two statements say exactly the same thing in terms of content. However, I think the form of the first statement better communicates what the analyst actually intends. In other words, I think the first statement establishes a slightly different context. Furthermore, I think this context will likely help the listener interpret my use of the word "confidence" correctly. That is, the first statement is better than the second at suggesting that I am using confidence as a way to highlight the process I used to derive the estimate and not just how I feel about it.

Another reason I think the second statement is inferior is because I think it sounds confusing to the casual listener. It is theoretically better (the bottomline is definitely up front) but, unless you are steeped in the arcana of analytic writing, it cannot be easily interpreted and could lead to confusion.

That's the reason for the quick poll. I just wanted to see what you thought--to see, in the words of Gertrude Stein, if there was any there there.

Thanks and I will post what I found (and my inevitably shocked reaction to it) in a later post.

Sunday, April 13, 2008

IntelFusion Remixes Peterson's Table Of Analytic Confidence Assessment (IntelFusion)

Jeff Carr over at IntelFusion has added his own twist to Josh Peterson's recent work on analytic confidence. Jeff's remix is a good visualization (see below) of what Josh was getting at in his thesis. See the full post here and the IntelFusion main page here.

Thursday, April 3, 2008

Analytic Confidence Defined...Finally! (Original Research)

One of the great benefits of teaching at Mercyhurst is having the opportunity to work with a bunch of dedicated and very intelligent students. Another advantage is having enough students in the intelligence studies program to be able to conduct meaningful experiments.

Both factors come into play in the newly published thesis, Appropriate Factors To Consider When Assessing Analytic Confidence In Intelligence Analysis by one of our grad students, Josh Peterson (click here to download the full text. The full text will also be permanently available in the Mercyhurst Student Projects link list in the right hand column of this blog). Josh presented some of his research and findings at the recently completed ISA Conference but the full study is presented here for the first time.

I have written about the problems with the concept of analytic confidence before. Some analysts interpret analytic confidence psychologically (i.e. how the analyst feels about his or her analysis) while some use it estimatively (e.g. "I am confident that X will happen"). I have also argued that neither makes much sense in the context of modern best practices for intelligence analysis.

Analytic confidence really has to do with how well calibrated a particular estimate is. Both an experienced analyst and a beginner might say that "X is likely" but one would hope (vainly, perhaps) that the expert would have a tighter shot group around the target than the newbie.

Josh takes this as his starting point and asks, "What, then, are the relevant, legitimate elements of analytic confidence? What should analysts consider when they are asked to state how confident they are in their analysis?" Through a good bit of research (laid out in his lit review) he identified seven elements that seemed to legitimately underpin the concept of analytic confidence. They are: Source reliability, source corroboration, use of a structured method to do the analysis, analyst level of expertise, amount of collaboration between analysts, task complexity and time pressure.

Josh then created scenarios that were similar but, in one case, all of the elements mentioned above were extremely negative (low source reliability, high time pressure, etc) and, in another case, the elements were extremely positive (high source reliability, use of a structured method, etc.). Josh also established a third, control, group to help establish the validity of his experiment.

Josh used the students here at Mercyhurst as subjects for the studies. The students here get a good bit of real world experience in doing analysis both in our classes and in the internships and contract work we do so I think they are good proxies for entry-level analysts in the Intelligence Community.

Josh found that students could accurately identify high confidence from low confidence scenarios but that they were doing this largely through intuition. He suggested that we probably needed to update our curriculum in order to better teach our students those elements that should legitimately raise or lower an analyst's confidence in his or her work (suggestions we have already adopted).

Finally, Josh took his results and his research and combined them into a rubric (see below) that analysts can use to help score their confidence. Josh did not have time to test this rubric and the weighting in it represents his interpretation of the relative importance of each factor based on his read of the literature. Given how far he had already come, he wisely left these tasks to future researchers.

Analytic confidence is a tough nut to crack. It is hard to explain and even more difficult to research and test. Josh has taken a good first cut at it, though, and I think his work deserves some attention if only as a step upon which others can build.

Sunday, March 23, 2008

The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence (Final Version With Abstract)

Abstract:

There has been a good bit of discussion in the press and elsewhere concerning the recently released National Intelligence Estimate (NIE) on Iran’s nuclear program. Virtually all of this commentary has focused on the facts, sources and logic – the content – of the estimate. It is my position that, while the content is fascinating, the most interesting story behind the NIE has to do with the changes in form that this latest NIE has adopted; that what the National Intelligence Council (NIC) has said is, in many ways, less interesting than the way it has decided to say it. This shift in form implies a new, emerging theory of intelligence – what intelligence is and how to do it – that is likely to influence intelligence communities worldwide. “Emerging”, however, is the key term here. As this article will highlight, the revolution may have begun but it is far from complete

PDF Version (Pre-pub/Complete)

HTML Version:
Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print
Part 8 -- Confidence Is Not the Only Issue
Part 9 -- Waffle Words And Intel-Speak
Part 10 -- The Problem With “If”
Part 11 -- One More Thing
Part 12 -- Final Thoughts
Epilogue

Thursday, January 10, 2008

Part 7: Looking At The Fine Print (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper

Part 7 -- Looking At The Fine Print

Let’s take a look at an example from the Iran National Intelligence Estimate (NIE) and see if we can figure out what is going on here.

We judge with high confidence that Iran will not be technically capable of producing and reprocessing enough plutonium for a weapon before about 2015.

What is missing in this example and many other statements in the NIE, of course, is an estimate of likelihood (or Word Of Estimative Probability (WEP), if you prefer). The estimate does not say “…will not likely be technically capable …” Instead, the verb phrase “will not be technically capable” implies certainty about a future event – which is, by definition, uncertain.

Even in cases where the event happened in the past but the information regarding the event contains inconsistencies or uncertainties (in other words the event is not definitively factual) such as this statement (also from the Iran NIE), “We assess with high confidence that until fall 2003, Iranian military entities were working under government direction to develop nuclear weapons”, it seems inappropriate to not use estimative language in conjunction with a statement of confidence.

In other words, if the Intelligence Community (IC) knew for certain that Iranian military entities were working under government direction to develop nuclear weapons then they should not be indicating a probabilistic statement by saying “we assess”. If, on the other hand, they are not entirely certain, then they should not say “Iranian military entities were working” but rather “it is virtually certain that Iranian military entities were working” or whatever the analysts believe is the appropriate estimate of likelihood. Mixing the formulations makes the definitions laid out in the Explanation of Estimative Language (EEL) page -- the "page five" in the title -- meaningless.

This is problematic for several additional reasons. First, it is bound to be confusing to the reader. Having carefully explained that “We judge” is an indicator of estimation but then phrasing the statement in terms of certainty makes the attentive reader wonder what the IC really means; is this statement a fact or an assessment? It could be read both ways. Second, another graduate student with whom I have worked, Mike Lyden, has done some very interesting research comparing NIE estimative statements against historical fact (his thesis that contains the research is available currently only through inter-library loan). Across the last 40 years, estimates that used WEPs tend to be about 75% accurate. Statements that use words of certainty hover around 50% accuracy (the sampling size was large enough that this difference was statistically significant to several decimal places as I recall). Mike speculates that this difference may be tied up with psychological notions of confidence (explained later) but whatever the reason, the evidence is pretty compelling – the Intelligence Community makes better estimates when it does not use words of certainty.

Another possibility, of course, is that I have got it all wrong; that I have mischaracterized what the IC intended to do when they defined “confidence” the way they did. Indeed, there are several other ways that the word "confidence" could be interpreted that would work in this sentence.

First, confidence could refer to psychological confidence or the way the analyst "feels" about the assessment. Psychologists have long known that the more information you get the more confident you feel in your assessment of a situation. Up to a point, this increasing confidence is warranted. Fairly quickly, however, your mind forms a more or less rigid conceptual model of the problem you are facing so that your mind takes each new fact and tends to either force it into the existing model or discard it as irrelevant. The net effect of this is that, while you feel increasingly confident, your chances of being correct stay about the same. Psychologists call this Overconfidence Bias and it is generally considered a bad thing in analysis. Moreover, it is well known within intelligence circles, having been covered extensively by Richards Heuer in his classic, Psychology Of Intelligence Analysis. It is, therefore, unlikely to be what the IC means when it talks about confidence on the EEL page.

Second, confidence is often used as a synonym for likelihood as in “I am highly confident that New England will win the Super Bowl.” While this works in casual speech, this certainly makes no sense in the context of this NIE. The EEL page defines an entirely different way of ascribing levels of likelihood to its assessments and specifically states that the level of confidence language applies “to our assessments (italics mine).” To use confidence as a synonym for likelihood would be tantamount to the IC saying one thing and doing another which, well, they have already done. I don’t, however, think they would be that silly again. For the same reason, the introductory phrases, “we assess”, “we judge” and “we estimate” can’t be considered to be expressions of likelihood either.

Third, and likely most closely related to what the IC means, is a statistical notion of confidence, commonly expressed as a margin of error. The form of the statement is quite familiar to most of us: “Candidate X leads in the polls, 61 to 39% (plus or minus 3 percent).” This means (typically at the 95% confidence level – yet another statistical term) that Candidate X’s true lead could be as low as 58% or as high as 64%. This form certainly seems to mirror the form examined in Part 4. High confidence under this interpretation would mean that the margin of error is low, that the true probability hovers near the estimate made by the authors of the estimate. The problem here comes in the way the IC has actually used confidence in these phrases. If they mean it to be interpreted statistically it makes no sense to then say something that would be functionally equivalent to “…plus or minus 3 percent, Iran will not be technically capable…”. This kind of statement and others like it only make sense when associated with a probability or, in the case of the NIE, an Estimate Of Likelihood.

This, in turn, brings me back to the more general notion of analytic confidence that I discussed in Part 4. Certainly the IC does not want to convey numerical certainty and has said so (at least in early forms of the EEL page) but this idea of analytic confidence seems similar to the idea of statistical confidence. By using words (not numbers) that express likelihood and then using words (not numbers) to express its confidence in an expression of likelihood, the IC’s implied definition of analytic confidence would resonate with, but not mirror, what many people already generally understand, i.e. the statistical notion of confidence. Just as with statistical notions of confidence, however, this idea of analytic confidence only makes sense if there is an expression of likelihood to go with it.

Which leaves me with a problem. I don’t know what the IC means when they talk about confidence. The EEL page implies they intend to use it one way. Then they do something entirely different in the text and none of the possible variations in meaning makes any sense. They do it so many times that I can’t ascribe it to accident.

I am just an average Joe. The first alternative is that I just don’t understand. I am prepared to admit that. I would suggest, however, that the current form of the EEL page needs to be changed so that it is clearer. I guarantee that if I cannot understand what it means, there are many more average Joes that are struggling with it (or just ignoring it) as well.

The second alternative – and one that is a bit more unnerving – is that the IC does not know what it means when it says high, moderate or low confidence. Perhaps sometimes they are using it to describe how they feel about their position, sometimes they may be using it as a synonym for an estimate and sometimes they may mean it more statistically, leaving it up to the reader to figure out which it is from the context.

Tomorrow: Part 8 -- Confidence Is Not the Only Issue

Wednesday, January 9, 2008

Part 6: Digging Deeper (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications for Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...

Part 6 -- Digging Deeper

There are some disturbing trends in other numbers collected from the Iran National Intelligence Estimate. For example, 71% of the sentences in the Iran NIE contain one of the three statements “we assess”, “we judge” or “we estimate”. As you will recall, this is the way the Intelligence Community (IC) indicated it would preface its estimative conclusions. Compare this with the number of sentences with statements of confidence, i.e. 61%. I could see there being fewer sentences beginning with these three phrases (it would get tedious to constantly see “we assess", "we estimate" or "we judge” all the time) but how do you get more? That means that there are at least some estimates marked by the words that the community has stated it would use to mark such estimates that do not also contain statements of confidence.

Not that big of a deal, you say. OK, I agree, but consider this: Only 29% of the sentences in the Iran NIE contain Words of Estimative Probability (WEPs)! That means that there are some, perhaps many, sentences that indicate that they are estimative in nature but are missing one and perhaps both of the other two elements (WEPs or an assessment of confidence) that the Intelligence Community itself said it would use.

It makes my head hurt.

Let’s review the bidding: Up until the Iran NIE was released only several weeks ago, the IC was saying one thing and then doing another with regard to statements of confidence in their estimates. The Iran NIE dramatically reversed this trend and included statements of confidence in almost 2/3s of its sentences… but, while this is an undeniable improvement, there are still numbers that don’t add up.

Tomorrow: Part 7 -- Looking At The Fine Print

Tuesday, January 8, 2008

Part 5 -- Enough Exposition! Let’s Get Down To It… (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intel)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail

Part 5 -- Enough Exposition! Let’s Get Down To It…

Having laid out this vision of a “theoretically complete estimate” (my words not theirs), how then does the Intelligence Community (IC) use it? To what extent do these carefully crafted words defined on page five (and in one case, page six) of the most recent National Intelligence Estimates (NIEs) actually get used in these documents? The answer is really going to surprise you.

Let me start with the issue of confidence in assessments in the recent NIEs as this is the place where the change is most dramatic.

In order to benchmark the estimates, I started by counting the number of sentences in each of the last several NIEs. For comparison purposes, I included the 2002 WMD NIE and the 1990 Yugoslavia NIE as well. I only looked at the Key Judgments of each NIE. While I understand that there is a good bit more information in the NIE than in the Key Judgments, I am virtually certain that the bulk of the NIE is consistent with the Key Judgments whether they are made public or not.

As you can see by the chart above the numbers are fairly consistent across the NIEs. The minimum number of sentences is 25 with a maximum of 52. The average is 38. I don’t consider these differences in length to be important. They are likely explained by the nature of the subject matter and the level of detail of the full text NIE. I wanted, however, to be able to compare words and phrases defined in the Explanation of Estimative Language (EEL) pages (See Part 4 for more information on these) across multiple NIEs and I knew that mere numbers of uses of the word “likely”, for example, could be skewed by the length of the estimate (i.e. the longer the estimate, the more times a certain word would probably be used). I also considered it unlikely that Words of Estimative Probability (WEPs) and other special words (as defined by the EEL page) would be used multiple times in a single sentence. Number of sentences, therefore, while not perfect, seemed to me to be a useful denominator.

If the number of sentences is the denominator, what about statements of confidence, the numerator? I generated this number by searching each of the Judgments for each of the words or phrases highlighted in the EELs and then going back through and reading for words that might serve the same purpose but were not specifically mentioned in the EEL. Looking at the number of sentences with explicit levels of confidence in each of the identified estimates tells a startling story:

What immediately jumps out is the number of sentences that contain statements of confidence in the Iran NIE – 19 out of the 31 total sentences or almost 2/3 of the sentences in the Iran NIE – compared with other NIEs. None come even close and the majority of them (including the three other NIEs that contained EEL pages that specifically said expressions of confidence were going to be used) contain none. Not one.

Frankly (and abandoning any pretense of academic detachment for a minute) this stuns me. I have a good deal of respect for the the analysts in the NIC and throughout the IC but what were they thinking when they put the previous NIEs in 2007 together? That no one would notice? If that is the case, they were probably correct. Certainly I have not seen a single critique that highlights the absolute inconsistency of stating, “We intend to assess confidence” and then not doing it – three times in a row. That they never said they had to use statements of confidence? That is quibbling. The whole purpose of the EEL is to define “What We Mean When We Say...”. It’s like including a Spanish glossary at the front of a Chinese textbook and then saying, “We never said we had to write the textbook in Spanish.” Why make a point of including an explanation of statements of confidence and then not using them at all? That the expressions of confidence are in the classified but not the unclassified version? That makes even less sense and I don’t think the IC is that dumb. That they had never done it before? Nonsense. The Iraq WMD estimate contained explicit references to confidence. I'm sorry, but three times in a row is too strong of a pattern to ignore. The failure to do what the Intelligence Community said it would do was intentional.

The worst case scenario is that the IC suspects that no one is reading these things anyway or, if they are, they believe the readers are only going to cherry-pick the parts that serve their policy or political purposes. In this context, carefully nuancing your statements and enforcing strict consistency, indeed any consistency, in your use of words is just wasted effort. This is a cynics-eye view yet, sadly, the evidence seems to support it.

That is a shame. Many very smart, dedicated people work in the IC. Many of them are putting their lives on the line to collect, process and analyze the information our decisionmakers need to make good decisions. Certainly the taxpayer has borne a not insignificant burden funding it. The IC's work should be applause-worthy but saying one thing and doing another is not a cause for either confidence or approbation.

Water under the bridge, at this point, of course. The Iran NIE obviously fixed all that, you might say.

Not so fast.

Tomorrow: Part 6 -- Digging Deeper

Monday, January 7, 2008

Part 4 -- Page Five In Detail (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins

Part 4 -- Page Five In Detail

What, then, is so darn unique about page five? While the format and language of the “Explanation of Estimative Language” page (hereinafter the "EEL") has undergone some changes (for the better) over the last four publicly released National Intelligence Estimates (NIEs), all of the estimates that contain such a page make the same three key points:

First, the NIE is…well…an estimate. The authors intend this to be a probabilistic judgment, not a statement of "facts". This may seem obvious but, to many casual readers, there may still be this lingering impression that the CIA, NSA and the other 14 agencies that make up the National Security Community are omniscient. Sorry, not the case and the authors of the NIEs at the National Intelligence Council (NIC) want us to know it.

Second, there is a discussion of Estimates of Likelihood. Specifically, this section talks about what the intelligence community commonly calls Words of Estimative Probability (WEPs -- after the Sherman Kent article of the same name) and what linguistics professionals usually refer to as Verbal Uncertainty Expressions (Thanks, Rachel!). These are words, such as "likely", "probably", or "almost certainly", that convey a sense of probability without coming right out and saying “60%” or whatever.

Noted MIT scholar Michael Schrage came out quite forcefully against this type of estimative language in a Washington Post editorial in 2005. In the same article he spoke very favorably of using percentages and Bayesian statistical methods to get them. Despite this kind of criticism, the NIC , in the early versions of the EEL page, noted that, “Assigning precise numerical ratings to such judgments would imply more rigor than we intended”. While this language was dropped in the Iran NIE (probably due to space constraints), it likely continues to represent the NICs position.

Regardless of its desire to avoid numbers, the NIC still effectively benchmarks its WEPs in two ways. First, it makes it clear that words such as "probably", "likely", "very likely" and "almost certainly" indicate a greater than even chance (above 50%) while words like "unlikely" and "remote" indicate a less than even chance (below 50%). In addition, the NIC also provides a handy scale that, while it is devoid of numbers, clearly rank orders the WEPS in regular increments. While the rank ordering is more important than the actual increments, early versions have five increments implying roughly 20% intervals for each word. The most recent version in the Iran NIE has seven intervals (see the chart below) implying intervals of approximately 14%.

The EEL page also identifies the language the authors will use for improbable but potentially important events. These words and phrases include such old standards as "possible/possibly", "may", and "might" and phrases such as "we cannot dismiss" and "we cannot rule out".

I intend to write quite a bit about WEPs later on but one point is absolutely clear: This move towards consistency in the use of language is an incredibly positive step forward but the “poets” in the IC have only been defeated, not routed. Kent defined poets as the type of analysts who “… appear to believe the most a writer can achieve when working in a speculative area of human affairs is communication in only the broadest general sense. If he gets the wrong message across or no message at all-well, that is life.” There has been, as we will see in later posts in this series, either a real hesitancy or a real lack of understanding of the value of consistent terminology on the part of many analysts in the intelligence community.

Consistent terminology, however, is something that decisionmakers have been requesting from intelligence professionals for decades. Mercyhurst alumna, Jen Wozny, wrote a wonderful thesis on the topic (currently you can only obtain it through inter-library loan with the Hammermill Library at Mercyhurst), exploring what over 40 decisionmakers said they wanted from intelligence. One of the key requests, of course, was consistent terminology. I consider it likely that the potential for broader distribution brought on by the recent Congressional requests and the public scrutiny of these latest NIEs essentially forced the Intelligence Community to adopt the more or less consistent series of terms described above.

While it may seem ludicrous to many (especially in the business or scientific communities) that this was a real debate in the intelligence community, it was and, based on the differences between what the EEL page says and what was actually done (which will make up the bulk of the remaining posts in this series), it still is.

Third and finally, the EEL page explains what the NIC means when it talks about “confidence in assessments”. This concept is difficult to explain to most people and the NIC has not been very helpful with their brief discussion of the concept.

Confidence in an assessment is a very different thing than the assessment itself. Imagine two analysts working on the same problem. One is young, inexperienced, working on what is generally considered a tough problem on a tight time schedule. He is unfamiliar with a number of key sources and cannot adequately judge the reliability of the ones he does have. When pressed to make an estimate regarding this problem, he states that he thinks that “X is likely to happen”.

The second analyst is a seasoned analyst with adequate time to think about the problem and considerable experience in the subject in question. He knows where all the sources are and knows which ones are good and which ones are to be taken with a large grain of salt. He, too, states that he thinks, “X is likely to happen.” Both analysts have given the same assessment of the same problem. The level of confidence of the first analyst is likely much lower than the level of confidence of the second analyst, however.

The important thing to note is that the analyst is expressing confidence in his probabilistic assessment. In the first case the young analyst is essentially saying “I think X is likely but for a number of reasons, not the least of which is my own inexperience, I think that this assessment could be way off. If I knew just a little bit more, I could come back to you saying that X is anything from remote to virtually certain.” In the second case, the senior analyst would say, “I think X is likely, but because I know a lot about this problem and how to do analysis, I am fairly comfortable that X is likely and even if I went out and did more research, my estimate would still probably be, “X is likely”.

How does one determine a level of analytic confidence, though? What are the appropriate elements and how are they measured? How do you know when you have crossed the line from low to moderate and the line from moderate to high (the three levels of confidence used on the EEL page)? The discussion above suggests that there are a number of legitimate factors that analysts should consider before making a statement of analytic confidence. The EEL page, strangely, does not see it that way, preferring to tie it only to the quality of the information and the nature of the problem (presumably some sort of scale running from easy to hard).

Recent research by a Mercyhurst grad student (Thanks, Josh!) suggests that a number of things legitimately influence analytic confidence including, among others, subject matter expertise (though it is likely not as important as some people think), time on target, the use of structured methods in the analysis, the degree and way in which analysts collaborate on the product, etc. I suspect that the IC is well aware of at least some of these other elements of analytic confidence (I am hard pressed to imagine, for example, senior officials in the IC stating that the subject matter expertise of their analysts doesn’t matter in their calculation of confidence yet it is not mentioned as an element in the EEL page). I find it disingenuous that they do not list these broader elements that could impact analytic confidence.

Despite these caveats and the minor weaknesses, the EEL implies a fairly comprehensive vision of what I have begun calling a theoretically complete estimate. How might such an estimate appear? Something like, “We estimate that X is likely to happen and our confidence in this assessment is high.” Translated, this might look like, “We are willing to make a rough probabilistic statement (Point 1 in the EEL) indicating that we think alternative X has about a 60-75% chance of occurring (Point 2 in the EEL). Because we have pretty good sources and this problem is not that difficult we are very comfortable that the actual range might be a bit broader but we don't think it is by much (Point 3 in the EEL).”

Ideally, decisionmakers want to know the future with certainty. Despite what the cynics in the IC might say, realistic decisionmakers understand that intelligence professionals deal with unstructured and incomplete data, some of which is deliberately deceptive, concerning difficult and even intractable problems and that certainty, as an intelligence judgment, is impossible. Under these circumstances, the structure outlined in the EEL pages of these recent NIE's seems both reasonable and useful.

Tomorrow: Part 5 -- Enough Exposition! Let’s Get Down To It…

Sources And Methods