Showing posts with label NIE. Show all posts
Showing posts with label NIE. Show all posts

Friday, February 6, 2009

Part 7 -- The Iraq WMD Estimate And Other Iraq Pre-War Assessments (Evaluating Intelligence)

Part 1 -- Introduction
Part 2 -- A Tale Of Two Weathermen
Part 3 -- A Model For Evaluating Intelligence
Part 4 -- The Problems With Evaluating Intelligence Products
Part 5 -- The Problems With Evaluating Intelligence Processes
Part 6 -- The Decisionmaker's Perspective

Perhaps the most famous document leading up to the war in Iraq is the much-maligned National Intelligence Estimate (NIE) titled Iraq's Continuing Programs for Weapons Of Mass Destruction completed in October, 2002 and made public (in part) in April, 2004. Subjected to extensive scrutiny by the Commission on the Intelligence Capabilities of the United States Regarding Weapons of Mass Destruction, this NIE was judged "dead wrong" in almost all of its major estimates.

Far less well known are the two Intelligence Community Assessments (ICA) both completed in January, 2003. The first, Regional Consequences of Regime Change in Iraq, was made public in April, 2007 as was the second ICA, Principal Changes in Post-Saddam Iraq. Both documents were part of the US Senate's Select Subcommittee on Intelligence report on Pre-War Intelligence Assessments About Post War Iraq and both (heavily redacted) documents are available as appendices to the subcommittee's final report.

The difference between an NIE and an ICA seems modest to an outsider. Both types of documents are produced by the National Intelligence Council and both are coordinated within the US national security intelligence community and, if appropriate, with cleared experts outside the community. The principal differences appear to be the degree of high level approval (NIEs are approved at a higher level than ICAs) and the intended audiences (NIEs are aimed at high level policy makers while ICAs are geared more to the desk-analyst policy level (Thanks, Elizabeth!).

In this case, there appears to be at least some overlap in the actual drafters of the three documents. Paul Pillar, National Intelligence Officer (NIO) for the Near East and South Asia at the time was primarily responsible for coordinating (and, presumably drafting) both of the ICAs. Pillar also assisted Robert D. Walpole, NIO for Strategic and Nuclear Programs in the preparation of the NIE (along with Lawrence K. Gershwin, NIO for Science and Technology and Major General John R. Landry, NIO for Conventional Military Issues).

Despite the differences in the purposes of these documents, it is likely safe to say that the fundamental analytic processes -- the tradecraft and evaluative norms -- were largely the same. It is highly unlikely, for example, that standards such as "timeliness" and "objectivity" were maintained in NIEs but abandoned in ICAs.

Why is this important? As discussed in detail in Part 3 of this series, it is important, in evaluating intelligence, to cast as broad a net as possible, to not only look at examples where the intelligence product was false but also cases where the intelligence product was true and, in turn, examine the process in both cases to determine if the analysts were good or just lucky or bad or just unlucky. These three documents, prepared at roughly the same time, under roughly the same conditions, with roughly the same resources on roughly the same target allows the accuracy of the estimative conclusions in the documents to be compared with some assurance that doing so may help get at any underlying flaws or successes in the analytic process.

Monday: The Score

Friday, January 9, 2009

Cruise Missive: NIC Publishes New Paper On Global Health, Avoids Detection (NIC)

I love the National Inteligence Council (NIC). Really. I do. These guys are some of the brightest analysts working some of the toughest problems in the world.

But they don't know PR from boo-diddly...

Yesterday, they published an excellent paper on the Strategic Implications of Global Health, complete with some truly outstanding charts and graphics (like the one below - click on it to see the larger version). You think they would get the word out. Maybe post something on their home page, maybe have a press release, maybe have something on the DNI's home page.

Nada. Zip. Zero.

I should be happy, I guess (this is what journalists call a "scoop", I think). The analysts at the NIC have done good work, though, and it deserves a broader audience than what this tiny blog can provide. So, if you are reading this, pass it on...

(For all the fans of Mercyhurst, we get a shout-out in the Scope Note for our strategic intelligence project on the implications of chronic and infectious diseases on US national interests. Hooo-ahhh!!)

Monday, May 19, 2008

Saying One Thing And Doing Another: A Look Back At Nearly 60 Years Of Estimative Language (Original Research)

US News and World Reports has an interesting story about the current state of intelligence reform. According to the article, CIA Director, Mike Hayden, said,

  • "Some months ago, I met with a small group of investment bankers and one of them asked me, 'On a scale of 1 to 10, how good is our intelligence today?'" recalled Hayden. "I said the first thing to understand is that anything above 7 isn't on our scale. If we're at 8, 9, or 10, we're not in the realm of intelligence—no one is asking us the questions that can yield such confidence. We only get the hard sliders on the corner of the plate. Our profession deals with subjects that are inherently ambiguous, and often deliberately hidden. Even when we're at the top of our game, we can offer policymakers insight, we can provide context, and we can give them a clearer picture of the issue at hand, but we cannot claim certainty for our judgments."
(For those of you keeping score at home, Hayden said much the same thing last year during an interview with CSPAN...)

Frankly, I don't know anyone knowledgeable about the strengths and weaknesses of intelligence that doesn't agree with this statement. Certitude is impossible. That is what makes the chart below so darn interesting:


The chart is from Rachel Kesselman's recently completed thesis, Verbal Probability Expressions In National Intelligence Estimates: A Comprehensive Analysis Of Trends From The Fifties Through Post 9/11. The chart shows the number of times the word "will" has been used, in an estimative sense (e.g "X will happen"), in the Key Judgments of 120 National Intelligence Estimates (NIE) over the last 58 years (20 per decade) that she examined.

In fact, at 717 times, the word "will" was the single most commonly used estimative word, by a very large margin, in NIEs. Not only was it the single most commonly used word, it was also one of the most consistently used words across the decades (tests Rachel ran showed that the variances across the decades were not statistically significant).

So...if certitude is impossible, why does the Intelligence Community use "will" -- a word that reeks of certitude -- so often in its estimates? Such a result is absolutely inconsistent with statements, such as Hayden's above, made by virtually everyone who has ever jumped up to defend intelligence's predictive track record.

This was only one of the many fascinating results that came out of Rachel's exhaustive study of the words that analysts have used over the years to verbally express probabilities

Rachel's lit review, for example, makes for very interesting reading. She has done a thorough search of not only the intelligence but also the business, linguistics and other literatures in order to find out how other disciplines have dealt with the problem of "What do we mean when we say something is 'likely'..." She uncovered, for example, that, in medicine, words of estimative probability such as "likely", "remote" and "probably" have taken on more or less fixed meanings due primarily to outside intervention or, as she put it, "legal ramifications". Her comparative analysis of the results and approaches taken by these other disciplines is required reading for anyone in the Intelligence Community trying to understand how verbal expressions of probability are actually interpreted.

Another of my favorite charts is the one below:


This chart examines the use of the NIC's nine currently "approved" words of estimative probability (See page 5 of this document for additional discussion) across the decades. The NICs list only became final in the last several years so it is arguable whether this list of nine words really captures the breadth of estimative word usage across the decades. Rather, it would be arguable if this chart didn't make it crystal clear that the Intelligence Community has really relied on just two words, "probably" and "likely" to express its estimates of probabilities for the last 60 years. All other words are used rarely or not at all.

Based on her research of what works and what doesn't and which words seem to have the most consistent meanings to users, Rachel even offers her own list of estimative words along with their associated probabilities:


Rachel's work tracks well with my own examination of word usage in recent NIEs and with some of the findings in Mike Lyden's thesis on Accelerated Analysis, but her thesis really stands on its own and my brief description and summary of some of the highlights does not do it justice. It is a first-of-its-kind, longitudinal study of estimative word usage by the intelligence community and has contributed significantly to my own understanding of where the Intelligence Community has been over the last 58 years. I think readers of this blog will be more than a little interested in her results and recommendations as well.

Related Posts:
The Revolution Begins On Page Five...
Accelerated Analysis: A New And Promising Intelligence Process
What Do Words Of Estimative Probability Mean?

Friday, April 4, 2008

Iraq NIE Released, Countdown Begins For Unclass Version (NYT)

The New York Times is reporting this morning that the latest National Intelligence Estimate on the situation in Iraq has been released. The report also quotes anonymous sources (of course) concerning the content: "The new intelligence estimate cites slow but steady progress by Iraqi politicians on forging alliances between Shiites and Sunnis in Iraq" and "several factors that could reverse these trends: including a campaign of violence by Shiite splinter groups and the possibility that the government would not carry out a series of reconciliation laws Iraq’s Parliament passed recently."

The Times report goes on to state that Senators Carl Levin and Edward Kennedy have already asked for an unclassified version of the NIE, stating, “Without a current unclassified assessment of the situation in Iraq, Congress and the American people will not have the essential information needed for an informed public debate" (Note: Press release confirming letter is here). "Authorized leak" versus "unclassified version" is a distinction without a difference. I say there will be an unclass version on the streets before the Pennsylvania primary. Any takers?

Sunday, March 23, 2008

The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence (Final Version With Abstract)

Abstract:

There has been a good bit of discussion in the press and elsewhere concerning the recently released National Intelligence Estimate (NIE) on Iran’s nuclear program.
Virtually all of this commentary has focused on the facts, sources and logic – the content – of the estimate. It is my position that, while the content is fascinating, the most interesting story behind the NIE has to do with the changes in form that this latest NIE has adopted; that what the National Intelligence Council (NIC) has said is, in many ways, less interesting than the way it has decided to say it. This shift in form implies a new, emerging theory of intelligence – what intelligence is and how to do it – that is likely to influence intelligence communities worldwide. “Emerging”, however, is the key term here. As this article will highlight, the revolution may have begun but it is far from complete

PDF Version (Pre-pub/Complete)

HTML Version:
Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print
Part 8 -- Confidence Is Not the Only Issue
Part 9 -- Waffle Words And Intel-Speak
Part 10 -- The Problem With “If”
Part 11 -- One More Thing
Part 12 -- Final Thoughts
Epilogue

Friday, February 22, 2008

Must See TV On Recent NIEs, Intel Processes (Fora.tv)

DDNI for Analysis, Thomas Fingar, in a 14 February speech in front of the Commonwealth Club in San Francisco, talks in detail about the National Intelligence Estimate on Iraq WMDs, the recent Iran Nuclear Capabilities and Intentions NIE and reform efforts in the intelligence community. While Dr. Fingar's pacing and rhetorical style takes a little bit of time to get used to, I consider it must see TV for any Intel Studies students.

A few of my favorite quotes and factoids:

Washington is a "political theme park surrounded by reality".

The current officials in the intel community have an opportunity to do what "42 studies and commissions failed to accomplish".

"The best way to get more money and more people is to screw up."

"We are dealing in a realm that can be likened to a thousand piece puzzle. You've got eight pieces and someone lost the box top with the picture."

"...the goal is not to make us smarter but to make policy better..."

"...roughly 55% of the community" has joined since 9/11.

The Iraq WMD estimate was like "having your yearbook photo taken on your worst bad hair day ever."

"We need to move beyond a federation of agencies coming together, to build a community of analysts. Analysts who don't pay any attention to the agency lanyard around their neck, that engage and mix it up."

"...I had learned long ago in Washington that there are only two possibilities. There are policy successes and intelligence failures."

On the Iran NIE: "...there were in excess of 100 people who worked on it..."

On the Iran NIE: "No need to deal with the substance of the product if you can have an ad hominem attack that discredits the product."

Intelligence Priorities: Terrorism, counterproliferation, cyberthreat, Iran, instability, military modernization of Russia and China

"Reputations matter" and then later, "We want people to have the equivalent of an EBay reputation."

"We're right most of the time...and that bothers me. Not because I don't like being right but because I think we ask too many easy questions."

The first two clips are excerpts on the Iraq WMD Estimate and the Iran Estimate. The third link is to the full speech and Q and A.

Flaws In the Iraq WMD Estimate




Iran's Nuclear National Intelligence Estimate




Intelligence Reform and the Iran NIE



Monday, January 28, 2008

Rank Ordering The 26 Risks From The 2008 Global Risk Report (WEF)

The World Economic Forum just concluded in Davos, Switzerland. The annual meeting of many of the world's economic and political leaders is always worth watching but the most interesting document to come out this year, for me at least, was the Global Risk Report for 2008 (Download the full text here).

Put together by some pretty high speed, low drag thinkers from places like Citigroup, Swiss Re and the Wharton School Risk Center, the report is roughly equivalent to an NIE for the Davos crowd. The document makes estimates regarding the likelihood and the severity (in terms of lives and money) of 26 separate events over the next 10 years. Some, like the risk of a Katrina-like disaster in another major metropolitan area somewhere in the world, are based on what the report refers to as "traditional actuarial models". Other risks, such as seven identified geopolitical risks, have much greater uncertainty associated with them. All of the risks are rated on a 1-5 scale with a 1 = least likely or least severe and a 5 = most likely or most severe. The table below gives a summary of the risks, the scores associated with those risks and what exactly the numbers mean.



A couple of interesting things here. First, no single risk has a greater than 20% chance of occurring according to the authors. Second, the scales, in terms of severity, are not set at regular intervals for either lives or money. They spike upward very quickly, almost logarithmically, so that the differences between a 2 and a 3 and a 3 and 4, for example, are quite stark -- almost as if the economists of Davos were making a Richter Scale for risk. Third, some of the titles are a little obtuse. You will have to see the full text to understand what some of them mean.

Finally, though, I found it fascinating what the authors of the document didn't do. They didn't look at the risks from the standpoint of overall severity. In other words, they did not do what I did in the chart above -- create a single chart with all three variables in it. They chose to look at likelihood versus severity in terms of lives separate from likelihood versus severity in terms of cost. They created two charts and two tables to show how these variables worked together.

While these views were very interesting, it seemed to beg the question, "How should we rank order the risks?" Decisionmakers have to allocate resources (for a more detailed discussion of the central importance of this idea to notions of strategy and strategic intelligence see this article by me and Diane Chido). Typically, decisionmakers allocate more resources to attempt to mitigate or take advantage of the most important risks and allocate fewer resources to the less important risks. All of this seems like common sense until you try to rank order the risks. There are lots of methods for making this type of determination but one of the most common is expected value.

The concept behind expected value is simple: If I have a 30% chance of making 10 dollars, my expected value is .3 X 10 = 3 dollars. It can also lead to what might seem like a counterintuitive answer. Quick, which would you rather have, a 30% chance at 10 dollars or a 1% chance at $1000? While many people pick the first deal, the expected value of the first is only $3 while the expected value of the second deal is $10. Expected values can be either positive or negative. In the case of the Davos data, we are clearly talking about expected costs.

We already have the likelihood values and, since the World Economic Forum was kind enough to use 5 point scales for both of their severity ratings, the equation is pretty simple. The expected value of the Davos estimates would be the likelihood of occurrence multiplied by the average of the two severity ratings (cash and lives). In terms of rank ordering it makes no difference if I use the likelihood scale or the actual percentages associated with that scale. In order to keep things simple, I used the likelihood scale.

OK, OK, I hear you. You can't equate cash and lives (even if the fact that the authors put the two on the same five point scale almost begs you to do so...). This is, of course, the likely reason that the authors of the document would claim for not doing the expected value calculation that I am about to do. I think they are being a little too coy. Many of the guys who wrote this are actuaries or come from the insurance field. They put a dollar value on life every day. It is their job. I suspect that explaining how they do this and justifying their decision just seemed like something that was too difficult to do so they left us with a less useful document.

Which is a shame because I think there are at least two ways they could explain themselves that would have made this document more useful to decisionmakers. First, the intent behind using the combined severity scores is for the purpose of rank ordering not for the purpose of valuing. What we want here is some sense of which risk is overall most critical and, in general terms, how much worse is one risk than another. Second, there appears to be an inadvertent bias towards life in the assessment of overall severity. There are five total category 4 or 5 level risks with respect to severity in terms of cost. Only one of those has any associated cost in lives. On the other hand, there are also five category 4 or 5 risks with respect to severity in terms of lives lost. The severity in terms of cost for each of these, however, runs between 3 and 4, virtually guaranteeing that, assuming equal risks of both events, events with more lives lost will be more important than events where more money is lost.

In the end, I wish they had taken a crack at it because I know they are better mathematicians than I and would have done a better job of it. Anyway, here goes! When you do the math, the 26 risks for the next 10 years break down along these lines:



Lots of interesting stuff here as well. First, none of the risks get close to the maximum score of 25. Second, the top two risks are both what the World Economic Forum categorizes as "society" risks -- chronic disease and a pandemic. It is also interesting to see that terrorism is #18 out of 26, roughly equal to the collapse of the nuclear non-proliferation treaty in terms of its expected cost. Finally it is amazing to me how many of these risks could be readily and accurately assessed using exclusively open sources.

It is also instructive to look at the averages for each of the five sectors the WEF evaluated:



Clearly, society risks have the greatest potential cost while risks associated with technology pose the least expected cost. The other three categories, geopolitics, economy and environment, have roughly the same expected cost and lie between the two extremes.

One final comment; a thought experiment, really. Let's assume that the list is generally accurate, that these are the risks the world will face in the next ten years and that they are ranked more or less according to their level of importance. How do the intelligence resources that should be watching these risks stack up against the risks themselves? How much is spent on intelligence concerning chronic disease in the developed world and is that amount weighted appropriately given this issue's importance?

Clearly every one of these global risks has national security implications but not every one translates into a traditional national security priority. Which ones do the intelligence community cover and which ones do other agencies cover? Given that they are all interconnected (the full document even has a network diagram showing how they are all interconnected), where does the integrated picture emerge? Who is responsible for producing it?

Even in those areas where the intelligence community clearly has a stake, have resources been allocated appropriately? For example, failed and failing states have an expected cost that is roughly twice as much as international terrorism. Do we spend twice as much on intelligence concerning failed and failing states as we do on intelligence concerning terrorism? If we don't, are we being rational?

I can only speculate about the answers to these questions but I would be interested in your comments.

(Note: The full spreadsheets for the data above are available here)

Sunday, January 20, 2008

Epilogue -- The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print
Part 8 -- Confidence Is Not the Only Issue
Part 9 -- Waffle Words And Intel-Speak
Part 10 -- The Problem With “If”
Part 11 -- One More Thing
Part 12 -- Final Thoughts


Epilogue

One of the reasons I decided to post this "article" as a series on this blog was to experiment with the blog format; to see how it might work (or not) with more academic style articles. The purpose of this post is to discuss what I have learned through this process.

First, though, I want to thank all of the people who took the time to comment on the blog or to drop me a line. The responses, even when they took exception to my findings, were overwhelmingly positive regarding this "experiment".

If this was an experiment, what, then, were the results? I will start by laying out some of the facts and then discuss what I think they mean in the context of academic publication and scholarship.

This blog, which has been around since just before Thanksgiving last year, currently gets about 1000-1500 unique visitors a week and about twice as many page views. As I understand it, "unique visitor" is a term of art that describes a "hit" from a single person and "page views" describes how many pages a unique visitor looks at before he or she departs the site. Confusingly, unique visitors can be the same person if they come back at different times. If, for example, someone hits the site, looks at 5 pages (clicks on 5 links to 5 posts), departs, comes back at a later date and looks at 3 pages, then the site has been hit by 2 unique users and has had 8 page views.

Clearly, given the nature of the series, the number of people who actually read any or all of the posts in the series is something well below the approximately 3000-4500 unique visitors the site had over the 2.5 weeks the series ran.

While there are all sorts of packages available, I have installed only the most basic analytics software on the site. This software allows me to know who is hitting the site and what they are reading in only the grossest possible sense. My own estimation, based on an extrapolation of the numbers I do have, is that the series had 1000-1500 unique visitors representing no fewer than 100 and no more than 300 real live people who read all or most of the posts in the series. It is likely that another 100-300 people read at least one part of the series over the last 2.5 weeks.

One of the main difficulties with Intelligence Studies as an academic discipline is that there are relatively few journals in it. Moreover, since the economics of journal publishing are driven more by subscriptions than the popularity of the articles in the journal, comparing the number of readers from journal to journal and from discipline to discipline is fraught with difficulty.

That said, I have tracked down two interesting numbers for comparison purposes. The first indicates that journal articles in the British Medical Journal average 1168 hits (not all of which will be unique visitors) per article in the week after publication. The second, from the British Library, indicates that the average number of readers per annum of an average journal article ranges from 500-1500 with an average of 900 (this presumably includes people who will read only part of the article).

I was, frankly, pretty surprised to see these comparison figures and it suggests that the way forward within the intelligence studies discipline is with online journal publishing. If I can get roughly the same number of hits (largely from professionals in the discipline) with this modest effort as the British Medical Journal reaches over generally the same time period and roughly the same number of readers in 2.5 weeks that the lesser read (but far more widely distributed) journal articles analyzed by the British Library get in a year, I think it indicates a high comfort level from people in the profession with the electronic distribution of scholarship.

Beyond the question of readership there is also the more important question of scholarship. This is much more difficult to get at by looking at the numbers, however. On the face of it, the series falls well within normal limits for journal articles. Putting all the pieces together adds up to about 9000 total words which would, with charts, graphs and bibliographies add up to a respectably substantial journal article (about 30-40 pages depending on the journal). This is far longer than mathematics articles (which average about 12 pages) and far shorter than law review articles (which have recently begun to impose page limits in order to bring down the number of pages to 70 or so...). A quick review of page lengths in Intelligence and National Security suggest that 30-40 pages is within the range of "normal" for that publication.

Likewise the methodology and collection of the data were well within the norm for academic articles. I took a discrete but logically connected subset of National Intelligence Estimates and analyzed the way in which they were written, looking for patterns that emerged from the data analysis. The results are easily open to verification -- anyone else can do the same thing I did -- and the documents analyzed were all primary sources. I also tried to indicate where I saw weaknesses in my method and why I thought I could still make the evaluation I was making.

I also tried to be suitably "academic" in the tone of the article. I think I laregly succeeded while noting an occasional descent into "blog-speak". Many of the readers of this series of posts are students at Mercyhurst College and I wanted to make the series as interesting as possible. Having read many academic articles over the years, I also recognize that the pure academic tone is certainly not required...

Citing sources proved particularly easy with this form of publication. My intent was to turn my endnotes into hyperlinks within the posts. The paper contains 75 hyperlinks to sources outside of the document. In a print journal, all of these would have to be endnotes. I am not entirely satisfied with this method for sourcing, however. For one thing, the hyperlink cannot take the reader to the exact place in a lengthy document to which the post refers. I think if I do such an experiment again (and I think I am likely to), I will include the page number in a parenthetical immediately after the hyperlink. While this article was fairly easy to write without reference to non-web-based sources, I expect that this will not always be the case. I had planned to just include an endnote and put it at the bottom of each post if I had to refer to something that was not on the web but, thanks to the DNI and the CIA Center for the Study of Intelligence, all of the major references were readily available.

The fundamental element of traditional academic scholarship that was missing from the process was peer review prior to publication. I would make three comments here. First, the problems with the peer review system are well known to any academic. Some have gone as far as to claim that it is less a system for determining quality than it is a system for enforcing acceptability. Second, the notion of what constitutes adequate peer review is changing dramatically with any number of experiments on-going. Finally, the ability to comment that is provided by blogging technology changes and adds both depth and nuance to traditional notions of peer review.

Consider the traditional process. An article goes in and it is assigned to various referees who make independent and anonymous reviews of the work prior to publication. The readers rarely get any insight into reviewer comments or questions. Comments from the readers likewise have to go back to the editors and may or may not show up in a later issue of the journal. The best indication of the quality of an article is likely the number of times it is cited in other works -- something that is not known for years after the article is published.

With blogging technology, the peer review process becomes an integral part of the writing process. It happens simultaneously, in more or less real time. A variety of different metrics (including the ones discussed above) are more or less immediately available.

All in all, the process of writing this article and posting it as a series on a blog was extremely gratifying. I enjoyed the research, writing and review processes much more than I do normally.
Again, thank you all for all of your comments. It was genuinely appreciated.

Thursday, January 17, 2008

Part 12 -- Final Thoughts (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print
Part 8 -- Confidence Is Not the Only Issue
Part 9 -- Waffle Words And Intel-Speak
Part 10 -- The Problem With “If”
Part 11 -- One More Thing

Part 12 -- Final Thoughts

The Pontiac Solstice, when it came out last year, marked a significant change in the way most people think about the Pontiac brand. The picture below, taken from the Wikimedia Commons, does not do the car justice. If you haven't seen one, go to the Solstice website or, even better, park next to one and get out and take a look. If you like automobiles at all, it is hard not to like the look of the Solstice. More importantly, the Solstice was something that no one could imagine coming out of the stodgy and decidedly old-fashioned workshops at Pontiac. It was revolutionary.


It did not take long, however, before the automobile critics got the car and tore it apart. They were not kind. It had a noisy engine, interior styling was poor, its ragtop was hard to work with and the reliability of the car (based on past Pontiac performance) was expected to be much worse than for other sports cars. My guess is the critics figured that anything that looked that good on the outside needs to be as good on the inside as well.

The parallel between the previous 11 parts of this experiment in blog based "scholarship" (which has unquestionably descended into editorializing at times) is undeniable. What the Intelligence community has done with the Iran National Intelligence Estimate (NIE) is unquestionably revolutionary. The process of taking what was likely a Top Secret codeword document and revising it to an unclassified version for the world to see is, by itself, extremely difficult. To complicate matters by then requiring the IC to come up with a one-page explanation of what amounts to the IC's current theory of the intelligence estimate seems to be almost ask too much of the system.

Revolutions are not linear, however, they are iterative. Washington needed Valley Forge before he could have Yorktown, physics needed Newton before it could have Einstein and I am sure the Pontiac designers are working to fix any of the criticisms they consider legitimate as well. The distance covered from previous NIEs to the Iran NIE is massive but, as the previous 11 parts of this analysis suggest, there is more that should be considered. Specifically:
  • Take advantage of the theory already articulated to make estimates clearer and more nuanced. To put it more simply: Actually do what you said you were going to do on Page 5.
The decision to make the Key Judgments from at least some of the NIEs public had to have been difficult but the benefits are tangible. Not only does it inform the electorate, it prevents the elected from ignoring inconvenient assessments. In a day and age where massive information flows threaten to swallow us all, it makes intelligence and the intelligence community that produces it more relevant, not less. While there is clearly work still left to be done, the IC has accomplished much in a very short time. The revolution has begun; long live the revolution!

Wednesday, January 16, 2008

Part 11 -- One More Thing (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print
Part 8 -- Confidence Is Not the Only Issue
Part 9 -- Waffle Words And Intel-Speak
Part 10 -- The Problem With “If”

Part 11 -- One More Thing

The other thing that changed within the form of the National Intelligence Estimate (NIE) (at least with regard to the form of the publicly available Key Judgments) with the release of last month’s Iran NIE is the nature of the Scope Note (see page 4 of the Iran NIE).

Prior to the Iran NIE, the Scope Note was either a list of additional analytic cautions or was not released at all. The Iran NIE, as with many of the other factors outlined in the previous 10 parts of this series, changed all that. The Scope Note still contains some “administrative” data and additional caveats but it is now primarily concerned with the specific questions the Intelligence Community (IC) has been asked to answer and with some of the assumptions it has made in the preparation of the document. These were likely taken from a formal Terms of Reference document. This document normally precedes the creation of an NIE and it tells the analysts in the National Intelligence Council (NIC), in broad terms, what questions they are supposed to answer. As the NIC puts it in the prefatory comments to the Iran NIE: “The TOR defines the key estimative questions, determines drafting responsibilities, and sets the drafting and publication schedule.”

The Scope Note from the Iran NIE asked five questions:

  • What are Iran’s intentions toward developing nuclear weapons?
  • What domestic factors affect Iran’s decisionmaking on whether to develop nuclear weapons?
  • What external factors affect Iran’s decisionmaking on whether to develop nuclear weapons?
  • What is the range of potential Iranian actions concerning the development of nuclear weapons, and the decisive factors that would lead Iran to choose one course of action over another?
  • What is Iran’s current and projected capability to develop nuclear weapons? What are our key assumptions, and Iran’s key chokepoints/vulnerabilities?

If these are the questions, what then are the answers? Take the first question, for example: "What are Iran's intentions toward developing nuclear weapons?". Read through the NIE yourself. Where is the first question clearly and unambiguously answered? Is this the answer: “…we also assess with moderate-to-high confidence that Tehran at a minimum is keeping open the option to develop nuclear weapons.” If so, it is not much of an answer. "[A]t a minimum is keeping open the option..." sounds not only vague but also borders on just plain common sense. Maybe this is the answer: “We do not have sufficient intelligence to judge confidently whether Tehran is willing to maintain the halt of its nuclear weapons program indefinitely while it weighs its options, or whether it will or already has set specific deadlines or criteria that will prompt it to restart the program.” I may be wrong but that sounds to me like, "We don't know" and, I would argue, that saying so up front and clearly (instead of in the middle of the Key Judgments) would have significantly changed the tone and content of the post-release discussion concerning this estimate.

Some of the other questions suffer from the same lack of a clear answer while the form of the Key Judgments makes finding the answers to these questions as difficult as possible. Search the document for the words "intent" or "intention". Outside the title and the Scope Note these words are never used again. Search for "factor" or "domestic" or "external". Wouldn’t you expect these words, so prominent in the questions the decisionmakers asked, to be somewhere mentioned in the Key Judgments? Wouldn’t the use of these words signal to the decisionmakers that were reading this document that 'here is the answer" to the questions they asked? Yes, of course, the sophisticated readers for whom these NIEs are primarily written can figure this all out themselves, but why should they have to? What fundamental intelligence principle is being abandoned by making the relationship between the question and the answer clearer?

If you are going to state up front that there are five questions to be answered, what then is wrong with organizing your answers around those five questions? Doesn’t it make way too much sense to say, in response to the first question, something like, "With X degree of confidence we assess that Iran’s intentions towards developing nuclear weapons likely are…”? Such a structure makes it clear what question is being answered and follows the guidelines laid down on page 5 of this same estimate.

I am not suggesting that the Intelligence Community (IC) turn into an "answer service." I strongly believe that the IC has an obligation to not only answer the questions it has been asked but also to address those questions people should be asking. I think it is the IC’s duty to look broadly, deeply, at these questions; to get at the nuance that only their expertise allows. That said, you still need to answer the question. Clearly. And if the answer is “I don’t know” then say so up front.

Part 12 – Final Thoughts

Tuesday, January 15, 2008

Part 10 -- The Problem With "If" (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print
Part 8 -- Confidence Is Not the Only Issue
Part 9 -- Waffle Words And Intel-Speak

Part 10 -- The Problem With “If”

So, “could” doesn't work. Nor does “may”, “might” and “possible” (If I had a nickel for every time a decisionmaker has said to me, “Son, anything is possible”, I would be wealthy). Even the only occasionally used “we cannot dismiss” or “hard pressed” create such a strong sense of a lack of definition that analysts should restrict or eliminate them as well from their vocabulary. What’s more, they are all unnecessary if the guidelines laid down in the Explanation Of Estimative Language – the Page 5 of the title – are followed more closely. What, then, is the problem with “if” and the other words on the list? Consider this statement from the recent Iran National Intelligence Estimate (NIE):

  • We assess with high confidence that Iran has the scientific, technical and industrial capacity eventually to produce nuclear weapons if it decides to do so.
Lots of problems here, of course: The use of a statement of confidence without a corresponding Word of Estimative Probability, the use of the word “eventually” (“Hell, son, the world will come to an end…eventually.” Another one I could have funded my retirement with). The “if” clause is particularly problematic, though. "If" clauses have a tendency to beg the real question. What is the real question here? Isn’t it “Will Iran decide to build a nuclear weapon or not?” That is a much more important and interesting question than the question this sentence actually answers concerning the scientific, technical and industrial capacity of Iran. It is sort of like your doctor coming in and saying, “We assess with high confidence that eventually you will not be able to drive your car, if you have cancer.” Glad to hear it, Doc, but can you elaborate on that last part a bit…

“Unless” and “Should” clauses used like “if” are equally worrisome. Consider these two sentences, the first from the Global Terror NIE and the second from the 2nd Iraq Stability NIE:
  • Should al-Zarqawi continue to evade capture and scale back attacks against Muslims, we assess he could broaden his popular appeal and present a global threat.
And
  • Broadly accepted political compromises required for sustained security, long-term political progress, and economic development are unlikely to emerge unless there is a fundamental shift in the factors driving Iraqi political and security developments.
In the first instance, the unanswered intel question is whether or not Zarqawi will scale back his attacks. In the second, the real question is whether there is likely to be the fundamental shift that the IC has identified as necessary. Admittedly, both questions contain elements that are probably outside of the NICs purview. In the first instance, the issue of whether or not Zarqawi will continue to evade capture falls largely within the realm of those charged with hunting him and going beyond this carefully phrased clause might somehow jeopardize those operations. In the second case it is less clear if the "fundamental shift in factors" is a euphemism for the potential results of planned US and allied action or not. The IC is, I think, rightly cautious about commenting on the possible success or failure of US plans. While the IC is well aware, in general, of the capabilities and limitations of the US government, it spends most of its time and energy focused externally, on threats to and opportunities for the United States. It is not and should not try to also be the expert in applying diplomatic, informational, military or economic pressure outside of the narrow bounds traditionally labeled "covert action" (and maybe not there either...).

Despite this, there are clear intelligence questions here that have gone unanswered through the use of "should" and "unless". Is Zarqawi likely to scale back his attacks or not? Do the factors driving Iraqi political and security developments that are independent of US action likely favor the broadly accepted political compromises deemed necessary? It seems clear that, using the guidance from the EEL page, the IC could make these type estimates more useful to decisionmakers.

Not all “if” clauses are awful, though. There are some, like this one from the Iran NIE, “Barring such acquisitions, if Iran wants to have nuclear weapons it would need to produce sufficient amounts of fissile material indigenously—which we judge with high confidence it has not yet done” where the analysis actually answers the question implied by the “if” clause. Therefore, in computing the percentages in the table in Part 9, I only included “if” clauses that fit the “waffle-word” category.

Part 11 -- One More Thing

Monday, January 14, 2008

Part 9 -- Waffle Words And Intel-Speak (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print
Part 8 -- Confidence Is Not the Only Issue

Part 9 -- Waffle Words And Intel-Speak

The Words Of Estimative Probability (WEPs) outlined in Part 8 are not the only WEPs that analysts can use to express probabilistic judgments. The recent National Intelligence Estimates (NIEs) also specifically refer to a series of words such as "might" or "may" and phrases such as “we cannot dismiss” or “we cannot rule out” that are meant to signify events of undetermined probability or events that are remote but significant if they do occur. Analysts often perceive the use of some of these type words as unavoidable if they wish to convey the full range of possibilities inherent in an estimate. Decisionmakers have another attitude about these words. They call them "waffle words" or "intel-speak" and often believe that the primary reason for their inclusion is to cover the analyst’s backside.





The six public estimates reviewed in this sample contain a number of these type words. Again, I counted only examples where the analyst was making an estimate (excluding, for example, those times where “might” was used as a noun). I included a broader range of waffle words than some might agree to on first blush. I hope to outline the problems with each but there seem to be three conclusions that jump out from the chart above.

First, there appear to be roughly the same number of waffle words as there are statements containing WEPS (In fact, in the Iran NIE, the number was exactly the same) across the entire set of NIEs examined. I think this is a bad thing and will argue (hopefully convincingly) later on that, using the system the Intelligence Community (IC) has already laid out, there is not only a good reason not to use waffle words but also a simple way to keep from ever having to use intel-speak again.

Second, it seems likely that the ratio of waffle word sentences to WEP sentences is a good indicator of overall confidence in the estimate. I find it fascinating, for example, that the first Iraq Stability NIE contained over 3 times as many sentences using waffle words words such as "could" and "might" as sentences containing more more meaningful WEPs such as "likely". Such a strong preference for one type of formulation over another sends a strong signal that the analysts involved were (perhaps unconsciously) hedging their bets in a very real way.

The third conclusion is that there is a moderately strong preference for the waffle-words, “could” and “if”. It is easy to see what is wrong with “could”. Anything "could" happen. To tell a decisionmaker that something could happen is to increase his or her uncertainty, not reduce it. Certainly, with no WEP and confidence statement to even roughly assign a probability to the described event, decisionmakers are left on their own to figure out an appropriate level of time or other resources to devote to thwarting this nebulous threat or to take advantage of this ephemeral opportunity. This is tantamount to asking the decisionmaker to be the analyst! “All I can figure out is that something could happen, boss,” the IC seems to be saying, “You need to figure out how likely it is so you can assign the appropriate resources to deal with it. Oh, and by the way, if you guess wrong, I will still be able to say that I warned you.” It is easy to see why decisionmakers don’t like “could”.

Take a look at this statement from the August 2007 Prospects for Iraq’s Stability NIE:

  • A multi-stage process involving the Iraqi Government providing support and legitimacy for such initiatives could foster over the longer term political reconciliation between the participating Sunni Arabs and the national government.

Or it could not. The use of the word “could” here is really not very helpful. I suppose it is possible that the policymakers that were involved with the Iraq situation at the time were not aware of this possibility but I kinda doubt it. Now the policymakers themselves have to figure out if they should pursue a policy that supports the Iraqi Government along these lines or not. The IC has pointed out the obvious (to the decisionmakers, anyway) and has not given them any sense of which way it will go.

How might it have been better phrased? What about:

  • We assess with low confidence that a multi-stage process involving the Iraqi Government providing support and legitimacy for such initiatives will likely foster over the longer term political reconciliation between the participating Sunni Arabs and the national government.

While the specific terms I have used here ("low confidence' and "likely") are clearly notional (I don't know what words the authors would have used had they used this formulation), this statement is entirely consistent with the guidelines laid out in the Explanation of Estimative Language (EEL) page. It does what the intel community should do, make the call, while still being properly caveated. It does not ask the decisionmaker to be the analyst as well. Finally, the analyst, by using consistent terminology and providing a more useful estimate, is less open to unjustified criticism in any sort of after action review.

The point is that appropriately using confidence combined with a WEP allows an analyst to make the best call with the facts he or she has at the time yet still send a signal to the decisionmaker about the firmness of the estimate without having to resort to useless waffle-words. It is a better system. It is better because it is clearer. It is better because it is consistent. It is better because it stands up to after-action scrutiny. It seems to be what the IC has in mind on page 5 but it is clearly not what the IC is doing (at least not with its most recent public NIEs).

Tomorrow: Part 10 -- The Problem With “If”

Friday, January 11, 2008

Part 8 -- Confidence Is Not the Only Issue (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper
Part 7 -- Looking At The Fine Print

Part 8 -- Confidence Is Not the Only Issue

Some 29% of the sentences in the Iran National Intelligence Estimate (NIE) do contain Words of Estimative Probability (WEPs), however. As the chart below shows, this is pretty much in line with other NIEs. The chart outlines the number of uses of a particular word in an estimative sense in each of the eight NIEs I examined. Again, I only looked at the words in the Key Judgments (not in any of the prefatory matter or in any of the full text or appendices). The column on the far right shows the percent of the time a particular WEP showed up in NIEs generally. In other words, "probably" was used in 33 sentences and there were 263 sentences total in the 7 NIEs examined, so it showed up about 13% of the time. I am also well aware that such a simple review is fraught with difficulty given the complexity of the English language but, since I am only looking for broad trends, I believe that such a review is an appropriate method for analyzing the way in which these estimates were written and the way in which they are changing.





In fact, the Iran NIE is well within the range of other NIEs with respect to percent of sentences containing WEPs. Furthermore, the Iran NIE does not use any “unauthorized” WEPs. That is to say, only WEPs specifically listed on the Explanation of Estimative Language (EEL) page are used in the Iran NIE. This was not the case in previous NIEs which used (though not often) statements that were undefined at a minimum and misleading at their worst. Consider the use of “most likely” in the August 2007 update to “Prospects for Iraq’s Stability”:

  • We judge such initiatives are most likely to succeed in predominantly Sunni Arab areas, where the presence of AQI elements has been significant, tribal networks and identities are strong, the local government is weak, sectarian conflict is low, and the ISF tolerate Sunni initiatives, as illustrated by Al Anbar Province.

“Most likely” could mean many things in this context since there is no baseline probability with which to compare it. The initiatives referenced in the report could be likely to succeed or unlikely to succeed; the reader cannot know from the text. All we can know is that they are "most likely" to succeed in the predominantly Sunni areas. Other formulations, such as “much less likely” and “increasingly likely”, suffer from the same problem. “Not likely” is the only place where I am clearly quibbling as it is obviously synonymous with unlikely. I just think it is silly to state that the authors intend to use “unlikely” on page 5 (the EEL page) and then ignore that and use “not likely” in the text. If the two are truly synonymous then use the one you said you were going to use. If they aren’t synonymous, then explain the difference. You can’t have it both ways.

Beyond the mere use of WEPs, there also appears to be an issue with which WEPs predominate. Again, there is a strong pattern – the clear preference over the last 6 public NIEs for the use of the word “probably”. In fact 73% of authorized and 62% of all WEPs used in the last six NIEs are “probably”. It is also interesting to note that the only non-millennial NIE examined, the 1990 Yugo NIE did not use “probably” at all (whether this pattern holds and whether this was a good thing, I will leave to other researchers).

If the analysts involved in these estimates genuinely believe that all these events are “probable” and not somewhat more or less likely then there is little to discuss. The extreme overuse of the term suggests other explanations, however. "Probably" is arguably one of the broadest WEPs in terms of meaning (see Figure 1 in the paper linked here). Fairly clearly it means that the odds are above even chance but it seems open to interpretation from there.

Thus, analysts could be using "probably" as an analytic safe haven. Relatively certain that the odds are above 50% but unwilling to be more aggressive and use a phrase such as “highly likely” or “virtually certain” and unaware or unable to use expressions of confidence to appropriately nuance these more aggressive terms, these analysts default to “probably”. Since the NIE is a consensus estimate combining input from all 16 intelligence agencies, it is also possible that "probably" was the one word upon which everyone could agree; that it represents, essentially, a compromise position. Either way, such a move is “safe” in terms of getting the answer broadly correct but hurts the decisionmaker who, in the end, must take action and allocate resources. If analysts are more certain than they are willing to put in writing, the decisionmaker is deprived of the analysts’ best judgment and will arguably make less informed decisions.

(Note: The statistical analogy to the issue described above is the classic problem of calibration versus discrimination. For additional insights into this issue I refer you to Phillip Tetlock’s book Expert Political Judgment or to this site)

Monday: Part 9 -- Waffle Words And Intel-Speak

Thursday, January 10, 2008

Part 7: Looking At The Fine Print (The Revolution Begins On Page Five: The Changing Nature Of The NIE And Its Implications For Intelligence)

Part 1 -- Welcome To The Revolution
Part 2 -- Some History
Part 3 -- The Revolution Begins
Part 4 -- Page Five In Detail
Part 5 -- Enough Exposition, Let's Get Down To It...
Part 6 -- Digging Deeper

Part 7 -- Looking At The Fine Print

Let’s take a look at an example from the Iran National Intelligence Estimate (NIE) and see if we can figure out what is going on here.

  • We judge with high confidence that Iran will not be technically capable of producing and reprocessing enough plutonium for a weapon before about 2015.

What is missing in this example and many other statements in the NIE, of course, is an estimate of likelihood (or Word Of Estimative Probability (WEP), if you prefer). The estimate does not say “…will not likely be technically capable …” Instead, the verb phrase “will not be technically capable” implies certainty about a future event – which is, by definition, uncertain.

Even in cases where the event happened in the past but the information regarding the event contains inconsistencies or uncertainties (in other words the event is not definitively factual) such as this statement (also from the Iran NIE), “We assess with high confidence that until fall 2003, Iranian military entities were working under government direction to develop nuclear weapons”, it seems inappropriate to not use estimative language in conjunction with a statement of confidence.

In other words, if the Intelligence Community (IC) knew for certain that Iranian military entities were working under government direction to develop nuclear weapons then they should not be indicating a probabilistic statement by saying “we assess”. If, on the other hand, they are not entirely certain, then they should not say “Iranian military entities were working” but rather “it is virtually certain that Iranian military entities were working” or whatever the analysts believe is the appropriate estimate of likelihood. Mixing the formulations makes the definitions laid out in the Explanation of Estimative Language (EEL) page -- the "page five" in the title -- meaningless.

This is problematic for several additional reasons. First, it is bound to be confusing to the reader. Having carefully explained that “We judge” is an indicator of estimation but then phrasing the statement in terms of certainty makes the attentive reader wonder what the IC really means; is this statement a fact or an assessment? It could be read both ways. Second, another graduate student with whom I have worked, Mike Lyden, has done some very interesting research comparing NIE estimative statements against historical fact (his thesis that contains the research is available currently only through inter-library loan). Across the last 40 years, estimates that used WEPs tend to be about 75% accurate. Statements that use words of certainty hover around 50% accuracy (the sampling size was large enough that this difference was statistically significant to several decimal places as I recall). Mike speculates that this difference may be tied up with psychological notions of confidence (explained later) but whatever the reason, the evidence is pretty compelling – the Intelligence Community makes better estimates when it does not use words of certainty.

Another possibility, of course, is that I have got it all wrong; that I have mischaracterized what the IC intended to do when they defined “confidence” the way they did. Indeed, there are several other ways that the word "confidence" could be interpreted that would work in this sentence.

First, confidence could refer to psychological confidence or the way the analyst "feels" about the assessment. Psychologists have long known that the more information you get the more confident you feel in your assessment of a situation. Up to a point, this increasing confidence is warranted. Fairly quickly, however, your mind forms a more or less rigid conceptual model of the problem you are facing so that your mind takes each new fact and tends to either force it into the existing model or discard it as irrelevant. The net effect of this is that, while you feel increasingly confident, your chances of being correct stay about the same. Psychologists call this Overconfidence Bias and it is generally considered a bad thing in analysis. Moreover, it is well known within intelligence circles, having been covered extensively by Richards Heuer in his classic, Psychology Of Intelligence Analysis. It is, therefore, unlikely to be what the IC means when it talks about confidence on the EEL page.

Second, confidence is often used as a synonym for likelihood as in “I am highly confident that New England will win the Super Bowl.” While this works in casual speech, this certainly makes no sense in the context of this NIE. The EEL page defines an entirely different way of ascribing levels of likelihood to its assessments and specifically states that the level of confidence language applies “to our assessments (italics mine).” To use confidence as a synonym for likelihood would be tantamount to the IC saying one thing and doing another which, well, they have already done. I don’t, however, think they would be that silly again. For the same reason, the introductory phrases, “we assess”, “we judge” and “we estimate” can’t be considered to be expressions of likelihood either.

Third, and likely most closely related to what the IC means, is a statistical notion of confidence, commonly expressed as a margin of error. The form of the statement is quite familiar to most of us: “Candidate X leads in the polls, 61 to 39% (plus or minus 3 percent).” This means (typically at the 95% confidence level – yet another statistical term) that Candidate X’s true lead could be as low as 58% or as high as 64%. This form certainly seems to mirror the form examined in Part 4. High confidence under this interpretation would mean that the margin of error is low, that the true probability hovers near the estimate made by the authors of the estimate. The problem here comes in the way the IC has actually used confidence in these phrases. If they mean it to be interpreted statistically it makes no sense to then say something that would be functionally equivalent to “…plus or minus 3 percent, Iran will not be technically capable…”. This kind of statement and others like it only make sense when associated with a probability or, in the case of the NIE, an Estimate Of Likelihood.

This, in turn, brings me back to the more general notion of analytic confidence that I discussed in Part 4. Certainly the IC does not want to convey numerical certainty and has said so (at least in early forms of the EEL page) but this idea of analytic confidence seems similar to the idea of statistical confidence. By using words (not numbers) that express likelihood and then using words (not numbers) to express its confidence in an expression of likelihood, the IC’s implied definition of analytic confidence would resonate with, but not mirror, what many people already generally understand, i.e. the statistical notion of confidence. Just as with statistical notions of confidence, however, this idea of analytic confidence only makes sense if there is an expression of likelihood to go with it.

Which leaves me with a problem. I don’t know what the IC means when they talk about confidence. The EEL page implies they intend to use it one way. Then they do something entirely different in the text and none of the possible variations in meaning makes any sense. They do it so many times that I can’t ascribe it to accident.

I am just an average Joe. The first alternative is that I just don’t understand. I am prepared to admit that. I would suggest, however, that the current form of the EEL page needs to be changed so that it is clearer. I guarantee that if I cannot understand what it means, there are many more average Joes that are struggling with it (or just ignoring it) as well.

The second alternative – and one that is a bit more unnerving – is that the IC does not know what it means when it says high, moderate or low confidence. Perhaps sometimes they are using it to describe how they feel about their position, sometimes they may be using it as a synonym for an estimate and sometimes they may mean it more statistically, leaving it up to the reader to figure out which it is from the context.

Tomorrow: Part 8 -- Confidence Is Not the Only Issue