Showing posts with label IBM. Show all posts
Showing posts with label IBM. Show all posts

Sunday, March 21, 2010

In Search of Encrypted Search

Last December Moxie Marlinspike announced a $34 cloud service for auditing the strength of WPA passwords. The service tries to reconstruct the password from the user-supplied WPA session material and a hand-crafted dictionary of 135 million passwords. Marlinspike's cloud service searches through the optimised dictionary in just 20 minutes, a computation that would otherwise take 5 days or so on a standard desktop. Notice then that Moxie learns the password if it is found in his dictionary. This might be an example of where customers would prefer Moxie to perform an encrypted search on your behalf – the parameters are encrypted and Moxie just learns a simple yes or no as to whether the corresponding password was found.

This is an encrypted search problem that could be solved in principle by the “fully homomorphic” encryption system devised by IBM researcher Craig Gentry in June of last year. There was much fanfare around the breakthrough and googling “ibm homomorphic encryption” returns over 21,000 results today. There is a long and informative article in Forbes under the headline IBM's Blindfolded Calculator, and you can also see a lecture by Gentry on his result here, given at the Fields Institute in Toronto last December.

Homomorphisms in RSA

But what is homomorphic encryption? Well RSA gives us a nice example, and recall that the “textbook” RSA encryption of a message M is given as

E(M) = M^e mod N

It is not hard to verify that

E(M1) * E(M2) = E(M1 * M2)

which in words states that the product of two encryptions is equal to the encryption of their product. And this is the defining property of a homomorphism – given a function (here encryption) and two inputs (here messages) with a binary operation defined on those inputs (here multiplication), we can interchange the order of applying the function and the operation to get the same result. So we say that the RSA function is a homomorphism with respect to multiplication, or more simply, just a multiplicative homomorphism.

So for an unknown message M2, if we are given E(M2) we can multiply E(M2) by E(M1) to produce E(M1*M2) without needing to decrypt M2. In the early days of RSA this homomorphic property was deemed to be undesirable, due to posited scenarios like the following one (which in fact seems more likely today than they did 20 years ago). Let’s say Alice is going to transfer $M to the account of Bob and sends an encrypted message E(M) instructing her bank to perform the transfer. But Bob intercepts the message and multiples it by E(2), the encryption of 2. The new message becomes E(2)*E(M) = E(2*M) and Bob doubles his money without having to decrypt E(M). The PKCS 1 format has removed the threat of manipulating messages in this way by adding additional padding and fixed fields to messages which can detect such (multiplicative) manipulation.

While RSA is multiplicatively homomorphic it is not additively homomorphic, since in general

E(M1) + E(M2) != E(M1 + M2).

Getting back to the encrypted search problem, does RSA being a multiplicative homomorphism help us? Well not really. Say we are searching for a string S in some data D, where we are only given E(D). We could encrypt S to get get E(S), and then compute

E(S) * E(D)  =  E(S*D)

but E(S*D) won’t tell us that S is somewhere in D because treating S and D as numbers then multiplying them is not related to searching. So we need to find an encryption scheme that has more homomorphic properties than just multiplication if we are to perform encrypted search. And this is what Mr. Gentry did.

Fully Homomorphic Encryption

This elusive property is what Gentry has created in his “fully homomorphic” encryption system, a system which can support the secure homomorphic evaluation of any function given an inout, such as search over data, and not just simple arithmetic functions like multiplication. The underlying computational problem that furnished this property was neither RSA nor discrete logarithm systems but rather a geometric construction referred to as an ideal lattice. Lattices are special vector spaces that furnish computationally difficult problems that have been used in several other cryptosystems such as NTRU, and provide an alternative to the “standard” hard problems used from number theory.

Using the ideal lattice Gentry was able to show how to compute a “circuit” for problems such as encrypted search that can be evaluated without decrypting the data. Both the encrypted term and data are used as inputs to the circuit, and the trick is to devise a method for securely and correctly evaluating the gates of the circuit. Your computer uses built-in gates and circuits to evaluate computations, such as performing the multiplications and divisions to evaluate an RSA encryption or decryption during an SSL session for example. And as such the computations are small and fast. However for encrypted search the corresponding circuit needs to be built and represented in software, which leads to large and inefficient computations. So this is why the result is a breakthrough but not a practical one at the moment. Gentry has estimated that building a circuit to perform an encrypted Google search with encrypted keywords would multiply the current computing time by around 1 trillion.

The Outlook for Encrypted Search

It remains to be seen whether Gentry’s theoretical breakthrough can be converted into a practical service, in the cloud or otherwise. Gentry has worked with some other researchers from IBM and MIT to remove the requirement of using lattices and to just treat the fully homomorphic encryption as operating over the integers, which provides a simpler conceptual framework. Making the central ideas more accessible will increase the chances of finding more practical solutions. There is a recent review of homomorphic encryption by Craig Stuntz, giving more details and promising code in an upcoming post. Stuntz also points to an article by Gentry in the respected Communications of the ACM which begins

Suppose that you want to delegate the ability to process your data, without giving away access to it. We show that this separation is possible: we describe a "fully homomorphic" encryption scheme that keeps data private, but that allows a worker that does not have the secret decryption key to compute any (still encrypted) result of the data, even when the function of the data is very complex. In short, a third party can perform complicated processing of data without being able to see it. Among other things, this helps make cloud computing compatible with privacy.

Reading further requires a subscription but it is clear that the work of Gentry and his co-authors is being positioned as a core service for cloud platforms, and I am suspecting IBM’s platform in particular. However, just at the moment, you will have to trust Moxie with your WPA password.

Sunday, March 7, 2010

Passwords for USB Keypads

Bruce Schneier recently posted about a new USB stick that comes with its own on-board numeric keypad, permitting a password consisting of digits to be entered directly into the USB device to authorize unlocking. Such a stick and keypad would circumvent the recent USB password vulnerability that was derived from a poor implementation of password verification on the desktop.

image

The stick in question from Corsair (shown above) also uses AES-256 encryption to protect the data on the stick. The AES-256 key for the stick is then likely to be derived from the user-supplied password (say using PKCS #5 or RFC 2898), or used to protect a file which contains a full-length 256-bit key. In either case the 256-bit key will be derived from, or protected by, a password which has a much lower entropy.

Bruce points out that a 77-digit password would be needed to produce the same entropy as a 256-bit key (since the logarithm to the base 10 of 2^{256} is about 77 ). I made the same point in Are AES 256-bit keys too large? where I calculated that a password based on the 94 printable ASCII characters would need to be 40 characters in length to achieve the same entropy of a 256-bit key (since the logarithm to the base 94 of 2^{256} is about 40). Deriving or bootstrapping AES keys from passwords is really an exercise in self-deception, especially when considering 256-bit keys. The discrepancy between the low entropy of passwords and the astronomical keyspace of AES-256 simply cannot be reconciled.

Perhaps the situation would improve if a biometric such as a fingerprint was used to bootstrap a 256-bit key. I did some research about a year ago and posted what I found in On the Entropy of Fingerprints. Some work has been done by IBM researchers who estimate the entropy of fingerprints to be at most 85 bits, or approximately the same as a length 13 password based on the 94 printable ASCII characters. An improvement, but still a long way from 256 bits of entropy.

Sunday, November 22, 2009

FUDgeddaboudit

imageI first came across the term fuhgeddaboudit in writing while reading the The Black Swan, where Taleb was answering the question as to whether journalists can be relied on to unearth all of the silent evidence on a given topic - fuhgedaboudit! The term is short for “forget about it”, popularized in US gangster media such as the Sopranos, which google defines as

  • An issue is not worth the time, energy, mental effort, or emotional resources
  • Whatever the current topic of discussion may be, regardless of who has stated it (even the speaker) is thereby declared null and void and without merit

Both of these sentiments were called forth when I read the recent post from Anton Chuvakin on FUD-based security. Anton was reminding us that FUD is alive and well in IT Security, and actually it has nowhere to go but up in terms of mindshare since more sophisticated methods, such as ROSI, have nowhere to go but down.

Even though FUD is a blunt instrument, Anton argues that it is very effective when it comes to getting things done, allowing real issues to be brought to the table, and limits reliance on decision makers to do the right thing (which they often don’t). He even jokes that FUD is a more pragmatic triad for security than the venerated CIA.

The whole post was ethically stomped on by RThomas (Russell Thomas) from the New School of Information Security blog (NSOIS) who stated in a comment that

FUD is the distorted and irrational exaggeration of fears and uncertainties for the sole purpose of manipulating the decision-maker.

The term "FUD" originated in the 1970s regarding IBM's selling tactics against competitors. The FUD technique was used to destabilize the decision-maker's thinking process regarding potentially viable alternatives. FUD issues raised could not really be answered by the decision-maker or the competitor, and so nagged at the back of the mind. They had the effect of causing the decision-maker to retreat to the safe decision, which was IBM. "Nobody ever got fired for buying IBM" was one famous phrase embodying the effects of FUD …

There are substantial reasons for framing risks in a way that goes beyond simple statement of facts and statistics, namely to deal with the psychology of risk. The ethical security or risk professional will take pains to present scenarios that are feared in a way that the decision-maker can understand and, most important, to see those scenarios in perspective relative to other possibilities and probabilities.

and Russ further drove home his point in an additional post over at the NSOIS, concluding that

Security is always a secondary objective to some other (upside) enterprise objectives. Security investments are always subject to evaluation relative to other investment alternatives, both inside and outside of IT. These are the realities of enterprise performance and leadership. Some security people may stomp their feet in protest, or resort to unethical tactics like FUD, but don’t delude yourself that you are making the world (or the enterprise) a better place.

This is the same sentiment that I addressed in my The Relegation of Security to NFR Status post. NFR stands for non-functional requirement and includes things like ensuring that there is sufficient network capacity, that the servers are adequately sized for peak loads, help desk support is created, back-up and recovery is deployed, the web interface is friendly, and so on. FUD is not really IT Security’s opportunity to get some skin back in the functional (i. e. business) requirements game, as we will still look like uninvited gate crashers at best, and bullies at worst.

At the recent CSI meeting in Washington, as reported by Dark Reading, with my take here in Security Muggles, several CSOs opined that we need better communication with business people on their terms so that Security people are earning a seat at the decision-making table. They want to do more RSVP-ing than crashing.

Wade Baker over on the Verizon blog recently asked how people make security decisions, beginning from the frank assumption that

In most cases, it is impossible to precisely formulate all factors in the decision, so we abandon the “scientific” route and revert to some other method of making it (see below). This is where our predominantly engineering mindset hurts us. Instead, we should realize that organizations have always made decisions using varying amounts of information of varying quality. Our dilemma is not new. Valid and vetted approaches exist for structured decision problems with an abundance of precise data and also for unstructured problems with sparse amounts of “fuzzy” data. These approaches are out there and are eagerly waiting for us to apply them to problems in our domain.

FUD can be seen as a response to this reality, but not a very structured response, and one that ignores the methods and techniques developed in other fields for coping with decisions under uncertainty. Wade also ran a little survey on the approaches that security people use for decision-making and he received just over 100 responses. You can read his summary of the response here, and his summary graph is below.

image

Even given the small sample size it seems that some people are opting away from FUD, far away in fact. I don’t think IT Security as a profession, or any profession (except maybe politics), has a long run future based on FUD since you don’t need much technical skill or experience to pursue this approach, and there are probably plenty of people for hire to carry out such campaigns who are not particularly well-qualified in security.

So ethical considerations aside, I have never considered FUD a long term strategy. Its persistence I imagine can be attributed largely to regular changes in the ranks of security decision makers, and a mind-numbing churn in technology and the IT sector as a whole. The same “new fears” are being presented to new people, as FUD has gone into heavy syndication in the IT Security industry and its always showing in re-runs somewhere. Put your time and energy somewhere else.

In short fuhgeddaboudit !

Tuesday, April 14, 2009

On the Entropy of Fingerprints

A biometric is just a long password, that is easy to remember and easy to enter (with the right hardware support). But just how long a password? Can we measure and compare the “something you are” against the “something you know” authentication criteria? I went looking on the web and yes there are some answers.

In An Analysis of Minutiae Matching Strength three IBM researchers outline how to measure the entropy of fingerprints and their resistance to brute force attacks as compared to passwords. The authors state that sampled biometrics are much longer than passwords (several hundred bytes to over a megabyte) and typically have a high information content. A password of equivalent length would be difficult to remember.

The authors use two models to arrive at these conclusions. In both models they assume that an extracted fingerprint sample can be represented as an image of 300 x 300 pixels, which can be divided into 400 non-overlapping sites of 20 x 20 pixels. Each site holds a minutia detailing a ridge and valley pattern of a fingerprint, and each minutia point has an angle of orientation represented by d = 4, 8 or 16 values. A sample fingerprint is considered a match against a template if a minimum number of N sites match where N is 10, 12, 14, 16 or 18.

image

So this is like saying that you have a password of length 400 where each character takes on at least d values and you accept a candidate password as correct if it matches the true password in at least N positions. Letting N = 10 and d = 4 yields just over 2^85 possible fingerprint configurations. So attempting to randomly guess a correct fingerprint template in this model only succeeds with one chance in 2^{-85}. This is very low indeed and corresponds to a random length 13 password based on the 94 printable ASCII characters.

What we have described is called the simple model by the authors, which does not account for certain minutia dependencies. A more complex model is proposed to compensate which also shows that the entropy is still as high as 80 bits with additional matches. Even with the complex model there were quite a few caveats, and a revised model was reported in the excellent 2008 survey paper Biometrics: A Tool for Information Security.

In section V.A of the survey paper the amount of discriminating information in a fingerprint is discussed. The revised model is somewhat more conservative in its comparisons to passwords. The authors now state that randomly matching on at least 20 from 36 minutia is at least as difficult as guessing a length 6 case-sensitive alphanumeric password (about 10^{11} in total).

The revised model was motivated by the desire to quantify the uniqueness of fingerprints due to their importance in determining guilt in court cases. And just like DNA tests, the assumed power of fingerprints to uniquely discriminate between individuals is being downgraded.

So in summary a biometric is just a long password, that is easy to remember and easy to enter (with the right hardware support). But you need to check the parameters of the matching algorithm and its assumptions to determine how strong your fingerprint as compared to a password.

Related Posts

Wednesday, April 8, 2009

The Data Centric Security Model (DCSM)

(Repost from July 2007)

While at IBM I worked on a concept which we called the Data Centric Security Model (DCSM), with the basic idea being that security people will have more fruitful interaction with IT managers if discussions centered on their data rather than on our technology. After I left IBM, several people continued to work on the DCSM resulting in a paper presented at the Business Driven IT Workshop in May 2007, which has now been posted on Scribd.

A Data Centric Security Model A Data Centric Security Model Luke O'Connor IBM proposal for a data centric approach to IT security.

Thursday, January 29, 2009

Data Centric Security Model hot on Scribd

My paper on Data Centric Security has made it to the hotlist on Scribd, which means that the paper is getting quite a few hits of late. I blogged about this work in one of my first posts to No Tricks back in September 2007. If you Google the topic further you will find that IBM has taken the idea a lot further since this initial work, and I believe that a whole consulting practice has been established around this concept.

A Data Centric Security Model