Lecture 10-11: General attacks on LFSR based stream
ciphers
Thomas Johansson
T. Johansson (Lund University) 1 / 23
Introduction
z = z1 , z2 , . . . , zN is a known keystream sequence
find a distinguishing attack, or
find a key recovery attack
with complexity lower than exhaustive key search.
T. Johansson (Lund University) 2 / 23
Linear complexity and Berlekamp-Massey algorithm
the keystream sequence z = z1 , z2 , . . . will be periodic (after possibly
removing some of the first symbols).
Any such sequence can be generated by an LFSR.
One possible approach would then be to replace the entire generator
with an (in general very long) LFSR.
T. Johansson (Lund University) 3 / 23
I
Definition
The linear complexity of a sequence s (finite or semi-infinite), denoted
L(s), is the length of the shortest LFSR that can produce the sequence.
To find the shortest LFSR producing a certain sequence we use the
Berlekamp-Massey algorithm.
T. Johansson (Lund University) 4 / 23
Berlekamp-Massey algorithm
IN: A sequence s = (s0 , s1 , . . . , sN −1 ) of length N .
OUT: The shortest LFSR < C(D), L > generating s.
1. Initilization C(D) = 1, L = 0, C ∗ (D) = 1, d∗ = 1, m = −1, n = 0.
2. While (n < N ) do the following:
2.1 Compute the discrepancy
L
X
d = sn − −ci sn−i .
i=1
2.2 If d 6= 0 do the following:
T (D) = C(D), C(D) = C(D) − d · (d∗ )−1 · C ∗ (D)Dn−m .
If L ≤ n/2 then L = n + 1 − L, C ∗ (D) = T (D), d∗ = d, m = n.
2.3 n = n + 1.
3. Return < C(D), L >
T. Johansson (Lund University) 5 / 23
Massey's algorithm
C (z) 1
L 0
C0(z) 1
d0 1
e 1
N 0
N N +1
d sN ,
X ,c s ,
L
( i) N i
i=1
e e+1 Ja
d = 0?
Nej
C (z ) C (z) , dd,0 1z,e C0(z) Ja
N < 2L?
Nej
C1(z) C (z)
C (z) C (z) , dd,0 1z,eC0(z)
L N +1,L
C0(z) C1(z)
d0 d
c 1
T. Johansson (Lund University) 6 / 23
Properties of Berlekamp-Massey algorithm
The running time of the Berlekamp-Massey algorithm for determining
the linear complexity of a length n sequence is O(n2 ) operations.
Delivers one connection polynomial and the length L of the LFSR.
If (and only if) L ≤ N/2 there is a unique connection polynomial.
The proof is left out...
T. Johansson (Lund University) 7 / 23
Properties of Berlekamp-Massey algorithm
We want to find the shortest LFSR generating s, a periodic sequence
with period T .
Berlekamp-Massey algorithm can provide the solution if the input is
the length 2T sequence (s0 , s1 , . . . , sT −1 , s0 , s1 , . . . , sT −1 ).
You only need to process the first T + k symbols of the sequence,
where k is the first positive integer such that
(s0 , s1 , . . . , sT −1 , s0 , s1 , . . . , sk−1 ) has linear complexity ≤ k.
T. Johansson (Lund University) 8 / 23
Linear complexity
Let s1 = s10 , s11 , s12 , . . . and s2 = s20 , s21 , s22 , . . .
f (x1 , x2 ) be a function in two variables, x1 , x2 ∈ Fq .
By
s = f (s1 , s2 )
we mean the sequence s = f (s10 , s20 ), f (s11 , s21 ), f (s12 , s22 ), . . ..
T. Johansson (Lund University) 9 / 23
Linear complexity
Theorem
Let s1 and s2 be two sequences with linear complexity L(s1 ) and L(s2 )
respectively. Then
If f (x1 , x2 ) = x1 + x2 then L(f (s1 , s2 )) ≤ L(s1 ) + L(s2 ).
If f (x1 , x2 ) = x1 x2 then L(f (s1 , s2 )) ≤ L(s1 )L(s2 ).
T. Johansson (Lund University) 10 / 23
Example
z = f (s1 , s2 , s3 , s4 ),
where si , i = 1, . . . , 4 are LFSR sequences with period 2Li − 1
(m-sequences). Let
f (x1 , x2 , x3 , x4 ) = x1 + x2 x3 + x3 x4 + x2 x4 .
Find a bound on the linear complexity of the keystream sequence L(z).
Solution: Using Theorem 2 we get
L(z) = L(f (s1 , s2 , s3 , s4 )) ≤ L(s1 )+L(s2 )L(s3 )+L(s3 )L(s4 )+L(s2 )L(s4 ) ≤
T. Johansson (Lund University) 11 / 23
Correlation attacks - idea
A common method to build a keystream generator is to combine
several linear feedback shift registers to get a keystream with desired
statistical properties.
Correlation attack: If one can detect a correlation between z and the
output of one individual LFSR, this can be used in a
“divide-and-conquer” attack on the individual LFSR.
T. Johansson (Lund University) 12 / 23
Project 3
T. Johansson (Lund University) 13 / 23
Project 3
The values of Li and Ci (D) for the different LFSRs are,
L1 = 13, C1 (D) = 1 + D1 + D2 + D4 + D6 + D7 + D10 + D11 + D13 ,
L2 = 15, C2 (D) = 1 + D2 + D4 + D6 + D7 + D10 + D11 + D13 + D15 ,
L3 = 17, C3 (D) = 1 + D2 + D4 + D5 + D8 + D10 + D13 + D16 + D17 .
The secret key K is the initial state of the three LFSRs.
K = (K1 , K2 , K3 ), where Ki is the initial state of the ith LFSR.
T. Johansson (Lund University) 14 / 23
Project 3
Exercise 1:
Each group is given a keystream z1 , z2 , . . . , zN of some length N . Find the
key K that was used to produce this keystream.
Exercise 2:
Assume that the attack takes T seconds. How long would it take to attack
by an exhaustive search over the entire keyspace?
T. Johansson (Lund University) 15 / 23
Project 3 - the correlation attack
a correlation between the output of one of the shift registers and the
keystream, i.e., P ui = zi 6= 0.5,
T. Johansson (Lund University) 16 / 23
Project 3 - the correlation attack
(j)
Let ui be the output of the jth LFSR and assume that
(j)
P ui = zi = p, where p 6= 0.5.
What is the exact value of p?
Guess that the initial state of the jth LFSR is
(j) (j) (j)
û0 = (û1 , û2 , . . . , ûLj ).
We can calculate an LFSR output sequence û = (û1 , û2 , . . . , ûN ), where
(j)
ûi =ûi , 0 < i ≤ Lj ,
Lj
X
ûi = cl ûi−l , Lj < i ≤ N.
l=1
T. Johansson (Lund University) 17 / 23
Project 3 - the correlation attack
The Hamming distance between x and y, dH (x, y), is defined to be
the number of coordinates in which x and y differ.
Estimate the correlation p with p∗ , where
dH (û, z)
p∗ = 1 − .
N
If the guessed initial state, û0 , is correct, we get p∗ ≈ p, otherwise
p∗ ≈ 0.5.
T. Johansson (Lund University) 18 / 23
Algorithm
1. For each possible initial state, calculate
p∗ ;
2. Output the initial state for which p∗ is
maximal.
T. Johansson (Lund University) 19 / 23
Linear distinguishing attacks
Introduce linear approximations of all nonlinear operations in a specific
“path” of the cipher.
The path should connect some know values, i.e., key stream symbols.
If the linear approximation is true, this leads to a linear relationship
among the known key stream symbols.
If it is not true, we can think of the error introduced by the linear
approximation as truly random noise.
A linear combination of key stream symbols above can be viewed as a
sample from a very noisy (but not uniform) distribution. By collecting
many such samples, we can eventually distinguish the distribution they
are drawn from, from the uniform distribution.
T. Johansson (Lund University) 20 / 23
Example
A long LFSR with connection polynomial C(D) = 1 + D34 + D69 is
generating a sequence s = (s0 , s1 , s2 , . . .) in F2 . The output of the
generator is generated as
zi = f (si , si+1 , si+2 , si+3 ), i = 0, 1, . . . ,
where f is the Boolean function
f (x1 , x2 , x3 , x4 ) = x1 + x2 + x3 + x1 x2 x3 x4 . filter generator
Describe a linear distinguishing attack on the generator.
T. Johansson (Lund University) 21 / 23
Example
The LFSR sequence obeys the recursion
si + si+35 + si+69 = 0, i = 0, 1, . . . .
Replace f with the linear function g(x1 , x2 , x3 , x4 ) = x1 + x2 + x3 .
After replacement, we have
zi = g(si , si+1 , si+2 , si+3 ) = si + si+1 + si+2 , i = 0, 1, . . . .
Now we try to find a set of dependent linear equations.
As si + si+35 + si+69 = 0 we also have
si +si+1 +si+2 +si+35 +si+36 +si+37 +si+69 +si+70 +si+71 = 0, i = 0, 1, . . .
T. Johansson (Lund University) 22 / 23
Example cont’
But zi = si + si+1 + si+2 , so we must have zi + zi+35 + zi+69 = 0
(assuming that g always gives the result of f ).
So in our attack we create a sequence of sample values
Qi = zi + zi+35 + zi+69 , i = 0, 1, . . .. Calculating P (Qi ) gives
P (Qi = 0) = 0.835.
The number of zeros in Q has a binomial distribution with probability
0.835, whereas a random Q probability 0.5. Apply hypothesis testing!
T. Johansson (Lund University) 23 / 23