Sei sulla pagina 1di 129

Some Results on Stream Ciphers

A THESIS

submitted by

SABYASACHI DEY

for the award of the degree

of

DOCTOR OF PHILOSOPHY

DEPARTMENT OF MATHEMATICS
INDIAN INSTITUTE OF TECHNOLOGY MADRAS.
February 2018
THESIS CERTIFICATE

This is to certify that the thesis Results on Some Stream Ciphers submitted by Sabyasachi
Dey (MA15D016) to the Indian Institute of Technology, Madras, for the award of the
degree of Doctor of Philosophy, is a bonafide record of the research work done by him
under my supervision. The contents of this thesis, in full or in parts, have not been
submitted to any other Institute or University for the award of any degree or diploma.

Dr. Santanu Sarkar


Research Guide
Assistant Professor
Dept. of Mathematics
IIT Madras, 600036
Chennai
February 2018

i
ACKNOWLEDGEMENTS

At first, I would like to express my gratitude to my guide, Dr. Santanu Sarkar, for
his relentless support and guidance throughout last few years in my research. I believe
that only because of his constant support, the journey of my Phd life has been much
easier. He never let me lose my focus from my research, and was always there for me
whenever I needed any help.

In my few years in IIT Madras, I have come in contact with several faculties who
influenced me to pursue my research in Mathematics. I want to thank them for their
contribution in my studies and career.

I have made some great friends in my hostel during school life, college life and
finally here in IIT. Not only they have huge contribution in my academic career, but
also they are the reason why I have led such a happy hostel life. I would like to take this
opportunity to show my gratitude to all my friends who made me feel that the hostel is
a second home for me.

I would also like to acknowledge the influence of Mr. Samiran Gupta, my child-
hood teacher , in my academic career, who was the first person to grow my interest in
Mathematics. Also I express my gratitude to my aunt Mrs. Shyamali Majumdar, who
has always been a second mother to me, for her affection, inspiration, support and faith
on me.

Finally, I thank my parents, whose influence and support in my life can’t be ex-
pressed in words. They have made me what I am today.

ii
ABSTRACT

This thesis is based on some results on the stream ciphers RC4, Salsa, Chacha and
Fruit. We have provided theoretical proofs of few famous biases observed in the RC4
algorithm. These biases have significant contributions to recover plaintexts from the
knowledge of ciphertext. For Salsa and ChaCha, we improve the existing attacks. Fruit,
which an ultra lightweight stream cipher proposed very recently, does not have much
attacks against it. Here we cryptanalyse Fruit and provide a time memory tradeoff
attack.

RC4, which is one of the mostly used stream ciphers in last two decades, is now
considered to be weak because of multiple biases have been reported. RC4 has gone
through rigorous analysis in last twenty years. In 1995, Roos observed a bias of keystream
bits of RC4. In this thesis, we generalise that work. We also provide some theoretical
justification of the bias of keystream byte Zi towards i − K[0], which was observed by
Paterson et al. in Asiacrypt 2014. This bias has been used in a broadcast attack in WEP.
Also, another useful bias observed experimentally in RC4 is the bias of Zi = i. Here we
have proved this bias.

Salsa and ChaCha are two ciphers which are considered to be the replacement of
old stream ciphers like RC4. In FSE 2008, Aumasson et al. introduced an idea of prob-
abilistically neutral bits to provide differential attacks against these two ciphers. Using
that idea, Salsa can be attacked upto 8th round and ChaCha upto 7th round. After-
wards, those attacks have been improved further. Here, we first provide an algorithm to
construct the set of probabilistically neutral bits in a better way to improve the attack.
Our construction of probabilistically neutral bit set is able reduce the attack complexity,
both for Salsa and ChaCha.

Fruit, compared to the previously discussed ciphers, is much newer. Fruit is very
interesting because of its ultra lightweight structure. Its design is inspired by the design
principle used in Sprout, which involved the use of key bits in NFSR update function.

iii
Fruit has a state of size 80, which is same as its key size. We provide a time memory
tradeoff attack against Fruit. This attack is based on some kinds of sieving applied on
the possible states. Our attack finds the state with complexity around 275 for first 80-bit
version of Fruit and complexity 276.66 for second 80-bit version of Fruit.

Keywords: Bias, Chacha, Differential Attack, Fruit, RC4, Salsa, Stream Cipher, Time
Memory Tradeoff.

iv
TABLE OF CONTENTS

THESIS CERTIFICATE i

ACKNOWLEDGEMENTS ii

ABSTRACT iii

1 Introduction 1
1.1 Encryption and Decryption . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Asymmetric Key Cryptosystems . . . . . . . . . . . . . . 3
1.1.2 Symmetric key cryptography . . . . . . . . . . . . . . . . . 4
1.2 Perfect Secrecy and Stream Cipher . . . . . . . . . . . . . . . . . . 6
1.3 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.4 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Experimental Framework of this Thesis . . . . . . . . . . . . . . . 16
1.6 Prerequisites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2 Generalization of Roos bias in RC4 17


2.1 Negative bias of Zi towards i − K[0] . . . . . . . . . . . . . . . . . 19
2.2 Generalization of Roos Bias and bias of Zi = i − fy . . . . . . . . . 22
2.2.1 Probability Zi = i − fi . . . . . . . . . . . . . . . . . . . . 30
2.3 Biases of Zi towards fi−1 . . . . . . . . . . . . . . . . . . . . . . . 31
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3 Settling the mystery of Zr = r in RC4 35


3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
3.2 Probability Transition Matrix and its application . . . . . . . . . . . 38
3.2.1 Idea of Probability Transition in RC4 . . . . . . . . . . . 38
3.2.2 Explanation of the probabilities after KSA phase and dur-
ing PRGA of RC4: . . . . . . . . . . . . . . . . . . . . . 40

v
3.3 Theoretical Explanation of Zr = r . . . . . . . . . . . . . . . . . . 48
3.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4 Some results on reduced round Salsa and Chacha 55


4.1 Structure of the ciphers . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1.1 Structure of Salsa . . . . . . . . . . . . . . . . . . . . . . . 57
4.1.2 ChaCha . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.2 Idea of attack on Salsa and ChaCha . . . . . . . . . . . . . . . . . 60
4.2.1 Technique of Attack . . . . . . . . . . . . . . . . . . . . . 60
4.2.2 Concept of PNB . . . . . . . . . . . . . . . . . . . . . . . 60
4.2.3 Chaining Distinguishers . . . . . . . . . . . . . . . . . . . 62
4.2.4 Choosing proper IV . . . . . . . . . . . . . . . . . . . . . . 63
4.3 Improving the way of constructing PNB set: Our algorithm . . . . . 64
4.3.1 Algorithm for Salsa . . . . . . . . . . . . . . . . . . . . . 65
4.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.4.1 Our results on Salsa . . . . . . . . . . . . . . . . . . . . . 66
4.4.2 Experimental Result For ChaCha . . . . . . . . . . . . . . 76
4.5 How to assign values to PNBs . . . . . . . . . . . . . . . . . . . . 78
4.6 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5 Some results on Fruit 84


5.1 Description of Fruit version 1 . . . . . . . . . . . . . . . . . . . . . 87
5.2 Key recovery attack on Fruit version 1 . . . . . . . . . . . . . . . . 90
5.2.1 First phase of the attack . . . . . . . . . . . . . . . . . . . 90
5.2.2 Second phase of the attack: Guessing a middle state . . . . 96
5.3 Second Version of Fruit . . . . . . . . . . . . . . . . . . . . . . . 101
5.3.1 Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.3.2 Cryptanalysis . . . . . . . . . . . . . . . . . . . . . . . . . 102
5.3.3 Weak key class . . . . . . . . . . . . . . . . . . . . . . . . 104
5.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

6 Conclusion 106

vi
6.1 Summary of Technical Results . . . . . . . . . . . . . . . . . . . . 106
6.2 Open Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.2.1 RC4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.2.2 Salsa and Chacha . . . . . . . . . . . . . . . . . . . . . . . 108
6.2.3 Fruit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109

List of Papers Based on Thesis 121


CHAPTER 1

Introduction

Cryptology is the study of keeping information secret. This subject originated long
ago to fulfill the requirement of human to transfer and store valuable information with
privacy and security. Initially cryptology was considered as an art, because to design a
proper way of hiding information requires creativity. In world history, we can see many
artists delivering secret message to people by their creation. However, this approach
changed later. As the subject developed, the requirement of science and technology
was realised. In history, we can see application of cryptology in so many ways.

Julias Caesar used cryptology to protect messages during military warfare. His
method of encrypting message is called Caesar cipher or Caesar code. This was basi-
cally a shift cipher, i.e, each alphabet of the actual message was substituted by another
alphabet which is at some fixed distance from the actual alphabet. Though compared
to modern ciphers this was a very ordinary one, but considering the technologies and
knowledge people had at that time, this cipher can be expected to be very effective for
secure communication.

Famous artist Leonardo Da Vinci had used cryptology in his creations. In some of
his masterpieces like The Last Supper, Mona Lisa etc, he had hidden secret messages,
which have been discovered after centuries.

In 800 AD, an Arabian mathematician named Al Kindi invented different cryptan-


0
alytic methods. He wrote a book named Risãla fĩ lstikhrãj al-Kutub al-Mu amãh, in
which he provided different cryptanalytic strategies. After that in fourteenth century
another mathematician named Ahmad al-Qalqashandi wrote some techniques of cryp-
tology in his book. In fact, it is believed that modern cryptology is influenced by the
works of the Arabians in this area.

In ancient India, an encryption scheme called Katapayadi Sankhya was used by sci-
entists. This scheme was used to represent numbers by letters. By producing some
meaningful word by this conversion from digits to letters, long numbers were remem-
bered very easily. In some ancient books, position and distances of planets and stars are
presented using this schemes.

During second world war, cryptology played a vital role in deciding the fate of the
war. A machine called Enigma was used by the Germans for military communication.
This machine had significant contribution on initial success of the Germans in the war.

Now, we can see application of cryptology everywhere. Master card, gmail, atm
card password etc. are some of its most common applications. As the subject devel-
oped, different aspects of the security came out. To provide security to a secret infor-
mation, first we have to construct a suitable design which will ensure the secrecy of the
information. Then we need detail study of that design so that we have an idea of how
strong the security is. The first aspect of the subject is constructive, which needs cre-
ativity. On the other hand, the second aspect is destructive, where the aim is to break the
security. Both the aspects are equally important because without proper test of security
a design can not be used for security of an information. Based on this, modern study of
cryptology has two directions:

1. Cryptography: It deals with construction of proper design to provide security to


the information.

2. Cryptanalysis: It deals with detail study of the design to test the strength of the
security provided by the design.

1.1 Encryption and Decryption

Modern cryptography follows a general pattern to provide security to information. It


has an algorithm, which takes some input, applies some operations on it and at the end
gives some output. Also, another important factor is the key. Key is a string of 0 and
1, which is unknown to any outsider. Here, an important thing to observe is that though
the key is private, the algorithm used to provide security is public. The point of keeping
the algorithm public is that: if the algorithm itself is private, the detail analysis of the
design can be done by the the designer only. On the other side, if it is public, then
anybody can analyse this design and if there is any fault, it can be found much quickly.

2
This helps to take proper countermeasure or to discard the design if countermeasure is
not possible.

Definition 1 A Cryptosystem is defined as a 5-tuple (P,C, K, E, D) where: P : set of


Plaintexts, C : set of Ciphertexts, K : set of Keys, E : set of Encryption rules, D : set
of Decryption rules. Corresponding to each key k ∈ K, there exist an encryption rule
ek ∈ E such that ek : P → C, and a decryption rule dk ∈ D such that dk : C → P. For any
x ∈ P, dk (ek (x)) = x.

1.1.1 Asymmetric Key Cryptosystems

These cryptosystems are also known as public key cryptosystems. The name ‘asymmet-
ric’ comes from the fact that in this technique, the key used for encryption is not same as
the key used for decryption. Instead of having a single key, a pair of keys are used. First
one is called private key, which is known to the receiver only. The other key is public.
The sender encrypts the message with the public key, sends it to the receiver. The re-
ceiver decrypts the ciphertext with the private key. But since the private key is unknown,
anybody can encrypt the message, but nobody can decrypt except the receiver. These
cryptosystems are mostly based on hard mathematical problems. For example, famous
RSA Cryptosystem [100] is based on factorisation problem, Elliptic Curve Cryptosys-
tem [72, 88] is based on discrete log problem, NTRU Cryptosystem [60] is based on
shortest vector problem of a lattice etc.

Asymmetric key cryptosystem has various use. For example, to share the key of any
other cipher between two parties, public key cryptography is very useful. Suppose two
parties are interested to use a new key in a cipher that they usually use for communica-
tion because they feel their old key is not secure. Now, to share the new key, they can
not use the same cipher because of security. In this context, public key cryptography
is used. One party encrypts the key using a public key of other party. Other party can
decrypt and receive the key using their private key. Also, public key cryptography is
used for authentication, which is called digital signature.

3
1.1.2 Symmetric key cryptography

Symmetric key cryptography deals with the idea where the key used for encryption is
same as the key used for decryption. The branch of study consists of two sub-branches:
block cipher and stream cipher. In this thesis, we will discuss cryptanalysis of some
stream ciphers.

Block cipher

Block cipher is a category of symmetric key cryptosystem. In block cipher, a group of


plaintext bits are taken as input, which is called block. This whole block is encrypted
together with the help of the key and a block of ciphertexts is produced. Let x be a
plaintext block of size n and k be a key. The encryption is:

E : {0, 1}n × {0, 1}k → {0, 1}n

such that E(x, k) = y, where y is the ciphertext. Similarly, the decryption function D
is such that it takes a n bit ciphertext and k bit key as input, applies the algorithm and
produces an n bit plaintext. For a fixed key k, the encryption function and the decryption
functions are inverse of each other.

D ((E(k, x)), k) = x

Some of the popular block ciphers are DES [36], AES [1] etc. Data Encryption Standard
or DES is a cipher which has been used by US Government for a long period. It was
made public in 1976 and went through rigorous analysis. Later, DES was replaced by
Advanced Encryption Standard or AES by US Govt. AES is possibly the mostly used
cipher in recent times.

Stream ciphers

This is another category of symmetric key ciphers. Based on the design principle,
streams ciphers can be divided further into two categories: synchronous stream ciphers

4
pseudo
key ALGORITHM random
keystream
stream
cipher

Figure 1.1: Pseduo random generation using stream cipher

and asynchronous stream ciphers. In this thesis, we focus on synchronous stream ci-
phers only. So, afterwards whenever we mention the term ‘stream cipher’, we mean
synchronous stream cipher.

Let us discuss the general idea of synchronous stream ciphers. Suppose Alice wants
to send message m to Bob. In the principle of stream cipher, there is an algorithm which
takes the key k as an input. After applying some operations on k, it produces a stream
of 0 and 1. This is called pseudorandom keystream (see Fig. 1.1). This keystream is
XORed with the actual message m to produce c, which is called ciphertext. Then Alice
sends this c to the receiver Bob. Now, Bob has the same key k. With k, he generates the
same keystreams as Alice. By XORing this keystream with ciphertext c, he gets back
the actual message m.

Nessie project: As the use of ciphers increased, cryptology community felt the ne-
cessity of new promising ciphers. In last twenty years, several project has been ar-
ranged in attempt to find new ciphers for widespread adoption. In the period 1997 to
2000, a project called Advanced Encryption Standard process was arranged by NIST in
order to find a successor of DES. In Japan, another project called CRYPTREC was ar-
ranged in 2000 in search of quality cryptographic designs. NESSIE was a project which
took place during the period 2000 to 2003 by the European cryptography community.
NESSIE stands for New European Schemes for Signatures, Integrity and Encryption.
The purpose of this project was to achieve some new ciphers for future adoption. Many
famous cryptographers participated in this project. Forty two ciphers were submitted in
this competition. These designs were analysed rigorously by cryptographers worldwide
for next few years. At the end, twelve ciphers were selected to be secure enough for fu-

5
Portfolio 1 Portfolio 2
(Software) (Hardware)
HC-128 [114] Grain [76]
Rabbit [24] MICKEY [6]
Salsa20/12 [13] Trivium [29]
Sosemanuk [12]

Table 1.1: Final eStream ciphers

ture adoption. Also, five more ciphers, which were not a part of this competition, were
declared to be promising for use. Six stream ciphers were submitted in this project.
Unfortunately, none of those were selected at the end because all of them were proved
to be insecure.

eStream Project: NESSIE project failed to provide some good stream ciphers. This
led to the setup of another project by EU ECRYPT in 2004, called eSTREAM. This
project was aimed at finding new stream ciphers only. The project was divided in two
categories: software portfolio and hardware portfolio. Total thirty four ciphers were
submitted here. The competition took place in three phases. In the first phase of the
project, the submitted stream ciphers went through scrutiny based on their security, flex-
ibility and simplicity. In 2006 March the first phase ended. In August 2006 the second
phase of the project started. The ciphers underwent further analysis by cryptologists. In
2007, third phase of the project started. Eight ciphers from software profile and seven
from hardware profile reached this phase. Finally, in 2008 April, the third phase ended.
Four ciphers from software category and three from hardware were announced to be the
finalists. Please see Table 1.1.

1.2 Perfect Secrecy and Stream Cipher

A concept called perfect secrecy [69] was introduced by Shannon to measure the se-
curity of a cipher. The primary goal of a cipher is to encrypt a message in such a way
that from the ciphertext, any adversary does not gain any extra information about the
plaintext. This idea was properly defined by Shannon using probability in the defini-
tion of perfect secrecy. Initially we assume that there is an adversary who knows the

6
probability of occurrence of all possible plaintexts. He also knows the ciphertext. Now,
a cipher is secure if the knowledge of ciphertext does not change the probabilities of
occurrences of plaintext, i.e, the ciphertext will not help the adversary by any means.
The definition of perfect secrecy is as follows:

Definition 2 A cryptosystem is said to have perfect secrecy if for any plaintext x ∈ P


and ciphertext y ∈ C, the probability Pr(x) = Pr(x | y).

Theorem 1.1 Let (P,C, K, E, D) be a cryptosystem which attains perfect secrecy. Then
|K| ≥ |P|.

Proof 1.2 Let us assume, it possible, |K| < |P|. We consider that the set of plaintexts P
is uniformly distributed. Let y be ciphertext such that there exist at least one plaintext x
and key k for which ek (x) = y. Suppose

Py = {x ∈ P | ∃ k ∈ K for which ek (x) = y}.

So, Py is the set of all possible decryptions of y. Clearly Py 6= φ . Decrypting y with all
possible keys can produce at most |K| different plaintexts. So, definitely

|Py | ≤ |K|.

So, we have
|Py | ≤ |K| < |P|.

This means, there exist at least one plaintext x0 such that x0 ∈ P but x0 ∈
/ Py . So, x0
can’t be achieved by decryption of y by any key. So, Pr(P = x0 | C = y) = 0. This
is a contradiction to the perfect secrecy since Pr(P = x0 ) 6= 0 because of the uniform
distribution.

So, our assumption that |K| < |P| is wrong. This proves the theorem. 

Shannon’s Theorem: Shannon’s theorem is one of the most fundamental works in the
development of modern cryptography. After providing a mathematical form for security
in his definition of perfect secrecy, he also provided a theorem to give the necessary and
sufficient conditions to attain perfect secrecy for a cryptosystem. This condition is based

7
on the assumption that the size of the key space, plaintext space and ciphertext space is
same.

Theorem 1.3 Let (P,C, K, E, D) be a cryptosystem where |P| = |C| = |K| = n for some
n. Then, (P,C, K, E, D) attains perfect secrecy if and only if:

• The probability of keys follows uniform distribution, i.e, the probability that a
1
particular key will be chosen is |K| .

• For any m ∈ P and c ∈ C, there exist exactly one key k ∈ K which encrypts m to c.

Proof 1.4 Let us assume that the system has perfect secrecy. We show that the two
given conditions hold.
Let m be a message. Now, suppose Cm be the set of all possible ciphertexts that can be
generated by m. So, Cm ⊂ C. Now, as shown in the previous theorem, for any c ∈ C there
exists at least one key which encrypts m to c. So, for any c ∈ C, c ∈ Cm . So, Cm = C.
This implies, |Cm | = |K| (since from the assumption, |C| = |K|).

Now, suppose, Em be a mapping from K to Cm such that for any k ∈ K, Em (k) is


the ciphertext where m is mapped under key k. Clearly this map is surjective. Since,
|K| = |Cm |, the map must be bijective. So there does not exist any two keys k1 and k2
which encrypts m to same ciphertext c. This implies, for some m and c, there exists
exactly one k which encrypts m to c. This proves the second condition.

Suppose m1 , m2 , · · · , mn be the set of all plaintexts. Let c be a fixed ciphertext. For


any mi , we denote by ki the key which encrypts mi to c. Then, from the definition of
perfect secrecy we have

Pr (M = mi ) = Pr (M = mi |C = c).

Now, by Bayes’ theorem,

Pr (C = c | M = mi ) Pr (M = mi )
Pr (M = mi |C = c) =
Pr (C = c)

Pr (K = ki ) Pr (M = mi )
= .
Pr (C = c)
This implies, Pr (K = ki ) = Pr (C = c). This is true for any ki . So, probability of each ki
1
is same, i.e, they follow uniform distribution, i.e, probability is |K| .

8
Conversely, if the two conditions hold, for any m ∈ P and c ∈ C, we have exactly
one key k which encrypts m to c. So,

1
Pr (C = c | M = m) = Pr (K = k) = .
|K|

Pr (C=c | M=m) Pr (M=m) 1 Pr (M=m)


Hence Pr (M = m |C = c) = Pr (C=c) = |k| Pr (C=c) .

n
Now, Pr(C = c) = ∑ Pr(C = c | M = mi ) Pr(M = mi )
i=1
1 n 1
= ∑ Pr(M = mi ) = .
|K| i=1 |K|

1 Pr(M=m)
So, Pr(M = m |C = c) = |k| 1 = Pr(M = m)
|k|
Therefore, the system has perfect secrecy. 

One-time pad: In 1917, Vernan proposed a cipher called one time pad. Though the
idea was previously given by Miller in 1882, Vernan patented this idea. The encryption
technique of one time pad is important because though it was invented much before the
evolution of the idea of perfect secrecy, but one time pad satisfies the property of perfect
secrecy. The technique of encryption is very simple. In one time pad, the key space K,
plaintext space P and ciphertext space C, all are{0, 1}l , for some integer l. This means,
all of them are nothing but all possible strings of 0 and 1 of size l. So, P, K,C are of
same size. Now, the probability of occurrence of any key is uniform, i.e, for any key
1
k, Pr(k) = 2l
. Now, for a plaintext x and key k, the encryption is c = x ⊕ k, where ⊕
is bitwise XOR. Decryption is exactly the opposite. For any ciphertext c and key k,
x = c ⊕ k. Quiet easily it can be verified that

dk (ek (x)) = dk (x ⊕ k) = (x ⊕ k) ⊕ k = x ⊕ (k ⊕ k) = x.

Since the key is uniformly random, the adversary does not gain any information about
the plaintext. Shannon proved that one time pad is perfectly secure [108].

Theorem 1.5 One time pad is perfectly secure.

Proof 1.6 Let x ∈ P be a plaintext. Let X,Y, K are the plaintext, ciphertext and key. For
some x ∈ P, y ∈ C, we want to show that Pr (X = x) = Pr (X = x |Y = y).

9
Pr(X=x) Pr (Y =y | X=x)
Now, Pr(X = x |Y = y) = Pr(Y =y) .

Also
Pr (Y = y | X = x) = Pr (X ⊕ K = y | X = x)
1
= Pr (x ⊕ K = y) = Pr (K = x ⊕ y) = 2l
.

1
And Pr (Y = y) = ∑ Pr (X = x) Pr (Y = y|X = x) = ∑ Pr (X = x) 2l
x∈X x∈X
1 1
= 2l ∑ Pr (X = x) = 2l .
x∈X

Pr(X=x) 1l
So, Pr (X = x |Y = y) = 1
2
= Pr(X = x). 
2l

Instead of having perfect secrecy, one time pad has several drawbacks. In fact, some of
its properties have made it impractical for use.
• One time pad requires perfectly random number generator. This requirement is
not practical, because in classical world we still do not have any procedure to
generate perfectly random numbers.
• In one time pad, the keylength is same as the message length. This is a serious
drawback, because for sending a very long message, both the sender and the
receiver has to securely store a equally long key. This is difficult to achieve.
Also, when the two parties share the key between them, they may not have any
idea about the length of the message. In that case, they do not have any upper
bound of the size of the message. Naturally, they can not decide the key since the
size is unknown.
• This technique is called one time pad because here one single key can not be
used to encrypt more than one message. Suppose same key k has been used
to encrypt two messages x1 and x2 . Suppose the corresponding ciphertexts are
y1 and y2 . So, y1 = x1 ⊕ k and y2 = x2 ⊕ k. Now, y1 and y2 are known to the
adversary. Also y1 ⊕ y2 = (x1 ⊕ k) ⊕ (x2 ⊕ k) = (x1 ⊕ x2 ) ⊕ (k ⊕ k) = (x1 ⊕ x2 ),
which is independent of k. So, from the information of one plaintext the adversary
can find out information of other plaintext. For example, suppose the key k =
0001. Consider two messages x1 = 0000 and x2 = 0001. So, the corresponding
ciphertexts are: y1 = 0001 and y2 = 0000. By XORing y1 and y2 , the adversary
gets 0001. Since this is same is x1 ⊕ x2 , the adversary knows that: x2 = x1 ⊕ 0001.
So, if a single bit of x1 is known to him, he can find out the corresponding bit of
x2 .

Suppose Alice wants to send plaintext m to Bob. In the principle of stream cipher,
there is an algorithm which takes the key as an input. After applying some operations

10
on the key, it produces a stream of 0 and 1. This is called pseudorandom keystream.
This keystream is XORed with the actual message m to produce c, which is called the
ciphertext. Now Alice sends this c to the receiver Bob. Bob has the same key that has
been used for encryption. With this key, he generates the same keystreams as Alice.
By XORing this keystream with ciphertext c, he gets back the actual message m. If
any third party wants to get the plaintext, he needs to know the keystream bit to get the
original message. But since the key is unknown to the third party, it is not possible to
generate the same keystream bits by them.

So, from the principle of one time pad it is clear that though theoretically one time
pad provides the best possible security to the message, but it is not practical. So, it
can be considered as just a hypothetical structure. The principle of stream cipher has
many similarities with one time pad. Stream cipher does not provide the perfect se-
crecy as defined by Shannon. One time pad requires the generation of perfectly random
numbers. Stream ciphers does not generate random binary random. But the keystream
generation of stream ciphers follows something which is called “pseudorandomness".
Pseudorandomness is a property of something which is not random at all, but appears
random. In stream cipher the keystream is generated by applying an algorithm over the
key. The actual key is small in size, and it is not XOR-ed with the message directly.
Rather, by applying some algorithm on the key, keystream bits are produced, which are
directly XOR-ed with the message.

Now, suppose the length of the key is `k and the length of the message is `m , where
`k is much smaller than `m . Then we need keystream of length `m . Now, there are 2`m
possible strings 0 and 1. So, if the keystream generation was perfectly random, each of
those 2`m strings had equal probability of occurrence. But, in our stream cipher, since
the size is only `k , there are only 2`k possible values for key. By applying algorithm
on them, at most 2`k different strings can be produced, which is negligible compared
to 2`m . So, quiet naturally, it is not random. But due to the design of the cipher, an
adversary can not guess or get any information about any keystream bit.

Unlike One time pad, in stream cipher, the key size does not depend on the size of
the message. It can be any fixed value, and infinite number of keystream bit can be
generated from this using the generation algorithm. So, a small size key can be easily

11
stored by the sender and the receiver. Based on the message length, both of them can
generate the required number of keystream bits from the key.

Also, in stream cipher, a single key may be used again and again for encryption.
Stream ciphers use another component other than key, which is called IV. IV is also
a binary string which is decided by the sender and the receiver. But unlike the key,
IV is public. So, the two parties do not have to share this using some private channel.
Advantage of having IV is, if the same key is repeated for more than one encryption,
but with different IV values, the output keystream bits are different. Suppose k is a key
which is used more than once for encryption. Now, for two different encryptions, the
IVs used are v1 and v2 . The key-IV pair (k, v1 ) generates different keystream bits from
the pair (k, v2 ). Suppose these generated keystreams are k10 and k20 respectively. For
plaintexts x1 and x2 , the ciphertexts are: y1 = x1 ⊕ k10 and y2 = x2 ⊕ k20 . So, XOR of
the ciphertexts give: y1 ⊕ y2 = (x1 ⊕ k10 ) ⊕ (x2 ⊕ k20 ) = (x1 ⊕ x2 ) ⊕ (k10 ⊕ k20 ). Adversary
can not extract any information from one plaintext even if the other plaintext is known,
since (k10 ⊕ k20 ) is unknown to him. So, using same key, more than one encryption is
possible using different IV. So the two parties can communicate for a long period with
same key, just by changing the IV.

1.3 Cryptanalysis

Since stream ciphers do not have perfect secrecy, obviously it has some non-randomness
in keystream generation. The potential of a stream cipher is based on the fact that the
non-randomness should not be identified by some computation. Any non-randomness
is the weakness of a cipher. So, the aim of the cryptanalysis of a stream is to find some
non-randomness in it. In cryptanalysis of a cipher, some fundamental assumptions
are followed by cryptographers. These assumptions are given by Dutch cryptographer
Kerckhoff. According to him, secure cryptosystem should have the property that even
if everything about the system, except the key, is public, still the system can not be
broken. So, the design of a cipher is not kept private. The adversary has the knowledge
of the structure of the cipher and the IVs. This is the fundamental assumption of the
cryptanalysis of any stream cipher. After that, the further analysis is done based on

12
some more assumptions decided by the analyst. This assumptions depend on the power
of the adversary, i.e, how much control the adversary has over the cipher. Based on these
further assumptions, attacks on stream ciphers are divided into some categories [111,
86].

• Known Ciphertext Attack: In this assumption, the adversary knows some of


the ciphertext outputs, but does not know the corresponding plaintexts. This is a
very mild assumption and with this assumption it is difficult to produce a strong
attack. If a cipher is undergone such kind of attack, certainly the cipher is very
weak.

• Known Plaintext Attack: Here the adversary is more powerful. He knows some
of the plaintexts and the corresponding ciphertexts, which have been produced by
encrypting those plaintexts.

• Chosen Plaintext Attack: In this model, the adversary has control over the plain-
texts. He does not only knows the plaintexts, rather he can choose the plaintexts
himself, encrypt those by the cipher and achieve the corresponding ciphertexts.

• Chosen IV Attack: In this model, the attacker not only can choose the plain-
texts, but also can control the IVs. He chooses the IVs, applies the algorithm and
obtains the keystream bits for some key.

• Chosen Ciphertext Attack: This model has the strongest assumption of all.
Here the adversary can choose the ciphertexts according to his wish and get back
the corresponding plaintext.

Based on the aim of the adversary, attacks are divided into different categories.

• Key Recovery: This is the highest aim of the adversary. He finds out the whole
secret key from the output keystream bits. Several key recovery attacks are known
against some famous ciphers.

• State Recovery: Even if the adversary can not achieve the key, he can find out
any intermediate state during the running of the cipher. If a particular state is
known, all the output keystreams generated further can be found. So, without
knowing the key, the adversary can achieve the output keystream bits. In some
cases, state recovery attack leads to key recovery attack.

• Distinguishing Attack: The attacker may not aim to find the key or any state.
Rather he can distinguish the output keystream bits produced by the cipher from
an actual random keystream bit. For this he has to find out some non-randomness
property of those keystream bits. This non-randomness property is called distin-
guisher. Sometimes distinguisher leads to key recovery attack.

• Weak Key: In this attack, adversary tries to find out a class of keys which pro-
vides huge non-randomness in the output keystream bits. For any key in this set,
the key recovery is very easy.

13
Now we describe some of the most common and popular attacks proposed against
stream ciphers.

• Algebraic Attack: This is an attack which involves solving algebraic equations.


This is a known plaintext attack. The attacker tries to find polynomial equations
over finite field from the keybits and keystream bits. After that, the system of
equations are solved by some equation solving method (e.g: SAT solver). From
this solution, the actual key can be recoverd partially or fully. Some examples of
Algebraic attack on stream ciphers are [9, 10, 33, 34].

• Time Memory Data Tradeoff Attack: In this attack idea, the attacker mostly
tries to reduce the time complexity by using memory. This attack is mostly used
for state recovery, though that may lead to key recovery also. It has two phases.
The first phase is called precomputation, where the attacker studies the cipher
thoroughly, finds out some properties of it, and records it in some tables. This
procedure takes long time, but that does not count in the complexity estimation
of the attack. Once the record is complete, that can be used again and again to
attack the cipher for different keys. In the second phase, which is the actual attack,
keystream bits are observed from the output, and based on the recordings done
in the first phase, the output keystreams are matched in tables formed in the first
phase. Availability of the tables reduces the time complexity drastically, whereas
it requires some space in the machine to be used. This attack was first applied on
a block cipher by Hellman[58]. Later, it had been applied on stream cipher A5
by Golic [49]. Some famous TMDTO attacks are presented in [62, 102, 20, 22].

• Differential Attack: This is an interesting attack where the adversary has the
control over IVs. He can assign values to the IVs according to his wish. So,
the adversary takes two values for IV, say v1 and v2 , where except one or few
positions, all the bits of v1 and v2 are same. Now he runs the algorithm for both
the IVs. At the end, two different keystreams z1 and z2 are obtained. Now, the
adversary tries to find some correlations between the bits of z1 and z2 . Using that
correlations, cipher can be analysed. Biham and Shamir [19] analysed DES using
differentiial idea. Later ideas of differential cryptanalysis on stream ciphers were
presented in [115, 116, 17].

• Correlation Attack: This is another kind of known plaintext attack model. This
attack is used on stream ciphers where the output keystream is produced by com-
bining the output of more than one LFSRs by some non-linear boolean function.
Instead of searching all the LFSR states together, a divide and conquer approach
is used where the adversary finds out the LFSR states one by one. For exam-
ple, suppose L1 , L2 , , . . . , Lr are the LFSR states, and the output of each Li is xi .
Suppose Z = f (x1 , x2 , · · · , xr ), where f is a non-linear boolean function. The ad-
versary tries to find a correlation between x1 and Z. Now, instead of guessing
all the LFSR states, adversary guesses only L1 . If the correlation is satisfied by
the guess, then the guess is considered to be correct. After that he repeats the
same for L2 and so on. Finding each of the LFSR states separately reduces the
time complexity by huge margin. Some of the famous correlation attacks are pre-
sented in [68, 85, 47, 48, 50, 51, 31]. Very recently, Zhang et al. [120] used this
idea to attack the stream cipher Fruit [3].

14
• Cube Attack: This attack idea was suggested by Shamir and Dinur [41] in Eu-
rocrypt 2009. Here, some positions of the IV bits (say, v bits) are chosen and
remaining are assigned some fixed number. These v positions are assigned all 2v
possible numbers and the algorithm is run by few rounds. Suppose, initial state
is S and after r rounds the output is given by f (S). Now f (S), produced for all 2v
IVs, are XORed. Based on this, some non-randomness of f (S) is tried to find by
repeating the procedure with different keys. This non-randomness can be used as
distinguisher of the stream cipher.

• Fault Attack: Fault attack is a newly emerged attack idea which needs injection
of a fault during the running of the algorithm. With suitable tampering of the
hardware, the adversary changes some data in the middle of the algorithm. As a
result, the cipher produces faulty output. However, the actual output can also be
achieved by repeating the process without any tamper. Now, by comparing the
original output to the faulty one, adversary can find some relations, which may
help to achieve some secret information. This type of attack was first introduced
by Boneh et al. [26]. Some important results in this directions are presented
in [61, 26, 18, 59, 90].

1.4 Organization of the Thesis

The thesis presents cryptanalytic results on some stream ciphers namely RC4, Salsa,
Chacha and Fruit. It is recommended that one reads the chapters in the order they are
presented. However, the reader may choose to browse quickly to a chapter of his/her
choice. A short summary for each chapter is presented as follows.

Chapter 1: In the current chapter, we have discussed some introductory materials re-
garding cryptography, and its major classifications.

Chapter 2: In this chapter, we generalize Roos bias which was observed by Roos in
1995. The materials of this chapter are based on our publication [40].

Chapter 3: Here we prove the bias of Zr = r in RC4. The materials of this chapter are
based on our work [39].

Chapter 4: In this chapter, we analyse Salsa and Chacha for reduced rounds. The
materials of this chapter are based on our publication [38].

15
Chapter 5: Here we analyse Fruit for full round. The materials of this chapter are
based on our publication [37].

Chapter 6: This chapter concludes the thesis. Here we present a comprehensive sum-
mary of our work that has been discussed throughout the thesis. We discuss open prob-
lems which might be interesting for further investigation along this line of research.

1.5 Experimental Framework of this Thesis

Throughout this thesis, we have furnished numerous experimental results supporting


our claims. We have performed all experiments using the following computing frame-
work.

• Operating System: Linux Ubuntu 16.04

• System Configuration: Intel(R) Xeon(R) CPU E5-2670 v3 @ 2.30GHz, 3 GB


RAM and 3 MB Cache.

• Coding Platform: C with gcc compiler.

1.6 Prerequisites

Cryptography is a highly mathematical endeavor. To understand the intricate details


proposed in this thesis, one requires a strong foundation in Mathematics. We frequently
use involved results of combinatorics, probability, statistics and data structure in this
thesis, and expect the reader to possess a good grasp on these topics. A graduate level
training in mathematics is recommended to read the material comfortably.

16
CHAPTER 2

Generalization of Roos bias in RC4

RC4 has attracted many cryptologists due to its simple structure. In Asiacrypt 2014, Pa-
terson et al. reported the results of large scale computation of RC4 biases. Among the
biases reported by them, we try to theoretically analyze a few which show very inter-
esting visual pattern. We first study this bias which relates the key stream byte Zi with
i − k[0], where k[0] is the first byte of the secret key. We then present a generalization
of Roos bias. In 1995, Roos observed the bias of initial bytes S[i] of the permutation
after KSA towards fi = ∑ir=1 r + ∑ir=0 K[r]. Here we study the probability of S[i] equals
to fy = ∑yr=1 r + ∑yr=0 K[r] for i 6= y. Our generalization provides complete correlation
between Zi and i − fy . We also analyze another key-keystream relation Zi = fi−1 which
was studied by Maitra and Paul in FSE 2008. We provide more accurate formulas for
the probability of both Zi = i − fi and Zi = fi−1 for different i’s than the existing works.

RC4 is a stream cipher which has been widely used worldwide and has become one
of the most popular ciphers in the world for the last 25 years. RC4 is a very simple
cipher and can be implemented only in a few lines of code. This cipher was designed
by Ron Rivest in 1987. Its first application was in Data security. It was also used in
RSA Lotus Notes.

Though RC4 was a trade secret in the beginning, in 1994 it was published. The first
adoption of this cipher was done by network protocol TLS. Later it has been used in
WEP [63] in 1997, SSL in 1995, WPA [64] in 2003 etc.

At first, we describe the design of RC4 briefly. It has two components. The
first component is the Key Scheduling Algorithm (KSA) and the other one is Pseudo-
Random Generation Algorithm (PRGA). Here, all the operations are done in modulo
256. The KSA takes an identity permutation S of 0 to 255. Using an `-byte secret key,
it scrambles the identity permutation over ZN , and derives another permutation. Af-
ter completion of KSA, the PRGA generates a pseudo-random sequence of keystream
bytes, using the scrambled permutation of KSA to Z1 , Z2 , . . .. After each iteration from
0 to 255, an output Zi is produced. These are bitwise XOR-ed with the plaintext to
produce the ciphertext. Both for the KSA and the PRGA, two indices i and j are used
in the permutation. In both of these, a swap between S[i] and S[ j] takes place.

KSA
PRGA
N = 256;
Initialization:
Initialization:
i = j = 0;
For i = 0, . . . , N − 1
S[i] = i;
Keystream Generation Loop:
j = 0;
i = i + 1;
j = j + S[i];
Scrambling:
Swap(S[i], S[ j]);
For i = 0, . . . , N − 1
t = S[i] + S[ j];
j = ( j + S[i] + K[i]);
Output Z = S[t];
Swap(S[i], S[ j]);

We use SrKSA , iKSA


r , jr to denote the permutation and the two indices after r-th round
KSA

of RC4 KSA. Hence SNKSA is the permutation after the complete key scheduling. Sr , ir , jr
is used to denote the permutation and the two indices after r-th round of RC4 PRGA.
So SNKSA = S0 . We use Ia,b to denote the indicator function. So

 1, for a = b
Ia,b =
 0, for a 6= b

y
y(y+1)
Also, by the notation fy , we denote the expression 2 + ∑ K[r] (0 ≤ y ≤ N − 1),
r=0
which plays a vital role in most of the proposed attacks on RC4.

For having such a simple design, many cryptologists have been attracted to this
cipher. Throughout last 25 years, multiple weaknesses of RC4 have been found. One of
the most remarkable attacks was presented by Fluhrer, Mantin and Shamir [44] in 2001.
This attack was based on the weaknesses in the Key Scheduling Algorithm. In 1995,
Roos [101] observed that after the KSA, the most likely value of SNKSA [y] for the first few
values of y is given by SNKSA [y] = fy . The experimentally found values of the probabilities
Pr(SNKSA [y] = fy ) decrease from 0.37 to 0.006 as y increases from 0 to 47. Later, the
theoretical proof of this was given by Paul et al. in SAC 2007 [95]. Recently Sarkar et
al. [38] improved the analysis of [95]. In [95], authors also discussed a reconstruction
algorithm to find the key from the final permutation SN after KSA using Roos biases.

18
Klein [70] observed correlations between keystreams and key using Roos biases. In
FSE 2008, Maitra et al [78] showed that not only the permutation bytes SNKSA [y], but also
the bytes SNKSA [SNKSA [y]], SNKSA [SNKSA [SNKSA [y]]] etc. are biased towards fy . Then in SAC 2010,
Sepehrdad et al. [106] showed some biases on the state variables, initial keystream
bytes and secret key of RC4. They also gave a key recovery attack on RC4 in WPA.
In Eurocrypt 2011, Sepehrdad et al. [107] presented an attack on WEP by using all the
previous known attacks in the literature and by introducing few new correlations.

In USENIX 2013, AlFardan et al. [2] used a Bayesian statistical method that recov-
ers plaintexts in a broadcast attack model, i.e, plaintexts that are repeatedly encrypted
with different keys under RC4. AlFardan et al. successfully used their idea to at-
tack the cryptographic protocol TLS by exploiting biases in RC4 keystreams. In FSE
2014, paterson et al. [94] and Sengupta et al. [54] exploited independently keystream
and key correlations to recover plaintext in WPA since the first three bytes of the RC4
key in WPA are public. In Asiacrypt 2014, Paterson et al. [93] improved the attack
of [94]. They performed large-scale computations using the Amazon EC2 cloud com-
puting infrastructure to obtain accurate estimates of the single-byte and double-byte
distributions.

The recent attacks on RC4-based protocols have led to the consensus that RC4 is
insecure and should be phased out. For an example, Vanhoef et al. [113] presented an
attack on TLS and WPA using RC4 (USENIX’15). Also, Banik et al. [67] presented
some works on joint distribution of keystream biases. These works show that RC4 is
still an active area of research.

2.1 Negative bias of Zi towards i − K[0]

Let us start with the following lemma.

Lemma 2.1 After KSA, Pr(SNKSA [i] = K[0]) = N1 (1 − N1 )(N−1−i) for i ≥ 1.

Proof 2.2 If SiKSA [ ji+1


KSA
] = K[0], after swap we have Si+1
KSA
[iKSA ] = K[0]. Now Pr(SiKSA [ ji+1
KSA
]
= K[0]) = N1 , since ji+1
KSA
is random. Also SNKSA [i] will be K[0] only if jKSA ’s cannot touch i
KSA
again, i.e, if all ji+2 , . . . , jNKSA are different from i, then SNKSA [i] will be K[0]. The probability

19
KSA
of ji+2 , ji+3
KSA
, . . . , jNKSA 6= i is (1 − N1 )(N−1−i) . Therefore,

1 (N−1−i)
 
1
Pr(SN [i] = K[0]) =
KSA
1− for i ≥ 1.
N N

Now we have the following result.

Lemma 2.3 In PRGA, for i ≥ 1,

1 i−1 1 1 i−2 i−1


   
Pr(Si−1 [i] = K[0]) = pi 1 − + 1− ∑ pl
N N N l=1
i−1
1 i−r−1 i−1 i−l −1
   
1
+ ∑ r 1− ∑ pl r − 1
r=2 N N l=1

 (N−1−i)
1
1
where, pi = 1− N
N .

Proof 2.4 We find the probability of this event by breaking them into mutually disjoint
events and finding their probabilities separately.

• Event 1: After the completion of KSA, K[0] is in i−th location of the array
(whose probability is pi from the Lemma 2.1) and this position is not touched
by j1 , . . . , ji−1 . Probability of this event is pi (1 − N1 )i−1 .

• Event 2: After the completion of KSA, K[0] is in some l−th location of the array
(whose probability is pl ) where 1 ≤ l ≤ i − 1. This position is not touched by
j1 , . . . , jl−1 . Then jl = i. After that jl+1 , . . . , ji−1 6= i. Since l can vary from 1 to
i−1 
1 i−2

1
i − 1, total probability of the above path is ∑ 1− pl .
l=1 N N

• Event 3: After the completion of KSA, K[0] is in l−th location of the array where
1 ≤ l ≤ i − 1. This position is not touched by j1 , . . . , jl−1 . Then jl = t for l + 1 ≤
t ≤ i − 1. After that jl+1 , . . . , jt−1 6= t. Then jt = i. Also jt+1 , . . . , ji−1 6= i. Total
i−1 i−1
1 i−3
 
1
probability of the this path is ∑ ∑ 2 1 − pl . Similarly, K[0] can
l=1 t=l+1 N N
come to i-th location with more than two jumps. If it comes through r + 1-th

20
jump, total probability will be

1 i−r−1 i−1 i−1 i−1 i−1


 
1
1 − ∑ ∑ ∑ . . . ∑ pl1
Nr N l1 =1 l2 =l1 +1 l3 =l2 +1 lr =lr−1 +1

1 i−r−1 i−1
   i−1 i−1 i−1 
1
= r 1− ∑ pl1 ∑ ∑ . . . ∑ 1
N N l1 =1 l2 =l1 +1 l3 =l2 +1 lr =lr−1 +1
i−r−1 i−1 
i − l1 − 1
 
1 1
= r 1− ∑ pl1
N N l1 =1 r−1

Thus adding the probabilities of these three disjoint events, we have,

1 i−1 1 1 i−2 i−1


   
Pr(Si−1 [i] = K[0]) = pi 1 − + 1− ∑ pl
N N N l=1
i−1
1 i−r−1 i−1 i−l −1
   
1
+ ∑ r 1− ∑ pl r − 1
r=2 N N l=1

We can use this lemma to find the probability Pr(Zi = i − K[0]). The following theorem
gives a bias of Zi towards (i − K[0]).

Theorem 2.5 We have


    
1 1
 Pr(S0 [1] = K[0]) N 1 − N + 1 − N + N 2 N1 ,
1 1
for i = 1


Pr(Zi = i − K[0]) =  
 Pr(Si−1 [i] = K[0]) N + 1 − N N1 ,
1 1
for i > 1

Proof 2.6 First consider i > 1.



1. Consider the event A : (Si−1 [i] 6= K[0]) ∩ (Si−1 [ ji ] = i − K[0]) . So after swap
Si [i] = i − K[0] and Si [ ji ] 6= K[0]. So Zi = Si [Si [i] + Si [ ji ]] 6= Si [i] = i − K[0].

2. Next consider the event B : (Si−1 [i] = K[0]) ∩ (Si−1 [ ji ] = i − K[0]) . Then Zi =
Si [Si [i] + Si [ ji ]] = Si [i] = i − K[0].

3. Now consider the event C = (A ∪ B)c . In this case Pr(Zi = i − K[0]) = N1 , con-
sidering random association. Also Pr(C) = 1 − Pr(A ∪ B) = 1 − Pr(Si−1 [ ji ] =
i − K[0]) = 1 − N1 .

21
Thus

Pr(Zi = i − K[0]) = Pr(Zi = i − K[0] | A) Pr(A) + Pr(Zi = i − K[0] | B) Pr(B)

+ Pr(Zi = i − K[0] |C) Pr(C)


1
= 0 · Pr(A) + 1 · Pr(B) + · Pr(C)
N 
1 1 1
= Pr(Si−1 [i] = K[0]) · + 1 −
N N N

Now for i = 1, j1 = 1 when S0 [1] = 1. In this case B is an impossible event. So for


i = 1, we take


A : (S0 [1] 6= K[0]) ∩ (S0 [ j1 ] = 1 − K[0]) ∩ (K[0] 6= 1) ,


B : (S0 [i] = K[0]) ∩ (S0 [ j1 ] = 1 − K[0]) ∩ (K[0] 6= 1) .

In this case
   
1 1 1 1 1
Pr(Z1 = 1 − K[0]) = Pr(S0 [1] = K[0]) 1− + 1− + 2 .
N N N N N

In Fig. 2.1, we plot the theoretical as well as experimental values of Pr(Zi = i−K[0])
with key length 16, where the experiments have been run over 100 billion trials of RC4
PRGA with randomly generated keys.

2.2 Generalization of Roos Bias and bias of Zi = i − fy

Theoretical justification of Roos bias has first appeared in [95]. Recently the work
of [95] has been revisited in [38]. We need the following result of [38, Lemma 2].
KSA
Lemma 2.7 In KSA, the probability of Pr(Si+1 [i] = fi ) can be given by

 i   
r i 1 i 1 i 1 i i 1 i
∏ 1− + p1 1 −1− + 1 − 1− 1− + 1−
r=1 N N N N N N N N
 i 
i 1  r i 1 i
1 − (1 − )i ∏ 1 −

+ 1− − p1 + p2 1 − 1− ,
N N r=1 N N N

22
Thm.1

Pr(Zi = i − K[0]) →
Pr(Zi = i − K[0])
1
N

i→
Figure 2.1: Distribution of Pr(Zi = i − K[0]) for i ∈ [1, 255].

where
Z min{cN+0.5,i(i+1)/2}

1 1 x−µ 
p1 = ∑ b−µ µ
· φ dx,
c=1 Φ( ) − Φ(− ) σ σ σ
cN−0.5 σ

1 1 min{(c+1)N−0.5,i(i+1)/2} x−µ 
Z
p2 = ∑ Φ( b−µ ) − Φ(− µ ) · σ 0.5+cN
φ
σ
dx,
c=0 σ σ
i p−1
1 x 1 
µ= ∑∑ 1− p−x ,
p=0 x=0 N N
 2
i  p−1  r−1  
1 x 1 2 1 x 1
σ2 = ∑ ∑ 1− p−x − ∑ 1− p−x
p=0 x=0 N N x=0 N N

−2x 1 2
e√
where φ (x) = 2π
is the density function of standard normal distribution.

Also the following result is proved in [38, Theorem 2].

Lemma 2.8

1 N−1−i
Pr(SNKSA [i] = fi ) = Pr(Si+1
KSA
[i] = fi ) · 1 −
N
 N−1 1 1 N−1−t
+ 1 − Pr(Si+1
KSA
[i] = fi ) · ∑ 2 1 −
t=i+1 N N

Now we find Pr(SNKSA [i] = fy ) for 0 ≤ i ≤ N − 1 and 1 ≤ y ≤ N − 1 with i 6= y.

23
Lemma 2.9 For i 6= y with y ≥ 1, we have
 
1 1 N−i−1 1
Pr(SN [i] = fy ) =
KSA
1− + 1 − Pr(Sy+1 [y] = fy ) −
KSA

N N N
 N−1 
1 1 N−1−t
∑ N2 · 1 − N .
t=i+1

Proof 2.10 We have two cases.

1. Case I: Let SiKSA [ ji+1


KSA
] = fy . This happens with probability N1 . So after swap Si+1 KSA
[i]
N−i−1
, . . . , jNKSA 6= i. So probability of this path is N1 1 − N1
KSA

becomes fy . Also ji+2 .
On the other hand if SiKSA [ ji+1 ] = fy and i ∈ { ji+2 KSA
, . . . , jNKSA }, SNKSA [i] will be always
different from fy .

2. Case II: If i < y and Sy+1 KSA


[y] = fy , then SNKSA [i] cannot be fy as y-th location of
S array cannot move to the left side when running index is greater than y. On
the other hand if i > y and Sy+1 KSA
[y] = fy , then SNKSA [i] can be fy only through the
first event. So we need Sy+1 KSA
6= fy . Let us consider the scenario where StKSA [t] =
fy for some t > i. This holds with probability N1 . Suppose that jt+1 KSA
= i and
jt+2 , · · · , jN all are different from i. Hence after the swap we get St+1 [i] = fy
KSA KSA KSA

and this location is not disturbed in further rounds of KSA. This path holds with
N−1−t
probability N12 1 − N1 .

Thus if i 6= y,

1 N−i−1
   
1 1 1 N−i−1
Pr(SN [i] = fy ) =
KSA
1− ·1+ 1− 1− ·0
N N N N
  N−1 
1 1 1 N−1−t
+ 1 − Pr(Sy+1 [y] = fy ) −
KSA
∑ 2 · 1− N .
N t=i+1 N

In Figure 2.2, we present both theoretical and experimental results for Pr(SNKSA [i] = fy )
for 0 ≤ i, y ≤ 50 with i 6= y. From the figure it is clear that there are some anomalies
when length of the keys are 16. This is because there are some fy ’s whose parities are
same when the key length is 16. We will discuss this issue for key-keystream relations
in Theorem 2.21.

24
0.0039

0.0044
0.0036 0.0036

0.0040 0.0055 0.0040 0.0040


0.0033 0.0050
0.0033

0.0035 0.0035
0.0045
0.0036
0.0030 0.0030 0.0040 0.0030 0.0030

0.0035
0.0025 0.0032
0.0025

0.0027 0.0030
0.0027
0.0020 0.0025 0.0020

0.0020
0.0015 0.0024
0.0028
0.0015
0.0024
0.0015

0.0010 0.0010 0.0010

50 0.0021 50
0.0024
50 0.0021
40 40 40

0 30 30 0.0020 30
10 20 y 0.0018 0
10
0
10
0.0018

20 20
20 y
20
20 y

30 10
x 40
30 10
0.0016
30 10

0.0015
i i
50 0
40 40
0.0015
0 0
50 50

(a) (b) (c)

Figure 2.2: Probability Pr(SNKSA [i] = fy ) for 0 ≤ i, y ≤ 50 with i 6= y. Here (a) Theoreti-
cal values, (b) Experimental results with 16 byte key and (c) Experimental
results with 256 byte key.

Lemma 2.11 In PRGA,

1 i−1 i−1 1 1 i−r−1


Pr(Si−1 [i] = fy ) = Pr(SNKSA [i] = fy ) 1 − + ∑ r 1−
N r=1 N N
 i−1
i−l −1
 
∑ Pr(SNKSA[l] = fy) r − 1
l=1

for 1 ≤ i ≤ N − 1 and 1 ≤ y ≤ N − 1.

Proof 2.12 Similar to Lemma 2.3. 

Now consider the following event C1 for occurrence of Zi = i − fi for i ≥ 1.

1. SNKSA [i] = fi
2. j1 , . . . , ji−1 6= i.
3. Si−1 [ ji ] 6= i − fi

1
Since Si [i] + Si [ ji ] 6= fi + i − fi = i, we have Pr(Zi = i − fi ) = N−1 . Above path holds
with probability ai = Pr(SNKSA [i] = fi )(1 − N1 )i .

Now we will prove the following theorems.

Theorem 2.13
    
Pr(S0 [1] = fy ) 1 1 − 1 + a1 1 I1,y+ 1 − 1 + 12 − a1 I1,y 1 , for y 6= 2



 N N N−1 N N N

   
Pr(Z1 = 1 − fy )= Pr(S0 [1] = fy ) · N1 · 1 − N1 + 1 − N1 + N12

 

− N2 − N12 · Pr(S0 [2] = f2 ) N1 ,
 
for y = 2,


25
where a1 = Pr(SNKSA [1] = f1 )(1 − N1 ).


Proof 2.14 Here events are A : S0 [1] 6= fy ∩ S0 [ j1 ] = 1 − fy ∩ fy 6= 0 and B : S0 [1] =

fy ∩ S0 [ j1 ] = 1 − fy ∩ fy 6= 0 . One can see that Pr(Z1 = 1 − fy | A) = 0 and Pr(Z1 =
1 − fy | B) = 1.

Also if S0 [1] + S0 [S0 [1]] = 2 and S0 [2] = f2 , Z1 will be always different from 1 − f2 .
2
Also Pr(S0 [1] + S0 [S0 [1]] = 2) = N − N12 as one path comes from S0 [1] = 1. Hence the
required result. 

Similarly, we find the bias of Z2 towards 2 − fy in the next theorem.

Theorem 2.15 We have,


  
1 1
 Pr(S1 [2] = fy ) N + a2 N−1 I2,y + 1 − N − a2 I2,y N1 ,
1
for y ≤ 2


Pr(Z2 = 2 − fy )=  
 Pr(S1 [2] = fy ) N + β · N−1 + 1 − N − α − β N1 ,
1 1 1
for y > 2

where
 
2 1 1 1

1. α = N − N2
η + N (1 − η)(1 − N )
 
2 1 1 1

2. β = 1 − N + N2
η + N (1 − η)(1 − N )

N
3. η = ∏yi=1 1 − Ni 1 − Ny 1 − N1
 

4. a2 = Pr(SNKSA [2] = f2 )(1 − N1 )2

Proof 2.16 For y ≤ 2, paths are the same as in Theorem 2.5. But for y > 2, we have
two more paths

1. C : (S1 [y] = fy ) ∩ ( fy 6= 2) ∩ (Z2 = 0) ,

2. D : (S1 [y] = fy ) ∩ ( fy 6= 2) ∩ (Z2 6= 0) .

1
We have Pr(Z2 = 2 − fy |C) = 0. Also Pr(Z2 = 2 − fy | D) = N−1 as Z2 6= 0, fy 6= 2.

Now consider events jtKSA ∈


/ {t, . . . , y} for 1 ≤ t ≤ y, fy ∈
/ {0, 1, · · · , y−1} and jtKSA 6= fy
for 1 ≤ t ≤ y. Then Sy+1
KSA
[y] = fy . Also if jy+2
KSA
, . . . , jNKSA , j1 6= fy , we have S1 [y] = fy .
N
Call this path E. Here Pr(E) = ∏yi=1 1 − Ni 1 − Ny 1 − N1 . One can see [95] that
 

26
Pr(S1 [y] = fy | E) = 1. Also assume Pr(S1 [y] = fy | E c ) = N1 . Again from [83], we know
2
Pr(Z2 = 0) = N − N12 . We have

Pr(C) = Pr(S1 [y] = fy ∩ fy 6= 2) Pr(Z2 = 0)


2 1 
− 2 Pr(S1 [y] = fy ∩ fy 6= 2 ∩ E) + Pr(S1 [y] = fy ∩ fy 6= 2 ∩ E c )

=
N N
2 1 
− 2 Pr(E) + Pr(S1 [y] = fy | E c ) · Pr(E c ) · Pr( fy 6= 2)

=
N N
2 1  1 1 
= − 2 Pr(E) + · (1 − Pr(E)) · (1 − )
N N N N

Similarly, Pr(D) = 1 − N2 + N12 Pr(E) + N1 (1 − Pr(E))(1 − N1 ) .


 


For all i greater than 2, the following theorem gives the probability Pr(Zi = i − fy ).

Theorem 2.17 We have,

1 1 1 1
Pr(Zi = i − fy ) = Pr(Si−1 [i] = fy ) · + ai Ii,y + 1 − − ai Ii,y ,
N N −1 N N

for 3 ≤ i ≤ N − 1 and 1 ≤ y ≤ N − 1, where ai = Pr(SNKSA [i] = fi )(1 − N1 )i−1 (1 − N1 ).

Proof 2.18 Similar to Theorem 2.5,



we consider the events A : (Si−1 [i] 6= K[0]) ∩ (Si−1 [ ji ] = i − K[0]) and B : (Si−1 [i] =

K[0]) ∩ (Si−1 [ ji ] = i − K[0]) . In these cases, Pr(Zi = i − fy ) are 0 and 1 respectively.
Next we consider C = (A ∪ B)c . Then Pr(C) = (1 − N1 ). But in case of i = y, C can be
divided into two mutually disjoint events C1 and C1c (as mentioned just before Theorem
2.13). Evaluating the probabilities of all these events, we get the result. 

In Figure 2.3, we present both theoretical and experimental results for Pr(Zi = i− fy ) for
1 ≤ i ≤ 50, 0 ≤ y ≤ 50 with i 6= y. From the figure it is clear there are some anomalies.
Among them the probability of Z2 = 2 − f31 is the most significant. We observe Pr(Z2 =
1
2 − f31 ) = N + 0.82
N2
. However if the key length is 256, we get Pr(Z2 = 2 − f31 ) =
1
N − 0.11
N2
, which matches exactly with the theoretical value. When key length is 16, we
have the following result.
 
2 2 1
 2 1
 N −1 2
Theorem 2.19 Pr(Z2 = 2 − f31 ) = N N − N2
+ 1− N + N2
2
N−1 N when length
of the key is 16.

27
0.003906
0.0043000

0.003904 0.0042975

0.003910 0.00432

0.0042950
0.003902 0.00431
0.003905

0.00430
0.0042925
0.003900
0.003900
0.00429

0.003895 0.0042900
0.00428

0.003898
0.003890
0.00427
0.0042875

0.003885 0.00426
0.003896
50 50 0.0042850

40 40

30 30
0 0.003894 0.0042825
0
10 20 y 10 20 y
20 20
30 10 30 10
i 40 i 0.0042800
0.003892 40
50 0 0
50

(a) (b)

Figure 2.3: Probability Pr(Zi = i − fy ) for 1 ≤ i ≤ 50, 0 ≤ y ≤ 50 with i 6= y. Here (a)


Theoretical values and (b) Experimental results with 16 byte key.

Proof 2.20 We divide it into two disjoint events, A : (Z2 = 0) and B : (Z2 6= 0). We
2
know that Pr(A) = N − N12 and Pr(B) = (1 − N2 + N12 ). Also one can see that, if the
31 15
length of the key is 16, f31 = 496 + 2 ∑ K[i] = 496 + 2 ∑ K[i] is always even. Hence
i=0 i=0
2
Pr( f31 = 2) = N. So,

Pr(Z2 = 2 − f31 ) = Pr(Z2 = 2 − f31 ∩ Z2 = 0) + Pr(Z2 = 2 − f31 ∩ Z2 6= 0)

= Pr(Z2 = 2 − f31 | Z2 = 0) Pr(Z2 = 0)+Pr(Z2 = 2 − f31 | Z2 6= 0) Pr(Z2 6= 0)

= Pr( f31 = 2 | Z2 = 0) · Pr(Z2 = 0) + Pr(Z2 = 2 − f31 | Z2 6= 0) Pr(Z2 6= 0)


1  N2 − 1 2
 
2 2 1  2
= − + 1− + 2
N N N2 N N N −1 N

1
Theorem 2.19 gives Pr(Z2 = 2 − f31 ) = N + N12 , which matches closely with the exper-
imental value. We also have another set of biases when key length is 16.

Theorem 2.21

1 2 (1 − N2 )

2 2
Pr(Z3+r = 3 + r − f35+r ) = − ) + (1 − ) Pr(S3+r−1 [3 + r] = f3+r )
N N2 N N −1 N
2
(1 − N ) 2
 
1 2
+ · + 1 − ) · (1 − Pr(S3+r−1 [3 + r] = f3+r ))
N −1 N N N

for r ≥ 0, when the length of the key is 16.

Proof 2.22 We have

28
35+r  3+r 
f35+r − f3+r = ∑ i + K[i] − ∑ i + K[i]
i=0 i=0
35+r 3+r  35+r 3+r 
= ∑ i− ∑ i + ∑ K[i] − ∑ K[i]
i=0 i=0 i=0 i=0
35+r 
= 624 + 32r + ∑ K[i]
i=4+r
19+r 35+r 
= 624 + 32r + ∑ K[i] + ∑ K[i]
i=4+r i=20+r
19+r 19+r 
= 624 + 32r + ∑ K[i] + ∑ K[ j + 16] [ j = (i − 16)]
i=4+r j=4+r
19+r 19+r  
= 624 + 32r + ∑ K[i] + ∑ K[ j] since keylength is 16, K[ j + 16] = K[ j]
i=4+r j=4+r
19+r 
= 624 + 32r + 2 ∑ K[i] .
i=4+r

One can see that f35+r − f3+r will be always even, which means f3+r and f35+r
will be of same parity for r ≥ 0, i.e, either both are even or both are odd (exclusive)
N
when length of the key is 16. So, for one value of f3+r , there are 2 possible values
2 2
for f35+r . So Pr( f35+r = f3+r ) = N. Also Pr(Zr = r − Sr−1 [r]) = N − N12 by Jenkins’
Correlation [66].

Now

Pr(Z3+r = 3 + r − f35+r ) = Pr(Z3+r =3 + r − f35+r |S3+r−1 [3 + r]= f3+r−1 ) Pr(S3+r−1 [3 + r] = f3+r )

+ Pr(Z3+r = 3 + r − f35+r |S3+r−1 [3 + r] 6= f3+r ) Pr(S3+r−1 [3 + r] 6= f3+r )

= Pr(Z3+r = 3 + r − f35+r |S3+r−1 [3 + r] = f3+r ∩ f3+r = f35+r )

Pr( f3+r = f35+r )+Pr(Z3+r =3 + r − f35+r |S3+r−1 [3 + r]= f3+r ∩ f3+r 6= f35+r )

Pr( f3+r 6= f35+r ) Pr(S3+r−1 [3 + r] = f3+r )

+ Pr(Z3+r =3 + r − f35+r|S3+r−1 [3 + r] 6= f3+r ∩ f3+r = f35+r )

Pr( f3+r = f35+r )+Pr(Z3+r =3 + r − f35+r|S3+r−1 [3 + r]6= f3+r ∩ f3+r6= f35+r )



Pr( f3+r 6= f35+r ) Pr(S3+r−1 [3 + r] 6= f3+r )
2 1 2 (1 − N2 ) 2 
= − 2) + (1 − ) Pr(S3+r−1 [3 + r] = f3+r )
N N N N −1 N
(1 − N2 ) 2 1 2 
+ + 1 − ) (1 − Pr(S3+r−1 [3 + r] = f3+r ))
N −1 N N N

Using Lemma 2.11, one can find Pr(S3+r−1 [3 + r] = f3+r ). From Theorem 2.21, we
calculate Pr(Z3+r = 3 + r − f35+r ), which is ( N1 + 0.31
N2
) when r = 0 and decreases as r

29
increases.

Remark 2.23 In Theorem 2.19 and Theorem 2.21, we justified two biases observed in
experiment for keylength 16. However, using the same argument, we can generalise the
results for any keylength. If the keylength is `, we will observe a similar bias in Pr(Z2 =
2 − f2`−1 ) and Pr(Z3+r = 3 + r − f3+2`+r ). These biases can be explained similarly,
i.e, f2`−1 and ( f3+2`+r − f3+r ) are always even. So this increases the probabilities
Pr( f2`−1 = 2) and Pr( f3+2`+r = f3+r ) to N2 .

2.2.1 Probability Zi = i − fi

Let us first start with y = i. In this case, results were discovered in [70] and proved
rigorously in [78]. It was shown in [78, Theorem 3]
1. Pr(Z1 = 1 − f1 ) = N1 1 + ( N−1 N+2 + 1

N ) N

1 N−1 [ i(i+1)
2 +N] + 1 · ( N−1 )i−1 − 1 + 1
1 + ( N−i
    
2. Pr(Zi = i − fi ) = N N )( N ) N N N N for i ∈
[2, N − 1].

Using Table 2.1 we present our comparative study of the correlation probabilities.
We present the theoretical values of Pr(Zi = i − fi ) for 1 ≤ i ≤ 64 according to Theo-
rem 2.13 and also according to the formula of [78]. We have calculated the values pi ’s,
which are required to find the ai ’s in Pr(Zi = i − fi ), using numerical methods available
in Sage [103]. The experimental values are averaged over 100 billion key schedulings,
where the keys are of length 16 and are randomly generated.

From Table 2.1, it is clear that our estimation gives a much better approximation
than [78]. One can note that from Table 2.3, Pr(Zi = i− fi ) < N1 for i ∈ [52, 64]. Formula
of [78] cannot capture this negative bias. As for example when y = 64, formula of [78]
1
gives Pr(Z64 = 64 − f64 ) = N + 1.82
N2
, but actually Pr(Z64 = 64 − f64 ) < N1 .

Remark 2.24 In [54], authors studied linear relations between keystream bytes and
key. They used these relations to recover plaintexts of WPA as the first three bytes of
the key are public. To recover first byte of plaintext, they used the relation Z1 = 1 − f1 .
From Table 2.1, one can note that our theoretical estimation of P(Z1 = 1 − f1 ) is better
than the existing work [78].

30
i Pr(Zi = i − fi )
[78] 0.005367 0.005332 0.005305 0.005273 0.005237 0.005196 0.005153 0.005106
1-8 Exp. 0.005264 0.005298 0.005280 0.005241 0.005211 0.005169 0.005127 0.005077
Thm. 2.13 0.005320 0.005298 0.005270 0.005238 0.005202 0.005161 0.005117 0.005070
[78] 0.005056 0.005005 0.004951 0.004897 0.004842 0.004787 0.004732 0.004677
9-16 Exp. 0.005028 0.004974 0.004921 0.004864 0.004808 0.004751 0.004697 0.004639
Thm. 2.13 0.005020 0.004968 0.004914 0.004859 0.004803 0.004747 0.004691 0.004636
[78] 0.004624 0.004572 0.004521 0.004473 0.004426 0.004382 0.00434 0.004301
17-24 Exp. 0.004586 0.004532 0.004481 0.004431 0.004385 0.004338 0.004298 0.004256
Thm. 2.13 0.004582 0.004529 0.004478 0.004429 0.004382 0.004338 0.004291 0.004252
[78] 0.004264 0.004230 0.004198 0.004169 0.004142 0.004117 0.004095 0.004075
25-32 Exp. 0.004220 0.004184 0.004154 0.004123 0.004097 0.004073 0.004050 0.004031
Thm. 2.13 0.004215 0.004181 0.004149 0.004121 0.004094 0.004070 0.004049 0.004029
[78] 0.004057 0.004041 0.004026 0.004014 0.004002 0.003993 0.003984 0.003976
33-40 Exp. 0.004013 0.003998 0.003985 0.003972 0.003962 0.003953 0.003945 0.003938
Thm. 2.13 0.004012 0.003997 0.003983 0.003971 0.003961 0.003952 0.003944 0.003937
[78] 0.003970 0.003964 0.003959 0.003955 0.003952 0.003949 0.003946 0.003944
41-48 Exp. 0.003932 0.003927 0.003922 0.003919 0.003916 0.003914 0.003911 0.003910
Thm. 2.13 0.003931 0.003926 0.003922 0.003919 0.003916 0.003913 0.003911 0.003909
[78] 0.003942 0.003940 0.003939 0.003938 0.003937 0.003937 0.003936 0.003935
49-56 Exp. 0.003908 0.003907 0.003906 0.003906 0.003905 0.003905 0.003904 0.003904
Thm. 2.13 0.003908 0.003907 0.003906 0.003905 0.003905 0.003904 0.003904 0.003904
[78] 0.003935 0.003935 0.003934 0.003934 0.003934 0.003934 0.003934 0.003934
57-64 Exp. 0.003904 0.003904 0.003904 0.003904 0.003904 0.003905 0.003905 0.003905
Thm. 2.13 0.003904 0.003904 0.003904 0.003904 0.003904 0.003905 0.003905 0.003905

Table 2.1: Comparison of our work with the work of [78] and experimental values.
Pr(Z1 = 1 − f2 ) Pr(Z1 = 1 − f3 ) Pr(Z1 = 1 − f4 ) Pr(Z1 = 1 − f5 ) Pr(Z1 = 1 − f6 )
Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp.
0.003886 0.003882 0.003897 0.003897 0.003897 0.003998 0.003898 0.003998 0.003898 0.003998
Pr(Z2 = 2 − f3 ) Pr(Z2 = 2 − f4 ) Pr(Z2 = 2 − f5 ) Pr(Z2 = 2 − f6 ) Pr(Z2 = 2 − f7 )
Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp.
0.003892 0.003891 0.003892 0.003892 0.003892 0.003892 0.003893 0.003892 0.003893 0.003893
Pr(Z3 = 3 − f4 ) Pr(Z3 = 3 − f5 ) Pr(Z3 = 3 − f6 ) Pr(Z3 = 3 − f7 ) Pr(Z3 = 3 − f8 )
Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp.
0.003897 0.003897 0.003898 0.003897 0.003898 0.003898 0.003898 0.003898 0.003898 0.009899
Pr(Z4 = 4 − f5 ) Pr(Z4 = 4 − f6 ) Pr(Z4 = 4 − f7 ) Pr(Z4 = 4 − f8 ) Pr(Z4 = 4 − f9 )
Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp. Thm. Exp.
0.003898 0.003897 0.003898 0.003898 0.003898 0.003898 0.003898 0.003898 0.003899 0.003898

Table 2.2: Theoretical and experimental values of few Zi = i − fy for y > i.

Theorem 2.17 also gives negative bias of Pr(Zi = i − fy ) for y > i. In Table 2.2,
we present few theoretical and experimental values. The experimental values are aver-
aged over 100 billion different keys, where the keys are of length 16 and are randomly
generated.

2.3 Biases of Zi towards fi−1

In this section we study the probability Pr(Zi = fi−1 ). In FSE 2008, Maitra and Paul [78]
observed this type of biases. In [78, Theorem 6], it is claimed that

N −1 N −i N − i + 1  N − 1  i(i−1)
  
2 +i
1
Pr(Zi = fi−1 ) = +
N N N N N
N−i  i−2
N −2 N −3

1
γi + ,
N N N

1 N−1 N−1−i
 1 N−1 N−i
+ N1 N−1

where γi = N N N −N N . From [82], we know that γi is the
probability of SNKSA [i] equals to zero after KSA.

31
Let us start with the following lemma.

Lemma 2.25 In PRGA,

 i−3 i−1−s i−1 


i−l −2
 
 γi 1 − 1 i−1 + ∑ 1 1 − 1

∑ γl s − 1 , for i > 3

N s
Pr(Si−1 [i] = 0) = s=1 N N l=2
 i−1
γi 1 − N1 , for 1 < i ≤ 3.

Proof 2.26 for i > 3, we have the following paths:

1. Let SNKSA [i] = 0. This hold with probability γi . Also all j1 , . . . , ji−1 are different
from i.

2. If SNKSA [0] = 0 or SNKSA [1] = 0, Si−1 [i] will be always different from zero. Again if
SNKSA [l] = 0 with 1 < l < i − 1, zero can move through s jumps with 1 ≤ s ≤ i − 3
as zero cannot move forward through i − 2 jumps, one jump in each step. This
i−1 
i−l −2

1 1 i−1−s

happens with probability N s 1 − N ∑ γl s − 1 . So total probability
l=2
i−1 
i−l −2

i−3 1 1
i−1−s
for this path is ∑s=1 N s 1 − N ∑ γl s − 1 . For 1 < i ≤ 3, we have
l=2
only the first path.

Hence the result. 

Now we will prove the following bias of Zi towards fi−1 .

Theorem 2.27 In PRGA,

1
Pr(Zi = fi−1 ) =τρδ ηψ + 1 − τρδ ηψ − τρδ (1 − η)ψ − τρ(1 − δ )ηψ − τ(1 − ρ)δ ηψ ,
N
i−2
where τ = Pr(Si−1 [i] = 0), ρ = Pr(SNKSA [SNKSA [i − 1]] = fi−1 ), δ = 1 − N1 , η = 1 − Ni ,

i−1
ψ = 1 − N1 and i > 2.

Proof 2.28 Consider the following five events.

1. First event A1 is Si−1 [i] = 0.

2. Second event A2 is SNKSA [SNKSA [i − 1]] = fi−1 .



3. Event A3 = ( j1 6= i − 1) ∩ · · · ∩ ( ji−2 6= i − 1) .

4. A4 = (1 6= SN [i − 1]) ∩ · · · ∩ (i 6= SN [i − 1]) .

5. A5 = ( j1 6= SN [i − 1]) ∩ · · · ∩ ( ji−1 6= SN [i − 1])

32
i Pr(Zi = fi−1 )
[78] 0.004413 0.004400 0.004384 0.004368 0.004350 0.004331 0.004312 0.004292
3-10 Exp. 0.004400 0.004386 0.004376 0.004356 0.004339 0.004321 0.004301 0.004281
Thm. 2.27 0.004400 0.004387 0.004372 0.004356 0.004339 0.004320 0.004301 0.004281
[78] 0.004271 0.00425 0.004229 0.004209 0.004188 0.004168 0.004148 0.004129
11-18 Exp. 0.004261 0.004241 0.004220 0.004200 0.004179 0.004162 0.004139 0.004120
Thm. 2.27 0.004261 0.004240 0.004220 0.004199 0.004179 0.004159 0.004139 0.004120
[78] 0.004111 0.004093 0.004076 0.004061 0.004046 0.004032 0.004019 0.004007
19-26 Exp. 0.004102 0.004085 0.004068 0.004052 0.004038 0.004024 0.004011 0.003999
Thm. 2.27 0.004102 0.004085 0.004068 0.004053 0.004038 0.004024 0.004011 0.004000
[78] 0.003996 0.003986 0.003976 0.003968 0.003960 0.003954 0.003948 0.003942
27-34 Exp. 0.003988 0.003978 0.003969 0.003961 0.003954 0.003950 0.003941 0.003937
Thm. 2.27 0.003989 0.003979 0.003970 0.003962 0.003954 0.003948 0.003942 0.003937
[78] 0.003937 0.003933 0.003929 0.003926 0.003923 0.003921 0.003919 0.003917
35-42 Exp. 0.003932 0.003928 0.003924 0.003922 0.003919 0.003917 0.003915 0.003913
Thm. 2.27 0.003932 0.003929 0.003925 0.003922 0.00392 0.003917 0.003915 0.003914
[78] 0.003915 0.003914 0.003913 0.003912 0.003911 0.003911 0.003910 0.003910
43-50 Exp. 0.003912 0.003911 0.003910 0.003909 0.003908 0.003907 0.003907 0.003907
Thm. 2.27 0.003912 0.003911 0.003910 0.003910 0.003909 0.003908 0.003908 0.003908

Table 2.3: Comparison of our work with the work of [78] and experimental values for
Zi = fi−1 .

Now one can see that

Pr(Zi = fi−1 | A1 ∩ A2 ∩ A3 ∩ A4 ∩ A5 )=1, Pr(Zi = fi−1 | A1 ∩ A2 ∩ A3 ∩ Ac4 ∩ A5 )=0,

Pr(Zi = fi−1 | A1 ∩ A2 ∩ Ac3 ∩ A4 ∩ A5 )=0, Pr(Zi = fi−1 | A1 ∩ Ac2 ∩ A3 ∩ A4 ∩ A5 )=0.

Also Pr(A1 ) = Pr(Si−1 [i] = 0), Pr(A2 ) = Pr(SNKSA [SNKSA [i − 1]] = fi−1 ),

i−2 i−1
Pr(A3 ) = 1 − N1 , Pr(A4 ) = 1 − Ni , Pr(A5 ) = 1 − N1

. Assuming Zi = fi−1
1
occurs with N in the other cases, we have the required result. 

Now one can find Pr(SNKSA [SNKSA [i − 1]] = fi−1 ) using the following theorem of [38].

Theorem 2.29 After the completion of KSA, the probability Pr(SNKSA [SNKSA [i]] = fi ) is

1 N−1−i 1−α −β
     
1 KSA
1− + β Pr(Si+1 [i] = fi ) + α + Pr(Si+1 [i] 6= fi ),
N N N

where

2 N−i−1 i r i 1 i−1 1 i 1 i−s


1. α = 1 − ∏ 1− 1− 1− ∑ 1−
N r=1 N N N N s=1 N
N −i−1 1 i+1 2 N−i−2
2. β = 1− 1−
N N N

Using Table 2.3 we present our comparative study of the correlation probabilities.
We present the theoretical values of Pr(Zi = fi−1 ) for 3 ≤ i ≤ 64 according to The-
orem 2.27 and also according to the formula of [78]. The experimental values are
averaged over 100 billion key schedulings, where the keys are of length 16 and are ran-

33
domly generated. From Table 2.3, it is clear that our estimation gives a much better
approximation than [78].

2.4 Conclusion

In this chapter, we have given justification of the negative bias between Zi with i −
k[0] which was observed experimentally by Paterson et al. Next we have considered a
generalization of Roos bias. We have also presented the complete correlation between
Zi and i − fy . Our formulas for the probabilities of Zi = i − fi and Zi = fi−1 give better
approximation than the existing works.

34
CHAPTER 3

Settling the mystery of Zr = r in RC4

Here, using probability transition matrix, at first we revisit the work of Mantin on find-
ing the probability distribution of RC4 permutation after the completion of KSA. After
that, we extend the same idea to analyse the probabilities during any iteration of Pseudo
Random Generation Algorithm. Next, we study the bias Zr = r (where Zr is the r-th
output keystream bit), which is one of the significant biases observed in RC4 output
keystream. This bias has played an important role in the plaintext recovery attack pro-
posed by Isobe et al. in FSE 2013. However, the accurate theoretical explanation of the
bias of Zr = r is still a mystery. Though several attempts have been made to prove this
bias, none of those provides accurate justification. Here, using the results found with
the help of probability transition matrix we justify this bias of Zr = r accurately and set-
tle this issue. The bias obtained from our proof matches perfectly with the experimental
observations.

3.1 Introduction

RC4 has been one of the most famous ciphers for research in last twenty years. Since
1994 when it was made public, it has gone through rigorous cryptanalysis from cryp-
tologists around the world [2, 15, 78, 67, 107, 106, 105]. Several weaknesses of this
cipher have been found, and some of them still do not have proper theoretical justifi-
cation. Due to so many weaknesses RC4 has been dropped by Google recently. But
it is still an active area of research. The importance of research on this cipher can be
observed in the recently published works on this cipher [113, 67, 93? , 98]. In 2017,
two works [27, 97] on RC4 are going to appear in Designs, Codes and Cryptography.

RC4 is the most used stream cipher in last two decades. It has been widely used
in different areas by different companies. It was designed by Ronald Rivest in 1987,
but was made public after 1994. First being adopted by TLS, RC4 was used in various
applications later. In 1997, it was used in WEP. After that, it was used by Microsoft
Lotus, Oracle Secure, WPA.

Due to its huge application and very simple structure, RC4 became the source of
attention in last two decades. There are so many attacks proposed against it. Here we are
going to mention only a few of them. The attacks have several directions. For example,
distinguishing attacks [45, 89, 79], state recovery attacks [71, 84], etc. The attacks
are mostly based on the correlations found between keystream and keys, or between
keystream and some constant values. In FSE 2001, Mantin and Shamir presented a
broadcast attack using a bias of Z2 [? ]. Another influential attack was provided by
Fluhrer et al. [44], which was based on the biases in Key Scheduling Algorithm. Some
more interesting results and attacks are provided in [113, 54, 93? , 95, 104]. The
biases obtained in RC4 keystreams resulted attack on protocol WEP [44, 71]. This led
to the introduction of a new protocol WPA, which was designed to block the attacks
against WEP. Though both of them used RC4, WPA had better key mixing features.
But, WPA also faced attack after a period. Based on the attacks proposed against RC4,
in 2014 Crypto, Rivest and Schuldt proposed a variant of RC4, named Spritz [99]. It
was designed mostly to defend the attacks against RC4. Proposal of ciphers like Spritz
even after so many years of proposal of RC4 shows the usefulness of the design model
of RC4-like structures. However, in FSE 2015, Banik et al. [? ] attacked Spritz based
on a short term bias and a long term bias of keystream.

Among all the biases used in attacks against RC4, most have been theoretically ex-
plained. However, both the biases of Zr = 0 and Zr = r did not have proper justification
for a long period. But both have significant contribution in attacks against RC4. In FSE
2013, Isobe et al. [? ] provided a full plaintext recovery attack where they used the
bias of Zr = r. Also, bias of Zr = 0 has been used by Maitra et al. [79] in attacks on
broadcast RC4.

After severe analysis, in Journal of Cryptology (2014), the explanation of Zr = 0 has


was given by Sen Gupta et al. [55], which very closely matched with the experimental
result. But the bias of Zr = r is still not properly explained.

We describe the structure of the RC4 cipher here in short. It has two phases, namely
Key scheduling algorithm (KSA) & Pseudo Random Generation algorithm (PRGA). In

36
KSA, the 256 byte key is given as input. The algorithm starts with an identity permu-
tation of 0 to 255. A scrambling is performed over this permutation using the key and
finally another permutation of 0 to 255 is achieved. In this phase, no output keystream
is generated. After this, the scrambled permutation of KSA goes to the PRGA phase.
Here, the output keystreams Z1 , Z2 , . . . are produced using the scrambled permutation.
Table 3.1 describes briefly the KSA and PRGA, where all operations are over ZN .

Table 3.1: Description of the RC4 Algorithm – KSA and PRGA.

KSA PRGA
Initialization: Initialization:
For i = 0, . . . , N − 1 i = j = 0;
S[i] = i;
j = 0; Keystream Generation Loop:
i = i + 1;
Scrambling: j = j + S[i];
For i = 0, . . . , N − 1 Swap(S[i], S[ j]);
j = ( j + S[i] + K[i]); t = S[i] + S[ j];
Swap(S[i], S[ j]); Output Z = S[t];

Our contribution: As already mentioned, the reason behind this bias of Zr = r is not
properly known. In [? ], Isobe et al. provided a theoretical (Theorem 8) justification
for this. The theoretical result is plotted against the experimental result in a graph. But
the probability Pr(Zr = r) achieved by their theory does not match properly with the
experimental result. As mentioned in that paper:

“Since the theoretical values do not exactly coincide with the experimental
values, we do not claim that Theorem 8 completely prove this bias".

After this, in FSE 2014, Sen Gupta et al. [54] gave another theoretical explanation
of this bias. Their values provided better result than [? ]. In our work, we further
improve this result which matches perfectly with experiment.

In 2001, Mantin [82] found the expression for probability Pr(S[u] = v) after the
completion of KSA. We analyse this probability using matrix form. Though both ideas
are actually same, our presentation is different. We use matrix form so that one can
visualize the transition probabilities easily. Though the probability Pr(S[u] = v) after the
completion of KSA has been found by Mantin, the probability Pr(S[u] = v) during any

37
iteration of PRGA was not studied in his work. Here, we also study these probabilities
using same idea.

In Journal of Cryptography 2014 [55], Sen Gupta et al. attempted to find the prob-
ability for Su−1 [u] = v. Applying our probability transition matrix, we can find the
probability Pr(Sr [u] = v) for any u, v at any iteration r of PRGA. After finding the prob-
ability during any iteration of PRGA, we use that here to prove the probability Zr = r.

3.2 Probability Transition Matrix and its application

3.2.1 Idea of Probability Transition in RC4

For any N, let S be a permutation of integers from 0 to N − 1. The value at r-th position
of permutation S is denoted by S[r] (starting from 0-th position S[0]). Now, suppose we
choose a particular position i of the permutation. Next, we randomly choose a number
j from 0 to N − 1. Now, we interchange S[i] and S[ j], i.e., we interchange the positions
of the values located at i-th and j-th position. We call this new permutation S0 . Using
the transition matrix we find the change of probability for presence of v at u-th position
from initial permutation S to final permutation S0 , i.e., from Pr(S[u] = v) to Pr(S0 [u] = v)
for any u and v after the interchange.

Let pu,v be the probability Pr(S[u] = v), and p0u,v be the probability Pr(S0 [u] = v).
Let MS be an N × N matrix. We number the columns and rows starting from 0 and
ending at N − 1. In this matrix, at (u, v)-th cell, i.e., at the cell located at u-th row and
v-th column, we put the probability Pr(S[u] = v) = pu.v . Similarly, MS0 is the respective
matrix for the probabilities of final permutation S0 . So, we fill the (u, v)-th cell of MS0
by p0u,v . Now, we try to find the relation between the entries of MS and MS0 .

   
p0,0 p0,1 ... p0,N−1 p00,0 p00,1 ... p00,N−1
p1,0 p1,1 ... p1,N−1 p01,0 p01,1 ... p01,N−1
MS=  −−−−−−−−→ MS0= .
  transition  
. . . . . . . .
. . . . . . . .
. . . . . . . .
pN−1,0 pN−1,1 ... pN−1,N−1 p0N−1,0 p0N−1,1 ... p0N−1,N−1

Lemma 3.1 For any chosen position i which interchanges value with some j, the prob-

38
abilities p0u,v are of the form:


 p (1 − 1 ) + 1 p , if u 6= i
u,v N i,v
p0u,v = N
 1, if u = i
N

Proof 3.2 Let i be the chosen position. So, we focus on the i-th row of MS . It contains
the probabilities of presence for any v ∈ [0, N − 1] at i-th position. Now, since j is
arbitrary, for any j0 ∈ [0, N − 1], Pr( j = j0 ) = N1 . Now, suppose we want to find p0j0 ,v0
for some v0 . For this, we consider the following two cases:

Case 1: j0 6= i : Now, after the interchange, v0 can come at position j0 by two possible
disjoint ways:

1. S[ j0 ] = v0 and j 6= j0 : If in the initial permutation S, v0 is located at position


j0 and j 6= j0 , then the swap between position i and j does not effect j0 . So, v0
remains at j0 . Probability of this event is
 
1
Pr(S[ j0 ] = v0 ) · Pr( j 6= j0 ) = p j0 ,v0 · 1 −
N
.

2. S[i] = v0 and j = j0 : In this case, in the initial matrix S, v0 was at position i.


Since j = j0 , due to swap, S0 [ j0 ] becomes v0 . The probability of this event is

1
Pr(S[i] = v0 ) · Pr( j = j0 ) = pi,v0 · .
N

So, total probability: p0j0 ,v0 = p j0 ,v0 (1 − N1 ) + pi,v0 N1 .

Case 2: j0 = i : For any j, if S[ j] = v0 , then after swap, S0 [i] becomes v0 . We know, for
any j0 ∈ 0, 1, · · · , N − 1, Pr( j = j0 ) is 1
N, since j is random. Now, Pr(S[ j] = v0 ) = p j,v0 .
!
N−1 N−1
So, total probability p0i,v0 = 1
N ∑ p j,v0 = N1 . (since ∑ p j,v0 = 1)
j=0 j=0

So, the entries p0u,v ’s of matrix M0S can be expressed by the entries of matrix MS as

39
follows:

p0,0 1 − N1 + N1 pi,0 p0,1 1 − N1 + N1 pi,1 p0,N−1 1 − N1 + N1 pi,N−1


    
...
p1,0 1 − N + N1 pi,0
1 p1,1 1 − N + N1 pi,1
1 p1,N−1 1 − N + N1 pi,N−1
1
  
...
. . .
 .

. . . .
.
 . . .

 
pi−1,0 1 − N1 + N1 pi,0 pi−1,1 1 − N1 + N1 pi,1 pi−1,N−1 1 − N1 + N1 pi,N−1
  
...
MS0 =  .
 
1 1 ... 1
 N N N 
pi+1,0 1 − N1 + N1 pi,0 pi+1,1 1 − N1 + N1 pi,1 pi+1,N−1 1 − N1 + N1 pi,N−1
  
 ... 
 . . . . 
. . .. .
. . .
pN−1,0 1 − N1 + N1 pi,0 pN−1,1 1 − N1 + N1 pi,1 pN−1,N−1 1 − N1 + N1 pi,N−1
  
...

3.2.2 Explanation of the probabilities after KSA phase and during


PRGA of RC4:

Using the idea of probability transition matrix, we can achieve the probability of S[u] =
v for any u, v ∈ {0, 1, 2, · · · , N − 1} during any iteration of KSA in RC4 and also after
any iteration of PRGA. For this, we start with a general matrix M0 with the initial
probabilities pi, j ’s and check how the entries of the matrix change with each iteration.
For convenience, we study for only a single column of the matrix. During the transition,
the column changes independently, i.e., the transition of each entry is not effected by
any entry of the other column. So, we can study the change for a single column and
the other columns will also change in similar manner. So, suppose, C0 be a particular
column of the initial matrix.

   
(0)
p0 p(0)
   
 p(1)   p(1) 
 0   
 (2)   
C0 =   =  p(2) .
 p0   
 ..   .. 

 .  
  . 

(N−1)
p0 p(N−1)

The entries of 0-th iteration, i.e, pu0 ’s are also denoted here by pu ’s. Afterwards, at
any instant where we introduce pu , we mean pu0 .

Now, let C(i) be the respective column after i iterations. Then, the entries of C(i) can
be given as in the following:

40
(u)
Theorem 3.3 Let pi be the u-th entry of C(i) where u ∈ [0, N − 1], then
 i−1
(u) 1 − 1 i + 1 (r) 1 − 1 r ,
 
p p if u ≥ i



 N N ∑ N


 r=0

 1,

if u = i − 1
(u) N
pi =  
1 1 i−u−1 1 i−1 (r) 1 r
 
N 1− N + N ∑r=u+1 p 1 − N





  
(r)
 r i−u−1 j
 u p 1 1

 + ∑r=0 N 2 . 1 − N . ∑ j=0 1 − N , if u < i − 1

Proof 3.4 We prove it by induction on i.

(u)
For u ≥ i: When i = 0, the expression given for pi becomes p(u) . So, for i = 0, it is
(u) i (r) 1 − 1 r for all
= p(u) 1 − N1 + N1 ∑i−1

true. Now, suppose for some i = k, pi r=0 p N
u ≥ k. We show that this is also true for the next iteration i = k + 1.

(u) (u) (k) (u) k


Now from Lemma 3.1, pk+1 = pk 1 − N1 + N1 .pk . Here, pk = p(u) 1 − N1 +


1 k−1 (r) 1 r
 (k) (k) 1 − 1 k + 1 k−1 p(r) 1 − 1 r .
 

N r=0 p 1 − N and p k = p N ∑
N r=0 N
For convenience of the reader and to shorten the calculations, we introduce variables x
and y where x denotes the term 1 − N1 and y denotes N1 . So, (x + y) = 1.

Therefore,

k−1 k−1
(u)
pk+1 = x p(u) xk + y ∑ p(r) xr + y p(k) xk + y ∑ p(r) xr
   
r=0 r=0
k−1 k−1
= p(u) xk+1 + xy ∑ p(r) xr + y.p(k) xk + y 2
∑p (r) r
x
r=0 r=0
k−1
= p(u) xk+1 + (xy + y2 ) ∑ p(r) xr + yp(k) xk
r=0
k−1
= p(u) xk+1 + y ∑ p(r) xr + yp(k) xk
r=0
k
= p(u) xk+1 + y ∑ p(r) xr
r=0

1 k+1 1 k (r) 1 r
   
(u)
=p 1− + ∑p 1−
N N r=0 N

So, the result is true for i = k + 1.

For u = (i − 1): It comes directly from Lemma 3.1.

41
(u) (i−1)
For u ≤ (i − 1): When i = u + 1, pi = pi = N1 . So the result is true for u = i − 1.

Next, when i = u + 2, we know from Lemma 3.1,

 
(u) (u) 1 1 (u+1)
pu+2 = pu+1 1 − + pu+1
N N
1 u+1 1 u (r) 1 r
      
1 1 1 (u+1)
= 1− + p 1− + ∑p 1−
N N N N N r=0 N

So, it satisfies for i = u + 2. Now, suppose, for some i = k, it is true. This means,

1 k−u−1 1 k−1 (r) 1 r 1 r k−u−1 1 j


      u  (r)     
(u) 1 p
pk = 1− + ∑ p 1 − + ∑ 2 1 − ∑ 1 − .
N N N r=u+1 N r=0 N N j=0 N

So, for i = k + 1,

(u) (u) (k)


pk+1 = pk x + ypk

where
  k−1  u  k−u−1 
(u)
pk x = x yxk−u−1 + y ∑ p(r) xr + ∑ p(r) y2 xr ∑ x j
r=u+1 r=0 j=0
 k−1  u  k−u−1 
= yxk−u + xy ∑ p(r) xr + x ∑ p(r) y2 xr ∑ xj
r=u+1 r=0 j=0

and
 k−1 
(k)
ypk = y p(k) xk + y ∑ p(r) xr
r=0
u k−1
= y[p(k) xk + y ∑ p(r) xr + y ∑ p(r) xr ]
r=0 r=u+1
u k−1
= yp(k) xk + y2 ∑ p(r) xr + y2 ∑ p(r) xr
r=0 r=u+1

42
Adding these two, we have:
 k−1  u  k−u−1 
(u)
pk+1 = yxk−u + xy ∑ p(r) xr + x ∑ p(r) y2 xr ∑ x j + yp(k) xk
r=u+1 r=0 j=0
u k−1
+ y2 ∑ p(r) xr + y2 ∑ p(r) xr
r=0 r=u+1
 k−1  k−1 u  k−u−1 
= yxk−u + xy ∑ p(r) xr + y2 ∑ p(r) xr + yp(k) xk + x ∑ p(r) y2 xr ∑ xj
r=u+1 r=u+1 r=0 j=0
u
+ y2 ∑ p(r) xr [Rearranging the terms]
r=0
 k−1   u  k−u  u 
= yxk−u + (xy + y2 ) ∑ p(r) (x)r + yp(k) (x)k + ∑ p(r) y2 xr ∑ x j + y2 ∑ p(r) xr
r=u+1 r=0 j=1 r=0
 k−1   u  k−u  u 
= yxk−u + y(x + y) ∑ p(r) (x)r + yp(k) (x)k + ∑ p(r) y2 xr ∑ xj + y2 ∑ p(r) xr
r=u+1 r=0 j=1 r=0
 k−1   u  k−u 
= yxk−u + y ∑ p(r) xr + yp(k) xk + ∑ p(r) y2 xr ∑ x j +
r=u+1 r=0 j=1
 u 
∑ p(r) y2 xr [since (x + y) = 1]
r=0

 k   u  k−u 
= yxk−u + y ∑ p(r) xr + ∑ p(r) y2 xr ∑ x j
r=u+1 r=0 j=0
k   u  (r)
1 r k+1−u−1
 
1 1 1 1 r p 1 j
= (1 − )k+1−u−1 + ∑ p(r) (1 − ) + ∑ (1 − ) ∑ (1 − )
N N N r=u+1 N r=0 N2 N j=0 N

Pr(S[u] = v) after KSA: In key scheduling algorithm, j is updated as j = j + S[i] + k[i].


Since a keybit is involved in the sum and keybits are random, j can be treated as random,
without caring about the other variables involved in the sum. This is because for any
1
j0 ∈ [0, N − 1], Pr( j = j0 ) = Pr( j + S[i] + k[i] = j0 ) = Pr(k[i] = j0 − j − S[i]) = N,
since k[i] is random. Now, in KSA, i starts from 0 and at each iteration increases by
1. Here we find the probability transition matrix for the permutation S after each round
of KSA. The permutation obtained after r-th iteration is denoted by Sr . We denote the
probability matrix corresponding to the initial permutation S0 as M(S0 ) and the matrix
corresponding to any Sr as M(Sr ). Also, the entries of the matrix M(Sr ) are denoted as
(r)
pu,v . After each iteration, the probability transition matrix is updated by the probability
transition formula given in Lemma 3.1. We denote this transition operation as T R. So,
T R(M(Sr )) = M(Sr+1 ).
Since initially KSA starts with the identity permutation, we can express the probability

43
Pr(S[u] = v) for any u, v as follows:

1. Pr(S[u] = v) = 1 if u = v

2. Pr(S[u] = v) = 0 if u 6= v

So, the matrix M(S0 ) is basically an identity permutation.

Initial Matrix:
 
1 0 0 ··· 0
 
···
 
0 1 0 0
 
MS0 =0 ··· 0 .
 
0 1
 .. .. .. .. 
 
...
. . . .
 
0 0 0 ··· 1

Now, after each iteration, we update the matrix by the transition operation. After
the first transition, T R(MS0 ) = MS1 .

In the next iteration, i = 1 and then by the same transition formula (Lemma 3.1) on
MS1 , we can obtain the matrix MS2 . Thus, by consecutive application of transition for
each iteration, at the end we can achieve the final transition matrix MSN .

TR TR TR
MS0 −−→ MS1 −−→ MS2 · · · −−→ MSN

Therefore, the entries of the matrix obtained after any number of iterations can be di-
rectly found by Theorem 3.3. Here, in particular, we find the entries after the final
iteration and show that it matches with Mantin’s result [82].

One important point to note is that, in every transition update, each entry is effected
by the entries of the same column only. The entries of other columns do not have
(r)
any influence on it. So, to find any entry pu,v of the final matrix MSN , we can only
concentrate on the respective column only, i.e., the v-th column. Let us denote the v-th
column of any transition matrix MSr as Cv (MSr ). Now, in the initial matrix MS0 , the
entries of v-th column Cv (MS0 ) was as follows:

44
 (0)
  
p0,v 0
(0)
 p1,v   0 
 ..
  ..

   
 .   . 
 (0)   
pu−1,v 0
Cv (MS0 ) =  =
   
(0) 
 pu,v   1 
 (0)   
 pu+1,v   0 
 ..   .. 
 .   . 
(0)
pN−1,v 0

Now, after N iterations, the probability Pr(S[u] = v) can be directly found by Theo-
rem 3.3. So, we use the formula:

i u  (r)
1 i−u−1
  
(u) 1 1 1 1 r p 1
pi = (1 − )i−u−1 + ∑ p(r) (1 − ) +∑ 2
.(1 − )r . ∑ (1 − ) j
N N N r=u+1 N r=0 N N j=0 N

Here, i = N and p(v) = 1. So, if v > u, the third term in the sum becomes 0 (since
all p(r) for r = 0, 2, · · · , u are 0).
So,
 N 
1 1 1 1 r
Pr(S[u] = v) = (1 − )N−u−1 + ∑ p(r) (1 − )
N N N r=u+1 N
 
1 1 1 1
= (1 − )N−u−1 + p(v) (1 − )v
N N N N
1 1 1 1
= (1 − )N−u−1 + (1 − )v
N N N N
 
1 1 1
= (1 − )N−u−1 + (1 − )v
N N N

For v ≤ u, the second term in the sum vanishes, since for all r > v, p(r) = 0.
So,

u
p(r) 1 N−u−1
 
1 1 1
Pr(S[u] = v) = (1 − )N−u−1 + ∑ (1 − )r ∑ (1 − ) j
N N r=0 N2 N j=0 N

1 1 p(v) 1 N−u−1 1
= (1 − )N−u−1 + 2 (1 − )v ∑ (1 − ) j
N N N N j=0 N

1 1 1 1 N−u−1 1
= (1 − )N−u−1 + 2 (1 − )v ∑ (1 − ) j
N N N N j=0 N
1 1 1 1 1
= (1 − )N−u−1 + (1 − )v (1 − (1 − )N−u )
N N N N N
 
1 1 1 1
= (1 − )N−u−1 + (1 − )v (1 − (1 − )N−u )
N N N N

45
So, we have:

  
1 N−u−11 1 v
(1 − N ) + (1 − N ) if v ≥ u


 N
Pr(S[u] = v) =  
1 1 N−u−1 1 v 1 N−u
 N (1 − N ) + (1 − N ) (1 − (1 − N ) ) , if v < u

This matches exactly with the result obtained by Mantin [82]. Here, we show the
transition of the column in the diagram.

Probabilities during PRGA: Using the idea of probability transition matrix, we can
find the probability Pr(Sr [u] = v) for any u and v after r-th round. However, here the
procedure is slightly tricky. In PRGA, we know that the iteration starts with i = 1,
unlike KSA. And, here j is updated as j = j + S[i]. So, j1 = S[1], which cannot be
taken as uniformly distributed. However,in FSE 2011[79], Maitra et al showed that
1
as r increases, the distribution of jr gets closer to N. They have shown that j2 has
much more randomness than j1 , and from j3 onwards almost uniformly randomness is
observed. So for first two iteration we take care of the distribution of j, and from third
iteration we take it distribution to be N1 .

First Iteration: We start with the matrix achieved after the first iteration. The proba-
bilities Pr(S[u] = v) after first iteration can be found in [55] in the following lemma.
Lemma 3.5 After the first round of RC4 PRGA, the probability Pr(S1 [u] = v) is:



 Pr(S0 [1] = 1) + ∑ Pr(S0 [1] = X ∧ S0 [X] = 1), u = 1, v = 1;
X6=1








 ∑ Pr(S 0 [1] = X ∧ S0 [X] = v), u = 1, v 6= 1;
X6=1,v
Pr(S1 [u] = v) =


 Pr(S0 [1] = u) + ∑ Pr(S0 [1] = X ∧ S0 [u] = u), u 6= 1, v = u;
X6=u





 ∑ Pr(S0 [1] = X ∧ S0 [u] = v), u 6= 1, v 6= u.


X6=u,v

From this, we find the entries of the matrix after first iteration. Now, the second
iteration is i = 2. Then, to deal with iteration starting from i = 2, we just change the
position of the rows of the matrix. The row corresponding to i = 2 comes to the first.
Each of the rows are shifted upwards by 2 rows, and the 0-th and 1-st row go to the last.
So, in this new matrix the iteration starts from the first row.

46
0.0088
0.0051
224 224
0.0080
0.0048
192 192
0.0045 0.0072
160 160
0.0042 0.0064

v −→

v −→
128 128
0.0039 0.0056
96 96
0.0036 0.0048
64 64
0.0033 0.0040
32 32
0.0030 0.0032
0 0
0 32 64 96 128 160 192 224 0 32 64 96 128 160 192 224
u −→ u −→

(a) (b)

0.0052 0.0039115
224 224
0.0050 0.0039100
192 192
0.0048 0.0039085
160 160
0.0046

v −→

v −→
0.0039070
128 128
0.0044 0.0039055
96 96
0.0042 0.0039040
64 64
0.0040 0.0039025
32 32
0.0038 0.0039010
0 0
0 32 64 96 128 160 192 224 0 32 64 96 128 160 192 224
u −→ u −→

(c) (d)

Figure 3.1: Probability Pr(S[u] = v) for 1 ≤ u ≤ 255, 0 ≤ v ≤ 255 in PRGA. Here (a)
Round i = 0 (b) Round i = 1 (c) Round i = 256 (d) Round i = 512 .

Second iteration: In [79], the probability distribution of j2 is given as follows:



 N−1



 Pr(S0 [1] = 2) + ∑ Pr(S0 [1] = w) Pr(S0 [2] = v − w), if v = 4

 w=0
w6=2
Pr( j2 = v) = N−1




 ∑ Pr(S0[1] = w) Pr(S0[2] = v − w), if v 6= 4

 w=0
w6=2

1
So, instead of using the values N and (1 − N1 ), we use the expressions given in the above
equations to update the matrix. From third iteration, since j3 behaves almost uniformly
random, we can apply the formulas achieved in Theorem 3.3 to find the probabilities
after any round. Thus,using the idea of probability transition matrix, we find the proba-
bility of S[u] = v after any iteration of KSA and PRGA. Probability distributions of few
j values are given in Figure ??.

We provide the heat maps in Figure 3.1 for the probabilities for PRGA for round
i = 0, 1, 256 and 512.

Recently in 2017, Paul et al. [98] did a detail study of the probabilities at every
iteration of KSA and PRGA. In [98], for the analysis of PRGA distribution, the authors
have taken j to be uniformly random. But this is not the case in reality, which has been
also mentioned by the authors. The value of j in the first iteration is a function of KSA
permutation and this cannot be taken as random. The value of j2 also is not random.

47
However, in the next iterations, the distribution of j becomes very close to random. In
conclusion of [98], the authors clearly mentioned that their rigorous analysis on PRGA
distribution is based on the assumption that j is random. They raised an open problem
to find the actual distribution of PRGA. In our matrix approach, we are able to deal
with this very easily. So, this approach improves the result of the PRGA distribution
from [98].

3.3 Theoretical Explanation of Zr = r

Here we prove the bias of Zr = r for r ≥ 3. In the following lemma we show some
events. In few of them Zr = r is the only possible output. In some paths Zr can never
be equal to r. After discussing these paths, we find their respective probabilities of oc-
currence. Finally, in Theorem 3.14, we find the probability of Zr = r. For convenience,
we denote by KSA(u, v) the probability of SKSA [u] = v after the completion of KSA.
Notations:

• Sr [u] : value at u-th position after r-th round of PRGA.

• KSA(u, v) : Probability of occurrence of v at u-th position after KSA.

• jr : j at r-th iteration.

• SKSA [u] : value at u-th position after KSA.

Lemma 3.6 During PRGA,


 
Pr Zr = r | (Sr−2 [r − 1] = r ∩ Sr−2 [r] = 0 ∩ jr−1 6= r) = 1,
 
Pr Zr = r | (Sr−1 [r] 6= 0 ∩ Sr−1 [ jr ] = r) = 0.

Proof 3.7 Here we have Sr−2 [r − 1] = r, Sr−2 [r] = 0 and jr−1 6= r. Since jr−1 6= r and
Sr−2 [r] = 0, we have jr = jr−1 . Thus when i = r, after swap, we have Sr [r] = r and
Sr [ jr ] = 0. Thus
Zr = Sr [Sr [r] + Sr [ jr ]] = Sr [r] = r.

48
r-1 r J
r 0 y

y 0 r

y r 0

Zr=r

Figure 3.2: Path for Zr = r given Sr−2 [r − 1] = r, Sr−2 [r] = 0 and jr−1 6= r.

Please see the path in Figure 3.2. Thus


 
Pr Zr = r | (Sr−2 [r − 1] = r ∩ Sr−2 [r] = 0 ∩ jr−1 6= r) = 1.

Also

1
Pr(Sr−2 [r − 1] = r ∩ Sr−2 [r] = 0 ∩ jr−1 6= r) = Pr(Sr−2 [r − 1] = r) Pr(Sr−2 [r] = 0) 1 − ,
N

where Pr(Sr−2 [r − 1] = r), Pr(Sr−2 [r] = 0) can be calculated using the idea of Sec-
tion 3.2.

Similarly  
Pr Zr = r | (Sr−1 [r] 6= 0 ∩ Sr−1 [ jr ] = r) = 0

and  
1
Pr(Sr−1 [r] 6= 0 ∩ Sr−1 [ jr ] = r) = 1 − Pr(Sr−1 [r] = 0) ,
N
assuming jr is random.

Lemma 3.8 Consider the events:

1. E1 : SKSA [1] = r ≥ 3

2. E2 : j2 ∈
/ [3, r]

3. E3 : jl 6= j2 , l ∈ [3, r − 1]

4. E4 : jl 6= r, l ∈ [3, r − 1]

5. E5 : jr = j2

6. E6 : SKSA [2] 6= jr − r

49
Then Pr(Zr = r | ∩5i=1 Ei ) = 1, Pr(Zr = r | E1 ∩ E2 ∩ E3c ∩ E4 ∩ E5 ) = 0,


KSA( jr , jr −r) 1 r−3
1−KSA( jr ,r) (1 − N ) if jr > r & jr 6= 2r




Pr(Zr = r | E1 ∩ (E2 ∩ E3 )c ∩ E4 ∩ E6 ) = 1 1 r− jr −1
N−1 (1 − N ) if jr < r



 0 if j = r, 2r.
r

Here, for any event E , by E c we mean the complement of E , i.e, the event that E does
not occur.

The probabilities are as follows:

1. Pr(E1 ) = KSA(1, r)
N−r−2
2. Pr(E2 ) = N

3. Pr(E3 ) = Pr(E4 ) = (1 − N1 )r−3


1
4. Pr(E5 ) = N

5. Pr(E6 ) = 1 − Pr(KSA(2, jr − r))

Proof 3.9 Due to the event E1 , j1 = r. After the swap, S1 [r] = r. Now, j2 = j1 + S1 [2] =
r + SKSA [2]. (since r > 2, the first swap cannot involve the position SKSA [2]). Let us
denote SKSA [2] by w. So, j2 = r +w. So, after the next swap, S2 [r] = r and S2 [r +w] = w.
Then, due to event E3 , the positions r and r + w are not affected upto (r − 1)-th iteration.
Next, at r-th iteration, jr = j2 = r + w due to event E4 . So, after swap, Sr [r] = w and
Sr [r + w] = r. So, Zr = Sr [Sr [r] + Sr [r + w]] = Sr [z + w] = r.

Now, the probabilities of the events are Pr(E1 ) = KSA(1, r), Pr(E2 ) = N−r−2
N , Pr(E3 ) =
Pr(E4 ) = (1 − N1 )r−3 , Pr(E5 ) = N1 .

Assuming the Ei ’s are independent,

N −r−2 2 r−3 1
 
Pr(∩5i=1 Ei ) ≈ KSA(1, r) 1− .
N N N

Now, on the other side, if E3c occurs, this means some jl is equal to j2 for l ∈
[3, r − 1]. As a result, the value at position j2 changes. Once it changes, there is no
chance of getting back that value upto (r − 1)-th iteration because i moves towards the

50
right side at each iteration and it cannot reach the position where the value has been
swapped. As a result, the output Zr cannot be r.
 
c 1 r−3
The probability Pr(E3 ) = 1 − (1 − N ) .

Now if E1 and E4 hold, Sr−1 [r] = r. Now if Sr−1 [ jr ] = jr − r, Zr = r. Now we have


two cases:

Case 1: jr > r : The only possibility of this is if after KSA, position jr is occupied by
jr − r, and j3 , j4 · · · jr−1 does not touch this position. In this case, the probability is

KSA( jr , jr − r) 1 r−3
 
1−
1 − KSA( jr , r) N

as by the condition E1 , SKSA [1] = r.

In any other case, this would not occur. Suppose, at the end of KSA, jr is not
occupied by jr − r. Then, in order to bring jr − r to jr -th position, at some iteration
between 1 to r, jr − r has to come to jr -th position by swap. This is possible only if
at some iteration either i or j becomes equal to jr . Since jr > r, i cannot be equal to
jr in first r iterations. Suppose, at some iteration m < r, jm become equal to jr . This
means, when i = m, the m-th position contains jr − r and after the swap between m
and jm , it comes to position jm . But, according to the update rule, jm = jm−1 + S[m] =
jm−1 + jr − r. Since jm = jr , we have jm−1 = r, which is not possible by assumption E4 .
So, this event is not possible.

Case 2: jr < r : In this situation, when i = jr , due to swap S jr [ jr ] = jr − r. This


1
happens with probability N−1 as S jr [r] = r and jr 6= 2r. Also remaining jl cannot be jr
for l = jr + 1, . . . , l = r − 1. Thus total probability is

1 r− jr −1
 
1
1−
N −1 N

Lemma 3.10 Consider the events:

1. E7 : SKSA [r] = r ≥ 3

51
2. E8 : jl 6= r, l ∈ [2, r − 1]

Then


KSA( jr , jr −r) 1 r−1
1−KSA( jr ,r) (1 − N ) if jr > r & jr 6= 2r




Pr(Zr = r | E7 ∩ E8 ) = 1 1 r− jr −1
N−1 (1 − N ) if jr < r



 0 if j = r, 2r.
r

Proof 3.11 Proof is similar to the second part of the proof of Lemma 3.8. Also Pr(E7 ) =
 r−2
1
KSA(r, r) and Pr(E8 ) = 1 − N .

Lemma 3.12 Consider the events:

1. E9x : SKSA [x] = r ≥ 3 for x ∈ [2, r − 2]

2. E10
x : j , j ,..., j
1 2 x−1 6= x

3. E11
x : j =r
x

4. E12
x : j
x+1 ∈
/ [x + 2, r]

5. E13
x : j 6= r, l ∈ [x + 2, r − 1]
l

6. E14
x : j 6= j
l x+1 , l ∈ [x + 2, r − 1]

7. E15
x : j = j
r x+1

i=9 Ei ) = 1, Pr(Zr = r | E9 ∩E10 ∩E11 ∩E12 ∩E13 ∩(E14 ) ∩E15 ) =


Then Pr(Zr = r | ∩15 x x x x x x x c x

Proof 3.13 Proof is similar to the first part of the proof of Lemma 3.8. Also Pr(E9x ) =
x ) = 1 − 1 x−1 , Pr(E x ) = 1 , Pr(E x ) = 1 − r−x−1 , Pr(E x ) = 1 −

KSA(x, r), Pr(E10 N 11 N 12 N 13
1
r−x−2 x ) = 1− 1
 r−x−2 x )= 1.
N , Pr(E14 N , Pr(E15 N

Now we will prove the main result.

52
Theorem 3.14 In PRGA phase of RC4, the probability Pr(Zr = r) for 3 ≤ r ≤ 255 is
given by

5 N−1
KSA( jr , jr − r) 1 r−3
  
Pr(Zr = r) = ∏ Pr(Ei ) + ∑ 1−
i=1 jr =r+1 1 − KSA( jr , r) N
jr 6=2r
r−1
1 r− jr −1
  
1 
+ ∑ 1− Pr(E1 ) 1 − Pr(E2 ) Pr(E3 ) Pr(E4 ) Pr(E6 )
jr =0 N − 1 N

1 r−1 r−1 1 1 r− jr −1
 N−1
KSA( jr , jr − r)
    
+ ∑ 1 − KSA( jr , r) 1 − N + ∑ 1− Pr(E7 ) Pr(E8 )
jr =r+1 jr =0 N − 1 N
jr 6=2r
 r−2 15   
1
+ ∑ ∏ Pr(Eix ) + Pr(Sr−2 [r − 1] = r) Pr(Sr−2 [r] = 0) 1 −
N
x=2 i=9
 5 15
∏ Pr(Eix )

+ 1 − ∏ Pr(Ei ) − Pr(E1 ) 1 − Pr(E2 ) Pr(E3 ) Pr(E4 ) Pr(E6 ) − Pr(E7 ) Pr(E8 ) −
i=1 i=9
i6=4 i6=14

1 1 1
− Pr(Sr−2 [r − 1] = r) Pr(Sr−2 [r] = 0) 1 − − (1 − Pr(Sr−1 [r] = 0) .
N N N

Proof 3.15 Major paths are coming from Lemma 3.6, Lemma 3.8, Lemma 3.10 and
5
Lemma 3.12. The first term ∏ Pr(Ei ) comes from Lemma 3.8, where we assume that
i=1

5
Pr(∩5i=1 (Ei )) = ∏ Pr(Ei )
i=1

due to independence.

Similarly in other cases also we assume the independence and find the probability
of the intersection of events by the product. In the complementary path, we assume that
Zr = r holds with probability N1 . Hence the proof.

Experimental results: We run our experiment for 241 random 256 bit key. The graph
obtained in experiment has been shown in Figure 3.3. We compare our theoretical result
with the experimental result as well as the theories provided by [? ] and [54]. Where
the graph of [? ] and [54] have significant difference from the experimental curve, our
theory matches the curve exactly. Thus, our work provides the accurate justification of
the bias observed for Zr = r.

53
0.003935
Random ( N1 )
0.003930 Theoretical values (Sen Gupta et al.)
Experimental data
0.003925 Theoretical values (our)

Pr(Zr = r) −→
Theoretical values (Isobe et al.)

0.003920

0.003915

0.003910

0.003905
32 64 96 128 160 192 224
r −→

Figure 3.3: Index r of RC4 keystream bytes.

3.4 Conclusion

In this chapter, we accurately justify the bias of Zr = r theoretically. In our proof, we use
the probability distribution of RC4 permutation during PRGA, which we obtain by the
idea of transition matrix. The proof of this bias was attempted before in FSE 2013 and
FSE 2015. But previous theoretical curves did not match accurately with experimental
curve. Our work finally puts an end to this research by an exact explanation of the bias.

54
CHAPTER 4

Some results on reduced round Salsa and Chacha

Salsa20 and ChaCha20 are two of the most promising ciphers in recent days. The most
significant step in the cryptanalysis of Salsa and ChaCha is the idea of Probabilistic
Neutral Bits, which was introduced by Aumasson et al. (FSE 2008). After that, no
significant improvement is achieved in the procedure of choosing Probabilistic Neutral
Bits. The works in this direction mostly were concerned about forward probabilities.
In this chapter, we give a new algorithm to construct Probabilistic Neutral Bits. We
use this algorithm to improve the existing attacks for reduced rounds of both Salsa and
ChaCha. Our attacks on Salsa and Chacha are respectively around 2.27 and 5.39 times
faster than the existing works of Choudhuri and Maitra [32].

In 2005, a project called eSTREAM was organised by EU ECRYPT to identify


suitable stream cipher for adoption in near future. Various attacks on RC4 made it
weaker and risky to use, which led to the rejection of RC4 by different companies.
This was the purpose of the arrangement of this project. The eSTREAM, which was
organised in three phases, was basically a competition between some newly proposed
ciphers. Salsa20 is a stream cipher submitted by D. J. Bernstein [13] to eSTREAM
project. In 2008, it was selected as Phase 3 design for software by eSTREAM, by
receiving highest votes. Original Salsa has 20 rounds. But later Salsa with 8 rounds
(Salsa20/8) and 12 rounds (Salsa20/12) were also proposed.

From the beginning, Salsa has gained serious attention for Cryptanalysis. So far,
quite a few differential attacks have been proposed against Salsa. The main idea of dif-
ferential attack is to input some difference at the initial stage and obtain a bias in the out-
put after few rounds. The first differential attack was proposed by Crowley in 2005 [35].
It could break the 5 round Salsa with time complexity 2165 . Then in Indocrypt 2006,
Fischer et al. [43] reported an attack for 6 round version of Salsa with time complexity
2177 . This attack was further extended to 7 rounds by Tsnunoo et al. [118] with around
2190 trials. In FSE 2008, Aumasson et al. [5] suggested an improvement in the back-
ward inversion to 4 rounds, which led to an attack on 8 round Salsa with 2251 trials.
Table 4.1: Existing attack complexity for reduced round Salsa and Chacha

Cipher Round Attack complexity Reference


2251.0 [5]
2250.0 [109]
Salsa 8 2247.2 [80]
2245.5 [77]
2244.9 [32]
2243.7 Our
2248.0 [5]
2246.5 [109]
Chacha 7 2238.9 [77]
2237.7 [32]
2235.2 Our

This attack uses the concept of probabilistic neutral key bits (PNB) for detecting the
differential. This attack was improved by Shi et al. [109] in ICISC 2012, which reduced
the complexity to 2250 . Later, Maitra et al. [80] revisited PNB concept and gave some
new ideas to reduce the complexity. For 8-round Salsa20 they achieved the complexity
of 2247.2 . After that, Maitra [77] improved the attack complexity up to 2245.5 . Recently,
using multibit approach Choudhuri et. al. [32] improved it up to 2244.9 .

ChaCha [14] is a variant of Salsa. This was published by Bernstein in 2008, to


achieve better performance than Salsa. The 256 bit ChaCha6 and ChaCha7 have been
attacked by Aumasson et. al. [5]. Later, Maitra [77] improved the attack to complexity
2238.9 by choosing the IVs properly to achieve better results. Recently, Choudhuri et
al. [32] suggested to use multiple bit output instead of single bit output. They improved
the complexity to 2237.6 . Their result is the best among the existing results.

So far, no attack has been reported against full round of Salsa and Chacha. Chacha
is in the process of standardization. ChaCha20 has been adopted by Google [92], to be
used in OpenSSL, replacing RC4. It is also used in OpenBSD and NetBSD operating
systems.

Notations: In this chapter we have used a few notations. Let us present all the notations
before proceeding to next section.

• Xi denotes the word in i-th cell of the matrix X, as given in introduction already.

56
• By Xi, j , we denote the j-th bit of Xi , starting from right (i.e, Xi,0 is the least
significant bit of Xi .). So Xi, j denotes j-th bit of i-th cell of X. We represent it
also by ‘position (i, j)’.

• By X 0 we denote the matrix achieved by injecting a difference at some intended


position of X.

• X r denotes the output matrix after r-th round of X.

• Xir is the i-th word of X r .

• Xi,r j is the j-th bit of Xir .

• By ∆ri, j we represent Xi,r j ⊕ Xi,0rj . In particular when r = 0, we use ∆0i, j = ∆i, j .

• By |X|, we denote the cardinality of a set X.

4.1 Structure of the ciphers

4.1.1 Structure of Salsa

This cipher considers a 4 × 4 matrix, where each cell is of 32 bits. The 16 cells include
8 key cells, 4 constants cells, 2 IV cells and 2 counter cells. The 256 bit Salsa20 divides
the 256 bit input key into 8 cells, 32 bits each. 128 bit Salsa20 replicates another copy
of 128 bit key to make it 256 bits, and then do the same. It also takes 64 bit counter and
64 bit IV as input.
   
X0 X1 X2 X3 c0 k0 k1 k2
   
   
 X4 X5 X6 X7   k3 c1 v0 v1 
X =

=
 
.

 X8 X9 X10 X11   t0 t1 c2 k4 
   
X12 X13 X14 X15 k5 k6 k7 c3

In the above matrix, c0 = 0x61707865, c1 = 0x3320646e, c2 = 0x79622d32 and c3 =


0x6b206574 are constant cells, ki = key cells, vi = IV cells, ti = counter cells.

Quarterround Function: This is a nonlinear function operating on a 4-tuple (a, b, c, d)


to give an output of 4-tuple (a, b, c, d), where each of a, b, c, d is a 32 bit word. The
function is as follows:

57
b = b ⊕ ((a + d) ≪ 7),

c = c ⊕ ((b + a) ≪ 9),

d = d ⊕ ((c + b) ≪ 13),

a = a ⊕ ((d + c) ≪ 18).

Note that here 0 +0 sign denotes addition modulo 232 , ⊕ is the usual XOR operation
and ≪ is left cyclic rotation.

Quarterround function is applied to each column (from 1st to 4th) of the matrix, one
by one, and each of these operations is called a columnround. Each columnround is fol-
lowed by an corresponding rowround, where this function is applied to respective row.
Here, an important point is, in columnround, the order of the cells taken as (a, b, c, d)
is not same for each column. It is respectively (X0 , X4 , X8 , X12 ), (X5 , X9 , X13 , X1 ), (X10 ,
X14 , X2 , X6 ) and (X15 , X3 , X7 , X11 ). Each set of columnround and rowround is together
called a doubleround.

In Salsa20, 20 rounds are performed. By X r , we denote the output matrix after r-th
round. And by R, we denote the total number of rounds. So, the initial matrix is X 0 and
the final output matrix is X R . Since Salsa20 has 20 rounds, here R = 20. Finally, we get
a keystream of 512 bits as Z = X + X R .

Reverse Salsa: Since each state transition function in Salsa is reversible, each round of
Salsa20 is reversible. We call the reverse algorithm ReverseSalsa and each round of the
reverse algorithm reverseround. So, application of reverseround on X r+1 gives X r . So,
using ReverseSalsa algorithm, we can get back X 0 from X 20 . In each round of Revers-
eSalsa, Inverse functions of quarterrounds of Salsa are applied first as rowround, and
then the rowround is followed by respective columnround. This application starts from
fourth row and fourth column, and ends at first row and first column. The quarterround
functions of ReverseSalsa is as follows:

a = a ⊕ ((d + c) ≪ 18)

58
d = d ⊕ ((c + b) ≪ 13)

c = c ⊕ ((b + a) ≪ 9)

b = b ⊕ ((a + d) ≪ 7)

4.1.2 ChaCha

As a variant of Salsa, ChaCha has a structure almost similar to Salsa. Here, in the initial
matrix, the positions of the cells are different
   
X0 X1 X2 X3 c0 c1 c2 c3
   
   
 X4 X5 X6 X7   k0 k1 k2 k3 
X =

=
 
.

 X8 X9 X10 X11   k4 k5 k6 k7 
   
X12 X13 X14 X15 t0 t1 v0 v1

Here c0 = 0x61707865, c1 = 0x3320646e, c2 = 0x79622d32, c3 = 0x6b206574. Also


ki , vi and ti denote the key cells, IV cells and counter cells respectively.

Round Function: In ChaCha, the nonlinear round function is slightly different from
Salsa.

a = a + b, d = ((d ⊕ a) ≪ 16),

c = c + d, b = ((b ⊕ c) ≪ 12),

a = a + b, d = ((d ⊕ a) ≪ 8),

c = c + d, b = ((b ⊕ c) ≪ 7).

The way of application of the nonlinear function is not same in every round. Un-
like columnround and rowround in Salsa, ChaCha applies the function along column
and diagonals. Along column the order is (X0 , X4 , X8 , X12 ), (X1 , X5 , X9 , X13 ), (X2 , X6 ,
X10 , X14 ) and (X3 , X7 , X11 , X15 ). Also along diagonal the order is (X0 , X5 , X10 , X15 ),
(X1 , X6 , X11 , X12 ), (X2 , X7 , X8 , X13 ) and (X3 , X4 , X9 , X14 ). Like Salsa, each round of
ChaCha is also reversible.

59
4.2 Idea of attack on Salsa and ChaCha

4.2.1 Technique of Attack

Several attacks have been proposed so far against Salsa and ChaCha. The basic ideas
of these attacks are similar. We input a difference at any intended bit Xi, j of the initial
matrix X. We call the new matrix X 0 . We try to obtain some bias of output difference
at some particular bit or combination of bits of the output matrix at some r-th round.
We can compute Pr(∆rp,q = 1 | ∆i, j = 1). Suppose this value is 21 (1 + εd ). The term εd is
the measure of the bias of the output difference. Similarly, from the final state, we can
come backward by ReverseSalsa.

4.2.2 Concept of PNB

The concept of PNB was introduced in 2008 by Aumasson et al. [5]. This idea was later
revisited by Maitra et al. [80] to provide an improved attack. At first, we give a brief
idea of Probabilistic Neutral Bits or PNB as given in [5, 80].
The main aim of this idea is to reduce the complexity of searching 256 bits of the
unknown key. We try to partition the set of keybits into two parts:

1. Significant Keybits: keybits which have high influence on the output.

2. Non-significant Keybits: keybits which have low influence on the output.

To be more precise, we find a set of keybits such that, if the values of the keybits of
this set is changed arbitrarily, the probability that the output will change too, is lower
than usual. These keybits are considered to have low influence on the output (Non-
significant keybits). If we can find a set of such keybits, we try to find the values of the
remaining keybits, i.e, the significant keybits, by guessing randomly and considering a
distinguisher to identify the correct set of values. After finding the significant bit values,
we can find the values of non-significant bits by similar guessing and identifying. The
advantage of this idea is that, since the number of significant keybits is much less than
the total size of the key (256), the maximum number of guesses required is significantly
less than 2256 .

60
Let us explain it more formally. Suppose X and X 0 are the initial matrices with input
difference ∆i, j = 1 at position (i, j). After r < R rounds of Salsa, we obtain a huge
bias εd in the output difference at position (p, q). We denote it by ∆rp,q . At the end
of R rounds, we achieve Z = X + X R and Z 0 = X 0 + X 0R . Now, after completion of R
rounds, we change any one keybit, say k, of the initial matrices X and X 0 . We call the
new matrix X̃ and X̃ 0 . Now, subtracting from Z and Z 0 respectively, we obtain Z − X̃
and Z − X̃ 0 . Next we apply the ReverseSalsa algorithm on Z − X̃ and Z − X̃ 0 by R − r
rounds to obtain Y and Y 0 and then find their difference in (p, q)-th position. Suppose
0 . We compare this difference to the difference ∆r ,
the difference is Γ p,q = Yp,q ⊕ Yp,q p,q

which was obtained after r rounds of Salsa. Suppose these two differences are equal in
high probability, i.e, Pr(Γ p,q = ∆rp,q | ∆i, j = 1) is high, we consider the keybit k to be
non-significant bit and call it Probabilistically Neutral Bit or PNB.

To identify PNBs, we consider a predetermined threshold probability bias γ. We


run the experiment many times using different IVs and then calculate the probability
Pr(Γ p,q = ∆rp,q | ∆i, j = 1). If Pr(Γ p,q = ∆rp,q | ∆i, j = 1) = 12 (1 + γk ) ≥ 12 (1 + γ), then k is
a PNB. The bias γk is called the neutrality measure of the keybit k. So, basically γk ≥ γ
implies that the keybit k is a PNB. In this way, we check this bias for every keybit
and finally obtain a set of PNBs. So, the whole set of keybits are divided into two
sets, PNBs and non-PNBs. We suppose the size of these sets are m and n respectively
(m + n = 256).

Actual attack after PNB construction: Now, in our main attack, our aim is to find the
values of the non-PNBs, without knowing the correct values of PNBs. Since changing
PNBs affect the output in a low probability, we take a random value for each PNB and
set it to that fixed value. Now we guess a value for each of the non-PNBs and denote
the matrices as X̃ and X̃ 0 . We compute Z − X̃ and Z − X̃ 0 respectively and apply Revers-
eSalsa by R − r rounds on both of them to obtain states Ỹ and Y˜0 respectively. Now, r
round forward Salsa yields X r and X 0r respectively. Let, Pr(X p,q
r ⊕X 0r = 0) = 1 (1+ε ).
p,q 2 d

Suppose, Γ̃ p,q = Ỹp,q ⊕ Y˜0 p,q . Also, suppose, Pr(Γ̃ p,q = ∆rp,q ) = 21 (1 + εa ). Then,
Pr(Γ̃ p,q = 0) = 12 (1 + ε), where ε = εa · εd , provided the two events are independent.
Now, if Pr(Γ̃ p,q = 0 | ∆i, j = 1) gives a significant bias ε, we can conclude that our
guessed non-PNB values are correct. Thus, we can find the non-PNB set. After this, we

61
can guess the values of PNBs, fixing the non-PNBs to their original values.

Instead of an exhaustive search over all possible 2256 values for the keybits, the
concept of PNB helps to reduce the complexity of search. If the size of PNB set is m,
then number of non-PNBs is n = 256 − m.

Complexity Estimation: Here we briefly repeat the estimation provided by [5] for
the reader’s convenience. We have 2n possible sequences of random values for the n
non-PNBs. Out of them, only 1 sequence is correct and remaining 2n − 1 sequences
are incorrect. In our hypothesis testing, we consider the null hypothesis H0 as: The
chosen sequence is incorrect. So, 2n − 1 sequences satisfy the null hypothesis and only
1 sequence satisfies the alternative hypothesis H1 (chosen sequence is correct).

Two possible errors can occur in this attack:

1. Error of Non-Detection: The chosen sequence A is correct, i.e, A ∈ H1 , but it


can’t be detected. The probability of this error is Pnd .
2. False Alarm Error: The chosen sequence A is incorrect, i.e, A ∈ H0 , but it gives
a significant bias. As a result, wrong sequence is accepted. The probability of
this event is Pf a .

Now, to achieve a bound on these probabilities, authors [5] used a result given by
Neyman- Pearson decision theory. According to this result, the number of samples is

α log 4 + 3 1 − ε ∗2 2
√ 
N≈ .
ε∗

These samples can be used to achieve the bound of Pnd = 1.3 × 10−3 and the bound of
Pf a by 2−α . Here ε ∗ is the median of all ε’s. Based on these values, the complexity can
be given by
2n N + 2m Pf a = 2n · N + 2256−α .


4.2.3 Chaining Distinguishers

In 2012, Shi et al. [109] presented a new approach to reduce the complexity of the
actual attack. Instead of finding the values of n non-PNBs by random 2n guesses, this

62
approach [109] searches non-PNBs through step-by-step procedure. We denote the set
of keybits involved in subkey K by S(K). A few subkeys K10 , K20 , · · · , Kr0 are formed from
0 , and K 0 is equal
the non-PNBs such that for each i ∈ [1, r − 1], Ki0 is a subkey of Ki+1 r

to the whole non-PNB. For each Ki0 , there is a distinguisher Di for 1 ≤ i ≤ r. In this
approach, we first guess the keybits of S(K10 ) and verify our guesses by distinguisher
D1 . After that we try to guess the keybits of S(K20 ) \ S(K10 ). Along with K10 found in
the previous step, guessed values of S(K20 ) \ S(K10 ) give us a possible candidate for K20 ,
which we verify by D2 . Similarly, we proceed to the next iteration. At i-th iteration,
0 ), attach them with K 0 , (found in previous step)
we guess the keybits of S(Ki0 ) \ S(Ki−1 i−1

and verify by Di . At r-th step, we get the non-PNB set Kr0 .

Now, we discuss the complexity of this approach. Suppose si = |S(Ki0 )| and the
number of samples required to guess Ki0 is Ni for i ∈ [1, r]. Also, suppose that the prob-
ability that incorrect subkey passes the distinguisher Di (false alarm error) is (Pf a )i =
2−αi . So, in our guess of K10 , we need 2s1 attempts, each requiring N1 samples. Here
(Pf a )1 = 2−α1 . Step 2 searches for s2 − s1 keybits, with false alarm error probability
(Pf a )2 = 2−α2 . Calculating the complexity in each step, the total complexity, as given
in [109], is

2s1 · N1 + 2s1 · (Pf a )1 · 2s2 −s1 · N2 + · · · + 2s1 · (Pf a )1 · 2s2 −s1 (Pf a )2 · · · 2sr −sr−1 · Nr

+2s1 · (Pf a )1 · 2s2 −s1 (Pf a )2 · · · 2sr −sr−1 · (Pf a )r · 2256−sr

= 2s1 · N1 + 2s2 −α1 · N2 + · · · + 2sr −α1 −α2 −···−αr−1 · Nr + 2256−α1 −α2 −···−αr .

4.2.4 Choosing proper IV

After putting the input difference at some position, the difference propagates in each
round. Now, if the difference after the first round can be minimized, one can find a
better bias at the end. Maitra [77] gave a nice procedure to choose the IVs so that the
difference after first round is minimum.

1. In Salsa: For any IV and key, at least four output differences will occur. So, our
aim is to choose those IVs for which the number of output differences is not more
than four. According to [77], if value of keycells k2 and k4 are fixed, then, just
by choosing 12-th bit of v1 properly, we can make the IV efficient for minimum
(four) output differences, even if the remaining 31 bits are arbitrary. This implies

63
that for a combination of k2 and k4 , we have 231 IVs available for minimum output
difference.
2. In ChaCha: Due to more complicated structure, minimizing the difference prop-
agation in ChaCha is much more difficult. Even replacing the addition modulo
232 by XOR, one can see that the input difference ∆13,13 generates output differ-
ence at 10 different places. Experiments were performed in [77] to find a position
for input difference so that the output difference is at 10 places. The values of
k1 and k5 are fixed, and the IVs suitable for minimum output difference 10 were
recorded. These experiments were performed over 211 different set of values for
k1 and k5 . In 373 cases, there was not a single IV available. On average, 227 IVs
were available for each set of values of k1 , k5 .

4.3 Improving the way of constructing PNB set: Our


algorithm

Motivation and basic idea: While finding PNB set, our aim is to find a set of m keybits
of the matrix X such that, even if the values at those positions are arbitrarily assigned,
value of ε is high. To achieve such a set, in the previous works, one single keybit is
changed and the probability of those two differences to be equal is calculated. If this
probability is more than a threshold value, then we include that keybit in PNB set. In
other words, we can say that if PNB set is of size m, that means we have chosen the m
keybits, say x1 , x2 , . . . , xm , for which the above mentioned probability is maximum. This
means, as a single PNB, these m bits give the best possible result. And we assumed that
as a set, this set KPNB ={x1 , x2 , . . . , xm } will give the best possible result.

But, in reality, this may not be the case. It can be possible that as a single PNB each
of x1 , x2 , . . . , xm gives the best result, but as a PNB-set {x1 , x2 , . . . , xm } does not give the
best result. There are 256

m possible subsets of size m of the 256 keybits. And some
subset other than {x1 , x2 , . . . , xm } may give better result as a PNB set. In other words,
0
there may be a subset KPNB ={y1 , y2 , . . . , ym } of keybits, where few y0i s are not from
0
KPNB , but the value of ε is more if KPNB is considered as PNB set. So, the procedure
of choosing PNBs can be improved further to give a better result. Here, we give an
algorithm to find a better PNB set. Our main approach is to find a combination of
keybits which can act as good PNB as a whole set, rather than choosing those keybits
which act as good PNB alone. In this PNB set construction, we use the idea provided

64
by Maitra [77] to choose the IV in such a way that the difference is minimum.

4.3.1 Algorithm for Salsa

The first difference between our approach and the existing idea is that in our case we do
not declare any threshold value for the probability to include a keybit in PNB. Rather,
we determine the size of PNB set from the beginning. Suppose this predetermined size
of PNB set is m. Our algorithm returns a PNB set {k1 , k2 , . . . , km }. In our algorithm, we
have m iterations. In each iteration, we include a new PNB into our set. In i-th iteration,
we include ki into PNB set.

To find PNB set, at first we define a set PNB0 = φ .

Choosing and recording IVs: Our first aim is to find suitable IV v1 for each keybit.
Using Maitra’s idea [77], we fix two keycells k2 and k4 and go on changing the values
of IV v1 to find the values for which the number of output differences after first round
is minimum, which is 4. According to [77], 231 such IV v1 values are available for each
possible values of k2 and k4 . So, for each combination of values of k2 and k4 , we record
those 231 possible IVs.

First Iteration: Next we put the input difference ∆i, j = 1 of matrix X and obtain X 0 .
We run the Salsa algorithm and obtain Z = X + X R and Z 0 = X 0 + X 0R . Next, we change
a single keybit from X and X 0 to obtain X̃ and X̃ 0 respectively. Now we apply Revers-
eSalsa algorithm on Z − X̃ and Z − X̃ 0 up to R − r rounds and obtain states Y and Y 0 .
0 . Our aim is to find the key bit which maximizes the probability
Let Γ p,q = Yp,q ⊕ Yp,q
Pr(Γ p,q = 0 | ∆i, j = 1). Let k1 be the key bit position which gives maximum probability
Pr(Γ p,q = 0 | ∆i, j = 1). Define a new set PNB1 = PNB0 ∪ {k1 } = {k1 }.

Second iteration: In the next step of the algorithm, we choose another keybit position
k which is not k1 . That means we choose k from the remaining 255 keybits. Take
x1 → {0, 1} uniformly at random. If x1 = 1, complement both keybit positions k and k1 .
If x1 = 0, complement only keybit position k. Repeat the whole process and calculate
the probability Pr(Γ p,q = 0 | ∆i, j = 1). Our second PNB is k2 = k, which maximizes the
probability Pr(Γ p,q = 0 | ∆i, j = 1). Define a new set PNB2 = {k1 , k2 }.

65
General Iteration: Let at t-th step our PNB set PNBt = {k1 , k2 , . . . , kt }. At t + 1-th
step, choose a keybit position k ∈
/ PNBt , i.e, from the remaining 256 − t keybit po-
sitions. Take (x1 , . . . , xt ) → {0, 1}t uniformly at random. If xi = 1, complement the
key bit position ki for 1 ≤ i ≤ t. Also complement the key bit position k. Repeat the
whole process and calculate the probability Pr(Γ p,q = 0 | ∆i, j = 1). Our t + 1-th PNB
is kt+1 = k, which maximizes the probability Pr(Γ p,q = 0 | ∆i, j = 1). Define a new set
PNBt+1 = {k1 , k2 , . . . , kt+1 }.

Proceeding in this way, when we achieve a set PNBm = {k1 , k2 , . . . , km }, we stop


and declare PNBm as our intended PNB set.

4.4 Experimental Results

4.4.1 Our results on Salsa

Based on the above mentioned algorithm, we run program and get a set of Probabilistic
Neutral Bits. Similar to [5, 80], we take ∆7,31 = 1. That is, we put the difference at
the most significant bit of 7-th cell. We consider the output difference ∆41,14 . We take
Z = X + X 8 and Z 0 = X 0 + X 08 . After that we come back 4 rounds. According to the
notations used in Section 4.3.1, we have R = 8 and r = 4. So we consider Pr(Γ1,14 =
0 | ∆7,31 = 1).

We predetermine the size of PNB set to be 42, i.e, m = 42. Below in Table 4.2, we
give the set of the 42 PNBs, which is generated by choosing random 232 key-IVs. We
list them in three side by side tables. The tables contain four columns. The first column
gives the name of the keybit. Second column gives the cell of the matrix where this
keybit is located, assuming the cells as X0 , X1 , X2 , . . . , X15 . The third column gives the
position of the bit in the corresponding cell. Here, bit number i denotes the i-th bit from
right to left, assuming the rightmost bit to be 0-th bit. The final column gives the keybit
number. One can observe from Table 4.2 that we include the keybit location P41 = 40 in
our PNB set but keybit location 73 is not included. However as a single PNB, 73 gives
probability 0.5100 whereas 40 gives 0.5056. So, the existing idea [5] tells us to include

66
Table 4.2: PNB set achieved by the algorithm for single bit output difference in Salsa

PNB Cell No. Bit No. Keybit no. PNB Cell No. Bit No. Keybit no. PNB Cell No. Bit No. Keybit no.
P1 12 5 165 P16 1 26 26 P31 4 24 120
P2 12 6 166 P17 1 27 27 P32 4 25 121
P3 12 7 167 P18 13 19 211 P33 14 23 247
P4 12 8 168 P19 1 28 28 P34 4 26 122
P5 12 9 169 P20 14 21 245 P35 3 7 71
P6 12 10 170 P21 12 16 176 P36 3 8 72
P7 12 11 171 P22 1 29 29 P37 12 18 178
P8 12 12 172 P23 13 20 212 P38 14 2 226
P9 12 13 173 P24 14 0 224 P39 4 27 123
P10 14 18 242 P25 1 30 30 P40 13 22 214
P11 12 14 174 P26 1 31 31 P41 2 8 40
P12 14 19 243 P27 14 22 246 P42 14 24 248
P13 13 18 210 P28 13 21 213
P14 14 20 244 P29 14 1 225
P15 12 15 175 P30 12 17 177

Table 4.3: Our attack complexity for different size of PNB set

Size of PNB set ε ∗ (median) ε̄ (mean) Complexity Optimum α N


36 0.001346 0.001358 2245.00 15.20 224.92
37 0.000978 0.000986 2244.93 15.28 225.85
38 0.000698 0.000708 2244.90 15.31 226.82
39 0.000502 0.000504 2244.86 15.36 227.78
40 0.000354 0.000364 2244.86 15.35 228.78
41 0.000252 0.000256 2244.85 15.37 229.77
42 0.000180 0.000184 2244.82 15.40 230.74

73 instead of 40. Here our algorithm gives different PNB than the existing approach.

We perform our experiment over 1024 randomly chosen keys. For each key, we
experiment over 230 random IVs and the probability is calculated.

From the Table 4.3, we get the best attack complexity 2244.82 for m = 42.

Explanation for choosing the PNBs in k2 : In [77], the probabilistically neutral bits
located in k2 and k4 have been discarded. The reason behind this is, while choosing
the suitable IVs for minimum output difference, k2 and k4 are fixed. So, if we assign
arbitrary values to the PNBs located in k2 and k4 , then the corresponding set of IVs may
differ from the set of IVs of original key. If no IV is common between them, we can’t
proceed to the actual attack. In our experimental result, we get two PNBs which are
located at 7-th and 8-th bit of k2 . Following [77], one may think that these two PNBs
should be discarded. But in the following Theorem 4.1 we show that we can include
those keybits as well into PNB set.

67
Theorem 4.1 If an arbitrary IV value gives minimum output differences for some value
of k2 , then the probability that the same IV also gives minimum output differences for
all four possible values of k2 achieved by changing the 7-th and 8-th bits (and keeping
the remaining bits same), is greater than 0.72.

We give the proof of this theorem, using few lemmas. Assigning arbitrary value
to 7-th and 8-th keybit, we can have total 4 possible values for k2 . According to this
theorem, we have 0.72 × 231 IVs which give minimum output difference for all four
values. From Theorem 4.1, it is clear that a huge number of IVs are available such that
whichever value we assign to those two keybits, those IVs suit all of them. So, we can
easily include those PNBs into our PNB set.

Proof of Theorem 4.1:

For any 32 bit binary number x, by xn we denote the n-th bit of that number. So LSB
is the 0-th bit. While adding two numbers a and b, at any bit, a carry of value 1 may be
generated. Suppose the carry is generated while adding the i-th bits (i.e, ai and bi ), and
this carry is added to the sum of next bits, we denote the carry by ci+1 . For convenience,
we consider this carry variable for every bit. We assign the value 0 to this ci+1 , if no
carry is generated. If carry 1 is generated, we assign 1 to ci+1 . So, while adding a and
b, the n-th bit of a + b is the sum of an , bn and cn , where cn is 1 if carry 1 is generated in
the previous bit addition, and 0 if no carry is generated. In the sum S = a + b mod 232 ,
if one bit of b is changed, the corresponding bit of a + b also changes. Now, due to
carry, this difference may propagate further to the next bits. The following lemma gives
a probabilistic measure on how far the difference may propagate in a + b.

Lemma 4.2 Let a = a31 a30 a29 · · · a0 and b = b31 b30 b29 · · · b0 be two arbitrarily chosen
numbers of 32 bits. Let b0 = b031 b030 b029 · · · b00 be a number which differs at exactly one bit
(say n-th, n ≤ 31) from b. Consider S = a + b mod 232 and S0 = a + b0 mod 232 . Then
for any k ≥ 0 such that n + k ≤ 31, the probability that S and S0 will differ at (n + k)-th
1
bit is 2k
.

Proof 4.3 Without loss of generality let us assume that bn = 0 and b0n = 1. So, b < b0 .
If k = 0, then n + k = n. So the n-th bit of S and S0 is the LSB of (cn + an + bn ) and
(c0n + an + b0n ) respectively. Now cn and c0n are same as bi = b0i for 1 ≤ i ≤ (n − 1).

68
We know that bn differs from b0n . So cn + an + bn mod 2 6= c0n + an + b0n mod 2. So,
Pr(Sn 6= Sn0 ) = 1 = 1
20
. So, the result is true for k = 0.

For, k ≥ 1, since bn+k and b0n+k are same, the k-th term of S and S0 can differ only if
one of them receives a carry generated by the sum of previous bits and other does not.
According to our assumption, since b < b0 , the sum at (n + k − 1)-th bit of S0 have to
generate the carry and S does not, i.e, c0n+k−1 = 1 and cn+k−1 = 0.

Again, if (k − 1) is not 0, bn+k−1 = b0n+k+1 and by the same argument, the sum at
(n + k − 1)-th bit can differ only if the sum at n + k − 2-th bit of S0 generates a carry
and S does not, i.e, c0n+k−2 = 1 and cn+k−2 = 0. In this way, each of bits (n + k − 1),
(n + k − 2), (n + k − 3), · · · , n of S0 must generate a carry. At the same time, each of
(n + k − 1), (n + k − 2), (n + k − 3) · · · n-th term of S must not generate any carry.

1
We show that the probability of the above mentioned event is 2k
. Since the n-th bit of
S0 generates a carry and n-th bit of S does not, an = 1 iff there is no carry received from
Sn−1 . This means, if carry is received, an must be 0. We call this event A0 . Since our
chosen numbers are arbitrary, Pr(A0 ) = 12 . Suppose, i-th bit of S0 generates a carry and
i-th bit of S does not. Also, we have bi+1 = b0i+1 . Now, (ai+1 , bi+1 ) and can have four
possible pairs (1, 0), (1, 1), (0, 1) and (0, 0). Among these pairs, (1, 0) and (0, 1) are the
0
only pairs for which a carry is generated in Si+1 and is not generated in Si+1 . We call
this event Ai . The probability that (ai , bi ) is either (0, 1) or (1, 0) is 21 . So, Pr(Ai ) = 12 .
Now, to make our event possible, each (ai , bi ) for i = (n + 1) to (n + k − 1) must be of
the form (1, 0) or (0, 1). So, Ai should occur for i = (n + 1) to (n + k − 1). Since the
numbers are arbitrary, we can assume each Ai to be independent. So,

1
Pr(An+1 ∩ · · · ∩ An+k−1 ) = Pr(An+1 ) · · · Pr(An+k−1 ) = .
2k−1

So, the probability

0 1 1 1
Pr(Sn+k 6= Sn+k ) = Pr(A0 ) · Pr(An+1 ) · · · Pr(An+k−1 ) = · k−1 = k .
2 2 2

Looking at the proof, we can see that since the bi ’s are same as b0i ’s for 1 ≤ i ≤ n − 1,

69
the generated carry cn and c0n are same (0 or 1). So, only the difference between bn and
b0n can influence the difference at (n + k)-th bit. Now, we generalise the Lemma 4.2 in
the following lemma.

Lemma 4.4 Let a = a31 a30 a29 · · · a0 and b = b31 b30 b29 · · · b0 be two arbitrarily chosen
numbers of 32 bits. Let b0 = b031 b030 b029 · · · b00 be a number such that

1. bi = b0i for i ≥ (n + 1),


2. bn 6= b0n ,
3. cn = c0n .

Suppose S = a + b mod 232 and S0 = a + b0 mod 232 . Then for any k ≥ 0 such that
n + k ≤ 31, the probability that S and S0 will differ at (n + k)-th bit, is 1
2k
.

Proof 4.5 Same as 4.2. 

Lemma 4.6 a = a31 a30 a29 · · · a0 be a 32 bits number and


b = b31 b30 b29 · · · bn+1 xybn−2 · · · b0 be a number of which (n − 1)-th and n-th positions
(i.e, x, y) are variables. Let S = a + b mod 232 . Then the probability of the event E that
for all four possible values of (x, y), (n + k)-th bit of S (k ≥ 0 and n + k ≤ 31) is same,
3

is 1 − 2k+1 .

Proof 4.7 The four possible values for (x, y) are (1, 1), (0, 1), (1, 0) and (0, 0). Without
loss of generality, we pick any one of them, say, (1, 1), and call it b(1) . So, the other
values are b(2) , b(3) and b(4) . Also, we denote the respective S = (a + b)’s as S(1) , S(2) ,
S(3) , S(4) . So, two b’s differ at one position only from b(1) (in this case, b(2) and b(3) ),
and one b differs at exactly two positions (in this case b(4) ). At first, we divide it into
two disjoint events.

(i)
Case 1: We assume that an−1 = 0,cn−1 = 0. In this can case, no carry bit cn will be
(i)
generated at (n − 1)-th bit of any S(i) , i.e, cn = 0 for all i ∈ [1, 4]. So, the existence
of differences at (n + k)-th bit will depend only on the differences at n-th bit. So, by
1
Lemma 4.2, the probability that the difference will exist is 2k
. So, probability of (n + k)-
th bit of S to be same for all (x, y) is (1 − 21k ).

Case 2: Now assume an−1 = 0, cn−1 = 1. We divide this case into two subcases.

70
• Let an = 0. In this case, the n-th bits of S(2) , S(3) and S(4) do not generate any
carry. As a result, their (n + k)-th bit will have same value. In S(1) , a carry is
generated which reaches (n + 1)-th bit. By the idea of Lemma 4.2, we can say
that the n + k-th bit of S(1) will differ from other S(i) ’s with probability 2k−1
1
. So,
probability of the (n + k)-th bits of all S(i) ’s to be equal is (1 − 2k−1 ).
1

• Let an = 1. In this case, the n-th bits of S(1) , S(2) and S(3) generate carry, but
S(4) does not. So, the (n + k)-th bits of S(1) , S(2) and S(3) have same value. Now,
using the idea of Lemma 4.2, the (n + k)-th bit of S(1) differs from others with
1
probability 2k−1 . So, in this case also, probability of the (n + k)- th bits of all
(i) 1
S ’s to be equal is (1 − 2k−1 ).

Case 3: Now let an−1 = 1, cn−1 = 0. In this case the argument is similar to case 2. So,
1
here also, probability is 1 − 2k−1 .

Case 4: Let an−1 = 1, cn−1 = 1. In this case, cin is 1 for all i. So, using Lemma 4.4,
we can say that the probability that the (n + k)-th bit is not same for all S(i) ’s, is 1
2k
.
Therefore, the probability that they are equal, is 1 − 21k .

Now, since a is arbitrary, Pr(an−1 = 0) = Pr(an−1 = 1) = 12 . Suppose Pr(cn−1 =


0) = c. So, the probability

Pr(E)

= Pr(E | an = 0, cn = 0) · Pr(an = 0) · Pr(cn = 0) + Pr(E | an = 1, cn = 1) · Pr(an = 1) Pr(cn = 1)

+ Pr(E | an = 1, cn = 0) · Pr(an = 1) · Pr(cn = 0) + Pr(E | an = 0, cn = 1) · Pr(an = 0) Pr(cn = 1)


1 1
= Pr(E | an = 0, cn = 0) · · c + Pr(E | an = 1, cn = 1) · · (1 − c)
2 2
1 1
+ Pr(E | an = 1, cn = 0) · · c + Pr(E | an = 0, cn = 1) · · (1 − c)
 2 2 
1 1 1 1 1
= (1 − k ) · c + (1 − k ) · (1 − c) + (1 − k ) · c + (1 − k−1 ) · (1 − c)
2 2 2 2 2
 
1 3
= 2− k
2 2
3
= 1 − k+1
2

Lemma 4.8 Suppose

1. a = a31 a30 · · · an+1 an an−1 · · · a0

2. a0 = a031 a030 · · · a0n+1 a0n a0n−1 · · · a00

3. b = b31 b30 · · · bn+1 bn bn−1 · · · b0 ,

71
4. b0 = b031 b030 · · · b0n+1 b0n b0n−1 · · · b00

be such that for all i > n, ai = a0i and bi = b0i . And S = a + b mod 232 and S0 =
a0 + b0 mod 232 . Then for any k > 0 such that n + k ≤ 31, Pr(Sn+k = Sn+k
0 1
) > 1 − 2k−1 .

Proof 4.9 The sum of n-th bit may or may not give a carry. As usual, let us denote the
carry produced at n-th bit of S and S0 by cn+1 and c0n+1 . Let E be the event that cn+1 and
c0n+1 are equal. In this case, since all the further bits of a and b are respectively same as
0
a0 and b0 , all further Si ’s are same as Si0 ’s. So, Sn+k = Sn+k with probability 1. Now we
consider the event E c , where cn+1 6= c0n+1 . In this case, by the same arguments used in
0
Lemma 4.2, Sn+1 6= Sn+1 and the difference between Si and Si0 propagates further only
0
if the (ai , bi ) = (a0i , b0i )’s are either (1, 0) or (0, 1). So for k ≥ 2, Sn+k 6= Sn+k only if
∀ i ∈ [n + 1, n + k − 1], (ai , bi ) = (a0i , b0i ) is either (0, 1) or (1, 0). This probability is 1
2k−1
.
0
So, Pr(Sn+k = Sn+k 1
) = (1 − 2k−1 0
). So, total probability Pr(Sn+k = Sn+k ) = Pr(Sn+k =
0
Sn+k 0
| E) · Pr(E) + Pr(Sn+k = Sn+k 1
| E c ) · Pr(E c ) = 1 · Pr(E) + (1 − 2k−1 ) · Pr(E c ) > 1 −
1
2k−1
. 

According to the notations we used, in the columnround on 4-th column: a = c3 , b = k2 ,


c = v1 and d = k4 . Using above lemmas, we provide the following theorem which shows
that in single bit output difference of Salsa, the two PNBs located at k2 , can be included
in PNB set, because a huge number of IVs are available which works for any values
assigned to 7-th and 8-th bit of k2 .

Proof of Theorem 4.1.

Proof 4.10 According to [77], only by choosing the 12-th bit of the IV properly, we can
make it eligible for minimum output differences. If the sum of first 11 bits of c and d
gives a carry 1, we have to choose the 12-th bit of c (=v1 ) as 1, and if there is no carry,
i.e, carry is 0, then we choose the 12-th bit as 0. We denote the 7-th and 8-th bit as x
(1) (2) (3) (4)
and y. Suppose the four possible values of k2 are k2 , k2 , k2 , k2 , where the (x, y)’s
are (1, 1), (1, 0), (0, 1), (0, 0) respectively. Now, suppose v1 suits for minimum output
(1)
difference for k2 . This means that v1 follows the above mentioned rule. Now, if for
(2) (3) (4)
k2 , k2 and k2 , the sum of first 11 bits of c and d gives the same carry as in case of
(1) (i)
k2 , then the same v1 suits for all k2 ’s.

72
(i)
In this theorem, we show that the sum of first 11 bits gives same carry for all k2
with high probability. In the first step of quarterround, b = b ⊕ ((a + d) ≪ 7). Since
a and d are same for all four cases, the output b’s have difference at 7-th and 8-th bit
only, which is due to the differences in the input b’s. In the second step of quarterround,
c = c⊕((b+a) ≪ 9). Since b is different in all four cases, the 16-th and 17-th places of
the output c differs. Due to carry, the differences may propagate further. Let us consider
the event E1 that the differences cannot propagate more than 4 bits, i.e, beyond 21-st
bit. So 22-nd bit is same in all four cases. By the lemma 4.6, this probability is (1 − 236 ).

In the third step, both b and c differs. In the sum b+c mod 232 , the 7-th and 8-th bits
will differ due to the difference in b’s. Now, suppose E2 be the event that this difference
will not propagate more than 5 bits. Applying Lemma 4.6 this probability is (1 − 217 ).
After rotation by 13 bits, these differences shifts to the bits from 20-th to 26-th.

Again, due to the difference in c’s, in the sum (b + c), differences come at the bits
from 16-th to 21-st. This difference may propagate. In this case, we consider the event
E3 that the difference does not propagate more than 4 bits. We find the probability of
E3 . Using Lemma 4.8, we can say Pr(E3 ) > (1 − 213 ) = 7
8

After left rotation, these differences shifts to the bits from 29-th to 31-st and 0-th to
7-th. So, d differs from 0-th bit to 7-th bit, 20-th to 26-0th and 29-th to 32-nd.
Finally, in fourth step, a = a ⊕ ((c + d) ≪ 18). Now, c + d mod 232 differ from 0-th
bit 7-th bit. Assuming that all E1 , E2 , E3 occur, let E4 be the event that this difference
propagates at most to 10-th bit. Using the Lemma 4.8, we find the probability of E4 to
be greater than 87 .

Now, if E1 , E2 , E3 , E4 occur simultaneously, the carry of sum of first 11 bits are


(i)
same in four cases. So, the same IV works for all k2 . Now, assuming Ei ’s independent,
Pr(E1 ∩E2 ∩E3 ∩E4 ) = Pr(E1 )·Pr(E2 )·Pr(E3 )·Pr(E4 ) > (1− 236 )·(1− 217 )·( 87 )2 > 0.72.

Use Of multi bit output difference in the algorithm: In [32], a new approach has
been introduced to further increase the bias in differential attack against Salsa. Instead
of observing the output difference in one single position, they suggested to observe

73
Table 4.4: PNB set achieved for multibit output difference in Salsa

PNB Cell No. Bit No. Keybit no. PNB Cell No. Bit No. Keybit no. PNB Cell No. Bit No. Keybit no.
P1 12 4 164 P16 12 14 174 P31 4 24 120
P2 12 5 165 P17 1 25 25 P32 14 22 246
P3 12 6 166 P18 1 26 26 P33 3 6 70
P4 12 7 167 P19 13 18 210 P34 4 25 121
P5 12 8 168 P20 1 27 27 P35 1 30 30
P6 12 9 169 P21 14 20 244 P36 1 31 31
P7 12 10 170 P22 12 15 175 P37 14 1 225
P8 14 31 255 P23 13 19 211 P38 3 7 71
P9 12 11 171 P24 1 28 28 P39 2 7 39
P10 12 12 172 P25 14 21 245 P40 3 8 72
P11 14 17 241 P26 12 16 176 P41 13 21 213
P12 12 13 173 P27 1 29 29 P42 4 26 122
P13 14 18 242 P28 14 0 224 P43 4 11 107
P14 13 17 209 P29 13 20 212 P44 3 9 73
P15 14 19 243 P30 4 23 119 P45 2 8 40

multiple positions. They provided theoretical way of choosing the proper combination
of output bits so that the bias becomes significantly large. Using this improvement, they
achieved much better results for reduced round Salsa. For some linear combinations of
bits, they obtained high biases in Salsa 6-round. This result helps in cryptanalysis of
6-round Salsa in practical time.

Based on the theoretical results, they provided a list of input difference bits and
for each of them, a combination of three output difference bits which gives a huge
bias after 5 rounds. For Salsa, they put the difference at position (7, 0), i.e, 0-th bit
of 7-th cell. Maximum bias is observed for the combination of locations (9, 0), (13, 0)
and (1, 13), i.e, they observed the output difference ∆59,0 ⊕ ∆513,0 ⊕ ∆51,13 . Though we
discuss our algorithm for single bit output difference of Salsa, the same algorithm can
work for multi bit output differences. In our experiment, we use the theory of [32] to
improve our result. Using their idea, we observe output difference at 3 bits and run
our algorithm. Here we take R = 8 and r = 5. That is, we consider Z = X + X 8 and
Z 0 = X 0 + X 08 . After that we come back R − r = 3 rounds and consider the probability
Pr(Γ9,0 ⊕ Γ13,0 ⊕ Γ1,13 = 0 | ∆7,0 = 1). We present our PNB set in Table 4.4, which is
generated by choosing 232 random key-IVs.

Here, input difference is put at (7, 0), which is in 4-th column. The corresponding
columnround involves k2 and k4 . From our Table 4.4, we see that 4 PNBs are there in
k2 . These are at (3, 6), (3, 7), (3, 8) and (3, 9). Now, assigning arbitrary values to these
4 positions, we have 24 = 16 different key values. So, for actual attack, we use only

74
Table 4.5: Complexity for different size of PNB set in Salsa

Size of PNB set ε ∗ (median) ε̄ (mean) Complexity Optimum α N


36 -0.001542 -0.001538 2244.63 15.60 224.55
37 -0.001164 -0.001168 2244.45 15.79 225.38
38 -0.000856 -0.000862 2244.35 15.91 226.27
39 -0.000630 -0.000628 2244.24 16.02 227.16
40 -0.000466 -0.000468 2244.11 16.16 228.04
41 -0.000344 -0.000358 2244.00 16.28 228.92
42 -0.000246 -0.000250 2243.96 16.32 229.89
43 -0.000176 -0.000176 2243.93 16.35 230.86
44 -0.000118 -0.000118 2244.08 16.19 232.00
45 -0.000086 -0.000086 2244.00 16.28 232.92
46 -0.000060 -0.000060 2244.03 16.24 233.96

those IVs which give minimum output difference ( which is 4) after first round for all
16 possible values of key. If no such IV is available, then we discard that key value and
choose another one.

We experiment over 1024 randomly chosen keys. For each key, the experiment is
performed over 230 random IVs, and finally the probability is calculated. From the
Table 4.5, one can see the lowest complexity 2243.93 is attained when PNB size is 43.
So, if we predetermine our PNB set size to be 43 and then run the algorithm, we attain
the best result. This result beats the previous best result attained by [32] which is 2244.85 .

Results using Chaining Distinguisher: Now we provide our results using the approach
of chaining distinguisher [109], as given in Section 4.2.3. We experiment over 128
randomly chosen keys. For each key, the experiment is performed over 233 random
IVs, and finally the probability is calculated. In previous works [5, 80], the median
of all ε values was used to calculate N so that the success probability is at least 50%.
In chaining distinguisher approach, we find the non-PNB keybits in two steps. So,
according to the notations used in Section 4.2.3, we have r = 2. So we have two biases
ε1 and ε2 in two steps. If we use the median values for both ε1 and ε2 , the final success
probability decreases to 50% × 50% = 25%. To increase this probability, instead of
using the median values of ε1 and ε2 , we use the value with 29 percentile. So we
choose the value ε1+ such that 29% ε1 s are less than ε1+ . So, in first step, our success
probability is 71%. Similarly in the second step we take ε2+ such that 29% ε2 s are less

75
Table 4.6: Value of ε + (29 percentile) for different size of PNB set for Salsa using
Chaining Distinguishers

Size of PNB 36 37 38 39 40 41
ε + (29 percentile) 0.001340 0.001014 0.000734 0.000548 0.000383 0.000269
Size of PNB 42 43 44 45 46
ε + (29 percentile) 0.000190 0.000138 0.000092 0.000066 0.000044

than ε2+ . So, total success probability remains 71% × 71% ≈ 50%.

In Table 4.6, for different size of PNB set of Salsa, we give the respective values of
the ε + . The best result is achieved for the PNB size pair (39, 43). Hence in the first
step, we consider our PNB set size to be 43, and find the remaining 213 non-PNBs. In
the second step, we search 4 more keybits. Finally, by exhaustive key search, we search
remaining 39 PNBs. Using the notations in Section 4.2.3, we have optimum α1 = 3.76
and N1 = 230.45 . In second step, α2 = 15.62, N2 = 227.54 . The complexity is 2243.67 ,
which is 2.27 times faster than the existing best result 2244.85 .

4.4.2 Experimental Result For ChaCha

Like Salsa, we run the same algorithm on ChaCha also, with multibit output difference.
In this case we achieved even better result than Salsa. For ChaCha we give the input
difference at (13, 13) as in [32]. We consider the following output difference after 4.5
rounds as given in [32, Table 8],

∆4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5


0,0 ⊕ ∆0,8 ⊕ ∆1,0 ⊕ ∆5,12 ⊕ ∆11,0 ⊕ ∆9,0 ⊕ ∆15,0 ⊕ ∆12,16 ⊕ ∆12,24 .

Like [32], we take R = 7 and r = 4.5 in our approach. In Table 4.7, we present our
PNB set of size 55, which is generated by choosing 232 random key-IVs. Here, input
difference is put at (13, 13), which is in 2-nd column. The corresponding columnround
involves k1 and k5 . From our Table 4.7, we see that 3 PNBs are from k1 (5, 3), (5, 6),
(5, 4)), and 1 PNB is from k5 , which is (9, 31). We use those IVs which give minimum
output difference (which is 10) after first round for all possible arbitrary values at those
4 bits. For a key, if there is no such IV available which gives minimum difference for
all 16 key values achieved by assigning arbitrary values to PNBs, we discard that key

76
Table 4.7: PNB set achieved for ChaCha for multibit output difference

PNB Cell No. Bit No. Keybit no. PNB Cell No. Bit No. Keybit no. PNB Cell No. Bit No. Keybit no.
P1 7 31 127 P20 6 3 67 P39 6 7 71
P2 11 24 248 P21 4 15 15 P40 6 4 68
P3 11 25 249 P22 8 8 136 P41 6 8 72
P4 11 26 250 P23 7 7 103 P42 5 3 35
P5 11 27 251 P24 11 0 224 P43 11 3 227
P6 11 28 252 P25 7 8 104 P44 4 3 3
P7 11 29 253 P26 11 1 225 P45 8 28 156
P8 11 30 254 P27 4 16 16 P46 7 11 107
P9 11 31 255 P28 7 9 105 P47 5 6 38
P10 7 0 96 P29 11 2 226 P48 6 9 73
P11 7 1 97 P30 8 9 137 P49 4 18 18
P12 6 27 91 P31 9 31 191 P50 8 11 139
P13 6 28 92 P32 4 31 31 P51 11 4 228
P14 6 29 93 P33 7 4 100 P52 4 7 7
P15 7 2 98 P34 4 17 17 P53 10 0 192
P16 6 30 94 P35 8 31 159 P54 7 5 101
P17 6 31 95 P36 8 10 138 P55 5 4 36
P18 10 31 223 P37 4 6 6
P19 7 3 99 P38 7 10 106

Table 4.8: Complexity for different size of PNB set for ChaCha

Size of PNB set ε ∗ (median) ε̄ (mean) Complexity Optimum α N


46 0.000768 0.000768 2237.00 23.70 226.95
47 0.000584 0.000584 2236.80 23.91 227.74
48 0.000446 0.000448 2236.59 24.14 228.53
49 0.000340 0.000346 2236.38 24.36 229.32
50 0.000256 0.000260 2236.20 24.54 230.15
51 0.000186 0.000186 2236.13 24.62 231.07
52 0.000140 0.000144 2235.95 24.80 231.90
53 0.000100 0.000100 2235.93 24.83 232.87
54 0.000070 0.000072 2235.95 24.80 233.90
55 0.000050 0.000050 2235.93 24.83 234.87

and choose another one.

We experiment over 1024 randomly chosen keys. For each key, the experiment is
performed over 230 random IVs, and finally the probability is calculated. From Ta-
ble 4.8, it is clear that our result is better than the existing best result 2237.65 of [32].
The lowest complexity is 2235.93 , which is attained when the PNB set size is 53.

In Table 4.9, for different size of PNB set, we give the respective values of the ε +
with 29 percentile. Here, we experiment over 128 randomly chosen keys. For each key,
the experiment is performed over 233 random IVs. The best result is achieved for the
PNB size pair (52, 53). This means, we assume our PNB set size to be 53 in the first
step. We find the remaining 203 bits. Then, in the second step, we find one more keybit.

77
Table 4.9: Value of ε + (29 percentile) for different size of PNB set for ChaCha using
Chaining Distinguishers.

Size of PNB 46 47 48 49 50
ε + (29 percentile) 0.000660 0.000491 0.000381 0.000293 0.000225
Size of PNB 51 52 53 54 55
ε + (29 percentile) 0.000149 0.000116 0.000083 0.000054 0.000037

So, now we have the values of 214 keybits. At last, we find remaining 52 PNBs. The
complexity is 2235.22 , which is 5.39 times faster than the existing best result 2237.65 . For
this pair, we have optimum α1 = 4.16, N1 = 232.42 and α2 = 24.25, N2 = 232.01 .

4.5 How to assign values to PNBs

The differential attacks against ChaCha and Salsa involve two probability biases: for-
ward probability bias (εd ) and backward probability bias (εa ). The product of these
two biases is given by ε. Higher value of ε results in reduction of attack complexity.
So far, most of the works have tried to increase the forward probability bias εd . Here
we suggest a method to increase the backward probability bias. The attacks against
these two ciphers suggest to put arbitrary values in probabilistic neutral bits and then
find the remaining bits by guessing. In this note, we suggest to put some fixed values
at probabilistic neutral bits, instead of putting arbitrary values, so that we can have a
higher backward probability bias (εa ). As a result, the complexity of the attack can
be improved further. In our idea, we try to focus on the values assigned to the PNBs.
Suppose, there are m probabilistic neutral bits. So, as a tuple it has 2m possible values.
Among them, only one is correct. We observe that there are few set of values, which
give a better bias of the backward probability on average, even if the values are not
correct. Suppose X and X are respectively the initial matrix and the matrix obtained by
putting some arbitrary value at the PNBs. Now, we compute both Z − X and Z − X. If
the differences between Z − X and Z − X are at only a few positions, then after applying
R − r rounds of reverse algorithm of ChaCha on Z − X, the backward probability εa
becomes high. Due to this high bias, from Neyman-Pearson formula, we can achieve a
lower value of N, which will help to reduce the complexity. On the other hand, if Z − X

78
and Z − X have differences at many positions, the bias εa becomes low, which increases
the N.

So, if the values of the PNBs can be chosen in such a way that the differences
between Z − X and Z − X can be minimized, we can achieve a high εa . Of course, this
difference depends on the original values of the PNBs. If some guessed values of PNBs
give very low difference between Z − X and Z − X for some key, the same guessed
value may give large difference for some other key. But, considering all possible keys,
there are some values for PNBs which give low difference on average. If those values
are assigned to the PNBs instead of assigning arbitrary values, we get advantage in our
attack in average case.

In ChaCha and Salsa, we have 8 cells which contains keybits. We work on a single
key cell at a time. For convenience, let us assume that we work on key cell k. To find the
values of the PNBs located at k, we consider all possible values for those bits. Suppose,
k contains m PNBs and we denote them as p1 , p2 , · · · , pm . So, the block p1 p2 · · · pm
has 2m possible values. Let the values be v0 , v1 , · · · , v2m −1 . When we compute Z − X,
by W we denote the 32 bit block of Z from which k is subtracted. Now, we choose
random values for W and k. For each j from 0 to 2m − 1, we construct 32 bit block k j by
replacing the original value of PNB block p1 p2 · · · pm , by value v j . Next we compute
W −k and W −k j for all j. Then, for each j, we count the number of differences between
W − k j and W − k, i.e, we count the number of 1’s appearing in (W − k) ⊕ (W − k j ).
Let this value be c j1 . After this, we again choose random values for W and k and repeat
the above operations. Thus, we repeat the same procedure and count the number of
differences between W − k and W − k j . Let it be c j2 . We repeat this for large number
of arbitrary values of W and k. Say this value is `. We add all c j1 , c j2 , c j3 · · · c j` ’s to get
the total number of differences, say c j . Thus, for all v j we have a corresponding c j .
Now suppose , c j0 = max j {c j }. Then we assign v j0 for the PNB block p1 p2 · · · pm . We
repeat the same operation for each keycell and obtain a value for the PNB block of that
cell.

Theorem 4.11 If there is a PNB block consisting of consecutive bits ending at the MSB
of any cell, i.e, of the form p31 p30 p29 · · · p32−i , then we observe that for that block any
arbitrary value can be assigned to the PNBs. This means, all 2i possible values for

79
p31 p30 p29 · · · p32−i give same bias on average.

Proof 4.12 Suppose k = k31 k30 · · · k0 is a keycell, of which the first i most significant key-
bits, i.e, k31 , k30 , k29 · · · k32−i are PNBs. Now, suppose z31 z30 · · · z0 be the corresponding
0 k0 k0 · · · k0
Z. Suppose k31 30 29 32−i is any arbitrary value that we assign to the PNBs. We call

this new 32-bit value k0 . Now, we compare the differences between Z − k and Z − k0 .

By Z1 , k1 , k10 we denote the most significant i bits of Z, k, k0 respectively. Since the


last 32 − i bits are not PNB, they are same for k, k0 . As a result, the last 32 − i bits of Z −
k, Z − k0 are same. So the number of positions where Z − k mod 232 and Z − k0 mod 232
differ is same as the number of positions where (Z1 − k1 ) mod 2i and (Z1 − k10 ) mod 2i
differ. Now, we consider all possible values for k1 . So, for all possible values of k1 ,
Z1 − k1 gives all possible i-bit values that can be generated by 0 and 1. Let us call them
i
k11 , k12 · · · k12 . For all k j , we count the difference between Z1 − k1 j and Z1 − k10 j and find
j
their sum. Now, the number of differences between Z1 − k1 and Z1 − k10 j is the number of
j
1’s appearing in (Z1 −k1 ⊕Z1 −k10 j ). Now, the set {Z −k1 ⊕Z −k10 |k1 is a i bit number}
i
is basically the set of all possible i bit numbers. So, the sum is ∑2j=0 j ij , because there


i
j i-bit numbers which contains exactly j 1’s. Now, this value is same for any value of
i
k10 , i.e, ∑2j=0 j ij does not depend on the value of k1 . So, the total number of difference


is same for any value of the block p31 p30 · · · p32−i .

4.6 Experimental Results

ChaCha: We run our experiment over ChaCha. We use the idea of Maitra [77] to
minimize the number of differences after first round by choosing proper IV. To find the
best value for the PNB block of each keycell, we experimented on 107 keys. We provide
the values for 52 PNBs in Table 4.10. The PNB blocks where any arbitrary values give
same bias, is denoted by x. Like [32], we put the input difference at position (13, 13),
i.e, at 13th bit of 13th word. The output difference is observed after 4.5 rounds at ∆4.5
0,0 ⊕

∆4.5 4.5 4.5 4.5 4.5 4.5 4.5 4.5


0,8 ⊕ ∆1,0 ⊕ ∆5,12 ⊕ ∆11,0 ⊕ ∆9,0 ⊕ ∆15,0 ⊕ ∆12,16 ⊕ ∆12,24 . The average bias ε̄ observed

for random assignment of values for PNBs is 0.000144, whereas our selected values
for PNBs give bias 0.000318. We observe that for 67% keys, special PNB gives higher
ε than random PNB. For 10% of the keys, our complexity is around 10 times faster

80
3 6 7 15 16 17 18 31 35 38 67 68
1 1 0 1 1 1 0 x 1 0 0 1
71 72 73 91 92 93 94 95 96 97 98 99
1 1 0 x x x x x 1 1 1 1
100 103 104 105 106 107 127 136 137 138 139 156
0 1 1 1 1 0 x 0 0 0 1 x
159 191 223 224 225 226 227 228 248 249 250 251
x x x 1 1 1 1 0 x x x x
252 253 254 255
x x x x

Table 4.10: Values for Probabilistic Neutral Bits of ChaCha

Percentage of keys bias (existing) bias (our) existing complexity our complexity
10 0.000200 0.000648 2234.97 2231.71
20 0.000178 0.000433 2235.28 2232.80
30 0.000165 0.000314 2235.50 2233.70
40 0.000152 0.000231 2235.73 2234.74
50 0.000139 0.000182 2235.96 2235.22

Table 4.11: Comparison of bias ε and complexities between existing and our method
for ChaCha

than the existing complexity. The comparison between our complexity and existing
complexity is provided in Table 4.11 and a graphical representation of bias ε is provided
in Figure 4.1.

0.5040
Random PNB
0.5035
Special PNB
0.5030

0.5025

0.5020
ǫ −→

0.5015

0.5010

0.5005

0.5000

0.4995
0 20 40 60 80 100
Percentile→

Figure 4.1: Comparison between the bias achieved by random values and our values for
ChaCha

Using Column Chaining Distinguisher: We attack ChaCha with our idea using col-
umn chaining distinguisher [109, 38]. This improves the result even further. Here,
initially we consider the size of our PNB set to be 53. Then we find remaining 203
non-PNBs. In this step we assign arbitrary values at PNBs. We choose our bias ε1 to be

81
25 26 27 28 29 30 31 39 70 71 72 107
x x x x x x x x 1 1 0 1
119 120 121 122 164 165 166 167 168 169 170 171
1 1 1 0 0 1 1 1 1 0 0 0
172 173 174 175 176 209 210 211 212 213 224 225
0 0 0 0 1 1 1 1 1 0 0 1
241 242 243 244 245 246 255
1 1 1 1 1 0 x

Table 4.12: Values for the probabilistic neutral bits of Salsa

the value with 10 percentile, which is 0.000047. In the second step, we find the value
of another bit, considering remaining 52 bits to be PNB. Here we use our attack idea
of assigning our fixed values for PNBs. We use the bias ε2 to be the value with 45
percentile. This value is 0.000164. Thus our attack has a success probability of 50%.
In this method, the attack complexity comes down to 2234.50 .

Salsa: In Table 4.12, we provide the values of PNB blocks which gives the best bias for
Salsa. The PNB blocks where any arbitrary values give same bias, is denoted by x. The
input difference is put at position (7, 0) and output difference is observed at the XOR
of (9, 0), (13, 0), (1, 13). The average bias ε̄ observed for random assignment of values
for PNBs is -0.000170, whereas our selected values for PNBs give bias -0.000308. We
observe that for 57% keys, special PNB gives higher ε than random PNB. In Table
4.13 we provide the comparison between our result and existing result upto 50% of
keys. From the Table 4.13, it is clear that, for around 10% of the keys, the complexity
is significantly less than (5.35 times) existing result. However, as the percentage of
keys increases, our result gets closer to existing result, but still it is much better. In
Figure 4.2, we present the bias ε between randomly chosen PNBs and our chosen PNB
values. From the graph it is clear that for small fraction of keys our procedure gives
significantly better result. As number of keys increases, the difference between our
result and existing result decreases.

4.7 Conclusion

We analyse Salsa and Chacha for reduced rounds. We have proposed a new algorithm
to construct the set of Probabilistic Neutral Bits. Using this algorithm, we show that
one can cryptanalyse 8 rounds Salsa with a key search complexity 2243.67 and 7 rounds

82
Percentage of keys bias (existing) bias (our) existing complexity our complexity
10 -0.000232 -0.000667 2243.18 2240.23
20 -0.000207 -0.000397 2243.48 2241.70
30 -0.000192 -0.000305 2243.69 2242.42
40 -0.000181 -0.000226 2243.86 2243.25
50 -0.000167 -0.000192 2244.07 2243.69

Table 4.13: Comparison of bias and complexities between existing and our method for
Salsa

0.500

0.499

0.498
ǫ −→

0.497

0.496
Random PNB
Special PNB
0.495
0 20 40 60 80 100
Percentile→

Figure 4.2: Comparison between the bias achieved by random values and our values for
Salsa

Chacha with complexity 2235.22 . Our attack on Salsa and Chacha is around 2.27 and
5.39 times faster than the existing results. Next, we aim at increasing the backward
probability bias of differential attack against reduced round Salsa and ChaCha. Instead
of assigning random values for probabilistic neutral bits, we found some fixed values
for the PNB blocks of the keycells. These values give minimum difference between
Z − X and Z − X 0 in average case. As a result, the backward probability bias increases
significantly. This helps to reduce the complexity of the attack slightly.

83
CHAPTER 5

Some results on Fruit

In FSE 2015, Armknetcht et al. proposed a new technique to design stream ciphers,
which involves repeated use of keybits in each round of keystream bit generation. This
technique showed the possibility to design stream ciphers where internal state size is
significantly lower than twice the key size. They proposed a new cipher based on this
idea, named Sprout. But soon Sprout was proved to be insecure. In Crypto 2015,
Lallemand et al. proposed an attack which was 210 times faster than the exhaustive
search. But the new idea used in Sprout showed a new direction in the design of stream
cipher, which led to the proposal of several new ciphers with small size of internal state.

Fruit is a recently proposed cipher where both the key size and state size are 80.
Here, we attack full round Fruit by a divide-and-conquer method. Our attack is equiv-
alent to 274.95 many Fruit encryptions, which is around 16.95 times faster than average
exhaustive key search. Our idea also works for the second version of Fruit.

In modern cryptography, stream ciphers play a vital role because of the need for
lightweight cryptosystems. Over the last few years, lightweight stream ciphers have
drawn serious attention. Reduction of the area size of the cipher helps to install other
protection mechanisms for the cipher. Also, it reduces the power consumption of the
machine. Recent applications like WSN, RFID etc require use of lightweight ciphers.

Stream cipher is basically a class of symmetric key ciphers which generates pseudo-
random keystream. It starts with a secret key and an IV. In the first phase of a stream
cipher, the secret key and the IV is loaded and the Key Scheduling Algorithm is per-
formed, without generating any output keystream. In the second phase, which is called
PRGA, the keystream is generated. This keystream is directly XOR-ed with plaintext
bit by bit. There are many stream ciphers which are currently being used in the market.
RC4, Fish [23] are some of the most used ciphers in last decade. However, most of these
ciphers showed severe insecurity in recent times. At the same time, lightweight ciphers
are in high demand from industries. As a result, many new ciphers have been proposed
in the last decade. Grain [76], Led [53], Twine [112], Lblock [117], Present [25], Ktan-
tan [28], Klein [52], Trivium [30], Clefia [110] etc. are some of the promising new
ciphers.

The failure of NESSIE project, arranged in 1999 to develop a secure stream cipher
led to the launch of eSTREAM by EU ECRYPT in 2004. This multi-year project had
two portfolios: hardware and software. Total 34 ciphers were proposed in the first phase
of eSTREAM, among which only a few made it to the next phases of the project. After
the last revision, in hardware category, only Mickey [6], Grainv v1 and Trivium [30]
are still the part of this portfolio. However, all these ciphers use a large internal state
to generate the keystream bits. The design of a lightweight stream cipher requires the
reduction of the internal state. Unfortunately, for the usual design structure of stream
cipher, to resist the Time Memory Data tradeoff attack [11, 21], the common principle
is to keep the internal state size at least twice the key size. This made the construction
of more lightweight stream ciphers challenging.

In 2015, Armknetcht et al. [4] suggested a slight change in the basic design of stream
ciphers to reduce the internal state without harming its security against TMD tradeoff
attack. Though in previous design patterns usually the secret key was involved in the
initialisation process only, they suggested to use it repeatedly while encryption. In a ci-
pher, initially the key is stored in a Non-Volatile memory (NVM), which means that the
key values in this locations do not change while the cipher is running. After the process
starts, the key is loaded into Volatile Memory and used. After the use of the key, the
values of Volatile Memory changes, but the key is still stored in Non-Volatile memory.
In classical design of stream cipher, the NVM is of no use after this initialisation. Ac-
cording to the new idea of Armknetcht et al. [4], unlike the classical stream cipher, the
involvement of keys is not over after the initialisation process. Rather, in each clock,
the key is loaded from the NVM to VM and used in keystream generation. Based on
this principle they proposed a new cipher, named Sprout, which used 80-bit key, while
its internal state size was also 80. This new idea attracted cryptographers. Surpris-
ingly, Sprout was not secure. Lallemand et al. [73] attacked the cipher with complexity
around 270 , which was 210 times faster than the exhaustive key search. This attack was
based on a divide and conquer method. In [81], Maitra et al. provided a fault attack on

85
Sprout. Then, Esgin et al. [42] presented a tradeoff attack with time complexity 233 and
using memory of 770 terabytes. In Indocrypt 2015, Banik [7] attacked Sprout with low
data complexity. In Asiacrypt 2015, Zhang et al. [119] attacked Sprout with complexity
220 times less than [73].

Despite the attacks against Sprout, the new idea used in it showed the possibility to
design stream ciphers with significantly low internal state size. Several new ciphers with
small internal state have been proposed recently. In 2016, Hamann et al. [56] presented
Lizard, which used 120 bit key and 121 bit inner state. Mikhalev et al. proposed a
cipher named Plantlet [87]. It uses 80 bit key, and the LFSR and NFSR sizes are 61 and
40 respectively.

Fruit is also a cipher inspired by the same idea of repeated use of the key. Ghafari,
Hu and Chen designed this new ultra-lightweight cipher [3]. Its internal state size is
80, which is same as the key size. To resist the attack ideas proposed against Sprout,
Ghafari et al. used some new techniques. Most of attacks used against Sprout concerned
about the bias of the round key function. To protect Fruit from these attacks, they used
a different and more complicated round key function. A larger NFSR has been used.
They also prevented the NFSR bits to become all zero after initialisation.

According to the authors [3], Fruit is much more secure than ciphers like Grain and
Sprout against the cube attack, TMD tradeoff attack etc. It also has no weak key-IV.
Authors also compared its area size and hardware implementation results with Sprout
and Grain. These comparisons show that Fruit is much more lightweight than those
ciphers. As given in [3], the area size of Grain is around 20% more than Fruit, and
gate equivalents used by Fruit is less than 80% of that of Grain. In this chapter we
cryptanalyse Fruit. We present an attack which is inspired by divide-and-conquer idea
of [73] against Sprout. Our attack recovers the whole key with complexity 274.95 for
Fruit version 1 and 276.66 for version 2.

86
k0 k1 ...... k78 k79
Round key function Counter

kt0

g f
L
NFSR LFSR

/
/ ht

zt

Figure 5.1: Structure of Stream Cipher Fruit

5.1 Description of Fruit version 1

Here we briefly describe the design of Fruit. The designers of this cipher aim at keeping
the size of the state small, but at the same time oppose the time memory data tradeoff
attack. Here, the internal state is of 80 bits, which is same as the size of the secret key.
It is composed of an LFSR of 43 bits and an NFSR of 37 bits. With the 80-bit secret
key, an IV of 70 bit is also given as input. Under a single IV, maximum 243 keystreams
can be produced. For security, the authors also suggest to use each key less than 215
times with different IVs and not to reuse the same IV with different keys. At first, we
provide some common notations:

1. t: the clock number.

2. Lt : the LFSR state (lt , lt+1 , lt+2 · · · , lt+42 ) at clock t.

3. Nt : the NFSR state (nt , nt+1 , nt+2 · · · , nt+36 ) at clock t.

4. Cr : 7-bit counter (ct0 , ct1 , ct2 , · · · , ct6 ).

5. Cc : 8-bit counter (ct7 , ct8 , ct9 , · · · , ct14 ).

6. k: the secret key (k0 , k1 , · · · , k79 ).

87
7. kt0 : the round key bit generated at clock t.
8. IV = (v0 , v1 , · · · , v69 )
9. zt : keystream bit generated at clock t.

Counters: Unlike most of the other similar ciphers, Fruit breaks its 15 bit counter into
two parts. The first part (Cr) is of 7-bit. It is allocated to round key generation. The
next 8 bits (Cc) are used in keystream generation. Both these counters start from 0 and
increase by one at eack clock. These two counters are independent.

Round key function: The bits of round key are generated using 6 bits of the key, by
the following function:

kt0 = ks · ky+64 ⊕ ku+72 · k p ⊕ kq+32 ⊕ kr+64 .

Here, the values of s, y, u, p, q, r are given as s = (ct0 ct1 ct2 ct3 ct4 ct5 ), y = (ct3 , ct4 , ct5 ), u =
(ct4 , ct5 , ct6 ),
p = (ct0 ct1 ct2 ct3 ct4 ), q = (ct1 ct2 ct3 ct4 ct5 ), r = (ct3 ct4 ct5 ct6 ).

LFSR: The LFSR is of 43 bits. The feedback rule of LFSR is

lt+43 = f (Lt ) = lt ⊕ lt+8 ⊕ lt+18 ⊕ lt+23 ⊕ lt+28 ⊕ lt+37 .

NFSR: In Fruit, the length of NFSR is 37 bits. The feedback function uses a counter
bit of Cc, kt0 , lt and a non-linear function g over Nt . Here

nt+37 = g(Nt ) ⊕ kt0 ⊕ lt ⊕ ct10 ,

where g is given by

g(Nt ) = nt ⊕ nt+10 ⊕ nt+20 ⊕ nt+12 · nt+3 ⊕ nt+14 · nt+25 ⊕ nt+5 · nt+23 · nt+31

⊕nt+8 · nt+18 ⊕ nt+28 · nt+30 · nt+32 · nt+34 .

Output function: The computation of the output bit zt is performed by applying a


function over few selected bits of NFSR and LFSR. 1 bit of LFSR and 7 bits of NFSR

88
are XORed with the value of a non-linear function h over LFSR and NFSR. Output bit
is
zt = ht ⊕ nt ⊕ nt+7 ⊕ nt+13 ⊕ nt+19 ⊕ nt+24 ⊕ nt+29 ⊕ nt+36 ⊕ lt+38 ,

where ht is

ht = nt+1 · lt+15 ⊕ lt+1 · lt+22 ⊕ nt+35 · lt+27 ⊕ nt+33 · lt+11 ⊕ lt+6 · lt+33 · lt+42 .

Initialisation: 1 bit one and 9 zero bits are concatenated at the beginning of the IV.
Also 50 bit zeros are concatenated at the end. So, the IV is of 130 bits (we call it IV 0 ),
which looks like:

IV 0 = 1000000000v0 v1 · · · v68 v69 000 · · · 00.

At the initial stage, the 80 key bits are loaded in the LFSR and NFSR. The first
37 key bits are loaded in NFSR and the remaining 43 keybits are loaded in LFSR. All
ci ’s are taken as 0 initially. In the first stage, the cipher is clocked 130 times, and the
keystream bits zt ’s are not given as output. Rather it is XOR-ed with the IVs and then
fed to both NFSR and LFSR. So,

lt+43 = zt ⊕ vt0 ⊕ f (Lt ).

nt+37 = zt ⊕ vt0 ⊕ g(Nt ) ⊕ kt0 ⊕ lt ⊕ ct10 .

In second stage, the cipher sets all bits of Cr equal to LSB of the NFSR, except the
last bit that is equal to LSB of the LFSR. Also l130 is set to 1. In this stage, zt ⊕ vt is not
fed to the NFSR and LFSR. The cipher still does not give zt as output.

Keystream generation: At the end of 210 rounds of first and second stage, the cipher
starts generating output. This output zt is XORed with the plaintext to get the ciphertext.

Inverse Operation: Suppose at any clock t, the state (Lt , Nt ) is known. To find
(Lt−1 , Nt−1 ), we only need the values of lt−1 and nt−1 , which can be given as follows

lt−1 = lt+42 ⊕ lt+7 ⊕ lt+17 ⊕ lt+22 ⊕ lt+27 ⊕ lt+36

89
0 10
nt−1 = nt+36 ⊕ kt−1 ⊕ lt−1 ⊕ ct−1 ⊕ nt+9 ⊕ nt+19 ⊕ nt+11 · nt+2 ⊕ nt+13 · nt+24 ⊕ nt+4

·nt+22 · nt+30 ⊕ nt+7 · nt+17 ⊕ nt+27 · nt+29 · nt+31 · nt+33

5.2 Key recovery attack on Fruit version 1

In this section, we describe an attack on full round Fruit. First phase of our attack is
based on divide and conquer approch. Using this idea, we reduce our search space. In
the next phase, we prune further by using a clever guess and determine approch. Let us
start with simple observations.

Linear register: Note that the linear register state is totally independent from the rest
during the keystream generation phase. Thus, once its 43-bit value at time t are guessed,
we can compute all of its future states during the keystream generation.

Counter: After 130 rounds of the initialisation process, first part of the counter Cr is
fed krom the LFSR and NFSR. Thus after 130 rounds, attacker does not know Cr. But
the second part of the counter Cc is deterministic and it is independent of the key. So
ct10 , used in NFSR feedback, is known to the attacker.

5.2.1 First phase of the attack

At any time t in the keystream generation process, we guess the state. Since total size of
LFSR and NFSR is 43 + 37 = 80, the number of possible states is 280 . Now we reduce
this size using two ideas. Our first idea is deterministic in nature whereas the second
one is probabilistic. Let us start with first idea.

Sieving of 1 bit: Let us consider the output function

zt = nt+1 · lt+15 ⊕ lt+1 · lt+22 ⊕ nt+35 · lt+27 ⊕ nt+33 · lt+11 ⊕ lt+6 · lt+33 · lt+42

⊕nt ⊕ nt+7 ⊕ nt+13 ⊕ nt+19 ⊕ nt+24 ⊕ nt+29 ⊕ nt+36 ⊕ lt+38 .

90
We can see that no key bit has been used to compute zt . So, at any clock t, if we know
the internal state, we can compute the output zt , without knowing any key bit. As zt is
already known, this gives us sieving of one bit, i.e, only by knowing 79 bits of the state,
we can compute the remaining bit from zt . So, the number of possible state candidates
is reduced by half i.e, 279 . However, we cannot continue this sieving further for the
next output keybits because for i ≥ 1 kt+i−1 is involved in nt+36+i , which is required in
computing zt+i .

NFSR k0 z LFSR
.. .. ..
.. .. ..
.. .. ..
.. .. ..
.. .. ..
.. .. ..
...

Guessed bit of and known keystream bit give one bit of NFSR

Figure 5.2: Our attack idea on Fruit.

Probabilistic Sieving: bias of kt0 : In Fruit, round key kt0 is the combination of 6 key
bits. However, we observe a positive bias of kt0 towards 0. The value of kt0 depends on
the counter value Cr by

kt0 = ks · ky+64 ⊕ ku+72 · k p ⊕ kq+32 ⊕ kr+64 .

Now, for some values of Cr, kt0 gives 0 with high probability. For example, if Cr is
1111111, kt0 = k63 k71 ⊕ k79 k31 ⊕ k63 ⊕ k79 = k63 (k71 ⊕ 1) ⊕ k79 (k31 + 1). Among 16 pos-
sible values of (k63 , k71 , k79 , k31 ), only six give kt0 = 1. So, Pr(kt0 = 0) = 58 . We observe
that among 128 possible values of Cr, there are 32 such values for which this probabil-
ity is 58 , whereas only for four values this probability is 38 . Remaining all counter values
give probability 12 . Overall, kt0 takes value 0 with probability 135
256 = 52.7%. In Table 5.1,
in decimal form we give the counter values for which the probabilities show bias.

Reducing complexity using this bias: Note that if we know kt0 , we can find nt+37 from
the knowledge of zt . This reduces the guessed size of the state using sieving. Since kt0 is
biased towards 0, we use this fact to guess the value of kt0 . Suppose, we want to guess the

91
Counter values with Pr(kt0 = 0) = 3
8 Counter values with Pr(kt0 = 0) = 5
8
64 72-79
80 88-95
96 104-111
112 120-127

Table 5.1: Distribution of kt0 for different counter values.

r 6 8 10 12 14
E(Xr ) 30.777 121.527 484.527 1936.527 7744.526
Reduction factor 2.079 2.107 2.113 2.115 2.116

Table 5.2: Reduction factor for different r consecutive guesses.

0 , k0 , · · · , k0 . We start with the strings which have more probabilities


values of kt−1 t−2 t−r

of occurrence. It gives high probability to make our guess correct. For example, when
r = 4, the string 0000 has probability (0.527)4 % to be correct, where as the probability
of 1111 is (0.473)4 % if the events ki0 = 0 and k0j = 0 are independent for i 6= j.

So, while guessing an r-bit string for kt0 , kt+1


0 , · · · k0
t+r−1 , we arrange all possible

strings in decreasing order of their probabilities and form a list. We take our first guess
r
z }| {
as 00 · · · 0 because this string has the maximum probability. Our next guess would be
the second string in the list. In this way we go on attempting one by one from the list.

Suppose, Xr is a random variable which denotes the number of guesses required to


find the correct kt0 , · · · , kt+r−1
0 . Now expected of value of Xr is

2r −1
135 count(i,0) 121 r−count(i,0)
   
E(Xr ) = ∑ i ,
i=0 256 256

where count(i,0) denotes the number of 0’s when i is represented as a binary r bit number.

2r
Since E(Xr ) < 2r−1 , the complexity of the search is reduced by E(Xr ) . In Table 5.2,
we give the expected reduction for different r’s. As we see from the table, increasing
the value of r increases the reduction factor. When r = 12, the reduction factor almost
reaches a constant value.

0 = 0} and {k0
In Table 5.2, we assume that the events {kt+i t+ j = 0} are independent

for 0 ≤ i, j < r with i 6= j. But this not the case. There are 32 counter values for which

92
r 6 8 10 12 14 16 18 20
E(Xr ) 26.622 100.377 382.818 1465.256 5623.352 21625.896 79072.808 277169.368
Reduction factor 2.404 2.550 2.675 2.795 2.913 3.030 3.315 3.783

Table 5.3: Actual reduction factor for different r consecutive guesses.

Pr(kt0 = kt+1
0 ) = 3 . Similarly there are 16 counter values for which Pr(k0 = k0 ) =
4 t t+2
9
16 .
So kt+i and kt+ j are highly correlated for i 6= j. Thus we consider an ordering for
0 , · · · , k0
guessing kt0 , kt+1 20
t+r−1 . Experiment with 2 many random keys shows that

12
135 12
 z }| {   
0 0 0
P (kt , kt+1 , · · · , kt+11 ) = (0, . . . , 0) = 3.70 × .
256

This means, the influence of the correlations of kt0 makes the probability
12  12
z }| {
0 0 0 135
(kt , kt+1 , · · · , kt+11 ) = (0, . . . , 0) 3.70 times more than our initial prediction 256 .
In Table 5.3, we present actual reduction factor. Experiment is done over random 227
random keys.

Building List for LFSR and NFSR: To reduce the complexity, we construct separate
lists for all possible values of LFSR and NFSR. At any t0 -th clock, instead of guessing
the whole 80 bits of the internal state, if we guess the LFSR bits and NFSR bits sepa-
rately, the complexity decreases significantly. For this purpose, we build independent
lists for LFSR and NFSR, which we denote as LL and LN respectively. Since the LFSR
is of length 43, LL contains all 243 possible state values of LFSR. For NFSR, the list
LN contains few more values except the 237 possible state values. Let us have a look at
the inverse NFSR function:

0 10
nt−1 = nt+36 ⊕ kt−1 ⊕ lt−1 ⊕ ct−1 ⊕ nt+9 ⊕ nt+19 ⊕ nt+11 · nt+2 ⊕ nt+13 · nt+24 ⊕ nt+4

·nt+22 · nt+30 ⊕ nt+7 · nt+17 ⊕ nt+27 · nt+29 · nt+31 · nt+33 .

10 is known. But k0
Here, the term ct−1 t−1 and lt−1 are unknown. So, we include a new

column for kt0 in LN . Similarly, since we have to consider both the possible values of lt
(0 and 1), we create another column for lt . Let us consider r + 1 consecutive backward
rounds, beginning from the clock t0 , i.e, rounds t0 ,t0 − 1,t0 − 2, · · · ,t0 − r. Then, in each
of these rounds, kt0 is unknown. One can not compute the keystream bit zt−1 from the

93
0 . Thus to sieve further we first guess
knowledge of state (Lt , Nt ) without knowing kt−1
the values kt00 −1 , kt00 −2 , · · · , kt00 −r and use the knowledge of keystream bits zt0 −1 , · · · , zt0 −r .
So, we have to consider both values (0 and 1) each of kt00 −1 , kt00 −2 , · · · , kt00 −r , which gives
2r possible cases. But one can see from the Table 5.3 that to find the correct value of
the tuple (kt00 −1 , kt00 −2 , · · · , kt00 −r ), average required guess is much smaller than 2r .

For each such case, we consider 243 possible internal LFSR states and 237 possible
NFSR states. This lists LL and LN are also sorted according to the values of some
expressions. Detail explanation of this is given next.

Detail Explanation of Sieving of 1 bit using Lists LL and LN : Here we give a de-
tailed explanation on how to construct the the lists LL and LN and reduce the complex-
ity to half. Since we do not consider the probabilistic sieving here, we do not use the
columns for kt0 . So, we assume LN to have 237 possible state values only. If we look at
the function

zt = nt+1 · lt+15 ⊕ lt+1 · lt+22 ⊕ nt+35 · lt+27 ⊕ nt+33 · lt+11 ⊕ lt+6 · lt+33 · lt+42

⊕nt ⊕ nt+7 ⊕ nt+13 ⊕ nt+19 ⊕ nt+24 ⊕ nt+29 ⊕ nt+36 ⊕ lt+38 .

We can divide the terms into three types

1. Terms having only LFSR bits.

2. Terms having only NFSR bits.

3. Terms having both NFSR and LFSR bits.

We denote the sum of the terms involving LFSR bits and zt by τl and the sum of the
terms involving NFSR bits by τn . So, τl = lt+1 lt+22 ⊕ lt+6 lt+33 lt+42 ⊕ lt+38 ⊕ zt and
τn = nt ⊕ nt+7 ⊕ nt+13 ⊕ nt+19 ⊕ nt+24 ⊕ nt+29 ⊕ nt+36 . So, we have τl ⊕ τn ⊕ nt+1 lt+15 ⊕
nt+35 lt+27 ⊕ nt+33 lt+11 = 0. Among LFSR bits, lt+15 , lt+27 and lt+11 are involved in
product with LFSR bits. So, once we have the value of zt , we sort our LL list by the
values of τl , lt+15 , lt+11 , lt+27 . For this, consider the tuple (τl , lt+15 , lt+11 , lt+27 ). This
tuple can have 16 possible values. Based on these values, we divide the list LL into 16
parts. Similarly, we sort LN according to τn , nt+33 , nt+35 , nt+1 . So, using the equation

94
τl ⊕ τn ⊕ nt+1 lt+15 ⊕ nt+35 lt+27 ⊕ nt+33 lt+11 = 0, we can choose our LFSR and NFSR
states by gradual matching [91]. This will help us to reduce the complexity by half.

We give a simple example. Suppose we are checking the states for which lt+15 , lt+11
and lt+27 is zero. Then, to satisfy the equation

τl ⊕ τn ⊕ nt lt+15 ⊕ nt+35 lt+27 ⊕ nt+33 lt+11 = 0,

we combine the part of LL for which τl = 0 only with the parts of LN for which τn = 0,
and not with the parts for which τn = 1. Similarly, with the LFSR states where τl = 1,
we combine the NFSR states for which τn = 1. In particular, the part LL(0,0,0,0) is
combined with LN(1,1,1,0) but not with LN(1,1,1,1) , whereas for LL(0,0,0,0) we do the
opposite.

Probabilistic Sieving Using lists LL and LN : We can perform probabilistic sieving


using the two lists LL and LN . In this case, the attacker is provided a number of
consecutive keystream bits and she has to guess the value of kt0 for those clocks. Now,
instead of guessing kt0 values arbitrarily, following a pattern will help to find the correct
guess with less number of attempts, as mentioned before.

Suppose attacker tries to find the state (Lt0 , Nt0 ) at time t = t0 . She will use the
keystream bits zt0 , zt0 −1 , · · · zt0 −r for this. For a state (Lt0 , Nt0 ), the zt0 can be com-
puted directly, without any value of kt0 . But subkey kt00 −1 is involved in zt0 −1 , because
zt0 −1 depends on nt0 −1 , and nt0 −1 depends on kt00 −1 and lt0 −1 . So, the attacker has to
guess kt00 −1 , kt00 −2 , · · · kt00 −r . Now, instead of considering all possible values for the tuple
(kt00 −1 , kt00 −2 , · · · kt00 −r ), we arrange the values according to their probabilities. Let the
expected number of attempts to find (kt00 −1 , kt00 −2 , · · · , kt00 −r ) be E(Xr ). So, we consider
the first E(Xr ) number of choices for (kt00 −1 , kt00 −2 , · · · , kt00 −r ). For each such value we
have to consider the 237 NFSR states. Again, since at each (t0 − j)-th round ( j = 1 to
r), we need to consider both possible values for lt0 − j . So, the final size of the list LN is
237 · E(Xr ) · 2r = 237+r · E(Xr ). After that, the list is sorted in the following way.

1. Based on the output keystream zt0 at the first round, at first we sort the list just
the same way as we mention in 1-bit sieving. LL is sorted based on the val-
ues of τl0 = (lt0 +1 lt0 +22 ⊕ lt0 +6 lt0 +33 lt0 +42 ⊕ lt0 +38 ⊕ zt0 ), lt0 +15 , lt0 +11 , lt0 +27 and
LN is sorted according to τn0 = (nt0 ⊕ nt0 +7 ⊕ nt0 +13 ⊕ nt0 +19 ⊕ nt0 +24 ⊕ nt0 +29 ⊕

95
r 10 12 14 16 18 20
279−r × E(Xr ) 277.58 277.52 277.46 277.40 277.27 277.08

Table 5.4: Size of the final possible state for different r.

nt0 +36 ), nt0 +33 , nt0 +35 , nt0 +1 . So, LL and LN , both are divided into 24 sublists.
Now among this, only 2(4+4−1) = 27 combinations are eligible for correct state,
by 1-bit sieving. For each possible combination of sublists from LL and LN , we
consider the next j.

2. For all t = t0 − 1 to t0 − r, we further subdivide (sort) the sublists formed in previ-


ous step. Based on the value of zt , LL is sorted according to τl , lt+15 , lt+11 , lt+27
and also lt . Also LN is sorted according to τn , nt+33 , nt+35 , nt+1 . While merging,
corresponding to the LFSR states where lt = 0, we consider the NFSR states in
LN with lt = 0. Same we do with lt = 1. Again, for each possible combination
of sublists from LL and LN , we consider the next t.

Thus due to direct 1 bit sieving and probabilistic sieving, total number of possible
280
state will be now 2 = 279 . If we guess (kt00 −1 , kt00 −2 , · · · kt00 −r ) for E(Xr ) times, total
number of possible state is 279−r × E(Xr ). In Table 5.4, we provide the total number of
states according to r.

5.2.2 Second phase of the attack: Guessing a middle state

Using our first approach, we have a total of 277.08 possible states. Now problem is how
we can reduce this size further? At any time t0 > 0, we guess an 80-bit vector for the
internal state (Lt0 , Nt0 ). Now, since

zt+1 = ht+1 ⊕ nt+1 ⊕ nt+8 ⊕ nt+14 ⊕ nt+20 ⊕ nt+25 ⊕ nt+30 ⊕ nt+37 ⊕ lt+39 ,

we have

nt+37 = ht+1 ⊕ nt+1 ⊕ nt+8 ⊕ nt+14 ⊕ nt+20 ⊕ nt+25 ⊕ nt+30 ⊕ zt+1 ⊕ lt+39 ,

which we compute using the output bit zt+1 and the guessed NFSR and LFSR bits.
Now, from the equations nt+37 = g(Nt ) ⊕ kt0 ⊕ lt ⊕ ct10 and kt0 = ks · k(y+64) ⊕ k(u+72) ·

96
Eq. No. Equation Eq. No. Equation
1 k0 k64 + k0 k72 + k32 + k64 + α0 2 k0 k64 + k0 k73 + k32 + k65 + α1
3 k0 k74 + k0 k65 + k33 + k66 + α2 4 k0 k75 + k0 k65 + k33 + k66 + α3
5 k1 k76 + k0 k72 + k32 + k64 + α4 6 k1 k77 + k0 k73 + k32 + k65 + α5
7 k1 k78 + k0 k65 + k33 + k66 + α6 8 k1 k79 + k0 k65 + k33 + k66 + α7
9 k2 k72 + k4 k68 + k36 + k72 + α8 10 k2 k73 + k4 k68 + k36 + k73 + α9
11 k2 k74 + k5 k69 + k37 + k74 + α10 12 k2 k75 + k5 k69 + k37 + k75 + α11
13 k3 k76 + k4 k68 + k36 + k72 + α12 14 k3 k77 + k4 k68 + k36 + k73 + α13
15 k3 k78 + k5 k69 + k37 + k74 + α14 16 k3 k79 + k5 k69 + k37 + k75 + α15
17 k4 k72 + k8 k64 + k40 + k64 + α16 18 k4 k73 + k8 k64 + k40 + k65 + α17
19 k4 k74 + k9 k65 + k41 + k66 + α18 20 k4 k75 + k9 k65 + k41 + k67 + α19
21 k5 k76 + k10 k66 + k42 + k68 + α20 22 k5 k77 + k10 k66 + k42 + k69 + α21
23 k5 k78 + k11 k67 + k43 + k70 + α22 24 k5 k79 + k11 k67 + k43 + k71 + α23

Table 5.5: Set of 24 equations

k p ⊕ k(q+32) ⊕ k(r+64) , we form the equation:

ks · ky+64 ⊕ ku+72 · k p ⊕ kq+32 ⊕ kr+64 ⊕ g(Nt ) ⊕ lt ⊕ ct10 ⊕ nt+37 = 0.

Expressing the sum g(Nt ) ⊕ lt ⊕ ct10 ⊕ nt+37 as αt , the equation becomes

ks · ky+64 ⊕ ku+72 · k p ⊕ kq+32 ⊕ kr+64 ⊕ αt = 0.

For, any t ≥ t0 , this equation should be satisfied if the guess of internal state (Lt0 , Nt0 )
is correct. Now, we observe experimentally that for random counter, we can discard
52% wrong state if we take 24 keystream bits. Experiment is done over 100000 random
key-IV.

Below we explain it by an example for the counter Cr = 0. We show that for the
counter Cr = 0, 24 key stream bits are sufficient to discard around 60% wrong guessed
states.

Set of 24 equations beginning with counter 0:

We divide the first 24 equations into two disjoint sets E1 and E2 , each containing 12
equations. In 5.5, the 24 equations are given in order. Instead of using XOR (⊕) we
have used ’+’, since the terms are only single bit. But obviously this addition (+) is

97
modulo 2. Also, the right hand side of each equation is zero. We do not write it in the
table. Just the left hand side expression is given. Here, we mention the equations in E1
and from these 12 equations, we derive three conditions such that the correct candidate
cannot satisfy any one of these conditions. So, if a guessed candidate satisfies any one
of these three conditions, it is not the correct candidate, and we can discard it. For
convenience, here in these 12 equations of E1 , we denote the αi ’s as β j , where j = 1 to
12.
First set of equations:

k0 k64 + k0 k72 + k32 + k64 + β0 (5.1)

k0 k64 + k0 k73 + k32 + k65 + β1 (5.2)

k0 k74 + k0 k65 + k33 + k66 + β2 (5.3)

k0 k75 + k0 k65 + k33 + k66 + β3 (5.4)

k2 k72 + k4 k68 + k36 + k72 + β4 (5.5)

k2 k73 + k4 k68 + k36 + k73 + β5 (5.6)

k2 k74 + k5 k69 + k37 + k74 + β6 (5.7)

k2 k75 + k5 k69 + k37 + k75 + β7 (5.8)

k4 k72 + k8 k64 + k40 + k64 + β8 (5.9)

k4 k73 + k8 k64 + k40 + k65 + β9 (5.10)

k4 k74 + k9 k65 + k41 + k66 + β10 (5.11)

k4 k75 + k9 k65 + k41 + k67 + β11 (5.12)

Condition A:

(a) β4 + β5 = 1,

(b) β2 + β3 + β10 + β11 = 0,

(c) β0 + β1 + β8 + β9 = 1:

Since β4 +β5 = 1, adding equations (5.5) and (5.6), we have (k2 +1) = 1 = k74 +k75 .
We add equations (5.3), (5.4), (5.7), (5.8) and put k74 + k75 = 1 which together with the
fact that β2 + β3 + β10 + β11 = 0 gives:(k0 + k4 ) = 0. Finally, adding equations (5.1),
(5.2), (5.9), (5.10) and putting (k0 + k4 ) = 0, we have β0 + β1 + β8 + β9 = 0, which

98
contradicts with (c).
Condition B:

(a) β4 + β5 + β6 + β7 = 1,

(b) β0 + β1 + β8 + β9 = 1,

(c) β2 + β3 + β10 + β11 = 1

Adding equations (5.5), (5.6), (5.7), (5.8) and from the fact that β4 + β5 + β6 +
β7 = 1, we have (k2 + 1)(k74 + k75 + k72 + k73 ) = 1. This implies that (k74 + k75 +
k72 + k73 ) = 1. But, adding equations (5.1), (5.2), (5.9) and (5.10) and using (b), we
have (k72 + k73 ) = 1. Also, adding equations (5.3), (5.4), (5.11) and (5.12), we have
(k74 + k75 ) = 1. (since β2 + β3 + β10 + β11 = 1). So, we have (k74 + k75 + k72 + k73 ) = 0,
which is a contradiction.
Condition C:

(a) β6 + β7 = 1,

(b) β0 + β1 + β8 + β9 = 0,

(c) β2 + β3 + β10 + β11 = 1

Adding equations (5.7),(5.8) and (a), we have k72 + k73 = 1. Using this fact and
adding (5.1), (5.2), (5.9), (5.10) we have (k0 + k4 + β0 + β1 + β8 + β9 ) = 0. From (b),
we have k0 + k4 = 0. Now, using k0 + k4 = 0 and adding equation (5.3), (5.4), (5.11),
(5.12) we get a contradiction to (c).

Let, by A, B, C we denote the set of candidates satisfying condition A, B, C re-


spectively. Looking at the conditions, one can easily observe that the sets A, B, C are
mutually disjoint. Since each condition has three equations, the number of candidates
1 3
satisfying a condition is 23
fraction of total candidates. So, |A ∪ B ∪C| is 8 fraction of
the total. Similarly, for the second set of equations, we can find 3 conditions, namely
condition P, Q, R, each containing three equations. So, denoting the corresponding set
of candidates as P, Q, R, we have |P ∪ Q ∪ R| = 83 . Since the conditions A, B, C are
independent from P, Q, R, we have |A ∪ B ∪C ∪ P ∪ Q ∪ R|
= 38 + 83 − ( 38 )2 ≈ .60 fraction of total.

99
Average number of output keystream bits required to eliminate a wrong state:
We give an approximate measure of the average number of zi ’s required to discard an
incorrect state. A set of 24 keystream bits can discard at least 50% wrong guesses. Now,
suppose, total number of possible guesses are S. For 24 clocks (24 zi s), on average,
S S
around 2 guesses will be eliminated. From the remaining 2 guesses, again half of them
will be eliminated when we use the equations derived from next 24 z0i s. So, we are left
S S
with 4 guesses. In this way, in general for any i, around 2i
guesses are there which
do not get eliminated for first 24(i − 1) equations, but get eliminate when we use 24i
equations. For these many guesses, the number of zi ’s required for elimination is 24i. To
calculate the average number of clocks required for elimination, we compute ∑i 2Si × 24
and divide by S. So, required average is

∑i 2Si × 24 2S × 24
≈ ≈ 48
S S

Based on this idea, we construct some tables in preprocessing phase.

Construction of Table:

In the preprocessing phase, we construct r tables, say Table T1 , T2 , · · · , Tr . In each


table, the first column contains all possible 24 bit binary strings, in increasing order
of their values. There are total 224 such strings. In the first table (T1 ), each of the
strings denotes a possible sequence of αi αi+1 · · · αi+23 . Corresponding to each such
string αi αi+1 · · · αi+23 , in the second column we record the possible values of Cr at
i-th clock. There are at most 128 values of Cr. So, the data complexity of this ta-
ble is 231 . Similarly, in table T2 , the first column contains all possible 24 bit binary
string, which corresponds to αi+24 αi+25 · · · αi+47 . In the second column, for each string
αi+24 αi+25 · · · αi+47 , we record the possible values of Cr at i-th clock. We do the same
for all r tables.

Processing Phase:

1. After guessing a middle state at t0 -th clock, we compute the αi ’s from the output
keystream zi ’s. We compare αt0 αt0 +1 αt0 +2 · · · αt0 +23 with the strings in the first
column by binary search to find the exact match and check the corresponding
possible Cr values in the second column. Let us denote the set of these Cr values
as T1 (Cr).

100
2. If T1 (Cr) = φ , we discard the state. Otherwise we go to the next table T2 . We find
the match αi+24 αi+25 · · · αi+47 in the first column and check the corresponding Cr
values. We denote this set of Cr values as T2 (Cr). If T1 (Cr) ∩ T2 (Cr) = φ , we
discard the state. Otherwise we go to table T3 and find T1 (Cr) ∩ T2 (Cr) ∩ T3 (Cr).
We do the same for each table Ti . If at any table the intersection of Ti (Cr)s become
φ , we discard the state.

3. If T (Cr) = ∩ri=1 Ti (Cr) 6= φ , we list the state along with its T (Cr) = ∩ri=1 Ti (Cr).
So, at the end of this, we have a list of possible candidates for the state (Lt , Nt )
and for each such candidate we have a set possible counter (Cr) values. Since on
average 48 key stream bits are sufficient to discard a wrong state, average number
of required tables is r = 2.

Time complexity of our attack: After 1 bit sieving and probabilistic sieving we have
total 277.08 possible states. Then to discard a wrong state, we have to run Fruit 48 rounds
on average (2 tables). Thus our total time complexity is 277.08 × 48, which is equivalent
1
to 277.08 × 48 × 210 = 274.95 many Fruit encryption. This is 16.56 times faster than the
average exhaustive search attack complexity 279 Fruit encryption.

5.3 Second Version of Fruit

After the proposal of Fruit, we reported our attack in [37]. Also Hamann et al. [57, 74,
75] attacked this version of Fruit independently. To block these approaches of attacks,
the designers brought some changes in the design and proposed a new version of 80
bit Fruit (Fruit v2). Here we discuss briefly the design of the second version and our
cryptanalysis on this version.

5.3.1 Structure

Round key generation: In the first version of Fruit, we observe significant bias in the
round key, which use use in our attack idea. In the second version, the authors have
changed the round key generation function and the variables used in it.

kt0 = ks · ky+32 ⊕ ku+64 · k p ⊕ kq+16 ⊕ kr+48 .

101
The variables used in it are defined as follows: s = (ct0 ct1 ct2 ct3 ct4 ), y = (ct5 ct6 ct0 ct1 ct2 ),
u = (ct3 ct4 ct5 ct6 ), p = (ct0 ct1 ct2 ct3 ), q = (ct4 ct5 ct6 ct0 ct1 ), r = (ct2 ct3 ct4 ct5 ct6 ).

LFSR and NFSR: The sizes of the LFSR and NFSR are 43 and 37 respectively, which
are same as the first version. The LFSR update function is also same as the first version.
However, the update function of NFSR has been changed slightly. The counter value
ct10 has been replaced by counter value ct3 . So, in the second version:

nt+37 = g(Nt ) ⊕ kt0 ⊕ lt ⊕ ct3 .

However, the function g is same as in the previous version.

Output Function: The new output generation function is:

zt = ht ⊕ nt ⊕ nt+7 ⊕ nt+13 ⊕ nt+19 ⊕ nt+24 ⊕ nt+29 ⊕ nt+36 ⊕ lt+38 ,

where ht = lt+6 lt+15 ⊕ lt+1 lt+22 ⊕ nt+35 lt+27 ⊕ lt+33 lt+11 ⊕ nt+1 nt+33 lt+42 .

5.3.2 Cryptanalysis

1-bit sieving: Like the first version, 1-bit sieving can be applied in the second version
also. As we see in the output function, zt does not use any round key bit. So, this zt is
a function of only LFSR and NFSR bits. From the knowledge of zt , one bit of the state
can be sieved. So, the number of possible states comes down to 279 .

Bias of round key: However, after our attack on the first version of Fruit, the designers
have improved the round key generation function. In the new version, the huge bias
we observed in the previous round key generation has been removed. As a result, the
probabilistic sieving that we applied in the first version, is not applicable in the new
one.

However, still we observe a small bias there in this function. Among the 128 dif-
ferent values of the counter, for only one value, the round key function shows bias in

102
generationg 0 and 1. The use of this single bias in the key recovery does not provide
any significant improvement in the attack.

Second Phase of Attack: Due to the change in design, the reduction in attack complex-
ity by the second phase of our attack is not as huge as in first version. In this phase, at
any iteration t0 of algorithm, we guess the whole state (Lt0 , Nt0 ). Now, from the output
generation function, we have

zt+1 = lt+7 lt+16 ⊕ lt+2 lt+23 ⊕ nt+36 lt+28 ⊕ lt+34 lt+12 ⊕ nt+2 nt+34 lt+43 ⊕
nt+1 ⊕ nt+8 ⊕ nt+14 ⊕ nt+20 ⊕ nt+25 ⊕ nt+30 ⊕ nt+37 ⊕ lt+39 .

So, we can express nt+37 as

zt+1 ⊕ lt+7 lt+16 ⊕ lt+2 lt+23 ⊕ nt+36 lt+28 ⊕ lt+34 lt+12 ⊕ nt+2 nt+34 lt+43

⊕nt+1 ⊕ nt+8 ⊕ nt+14 ⊕ nt+20 ⊕ nt+25 ⊕ nt+30 ⊕ lt+39 .

Since, zt+1 is known, we can find nt+37 from the bits of t-th state. Now, from the NFSR
update function and round key generation function, we have:

nt+37 = g(Nt ) ⊕ lt ⊕ ct3 ⊕ ks ky+32 ⊕ ku+64 k p ⊕ kq+16 ⊕ kr+48 .

Now, from this equation and already found value of nt+37 we can form the equation:

ku+64 k p ⊕ kq+16 ⊕ kr+48 ⊕ αt = 0,

where αt is g(Nt ) ⊕ lt ⊕ ct3 ⊕ nt+37 . This equation should be satisfied for any t ≥ t0
for the correct guess of state. In the first version, we have observed that 24 equations
are sufficient to sieve 50% wrong states. However, in the second version, due to more
improvement in design, we had to change our table construction. In this version we
observe that we need 83 equations to sieve 99% wrong states.

So, here we construct each table using 42 equations, instead of 24 as in first version.
Then using the same idea, We can attack this version with complexity 276.66 .

103
5.3.3 Weak key class

In [57], Hamann et al. were able to find a class of weak keys in Fruit v1. They
showed that for the subset of keys {k0 , k1 · · · k63 , 0, 0 · · · , 0}, kt0 is a function of key bits
k0 , k1 , · · · k31 only, and does not depend on k32 , k33 , · · · , k63 . As a result, for any of such
keys, they were able to recover the inner state at t = 130, i.e, before the completion of
initialization, with complexity much less than exhaustive key search. Then, by reversing
the cipher backward we can get the full 80-bit key.

However, in the second version of Fruit, the same set of keys are not weak any more.
No class of such weak keys have been found yet in the second version. In this work, we
are able to find class of weak keys of Fruit v2.

Observation:

1. Set of keys with k32 = k33 = · · · = k79 = 0: For this set of keys, ky+32 = ku+64 =
kr+48 = 0. So, kt0 is updated as

kt0 = kq+16 .

Now, (q + 16) can take values from 16 to 47. Since all ki ’s for i ≥ 32 is 0, kq+16
will depend on only key bits k16 , k17 · · · , k31 . So, key bits k0 to k15 are never
involved in the computation of kt0 in this case.

2. Set of keys with k16 = k17 · · · = k63 = 0: In this case, kt0 is calculated as:

kt0 = kr+48 ⊕ ku+64 · k p .

Now, the second term is a product, which is 0 with probability 34 . The first term is
also 0 with probability 43 . So, this expression becomes 0 with probability 16 9
. We
are not claiming any attack for this kind of keys, but we think this bias should be
taken care of while designing for improvement of security.

Countermeasures for our attack against Fruit: Recently the designers of Fruit have
proposed an 128 bit version of Fruit. In this design, they have taken care of our attack
strategy. So far we have not found any weakness in this version, at least according to
the strategies that we used in 80 bit versions. Based on this new design pattern, here we
suggest slight change in the 80 bit versions of Fruit so that our attack can be defended.
In the 128 bit version the designers have used another round key function which is used
in output keystream generation. Similar idea can be used in 80-bit versions also. Even,

104
without defining a new round function, we can use the same kt0 that we use in NFSR
update, in the output keystream generation. For example, in Fruit v2, the NFSR update
function can be as follows

zt = ht ⊕ nt ⊕ nt+7 ⊕ nt+13 ⊕ nt+19 ⊕ nt+24 ⊕ nt+29 ⊕ nt+36 ⊕ lt+38 ⊕ kt0 .

5.4 Conclusion

Here we have provided an attack 16.95 times and 10.07 times faster than average ex-
haustive search respectively for version 1 and 2 of Fruit. Our idea mostly uses the
bias of the round key (kt0 ) generation, as well as some other pruning method. We ex-
plained our attack with different diagrams and tables. Though the structure of Fruit
shows possibility of designing ultra-lightweight ciphers with promising security, this
attack shows that this design should be analysed more carefully to improve the security
without increasing the state size. We have also provided some suggestions in the design
pattern. We hope our work will help to design lightweight ciphers with secure round
key generation function in future.

105
CHAPTER 6

Conclusion

In this chapter, we conclude the thesis. We have analysed some stream ciphers namely
RC4, Salsa, Chacha and Fruit.

We revisit the chapters one-by-one to summarize the thesis. We mention the exist-
ing results and prior work (if any) in the direction. Most importantly, we present the
crux of the chapters, that is our contributions, improvements and extensions to exist-
ing methods. Finally, we also discuss the future scope for research and potential open
problems in respective field of study.

6.1 Summary of Technical Results

Chapter 1 provided the introduction to the thesis. The main technical results of the thesis
are discussed in Chapters 2 to 5, and the highlights of these chapters are as follows.

Chapter 2: Generalised Roos bias in RC4

In this chapter, we have given a justification of the negative bias between the i-th
keystream byte Zi with i − k[0] which was observed experimentally by Paterson et al. in
Asiacrypt 2014. In 1995, Roos observed a bias between Zi and fi = ∑ir=1 r + ∑ir=0 K[r].
We have generalized this bias. We have presented two correlations: first one is between
Zi and i − fy and the second one is between Zi and fi−1 .

Chapter 3: Settling the proof of Zr = r in RC4

In this chapter, we have first calculated the probability distribution of RC4 state after
any iterations of KSA or PRGA using probability transition matrix. After that we have
used these probabilities and proved the biases between Zr and r accurately. The proof
of these biases had been attempted before in FSE 2013 and FSE 2015 without much
success.

Chapter 4: Cryptanalysis for reduced round Salsa and Chacha

In this chapter, we have analysed Salsa and Chacha for reduced rounds. We have pro-
posed a new algorithm to construct the set of Probabilistic Neutral Bits (PNBs). Using
this algorithm, we have estimated that one can cryptanalyse 8 rounds Salsa with a key
search complexity 2243.67 and 7 rounds Chacha with complexity 2235.22 . Our attack on
Salsa and Chacha is around 2.27 and 5.39 times faster than the existing results. Next,
we aim to increase the backward probability bias. Instead of assigning random values
for probabilistic neutral bits, we have assigned some fixed values for the PNB blocks.
These values give minimum differences between Z − X and Z − X 0 on average. As a
result, the backward probability bias increases significantly. This helps to reduce the
complexity of the attack slightly.

Chapter 5: Results on Fruit

In this chapter, we have studied a ultra lightweight stream cipher Fruit. We have anal-
ysed full round Fruit by a divide-and-conquer method. Our attack is equivalent to 274.95
many Fruit encryptions, which is around 16.95 times faster than average exhaustive key
search. Our idea also works for the second version of Fruit.

6.2 Open Problems

In this section, we propose a few open problems related to our work. These may lead
to new interesting research topics in the related field.

107
6.2.1 RC4

In Crypto 2008 [84], Maximov and Khovratovich showed that one can recover RC4
state from the knowledge of keystreams with time complexity 2241 . However, in many
applications 16 byte key is used in RC4. So, one can find the key exhaustively with
complexity 2128 . Hence the following problem is very interesting in the context of 16
byte RC4 keys.

Problem 6.1 Is it possible to find the state with complexity less than 2128 ?

In FSE 2008 [16], Biham and Carmeli proposed an algorithm to find the key from
the knowledge of the state. This algorithm indirectly is based on the Roos biases. Before
this work, Paul and Maitra [96] studied this problem. However success probability of
both these works is very low for 16 byte key.

Problem 6.2 Is it possible to find the key of 16 byte from the knowledge of state?

6.2.2 Salsa and Chacha

Cryptanalysis of reduced rounds Salsa and Chacha is based on the idea of Probabilistic
Neutral Bits (PNB). Here differential is given to a particular bit position of IV. After
few rounds find the forward probability. If this probability is significantly different
from 0.5, then try to find few key bit positions which are probabilistically neutral in
the backward direction. But all these probabilities are calculated experimentally. No
theoretical justification has been given in the literature. Hence we have the following
problem.

Problem 6.3 Is it possible to find the forward probability theoretically ? Why some key
bits are probabilistic neutral during backward computation?

PNB idea on Salsa and Chacha was introduced in [5]. In that work, Salsa was
attacked up to 8 rounds and Chacha was attacked up to 7 rounds. After almost one
decade, still these are maximum attack rounds. So we have the following problem.

Problem 6.4 Can we analyse Salsa 256 for more than 8 rounds and Chacha for more
than 7 rounds?

108
6.2.3 Fruit

After our attack [37] and the attack of Zhang et al. [120], Ghafari et el. [46] proposed a
new version of Fruit which has 128 bit key. In this new version, key is involved both in
NFSR update function and keystreams generation function. It seems attacks [37, 120]
can not work directly on this version of Fruit. Hence we have the following question.

Problem 6.5 Can one find key with complexity less than 2128 in Fruit 128?

109
REFERENCES
[1] AES (2001). Advanced Encryption Standard. National Institute of Standards and
Technology. Available at http://csrc.nist.gov/CryptoToolkit/
aes/rijndael/.

[2] AlFardan, N. J., D. J. Bernstein, K. G. Paterson, B. Poettering, and


J. C. N. Schuldt, On the Security of RC4 in TLS. In Proceedings
of the 22th USENIX Security Symposium, Washington, DC, USA, August
14-16, 2013. 2013. URL https://www.usenix.org/conference/
usenixsecurity13/technical-sessions/paper/alFardan.

[3] Aminghafari, V. and H. Hu (2016). Fruit: ultra-lightweight stream cipher with


shorter internal state. IACR Cryptology ePrint Archive, 2016, 355. URL http:
//eprint.iacr.org/2016/355.

[4] Armknecht, F. and V. Mikhalev, On Lightweight Stream Ciphers with Shorter


Internal States. In Fast Software Encryption - 22nd International Workshop, FSE
2015, Istanbul, Turkey, March 8-11, 2015. 2015.

[5] Aumasson, J., S. Fischer, S. Khazaei, W. Meier, and C. Rechberger, New


Features of Latin Dances: Analysis of Salsa, ChaCha, and Rumba. In Fast Soft-
ware Encryption, 15th International Workshop, FSE 2008, Lausanne, Switzer-
land, February 10-13, 2008. 2008. URL https://doi.org/10.1007/
978-3-540-71039-4_30.

[6] Babbage, S. and M. Dodd, The MICKEY Stream Ciphers. In New Stream Ci-
pher Designs - The eSTREAM Finalists. 2008, 191–209.

[7] Banik, S., Some Results on Sprout. In Progress in Cryptology - INDOCRYPT


2015 - 16th International Conference on Cryptology in India, Bangalore, In-
dia, December 6-9, 2015, Proceedings. 2015. URL https://doi.org/10.
1007/978-3-319-26617-6_7.

[8] Banik, S. and T. Isobe, Cryptanalysis of the Full Spritz Stream Cipher. In
Fast Software Encryption - 23rd International Conference, FSE 2016, Bochum,
Germany, March 20-23, 2016. 2016. URL https://doi.org/10.1007/
978-3-662-52993-5_4.

[9] Barkan, E., E. Biham, and N. Keller, Instant Ciphertext-Only Cryptanalysis of


GSM Encrypted Communication. In Advances in Cryptology - CRYPTO 2003,
23rd Annual International Cryptology Conference, Santa Barbara, California,
USA, August 17-21, 2003, Proceedings. 2003.

[10] Barkan, E., E. Biham, and N. Keller (2008). Instant Ciphertext-Only Crypt-
analysis of GSM Encrypted Communication. J. Cryptology, 21(3), 392–429.

110
[11] Barkan, E., E. Biham, and A. Shamir, Rigorous Bounds on Cryptanalytic
Time/Memory Tradeoffs. In Advances in Cryptology - CRYPTO 2006, 26th An-
nual International Cryptology Conference, Santa Barbara, California, USA, Au-
gust 20-24, 2006, Proceedings. 2006. URL https://doi.org/10.1007/
11818175_1.

[12] Berbain, C., O. Billet, A. Canteaut, N. Courtois, H. Gilbert, L. Goubin,


A. Gouget, L. Granboulan, C. Lauradoux, M. Minier, T. Pornin, and H. Sib-
ert, Sosemanuk, a Fast Software-Oriented Stream Cipher. In New Stream Cipher
Designs - The eSTREAM Finalists. 2008, 98–118.

[13] Bernstein, D. J. (2005). Salsa20 specification. eStream Project. Available


athttp://www.ecrypt.eu.org/stream/salsa20pf.html.

[14] Bernstein, D. J. (2008). ChaCha, a variant of Salsa20. Workshop Record, SASC.

[15] Biham, E. and Y. Carmeli, Efficient Reconstruction of RC4 Keys from Internal
States. In Fast Software Encryption, 15th International Workshop, FSE 2008,
Lausanne, Switzerland, February 10-13, 2008. 2008. URL https://doi.
org/10.1007/978-3-540-71039-4_17.

[16] Biham, E. and Y. Carmeli, Efficient Reconstruction of RC4 Keys from Internal
States. In Fast Software Encryption, 15th International Workshop, FSE 2008,
Lausanne, Switzerland, February 10-13, 2008. 2008.

[17] Biham, E. and O. Dunkelman (2007). Differential Cryptanalysis in Stream


Ciphers. IACR Cryptology ePrint Archive, 2007, 218. URL http://eprint.
iacr.org/2007/218.

[18] Biham, E., L. Granboulan, and P. Q. Nguyen, Impossible Fault Analysis of


RC4 and Differential Fault Analysis of RC4. In Fast Software Encryption: 12th
International Workshop, FSE 2005, Paris, France, February 21-23, 2005. 2005.
URL https://doi.org/10.1007/11502760_24.

[19] Biham, E. and A. Shamir, Differential Cryptanalysis of the Full 16-Round DES.
In Advances in Cryptology - CRYPTO ’92, 12th Annual International Cryptology
Conference, Santa Barbara, California, USA, August 16-20, 1992, Proceedings.
1992.

[20] Biryukov, A. and A. Shamir, Cryptanalytic Time/Memory/Data Tradeoffs for


Stream Ciphers. In Advances in Cryptology - ASIACRYPT 2000, 6th Interna-
tional Conference on the Theory and Application of Cryptology and Informa-
tion Security, Kyoto, Japan, December 3-7, 2000, Proceedings. 2000. URL
https://doi.org/10.1007/3-540-44448-3_1.

[21] Biryukov, A. and A. Shamir, Cryptanalytic Time/Memory/Data Tradeoffs for


Stream Ciphers. In Advances in Cryptology - ASIACRYPT 2000, 6th Interna-
tional Conference on the Theory and Application of Cryptology and Information
Security, Kyoto, Japan, December 3-7, 2000, Proceedings. 2000.

111
[22] Biryukov, A., A. Shamir, and D. A. Wagner, Real Time Cryptanalysis of A5/1
on a PC. In Fast Software Encryption, 7th International Workshop, FSE 2000,
New York, NY, USA, April 10-12, 2000, Proceedings. 2000. URL https://
doi.org/10.1007/3-540-44706-7_1.

[23] Blöcher, U. and M. Dichtl, Fish: A Fast Software Stream Cipher. In Fast
Software Encryption, Cambridge Security Workshop, Cambridge, UK, Decem-
ber 9-11, 1993, Proceedings. 1993. URL https://doi.org/10.1007/
3-540-58108-1_4.

[24] Boesgaard, M., M. Vesterager, T. Pedersen, J. Christiansen, and O. Scave-


nius, Rabbit: A New High-Performance Stream Cipher. In Fast Software En-
cryption, 10th International Workshop, FSE 2003, Lund, Sweden, February 24-
26, 2003. 2003.

[25] Bogdanov, A., L. R. Knudsen, G. Leander, C. Paar, A. Poschmann,


M. J. B. Robshaw, Y. Seurin, and C. Vikkelsoe, PRESENT: An Ultra-
Lightweight Block Cipher. In Cryptographic Hardware and Embedded Sys-
tems - CHES 2007, 9th International Workshop, Vienna, Austria, September
10-13, 2007, Proceedings. 2007. URL https://doi.org/10.1007/
978-3-540-74735-2_31.

[26] Boneh, D., R. A. DeMillo, and R. J. Lipton, On the Importance of Checking


Cryptographic Protocols for Faults (Extended Abstract). In Advances in Cryptol-
ogy - EUROCRYPT ’97, International Conference on the Theory and Application
of Cryptographic Techniques, Konstanz, Germany, May 11-15, 1997, Proceed-
ing. 1997. URL https://doi.org/10.1007/3-540-69053-0_4.

[27] Bricout, R., S. Murphy, K. G. Paterson, and T. van der Merwe (2016).
Analysing and exploiting the mantin biases in RC4. IACR Cryptology ePrint
Archive, 2016, 63. URL http://eprint.iacr.org/2016/063.

[28] Cannière, C. D., O. Dunkelman, and M. Knezevic, KATAN and KTANTAN


- A Family of Small and Efficient Hardware-Oriented Block Ciphers. In Cryp-
tographic Hardware and Embedded Systems - CHES 2009, 11th International
Workshop, Lausanne, Switzerland, September 6-9, 2009, Proceedings. 2009.
URL https://doi.org/10.1007/978-3-642-04138-9_20.

[29] Cannière, C. D. and B. Preneel (2008). Trivium. New Stream Cipher De-
signs - The eSTREAM Finalists. URL https://link.springer.com/
chapter/10.1007/11836810_13.

[30] Cannière, C. D. and B. Preneel (2008). Trivium. New Stream Cipher Designs -
The eSTREAM Finalists. The eSTREAM Finalists.

[31] Canteaut, A. and M. Trabbia, Improved Fast Correlation Attacks Using Parity-
Check Equations of Weight 4 and 5. In Advances in Cryptology - EURO-
CRYPT 2000, International Conference on the Theory and Application of Cryp-
tographic Techniques, Bruges, Belgium, May 14-18, 2000, Proceeding. 2000.
URL https://doi.org/10.1007/3-540-45539-6_40.

112
[32] Choudhuri, A. R. and S. Maitra (2016). Significantly Improved Multi-bit
Differentials for Reduced Round Salsa and ChaCha. IACR Trans. Symmetric
Cryptol., 2016(2), 261–287. URL https://doi.org/10.13154/tosc.
v2016.i2.261-287.

[33] Courtois, N., Fast Algebraic Attacks on Stream Ciphers with Linear Feed-
back. In Advances in Cryptology - CRYPTO 2003, 23rd Annual Inter-
national Cryptology Conference, Santa Barbara, California, USA, August
17-21, 2003, Proceedings. 2003. URL https://doi.org/10.1007/
978-3-540-45146-4_11.

[34] Courtois, N. and W. Meier, Algebraic Attacks on Stream Ciphers with Linear
Feedback. In Advances in Cryptology - EUROCRYPT 2003, International Con-
ference on the Theory and Applications of Cryptographic Techniques, Warsaw,
Poland, May 4-8, 2003, Proceedings. 2003. URL https://doi.org/10.
1007/3-540-39200-9_21.

[35] Crowley, P. (2005). Truncated differential cryptanalysis of five rounds of


Salsa20. IACR Cryptology ePrint Archive, 2005, 375. URL http://eprint.
iacr.org/2005/375.

[36] DES (1999). Data Encryption Standard. National Institute of Standards and
Technology. Available at http://csrc.nist.gov/publications/
fips/fips46-3/fips46-3.pdf.

[37] Dey, S. and S. Sarkar (2017). Cryptanalysis of full round Fruit. The Tenth
International Workshop on Coding and Cryptography 2017 September 18-22,
2017 Saint-Petersburg, Russia.

[38] Dey, S. and S. Sarkar (2017). Improved analysis for reduced round Salsa and
Chacha. Discrete Applied Mathematics, 227, 58–69. URL https://doi.
org/10.1016/j.dam.2017.04.034.

[39] Dey, S. and S. Sarkar (2017). Settling the mystery of Zr = r in RC4. IACR
Cryptology ePrint Archive, 2017, 1072.

[40] Dey, S. and S. Sarkar (2018). Generalization of Roos bias in RC4 and some
results on key-keystream relations. Journal of Mathematical Cryptology.

[41] Dinur, I. and A. Shamir, Cube Attacks on Tweakable Black Box Polynomials.
In Advances in Cryptology - EUROCRYPT 2009, 28th Annual International Con-
ference on the Theory and Applications of Cryptographic Techniques, Cologne,
Germany, April 26-30, 2009. Proceedings. 2009.

[42] Esgin, M. F. and O. Kara, Practical Cryptanalysis of Full Sprout with TMD
Tradeoff Attacks. In Selected Areas in Cryptography - SAC 2015 - 22nd Inter-
national Conference, Sackville, NB, Canada, August 12-14, 2015. 2015. URL
https://doi.org/10.1007/978-3-319-31301-6_4.

[43] Fischer, S., W. Meier, C. Berbain, J. Biasse, and M. J. B. Robshaw, Non-


randomness in eSTREAM Candidates Salsa20 and TSC-4. In Progress in
Cryptology - INDOCRYPT 2006, 7th International Conference on Cryptology

113
in India, Kolkata, India, December 11-13, 2006, Proceedings. 2006. URL
https://doi.org/10.1007/11941378_2.

[44] Fluhrer, S. R., I. Mantin, and A. Shamir, Weaknesses in the Key Scheduling
Algorithm of RC4. In Selected Areas in Cryptography, 8th Annual International
Workshop, SAC 2001 Toronto, Ontario, Canada, August 16-17, 2001. 2001.

[45] Fluhrer, S. R. and D. A. McGrew, Statistical Analysis of the Alleged RC4


Keystream Generator. In Fast Software Encryption, 7th International Workshop,
FSE 2000, New York, NY, USA, April 10-12, 2000, Proceedings. 2000.

[46] Ghafari, V. A., H. Hu, and M. alizadeh (2017). Necessary conditions for de-
signing secure stream ciphers with the minimal internal states. IACR Cryptology
ePrint Archive, 2017, 765. URL http://eprint.iacr.org/2017/765.

[47] Golic, J. D., Towards Fast Correlation Attacks on Irregularly Clocked Shift Reg-
isters. In Advances in Cryptology - EUROCRYPT ’95, International Confer-
ence on the Theory and Application of Cryptographic Techniques, Saint-Malo,
France, May 21-25, 1995, Proceeding. 1995. URL https://doi.org/10.
1007/3-540-49264-X_20.

[48] Golic, J. D. (1996). Correlation Properties of a General Binary Combiner


with Memory. J. Cryptology, 9(2), 111–126. URL https://doi.org/10.
1007/BF00190805.

[49] Golic, J. D., Cryptanalysis of Alleged A5 Stream Cipher. In Advances in Cryp-


tology - EUROCRYPT ’97, International Conference on the Theory and Applica-
tion of Cryptographic Techniques, Konstanz, Germany, May 11-15, 1997. 1997.

[50] Golic, J. D., Correlation Analysis of the Shrinking Generator. In Advances in


Cryptology - CRYPTO 2001, 21st Annual International Cryptology Conference,
Santa Barbara, California, USA, August 19-23, 2001, Proceedings. 2001.

[51] Golic, J. D. and M. J. Mihaljevic (1991). A Generalized Correlation Attack on


a Class of Stream Ciphers Based on the Levenshtein Distance. J. Cryptology,
3(3), 201–212. URL https://doi.org/10.1007/BF00196912.

[52] Gong, Z., S. Nikova, and Y. W. Law, KLEIN: A New Family of Lightweight
Block Ciphers. In RFID. Security and Privacy - 7th International Workshop,
RFIDSec 2011, Amherst, USA, June 26-28, 2011. 2011. URL https://doi.
org/10.1007/978-3-642-25286-0_1.

[53] Guo, J., T. Peyrin, A. Poschmann, and M. J. B. Robshaw (2012). The LED
Block Cipher. IACR Cryptology ePrint Archive, 2012, 600. URL http://
eprint.iacr.org/2012/600.

[54] Gupta, S. S., S. Maitra, W. Meier, G. Paul, and S. Sarkar, Depen-


dence in IV-Related Bytes of RC4 Key Enhances Vulnerabilities in WPA.
In Fast Software Encryption - 21st International Workshop, FSE 2014, Lon-
don, UK, March 3-5, 2014. 2014. URL https://doi.org/10.1007/
978-3-662-46706-0_18.

114
[55] Gupta, S. S., S. Maitra, G. Paul, and S. Sarkar (2014). (Non-)Random
Sequences from (Non-)Random Permutations - Analysis of RC4 Stream Ci-
pher. J. Cryptology, 27(1), 67–108. URL https://doi.org/10.1007/
s00145-012-9138-1.

[56] Hamann, M., M. Krause, and W. Meier (2017). LIZARD - A Lightweight


Stream Cipher for Power-constrained Devices. IACR Trans. Symmetric Cryptol.,
2017(1), 45–79. URL https://doi.org/10.13154/tosc.v2017.
i1.45-79.

[57] Hamann, M., M. Krause, W. Meier, and B. Zhang (2017). Time-Memory-


Data Tradeoff Attacks against Small-State Stream Ciphers. IACR Cryptology
ePrint Archive, 2017, 384. URL http://eprint.iacr.org/2017/384.

[58] Hellman, M. E. (1980). A cryptanalytic time-memory trade-off. IEEE Trans.


Information Theory, 26(4), 401–406. URL https://doi.org/10.1109/
TIT.1980.1056220.

[59] Hoch, J. J. and A. Shamir, Fault Analysis of Stream Ciphers. In Cryptographic


Hardware and Embedded Systems - CHES 2004: 6th International Workshop
Cambridge, MA, USA, August 11-13, 2004. Proceedings. 2004. URL https:
//doi.org/10.1007/978-3-540-28632-5_18.

[60] Hoffstein, J., J. Pipher, and J. H. Silverman, NTRU: A Ring-Based Public


Key Cryptosystem. In Proceedings of ANTS’98, volume 1423 of Lecture Notes
in Computer Science. 1998.

[61] Hojsík, M. and B. Rudolf, Differential Fault Analysis of Trivium. In Fast Soft-
ware Encryption, 15th International Workshop, FSE 2008, Lausanne, Switzer-
land, February 10-13, 2008. 2008. URL https://doi.org/10.1007/
978-3-540-71039-4_10.

[62] Hong, J. and P. Sarkar, New Applications of Time Memory Data Tradeoffs. In
Advances in Cryptology - ASIACRYPT 2005, 11th International Conference on
the Theory and Application of Cryptology and Information Security, Chennai,
India, December 4-8, 2005, Proceedings. 2005. URL https://doi.org/
10.1007/11593447_19.

[63] IEEE1997 (1997). IEEE Standard for Wireless LAN medium access control
(MAC) and physical layer (PHY) specifications. IEEE. Available at http:
//ieeexplore.ieee.org/document/654749/.

[64] IEEE2004 (2004). Wireless LAN Medium Access Control (MAC) and Physical
Layer (PHY) Specifications - Amendment 8: Medium Access Control (MAC)
Quality of Service Enhancements. IEEE. Available at http://ieeexplore.
ieee.org/document/1541572/versions.

[65] Isobe, T., T. Ohigashi, Y. Watanabe, and M. Morii, Full Plaintext Recovery
Attack on Broadcast RC4. In Fast Software Encryption - 20th International
Workshop, FSE 2013, Singapore, March 11-13, 2013. 2013. URL https://
doi.org/10.1007/978-3-662-43933-3_10.

115
[66] Jenkins, R. J. (1996). ISAAC and RC4. URL http://burtleburtle.
net/bob/rand/isaac.html.

[67] Jha, S., S. Banik, T. Isobe, and T. Ohigashi, Some Proofs of Joint Distributions
of Keystream Biases in RC4. In Progress in Cryptology - INDOCRYPT 2016 -
17th International Conference on Cryptology in India, Kolkata, India, Decem-
ber 11-14, 2016, Proceedings. 2016. URL https://doi.org/10.1007/
978-3-319-49890-4_17.

[68] Johansson, T. and F. Jönsson, Fast Correlation Attacks Based on Turbo Code
Techniques. In Advances in Cryptology - CRYPTO ’99, 19th Annual In-
ternational Cryptology Conference, Santa Barbara, California, USA, August
15-19, 1999, Proceedings. 1999. URL https://doi.org/10.1007/
3-540-48405-1_12.

[69] Katz, J. and Y. Lindell, Introduction to Modern Cryptography. CRC Press,


2007.

[70] Klein, A. (2008). Attacks on the RC4 stream cipher. Des. Codes
Cryptography, 48(3), 269–286. URL https://doi.org/10.1007/
s10623-008-9206-6.

[71] Knudsen, L. R., W. Meier, B. Preneel, V. Rijmen, and S. Verdoolaege, Anal-


ysis Methods for (Alleged) RC4. In Advances in Cryptology - ASIACRYPT ’98,
International Conference on the Theory and Applications of Cryptology and In-
formation Security, Beijing, China, October 18-22, 1998, Proceedings. 1998.
URL https://doi.org/10.1007/3-540-49649-1_26.

[72] Koblitz, N. (1987). Elliptic curve cryptosystems. Mathematics of Computation,


48, 203–209.

[73] Lallemand, V. and M. Naya-Plasencia, Cryptanalysis of Full Sprout. In Ad-


vances in Cryptology - CRYPTO 2015 - 35th Annual Cryptology Conference,
Santa Barbara, CA, USA, August 16-20, 2015, Proceedings, Part I. 2015. URL
https://doi.org/10.1007/978-3-662-47989-6_32.

[74] M. Hamann, M. Krause, W. Meier and B. Zhang (2017). Design and analysis
of small-state grain-like stream ciphers. cryptography and communications.

[75] M. Hamann, M. Krause, W. Meier and B. Zhang (2017). On Stream


Ciphers with Small State. URL https://www.cryptolux.org/
mediawiki-esc2017/images/c/c2/Smallstate.pdf.

[76] M. Hell, T. J. and W. Meier (2007). Grain: a stream cipher for constrained
environments. IJWMC. URL http://dx.doi.org/10.1504/IJWMC.
2007.013798.

[77] Maitra, S. (2016). Chosen IV cryptanalysis on reduced round ChaCha and Salsa.
Discrete Applied Mathematics, 208, 88–97. URL https://doi.org/10.
1016/j.dam.2016.02.020.

116
[78] Maitra, S. and G. Paul, New Form of Permutation Bias and Secret Key Leakage
in Keystream Bytes of RC4. In Fast Software Encryption, 15th International
Workshop, FSE 2008, Lausanne, Switzerland, February 10-13, 2008. 2008. URL
https://doi.org/10.1007/978-3-540-71039-4_16.

[79] Maitra, S., G. Paul, and S. S. Gupta, Attack on Broadcast RC4 Revisited.
In Fast Software Encryption - 18th International Workshop, FSE 2011, Lyn-
gby, Denmark, February 13-16, 2011. 2011. URL https://doi.org/10.
1007/978-3-642-21702-9_12.

[80] Maitra, S., G. Paul, and W. Meier (2015). Salsa20 Cryptanalysis: New Moves
and Revisiting Old Styles. IACR Cryptology ePrint Archive, 2015, 217. URL
http://eprint.iacr.org/2015/217.

[81] Maitra, S., S. Sarkar, A. Baksi, and P. Dey (2015). Key Recovery from State
Information of Sprout: Application to Cryptanalysis and Fault Attack. IACR
Cryptology ePrint Archive, 2015, 236. URL http://eprint.iacr.org/
2015/236.

[82] Mantin, I. (2001). Analysis of the stream cipher RC4. URL


https://www.researchgate.net/publication/239062799_
Analysis_of_the_Stream_Cipher_RC4.

[83] Mantin, I. and A. Shamir, A Practical Attack on Broadcast RC4. In Fast Soft-
ware Encryption, 8th International Workshop, FSE 2001 Yokohama, Japan, April
2-4, 2001. 2001. URL https://doi.org/10.1007/3-540-45473-X_
13.

[84] Maximov, A. and D. Khovratovich (2008). New State Recovery Attack on


RC4. IACR Cryptology ePrint Archive, 2008, 17. URL http://eprint.
iacr.org/2008/017.

[85] Meier, W. and O. Staffelbach (1989). Fast Correlation Attacks on Certain


Stream Ciphers. J. Cryptology, 1(3), 159–176. URL https://doi.org/
10.1007/BF02252874.

[86] Menezes, A. J., P. C. van Oorschot, and S. A. Vanstone, Handbook of Applied


Cryptography. CRC Press, 2001. Available at http://www.cacr.math.
uwaterloo.ca/hac/.

[87] Mikhalev, V., F. Armknecht, and C. Müller (2016). On Ciphers that Continu-
ously Access the Non-Volatile Key. IACR Trans. Symmetric Cryptol., 2016(2),
52–79. URL https://doi.org/10.13154/tosc.v2016.i2.52-79.

[88] Miller, V. S., Use of elliptic curves in cryptography. In Proceedings of Crypto’85,


volume 218 of Lecture Notes in Computer Science. 1986.

[89] Mironov, I., (Not So) Random Shuffles of RC4. In Advances in Cryptology -
CRYPTO 2002, 22nd Annual International Cryptology Conference, Santa Bar-
bara, California, USA, August 18-22, 2002, Proceedings. 2002. URL https:
//doi.org/10.1007/3-540-45708-9_20.

117
[90] Mukhopadhyay, D., An Improved Fault Based Attack of the Advanced Encryp-
tion Standard. In Progress in Cryptology - AFRICACRYPT 2009, Second Inter-
national Conference on Cryptology in Africa, Gammarth, Tunisia, June 21-25,
2009. Proceedings. 2009.

[91] Naya-Plasencia, M., How to Improve Rebound Attacks. In Advances in Cryp-


tology - CRYPTO 2011 - 31st Annual Cryptology Conference, Santa Barbara,
CA, USA, August 14-18, 2011.. 2011.

[92] paper1 (2014). Google Swaps Out Crypto Ciphers in OpenSSL.


Available at www.infosecurity-magazine.com/news/
google-swaps-out-crypto-ciphers-in-openssl/.

[93] Paterson, K. G., B. Poettering, and J. C. N. Schuldt, Big Bias Hunting in Ama-
zonia: Large-Scale Computation and Exploitation of RC4 Biases (Invited Paper).
In Advances in Cryptology - ASIACRYPT 2014 - 20th International Conference
on the Theory and Application of Cryptology and Information Security, Kaoshi-
ung, Taiwan, R.O.C., December 7-11, 2014. Proceedings, Part I. 2014. URL
https://doi.org/10.1007/978-3-662-45611-8_21.

[94] Paterson, K. G., B. Poettering, and J. C. N. Schuldt, Plaintext recovery attacks


against WPA/TKIP. In Fast Software Encryption - 21st International Workshop,
FSE 2014, London, UK, March 3-5, 2014. 2014.

[95] Paul, G. and S. Maitra, Permutation After RC4 Key Scheduling Reveals the
Secret Key. In Selected Areas in Cryptography, 14th International Workshop,
SAC 2007, Ottawa, Canada, August 16-17, 2007. 2007.

[96] Paul, G. and S. Maitra, Permutation After RC4 Key Scheduling Reveals the
Secret Key. In Selected Areas in Cryptography, 14th International Workshop,
SAC 2007, Ottawa, Canada, August 16-17, 2007. 2007.

[97] Paul, G. and S. Ray (2015). On Data Complexity of Distinguishing Attacks vs.
Message Recovery Attacks on Stream Ciphers. IACR Cryptology ePrint Archive.
https://eprint.iacr.org/2015/1174.

[98] Paul, G. and S. Ray (2017). Analysis of Burn-in period for RC4 State Transition.
IACR Cryptology ePrint Archive, 2017, 175. URL http://eprint.iacr.
org/2017/175.

[99] Rivest, R. L. and J. C. N. Schuldt (2016). Spritz - a spongy RC4-like stream


cipher and hash function. IACR Cryptology ePrint Archive, 2016, 856. URL
http://eprint.iacr.org/2016/856.

[100] Rivest, R. L., A. Shamir, and L. M. Adleman (1978). A Method for Obtain-
ing Digital Signatures and Public-Key Cryptosystems. Communications of the
Association for Computing Machinery, 21(2), 120–126.

[101] Roos, A. (1995). A Class of Weak Keys in the RC4 Stream Cipher. URL http:
//www.impic.org/papers/WeakKeys-report.pdf.

118
[102] Saarinen, M. O., A Time-Memory Tradeoff Attack Against LILI-128. In
Fast Software Encryption, 9th International Workshop, FSE 2002, Leuven, Bel-
gium, February 4-6, 2002. 2002. URL https://doi.org/10.1007/
3-540-45661-9_18.

[103] sage (). Sage: Open source mathematical software. URL http://www.
sagemath.org/.

[104] Sepehrdad, P., P. Susil, S. Vaudenay, and M. Vuagnoux, Smashing WEP in a


passive attack. In Fast Software Encryption - 20th International Workshop, FSE
2013, Singapore, March 11-13, 2013. 2013. URL https://doi.org/10.
1007/978-3-662-43933-3_9.

[105] Sepehrdad, P., P. Susil, S. Vaudenay, and M. Vuagnoux (2015). Tornado


Attack on RC4 with Applications to WEP & WPA. IACR Cryptology ePrint
Archive, 2015, 254. URL http://eprint.iacr.org/2015/254.

[106] Sepehrdad, P., S. Vaudenay, and M. Vuagnoux, Discovery and Exploitation


of New Biases in RC4. In Selected Areas in Cryptography - 17th International
Workshop, SAC 2010, Waterloo, Ontario, Canada, August 12-13, 2010. 2010.
URL https://doi.org/10.1007/978-3-642-19574-7_5.

[107] Sepehrdad, P., S. Vaudenay, and M. Vuagnoux, Statistical Attack on RC4 -


Distinguishing WPA. In Advances in Cryptology - EUROCRYPT 2011 - 30th
Annual International Conference on the Theory and Applications of Crypto-
graphic Techniques, Tallinn, Estonia, May 15-19, 2011. Proceedings. 2011. URL
https://doi.org/10.1007/978-3-642-20465-4_20.

[108] Shannon, C. E. (1949). Communication theory of secrecy systems. Bell System


Technical Journal. URL http://www.ecrypt.eu.org/documents/D.
SYM.10-v1.pdf.

[109] Shi, Z., B. Zhang, D. Feng, and W. Wu, Improved Key Recovery At-
tacks on Reduced-Round Salsa20 and ChaCha. In Information Security
and Cryptology - ICISC 2012 - 15th International Conference, Seoul, Ko-
rea, November 28-30, 2012. 2012. URL https://doi.org/10.1007/
978-3-642-37682-5_24.

[110] Shirai, T., K. Shibutani, T. Akishita, S. Moriai, and T. Iwata, The 128-bit
blockcipher CLEFIA (extended abstract). In Fast Software Encryption, 14th
International Workshop, FSE 2007, Luxembourg, Luxembourg, March 26-28,
2007. 2007.

[111] Stinson, D. R., Cryptography Theory and Practice. CRC Press, third edition,
1995.

[112] Suzaki, T., K. Minematsu, S. Morioka, and E. Kobayashi, TWINE :


A Lightweight Block Cipher for Multiple Platforms. In Selected Areas
in Cryptography, 19th International Conference, SAC 2012, Windsor, ON,
Canada, August 15-16, 2012.. 2012. URL https://doi.org/10.1007/
978-3-642-35999-6_22.

119
[113] Vanhoef, M. and F. Piessens, All Your Biases Belong to Us: Break-
ing RC4 in WPA-TKIP and TLS. In 2016 USENIX Annual Tech-
nical Conference, USENIX ATC 2016, Denver, CO, USA, June 22-24,
2016.. 2016. URL https://www.usenix.org/conference/atc16/
technical-sessions/presentation/vanhoef.

[114] Wu, H., The Stream Cipher HC-128. In New Stream Cipher Designs - The
eSTREAM Finalists. 2008, 39–47.

[115] Wu, H. and B. Preneel, Differential Cryptanalysis of the Stream Ciphers Py,
Py6 and Pypy. In Advances in Cryptology - EUROCRYPT 2007, 26th An-
nual International Conference on the Theory and Applications of Cryptographic
Techniques, Barcelona, Spain, May 20-24, 2007, Proceedings. 2007. URL
https://doi.org/10.1007/978-3-540-72540-4_16.

[116] Wu, H. and B. Preneel, Differential-Linear Attacks Against the Stream Cipher
Phelix. In Fast Software Encryption, 14th International Workshop, FSE 2007,
Luxembourg, Luxembourg, March 26-28, 2007. 2007. URL https://doi.
org/10.1007/978-3-540-74619-5_6.

[117] Wu, W. and L. Zhang, LBlock: A Lightweight Block Cipher. In Applied


Cryptography and Network Security - 9th International Conference, ACNS 2011,
Nerja, Spain, June 7-10, 2011. Proceedings. 2011. URL https://doi.org/
10.1007/978-3-642-21554-4_19.

[118] Y. Tsunoo, T. Saito, H. Kubo and T. Suzaki and H. Nakashima, Differential


Cryptanalysis of Salsa20/8. 2007. URL http://www.ecrypt.eu.org/
stream/papersdir/2007/010.pdf.

[119] Zhang, B. and X. Gong, Another Tradeoff Attack on Sprout-Like Stream Ci-
phers. In Advances in Cryptology - ASIACRYPT 2015 - 21st International Con-
ference on the Theory and Application of Cryptology and Information Security,
Auckland, New Zealand, November 29 - December 3, 2015, Proceedings, Part II.
2015. URL https://doi.org/10.1007/978-3-662-48800-3_23.

[120] Zhang, B., X. Gong, and W. Meier (2017). Fast Correlation Attacks on Grain-
like Small State Stream Ciphers. IACR Trans. Symmetric Cryptol., 2017(4), 58–
81. URL https://doi.org/10.13154/tosc.v2017.i4.58-81.

120
LIST OF PAPERS BASED ON THESIS

Journal Publication

• Sabyasachi Dey and Santanu Sarkar. Improved analysis for reduced round Salsa
and Chacha. Discrete Applied Mathematics. Volume 227, pp. 58–69, 2017.

• Sabyasachi Dey and Santanu Sarkar. Generalization of Roos bias in RC4 and
some results on key-keystream relations. Accepted in Journal of Mathematical
Cryptology.

Workshop Publication

• Sabyasachi Dey and Santanu Sarkar. Cryptanalysis of full round Fruit. The Tenth
International Workshop on Coding and Cryptography (WCC) 2017 September
18-22, 2017 Saint-Petersburg, Russia.

Preprint

• Sabyasachi Dey and Santanu Sarkar. How to assign values to the PNBs in Chacha
and Salsa. (Communicated for publication).

• Sabyasachi Dey and Santanu Sarkar. Settling the mystery of Zr = r in RC4.


(Communicated for publication).

121

Potrebbero piacerti anche