0% found this document useful (0 votes)
4 views60 pages

Chapter 2

This document is a preface to a graduate course on applied cryptography, emphasizing the importance of cryptography in securing information systems. It outlines the book's structure, which includes symmetric encryption, public-key encryption, and cryptographic protocols, and highlights the need for mathematical proofs to ensure security. The book is designed for both beginners and advanced readers, providing a comprehensive understanding of cryptographic systems and their practical applications.

Uploaded by

ktnkings09042004
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views60 pages

Chapter 2

This document is a preface to a graduate course on applied cryptography, emphasizing the importance of cryptography in securing information systems. It outlines the book's structure, which includes symmetric encryption, public-key encryption, and cryptographic protocols, and highlights the need for mathematical proofs to ensure security. The book is designed for both beginners and advanced readers, providing a comprehensive understanding of cryptographic systems and their practical applications.

Uploaded by

ktnkings09042004
Copyright
© All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

A Graduate Course in Applied Cryptography

Dan Boneh and Victor Shoup

Version 0.6, Jan. 2023


Preface

Cryptography is an indispensable tool used to protect information in computing systems. It is


used everywhere and by billions of people worldwide on a daily basis. It is used to protect data at
rest and data in motion. Cryptographic systems are an integral part of standard protocols, most
notably the Transport Layer Security (TLS) protocol, making it relatively easy to incorporate
strong encryption into a wide range of applications.
While extremely useful, cryptography is also highly brittle. The most secure cryptographic
system can be rendered completely insecure by a single specification or programming error. No
amount of unit testing will uncover a security vulnerability in a cryptosystem.
Instead, to argue that a cryptosystem is secure, we rely on mathematical modeling and proofs
to show that a particular system satisfies the security properties attributed to it. We often need to
introduce certain plausible assumptions to push our security arguments through.
This book is about exactly that: constructing practical cryptosystems for which we can argue
security under plausible assumptions. The book covers many constructions for different tasks in
cryptography. For each task we define a precise security goal that we aim to achieve and then
present constructions that achieve the required goal. To analyze the constructions, we develop a
unified framework for doing cryptographic proofs. A reader who masters this framework will be
capable of applying it to new constructions that may not be covered in the book.
Throughout the book we present many case studies to survey how deployed systems operate.
We describe common mistakes to avoid as well as attacks on real-world systems that illustrate the
importance of rigor in cryptography. We end every chapter with a fun application that applies the
ideas in the chapter in some unexpected way.

Intended audience and how to use this book


The book is intended to be self contained. Some supplementary material covering basic facts from
probability theory and algebra is provided in the appendices. The book is divided into three parts.
• Part I develops symmetric encryption which explains how two parties, Alice and Bob, can
securely exchange information when they have a shared key unknown to the attacker. We
discuss data confidentiality, data integrity, and the important concept of authenticated en-
cryption.

• Part II develops the concepts of public-key encryption and digital signatures, which allow
Alice and Bob to communicate securely, without having a pre-shared secret key.

• Part III is about cryptographic protocols, such as protocols for user identification, key ex-
change, zero knowledge, and secure computation.

ii
A beginning reader can read though the book to learn how cryptographic systems work and
why they are secure. Every security theorem in the book is followed by a proof idea that explains
at a high level why the scheme is secure. On a first read one can skip over the detailed proofs
without losing continuity. A beginning reader may also skip over the mathematical details sections
that explore nuances of certain definitions.
An advanced reader may enjoy reading the detailed proofs to learn how to do proofs in cryp-
tography. At the end of every chapter you will find many exercises that explore additional aspects
of the material covered in the chapter. Some exercises rehearse what was learned, but many ex-
ercises expand on the material and present additional ideas that are not covered in the body. We
recommend that readers read through the exercises, even if they do not intend to solve them.

Status of the book


The current draft is mostly complete, although there are a few missing sections here and there.
Those sections, as well as the appendices, are forthcoming. We hope you enjoy this write-up. Please
send us comments and let us know if you find typos or mistakes. We are very grateful to all the
readers who have already sent us comments.

Citations: While the current draft is mostly complete, we have not yet included citations and
references to the many works on which this book is based. Those will be coming soon and will be
presented in the Notes section at the end of every chapter.

Dan Boneh and Victor Shoup


Jan. 2023

iii
Contents

1 Introduction 1
1.1 Historic ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Terminology used throughout the book . . . . . . . . . . . . . . . . . . . . . . . . . 1

I Secret key cryptography 3

2 Encryption 4
2.1 Shannon ciphers and perfect security . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.1 Definition of a Shannon cipher . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.1.2 Perfect security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1.3 The bad news . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Computational ciphers and semantic security . . . . . . . . . . . . . . . . . . . . . . 13
2.2.1 Definition of a computational cipher . . . . . . . . . . . . . . . . . . . . . . 13
2.2.2 Definition of semantic security . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.2.3 Connections to weaker notions of security . . . . . . . . . . . . . . . . . . . 18
2.2.4 Consequences of semantic security . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.5 Bit guessing: an alternative characterization of semantic security . . . . . . 25
2.3 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.3.1 Negligible, super-poly, and poly-bounded functions . . . . . . . . . . . . . . 28
2.3.2 Computational ciphers: the formalities . . . . . . . . . . . . . . . . . . . . . 29
2.3.3 Efficient adversaries and attack games . . . . . . . . . . . . . . . . . . . . . 32
2.3.4 Semantic security: the formalities . . . . . . . . . . . . . . . . . . . . . . . . 34
2.4 A fun application: anonymous routing . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.5 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

3 Stream ciphers 45
3.1 Pseudo-random generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.1 Definition of a pseudo-random generator . . . . . . . . . . . . . . . . . . . . 46
3.1.2 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2 Stream ciphers: encryption with a PRG . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.3 Stream cipher limitations: attacks on the one time pad . . . . . . . . . . . . . . . . 52
3.3.1 The two-time pad is insecure . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3.2 The one-time pad is malleable . . . . . . . . . . . . . . . . . . . . . . . . . . 53

iv
3.4 Composing PRGs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4.1 A parallel construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4.2 A sequential construction: the Blum-Micali method . . . . . . . . . . . . . 59
3.4.3 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.5 The next bit test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
3.6 Case study: the Salsa and ChaCha PRGs . . . . . . . . . . . . . . . . . . . . . . . . 67
3.7 Case study: linear generators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
3.7.1 An example cryptanalysis: the linear congruential generator . . . . . . . . . 70
3.7.2 The subset sum generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
3.8 Case study: cryptanalysis of the DVD encryption system . . . . . . . . . . . . . . . 74
3.9 Case study: cryptanalysis of the RC4 stream cipher . . . . . . . . . . . . . . . . . . 77
3.9.1 Security of RC4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
3.10 Generating random bits in practice . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.11 A broader perspective: computational and statistical indistinguishability . . . . . . 82
3.11.1 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
3.12 A fun application: coin flipping and bit commitment . . . . . . . . . . . . . . . . . . 88
3.13 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
3.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

4 Block ciphers 96
4.1 Block ciphers: basic definitions and properties . . . . . . . . . . . . . . . . . . . . . 96
4.1.1 Some implications of security . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.1.2 Efficient implementation of random permutations . . . . . . . . . . . . . . . 101
4.1.3 Strongly secure block ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.1.4 Using a block cipher directly for encryption . . . . . . . . . . . . . . . . . . 102
4.1.5 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.2 Constructing block ciphers in practice . . . . . . . . . . . . . . . . . . . . . . . . . . 107
4.2.1 Case study: DES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
4.2.2 Exhaustive search on DES: the DES challenges . . . . . . . . . . . . . . . . 113
4.2.3 Strengthening ciphers against exhaustive search: the 3E construction . . . . 115
4.2.4 Case study: AES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
4.3 Sophisticated attacks on block ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.3.1 Algorithmic attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
4.3.2 Side-channel attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
4.3.3 Fault injection attacks on AES . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.3.4 Quantum exhaustive search attacks . . . . . . . . . . . . . . . . . . . . . . . 131
4.4 Pseudo-random functions: basic definitions and properties . . . . . . . . . . . . . . 132
4.4.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
4.4.2 Efficient implementation of random functions . . . . . . . . . . . . . . . . . 133
4.4.3 When is a secure block cipher a secure PRF? . . . . . . . . . . . . . . . . . 134
4.4.4 Constructing PRGs from PRFs . . . . . . . . . . . . . . . . . . . . . . . . . 138
4.4.5 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
4.5 Constructing block ciphers from PRFs . . . . . . . . . . . . . . . . . . . . . . . . . . 141
4.6 The tree construction: from PRGs to PRFs . . . . . . . . . . . . . . . . . . . . . . . 147
4.6.1 Variable length tree construction . . . . . . . . . . . . . . . . . . . . . . . . 151
4.7 The ideal cipher model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154

v
4.7.1 Formal definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
4.7.2 Exhaustive search in the ideal cipher model . . . . . . . . . . . . . . . . . . 155
4.7.3 The Even-Mansour block cipher and the EX construction . . . . . . . . . . 158
4.7.4 Proof of the Even-Mansour and EX theorems . . . . . . . . . . . . . . . . . 159
4.8 A fun application: comparing information without revealing it . . . . . . . . . . . . 165
4.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
4.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167

5 Chosen Plaintext Attack 177


5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
5.2 Security against multi-key attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
5.3 Semantic security against chosen plaintext attack . . . . . . . . . . . . . . . . . . . 181
5.4 Building CPA secure ciphers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
5.4.1 A generic hybrid construction . . . . . . . . . . . . . . . . . . . . . . . . . . 183
5.4.2 Randomized counter mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
5.4.3 CBC mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
5.4.4 Case study: CBC padding in TLS 1.0 . . . . . . . . . . . . . . . . . . . . . 199
5.4.5 Concrete parameters and a comparison of counter and CBC modes . . . . . 199
5.5 Nonce-based encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 201
5.5.1 Nonce-based generic hybrid encryption . . . . . . . . . . . . . . . . . . . . . 203
5.5.2 Nonce-based Counter mode . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
5.5.3 Nonce-based CBC mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
5.6 A fun application: revocable broadcast encryption . . . . . . . . . . . . . . . . . . . 205
5.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208
5.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208

6 Message integrity 215


6.1 Definition of a message authentication code . . . . . . . . . . . . . . . . . . . . . . . 217
6.1.1 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220
6.2 MAC verification queries do not help the attacker . . . . . . . . . . . . . . . . . . . 220
6.3 Constructing MACs from PRFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
6.4 Prefix-free PRFs for long messages . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
6.4.1 The CBC prefix-free secure PRF . . . . . . . . . . . . . . . . . . . . . . . . 226
6.4.2 The cascade prefix-free secure PRF . . . . . . . . . . . . . . . . . . . . . . . 229
6.4.3 Extension attacks: CBC and cascade are insecure MACs . . . . . . . . . . . 230
6.5 From prefix-free secure PRF to fully secure PRF (method 1): encrypted PRF . . . 231
6.5.1 ECBC and NMAC: MACs for variable length inputs . . . . . . . . . . . . . 232
6.6 From prefix-free secure PRF to fully secure PRF (method 2): prefix-free encodings . 235
6.6.1 Prefix free encodings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 235
6.7 From prefix-free secure PRF to fully secure PRF (method 3): CMAC . . . . . . . . 236
6.8 Converting a block-wise PRF to bit-wise PRF . . . . . . . . . . . . . . . . . . . . . 239
6.9 Case study: ANSI CBC-MAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
6.10 Case study: CMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
6.11 PMAC: a parallel MAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
6.12 A fun application: searching on encrypted data . . . . . . . . . . . . . . . . . . . . . 245
6.13 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

vi
6.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246

7 Message integrity from universal hashing 252


7.1 Universal hash functions (UHFs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252
7.1.1 Multi-query UHFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 254
7.1.2 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
7.2 Constructing UHFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255
7.2.1 Construction 1: UHFs using polynomials . . . . . . . . . . . . . . . . . . . 255
7.2.2 Construction 2: CBC and cascade are computational UHFs . . . . . . . . . 258
7.2.3 Construction 3: a parallel UHF from a small PRF . . . . . . . . . . . . . . 260
7.3 PRF(UHF) composition: constructing MACs using UHFs . . . . . . . . . . . . . . . 262
7.3.1 Using PRF(UHF) composition: ECBC and NMAC security . . . . . . . . . 265
7.3.2 Using PRF(UHF) composition with polynomial UHFs . . . . . . . . . . . . 265
7.3.3 Using PRF(UHF) composition: PMAC0 security . . . . . . . . . . . . . . . 266
7.4 The Carter-Wegman MAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266
7.4.1 Using Carter-Wegman with polynomial UHFs . . . . . . . . . . . . . . . . . 273
7.5 Nonce-based MACs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
7.5.1 Secure nonce-based MACs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274
7.6 Unconditionally secure one-time MACs . . . . . . . . . . . . . . . . . . . . . . . . . 275
7.6.1 Pairwise unpredictable functions . . . . . . . . . . . . . . . . . . . . . . . . 275
7.6.2 Building unpredictable functions . . . . . . . . . . . . . . . . . . . . . . . . 275
7.6.3 From PUFs to unconditionally secure one-time MACs . . . . . . . . . . . . 276
7.7 A fun application: timing attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276
7.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
7.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277

8 Message integrity from collision resistant hashing 287


8.1 Definition of collision resistant hashing . . . . . . . . . . . . . . . . . . . . . . . . . 290
8.1.1 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
8.2 Building a MAC for large messages . . . . . . . . . . . . . . . . . . . . . . . . . . . 291
8.3 Birthday attacks on collision resistant hash functions . . . . . . . . . . . . . . . . . 293
8.4 The Merkle-Damgård paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
8.4.1 Joux’s attack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
8.5 Building Compression Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299
8.5.1 A simple but inefficient compression function . . . . . . . . . . . . . . . . . 299
8.5.2 Davies-Meyer compression functions . . . . . . . . . . . . . . . . . . . . . . 300
8.5.3 Collision resistance of Davies-Meyer . . . . . . . . . . . . . . . . . . . . . . 301
8.6 Case study: SHA256 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303
8.6.1 Other Merkle-Damgård hash functions . . . . . . . . . . . . . . . . . . . . . 305
8.7 Case study: HMAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
8.7.1 Security of two-key nest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
8.7.2 The HMAC standard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
8.7.3 Davies-Meyer is a secure PRF in the ideal cipher model . . . . . . . . . . . 310
8.8 The Sponge Construction and SHA3 . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
8.8.1 The sponge construction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 314
8.8.2 Case study: SHA3, SHAKE128, and SHAKE256 . . . . . . . . . . . . . . . 319

vii
8.9 Merkle trees: proving properties of a hashed list . . . . . . . . . . . . . . . . . . . . 320
8.9.1 Authenticated data structures . . . . . . . . . . . . . . . . . . . . . . . . . . 323
8.10 Key derivation and the random oracle model . . . . . . . . . . . . . . . . . . . . . . 324
8.10.1 The key derivation problem . . . . . . . . . . . . . . . . . . . . . . . . . . . 325
8.10.2 Random oracles: a useful heuristic . . . . . . . . . . . . . . . . . . . . . . . 327
8.10.3 Random oracles: safe modes of operation . . . . . . . . . . . . . . . . . . . 332
8.10.4 The leftover hash lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334
8.10.5 Case study: HKDF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335
8.11 Security without collision resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
8.11.1 Second preimage resistance . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
8.11.2 Randomized hash functions: target collision resistance . . . . . . . . . . . . 337
8.11.3 TCR from 2nd-preimage resistance . . . . . . . . . . . . . . . . . . . . . . . 338
8.11.4 Using target collision resistance . . . . . . . . . . . . . . . . . . . . . . . . . 341
8.12 A fun application: commitments and auctions . . . . . . . . . . . . . . . . . . . . . 343
8.13 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347
8.14 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 348

9 Authenticated Encryption 357


9.1 Authenticated encryption: definitions . . . . . . . . . . . . . . . . . . . . . . . . . . 358
9.1.1 One-time authenticated encryption . . . . . . . . . . . . . . . . . . . . . . . 359
9.2 Implications of authenticated encryption . . . . . . . . . . . . . . . . . . . . . . . . 360
9.2.1 Chosen ciphertext attacks: a motivating example . . . . . . . . . . . . . . . 360
9.2.2 Chosen ciphertext attacks: definition . . . . . . . . . . . . . . . . . . . . . . 362
9.2.3 Authenticated encryption implies chosen ciphertext security . . . . . . . . . 363
9.3 Encryption as an abstract interface . . . . . . . . . . . . . . . . . . . . . . . . . . . 365
9.4 Authenticated encryption ciphers from generic composition . . . . . . . . . . . . . . 367
9.4.1 Encrypt-then-MAC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 367
9.4.2 MAC-then-encrypt is not generally secure: padding oracle attacks on SSL . 369
9.4.3 More padding oracle attacks. . . . . . . . . . . . . . . . . . . . . . . . . . . 372
9.4.4 Secure instances of MAC-then-encrypt . . . . . . . . . . . . . . . . . . . . . 373
9.4.5 Encrypt-then-MAC or MAC-then-encrypt? . . . . . . . . . . . . . . . . . . 377
9.5 Nonce-based authenticated encryption with associated data . . . . . . . . . . . . . . 377
9.6 One more variation: CCA-secure ciphers with associated data . . . . . . . . . . . . 380
9.7 Case study: Galois counter mode (GCM) . . . . . . . . . . . . . . . . . . . . . . . . 381
9.8 Case study: the TLS 1.3 record protocol . . . . . . . . . . . . . . . . . . . . . . . . 383
9.9 Case study: an attack on non-atomic decryption in SSH . . . . . . . . . . . . . . . . 386
9.10 Case study: 802.11b WEP, a badly broken system . . . . . . . . . . . . . . . . . . . 389
9.11 A fun application: private information retrieval . . . . . . . . . . . . . . . . . . . . . 391
9.12 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391
9.13 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 391

II Public key cryptography 398

10 Public key tools 399


10.1 A toy problem: anonymous key exchange . . . . . . . . . . . . . . . . . . . . . . . . 399

viii
10.2 One-way trapdoor functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
10.2.1 Key exchange using a one-way trapdoor function scheme . . . . . . . . . . . 401
10.2.2 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 402
10.3 A trapdoor permutation scheme based on RSA . . . . . . . . . . . . . . . . . . . . . 403
10.3.1 Key exchange based on the RSA assumption . . . . . . . . . . . . . . . . . 405
10.3.2 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 405
10.4 Diffie-Hellman key exchange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 406
10.4.1 The key exchange protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 407
10.4.2 Security of Diffie-Hellman key exchange . . . . . . . . . . . . . . . . . . . . 407
10.5 Discrete logarithm and related assumptions . . . . . . . . . . . . . . . . . . . . . . . 408
10.5.1 Random self-reducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411
10.5.2 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413
10.6 Collision resistant hash functions from number-theoretic primitives . . . . . . . . . . 414
10.6.1 Collision resistance based on DL . . . . . . . . . . . . . . . . . . . . . . . . 414
10.6.2 Collision resistance based on RSA . . . . . . . . . . . . . . . . . . . . . . . 415
10.7 Attacks on the anonymous Diffie-Hellman protocol . . . . . . . . . . . . . . . . . . . 417
10.8 Merkle puzzles: a partial solution to key exchange using block ciphers . . . . . . . . 418
10.9 A fun application: accumulators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 420
10.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 423
10.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 424

11 Public key encryption 434


11.1 Two further example applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
11.1.1 Sharing encrypted files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
11.1.2 Key escrow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435
11.2 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 436
11.2.1 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 437
11.3 Implications of semantic security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 438
11.3.1 The need for randomized encryption . . . . . . . . . . . . . . . . . . . . . . 438
11.3.2 Semantic security against chosen plaintext attack . . . . . . . . . . . . . . . 439
11.4 Encryption based on a trapdoor function scheme . . . . . . . . . . . . . . . . . . . . 441
11.4.1 Instantiating ETDF with RSA . . . . . . . . . . . . . . . . . . . . . . . . . . 444
11.5 ElGamal encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445
11.5.1 Semantic security of ElGamal in the random oracle model . . . . . . . . . . 446
11.5.2 Semantic security of ElGamal without random oracles . . . . . . . . . . . . 448
11.6 A fun application: oblivious transfer based on Diffie-Hellman . . . . . . . . . . . . . 451
11.6.1 A secure OT from ElGamal encryption . . . . . . . . . . . . . . . . . . . . . 452
11.6.2 Adaptive oblivious transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . 453
11.6.3 Oblivious PRFs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454
11.6.4 A simple adaptive OT from an oblivious PRF . . . . . . . . . . . . . . . . . 456
11.7 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 457
11.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458

ix
12 Chosen ciphertext secure public key encryption 467
12.1 Basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467
12.2 Understanding CCA security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 469
12.2.1 CCA security and ciphertext malleability . . . . . . . . . . . . . . . . . . . 469
12.2.2 CCA security vs. authentication . . . . . . . . . . . . . . . . . . . . . . . . 470
12.2.3 CCA security and key escrow . . . . . . . . . . . . . . . . . . . . . . . . . . 471
12.2.4 Encryption as an abstract interface . . . . . . . . . . . . . . . . . . . . . . . 472
12.3 CCA-secure encryption from trapdoor function schemes . . . . . . . . . . . . . . . . 474
0
12.3.1 Instantiating ETDF with RSA . . . . . . . . . . . . . . . . . . . . . . . . . . 479
12.4 CCA-secure ElGamal encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 479
12.5 CCA security from DDH without random oracles . . . . . . . . . . . . . . . . . . . 484
12.5.1 Universal projective hash functions . . . . . . . . . . . . . . . . . . . . . . . 484
12.5.2 Universal2 projective hash functions . . . . . . . . . . . . . . . . . . . . . . 487
12.5.3 The ECS scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 488
12.6 CCA security via a generic transformation . . . . . . . . . . . . . . . . . . . . . . . 494
12.6.1 A generic instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499
12.6.2 A concrete instantiation with ElGamal . . . . . . . . . . . . . . . . . . . . . 499
12.7 CCA-secure public-key encryption with associated data . . . . . . . . . . . . . . . . 501
12.7.1 AD-only CCA security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 502
12.8 Case study: PKCS1, OAEP, OAEP+, and SAEP . . . . . . . . . . . . . . . . . . . 503
12.8.1 Padding schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
12.8.2 PKCS1 padding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504
12.8.3 Bleichenbacher’s attack on the RSA-PKCS1 encryption scheme . . . . . . . 505
12.8.4 Optimal Asymmetric Encryption Padding (OAEP) . . . . . . . . . . . . . . 508
12.8.5 OAEP+ and SAEP+ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 510
12.9 A fun application: private set intersection . . . . . . . . . . . . . . . . . . . . . . . . 511
12.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512
12.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 512

13 Digital signatures 526


13.1 Definition of a digital signature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 528
13.1.1 Secure signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 529
13.1.2 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532
13.2 Extending the message space with collision resistant hashing . . . . . . . . . . . . . 532
13.2.1 Extending the message space using TCR functions . . . . . . . . . . . . . . 533
13.3 Signatures from trapdoor permutations: the full domain hash . . . . . . . . . . . . . 534
13.3.1 Signatures based on the RSA trapdoor permutation . . . . . . . . . . . . . 536
13.4 Security analysis of full domain hash . . . . . . . . . . . . . . . . . . . . . . . . . . . 538
13.4.1 Repeated one-way functions: a useful lemma . . . . . . . . . . . . . . . . . 538
13.4.2 Proofs of Theorems 13.3 and 13.4 . . . . . . . . . . . . . . . . . . . . . . . . 543
13.5 An RSA-based signature scheme with a tight security proof . . . . . . . . . . . . . . 544
13.6 Case study: PKCS1 signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 546
13.6.1 Bleichenbacher’s attack on PKCS1 signatures . . . . . . . . . . . . . . . . . 548
13.7 Signcryption: combining signatures and encryption . . . . . . . . . . . . . . . . . . 549
13.7.1 Secure signcryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 551
13.7.2 Signcryption as an abstract interface . . . . . . . . . . . . . . . . . . . . . . 554

x
13.7.3 Constructions: encrypt-then-sign and sign-then-encrypt . . . . . . . . . . . 556
13.7.4 A construction based on Diffie-Hellman key exchange . . . . . . . . . . . . . 560
13.7.5 Additional desirable properties: forward secrecy and non-repudiation . . . . 562
13.8 Certificates and the public-key infrastructure . . . . . . . . . . . . . . . . . . . . . . 566
13.8.1 Coping with malicious or negligent certificate authorities . . . . . . . . . . . 568
13.8.2 Certificate revocation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 571
13.9 Case study: legal aspects of digital signatures . . . . . . . . . . . . . . . . . . . . . . 573
13.10 A fun application: forward secure signatures . . . . . . . . . . . . . . . . . . . . . . 574
13.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574
13.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 575

14 Fast hash-based signatures 583


14.1 Basic Lamport signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 584
14.1.1 Shrinking the signature using an enhanced TCR . . . . . . . . . . . . . . . 585
14.2 A general Lamport framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 586
14.2.1 An explicit containment free function . . . . . . . . . . . . . . . . . . . . . 588
14.3 Winternitz one-time signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589
14.3.1 A domination free function for Winternitz signatures . . . . . . . . . . . . . 592
14.4 HORS: short Lamport signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593
14.4.1 Shrinking the public-key using a Merkle tree . . . . . . . . . . . . . . . . . 594
14.5 Applications of one-time signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . 595
14.5.1 Online/offline signatures from one-time signatures . . . . . . . . . . . . . . 595
14.5.2 Authenticating streamed data with one-time signatures . . . . . . . . . . . 596
14.6 From one-time signatures to many-time signatures . . . . . . . . . . . . . . . . . . . 596
14.6.1 Indexed signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597
14.6.2 A many-time signature scheme from an indexed signature . . . . . . . . . . 598
14.6.3 The complete Merkle stateless signature system . . . . . . . . . . . . . . . . 600
14.6.4 Nonce-based Merkle signatures . . . . . . . . . . . . . . . . . . . . . . . . . 602
14.7 A fun application: the TESLA broadcast MAC . . . . . . . . . . . . . . . . . . . . . 603
14.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606
14.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606

15 Elliptic curve cryptography and pairings 615


15.1 The group of points of an elliptic curve . . . . . . . . . . . . . . . . . . . . . . . . . 615
15.2 Elliptic curves over finite fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617
15.2.1 Montgomery and Edwards curves . . . . . . . . . . . . . . . . . . . . . . . . 618
15.3 Elliptic curve cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619
15.3.1 The curves secp256r1 and secp256k1 . . . . . . . . . . . . . . . . . . . . . . 620
15.3.2 A security twist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 623
15.3.3 Curve25519 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
15.4 Pairing based cryptography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 624
15.5 Signature schemes from pairings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
15.5.1 The BLS signature scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . 627
15.5.2 Signature aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629
15.5.3 Secure BLS aggregation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633
15.5.4 Signature schemes secure without random oracles . . . . . . . . . . . . . . . 639

xi
15.6 Advanced encryption schemes from pairings . . . . . . . . . . . . . . . . . . . . . . . 643
15.6.1 Identity based encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . 643
15.6.2 Related security notions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 646
15.6.3 Identity based encryption from pairings . . . . . . . . . . . . . . . . . . . . 648
15.6.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 654
15.7 The functional encryption paradigm . . . . . . . . . . . . . . . . . . . . . . . . . . . 658
15.7.1 Sample functional encryption schemes from pairings . . . . . . . . . . . . . 662
15.7.2 Variations on functional encryption . . . . . . . . . . . . . . . . . . . . . . . 665
15.8 Multilinear maps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666
15.9 A fun application: fair exchange of signatures . . . . . . . . . . . . . . . . . . . . . 668
15.10 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671
15.11 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 671

16 Attacks on number theoretic assumptions 682


16.1 Analyzing the DL, CDH, and DDH assumptions . . . . . . . . . . . . . . . . . . . . 682
16.1.1 Square root time algorithms for discrete log . . . . . . . . . . . . . . . . . . 682
16.1.2 Discrete log in groups of composite order . . . . . . . . . . . . . . . . . . . 684
16.1.3 Information leakage in composite order groups . . . . . . . . . . . . . . . . 686
16.1.4 An attack on static Diffie-Hellman . . . . . . . . . . . . . . . . . . . . . . . 688
16.1.5 The relation between DL, CDH, and DDH . . . . . . . . . . . . . . . . . . . 690
16.2 Discrete log in Z∗p : the general number field sieve . . . . . . . . . . . . . . . . . . . . 690
16.2.1 Discrete log records in Z∗p . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691
16.2.2 A preprocessing attack on discrete log in Z∗p . . . . . . . . . . . . . . . . . . 692
16.3 A lower bound on discrete log in generic prime order groups . . . . . . . . . . . . . 692
16.4 Analyzing the factoring and RSA assumptions . . . . . . . . . . . . . . . . . . . . . 696
16.4.1 Factoring algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 696
16.4.2 Attacks on RSA resulting from poor key generation . . . . . . . . . . . . . 698
16.4.3 A fault injection attack on optimized RSA . . . . . . . . . . . . . . . . . . . 701
16.4.4 An attack on low secret exponent RSA . . . . . . . . . . . . . . . . . . . . . 702
16.5 Quantum attacks on factoring and discrete log . . . . . . . . . . . . . . . . . . . . . 705
16.6 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707
16.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 707

17 Post-quantum cryptography from lattices 712


17.1 Integer lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.2 Hard problems on lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.2.1 The SIS problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.2.2 The learning with errors (LWE) problem . . . . . . . . . . . . . . . . . . . . 713
17.2.3 The ring LWE problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.3 Trapdoor sampling from a lattice . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.4 Signatures from lattice problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.5 Public-key encryption from lattices . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.6 Fully homomorphic encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.7 A fun application: factoring integers using lattices . . . . . . . . . . . . . . . . . . . 713
17.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713
17.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 713

xii
III Protocols 714

18 Protocols for identification and login 715


18.1 Interactive protocols: general notions . . . . . . . . . . . . . . . . . . . . . . . . . . 717
18.1.1 Mathematical details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
18.2 ID protocols: definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718
18.3 Password protocols: security against direct attacks . . . . . . . . . . . . . . . . . . . 719
18.3.1 Password cracking using a dictionary attack . . . . . . . . . . . . . . . . . . 720
18.4 Making dictionary attacks harder . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
18.4.1 Public salts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724
18.4.2 Secret salts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
18.4.3 Slow hash functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 726
18.4.4 Slow memory-hard hash functions . . . . . . . . . . . . . . . . . . . . . . . 728
18.4.5 More password management issues . . . . . . . . . . . . . . . . . . . . . . . 732
18.5 One time passwords: security against eavesdropping . . . . . . . . . . . . . . . . . . 733
18.5.1 PRF-based one-time passwords: HOTP and TOTP . . . . . . . . . . . . . . 735
18.5.2 The S/key system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737
18.6 Challenge-response: security against active attacks . . . . . . . . . . . . . . . . . . . 738
18.6.1 Challenge-response protocols . . . . . . . . . . . . . . . . . . . . . . . . . . 740
18.7 A fun application: rainbow tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 742
18.8 Another fun application: hardening password storage . . . . . . . . . . . . . . . . . 746
18.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746
18.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 747

19 Identification and signatures from Sigma protocols 755


19.1 Schnorr’s identification protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 755
19.1.1 Honest verifier zero knowledge and security against eavesdropping . . . . . 760
19.2 From identification protocols to signatures . . . . . . . . . . . . . . . . . . . . . . . 762
19.2.1 A useful abstraction: repeated impersonation attacks . . . . . . . . . . . . . 763
19.2.2 Security analysis of Schnorr signatures . . . . . . . . . . . . . . . . . . . . . 764
19.2.3 A concrete implementation and an optimization . . . . . . . . . . . . . . . . 769
19.3 Case study: ECDSA signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770
19.4 Sigma protocols: basic definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771
19.4.1 Special soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 773
19.4.2 Special honest verifier zero knowledge . . . . . . . . . . . . . . . . . . . . . 774
19.5 Sigma protocols: examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 775
19.5.1 Okamoto’s protocol for representations . . . . . . . . . . . . . . . . . . . . . 775
19.5.2 The Chaum-Pedersen protocol for DH-triples . . . . . . . . . . . . . . . . . 777
19.5.3 A Sigma protocol for arbitrary linear relations . . . . . . . . . . . . . . . . 778
19.5.4 A Sigma protocol for the pre-image of a homomorphism . . . . . . . . . . . 780
19.5.5 A Sigma protocol for RSA . . . . . . . . . . . . . . . . . . . . . . . . . . . . 781
19.6 Identification and signatures from Sigma protocols . . . . . . . . . . . . . . . . . . . 782
19.6.1 The Fiat-Shamir heuristic for signatures . . . . . . . . . . . . . . . . . . . . 784
19.7 Combining Sigma protocols: AND and OR proofs . . . . . . . . . . . . . . . . . . . 787
19.7.1 The AND-proof construction . . . . . . . . . . . . . . . . . . . . . . . . . . 787
19.7.2 The OR-proof construction . . . . . . . . . . . . . . . . . . . . . . . . . . . 788

xiii
19.8 Witness independence and applications . . . . . . . . . . . . . . . . . . . . . . . . . 789
19.8.1 Definition of witness independence . . . . . . . . . . . . . . . . . . . . . . . 790
19.8.2 Special HVZK implies witness independence . . . . . . . . . . . . . . . . . . 791
19.8.3 Actively secure identification protocols . . . . . . . . . . . . . . . . . . . . . 792
19.8.4 Okamoto’s identification protocol . . . . . . . . . . . . . . . . . . . . . . . . 794
19.9 Multi-extractability: another notion of “proof of knowledge” . . . . . . . . . . . . . 796
19.9.1 Multi-extractable Sigma protocols . . . . . . . . . . . . . . . . . . . . . . . 796
19.9.2 Applications and limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
19.10 A fun application: a two round witness independent protocol . . . . . . . . . . . . . 808
19.11 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808
19.12 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 808

20 Proving properties in zero-knowledge 823


20.1 Languages and soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 823
20.2 Proving properties on encrypted data . . . . . . . . . . . . . . . . . . . . . . . . . . 824
20.2.1 A generic protocol for non-linear relations . . . . . . . . . . . . . . . . . . . 829
20.3 Non-interactive proof systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 831
20.3.1 Example: a voting protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 831
20.3.2 Non-interactive proofs: basic syntax . . . . . . . . . . . . . . . . . . . . . . 833
20.3.3 The Fiat-Shamir transform . . . . . . . . . . . . . . . . . . . . . . . . . . . 833
20.3.4 Non-interactive soundness . . . . . . . . . . . . . . . . . . . . . . . . . . . . 834
20.3.5 Non-interactive zero knowledge . . . . . . . . . . . . . . . . . . . . . . . . . 834
20.3.6 An example: applying the Fiat-Shamir transform to the Chaum-Pedersen
protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 837
20.4 Computational zero-knowledge and applications . . . . . . . . . . . . . . . . . . . . 838
20.4.1 Example: range proofs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 839
20.4.2 Special computational HVZK . . . . . . . . . . . . . . . . . . . . . . . . . . 840
20.4.3 An unconstrained generic protocol for non-linear relations . . . . . . . . . . 841
20.5 Bulletproofs: compressed Sigma protocols . . . . . . . . . . . . . . . . . . . . . . . . 842
20.6 Succinct non-interactive zero-knowledge proofs (SNARKs) . . . . . . . . . . . . . . 842
20.7 A fun application: everything that can be proved, can be proved in zero knowledge 842
20.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 842
20.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 843

21 Authenticated Key Exchange 855


21.1 Identification and AKE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 857
21.2 An encryption-based protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 858
21.2.1 Insecure variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 861
21.2.2 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 866
21.3 Perfect forward secrecy and a protocol based on ephemeral encryption . . . . . . . . 867
21.3.1 Assuming only semantically secure encryption . . . . . . . . . . . . . . . . . 869
21.4 HSM security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 869
21.4.1 A technical requirement: strongly unpredictable ciphertexts . . . . . . . . . 872
21.4.2 Insecure variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 872
21.5 Identity protection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 876
21.6 One-sided authenticated key exchange . . . . . . . . . . . . . . . . . . . . . . . . . . 878

xiv
21.6.1 A one-sided authenticated variant of AKE4 . . . . . . . . . . . . . . . . . . . 879
21.7 Deniability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 880
21.7.1 Deniability without identity protection . . . . . . . . . . . . . . . . . . . . . 881
21.7.2 Deniability with identity protection . . . . . . . . . . . . . . . . . . . . . . . 882
21.8 Channel bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884
21.9 Formal definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 885
21.9.1 Understanding the definition . . . . . . . . . . . . . . . . . . . . . . . . . . 889
21.9.2 Security of protocol AKE1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . 890
21.9.3 Modeling perfect forward secrecy . . . . . . . . . . . . . . . . . . . . . . . . 891
21.9.4 Modeling HSM security . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893
21.9.5 Modeling one-sided authentication . . . . . . . . . . . . . . . . . . . . . . . 896
21.9.6 Modeling channel bindings . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897
21.10 Case study: TLS session setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897
21.10.1 Authenticated key exchange with preshared keys . . . . . . . . . . . . . . . 900
21.11 Password authenticated key exchange . . . . . . . . . . . . . . . . . . . . . . . . . . 903
21.11.1 Phishing attacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 903
21.11.2 PAKE: an introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906
21.11.3 Protocol PAKE0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906
21.11.4 Protocol PAKE1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907
21.11.5 Protocol PAKE2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 909
21.11.6 Protocol PAKE+ 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
21.11.7 Explicit key confirmation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913
21.11.8 Phishing again . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913
21.11.9 Case study: PAKE used in the WiFi WPA3 protocol . . . . . . . . . . . . . 914
21.12 Key exchange using an online trusted third party . . . . . . . . . . . . . . . . . . . 914
21.12.1 A key exchange protocol with an online TTP . . . . . . . . . . . . . . . . . 914
21.12.2 Insecure variations of protocol OnlineTTP . . . . . . . . . . . . . . . . . . . 916
21.12.3 Security for protocol OnlineTTP . . . . . . . . . . . . . . . . . . . . . . . . 921
21.13 A fun application: establishing Tor channels . . . . . . . . . . . . . . . . . . . . . . 921
21.14 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921
21.15 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 921

22 Threshold cryptography 924


22.1 Shamir’s secret sharing scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 926
22.1.1 Shamir secret sharing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 927
22.1.2 Security of Shamir secret sharing . . . . . . . . . . . . . . . . . . . . . . . . 929
22.2 Threshold signatures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 930
22.2.1 A generic threshold signature scheme . . . . . . . . . . . . . . . . . . . . . . 932
22.2.2 BLS threshold signing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 932
22.2.3 Threshold signature security . . . . . . . . . . . . . . . . . . . . . . . . . . . 935
22.2.4 Security of threshold BLS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937
22.2.5 Accountability versus privacy . . . . . . . . . . . . . . . . . . . . . . . . . . 938
22.2.6 BLS accountable threshold signatures . . . . . . . . . . . . . . . . . . . . . 942
22.3 Threshold decryption schemes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 944
22.3.1 A generic threshold decryption scheme . . . . . . . . . . . . . . . . . . . . . 946
22.3.2 An insecure threshold decryption scheme . . . . . . . . . . . . . . . . . . . 947

xv
22.3.3 A secure threshold decryption scheme based on CDH . . . . . . . . . . . . . 949
22.3.4 Threshold decryption security . . . . . . . . . . . . . . . . . . . . . . . . . . 951
22.3.5 Security of threshold GS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 954
22.3.6 A pairing-free version of EthGS . . . . . . . . . . . . . . . . . . . . . . . . . . 956
22.4 Distributed key generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 959
22.4.1 Defining the problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 959
22.4.2 A simple DKG protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 965
22.5 Beyond threshold: monotone access structures . . . . . . . . . . . . . . . . . . . . . 975
22.5.1 The generic secret sharing construction . . . . . . . . . . . . . . . . . . . . 976
22.5.2 Linear secret sharing schemes and monotone span programs . . . . . . . . . 977
22.5.3 A signature scheme from linear secret sharing . . . . . . . . . . . . . . . . . 984
22.6 Gap security for threshold cryptosystems . . . . . . . . . . . . . . . . . . . . . . . . 987
22.6.1 Gap security for threshold signature schemes . . . . . . . . . . . . . . . . . 988
22.6.2 Gap security for threshold decryption schemes . . . . . . . . . . . . . . . . 990
22.7 A fun application: a randomness beacon . . . . . . . . . . . . . . . . . . . . . . . . 992
22.8 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 992
22.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 992

23 Secure multi-party computation 997


23.1 The basic idea of MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 998
23.1.1 Informal notions of security . . . . . . . . . . . . . . . . . . . . . . . . . . . 998
23.1.2 Assumptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 999
23.1.3 How to define security formally . . . . . . . . . . . . . . . . . . . . . . . . . 1001
23.1.4 Other applications of MPC . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001
23.2 Securely evaluating arithmetic circuits . . . . . . . . . . . . . . . . . . . . . . . . . . 1003
23.2.1 Arithmetic circuit evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . 1003
23.2.2 Beaver’s protocol: an honest-but-curious 2.5-party protocol . . . . . . . . . 1005
23.2.3 Abstracting Beaver’s 2.5-party protocol . . . . . . . . . . . . . . . . . . . . 1009
23.2.4 A maliciously secure version of Beaver’s 2.5-party protocol . . . . . . . . . . 1011
23.3 Garbled circuits: another approach to MPC . . . . . . . . . . . . . . . . . . . . . . 1018
23.3.1 Boolean circuit evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1018
23.3.2 Yao’s 2-party garbled circuit technique: basic ideas . . . . . . . . . . . . . . 1020
23.3.3 Garbling schemes: an application to outsourcing computation . . . . . . . . 1021
23.3.4 Garble0: a simple but efficient garbling scheme . . . . . . . . . . . . . . . . 1022
23.3.5 Garbling schemes: formally defining security properties . . . . . . . . . . . 1027
23.3.6 A 2-party garbling-based protocol secure against honest-but-curious adver-
saries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1029
23.3.7 A 3-party garbling-based protocol secure against malicious adversaries . . . 1031
23.4 Multi-party computation based on a secure distributed core . . . . . . . . . . . . . . 1034
23.4.1 Processing inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1035
23.4.2 Processing outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1037
23.5 Formal models for multi-party computation: the universal composability framework 1037
23.5.1 The real protocol and its execution . . . . . . . . . . . . . . . . . . . . . . . 1038
23.5.2 The ideal protocol and its execution . . . . . . . . . . . . . . . . . . . . . . 1041
23.5.3 Example: the ideal functionality for secure function evaluation . . . . . . . 1043
23.5.4 Secure implementation: a strong security notion . . . . . . . . . . . . . . . 1044

xvi
23.5.5 Consequences of secure implementation . . . . . . . . . . . . . . . . . . . . 1046
23.5.6 Defining honest-but-curious security . . . . . . . . . . . . . . . . . . . . . . 1054
23.5.7 A warmup honest-but-curious security proof: a simple OT protocol . . . . . 1056
23.5.8 A warmup malicious security proof: a simple OT protocol . . . . . . . . . . 1061
23.5.9 An example malicious security proof: Beaver’s 2.5-party protocol . . . . . . 1068
23.5.10 An example malicious security proof: multi-party computation based on a
secure distributed core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1072
23.5.11 An example malicious security proof: the 3-party garbled circuit protocol . 1074
23.6 Distributed key generation: ideal functionalities and extension to threshold MPC . 1077
23.6.1 Formal models for secure DKG . . . . . . . . . . . . . . . . . . . . . . . . . 1077
23.6.2 A threshold MPC protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . 1079
23.7 OT extension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1087
23.8 A fun application: another stab at private set intersection . . . . . . . . . . . . . . . 1091
23.9 Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1091
23.10 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1091

IV Appendices 1095

A Basic number theory 1096


A.1 Cyclic groups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1096
A.2 Arithmetic modulo primes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1096
A.2.1 Basic concepts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1096
A.2.2 Structure of Z∗p . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097
A.2.3 Quadratic residues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1097
A.2.4 Computing in Zp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1098
A.2.5 Summary: arithmetic modulo primes . . . . . . . . . . . . . . . . . . . . . . 1099
A.3 Arithmetic modulo composites . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1099

B Basic probability theory 1101


B.1 The birthday Paradox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1101
B.1.1 More collision bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1103
B.1.2 A simple distinguisher . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1103

C Basic complexity theory 1105

D Probabilistic algorithms 1106

xvii
xviii
Part I

Secret key cryptography

3
Chapter 2

Encryption

Suppose Alice and Bob share a secret key k. Alice wants to transmit a message m to Bob over a
network while maintaining the secrecy of m in the presence of an eavesdropping adversary. This
chapter begins the development of basic techniques to solve this problem. Besides transmitting a
message over a network, these same techniques allow Alice to store a file on a disk so that no one
else with access to the disk can read the file, but Alice herself can read the file at a later time.
We should stress that while the techniques we develop in this chapter to solve this fundamental
problem are important and interesting, they do not by themselves solve all problems related to
“secure communication.”

• The techniques only provide secrecy in the situation where Alice transmits a single message
per key. If Alice wants to secretly transmit several messages using the same key, then she
must use methods developed in Chapter 5.

• The techniques do not provide any assurances of message integrity: if the attacker has the
ability to modify the bits of the ciphertext while it travels from Alice to Bob, then Bob may
not realize that this happened, and accept a message other than the one that Alice sent. We
will discuss techniques for providing message integrity in Chapter 6.

• The techniques do not provide a mechanism that allow Alice and Bob to come to share a secret
key in the first place. Maybe they are able to do this using some secure network (or a physical,
face-to-face meeting) at some point in time, while the message is sent at some later time when
Alice and Bob must communicate over an insecure network. However, with an appropriate
infrastructure in place, there are also protocols that allow Alice and Bob to exchange a secret
key even over an insecure network: such protocols are discussed in Chapter 21.

2.1 Shannon ciphers and perfect security


2.1.1 Definition of a Shannon cipher
The basic mechanism for encrypting a message using a shared secret key is called a cipher (or
encryption scheme). In this section, we introduce a slightly simplified notion of a cipher, which we
call a Shannon cipher.
A Shannon cipher is a pair E = (E, D) of functions.

4
• The function E (the encryption function) takes as input a key k and a message m (also
called a plaintext), and produces as output a ciphertext c. That is,

c = E(k, m),

and we say that c is the encryption of m under k.

• The function D (the decryption function) takes as input a key k and a ciphertext c, and
produces a message m. That is,
m = D(k, c),
and we say that m is the decryption of c under k.

• We require that decryption “undoes” encryption; that is, the cipher must satisfy the following
correctness property: for all keys k and all messages m, we have

D(k, E(k, m) ) = m.

To be slightly more formal, let us assume that K is the set of all keys (the key space), M is the
set of all messages (the message space), and that C is the set of all ciphertexts (the ciphertext
space). With this notation, we can write:

E : K × M → C,
D : K × C → M.

Also, we shall say that E is defined over (K, M, C).


Suppose Alice and Bob want to use such a cipher so that Alice can send a message to Bob.
The idea is that Alice and Bob must somehow agree in advance on a key k ∈ K. Assuming this is
done, then when Alice wants to send a message m ∈ M to Bob, she encrypts m under k, obtaining
the ciphertext c = E(k, m) ∈ C, and then sends c to Bob via some communication network. Upon
receiving c, Bob decrypts c under k, and the correctness property ensures that D(k, c) is the same
as Alice’s original message m. For this to work, we have to assume that c is not tampered with in
transit from Alice to Bob. Of course, the goal, intuitively, is that an eavesdropper, who may obtain
c while it is in transit, does not learn too much about Alice’s message m — this intuitive notion is
what the formal definition of security, which we explore below, will capture.
In practice, keys, messages, and ciphertexts are often sequences of bytes. Keys are usually
of some fixed length; for example, 16-byte (i.e., 128-bit) keys are very common. Messages and
ciphertexts may be sequences of bytes of some fixed length, or of variable length. For example, a
message may be a 1GB video file, a 10MB music file, a 1KB email message, or even a single bit
encoding a “yes” or “no” vote in an electronic election.
Keys, messages, and ciphertexts may also be other types of mathematical objects, such as
integers, or tuples of integers (perhaps lying in some specified interval), or other, more sophisticated
types of mathematical objects (polynomials, matrices, or group elements). Regardless of how fancy
these mathematical objects are, in practice, they must at some point be represented as sequences
of bytes for purposes of storage in, and transmission between, computers.
For simplicity, in our mathematical treatment of ciphers, we shall assume that K, M, and C
are sets of finite size. While this simplifies the theory, it means that if a real-world system allows

5
messages of unbounded length, we will (somewhat artificially) impose a (large) upper bound on
legal message lengths.
To exercise the above terminology, we take another look at some of the example ciphers discussed
in Chapter 1.
Example 2.1. A one-time pad is a Shannon cipher E = (E, D), where the keys, messages, and
ciphertexts are bit strings of the same length; that is, E is defined over (K, M, C), where

K := M := C := {0, 1}L ,

for some fixed parameter L. For a key k ∈ {0, 1}L and a message m ∈ {0, 1}L the encryption
function is defined as follows:
E(k, m) := k ⊕ m,
and for a key k ∈ {0, 1}L and ciphertext c ∈ {0, 1}L , the decryption function is defined as follows:

D(k, c) := k ⊕ c.

Here, “⊕” denotes bit-wise exclusive-OR, or in other words, component-wise addition modulo 2,
and satisfies the following algebraic laws: for all bit vectors x, y, z ∈ {0, 1}L , we have

x ⊕ y = y ⊕ x, x ⊕ (y ⊕ z) = (x ⊕ y) ⊕ z, x ⊕ 0L = x, and x ⊕ x = 0L .

These properties follow immediately from the corresponding properties for addition modulo 2.
Using these properties, it is easy to check that the correctness property holds for E: for all k, m ∈
{0, 1}L , we have

D(k, E(k, m) ) = D(k, k ⊕ m) = k ⊕ (k ⊕ m) = (k ⊕ k) ⊕ m = 0L ⊕ m = m.

The encryption and decryption functions happen to be the same in this case, but of course, not all
ciphers have this property. 2
Example 2.2. A variable length one-time pad is a Shannon cipher E = (E, D), where the
keys are bit strings of some fixed length L, while messages and ciphertexts are variable length bit
strings, of length at most L. Thus, E is defined over (K, M, C), where

K := {0, 1}L and M := C := {0, 1}≤L .

for some parameter L. Here, {0, 1}≤L denotes the set of all bit strings of length at most L (including
the empty string). For a key k ∈ {0, 1}L and a message m ∈ {0, 1}≤L of length `, the encryption
function is defined as follows:
E(k, m) := k[0 . . ` − 1] ⊕ m,
and for a key k ∈ {0, 1}L and ciphertext c ∈ {0, 1}≤L of length `, the decryption function is defined
as follows:
D(k, c) := k[0 . . ` − 1] ⊕ c.
Here, k[0 . . ` − 1] denotes the truncation of k to its first ` bits. The reader may verify that the
correctness property holds for E. 2

6
Example 2.3. A substitution cipher is a Shannon cipher E = (E, D) of the following form. Let
Σ be a finite alphabet of symbols (e.g., the letters A–Z, plus a space symbol, ). The message space
M and the ciphertext space C are both sequences of symbols from Σ of some fixed length L:

M := C := ΣL .

The key space K consists of all permutations on Σ; that is, each k ∈ K is a one-to-one function from
Σ onto itself. Note that K is a very large set; indeed, |K| = |Σ|! (for |Σ| = 27, |K| ≈ 1.09 · 1028 ).
Encryption of a message m ∈ ΣL under a key k ∈ K (a permutation on Σ) is defined as follows

E(k, m) := k(m[0]), k(m[1]), . . . , k(m[L − 1]) ,

where m[i] denotes the ith entry of m (counting from zero), and k(m[i]) denotes the application
of the permutation k to the symbol m[i]. Thus, to encrypt m under k, we simply apply the
permutation k component-wise to the sequence m. Decryption of a ciphertext c ∈ ΣL under a key
k ∈ K is defined as follows:

D(k, c) := k −1 (c[0]), k −1 (c[1]), . . . , k −1 (c[L − 1]) .




Here, k −1 is the inverse permutation of k, and to decrypt c under k, we simply apply k −1 component-
wise to the sequence c. The correctness property is easily verified: for a message m ∈ ΣL and key
k ∈ K, we have

D(k, E(k, m) ) = D(k, (k(m[0]), k(m[1]), . . . , k(m[L − 1]) )


= (k −1 (k(m[0])), k −1 (k(m[1])), . . . , k −1 (k(m[L − 1])))
= (m[0], m[1], . . . , m[L − 1]) = m. 2

Example 2.4 (additive one-time pad). We may also define a “addition mod n” variation of
the one-time pad. This is a cipher E = (E, D), defined over (K, M, C), where K := M := C :=
{0, . . . , n − 1}, where n is a positive integer. Encryption and decryption are defined as follows:

E(k, m) := m + k mod n D(k, c) := c − k mod n.

The reader may easily verify that the correctness property holds for E. 2

2.1.2 Perfect security


So far, we have just defined the basic syntax and correctness requirements of a Shannon cipher.
Next, we address the question: what is a “secure” cipher? Intuitively, the answer is that a secure
cipher is one for which an encrypted message remains “well hidden,” even after seeing its encryp-
tion. However, turning this intuitive answer into one that is both mathematically meaningful and
practically relevant is a real challenge. Indeed, although ciphers have been used for centuries, it
is only in the last few decades that mathematically acceptable definitions of security have been
developed.
In this section, we develop the mathematical notion of perfect security — this is the “gold
standard” for security (at least, when we are only worried about encrypting a single message and
do not care about integrity). We will also see that it is possible to achieve this level of security;
indeed, we will show that the one-time pad satisfies the definition. However, the one-time pad is

7
not very practical, in the sense that the keys must be as long as the messages: if Alice wants to
send a 1GB file to Bob, they must already share a 1GB key! Unfortunately, this cannot be avoided:
we will also prove that any perfectly secure cipher must have a key space at least as large as its
message space. This fact provides the motivation for developing a definition of security that is
weaker, but that is acceptable from a practical point of view, and which allows one to encrypt long
messages using short keys.
If Alice encrypts a message m under a key k, and an eavesdropping adversary obtains the
ciphertext c, Alice only has a hope of keeping m secret if the key k is hard to guess, and that
means, at the very least, that the key k should be chosen at random from a large key space. To
say that m is “well hidden” must at least mean that it is hard to completely determine m from
c, without knowledge of k; however, this is not really enough. Even though the adversary may
not know k, we assume that he does know the encryption algorithm and the distribution of k. In
fact, we will assume that when a message is encrypted, the key k is always chosen at random,
uniformly from among all keys in the key space. The adversary may also have some knowledge of
the message encrypted — because of circumstances, he may know that the set of possible messages
is quite small, and he may know something about how likely each possible message is. For example,
suppose he knows the message m is either m0 = "ATTACK AT DAWN" or m1 = "ATTACK AT DUSK",
and that based on the adversary’s available intelligence, Alice is equally likely to choose either one
of these two messages. Without seeing the ciphertext c, the adversary would only have a 50%
chance of guessing which message Alice sent. But we are assuming the adversary does know c.
Even with this knowledge, both messages may be possible; that is, there may exist keys k0 and
k1 such that E(k0 , m0 ) = c and E(k1 , m1 ) = c, so he cannot be sure if m = m0 or m = m1 .
However, he can still guess. Perhaps it is a property of the cipher that there are 800 keys k0 such
that E(k0 , m0 ) = c, and 600 keys k1 such that E(k1 , m1 ) = c. If that is the case, the adversary’s
best guess would be that m = m0 . Indeed, the probability that this guess is correct is equal to
800/(800 + 600) ≈ 57%, which is better than the 50% chance he would have without knowledge
of the ciphertext. Our formal definition of perfect security expressly rules out the possibility that
knowledge of the ciphertext increases the probability of guessing the encrypted message, or for that
matter, determining any property of the message whatsoever.
Without further ado, we formally define perfect security. In this definition, we will consider a
probabilistic experiment in which the key is drawn uniformly from the key space. We write k to
denote the random variable representing this random key. For a message m, E(k, m) is another
random variable, which represents the application of the encryption function to our random key
and the message m. Thus, every message m gives rise to a different random variable E(k, m).
Definition 2.1 (perfect security). Let E = (E, D) be a Shannon cipher defined over (K, M, C).
Consider a probabilistic experiment in which the random variable k is uniformly distributed over K.
If for all m0 , m1 ∈ M, and all c ∈ C, we have

Pr[E(k, m0 ) = c] = Pr[E(k, m1 ) = c],

then we say that E is a perfectly secure Shannon cipher.


There are a number of equivalent formulations of perfect security that we shall explore. We
state a couple of these here.
Theorem 2.1. Let E = (E, D) be a Shannon cipher defined over (K, M, C). The following are
equivalent:

8
(i) E is perfectly secure.

(ii) For every c ∈ C, there exists an integer Nc (possibly depending on c) such that for all m ∈ M,
we have
{k ∈ K : E(k, m) = c} = Nc .

(iii) If the random variable k is uniformly distributed over K, then each of the random variables
E(k, m), for m ∈ M, has the same distribution.

Proof. To begin with, let us restate (ii) as follows: for every c ∈ C, there exists a number Pc
(depending on c) such that for all m ∈ M, we have Pr[E(k, m) = c] = Pc . Here, k is a random
variable uniformly distributed over K. Note that Pc = Nc /|K|, where Nc is as in the original
statement of (ii).
This version of (ii) is clearly the same as (iii).
(i) =⇒ (ii). We prove (ii) assuming (i). To prove (ii), let c ∈ C be some fixed ciphertext.
Pick some arbitrary message m0 ∈ M, and let Pc := Pr[E(k, m0 ) = c]. By (i), we know that for all
m ∈ M, we have Pr[E(k, m) = c] = Pr[E(k, m0 ) = c] = Pc . That proves (ii).
(ii) =⇒ (i). We prove (i) assuming (ii). Consider any fixed m0 , m1 ∈ M and c ∈ C. (ii) says
that Pr[E(k, m0 ) = c] = Pc = Pr[E(k, m1 ) = c], which proves (i). 2
As promised, we give a proof that the one-time pad (see Example 2.1) is perfectly secure.
Theorem 2.2. The one-time pad is a perfectly secure Shannon cipher.
Proof. Suppose that the Shannon cipher E = (E, D) is a one-time pad, and is defined over (K, M, C),
where K := M := C := {0, 1}L . For any fixed message m ∈ {0, 1}L and ciphertext c ∈ {0, 1}L ,
there is a unique key k ∈ {0, 1}L satisfying the equation

k ⊕ m = c,

namely, k := m ⊕ c. Therefore, E satisfies condition (ii) in Theorem 2.1 (with Nc = 1 for each c).
2
Example 2.5. Consider again the variable length one-time pad, defined in Example 2.2. This
does not satisfy our definition of perfect security, since a ciphertext has the same length as the
corresponding plaintext. Indeed, let us choose an arbitrary string of length 1, call it m0 , and an
arbitrary string of length 2, call it m1 . In addition, suppose that c is an arbitrary length 1 string,
and that k is a random variable that is uniformly distributed over the key space. Then we have

Pr[E(k, m0 ) = c] = 1/2 and Pr[E(k, m1 ) = c] = 0,

which provides a direct counter-example to Definition 2.1.


Intuitively, the variable length one-time pad cannot satisfy our definition of perfect security
simply because any ciphertext leaks the length of the corresponding plaintext. However, in some
sense (which we do not make precise right now), this is the only information leaked. It is perhaps not
clear whether this should be viewed as a problem with the cipher or with our definition of perfect
security. On the one hand, one can imagine scenarios where the length of a message may vary
greatly, and while we could always “pad” short messages to effectively make all messages equally
long, this may be unacceptable from a practical point of view, as it is a waste of bandwidth. On

9
the other hand, one must be aware of the fact that in certain applications, leaking just the length
of a message may be dangerous: if you are encrypting a “yes” or “no” answer to a question, just
the length of the obvious ASCII encoding of these strings leaks everything, so you better pad “no”
out to three characters. 2
Example 2.6. Consider again the substitution cipher defined in Example 2.3. There are a couple
of different ways to see that this cipher is not perfectly secure.
For example, choose a pair of messages m0 , m1 ∈ ΣL such that the first two components of m0
are equal, yet the first two components of m1 are not equal; that is,
m0 [0] = m0 [1] and m1 [0] 6= m1 [1].
Then for each key k, which is a permutation on Σ, if c = E(k, m0 ), then c[0] = c[1], while if
c = E(k, m1 ), then c[0] 6= c[1]. In particular, it follows that if k is uniformly distributed over the
key space, then the distributions of E(k, m0 ) and E(k, m1 ) will not be the same.
Even the weakness described in the previous paragraph may seem somewhat artificial. Another,
perhaps more realistic, type of attack on the substitution cipher works as follows. Suppose the
substitution cipher is used to encrypt email messages. As anyone knows, an email starts with a
“standard header,” such as "FROM". Suppose the ciphertext is c ∈ ΣL is intercepted by an adversary.
The secret key is actually a permutation k on Σ. The adversary knows that
c[0 . . . 3] = (k(F), k(R), k(O), k(M)).
Thus, if the original message is m ∈ ΣL , the adversary can now locate all positions in m where
an F occurs, where an R occurs, where an O occurs, and where an M occurs. Based just on this
information, along with specific, contextual information about the message, together with general
information about letter frequencies, the adversary may be able to deduce quite a bit about the
original message. 2
Example 2.7. Consider the additive one-time pad, defined in Example 2.4. It is easy to verify
that this is perfectly secure. Indeed, it satisfies condition (ii) in Theorem 2.1 (with Nc = 1 for each
c). 2
The next two theorems develop two more alternative characterizations of perfect security. For
the first, suppose an eavesdropping adversary applies some predicate φ to a ciphertext he has
obtained. The predicate φ (which is a boolean-valued function on the ciphertext space) may be
something very simple, like the parity function (i.e., whether the number of 1 bits in the ciphertext
is even or odd), or it might be some more elaborate type of statistical test. Regardless of how clever
or complicated the predicate φ is, perfect security guarantees that the value of this predicate on
the ciphertext reveals nothing about the message.
Theorem 2.3. Let E = (E, D) be a Shannon cipher defined over (K, M, C). Consider a probabilistic
experiment in which k is a random variable uniformly distributed over K. Then E is perfectly secure
if and only if for every predicate φ on C, for all m0 , m1 ∈ M, we have
Pr[φ(E(k, m0 ))] = Pr[φ(E(k, m1 ))].
Proof. This is really just a simple calculation. On the one hand, suppose E is perfectly secure, and
let φ, m0 , and m1 be given. Let S := {c ∈ C : φ(c)}. Then we have
X X
Pr[φ(E(k, m0 ))] = Pr[E(k, m0 ) = c] = Pr[E(k, m1 ) = c] = Pr[φ(E(k, m1 ))].
c∈S c∈S

10
Here, we use the assumption that E is perfectly secure in establishing the second equality. On the
other hand, suppose E is not perfectly secure, so there exist m0 , m1 , and c such that

Pr[E(k, m0 ) = c] 6= Pr[E(k, m1 ) = c].

Defining φ to be the predicate that is true for this particular c, and false for all other ciphertexts,
we see that

Pr[φ(E(k, m0 ))] = Pr[E(k, m0 ) = c] 6= Pr[E(k, m1 ) = c] = Pr[φ(E(k, m1 ))]. 2

The next theorem states in yet another way that perfect security guarantees that the ciphertext
reveals nothing about the message. Suppose that m is a random variable distributed over the
message space M. We do not assume that m is uniformly distributed over M. Now suppose k
is a random variable uniformly distributed over the key space K, independently of m, and define
c := E(k, m), which is a random variable distributed over the ciphertext space C. The following
theorem says that perfect security guarantees that c and m are independent random variables.
One way of characterizing this independence is to say that for each ciphertext c ∈ C that occurs
with nonzero probability, and each message m ∈ M, we have

Pr[m = m | c = c] = Pr[m = m].

Intuitively, this means that after seeing a ciphertext, we have no more information about the
message than we did before seeing the ciphertext.
Another way of characterizing this independence is to say that for each message m ∈ M that
occurs with nonzero probability, and each ciphertext c ∈ C, we have

Pr[c = c | m = m] = Pr[c = c].

Intuitively, this means that the choice of message has no impact on the distribution of the ciphertext.
The restriction that m and k are independent random variables is sensible: in using any cipher,
it is a very bad idea to choose the key in a way that depends on the message, or vice versa (see
Exercise 2.16).
Theorem 2.4. Let E = (E, D) be a Shannon cipher defined over (K, M, C). Consider a random
experiment in which k and m are random variables, such that

• k is uniformly distributed over K,

• m is distributed over M, and

• k and m are independent.

Define the random variable c := E(k, m). Then we have:

• if E is perfectly secure, then c and m are independent;

• conversely, if c and m are independent, and each message in M occurs with nonzero proba-
bility, then E is perfectly secure.

11
Proof. For the first implication, assume that E is perfectly secure. Consider any fixed m ∈ M and
c ∈ C. We want to show that

Pr[c = c ∧ m = m] = Pr[c = c] Pr[m = m].

We have

Pr[c = c ∧ m = m] = Pr[E(k, m) = c ∧ m = m]
= Pr[E(k, m) = c ∧ m = m]
= Pr[E(k, m) = c] Pr[m = m] (by independence of k and m).

So it will suffice to show that Pr[E(k, m) = c] = Pr[c = c]. But we have

Pr[c = c] = Pr[E(k, m) = c]
X
= Pr[E(k, m) = c ∧ m = m0 ] (by total probability)
m0 ∈M
X
= Pr[E(k, m0 ) = c ∧ m = m0 ]
m0 ∈M
X
= Pr[E(k, m0 ) = c] Pr[m = m0 ] (by independence of k and m)
m0 ∈M
X
= Pr[E(k, m) = c] Pr[m = m0 ] (by definition of perfect security)
m0 ∈M
X
= Pr[E(k, m) = c] Pr[m = m0 ]
m0 ∈M
= Pr[E(k, m) = c] (probabilities sum to 1).

For the second implication, assume that c and m are independent, and each message in M occurs
with nonzero probability. Let m ∈ M and c ∈ C. We will show that Pr[E(k, m) = c] = Pr[c = c],
from which perfect security immediately follows. Since Pr[m = m] 6= 0, this is seen thusly:

Pr[E(k, m) = c] Pr[m = m] = Pr[E(k, m) = c ∧ m = m] (by independence of k and m)


= Pr[E(k, m) = c ∧ m = m]
= Pr[c = c ∧ m = m]
= Pr[c = c] Pr[m = m] (by independence of c and m). 2

2.1.3 The bad news


We have saved the bad news for last. The next theorem shows that perfect security is such a
powerful notion that one can really do no better than the one-time pad: keys must be at least as
long as messages. As a result, it is almost impossible to use perfectly secure ciphers in practice: if
Alice wants to send Bob a 1GB video file, then Alice and Bob have to agree on a 1GB secret key
in advance.
Theorem 2.5 (Shannon’s theorem). Let E = (E, D) be a Shannon cipher defined over
(K, M, C). If E is perfectly secure, then |K| ≥ |M|.

12
Proof. Assume that |K| < |M|. We want to show that E is not perfectly secure. To this end, we
show that there exist messages m0 and m1 , and a ciphertext c, such that

Pr[E(k, m0 ) = c] > 0, and (2.1)


Pr[E(k, m1 ) = c] = 0. (2.2)

Here, k is a random variable, uniformly distributed over K.


To do this, choose any message m0 ∈ M, and any key k0 ∈ K. Let c := E(k0 , m0 ). It is clear
that (2.1) holds.
Next, let
S := {D(k1 , c) : k1 ∈ K}.
Clearly,
|S| ≤ |K| < |M|,
and so we can choose a message m1 ∈ M \ S.
To prove (2.2), we need to show that there is no key k1 such that E(k1 , m1 ) = c. Assume to
the contrary that E(k1 , m1 ) = c for some k1 ; then for this key k1 , by the correctness property for
ciphers, we would have
D(k1 , c) = D(k1 , E(k1 , m1 ) ) = m1 ,
which would imply that m1 belongs to S, which is not the case. That proves (2.2), and the theorem
follows. 2

2.2 Computational ciphers and semantic security


As we have seen in Shannon’s theorem (Theorem 2.5), the only way to achieve perfect security is
to have keys that are as long as messages. However, this is quite impractical: we would like to be
able to encrypt a long message (say, a document of several megabytes) using a short key (say, a few
hundred bits). The only way around Shannon’s theorem is to relax our security requirements. The
way we shall do this is to consider not all possible adversaries, but only computationally feasible
adversaries, that is, “real world” adversaries that must perform their calculations on real computers
using a reasonable amount of time and memory. This will lead to a weaker definition of security
called semantic security. Furthermore, our definition of security will be flexible enough to allow
ciphers with variable length message spaces to be considered secure so long as they do not leak
any useful information about an encrypted message to an adversary other than the length of the
message. Also, since our focus is now on the “practical,” instead of the “mathematically possible,”
we shall also insist that the encryption and decryption functions are themselves efficient algorithms,
and not just arbitrary functions.

2.2.1 Definition of a computational cipher


A computational cipher E = (E, D) is a pair of efficient algorithms, E and D. The encryption
algorithm E takes as input a key k, along with a message m, and produces as output a ciphertext c.
The decryption algorithm D takes as input a key k, a ciphertext c, and outputs a message m. Keys
lie in some finite key space K, messages lie in a finite message space M, and ciphertexts lie in some
finite ciphertext space C. Just as for a Shannon cipher, we say that E is defined over (K, M, C).

13
Although it is not really necessary for our purposes in this chapter, we will allow the encryption
function E to be a probabilistic algorithm. This means that for fixed inputs k and m, the output of
E(k, m) may be one of many values. Probabilistic algorithms are discussed further in Appendix D.
To emphasize the probabilistic nature of encryption we write
R
c← E(k, m)

to denote the process of executing E(k, m) and assigning the output to the program variable c. We
shall use this notation throughout the book whenever we use probabilistic algorithms. Similarly,
we write
R
k← K
to denote the process of assigning to the program variable k a random, uniformly distributed
element from the key space K. We shall use the analogous notation to sample uniformly from any
finite set.
We will not see any examples of probabilistic encryption algorithms in this chapter (we will see
our first examples of this in Chapter 5). Although one could allow the decryption algorithm to
be probabilistic, we will have no need for this, and so will only discuss ciphers with deterministic
decryption algorithms. However, it will be occasionally be convenient to allow the decryption
algorithm to return a special reject value (distinct from all messages), indicating some kind of error
occurred during the decryption process.
Since the encryption algorithm is probabilistic, for a given key k and message m, the encryption
algorithm may output one of many possible ciphertexts; however, each of these possible ciphertexts
should decrypt to m. We can state this correctness requirement more formally as follows: for
all keys k ∈ K and messages m ∈ M, if we execute
R
c← E(k, m), m0 ← D(k, c),

then m = m0 with probability 1.

From now on, whenever we refer to a cipher, we shall mean a computational cipher,
as defined above. Moreover, if the encryption algorithm happens to be deterministic, then
we may call the cipher a deterministic cipher.

Observe that any deterministic cipher is a Shannon cipher; however, a computational cipher
need not be a Shannon cipher (if it has a probabilistic encryption algorithm), and a Shannon
cipher need not be a computational cipher (if its encryption or decryption operations have no
efficient implementations).
Example 2.8. The one-time pad (see Example 2.1) and the variable length one-time pad (see
Example 2.2) are both deterministic ciphers, since their encryption and decryption operations may
be trivially implemented as efficient, deterministic algorithms. The same holds for the substitution
cipher (see Example 2.3), provided the alphabet Σ is not too large. Indeed, in the obvious imple-
mentation, a key — which is a permutation on Σ — will be represented by an array indexed by Σ,
and so we will require O(|Σ|) space just to store a key. This will only be practical for reasonably
sized Σ. The additive one-time pad discussed in Example 2.4 is also a deterministic cipher, since
both encryption and decryption operations may be efficiently implemented (if n is large, special
software to do arithmetic with large integers may be necessary). 2

14
2.2.2 Definition of semantic security
To motivate the definition of semantic security, consider a deterministic cipher E = (E, D), defined
over (K, M, C). Consider again the formulation of perfect security in Theorem 2.3. This says that
for all predicates φ on the ciphertext space, and all messages m0 , m1 , we have

Pr[φ(E(k, m0 ))] = Pr[φ(E(k, m1 ))], (2.3)

where k is a random variable uniformly distributed over the key space K. Instead of insisting that
these probabilities are equal, we shall only require that they are very close; that is,

Pr[φ(E(k, m0 ))] − Pr[φ(E(k, m1 ))] ≤ , (2.4)

for some very small, or negligible, value of . By itself, this relaxation does not help very much
(see Exercise 2.5). However, instead of requiring that (2.4) holds for every possible φ, m0 , and
m1 , we only require that (2.4) holds for all messages m0 and m1 that can be generated by some
efficient algorithm, and all predicates φ that can be computed by some efficient algorithm (these
algorithms could be probabilistic). For example, suppose it were the case that using the best
possible algorithms for generating m0 and m1 , and for testing some predicate φ, and using (say)
10,000 computers in parallel for 10 years to perform these calculations, (2.4) holds for  = 2−100 .
While not perfectly secure, we might be willing to say that the cipher is secure for all practical
purposes.
Also, in defining semantic security, we address an issue raised in Example 2.5. In that example,
we saw that the variable length one-time pad did not satisfy the definition of perfect security.
However, we want our definition to be flexible enough so that ciphers like the variable length one-
time pad, which effectively leak no information about an encrypted message other than its length,
may be considered secure as well.
Now the details. To precisely formulate the definition of semantic security, we shall describe
an attack game played between two parties, a challenger and an adversary. Throughout this
book, we shall formulate many attack games that capture various notions of security for different
types of cryptographic primitives. In general, the challenger follows a very simple, fixed protocol.
However, an adversary may follow an arbitrary (but still efficient) protocol. The challenger and the
adversary send messages back and forth to each other, as specified by their protocols, and at the
end of the game, the adversary may output some value. The attack game also defines a probability
space, and this in turn defines the adversary’s advantage, which is determined by the probability
of one or more events.
Some attack games, such as the one defining the semantic security of a cipher, comprise two
alternative “sub-games,” or “experiments” — in both experiments, the adversary follows the same
protocol, while the challenger behaves differently in each experiment. Specifically, in our attack
game defining the semantic security of a cipher, the adversary generates two messages m0 and m1
(of the same length), and sends both messages to the challenger. In one experiment, the challenger
encrypts m0 under a random key, while in the other experiment, the challenger encrypts m1 , again,
under a random key. In both experiments, the challenger sends the resulting ciphertext c back to
the adversary. After examining c, the adversary outputs a bit b̂ ∈ {0, 1}. The advantage of the
adversary is defined to be the absolute difference between the probability the adversary outputs
b̂ = 1 in one experiment and the probability that it outputs b̂ = 1 the other experiment. We state
this more formally as follows:

15
Challenger A
m0 , m 1 2 M
(Experiment b)

R
k K
R c
c E(k, mb )

b̂ 2 {0, 1}

Figure 2.1: Experiment b of Attack Game 2.1

Attack Game 2.1 (semantic security). For a given cipher E = (E, D), defined over (K, M, C),
and for a given adversary A, we define two experiments, Experiment 0 and Experiment 1. For
b = 0, 1, we define
Experiment b:

• The adversary computes m0 , m1 ∈ M, of the same length, and sends them to the challenger.

• The challenger computes k ←


R R
K, c ← E(k, mb ), and sends c to the adversary.

• The adversary outputs a bit b̂ ∈ {0, 1}.

For b = 0, 1, let Wb be the event that A outputs 1 in Experiment b. We define A’s semantic
security advantage with respect to E as

SSadv[A, E] := Pr[W0 ] − Pr[W1 ] . 2

Note that in the above game, the events W0 and W1 are defined with respect to the probability
space determined by the random choice of k, the random choices made (if any) by the encryption
algorithm, and the random choices made (if any) by the adversary. The value SSadv[A, E] is a
number between 0 and 1.
See Fig. 2.1 for a schematic diagram of Attack Game 2.1. As indicated in the diagram, A’s
“output” is really just a final message to the challenger.
Definition 2.2 (semantic security). A cipher E is semantically secure if for all efficient
adversaries A, the value SSadv[A, E] is negligible.
As a formal definition, this is not quite complete, as we have yet to define what we mean by
“messages of the same length”, “efficient adversaries”, and “negligible”. We will come back to this
shortly.
Let us relate this formal definition to the discussion preceding it. Suppose that the adversary
A in Attack Game 2.1 is deterministic. First, the adversary computes in a deterministic fashion

16
messages m0 , m1 , and then evaluates a predicate φ on the ciphertext c, outputting 1 if true and
0 if false. Semantic security says that the value  in (2.4) is negligible. In the case where A is
probabilistic, we can view A as being structured as follows: it generates a random value r from
(r) (r)
some appropriate set, and deterministically computes messages m0 , m1 , which depend on r, and
evaluates a predicate φ(r) on c, which also depends on r. Here, semantic security says that the value
(r) (r)
 in (2.4), with m0 , m1 , φ replaced by m0 , m1 , φ(r) , is negligible — but where now the probability
is with respect to a randomly chosen key and a randomly chosen value of r.
Remark 2.1. Let us now say a few words about the requirement that the messages m0 and m1
computed by the adversary Attack Game 2.1 be of the same length.

• First, the notion of the “length” of a message is specific to the particular message space M;
in other words, in specifying a message space, one must specify a rule that associates a length
(which is a non-negative integer) with any given message. For most concrete message spaces,
this will be clear: for example, for the message space {0, 1}≤L (as in Example 2.2), the length
of a message m ∈ {0, 1}≤L is simply its length, |m|, as a bit string. However, to make our
definition somewhat general, we leave the notion of length as an abstraction. Indeed, some
message spaces may have no particular notion of length, in which case all messages may be
viewed as having length 0.

• Second, the requirement that m0 and m1 be of the same length means that the adversary is not
deemed to have broken the system just because he can effectively distinguish an encryption
of a message of one length from an encryption of a message of a different length. This is how
our formal definition captures the notion that an encryption of a message is allowed to leak
the length of the message (but nothing else).
We already discussed in Example 2.5 how in certain applications, leaking the length of the
message can be catastrophic. However, since there is no general solution to this problem, most
real-world encryption schemes (for example, TLS) do not make any attempt at all to hide the
length of the message. This can lead to real attacks. For example, Chen et al. [41] show that
the lengths of encrypted messages can reveal considerable information about private data that
a user supplies to a cloud application. They use an online tax filing system as their example,
but other works show attacks of this type on many other systems. 2

Example 2.9. Let E be a deterministic cipher that is perfectly secure. Then it is easy to see that
for every adversary A (efficient or not), we have SSadv[A, E] = 0. This follows almost immediately
from Theorem 2.3 (the only slight complication is that our adversary A in Attack Game 2.1 may
be probabilistic, but this is easily dealt with). In particular, E is semantically secure. Thus, if E is
the one-time pad (see Example 2.1), we have SSadv[A, E] = 0 for all adversaries A; in particular,
the one-time pad is semantically secure. Because the definition of semantic security is a bit more
forgiving with regard to variable length message spaces, it is also easy to see that if E is the variable
length one-time pad (see Example 2.2), then SSadv[A, E] = 0 for all adversaries A; in particular,
the variable length one-time pad is also semantically secure. 2
We need to say a few words about the terms “efficient” and “negligible”. Below in Section 2.3
we will fill in the remaining details (they are somewhat tedious, and not really very enlightening).
Intuitively, negligible means so small as to be “zero for all practical purposes”: think of a number
like 2−100 — if the probability that you spontaneously combust in the next year is 2−100 , then you

17
would not worry about such an event occurring any more than you would an event that occurred
with probability 0. We also use the following terms:

• An efficient adversary is one that runs in a “reasonable” amount time.

• A value N is called super-poly if 1/N is negligible.

• A poly-bounded value is a “reasonably” sized number. In particular, we can say that the
running time of an efficient adversary is poly-bounded.

Fact 2.6. If  and 0 are negligible values, and Q and Q0 are poly-bounded values, then:

(i)  + 0 is a negligible value,

(ii) Q + Q0 and Q · Q0 are poly-bounded values, and

(iii) Q ·  is a negligible value.

For now, the reader can just take these facts as axioms. Instead of dwelling on these technical
issues, we discuss an example that illustrates how one typically uses this definition in analyzing the
security of a larger system that uses a semantically secure cipher.

2.2.3 Connections to weaker notions of security


[Link] Message recovery attacks
Intuitively, in a message recovery attack, an adversary is given an encryption of a random message,
and is able to recover the message from the ciphertext with probability significantly better than
random guessing, that is, probability 1/|M|. Of course, any reasonable notion of security should
rule out such an attack, and indeed, semantic security does.
While this may seem intuitively obvious, we give a formal proof of this. One of our motivations
for doing this is to illustrate in detail the notion of a security reduction, which is the main technique
used to reason about the security of systems. Basically, the proof will argue that any efficient
adversary A that can effectively mount a message recovery attack on E can be used to build an
efficient adversary B that breaks the semantic security of E; since semantic security implies that no
such B exists, we may conclude that no such A exists.
To formulate this proof in more detail, we need a formal definition of a message recovery
attack. As before, this is done by giving attack game, which is a protocol between a challenger and
an adversary.
Attack Game 2.2 (message recovery). For a given cipher E = (E, D), defined over (K, M, C),
and for a given adversary A, the attack game proceeds as follows:

• The challenger computes m ←


R R
M, k ← R
K, c ← E(k, m), and sends c to the adversary.

• The adversary outputs a message m̂ ∈ M.

Let W be the event that m̂ = m. We say that A wins the game in this case, and we define A’s
message recovery advantage with respect to E as

MRadv[A, E] := Pr[W ] − 1/|M| . 2

18
Definition 2.3 (security against message recovery). A cipher E is secure against message
recovery if for all efficient adversaries A, the value MRadv[A, E] is negligible.
Theorem 2.7. Let E = (E, D) be a cipher defined over (K, M, C). If E is semantically secure then
E is secure against message recovery.
Proof. Assume that E is semantically secure. Our goal is to show that E is secure against message
recovery.
To prove that E is secure against message recovery, we have to show that every efficient ad-
versary A has negligible advantage in Attack Game 2.2. To show this, we let an arbitrary but
efficient adversary A be given, and our goal now is to show that A’s message recovery advantage,
MRadv[A, E], is negligible. Let p denote the probability that A wins the message recovery game,
so that
MRadv[A, E] = p − 1/|M| .
We shall show how to construct an efficient adversary B whose semantic security advantage in
Attack Game 2.1 is related to A’s message recovery advantage as follows:
MRadv[A, E] ≤ SSadv[B, E]. (2.5)
Since B is efficient, and since we are assuming that E is semantically secure, the right-hand side of
(2.5) is negligible, and so we conclude that MRadv[A, E] is negligible.
So all that remains to complete the proof is to show how to construct an efficient B that satisfies
(2.5). The idea is to use A as a “black box” — we do not have to understand the inner workings
of A at all.
Here is how B works. Adversary B generates two random messages, m0 and m1 in M, and
sends these to its own SS challenger. This challenger sends B a ciphertext c, which B forwards
to A, as if it were coming from A’s MR challenger. When A outputs a message m̂, our adversary B
outputs b̂ = 1 if m̂ = m1 , and outputs b̂ = 0 otherwise.
That completes the description of B. Note that the running time of B is essentially the same
as that of A. We now analyze the B’s SS advantage, and relate this to A’s MR advantage.
For b = 0, 1, let pb be the probability that B outputs 1 if B’s SS challenger encrypts mb . So by
definition
SSadv[B, E] = |p1 − p0 |.
On the one hand, when c is an encryption of m1 , the probability p1 is precisely equal to A’s
probability of winning the message recovery game, so p1 = p. On the other hand, when c is an
encryption of m0 , the adversary A’s output is independent of m1 , and so p0 = 1/|M|. It follows
that
SSadv[B, E] = |p1 − p0 | = p − 1/|M| = MRadv[A, E].
This proves (2.5). In fact, equality holds in (2.5), but that is not essential to the proof. 2
The reader should make sure that he or she understands the logic of this proof, as this type of
proof will be used over and over again throughout the book. We shall review the important parts
of the proof here, and give another way of thinking about it.
The core of the proof was establishing the following fact: for every efficient MR adversary A
that attacks E as in Attack Game 2.2, there exists an efficient SS adversary B that attacks E as in
Attack Game 2.1 such that
MRadv[A, E] ≤ SSadv[B, E]. (2.6)

19
We are trying to prove that if E is semantically secure, then E is secure against message recovery.
In the above proof, we argued that if E is semantically secure, then the right-hand side of (2.6)
must be negligible, and hence so must the left-hand side; since this holds for all efficient A, we
conclude that E is secure against message recovery.
Another way to approach the proof of the theorem is to prove the contrapositive: if E is not
secure against message recovery, then E is not semantically secure. So, let us assume that E is not
secure against message recovery. This means there exists an efficient adversary A whose message
recovery advantage is non-negligible. Using A we build an efficient adversary B that satisfies (2.6).
By assumption, MRadv[A, E] is non-negligible, and (2.6) implies that SSadv[B, E] is non-negligible.
From this, we conclude that E is not semantically secure.
Said even more briefly: to prove that semantic security implies security against message recovery,
we show how to turn an efficient adversary that breaks message recovery into an efficient adversary
that breaks semantic security.
We also stress that the adversary B constructed in the proof just uses A as a “black box.” In
fact, almost all of the constructions we shall see are of this type: B is essentially just a wrapper
around A, consisting of some simple and efficient “interface layer” between B’s challenger and a
single running instance of A. Ideally, we want the computational complexity of the interface layer
to not depend on the computational complexity of A; however, some dependence is unavoidable:
if an attack game allows A to make multiple queries to its challenger, the more queries A makes,
the more work must be performed by the interface layer, but this work should just depend on the
number of such queries and not on the running time of A.
Thus, we will say adversary B is an elementary wrapper around adversary A when it can be
structured as above, as an efficient interface interacting with A. The salient properties are:
• If B is an elementary wrapper around A, and A is efficient, then B is efficient.

• If C is an elementary wrapper around B and B is an elementary wrapper around A, then C is


an elementary wrapper around A.
These notions are formalized in Section 2.3 (but again, they are extremely tedious).

[Link] Computing individual bits of a message


If an encryption scheme is secure, not only should it be hard to recover the whole message, but it
should be hard to compute any partial information about the message.
We will not prove a completely general theorem here, but rather, consider a specific example.
Suppose E = (E, D) is a cipher defined over (K, M, C), where M = {0, 1}L . For m ∈ M, we
define parity(m) to be 1 if the number of 1’s in m is odd, and 0 otherwise. Equivalently, parity(m)
is the exclusive-OR of all the individual bits of m.
We will show that if E is semantically secure, then given an encryption c of a random message
m, it is hard to predict parity(m). Now, since parity(m) is a single bit, any adversary can predict
this value correctly with probability 1/2 just by random guessing. But what we want to show is
that no efficient adversary can do significantly better than random guessing.
As a warm up, suppose there were an efficient adversary A that could predict parity(m) with
probability 1. This means that for every message m, every key k, and every encryption c of m,
when we give A the ciphertext c, it outputs the parity of m. So we could use A to build an SS
adversary B that works as follows. Our adversary chooses two messages, m0 and m1 , arbitrarily,

20
but with parity(m0 ) = 0 and parity(m1 ) = 1. Then it hands these two messages to its own SS
challenger, obtaining a ciphertext c, which it then forwards to A. After receiving c, adversary A
outputs a bit b̂, and B outputs this same bit b̂ as its own output. It is easy to see that B’s SS
advantage is precisely 1: when its SS challenger encrypts m0 , it always outputs 0, and when its SS
challenger encrypts m1 , it always outputs 1.
This shows that if E is semantically secure, there is no efficient adversary that can predict
parity with probability 1. However, we can say even more: if E is semantically secure, there is no
efficient adversary that can predict parity with probability significantly better than 1/2. To make
this precise, we give an attack game:
Attack Game 2.3 (parity prediction). For a given cipher E = (E, D), defined over (K, M, C),
and for a given adversary A, the attack game proceeds as follows:

• The challenger computes m ←


R R
M, k ← R
K, c ← E(k, m), and sends c to the adversary.

• The adversary outputs b̂ ∈ {0, 1}.

Let W be the event that b̂ = parity(m). We define A’s parity prediction advantage with
respect to E as
Parityadv[A, E] := Pr[W ] − 1/2 . 2

Definition 2.4 (parity prediction). A cipher E is secure against parity prediction if for all
efficient adversaries A, the value Parityadv[A, E] is negligible.
Theorem 2.8. Let E = (E, D) be a cipher defined over (K, M, C), and M = {0, 1}L . If E is
semantically secure, then E is secure against parity prediction.
Proof. As in the proof of Theorem 2.7, we give a proof by reduction. In particular, we will show
that for every parity prediction adversary A that attacks E as in Attack Game 2.3, there exists an
SS adversary B that attacks E as in Attack Game 2.1, where B is an elementary wrapper around
A, such that
1
Parityadv[A, E] = · SSadv[B, E].
2
Let A be a parity prediction adversary that predicts parity with probability 1/2 + , so
Parityadv[A, E] = ||.
Here is how we construct our SS adversary B.
Our adversary B generates a random message m0 , and sets m1 ← m0 ⊕ (0L−1 k 1); that is,
m1 is the same as m0 , except that the last bit is flipped. In particular, m0 and m1 have opposite
parity.
Our adversary B sends the pair m0 , m1 to its own SS challenger, receives a ciphertext c from
that challenger, and forwards c to A. When A outputs a bit b̂, our adversary B outputs 1 if
b̂ = parity(m0 ), and outputs 0, otherwise.
For b = 0, 1, let pb be the probability that B outputs 1 if B’s SS challenger encrypts mb . So by
definition
SSadv[B, E] = |p1 − p0 |.
We claim that p0 = 1/2 +  and p1 = 1/2 − . This is because regardless of whether m0
or m1 is encrypted, the distribution of mb is uniform over M, and so in case b = 0, our parity
predictor A will output parity(m0 ) with probability 1/2 + , and when b = 1, our parity predictor

21
A will output parity(m1 ) with probability 1/2 + , and so outputs parity(m0 ) with probability
1 − (1/2 + ) = 1/2 − .
Therefore,
SSadv[B, E] = |p1 − p0 | = 2|| = 2 · Parityadv[A, E],
which proves the theorem. 2
We have shown that if an adversary can effectively predict the parity of a message, then it can
be used to break semantic security. Conversely, it turns out that if an adversary can break semantic
security, he can effectively predict some predicate of the message (see Exercise 3.16).

2.2.4 Consequences of semantic security


In this section, we examine the consequences of semantic security in the context of a specific
example, namely, electronic gambling. The specific details of the example are not so important, but
the example illustrates how one typically uses the assumption of semantic security in applications.
Consider the following extremely simplified version of roulette, which is a game between the
house and a player. The player gives the house 1 dollar. He may place one of two kinds of bets:
• “high or low,” or
• “even or odd.”
After placing his bet, the house chooses a random number r ∈ {0, 1, . . . , 36}. The player wins if
r 6= 0, and if
• he bet “high” and r > 18,
• he bet “low” and r ≤ 18,
• he bet “even” and r is even,
• he bet “odd” and r is odd.
If the player wins, the house pays him 2 dollars (for a net win of 1 dollar), and if the player loses, the
house pays nothing (for a net loss of 1 dollar). Clearly, the house has a small, but not insignificant
advantage in this game: the probability that the player wins is 18/37 ≈ 48.65%.
Now suppose that this game is played over the Internet. Also, suppose that for various technical
reasons, the house publishes an encryption of r before the player places his bet (perhaps to be
decrypted by some regulatory agency that shares a key with the house). The player is free to analyze
this encryption before placing his bet, and of course, by doing so, the player could conceivably
increase his chances of winning. However, if the cipher is any good, the player’s chances should not
increase by much. Let us prove this, assuming r is encrypted using a semantically secure cipher
E = (E, D), defined over (K, M, C), where M = {0, 1, . . . , 36} (we shall view all messages in M
as having the same length in this example). Also, from now on, let us call the player A, to stress
the adversarial nature of the player, and assume that A’s strategy can be modeled as an efficient
algorithm. The game is illustrated in Fig. 2.2. Here, bet denotes one of “high,” “low,” “even,”
“odd.” Player A sends bet to the house, who evaluates the function W (r, bet), which is 1 if bet is a
winning bet with respect to r, and 0 otherwise. Let us define
IRadv[A] := Pr[W (r, bet) = 1] − 18/37 .
Our goal is to prove the following theorem.

22
House A
r ← {0, 1, . . . , 36}
R

R
k←K
R
c ← E(k, r) c

bet

outcome ← W (r, bet)

outcome

Figure 2.2: Internet roulette

Theorem 2.9. If E is semantically secure, then for every efficient player A, the quantity IRadv[A]
is negligible.
As we did in Section 2.2.3, we prove this by reduction. More concretely, we shall show that for
every player A, there exists an SS adversary B, where B is an elementary wrapper around A, such
that
IRadv[A] = SSadv[B, E]. (2.7)
Thus, if there were an efficient player A with a non-negligible advantage, we would obtain an
efficient SS adversary B that breaks the semantic security of E, which we are assuming is impossible.
Therefore, there is no such A.
To motivate and analyze our new adversary B, consider an “idealized” version of Internet
roulette, in which instead of publishing an encryption of the actual value r, the house instead
publishes an encryption of a “dummy”value, say 0. The logic of the ideal Internet roulette game is
illustrated in Fig. 2.3. Note, however, that in the ideal Internet roulette game, the house still uses
the actual value of r to determine the outcome of the game. Let p0 be the probability that A wins
at Internet roulette, and let p1 be the probability that A wins at ideal Internet roulette.
Our adversary B is designed to play in Attack Game 2.1 so that if b̂ denotes B’s output in that
game, then we have:
• if B is placed in Experiment 0, then Pr[b̂ = 1] = p0 ;
• if B is placed in Experiment 1, then Pr[b̂ = 1] = p1 .
The logic of adversary B is illustrated in Fig. 2.4. It is clear by construction that B satisfies the
properties claimed above, and so in particular,
SSadv[B, E] = |p1 − p0 |. (2.8)
Now, consider the probability p1 that A wins at ideal Internet roulette. No matter how clever
A’s strategy is, he wins with probability 18/37, since in this ideal Internet roulette game, the value

23
House A
r ← {0, 1, . . . , 36}
R

R
k←K
c ← E(k, 0)
R
c

bet

outcome ← W (r, bet)

outcome

Figure 2.3: ideal Internet roulette

B
Challenger
r
R
{0, 1, . . . , 36} A
(Experiment b) m0 r
m0 , m 1 m1 0
R
k K
R
c
c E(k, mb )
bet


b̂ W (r, bet)

Figure 2.4: The SS adversary B in Attack Game 2.1

24
of bet is computed from c, which is statistically independent of the value of r. That is, ideal Internet
roulette is equivalent to physical roulette. Therefore,
IRadv[A] = |p1 − p0 |. (2.9)
Combining (2.8) and (2.9), we obtain (2.7).
The approach we have used to analyze Internet roulette is one that we will see again and again.
The basic idea is to replace a system component by an idealized version of that component, and
then analyze the behavior of this new, idealized version of the system.
Another lesson to take away from the above example is that in reasoning about the security of
a system, what we view as “the adversary” depends on what we are trying to do. In the above
analysis, we cobbled together a new adversary B out of several components: one component was
the original adversary A, while other components were scavenged from other parts of the system
(the algorithm of “the house,” in this example). This will be very typical in our security analyses
throughout this text. Intuitively, if we imagine a diagram of the system, at different points in the
security analysis, we will draw a circle around different components of the system to identify what
we consider to be “the adversary” at that point in the analysis.

2.2.5 Bit guessing: an alternative characterization of semantic security


The example in Section 2.2.4 was a typical example of how one could use the definition of semantic
security to analyze the security properties of a larger system that makes use of a semantically
secure cipher. However, there is another characterization of semantic security that is typically more
convenient to work with when one is trying to prove that a given cipher satisfies the definition. In
this alternative characterization, we define a new attack game. The role played by the adversary
is exactly the same as before. However, instead of having two different experiments, there is just
a single experiment. In this bit-guessing version of the attack game, the challenger chooses
b ∈ {0, 1} at random and runs Experiment b of Attack Game 2.1; it is the adversary’s goal to guess
the bit b with probability significantly better than 1/2. Here are the details:
Attack Game 2.4 (semantic security: bit-guessing version). For a given cipher E = (E, D),
defined over (K, M, C), and for a given adversary A, the attack game runs as follows:
• The adversary computes m0 , m1 ∈ M, of the same length, and sends them to the challenger.
• The challenger computes b ←
R R
{0, 1}, k ← R
K, c ← E(k, mb ), and sends c to the adversary.
• The adversary outputs a bit b̂ ∈ {0, 1}.
We say that A wins the game if b̂ = b. 2
Fig. 2.5 illustrates Attack Game 2.4. Note that in this game, the event that the A wins the
game is defined with respect to the probability space determined by the random choice of b and k,
the random choices made (if any) of the encryption algorithm, and the random choices made (if
any) by the adversary.
Of course, any adversary can win the game with probability 1/2, simply by ignoring c completely
and choosing b̂ at random (or alternatively, always choosing b̂ to be 0, or always choosing it to be
1). What we are interested in is how much better than random guessing an adversary can do. If
W denotes the event that the adversary wins the bit-guessing version of the attack game, then we
are interested in the quantity |Pr[W ] − 1/2|, which we denote by SSadv∗ [A, E]. Then we have:

25
Challenger A
m0 , m 1 2 M
R
b {0, 1}
R
k K
R
c E(k, mb ) c

b̂ 2 {0, 1}

Figure 2.5: Attack Game 2.4

Theorem 2.10. For every cipher E and every adversary A, we have


SSadv[A, E] = 2 · SSadv∗ [A, E]. (2.10)
Proof. This is just a simple calculation. Let p0 be the probability that the adversary outputs 1 in
Experiment 0 of Attack Game 2.1, and let p1 be the probability that the adversary outputs 1 in
Experiment 1 of Attack Game 2.1.
Now consider Attack Game 2.4. From now on, all events and probabilities are with respect to
this game. If we condition on the event that b = 0, then in this conditional probability space, all
of the other random choices made by the challenger and the adversary are distributed in exactly
the same way as the corresponding values in Experiment 0 of Attack Game 2.1. Therefore, if b̂ is
the output of the adversary in Attack Game 2.4, we have
Pr[b̂ = 1 | b = 0] = p0 .
By a similar argument, we see that
Pr[b̂ = 1 | b = 1] = p1 .
So we have
Pr[b̂ = b] = Pr[b̂ = b | b = 0] Pr[b = 0] + Pr[b̂ = b | b = 1] Pr[b = 1]
= Pr[b̂ = 0 | b = 0] · 12 + Pr[b̂ = 1 | b = 1] · 12
 
= 12 1 − Pr[b̂ = 1 | b = 0] + Pr[b̂ = 1 | b = 1]
= 12 (1 − p0 + p1 ).
Therefore,
SSadv∗ [A, E] = Pr[b̂ = b] − 1
2 = 12 |p1 − p0 | = 1
2 · SSadv[A, E].
That proves the theorem. 2
Just as it is convenient to refer to SSadv[A, E] as A’s “SS advantage,” we shall refer to
SSadv∗ [A, E] as A’s “bit-guessing SS advantage.”

26
[Link] A generalization
As it turns out, the above situation is quite generic. Although we do not need it in this chapter, for
future reference we indicate here how the above situation generalizes. There will be a number of
situations we shall encounter where some particular security property, call it “X,” for some crypto-
graphic system, call it “S,” can be defined in terms of an attack game involving two experiments,
Experiment 0 and Experiment 1, where the adversary A’s protocol is the same in both experiments,
while that of the challenger is different. For b = 0, 1, we define Wb to be the event that A outputs
1 in Experiment b, and we define

Xadv[A, S] := Pr[W0 ] − Pr[W1 ]

to be A’s “X advantage.” Just as above, we can always define a “bit-guessing” version of the attack
game, in which the challenger chooses b ∈ {0, 1} at random, and then runs Experiment b as its
protocol. If W is the event that the adversary’s output is equal to b, then we define

Xadv∗ [A, S] := Pr[W ] − 1/2

to be A’s “bit-guessing X advantage.”


Using exactly the same calculation as in the proof of Theorem 2.10, we have

Xadv[A, S] = 2 · Xadv∗ [A, S]. (2.11)

2.3 Mathematical details


Up until now, we have used the terms efficient and negligible rather loosely, without a formal
mathematical definition:

• we required that a computational cipher have efficient encryption and decryption algorithms;

• for a semantically secure cipher, we required that any efficient adversary have a negligible
advantage in Attack Game 2.1.

The goal of this section is to provide precise mathematical definitions for these terms. While
these definitions lead to a satisfying theoretical framework for the study of cryptography as a
mathematical discipline, we should warn the reader:

• the definitions are rather complicated, requiring an unfortunate amount of notation; and

• the definitions model our intuitive understanding of these terms only very crudely.

We stress that the reader may safely skip this section without suffering a significant loss in under-
standing. Before marching headlong into the formal definitions, let us remind the reader of what
we are trying to capture in these definitions.

• First, when we speak of an efficient encryption or decryption algorithm, we usually mean one
that runs very quickly, encrypting data at a rate of, say, 10–100 computer cycles per byte of
data.

27
• Second, when we speak of an efficient adversary, we usually mean an algorithm that runs in
some large, but still feasible amount of time (and other resources). Typically, one assumes
that an adversary that is trying to break a cryptosystem is willing to expend many more
resources than a user of the cryptosystem. Thus, 10,000 computers running in parallel for
10 years may be viewed as an upper limit on what is feasibly computable by a determined,
patient, and financially well-off adversary. However, in some settings, like the Internet roulette
example in Section 2.2.4, the adversary may have a much more limited amount of time to
perform its computations before they become irrelevant.

• Third, when we speak of an adversary’s advantage as being negligible, we mean that it is so


small that it may as well be regarded as being equal to zero for all practical purposes. As
we saw in the Internet roulette example, if no efficient adversary has an advantage better
than 2−100 in Attack Game 2.1, then no player can in practice improve his odds at winning
Internet roulette by more than 2−100 relative to physical roulette.
Even though our intuitive understanding of the term efficient depends on the context, our
formal definition will not make any such distinction. Indeed, we shall adopt the computational
complexity theorist’s habit of equating the notion of an efficient algorithm with that of a (proba-
bilistic) polynomial-time algorithm. For better and for worse, this gives us a formal framework that
is independent of the specific details of any particular model of computation.

2.3.1 Negligible, super-poly, and poly-bounded functions


We begin by defining the notions of negligible, super-poly, and poly-bounded functions.
Intuitively, a negligible function f : Z≥0 → R is one that not only tends to zero as n → ∞, but
does so faster than the inverse of any polynomial.
Definition 2.5. A function f : Z≥1 → R is called negligible if for all c ∈ R>0 there exists
n0 ∈ Z≥1 such that for all integers n ≥ n0 , we have |f (n)| < 1/nc .
An alternative characterization of a negligible function, which is perhaps easier to work with,
is the following:
Theorem 2.11. A function f : Z≥1 → R is negligible if and only if for all c > 0, we have

lim f (n)nc = 0.
n→∞

Proof. Exercise. 2
Example 2.10. Some examples of negligible functions:

2−n , 2− n
, n− log n .

Some examples of non-negligible functions:


1 1
, . 2
1000n4 2
+ n log n n 100

Once we have the term “negligible” formally defined, defining “super-poly” is easy:
Definition 2.6. A function f : Z≥1 → R is called super-poly if 1/f is negligible.

28
Essentially, a poly-bounded function f : Z≥1 → R is one that is bounded (in absolute value) by
some polynomial. Formally:
Definition 2.7. A function f : Z≥1 → R is called poly-bounded, if there exists c, d ∈ R>0 such
that for all integers n ≥ 0, we have |f (n)| ≤ nc + d.
Note that if f is a poly-bounded function, then 1/f is definitely not a negligible function.
However, as the following example illustrates, one must take care not to draw erroneous inferences.
Example 2.11. Define f : Z≥1 → R so that f (n) = 1/n for all even integers n and f (n) = 2−n
for all odd integers n. Then f is not negligible, and 1/f is neither poly-bounded nor super-poly. 2

2.3.2 Computational ciphers: the formalities


Now the formalities. We begin by admitting a lie: when we said a computational cipher E = (E, D)
is defined over (K, M, C), where K is the key space, M is the message space, and C is the ciphertext
space, and with each of these spaces being finite sets, we were not telling the whole truth. In the
mathematical model (though not always in real-world systems), we associate with E families of key,
message, and ciphertext spaces, indexed by

• a security parameter, which is a positive integer, and is denoted by λ, and

• a system parameter, which is a bit string, and is denoted by Λ.

Thus, instead of just finite sets K, M, and C, we have families of finite sets

{Kλ,Λ }λ,Λ , {Mλ,Λ }λ,Λ , and {Cλ,Λ }λ,Λ ,

which for the purposes of this definition, we view as sets of bit strings (which may represent
mathematical objects by way of some canonical encoding functions).
The idea is that when the cipher E is deployed, the security parameter λ is fixed to some value.
Generally speaking, larger values of λ imply higher levels of security (i.e., resistance against adver-
saries with more computational resources), but also larger key sizes, as well as slower encryption
and decryption speeds. Thus, the security parameter is like a “dial” we can turn, setting a trade-off
between security and efficiency.
Once λ is chosen, a system parameter Λ is generated using an algorithm specific to the cipher.
The idea is that the system parameter Λ (together with λ) gives a detailed description of a fixed
instance of the cipher, with
(K, M, C) = (Kλ,Λ , Mλ,Λ , Cλ,Λ ).
This one, fixed instance may be deployed in a larger system and used by many parties — the values
of λ and Λ are public and known to everyone (including the adversary).
Example 2.12. Consider the additive one-time pad discussed in Example 2.4. This cipher was
described in terms of a modulus n. To deploy such a cipher, a suitable modulus n is generated,
and is made public (possibly just “hardwired” into the software that implements the cipher). The
modulus n is the system parameter for this cipher. Each specific value of the security parameter
determines the length, in bits, of n. The value n itself is generated by some algorithm that may be
probabilistic and whose output distribution may depend on the intended application. For example,
we may want to insist that n is a prime in some applications. 2

29
Before going further, we define the notion of an efficient algorithm. For the purposes of this
definition, we shall only consider algorithms A that take as input a security parameter λ, as well as
other parameters whose total length is bounded by some fixed polynomial in λ. Basically, we want
to say that the running time of A is bounded by a polynomial in λ, but things are complicated if
A is probabilistic:
Definition 2.8 (efficient algorithm). Let A be an algorithm (possibly probabilistic) that takes
as input a security parameter λ ∈ Z≥1 , as well as other parameters encoded as a bit string x ∈
{0, 1}≤p(λ) for some fixed polynomial p. We call A an efficient algorithm if there exist a poly-
bounded function t and a negligible function  such that for all λ ∈ Z≥1 , and all x ∈ {0, 1}≤p(λ) ,
the probability that the running time of A on input (λ, x) exceeds t(λ) is at most (λ).
We stress that the probability in the above definition is with respect to the coin tosses of A:
this bound on the probability must hold for all possible inputs x.1

Here is a formal definition that captures the basic requirements of systems that are parameter-
ized by a security and system parameter, and introduces some more terminology. In the following
definition we use the notation Supp(P (λ)) to refer to the support of the distribution P (λ), which
is the set of all possible outputs of algorithm P on input λ.
Definition 2.9. A system parameterization is an efficient probabilistic algorithm P that given
a security parameter λ ∈ Z≥1 as input, outputs a bit string Λ, called a system parameter, whose
length is always bounded by a polynomial in λ. We also define the following terminology:

• A collection S = {Sλ,Λ }λ,Λ of finite sets of bits strings, where λ runs over Z≥1 and Λ runs over
Supp(P (λ)), is called a family of spaces with system parameterization P , provided the
lengths of all the strings in each of the sets Sλ,Λ are bounded by some polynomial p in λ.

• We say that S is efficiently recognizable if there is an efficient deterministic algorithm


that on input λ ∈ Z≥1 , Λ ∈ Supp(P (λ)), and s ∈ {0, 1}≤p(λ) , determines if s ∈ Sλ,Λ .

• We say that S is efficiently sampleable if there is an efficient probabilistic algorithm that


on input λ ∈ Z≥1 and Λ ∈ Supp(P (λ)), outputs an element uniformly distributed over Sλ,Λ .

• We say that S has an effective length function if there is an efficient deterministic


algorithm that on input λ ∈ Z≥1 , Λ ∈ Supp(P (λ)), and s ∈ Sλ,Λ , outputs a non-negative
integer, called the length of s.

We can now state the complete, formal definition of a computational cipher:


1
By not insisting that a probabilistic algorithm halts in a specified time bound with probability 1, we give ourselves
a little “wiggle room,” which allows us to easily do certain types of random sampling procedure that have no a priori
running time bound, but are very unlikely to run for too long (e.g., think of flipping a coin until it comes up “heads”).
An alternative approach would be to bound the expected running time, but this turns out to be somewhat problematic
for technical reasons.
Note that this definition of an efficient algorithm does not require that the algorithm halt with probability 1 on
all inputs. An algorithm that with probability 2−λ entered an infinite loop would satisfy the definition, even though
it does not halt with probability 1. These issues are rather orthogonal. In general, we shall only consider algorithms
that halt with probability 1 on all inputs: this can more naturally be seen as a requirement on the output distribution
of the algorithm, rather than on its running time.

30
Definition 2.10 (computational cipher). A computational cipher consists of a pair of algo-
rithms E and D, along with three families of spaces with system parameterization P :

K = {Kλ,Λ }λ,Λ , M = {Mλ,Λ }λ,Λ , and C = {Cλ,Λ }λ,Λ ,

such that

1. K, M, and C are efficiently recognizable.

2. K is efficiently sampleable.

3. M has an effective length function.

4. Algorithm E is an efficient probabilistic algorithm that on input λ, Λ, k, m, where λ ∈ Z≥1 ,


Λ ∈ Supp(P (λ)), k ∈ Kλ,Λ , and m ∈ Mλ,Λ , always outputs an element of Cλ,Λ .

5. Algorithm D is an efficient deterministic algorithm that on input λ, Λ, k, c, where λ ∈ Z≥1 ,


Λ ∈ Supp(P (λ)), k ∈ Kλ,Λ , and c ∈ Cλ,Λ , outputs either an element of Mλ,Λ , or a special
symbol reject ∈
/ Mλ,Λ .

6. For all λ, Λ, k, m, c, where λ ∈ Z≥1 , Λ ∈ Supp(P (λ)), k ∈ Kλ,Λ , m ∈ Mλ,Λ , and c ∈


Supp(E(λ, Λ; k, m)), we have D(λ, Λ; k, c) = m.

Note that in the above definition, the encryption and decryption algorithms take λ and Λ
as auxiliary inputs. So as to be somewhat consistent with the notation already introduced in
Section 2.2.1, we write this as E(λ, Λ; · · · ) and D(λ, Λ; · · · ).
Example 2.13. Consider the additive one-time pad (see Example 2.12). In our formal framework,
the security parameter λ determines the bit length L(λ) of the modulus n, which is the system
parameter. The system parameter generation algorithm takes as input λ and generates a modulus
n of length L(λ). The function L(·) should be polynomially bounded. With this assumption, it is
clear that the system parameter generation algorithm satisfies its requirements. The requirements
on the key, message, and ciphertext spaces are also satisfied:

1. Elements of these spaces have polynomially bounded lengths: this again follows from our
assumption that L(·) is polynomially bounded.
R
2. The key space is efficiently sampleable: just choose k ← {0, . . . , n − 1}.

3. The key, message, and ciphertext spaces are efficiently recognizable: just test if a bit string s
is the binary encoding of an integer between 0 and n − 1.

4. The message space also has an effective length function: just output (say) 0. 2

We note that some ciphers (for example the one-time pad) may not need a system parameter.
In this case, we can just pretend that the system parameter is, say, the empty string. We also note
that some ciphers do not really have a security parameter either; indeed, many industry-standard
ciphers simply come ready-made with a fixed key size, with no security parameter that can be
tuned. This is simply mismatch between theory and practice — that is just the way it is.

31
That completes our formal mathematical description of a computational cipher, in all its glo-
rious detail.2 The reader should hopefully appreciate that while these formalities may allow us
to make mathematically precise and meaningful statements, they are not very enlightening, and
mostly serve to obscure what is really going on. Therefore, in the main body of the text, we will
continue to discuss ciphers using the simplified terminology and notation of Section 2.2.1, with the
understanding that all statements made have a proper and natural interpretation in the formal
framework discussed in this section. This will be a pattern that is repeated in the sequel: we shall
mainly discuss various types of cryptographic schemes using a simplified terminology, without men-
tion of security parameters and system parameters — these mathematical details will be discussed
in a separate section, but will generally follow the same general pattern established here.

2.3.3 Efficient adversaries and attack games


In defining the notion of semantic security, we have to define what we mean by an efficient adversary.
Since this concept will be used extensively throughout the text, we present a more general framework
here.
For any type of cryptographic scheme, security will be defined using an attack game, played
between an adversary A and a challenger: A follows an arbitrary protocol, while the challenger
follows some simple, fixed protocol determined by the cryptographic scheme and the notion of
security under discussion. Furthermore, both adversary and challenger take as input a common
security parameter λ, and the challenger starts the game by computing a corresponding system
parameter Λ, and sending this to the adversary.
To model these types of interactions, we introduce the notion of an interactive machine.
Before such a machine M starts, it always gets the security parameter λ written in a special buffer,
and the rest of its internal state is initialized to some default value. Machine M has two other
special buffers: an incoming message buffer and an outgoing message buffer. Machine M may be
invoked many times: each invocation starts when M ’s external environment writes a string to M ’s
incoming message buffer; M reads the message, performs some computation, updates its internal
state, and writes a string on its outgoing message buffer, ending the invocation, and the outgoing
message is passed to the environment. Thus, M interacts with its environment via a simple message
passing system. We assume that M may indicate that it has halted by including some signal in its
last outgoing message, and M will essentially ignore any further attempts to invoke it.
We shall assume messages to and from the machine M are restricted to be of constant length.
This is not a real restriction: we can always simulate the transmission of one long message by
sending many shorter ones. However, making a restriction of this type simplifies some of the
technicalities. We assume this restriction from now on, for adversaries as well as for any other type
of interactive machine.
For any given environment, we can measure the total running time of M by counting the
number of steps it performs across all invocations until it signals that it has halted. This running
time depends not only on M and its random choices, but also on the environment in which M
runs.3
2
Note that the definition of a Shannon cipher in Section 2.1.1 remains unchanged. The claim made at the end of
Section 2.2.1 that any deterministic computational cipher is also a Shannon cipher needs to be properly interpreted:
for each λ and Λ, we get a Shannon cipher defined over (Kλ,Λ , Mλ,Λ , Cλ,Λ ).
3
Analogous to the discussion in footnote 1 on page 30, our definition of an efficient interactive machine will not
require that it halts with probability 1 for all environments. This is an orthogonal issue, but it will be an implicit

32
Definition 2.11 (efficient interactive machine). We say that M is an efficient interactive
machine if there exist a poly-bounded function t and a negligible function , such that for all
environments (not even computationally unbounded ones), the probability that the total running
time of M exceeds t(λ) is at most (λ).
We naturally model an adversary as an interactive machine. An efficient adversary is simply
an efficient interactive machine.
We can connect two interactive machines together, say M 0 and M , to create a new interactive
machine M 00 = hM 0 , M i. Messages from the environment to M 00 always get routed to M 0 . The
machine M 0 may send a message to the environment, or to M ; in the latter case, the output
message sent by M gets sent to M 0 . We assume that if M halts, then M 0 does not send it any
more messages.
Thus, when M 00 is invoked, its incoming message is routed to M 0 , and then M 0 and M may
interact some number of times, and then the invocation of M 00 ends when M 0 sends a message to
the environment. We call M 0 the “open” machine (which interacts with the outside world), and M
the “closed” machine (which interacts only with M 0 ).

Naturally, we can model the interaction of a challenger and an adversary by connecting two
such machines together as above: the challenger becomes the open machine, and the adversary
becomes the closed machine.
In our security reductions, we typically show how to use an adversary A that breaks some
system to build an adversary B that breaks some other system. The essential property that we
want is that if A is efficient, then so is B. However, our reductions are almost always of a very
special form, where B is a wrapper around A, consisting of some simple and efficient “interface
layer” between B’s challenger and a single running instance of A.
Ideally, we want the computational complexity of the interface layer to not depend on the
computational complexity of A; however, some dependence is unavoidable: the more queries A
makes to its challenger, the more work must be performed by the interface layer, but this work
should just depend on the number of such queries and not on the running time of A.
To formalize this, we build B as a composed machine hM 0 , M i, where M 0 represents the interface
layer (the “open” machine), and M represents the instance of A (the “closed” machine). This leads
us to the following definition.
Definition 2.12 (elementary wrapper). An interactive machine M 0 is called an efficient
interface if there exists a poly-bounded function t and a negligible function , such that for all
M (not necessarily computationally bounded), when we execute the composed machine hM 0 , M i in
an arbitrary environment (again, not necessarily computationally bounded), the following property
holds:

at every point in the execution of hM 0 , M i, if I is the number of interactions between


M 0 and M up to at that point, and T is the total running time of M 0 up to that point,
then the probability that T > t(λ + I) is at most (λ).

If M 0 is an efficient interface, and M is any machine, then we say hM 0 , M i is an elementary


wrapper around M .
requirement of any machines we consider.

33
Thus, we will say adversary B is an elementary wrapper around adversary A when it can be
structured as above, as an efficient interface interacting with A. Our definitions were designed to
work well together. The salient properties are:

• If B is an elementary wrapper around A, and A is efficient, then B is efficient.

• If C is an elementary wrapper around B and B is an elementary wrapper around A, then C is


an elementary wrapper around A.

Also note that in our attack games, the challenger typically satisfies our definition of an efficient
interface. For such a challenger and any efficient adversary A, we can view their entire interaction
as a that of a single, efficient machine.

Query bounded adversaries. In the attack games we have seen so far, the adversary makes
just a fixed number of queries. Later in the text, we will see attack games in which the adversary
A is allowed to make many queries — even though there is no a priori bound on the number of
queries it is allowed to make, if A is efficient, the number of queries will be bounded by some
poly-bounded value Q (at least with all but negligible probability). In proving security for such
attack games, in designing an elementary wrapper B from A, it will usually be convenient to tell
B in advance an upper bound Q on how many queries A will ultimately make. To fit this into our
formal framework, we can set things up so that A starts out by sending a sequence of Q special
messages to “signal” this query bound to B. If we do this, then not only can B use the value Q in its
logic, it is also allowed to run in time that depends on Q, without violating the time constraints in
Definition 2.12. This is convenient, as then B is allowed to initialize data structures whose size may
depend on Q. Of course, all of this is just a legalistic “hack” to work around technical constraints
that would otherwise be too restrictive, and should not be taken too seriously. We will never make
this “signaling” explicit in any of our presentations.

2.3.4 Semantic security: the formalities


In defining any type of security, we will define the adversary’s advantage in the attack game as
a function Adv(λ). This will be defined in terms of probabilities of certain events in the attack
game: for each value of λ we get a different probability space, determined by the random choices of
the challenger, and the random choices of the adversary. Security will mean that for every efficient
adversary, the function Adv(·) is negligible.
Turning now to the specific situation of semantic security of a cipher, in Attack Game 2.1, we
defined the value SSadv[A, E]. This value is actually a function of the security parameter λ. The
proper interpretation of Definition 2.2 is that E is secure if for all efficient adversaries A (modeled as
an interactive machine, as described above), the function SSadv[A, E](λ) in the security parameter
λ is negligible (as defined in Definition 2.5). Recall that both challenger and adversary receive λ
as a common input. Control begins with the challenger, who sends the system parameter to the
adversary. The adversary then sends its query to the challenger, which consists of two plaintexts,
who responds with a ciphertext. Finally, the adversary outputs a bit (technically, in our formal
machine model, this “output” is a message sent to the challenger, and then the challenger halts).
The value of SSadv[A, E](λ) is determined by the random choices of the challenger (including the
choice of system parameter) and the random choices of the adversary. See Fig. 2.6 for a complete
picture of Attack Game 2.1.

34
Challenger A
(Experiment b)
R

⇤ P( )

m0 , m1 2 M ,⇤
R
k K ,⇤
R
c E( , ⇤; k, mb ) c

b̂ 2 {0, 1}

Figure 2.6: The fully detailed version of Attack Game 2.1

Also, in Attack Game 2.1, the requirement that the two messages presented by the adversary
have the same length means that the length function provided in part 3 of Definition 2.10 evaluates
to the same value on the two messages.
It is perhaps useful to see what it means for a cipher E to be insecure according to this formal
definition. This means that there exists an adversary A such that SSadv[A, E] is a non-negligible
function in the security parameter. This means that SSadv[A, E](λ) ≥ 1/λc for some c > 0 and for
infinitely many values of the security parameter λ. So this does not mean that A can “break” E
for all values of the security parameter, but only infinitely many values of the security parameter.
In the main body of the text, we shall mainly ignore security parameters, system parameters,
and the like, but it will always be understood that all of our “shorthand” has a precise mathematical
interpretation. In particular, we will often refer to certain values v as being negligible (resp., poly-
bounded), which really means that v is a negligible (resp., poly-bounded) function of the security
parameter.

2.4 A fun application: anonymous routing


Our friend Alice wants to send a message m to Bob, but she does not want Bob or anyone else to
know that the message m is from Alice. For example, Bob might be running a public discussion
forum and Alice wants to post a comment anonymously on the forum. Posting anonymously lets
Alice discuss health issues or other matters without identifying herself. In this section we will
assume Alice only wants to post a single message to the forum.
One option is for Alice to choose a proxy, Carol, send m to Carol, and ask Carol to forward
the message to Bob. This clearly does not provide anonymity for Alice since anyone watching the

35
network will see that m was sent from Alice to Carol and then from Carol to Bob. By tracing the
path of m through the network anyone can see that the post came from Alice.
A better approach is for Alice to establish a shared key k with Carol and send c := E(k, m) to
Carol, where E = (E, D) is a semantically secure cipher. Carol decrypts c and forwards m to Bob.
Now, someone watching the network will see one message sent from Alice to Carol and a different
message sent from Carol to Bob. Nevertheless, this method still does not ensure anonymity for
Alice: if on a particular day the only message that Carol receives is the one from Alice and the only
message she sends goes to Bob, then an observer can link the two and still learn that the posted
message came from Alice.
We solve this problem by having Carol provide a mixing service, that is, a service that mixes
incoming messages from many different parties A1 , . . . , An . For i = 1, . . . , n, Carol establishes
a secret key ki with party  Ai and each party Ai sends to Carol an encrypted message ci :=
E ki , hdestinationi , mi i . Carol collects all n incoming ciphertexts, decrypts each of them with
the correct key, and forwards the resulting plaintexts in some random order to their destinations.
Now an observer examining Carol’s traffic sees n messages going in and n messages going out, but
cannot tell which message was sent where. Alice’s message is one of the n messages sent out by
Carol, but the observer cannot tell which one. We say that Alice’s anonymity set is of size n.
The remaining problem is that Carol can still tell that Alice is the one who posted a specific
message on the discussion forum. To eliminate this final risk Alice uses multiple mixing services,
say, Carol and David. She establishes a secret key kc with Carol and a secret key kd with David.
To send her message to Bob she constructs the following nested ciphertext c2 :

c2 := E kc , E(kd , m) . (2.12)

For completeness Alice may want to embed routing information inside the ciphertext so that c2 is
actually constructed as:
 
c2 := E kc , hDavid, c1 i where c1 := E kd , hBob, mi .

Next, Alice sends c2 to Carol. Carol decrypts c2 and obtains the plaintext hDavid, c1 i which tells
her to send c1 to David. David decrypts c1 and obtains the plaintext hBob, mi which tells him to
send m to Bob. This process of decrypting a nested ciphertext, illustrated in Fig. 2.7, is similar to
peeling an onion one layer at a time. For this reason this routing procedure is often called onion
routing.
Now even if Carol observes all network traffic she cannot tell with certainty who posted a
particular message on Bob’s forum. The same holds for David. However, if Carol and David
collude they can figure it out. For this reason Alice may want to route her message through more
than two mixes. As long as one of the mixes does not collude with the others, Alice’s anonymity
will be preserved.
One complication is that when Alice establishes her shared secret key kd with David, she must
do so without revealing her identity to David. Otherwise, David will know that c1 came from
Alice, which we do not want. This is not difficult to do, and we will see how later in the book
(Section 21.13).

Security of nested encryption. To preserve Alice’s anonymity it is necessary that Carol, who
knows kc , learn no information about m from the nested ciphertext c2 in (2.12). Otherwise, Carol
could potentially use the information she learns about m from c2 to link Alice to her post on Bob’s

36
mix& mix&
c2& c1& m&
Alice& Carol& David& Bob&

Figure 2.7: An example onion routing using two mixes

discussion forum. For example, suppose Carol could learn the first few characters of m from c2 and
later find that there is only one post on Bob’s forum starting with those characters. Carol could
then link the entire post to Alice because she knows that c2 came from Alice.
The same should hold for David. David has kd , and by observing network traffic, knows that
Alice sent c2 . As in the previous paragraph, it is important that David learn nothing about m
from c2 .
Let us argue that if E is semantically secure, then no efficient adversary can learn information
about m given c2 and one of kc or kd . More generally, for a cipher E = (E, D) defined over
(K, M, C), let us define the n-way nested cipher En = (En , Dn ) as
 
En (k1 , . . . , kn ), m := E kn , E(kn−1 , · · · E(k1 , m)) .
Decryption applies the keys in the reverse order:
 
Dn (k1 , . . . , kn ), c := D k1 , D(k2 , · · · D(kn , c)) .
Our goal is to show that if E is semantically secure then En is semantically secure even if the
adversary is given all but one of the keys k1 , . . . , kn . To make this precise, we define two experiments,
Experiment 0 and Experiment 1, where for b = 0, 1, Experiment b is:
• The adversary gives the challenger (m0 , m1 , d) where m0 , m1 ∈ M are equal length messages
and 1 ≤ d ≤ n.
• The challenger chooses n keys k1 , . . . , kn ←
R R

K and computes c ← En (k1 , . . . , kn ), mb . It
sends c to the adversary along with all keys k1 , . . . , kn , but excluding the key kd .
• The adversary outputs a bit b̂ ∈ {0, 1}.
This game captures the fact that the adversary sees all keys k1 , . . . , kn except for kd and tries to
break semantic security.
We define the adversary’s advantage, NE(n) adv[A, E], as in the definition of semantic security:
NE(n) adv[A, E] := Pr[W0 ] − Pr[W1 ]
where Wb is the event that A outputs 1 in Experiment b, for b = 0, 1. We say that E is semantically
secure for n-way nesting if NE(n) adv[A, E] is negligible.
Theorem 2.12. For every constant n > 0, if E = (E, D) is semantically secure then E is seman-
tically secure for n-way nesting.
In particular, for every n-way nested adversary A attacking En , there exists a semantic security
adversary B attacking E, where B is an elementary wrapper around A, such that
NE(n) adv[A, E] = SSadv[B, E] .

The proof of this theorem is a good exercise in security reductions. We leave it for Exercise 2.15.

37
2.5 Notes
The one time pad is due to Gilbert Vernam in 1917, although there is evidence that it was discovered
earlier [15].
Citations to the literature to be added.

2.6 Exercises

2.1 (multiplicative one-time pad). We may also define a “multiplication mod p” variation of
the one-time pad. This is a cipher E = (E, D), defined over (K, M, C), where K := M := C :=
{1, . . . , p − 1}, where p is a prime. Encryption and decryption are defined as follows:

E(k, m) := k · m mod p D(k, c) := k −1 · c mod p.

Here, k −1 denotes the multiplicative inverse of k modulo p. Verify the correctness property for this
cipher and prove that it is perfectly secure.
2.2 (A good substitution cipher). Consider a variant of the substitution cipher E = (E, D)
defined in Example 2.3 where every symbol of the message is encrypted using an independent
permutation. That is, let M = C = ΣL for some a finite alphabet of symbols Σ and some L. Let
the key space be K = S L where S is the set of all permutations on Σ. The encryption algorithm
E(k, m) is defined as

E(k, m) := k[0](m[0]), k[1](m[1]), . . . , k[L − 1](m[L − 1])

Show that E is perfectly secure.


2.3 (A broken one-time pad). Consider a variant of the one time pad with message space
{0, 1}L where the key space K is restricted to all L-bit strings with an even number of 1’s. Give an
efficient adversary whose semantic security advantage is 1.
2.4 (Encryption chain). Let E = (E, D) be a cipher defined over (K, M, C) where K = M. Let
E 0 = (E 0 , D0 ) be a cipher where encryption is defined as
 
E 0 (k1 , k2 ), m := E(k1 , k2 ), E(k2 , m) ∈ C 2 .


Show that if E is perfectly secure then so is E 0 . Exercise 3.2 describes an application for this
encryption scheme.
2.5 (A stronger impossibility result). This exercise generalizes Shannon’s theorem (Theo-
rem 2.5). Let E be a cipher defined over (K, M, C). Suppose that SSadv[A, E] ≤  for all adversaries
A, even including computationally unbounded ones. Show that |K| ≥ (1 − )|M|.
2.6 (A matching bound). This exercise develops a converse of sorts for the previous exercise.
For j = 1, . . . , L, let  := 1/2j . Consider the L-bit one-time pad variant E defined over (K, M, C)
where M = C = {0, 1}L . The key space K is restricted to all L-bit strings whose first j bits are
not all zero, so that |K| = (1 − )|M|. Show that:
(a) there is an efficient adversary A such that SSadv[A, E] = /(1 − );

38
(b) for all adversaries A, even including computationally unbounded ones, SSadv[A, E] ≤ /(1 − ).
Note: Since the advantage of A in part (a) is non-zero, the cipher E cannot be perfectly secure.
2.7 (Deterministic ciphers). In this exercise, you are asked to prove in detail the claims made
in Example 2.9. Namely, show that if E is a deterministic cipher that is perfectly secure, then
SSadv[A, E] = 0 for every adversary A (bearing in mind that A may be probabilistic); also show
that if E is the variable length one-time pad, then SSadv[A, E] = 0 for all adversaries A.
2.8 (Roulette). In Section 2.2.4, we argued that if value r is encrypted using a semantically
secure cipher, then a player’s odds of winning at Internet roulette are very close to those of real
roulette. However, our “roulette” game was quite simple. Suppose that we have a more involved
game, where different outcomes may result in different winnings. The rules are not so important,
but assume that the rules are easy to evaluate (given a bet and the number r) and that every bet
results in a payout of 0, 1, . . . , n dollars, where n is poly-bounded. Let µ be the expected winnings
in an optimal strategy for a real version of this game (with no encryption). Let µ0 be the expected
winnings of some (efficient) player in an Internet version of this game (with encryption). Show that
µ0 ≤ µ + , where  is negligible, assuming the cipher is semantically secure.
Hint: You may want to use the fact that if XPis a random variable taking values in the set
{0, 1, . . . , n}, the expected value of X is equal to ni=1 Pr[X ≥ i].
2.9. Prove Fact 2.6, using the formal definitions in Section 2.3.
2.10 (Exercising the definition of semantic security). Let E = (E, D) be a semantically
secure cipher defined over (K, M, C), where M = C = {0, 1}L . Which of the following encryption
algorithms yields a semantically secure scheme? Either give an attack or provide a security proof
via an explicit reduction.
(a) E1 (k, m) := 0 k E(k, m)

(b) E2 (k, m) := E(k, m) k parity(m)

(c) E3 (k, m) := reverse(E(k, m))

(d) E4 (k, m) := E(k, reverse(m))


Here, for a bit string s, parity(s) is 1 if the number of 1’s in s is odd, and 0 otherwise; also,
reverse(s) is the string obtained by reversing the order of the bits in s, e.g., reverse(1011) = 1101.
2.11 (Key recovery attacks). Let E = (E, D) be a cipher defined over (K, M, C). A key recovery
attack is modeled by the following game between a challenger and an adversary A: the challenger
R
chooses a random key k in K, a random message m in M, computes c ← E(k, m), and sends (m, c)
to A. In response A outputs a guess k̂ in K. We say that A wins the game if D(k̂, c) = m and define
KRadv[A, E] to be the probability that A wins the game. As usual, we say that E is secure against
key recovery attacks if for all efficient adversaries A the advantage KRadv[A, E] is negligible.
(a) Show that the one-time pad is not secure against key recovery attacks.

(b) Show that if E is semantically secure and  = |K|/|M| is negligible, then E is secure against key
recovery attacks. In particular, show that for every efficient key-recovery adversary A there

39
is an efficient semantic security adversary B, where B is an elementary wrapper around A,
such that
KRadv[A, E] ≤ SSadv[B, E] + 
Hint: Your semantic security adversary B will output 1 with probability KRadv[A, E] in the
semantic security Experiment 1, and output 1 with probability at most  in Experiment 0.
Deduce from this a lower bound on SSadv[B, E] in terms of  and KRadv[A, E] from which
the result follows.

(c) Deduce from part (b) that if E is semantically secure and |M| is super-poly, then |K| cannot
be poly-bounded.
Note: |K| can be poly-bounded when |M| is poly-bounded, as in the one-time pad.
2.12 (Security against message recovery). In Section [Link] we developed the notion of
security against message recovery. Construct a cipher that is secure against message recovery, but
is not semantically secure.
2.13 (Advantage calculations in simple settings). Consider the following two experiments
Experiment 0 and Experiment 1:
• In Experiment 0 the challenger flips a fair coin (probability 1/2 for HEADS and 1/2 for
TAILS) and sends the result to the adversary A.

• In Experiment 1 the challenger always sends TAILS to the adversary.


The adversary’s goal is to distinguish these two experiments: at the end of each experiment the
adversary outputs a bit 0 or 1 for its guess for which experiment it is in. For b = 0, 1 let Wb
be the event that in experiment b the adversary output 1. The adversary tries to maximize its
distinguishing advantage, namely the quantity

Pr[W0 ] − Pr[W1 ] ∈ [0, 1] .

If the advantage is negligible for all efficient adversaries then we say that the two experiments are
indistinguishable.
(a) Calculate the advantage of each of the following adversaries:
(i) A1 : Always output 1.
(ii) A2 : Ignore the result reported by the challenger, and randomly output 0 or 1 with even
probability.
(iii) A3 : Output 1 if HEADS was received from the challenger, else output 0.
(iv) A4 : Output 0 if HEADS was received from the challenger, else output 1.
(v) A5 : If HEADS was received, output 1. If TAILS was received, randomly output 0 or 1
with even probability.

(b) What is the maximum advantage possible in distinguishing these two experiments? Explain
why.

40
2.14 (Permutation cipher). Consider the following cipher (E, D) defined over (K, M, C) where
C = M = {0, 1}` and K is the set of all `! permutations of the set {0, . . . , ` − 1}. For a key k ∈ K
and message m ∈ M define E(k, m) to be result of permuting the bits of m using the permutation
k, namely E(k, m) = m[k(0)]...m[k(` − 1)]. Show that this cipher is not semantically secure by
showing an adversary that achieves advantage 1.
2.15 (Nested encryption). For a cipher E = (E, D) with key space K define the nested cipher
E 0 = (E 0 , D0 ) as
E 0 (k0 , k1 ), m := E k1 , E(k0 , m) and D0 (k0 , k1 ), c := D(k0 , D(k1 , c)) .
  

Our goal is to show that if E is semantically secure then E 0 is semantically secure even if the
adversary is given one of the keys k0 or k1 .
(a) Consider the following two semantic security experiments, Experiments 0 and 1. For b = 0, 1,
in Experiment b the adversary first sends to the challenger two messages m0 and m1 . The
R
challenger chooses keys k0 , k1 ← R
K and sends back the pair (k1 , c) where c ← E 0 (k0 , k1 ), mb ).
Finally, the adversary outputs b̂ in {0, 1}. We define the adversary’s advantage, NEadv[A, E],
as in the usual the definition of semantic security. Show that for every nested encryption
adversary A attacking E 0 , there is a semantic security adversary B attacking E, where B is an
elementary wrapper around A, such that
NEadv[A, E] = SSadv[B, E] .
Draw a diagram with A on the right, B in the middle, and B’s challenger on the left. Show
the message flow between these three parties that takes place in your proof of security.
(b) Repeat part (a), but now when the adversary gets back the pair (k0 , c) from the challenger
R
(i.e., k1 is replaced by k0 ), where c ← E 0 (k0 , k1 ), mb ) as before. Draw a diagram describing
the message flow in your proof of security as you did in part (a).
This problem comes up in the context of anonymous routing on the Internet as discussed in Sec-
tion 2.4.
2.16 (Self referential encryption). Let us show that encrypting a key under itself can be
dangerous. Let E be a semantically secure cipher defined over (K, M, C), where K ⊆ M, and let
R
k ← K. A ciphertext c∗ := E(k, k), namely encrypting k using k, is called a self referential
encryption.
(a) Construct a cipher Ẽ = (Ẽ, D̃) derived from E such that Ẽ is semantically secure, but becomes
insecure if the adversary is given Ẽ(k, k). You have just shown that semantic security does
not imply security when one encrypts one’s key.
(b) Construct a cipher Ê = (Ê, D̂) derived from E such that Ê is semantically secure and remains
semantically secure (provably) even if the adversary is given Ê(k, k). To prove that Ê is
semantically secure, you should show the following: for every adversary A that attacks Ê,
there exists and adversary B that attacks E such that (i) the running time B is about the
same as that of A, and (ii) SSadv[A, Ê] ≤ SSadv[B, E].
2.17 (Compression and encryption). Two standards committees propose to save bandwidth
by combining compression (such as the Lempel-Ziv algorithm used in the zip and gzip programs)
with encryption. Both committees plan on using the variable length one time pad for encryption.

41
• One committee proposes to compress messages before encrypting them. Explain why this is
a bad idea.
Hint: Recall that compression can significantly shrink the size of some messages while having
little impact on the length of other messages.
• The other committee proposes to compress ciphertexts after encryption. Explain why this is
a bad idea.
Over the years many problems have surfaced when combining encryption and compression. The
CRIME [136] and BREACH [131] attacks are good representative examples.
2.18 (Voting protocols). This exercise develops a simple voting protocol based on the additive
one-time pad (Example 2.4). Suppose we have t voters and a counting center. Each voter is going
to vote 0 or 1, and the counting center is going to tally the votes and broadcast the total sum S.
However, they will use a protocol that guarantees that no party (voter or counting center) learns
anything other than S (but we shall assume that each party faithfully follows the protocol).
The protocol works as follows. Let n > t be an integer. The counting center generates an encryption
R
of 0: c0 ← {0, . . . , n − 1}, and passes c0 to voter 1. Voter 1 adds his vote v1 to c0 , computing
c1 ← c0 + v1 mod n, and passes c1 to voter 2. This continues, with each voter i adding vi to ci−1 ,
computing ci ← ci−1 + vi mod n, and passing ci to voter i + 1, except that voter t passes ct to the
counting center. The counting center computes the total sum as S ← ct − c0 mod n, and broadcasts
S to all the voters.
(a) Show that the protocol correctly computes the total sum.
(b) Show that the protocol is perfectly secure in the following sense. For voter i = 1, . . . , t, define
View i := (S, ci−1 ), which represents the “view” of voter i. We also define View 0 := (c0 , ct ),
which represents the “view” of the counting center. Show that for each i = 0, . . . , t and
S = 0, . . . , t, the following holds:

P of votes v1 , . . . , vt varies, subject to the restrictions that each vj ∈


as the choice
{0, 1} and tj=1 vj = S, the distribution of View i remains the same.
(c) Show that if two voters i, j collude, they can determine the vote of a third voter k. You are
free to choose the indices i, j, k.
2.19 (Two-way split keys). Let E = (E, D) be a semantically secure cipher defined over
(K, M, C) where K = {0, 1}d . Suppose we wish to split the ability to decrypt ciphertexts across
two parties, Alice and Bob, so that both parties are needed to decrypt ciphertexts. For a random
key k in K choose a random r in K and define ka := r and kb := k ⊕ r. Now if Alice and Bob get
together they can decrypt a ciphertext c by first reconstructing the key k as k = ka ⊕ kb and then
computing D(k, c). Our goal is to show that neither Alice nor Bob can decrypt ciphertexts on their
own.
(a) Formulate a security notion that captures the advantage that an adversary has in break-
ing semantic security given Bob’s key kb . Denote this 2-way key splitting advantage by
2KSadv[A, E].
(b) Show that for every 2-way key splitting adversary A there is a semantic security adversary B
such that 2KSadv[A, E] = SSadv[B, E].

42
2.20 (Simple secret sharing). Let E = (E, D) be a semantically secure cipher with key space
K = {0, 1}L . A bank wishes to split a decryption key k ∈ {0, 1}L into three shares p0 , p1 , and p2
so that two of the three shares are needed for decryption. Each share can be given to a different
bank executive, and two of the three must contribute their shares for decryption to proceed. This
way, decryption can proceed even if one of the executives is out sick, but at least two executives
are needed for decryption.
(a) To do so the bank generates two random pairs (k0 , k00 ) and (k1 , k10 ) so that k0 ⊕k00 = k1 ⊕k10 = k.
How should the bank assign shares so that any two shares enable decryption using k, but no
single share can decrypt?
Hint: The first executive will be given the share p0 := (k0 , k1 ).

(b) Generalize the scheme from part (a) so that 3-out-of-5 shares are needed for decryption.
Reconstituting the key only uses XOR of key shares. Two shares should reveal nothing about
the key k.

(c) More generally, we can design a t-out-of-w system this way for any t < w. How does the size
of each share scale with t? We will see a much better way to do this in Chapter 22
2.21 (Simple threshold decryption). Let E = (E, D) be a semantically secure cipher with key
space K. In this exercise we design a system that lets a bank split a key k into three shares p0 , p1 ,
and p2 so that two of the three shares are needed for decryption, as in Exercise 2.20. However,
decryption is done without ever reconstituting the complete key at a single location.
We use nested encryption from Exercise 2.15. Choose a random key k := (k0 , k1 , k2 , k3 ) in K4 and
encrypt a message m as:
 
R
 
c ← E k1 , E(k0 , m) , E k3 , E(k2 , m) .

(a) Construct the shares p0 , p1 , p2 so that any two shares enable decryption, but no single share
can decrypt. Hint: the first share is p0 := (k0 , k3 ).
Discussion: Suppose the entities holding shares p0 and p2 are available to decrypt. To
decrypt a ciphertext c, first send c to the entity holding p2 to partially decrypt c. Then
forward the result to the entity holding p0 to complete the decryption. This way, decryption
is done without reconstituting the complete key k at a single location.

(b) Generalize the scheme from part (a) so that 3-out-of-5 shares are needed for decryption.
Explain how decryption can be done without reconstituting the key in a single location.
An encryption scheme where the key can be split into shares so that t-out-of-w shares are needed
for decryption, and decryption does not reconstitute the key at a single location, is said to provide
threshold decryption. We will see a much better way to do this in Chapter 22.
2.22 (Bias correction). Consider again the bit-guessing version of the semantic security attack
game (i.e., Attack Game 2.4). Suppose an efficient adversary A wins the game (i.e., guesses the
hidden bit b) with probability 1/2 + , where  is non-negligible. Note that  could be positive or
negative (the definition of negligible works on absolute values). Our goal is to show that there is
another efficient adversary B that wins the game with probability 1/2+0 , where 0 is non-negligible
and positive.

43
(a) Consider the following adversary B that uses A as a subroutine in Attack Game 2.4 in the
following two-stage attack. In the first stage, B plays challenger to A, but B generates its
own hidden bit b0 , its own key k0 , and eventually A outputs its guess-bit b̂0 . Note that in
this stage, B’s challenger in Attack Game 2.4 is not involved at all. In the second stage, B
restarts A, and lets A interact with the “real” challenger in Attack Game 2.4, and eventually
A outputs a guess-bit b̂. When this happens, B outputs b̂ ⊕ b̂0 ⊕ b0 . Note that this run of A
is completely independent of the first — the coins of A and also the system parameters are
generated independently in these two runs.
Show that B wins Attack Game 2.4 with probability 1/2 + 22 .

(b) One might be tempted to argue as follows. Just construct an adversary B that runs A, and
when A outputs b̂, adversary B outputs b̂ ⊕ 1. Now, we do not know if  is positive or
negative. If it is positive, then A satisfies our requirements. If it is negative, then B satisfies
our requirements. Although we do not know which one of these two adversaries satisfies our
requirements, we know that one of them definitely does, and so existence is proved.
What is wrong with this argument? The explanation requires an understanding of the math-
ematical details regarding security parameters (see Section 2.3).

(c) Can you come up with another efficient adversary B 0 that wins the bit-guessing game with
probability at least 1/2 + ||/2? Your adversary B 0 will be less efficient than B.
Hint: try running the first stage of adversary B multiple times.

44

You might also like