DATA AND COMPUTER COMMUNICATIONS,EIGHTH EDITION
A comprehensive survey that has become the standard in the field,covering
(1) data communications,including transmission, media, signal encoding,link
control,and multiplexing; (2) communication networks, including circuit- and
packet-switched,frame relay, ATM,and LANs; (3) the TCP/IP protocol suite,
including IPv6,TCP,MIME, and HTTP,as well as a detailed treatment of
network security.Received the 2007 Text and Academic Authors Association
(TAA) award for the best Computer Science and Engineering Textbook of the
year.ISBN 0-13-243310-9
COMPUTER ORGANIZATION AND ARCHITECTURE,
EIGHTH EDITION
A unified view of this broad field.Covers fundamentals such as CPU, control
unit,microprogramming, instruction set, I/O,and memory.Also covers
advanced topics such as RISC,superscalar, and parallel organization. Fourth
and fifth editions received the TAA award for the best Computer Science and
Engineering Textbook of the year.ISBN 978-0-13-607373-4
OPERATING SYSTEMS,SIXTH EDITION
A state-of-the art survey of operating system principles.Covers fundamental
technology as well as contemporary design issues,such as threads,
microkernels,SMPs, real-time systems,multiprocessor scheduling, embedded
OSs,distributed systems, clusters,security, and object-oriented design.
Received the 2009 Text and Academic Authors Association (TAA) award
for the best Computer Science and Engineering Textbook of the year.
ISBN 978-0-13-600632-9
BUSINESS DATA COMMUNICATIONS,SIXTH EDITION
A comprehensive presentation of data communications and
telecommunications from a business perspective.Covers voice, data, image,
and video communications and applications technology and includes a number
of case studies.ISBN 978-0-13-606741-2
COMPUTER NETWORKS WITH INTERNET PROTOCOLS
AND TECHNOLOGY
An up-to-date survey of developments in the area of Internet-based protocols
and algorithms.Using a top-down approach, this book covers applications,
transport layer,Internet QoS, Internet routing, data link layer and computer
networks,security, and network management.ISBN 0-13141098-9
THE WILLIAM STALLINGS BOOKS ON COMPUTER
NETWORK SECURITY ESSENTIALS,FOURTH EDITION
A tutorial and survey on network security technology.The book covers
important network security tools and applications,including S/MIME, IP
Security,Kerberos,SSL/TLS, SET,and X509v3. In addition, methods for
countering hackers and viruses are explored.
COMPUTER SECURITY (with Lawrie Brown)
A comprehensive treatment of computer security technology,including
algorithms,protocols, and applications.Covers cryptography, authentication,
access control,database security, intrusion detection and prevention, malicious
software,denial of service, firewalls,software security, physical security,human
factors,auditing, legal and ethical aspects,and trusted systems. Received the
2008 Text and Academic Authors Association (TAA) award for the best
Computer Science and Engineering Textbook of the year.ISBN 0-13-600424-5
WIRELESS COMMUNICATIONS AND NETWORKS,Second Edition
A comprehensive,state-of-the art survey. Covers fundamental wireless
communications topics,including antennas and propagation, signal encoding
techniques,spread spectrum, and error correction techniques.Examines
satellite,cellular, wireless local loop networks and wireless LANs,including
Bluetooth and 802.11.Covers Mobile IP and WAP.ISBN 0-13-191835-4
HIGH-SPEED NETWORKS AND INTERNETS,SECOND EDITION
A state-of-the art survey of high-speed networks.Topics covered include TCP
congestion control,ATM traffic management, Internet traffic management,
differentiated and integrated services,Internet routing protocols and multicast
routing protocols,resource reservation and RSVP,and lossless and lossy
compression.Examines important topic of self-similar data traffic.
ISBN 0-13-03221-0
AND DATA COMMUNICATIONS TECHNOLOGY
CRYPTOGRAPHY AND
NETWORK SECURITY
PRINCIPLES AND PRACTICE
FIFTH EDITION
William Stallings
Prentice Hall
Boston Columbus Indianapolis New York San Francisco
Upper Saddle River Amsterdam Cape Town Dubai London Madrid
Milan Munich Paris Montreal Toronto Delhi Mexico City Sao Paulo
Sydney Hong Kong Seoul Singapore Taipei Tokyo
Vice President and Editorial Director,ECS:
Marcia Horton
Executive Editor:Tracy Dunkelberger
Associate Editor: Melinda Haggerty
Editorial Assistant:Allison Michael
Senior Managing Editor: Scott Disanno
Production Editor: Rose Kernan
Senior Operations Supervisor:Alan Fischer
Operations Specialist: Lisa McDowell
Cover Design:Black Horse Designs
Art Director:Kr istine Carney
Director,Image Resource Center: Melinda
Patelli
Manager,Rights and Permissions: Zina Arabia
Senior Marketing Manager:Er in Davis
Manager,Visual Research:Beth Brenzel
Manager,Cover Visual Research & Permissions:
Karen Sanatar
Composition:Integ ra
Printer/Binder: Edwards Brothers
Credits and acknowledgments borrowed from other sources and reproduced,with permission,in this textbook
appear on appropriate page within text.
If you purchased this book within the United States or Canada you should be aware that it has been
wrongfully imported without the approval of the Publisher or the Author.
Copyright © 2011,2006 Pearson Education, Inc., publishing as Prentice Hall.All rights reserved.
Manufactured in the United States of America.This publication is protected by Copyright,and permission
should be obtained from the publisher prior to any prohibited reproduction,storage in a retrieval system,or
transmission in any form or by any means,electronic,mechanical, photocopying, recording,or likewise.To
obtain permission(s) to use material from this work,please submit a written request to Pearson Education, Inc.,
Permissions Department,1 Lake Street, Upper Saddle River,NY 07458
Many of the designations by manufacturers and seller to distinguish their products are claimed as trademarks.
Where those designations appear in this book,and the publisher was aware of a trademark claim,the
designations have been printed in initial caps or all caps.
10987654321
ISBN 10: 0-13-609704-9
ISBN 13:978-0-13-609704-4
Library of Congress Cataloging-in-Publication Data On File
To Antigone never
dull never boring
the smartest
person I know
This page intentionally left blank
CONTENTS
Notation xiii
Preface xv
About the Author xxiii
Chapter 0 Reader’s Guide 1
0.1 Outline of This Book 2
0.2 A Roadmap for Readers and Instructors 2
0.3 Internet and Web Resources4
0.4 Standards 5
Chapter 1 Overview 7
1.1 Computer Security Concepts 9
1.2 The OSI Security Architecture14
1.3 Security Attacks 15
1.4 Security Services 19
1.5 Security Mechanisms 23
1.6 A Model for Network Security 25
1.7 Recommended Reading and Web Sites27
1.8 Key Terms,Review Questions, and Problems 29
PART ONE SYMMETRIC CIPHERS31
Chapter 2 Classical Encryption Techniques31
2.1 Symmetric Cipher Model 33
2.2 Substitution Techniques38
2.3 Transposition Techniques 53
2.4 Rotor Machines 55
2.5 Steganography 57
2.6 Recommended Reading and Web Sites59
2.7 Key Terms,Review Questions, and Problems 60
Chapter 3 Block Ciphers and the Data Encryption Standard 66
3.1 Block Cipher Principles 68
3.2 The Data Encryption Standard (DES) 77
3.3 A DES Example 85
3.4 The Strength of DES 88
3.5 Differential and Linear Cryptanalysis 89
3.6 Block Cipher Design Principles 92
3.7 Recommended Reading and Web Site96
3.8 Key Terms,Review Questions, and Problems 97
Chapter 4 Basic Concepts in Number Theory and Finite Fields 101
4.1 Divisibility and the Division Algorithm103
4.2 The Euclidean Algorithm105
v
vi CONTENTS
4.3 Modular Arithmetic 108
4.4 Groups,Rings, and Fields 116
4.5 Finite Fields of the Form GF(p) 120
4.6 Polynomial Arithmetic 122
4.7 Finite Fields of the Form GF(2
n
) 129
4.8 Recommended Reading and Web Sites141
4.9 Key Terms,Review Questions, and Problems 141
Appendix 4AThe Meaning of mod 144
Chapter 5 Advanced Encryption Standard 47
5.1 The Origins AES 148
5.2 AES Structure 150
5.3 AES Round Functions 155
5.4 AES Key Expansion166
5.5 An AES Example 169
5.6 AES Implementation 174
5.7 Recommended Reading and Web Sites178
5.8 Key Terms,Review Questions, and Problems 179
Appendix 5A Polynomials with Coefficients in GF(2
8
) 180
Appendix 5B Simplified AES 183
Chapter 6 Block Cipher Operation 192
6.1 Multiple Encryption and Triple DES193
6.2 Electronic Codebook Mode198
6.3 Cipher Block Chaining Mode 201
6.4 Cipher Feedback Mode 203
6.5 Output Feedback Mode 205
6.6 Counter Mode 206
6.7 XTS Mode for Block-Oriented Storage Devices 210
6.8 Recommended Web Site 214
6.9 Key Terms,Review Questions, and Problems 214
Chapter 7 Pseudorandom Number Generation and Stream Ciphers 218
7.1 Principles of Pseudorandom Number Generation 219
7.2 Pseudorandom Number Generators 226
7.3 Pseudorandom Number Generation Using a Block Cipher 229
7.4 Stream Ciphers 232
7.5 RC4 234
7.6 True Random Numbers237
7.7 Recommended Reading 238
7.8 Key Terms,Review Questions, and Problems 239
PART TWO ASYMMETRIC CIPHERS 243
Chapter 8 More Number Theory 243
8.1 Prime Numbers 245
8.2 Fermat’s and Euler’s Theorems248
8.3 Testing for Primality251
8.4 The Chinese Remainder Theorem254
CONTENTS vii
8.5 Discrete Logarithms 257
8.6 Recommended Reading and Web Sites262
8.7 Key Terms,Review Questions, and Problems 263
Chapter 9 Public-Key Cryptography and RSA 266
9.1 Principles of Public-Key Cryptosystems 269
9.2 The RSA Algorithm277
9.3 Recommended Reading and Web Sites291
9.4 Key Terms,Review Questions, and Problems 291
Appendix 9A Proof of the RSA Algorithm296
Appendix 9B The Complexity of Algorithms297
Chapter 10 Other Public-Key Cryptosystems 300
10.1 Diffie-Hellman Key Exchange 301
10.2 ElGamal Cryptosystem 305
10.3 Elliptic Curve Arithmetic 308
10.4 Elliptic Curve Cryptography 317
10.5 Pseudorandom Number Generation Based on an Asymmetric Cipher 321
10.6 Recommended Reading and Web Sites323
10.7 Key Terms,Review Questions,and Problems 324
PART THREE CRYPTOGRAPHIC DATA INTEGRITY ALGORITHMS327
Chapter 11 Cryptographic Hash Functions 327
11.1 Applications of Cryptographic Hash Functions 329
11.2 Two Simple Hash Functions 333
11.3 Requirements and Security 335
11.4 Hash Functions Based on Cipher Block Chaining 341
11.5 Secure Hash Algorithm (SHA) 342
11.6 SHA-3 352
11.7 Recommended Reading and Web Sites353
11.8 Key Terms,Review Questions,and Problems 353
Appendix 11A Mathematical Basis of Birthday Attack356
Chapter 12 Message Authentication Codes 362
12.1 Message Authentication Requirements 364
12.2 Message Authentication Functions 365
12.3 Message Authentication Codes 372
12.4 Security of MACs 374
12.5 MACs Based on Hash Functions:HMAC 375
12.6 MACs Based on Block Ciphers:DAA and CMAC 380
12.7 Authenticated Encryption:CCM and GCM 383
12.8 Pseudorandom Number Generation Using Hash Functions and MACs389
12.9 Recommended Reading 392
12.10 Key Terms,Review Questions,and Problems 393
Chapter 13 Digital Signatures 395
13.1 Digital Signatures 396
13.2 ElGamal Digital Signature Scheme 400
viii CONTENTS
13.3 Schnorr Dig ital Signature Scheme402
13.4 Digital Signature Standard (DSS) 403
13.5 Recommended Reading and Web Sites406
13.6 Key Terms,Review Questions,and Problems 407
PART FOUR MUTUAL TRUST410
Chapter 14 Key Management and Distribution 410
14.1 Symmetric Key Distribution Using Symmetr ic Encryption 412
14.2 Symmetric Key Distribution Using Asymmetric Encryption 421
14.3 Distribution of Public Keys 423
14.4 X.509 Certificates 428
14.5 Public Key Infrastructure 436
14.6 Recommended Reading and Web Sites438
14.7 Key Terms,Review Questions,and Problems 439
Chapter 15 User Authentication Protocols 444
15.1 Remote User Authentication Principles 445
15.2 Remote User Authentication Using Symmetric Encryption 448
15.3 Kerberos 452
15.4 Remote User Authentication Using Asymmetric Encryption 470
15.5 Federated Identity Management 472
15.6 Recommended Reading and Web Sites478
15.7 Key Terms,Review Questions,and Problems 479
Appendix 15A Kerberos Encryption Techniques481
PART FIVE NETWORK AND INTERNET SECURITY 485
Chapter 16 Transport-Level Security 485
16.1 Web Security Issues 486
16.2 Secure Sockets Layer (SSL) 489
16.3 Transport Layer Security (TLS) 502
16.4 HTTPS 506
16.5 Secure Shell (SSH) 508
16.6 Recommended Reading and Web Sites519
16.7 Key Terms,Review Questions,and Problems 519
Chapter 17 Wireless Network Security 521
17.1 IEEE 802.11 Wireless LAN Overview 523
17.2 IEEE 802.11i Wireless LAN Security 529
17.3 Wireless Application Protocol Overview 543
17.4 Wireless Transport Layer Security550
17.5 WAP End-to-End Security 560
17.6 Recommended Reading and Web Sites563
17.7 Key Terms,Review Questions,and Problems 563
Chapter 18 Electronic Mail Security 567
18.1 Pretty Good Privacy (PGP) 568
18.2 S/MIME 587
CONTENTS ix
18.3 DomainKeys Identified Mail (DKIM) 603
18.4 Recommended Web Sites 610
18.5 Key Terms,Review Questions,and Problems 611
Appendix 18A Radix-64 Conversion612
Chapter 19 IP Security 615
19.1 IP Security Overview 616
19.2 IP Security Policy 622
19.3 Encapsulating Security Payload 627
19.4 Combining Security Associations 634
19.5 Internet Key Exchange 638
19.6 Cryptographic Suites 647
19.7 Recommended Reading and Web Sites648
19.8 Key Terms,Review Questions,and Problems 649
APPENDICES 651
Appendix A Projects for Teaching Cryptography and Network Security 651
A.1 Sage Computer Algebra Projects652
A.2 Hacking Project653
A.3 Block Cipher Projects653
A.4 Laboratory Exercises654
A.5 Research Projects654
A.6 Programming Projects655
A.7 Practical Security Assessments655
A.8 Writing Assignments655
A.9 Reading/Report Assignments656
Appendix B Sage Examples 657
B.1 Chapter 2:Classical Encryption Techniques659
B.2 Chapter 3:Block Ciphers and the Data Encryption Standard 662
B.3 Chapter 4:Basic Concepts in Number Theory and Finite Fields 666
B.4 Chapter 5:Advanced Encryption Standard 673
B.5 Chapter 6:Pseudorandom Number Generation and Stream Ciphers 678
B.6 Chapter 8:Number Theory 680
B.6 Chapter 9:Public-Key Cryptography and RSA 685
B.7 Chapter 10:Other Public-Key Cryptosystems 688
B.8 Chapter 11:Cryptographic Hash Functions 693
B.9 Chapter 13:Digital Signatures 695
References 699
Index 711
ONLINE CHAPTERS
PART SIX SYSTEM SECURITY
Chapter 20 Intruders
20.1 Intruders
20.2 Intrusion Detection
x CONTENTS
20.3 Password Management
20.4 Recommended Reading and Web Sites
20.5 Key Terms,Review Questions,and Problems
Appendix 20A The Base-Rate Fallacy
Chapter 21 Malicious Software
21.1 Types of Malicious Software
21.2 Viruses
21.3 Virus Counter measures
21.4 Wo r m s
21.5 Distributed Denial of Service Attacks
21.6 Recommended Reading and Web Sites
21.7 Key Terms,Review Questions,and Problems
Chapter 22 Firewalls
22.1 The Need for Firewalls
22.2 Firewall Characteristics
22.3 Types of Firewalls
22.4 Firewall Basing
22.5 Firewall Location and Configurations
22.6 Recommended Reading and Web Sites
22.7 Key Terms,Review Questions,and Problems
PART SEVEN LEGAL AND ETHICAL ISSUES
Chapter 23 Legal and Ethical Issues
23.1 Cybercrime and Computer Cr ime
23.2 Intellectual Property
23.3 Privacy
23.4 Ethical Issues
23.5 Recommended Reading and Web Sites
23.6 Key Terms,Review Questions,and Problems
ONLINE APPENDICES
WilliamStallings.com/Crypto/Crypto5e.html
Appendix C Sage Problems
C.1 Getting Started with Sage
C.2 Programming with Sage
C.3 Chapter 2: Classical Encryption Techniques
C.4 Chapter 3: Block Ciphers and the Data Encryption Standard
C.5 Chapter 4: Basic Concepts in Number Theory and Finite Fields
C.6 Chapter 5: Advanced Encryption Standard
C.7 Chapter 7: Pseudorandom Number Generation and Stream Ciphers
C.8 Chapter 8: Number Theory
C.9 Chapter 9: Public-Key Cryptography and RSA
C.10 Chapter 10:Other Public-Key Cryptosystems
C.11 Chapter 11:Cr yptographic Hash Functions
C.12 Chapter 13:Dig ital Signatures
CONTENTS xi
Appendix D Standards and Standards-Setting Organizations
D.1 The Importance of Standards
D.2 Internet Standards and the Internet Society
D.3 National Institute of Standards and Technology
Appendix E Basic Concepts from Linear Algebra
E.1 Operations on Vectors and Matrices
E.2 Linear Algebra Operations over Z
n
Appendix F Measures of Security and Secrecy
F.1 Perfect Secrecy
F.2 Information and Entropy
F.3 Entropy and Secrecy
Appendix G Simplified DES
G.1 Overview
G.2 S-DES Key Generation
G.3 S-DES Encryption
G.4 Analysis of Simplified DES
G.5 Relationship to DES
Appendix H Evaluation Criteria for AES
H.1 The Origins of AES
H.2 AES Evaluation
Appendix I More on Simplified AES
I.1 Arithmetic in GF(2
4
)
I.2 The Mix Column Function
Appendix J Knapsack Public-Key Algorithm
J.1 The Knapsack Problem
J.2 The Knapsack Cryptosystem
J.3 Example
Appendix K Proof of the Digital Signature Algorithm
Appendix L TCP/IP and OSI
L.1 Protocols and Protocol Architectures
L.2 The TCP/IP Protocol Architecture
L.3 The Role of an Internet Protocol
L.4 IPv4
L.5 IPv6
L.6 The OSI Protocol Architecture
Appendix M Java Cryptographic APIs
M.1 Introduction
M.2 JCA and JCE Architecture
M.3 JCA Classes
M.4 JCE Classes
M.5 Conclusion and References
xii CONTENTS
M.6 Using the Cryptog raphic Application
M.7 JCA/JCE Cryptog raphy Example
Appendix N The Whirlpool Hash Function
N.1 Whirlpool Hash Structure
N.2 Block Cipher W
N.3 Performance of Whirlpool
Appendix O Data Compression Using ZIP
O.1 Compression Algorithm
O.2 Decompression Algorithm
Appendix P PGP Random Number Generation
P.1 True Random Numbers
P.2 Pseudorandom Numbers
Appendix Q International Reference Alphabet
Glossary
NOTATION
Symbol Expression Meaning
D, K D1K, Y2
Symmetric decryption of ciphertext using secret key KY
D, PR
a
D1PR
a
, Y2 Asymmetric decryption of ciphertext using A’s private key PR
a
Y
D, PU
a
D1PU
a
, Y2 Asymmetric decryption of ciphertext using A’s public key PU
a
Y
E, K E1K, X2
Symmetric encryption of plaintext using secret key KX
E, PR
a
E( , )XPR
a
Asymmetric encryption of plaintext using A’s private key PR
a
X
E,PU
a
E( , )XPU
a
Asymmetric encryption of plaintext using A’s public key PU
a
X
K
Secret key
PR
a
Private key of user A
PU
a
Public key of user A
MAC,
K
MAC(K,X)
Message authentication code of message using secret key
KX
GF(p)
The finite field of order ,where is prime.The field is defined as
the set together with the arithmetic operations modulo .pZ
p
pp
GF(2
n
)
The finite field of order 2
n
Z
n
Set of nonnegative integers less than n
gcd
gcd( )i, j
Greatest common divisor;the largest positive integer that divides
both and with no remainder on division.ji
mod mod ma Remainder after division of by ma
mod, K
(mod )ma K b mod mod mm = ba
mod, [
(mod )ma [ b mod mod mm Z ba
dlog
dlog
a, p
(b) Discrete logarithm of the number for the base (mod )pab
w
f(n)
The number of positive integers less than and relatively prime to .
This is Euler’s totient function.
nn
©
a
n
i=1
a
i
a
1
+ a
2
+
Á
+ a
n
ß
q
n
i=1
a
i
a
1
* a
2
*
Á
* a
n
Even the natives have difficulty mastering this peculiar vocabulary.
The Golden Bough, Sir James George Frazer
xiii
xiv NOTATION
| i | j
idivides ,which means that there is no remainder when is divided
by i
jj
|, | | a |
Absolute value of a
|| x || y
concatenated with yx
L x L y
is approximately equal to yx
x
y
Exclusive-OR of and for single-bit variables;
Bitwise exclusive-OR of and for multiple-bit variablesyx
yx
:,;:x;
The largest integer less than or equal to x
x S
The element is contained in the set S.x
·
Á, a
k
2
A
·
(a
1
, a
2
,
The integer A corresponds to the sequence of integers
(, )a
2
, Á , a
k
a
1
xv
PREFACE
“The tie,if I might suggest it, sir,a shade more tightly knotted. One aims at the
perfect butterfly effect.If you will permit me —”
“What does it matter, Jeeves, at a time like this? Do you realize that
Mr.Little’s domestic happiness is hanging in the scale?”
“There is no time,sir, at which ties do not matter.”
Very Good,Jeeves! P.G.Wodehouse
In this age of universal electronic connectivity,of viruses and hackers, of electronic eaves-
dropping and electronic fraud,there is indeed no time at which security does not matter.Two
trends have come together to make the topic of this book of vital interest.First, the explosive
growth in computer systems and their interconnections via networks has increased the
dependence of both organizations and individuals on the information stored and communi-
cated using these systems.This, in turn, has led to a heightened awareness of the need to
protect data and resources from disclosure, to guarantee the authenticity of data and
messages, and to protect systems from network-based attacks.Second, the disciplines of
cryptography and network security have matured,leading to the development of practical,
readily available applications to enforce network security.
OBJECTIVES
It is the purpose of this book to provide a practical survey of both the principles and practice
of cryptography and network security.In the first part of the book, the basic issues to be
addressed by a network security capability are explored by providing a tutorial and survey of
cryptography and network security technology.The latter part of the book deals with the
practice of network security: practical applications that have been implemented and are in
use to provide network security.
The subject,and therefore this book, draws on a variety of disciplines.In particular, it is
impossible to appreciate the significance of some of the techniques discussed in this book
without a basic understanding of number theory and some results from probability theory.
Nevertheless,an attempt has been made to make the book self-contained.The book presents
not only the basic mathematical results that are needed but provides the reader with an
intuitive understanding of those results.Such background material is introduced as needed.
This approach helps to motivate the material that is introduced,and the author considers
this preferable to simply presenting all of the mathematical material in a lump at the begin-
ning of the book.
INTENDED AUDIENCE
The book is intended for both academic and a professional audiences.As a textbook, it is
intended as a one-semester undergraduate course in cryptography and network security for
computer science, computer engineering,and electrical engineering majors. It covers the
xvi PREFACE
material in IAS2 Security Mechanisms,a core area in the Information Technology body of
knowledge;NET4 Security,another core area in the Information Technology body of knowl-
edge; and IT311, Cryptography,an advanced course; these subject areas are part of the
ACM/IEEE Computer Society Computing Curricula 2005.
The book also serves as a basic reference volume and is suitable for self-study.
PLAN OF THE BOOK
The book is divided into seven parts (see Chapter 0for an overview):
Symmetric Ciphers
Asymmetric Ciphers
Cryptographic Data Integrity Algorithms
Mutual Trust
Network and Internet Security
System Security
Legal and Ethical Issues
The book includes a number of pedagogic features,including the use of the computer
algebra system Sage and numerous figures and tables to clarify the discussions.Each chapter
includes a list of key words,review questions, homework problems, suggestions for further
reading, and recommended Web sites. The book also includes an extensive glossary,a list
of frequently used acronyms, and a bibliography.In addition, a test bank is available to
instructors.
ONLINE DOCUMENTS FOR STUDENTS
For this new edition,a tremendous amount of original supporting material has been made
available online,in the following categories.
Online chapters: To limit the size and cost of the book, four chapters of the book
are provided in PDF format.This includes three chapters on computer security and
one on legal and ethical issues.The chapters are listed in this book’s table of
contents.
Online appendices: There are numerous interesting topics that support material
found in the text but whose inclusion is not warranted in the printed text.A total of
fifteen online appendices cover these topics for the interested student.The appen-
dices are listed in this book’s table of contents.
Homework problems and solutions: To aid the student in understanding the material,
a separate set of homework problems with solutions are available.These enable the
students to test their understanding of the text.
Key papers: Twenty-four papers from the professional literature,many hard to find,
are provided for further reading.
Supporting documents: A variety of other useful documents are referenced in the text
and provided online.
Sage code: The Sage code from the examples in Appendix B in case the student wants
to play around with the examples.
PREFACE xvii
Purchasing this textbook now grants the reader six months of access to this online
material.See the access card bound into the front of this book for details.
INSTRUCTIONAL SUPPORT MATERIALS
To support instructors,the following materials are provided:
Solutions Manual: Solutions to end-of-chapter Review Questions and Problems.
Projects Manual: Suggested project assignments for all of the project categories listed
below.
PowerPoint Slides:A set of slides covering all chapters, suitable for use in lecturing.
PDF Files: Reproductions of all figures and tables from the book.
Test Bank: A chapter-by-chapter set of questions.
All of these support materials are available at the Instructor Resource Center
(IRC) for this textbook, which can be reached via personhighered.com/stallings or by
clicking on the button labeled “Book Info and More Instructor Resources”at this book’s
Web Site WilliamStallings.com/Crypto/Crypto5e.html. To gain access to the IRC,
please contact your local Prentice Hall sales representative via pearsonhighered.com/
educator/replocator/requestSalesRep.page or call Prentice Hall Faculty Services at
1-800-526-0485.
INTERNET SERVICES FOR INSTRUCTORS AND STUDENTS
There is a Web site for this book that provides support for students and instructors.The site
includes links to other relevant sites,transparency masters of figures and tables in the book
in PDF (Adobe Acrobat) format, and PowerPoint slides. The Web page is at
WilliamStallings.com/Crypto/Crypto5e.html.For more information, see Chapter 0.
New to this edition is a set of homework problems with solutions available at this Web
site.Students can enhance their understanding of the material by working out the solutions
to these problems and then checking their answers.
An Internet mailing list has been set up so that instructors using this book can
exchange information, suggestions,and questions with each other and with the author. As
soon as typos or other errors are discovered,an errata list for this book will be available at
WilliamStallings.com. In addition, the Computer Science Student Resource site at
WilliamStallings.com/StudentSupport.html provides documents, information, and useful
links for computer science students and professionals.
PROJECTS AND OTHER STUDENT EXERCISES
For many instructors,an important component of a cryptography or security course is a pro-
ject or set of projects by which the student gets hands-on experience to reinforce concepts
from the text. This book provides an unparalleled degree of support,including a projects
component in the course.The IRC not only includes guidance on how to assign and structure
xviii PREFACE
the projects,but it also includes a set of project assignments that covers a broad range of
topics from the text.
Sage Projects: Described in the next section.
Hacking Project: This exercise is designed to illuminate the key issues in intrusion
detection and prevention.
Block Cipher Projects: This is a lab that explores the operation of the AES encryption
algorithm by tracing its execution,computing one round by hand, and then exploring
the various block cipher modes of use.The lab also covers DES.In both cases, an
online Java applet is used (or can be downloaded) to execute AES or DES.
Lab Exercises: A series of projects that involve programming and experimenting with
concepts from the book.
Research Projects: A series of research assignments that instruct the student to
research a particular topic on the Internet and write a report.
Programming Projects: A series of programming projects that cover a broad range of
topics and that can be implemented in any suitable language on any platform.
Practical Security Assessments: A set of exercises to examine current infrastructure
and practices of an existing organization.
Writing Assignments: A set of suggested writing assignments organized by chapter.
Reading/Report Assignments: A list of papers in the literature — one for each
chapter — that can be assigned for the student to read and then write a short report.
See Appendix Afor details.
THE SAGE COMPUTER ALGEBRA SYSTEM
One of the most important new features for this edition is the use of Sage for cryptographic
examples and homework assignments.Sage is an open-source,multiplatform, freeware pack-
age that implements a very powerful,flexible, and easily learned mathematics and computer
algebra system. Unlike competing systems (such as Mathematica,Maple, and MATLAB),
there are no licensing agreements or fees involved. Thus,Sage can be made available on
computers and networks at school,and students can individually download the software to
their own personal computers for use at home.Another advantage of using Sage is that
students learn a powerful, flexible tool that can be used for virtually any mathematical
application,not just cryptography.
The use of Sage can make a significant difference to the teaching of the mathematics
of cryptographic algorithms.This book provides a large number of examples of the use of
Sage covering many cryptographic concepts in Appendix B.
Appendix C lists exercises in each of these topic areas to enable the student to gain
hands-on experience with cryptographic algorithms.This appendix is available to instruc-
tors at the IRC for this book.Appendix C includes a section on how to download and get
started with Sage,a section on programming with Sage, and includes exercises that can be
assigned to students in the following categories:
Chapter 2 — Classical Encryption: Affine ciphers and the Hill cipher.
Chapter 3 — Block Ciphers And The Data Encryption Standard: Exercises based on
SDES.
PREFACE xix
Chapter 4 — Basic Concepts In Number Theory And Finite Fields: Euclidean and
extended Euclidean algorithms,polynomial arithmetic, and GF(24).
Chapter 5 — Advanced Encryption Standard: Exercise based on SAES.
Chapter 6 — Pseudorandom Number Generation And Stream Ciphers: Blum Blum
Shub,linear congruential generator, and ANSI X9.17 PRNG.
Chapter 8 — Number Theory: Euler’s Totient function,Miller Rabin, factoring, modular
exponentiation,discrete logarithm, and Chinese remainder theorem.
Chapter 9 — Public-Key Cryptography And RSA: RSA encrypt/decrypt and signing.
Chapter 10 — Other Public-Key Cryptosystems: Diffie-Hellman, elliptic curve
Chapter 11 — Cryptographic Hash Functions: Number-theoretic hash function.
Chapter 13 — Digital Signatures: DSA.
WHAT’S NEW IN THE FIFTH EDITION
The changes for this new edition of Cryptography and Network Securityare more substantial
and comprehensive than those for any previous revision.
In the three years since the fourth edition of this book was published,the field has seen
continued innovations and improvements.In this new edition, I try to capture these changes
while maintaining a broad and comprehensive coverage of the entire field. To begin this
process of revision, the fourth edition was extensively reviewed by a number of professors
who teach the subject.In addition, a number of professionals working in the field reviewed
individual chapters.The result is that, in many places, the narrative has been clarified and
tightened,and illustrations have been improved. Also, a large number of new “field-tested”
problems have been added.
One obvious change to the book is a revision in the organization, which makes for a
clearer presentation of related topics.There is a new Part Three,which pulls together all of
the material on cryptographic algorithms for data integrity,including cryptographic hash
functions,message authentication codes,and digital signatures. The material on key manage-
ment and exchange,previously distributed in several places in the book, is now organized in
a single chapter,as is the material on user authentication.
Beyond these refinements to improve pedagogy and user friendliness,there have been
major substantive changes throughout the book.Highlights include:
Euclidean and extended Euclidean algorithms (revised): These algorithms are impor-
tant for numerous cryptographic functions and algorithms.The material on the
Euclidean and extended Euclidean algorithms for integers and for polynomials has
been completely rewritten to provide a clearer and more systematic treatment.
Advanced Encryption Standard (revised): AES has emerged as the dominant symmetric
encryption algorithm,used in a wide variety of applications. Accordingly,this edition has
dramatically expanded the resources for learning about and understanding this impor-
tant standard.The chapter on AES has been revised and expanded,with additional illus-
trations and a detailed example,to clarify the presentation. Examples and assignments
using Sage have been added.And the book now includes an AES cryptography lab,
which enables the student to gain hands-on experience with AES cipher internals and
modes of use.The lab makes use of an AES calculator applet,available at this book’s
Web site,that can encrypt or decrypt test data values using the AES block cipher.
xx PREFACE
Block Cipher Modes of Operation (revised): The material in Chapter 6 on modes of
operation has been expanded and the illustrations redrawn for greater clarity.
Pseudorandom number generation and pseudorandom functions (revised): The treat-
ment of this important topic has been expanded,with the addition of new material on
the use of symmetric encryption algorithms and cryptographic hash functions to con-
struct pseudorandom functions.
ElGamal encryption and digital signature (new): New sections have been added on
this popular public-key algorithm.
Cryptographic hash functions and message authentication codes (revised): The mater-
ial on hash functions and MAC has been revised and reorganized to provide a clearer
and more systematic treatment.
SHA-3 (new): Although the SHA-3 algorithm has yet to be selected, it is important
for the student to have a grasp of the design criteria for this forthcoming cryptograph-
ic hash standard.
Authenticated encryption (new): The book covers the important new algorithms,
CCM and GCM,which simultaneously provide confidentiality and data integrity.
Key management and distribution (revised): In the fourth edition,these topics were
scattered across three chapters.In the fifth edition, the material is revised and consoli-
dated into a single chapter to provide a unified,systematic treatment.
Remote user authentication (revised): In the fourth edition, this topic was covered in
parts of two chapters.In the fifth edition the material is revised and consolidated into
a single chapter to provide a unified,systematic treatment.
Federated identity (new): A new section covers this common identity management
scheme across multiple enterprises and numerous applications and supporting many
thousands,even millions, of users.
HTTPS (new): A new section covers this protocol for providing secure communica-
tion between Web browser and Web server.
Secure shell (new): SSH, one of the most pervasive applications of encryption tech-
nology,is covered in a new section.
DomainKeys Identified Mail (new): A new section covers DKIM,which has become
the standard means of authenticating e-mail to counter spam.
Wireless network security (new): A new chapter covers this important area of net-
work security.The chapter deals with the IEEE 802.11 (WiFi) security standard for
wireless local area networks;and the Wireless Application Protocol (WAP) security
standard for communication between a mobile Web browser and a Web server.
IPsec (revised): The chapter on IPsec has been almost completely rewritten. It now
covers IPsecv3 and IKEv2.In addition, the presentation has been revised to improve
clarity and breadth.
Legal and ethical issues (new): A new online chapter covers these important
topics.
Online appendices (new): Fifteen online appendices provide additional breadth and
depth for the interested student on a variety of topics.
Sage examples and problems (new): As mentioned, this new edition makes use of the
open-source,freeware Sage computer algebra application to enable students to have
hands-on experience with a variety of cryptographic algorithms.
PREFACE xxi
With each new edition it is a struggle to maintain a reasonable page count while adding
new material.In part, this objective is realized by eliminating obsolete material and tighten-
ing the narrative.For this edition, chapters and appendices that are of less general interest
have been moved online as individual PDF files.This has allowed an expansion of material
without the corresponding increase in size and price.
ACKNOWLEDGEMENTS
This new edition has benefited from review by a number of people who gave generously of
their time and expertise.The following people reviewed all or a large part of the manuscript:
Marius Zimand (Towson State University),Shambhu Upadhyaya (University of Buffalo),
Nan Zhang (George Washington University),Dongwan Shin (New Mexico Tech),Michael
Kain (Drexel University),William Bard (University of Texas), David Arnold (Baylor Uni-
versity), Edward Allen (Wake Forest University),Michael Goodrich (UC-Irvine), Xunhua
Wang (James Madison University),Xianyang Li (Illinois Institute of Technology),and Paul
Jenkins (Brigham Young University).
Thanks also to the many people who provided detailed technical reviews of one or
more chapters: Martin Bealby,Martin Hlavac (Department of Algebra, Charles University
in Prague,Czech Republic), Martin Rublik (BSP Consulting and University of Economics in
Bratislava),Rafael Lara (President of Venezuela’s Association for Information Security and
Cryptography Research), Amitabh Saxena,and Michael Spratte (Hewlett-Packard Com-
pany). I would especially like to thank Nikhil Bhargava (IIT Delhi) for providing detailed
reviews of various chapters of the book.
Joan Daemen kindly reviewed the chapter on AES.Vincent Rijmen reviewed the
material on Whirlpool.Edward F.Schaefer reviewed the material on simplified AES.
Nikhil Bhargava (IIT Delhi) developed the set of online homework problems and
solutions.Dan Shumow of Microsoft and the University of Washington developed all of the
Sage examples and assignments in Appendices B and C. Professor Sreekanth Malladi of
Dakota State University developed the hacking exercises.Lawrie Brown of the Australian
Defence Force Academy provided the AES/DES block cipher projects and the security
assessment assignments.
Sanjay Rao and Ruben Torres of Purdue University developed the laboratory exercis-
es that appear in the IRC.The following people contributed project assignments that appear
in the instructor’s supplement:Henning Schulzrinne (Columbia University); Cetin Kaya Koc
(Oregon State University);and David Balenson (Trusted Information Systems and George
Washington University).Kim McLaughlin developed the test bank.
Finally,I would like to thank the many people responsible for the publication of the
book,all of whom did their usual excellent job. This includes my editor Tracy Dunkelberger,
her assistant Melinda Hagerty,and production manager Rose Kernan. Also,Jake Warde of
Warde Publishers managed the reviews.
With all this assistance,little remains for which I can take full credit. However, I am
proud to say that,with no help whatsoever, I selected all of the quotations.
This page intentionally left blank
ABOUT THE AUTHOR
William Stallings has made a unique contribution to understanding the broad sweep of tech-
nical developments in computer security,computer networking and computer architecture.
He has authored 17 titles, and counting revised editions,a total of 42 books on various
aspects of these subjects.His writings have appeared in numerous ACM and IEEE publica-
tions,including the Proceedings of the IEEE and ACM Computing Reviews.
He has 11 times received the award for the best Computer Science textbook of the
year from the Text and Academic Authors Association.
In over 30 years in the field, he has been a technical contributor,technical manager,
and an executive with several high-technology firms.He has designed and implemented both
TCP/IP-based and OSI-based protocol suites on a variety of computers and operating
systems, ranging from microcomputers to mainframes.As a consultant, he has advised
government agencies,computer and software vendors, and major users on the design, selec-
tion,and use of networking software and products.
He created and maintains the Computer Science Student Resource Siteat WilliamStalI-
ings.com/StudentSupport.html.This site provides documents and links on a variety of subjects
of general interest to computer science students (and professionals).He is a member of the
editorial board of Cryptologia,a scholarly journal devoted to all aspects of cryptology.
Dr. Stallings holds a PhD from M.I.T.in Computer Science and a B.S. from Notre
Dame in electrical engineering.
xxiii
This page intentionally left blank
READERS GUIDE
0.1 Outline of This Book
0.2 A Roadmap for Readers and Instructors
Subject Matter
Topic Ordering
0.3 Internet and Web Resources
Web Sites for This Book
Other Web Sites
Newsgroups and Forums
0.4 Standards
CHAPTER
1
2 CHAPTER 0 / READER’S GUIDE
The art of war teaches us to rely not on the likelihood of the enemy’s not coming,but
on our own readiness to receive him;not on the chance of his not attacking, but rather
on the fact that we have made our position unassailable.
The Art of War,Sun Tzu
This book,with its accompanying Web site, covers a lot of material.Here we give the
reader an overview.
0.1 OUTLINE OF THIS BOOK
Following an introductory chapter,Chapter 1, the book is organized into seven parts:
Part One: Symmetric Ciphers: Provides a survey of symmetric encryption, including clas-
sical and modern algorithms.The emphasis is on the two most important algo-
rithms,the Data Encryption Standard (DES) and the Advanced Encryption
Standard (AES).This part also covers the most important stream encryption
algorithm,RC4, and the important topic of pseudorandom number generation.
Part Two: Asymmetric Ciphers: Provides a survey of public-key algorithms,
including RSA (Rivest-Shamir-Adelman) and elliptic curve.
Part Three: Cryptographic Data Integrity Algorithms: Begins with a survey of crypto-
graphic hash functions.This part then covers two approaches to data
integrity that rely on cryptographic hash functions: message authentica-
tion codes and digital signatures.
Part Four: Mutual Trust: Covers key management and key distribution topics and
then covers user authentication techniques.
Part Five: Network Security and Internet Security:Examines the use of cryptographic
algorithms and security protocols to provide security over networks and the
Internet.Topics covered include transport-level security,wireless network
security,e-mail security,and IP security.
Part Six: System Security: Deals with security facilities designed to protect a
computer system from security threats,including intruders, viruses,and
worms.This part also looks at firewall technology.
Part Seven: Legal and Ethical Issues: Deals with the legal and ethical issues related
to computer and network security.
A number of online appendices at this book’s Web site cover additional topics
relevant to the book.
0.2 A ROADMAP FOR READERS AND INSTRUCTORS
Subject Matter
The material in this book is organized into four broad categories:
Cryptographic algorithms: This is the study of techniques for ensuring the
secrecy and/or authenticity of information.The three main areas of study in
0.2 / A ROADMAP FOR READERS AND INSTRUCTORS 3
this category are: (1) symmetric encryption,(2) asymmetric encryption, and
(3) cryptographic hash functions,with the related topics of message authenti-
cation codes and digital signatures.
Mutual trust: This is the study of techniques and algorithms for providing
mutual trust in two main areas.First, key management and distribution deals
with establishing trust in the encryption keys used between two communicating
entities.Second, user authentication deals with establishing trust in the identity
of a communicating partner.
Network security: This area covers the use of cryptographic algorithms in
network protocols and network applications.
Computer security: In this book, we use this term to refer to the security of
computers against intruders (e.g., hackers) and malicious software (e.g.,
viruses).Typically, the computer to be secured is attached to a network,and
the bulk of the threats arise from the network.
The first two parts of the book deal with two distinct cryptographic approaches:
symmetric cryptographic algorithms and public-key,or asymmetric, cryptographic
algorithms.Symmetric algorithms make use of a single key shared by two parties.
Public-key algorithms make use of two keys:a private key known only to one party
and a public key available to other parties.
Topic Ordering
This book covers a lot of material.For the instructor or reader who wishes a shorter
treatment,there are a number of opportunities.
To thoroughly cover the material in the first three parts,the chapters should be
read in sequence.With the exception of the Advanced Encryption Standard (AES),
none of the material in Part Onerequires any special mathematical background. To
understand AES,it is necessary to have some understanding of finite fields.In turn,
an understanding of finite fields requires a basic background in prime numbers and
modular arithmetic.Accordingly,Chapter 4 covers all of these mathematical prelim-
inaries just prior to their use in Chapter 5on AES. Thus,if Chapter 5 is skipped, it is
safe to skip Chapter 4 as well.
Chapter 2 introduces some concepts that are useful in later chapters of Part
One.However, for the reader whose sole interest is contemporary cryptography,
this chapter can be quickly skimmed.The two most important symmetric crypto-
graphic algorithms are DES and AES,which are covered in Chapters 3 and 5,
respectively.
Chapter 6 covers specific techniques for using what are known as block sym-
metric ciphers.Chapter 7 covers stream ciphers and random number generation.
These two chapters may be skipped on an initial reading,but this material is refer-
enced in later parts of the book.
For Part Two,the only additional mathematical background that is needed is in
the area of number theory,which is covered in Chapter 8.The reader who has skipped
Chapters 4and 5 should first review the material on Sections 4.1 through 4.3.
The two most widely used general-purpose public-key algorithms are RSA
and elliptic curve,with RSA enjoying wider acceptance.The reader may wish to skip
the material on elliptic curve cryptography in Chapter 10,at least on a first reading.
4 CHAPTER 0 / READER’S GUIDE
InPart Three, the topics in Sections 12.6 and 12.7 are of lesser importance.
Parts Four,Five, and Six are relatively independent of each other and can be
read in any order.These three parts assume a basic understanding of the material in
Parts One,Two,and Three.The four chapters of Part Five, on network and Internet
security,are relatively independent of one another and can be read in any order.
0.3 INTERNET AND WEB RESOURCES
There are a number of resources available on the Internet and the Web to support
this book and to help readers keep up with developments in this field.
Web Sites for This Book
There is a Web page for this book at WilliamStallings.com/Crypto/Crypto5e.html.
The site includes the following:
Useful Web sites: There are links to other relevant Web sites,organized by
chapter,including the sites listed throughout this book.
Errata sheet: An errata list for this book will be maintained and updated as
needed. Please e-mail any errors that you spot to me.Errata sheets for my
other books are at WilliamStallings.com.
Figures: All of the figures in this book are provided in PDF (Adobe Acrobat)
format.
Tables:All of the tables in this book are provided in PDF format.
Slides: A set of PowerPoint slides are provided,organized by chapter.
Cryptography and network security courses: There are links to home pages for
courses based on this book;these pages may be useful to other instructors in
providing ideas about how to structure their course.
I also maintain the Computer Science Student Resource Site, at William
Stallings.com/StudentSupport.html.The purpose of this site is to provide docu-
ments,information, and links for computer science students and professionals.Links
and documents are organized into six categories:
Math: Includes a basic math refresher, a queuing analysis primer, a number
system primer,and links to numerous math sites
How-to: Advice and guidance for solving homework problems,writing techni-
cal reports,and preparing technical presentations
Research resources: Links to important collections of papers, technical
reports,and bibliographies
Miscellaneous: A variety of other useful documents and links.
Computer science careers: Useful links and documents for those considering a
career in computer science.
Humor and other diversions: You have to take your mind off your work once
in a while.
0.4 / STANDARDS 5
Other Web Sites
There are numerous Web sites that provide information related to the topics of this
book. In subsequent chapters,pointers to specific Web sites can be found in the
Recommended Reading and Web Sitessection. Because the addresses for Web sites
tend to change frequently,the book does not provide URLs.For all of the Web sites
listed in the book,the appropriate link can be found at this book’s Web site. Other
links not mentioned in this book will be added to the Web site over time.
Newsgroups and Forums
A number of USENET newsgroups are devoted to some aspect of cryptography or
network security.As with virtually all USENET groups,there is a high noise-to-signal
ratio,but it is worth experimenting to see if any meet your needs. The most relevant
are as follows:
sci.crypt.research: The best group to follow.This is a moderated newsgroup
that deals with research topics; postings must have some relationship to the
technical aspects of cryptology.
sci.crypt: A general discussion of cryptology and related topics.
sci.crypt.random-numbers: A discussion of cryptographic-strength random
number generators.
alt.security: A general discussion of security topics.
comp.security.misc:A general discussion of computer security topics.
comp.security.firewalls:A discussion of firewall products and technology.
comp.security.announce:News and announcements from CERT.
comp.risks: A discussion of risks to the public from computers and users.
comp.virus: A moderated discussion of computer viruses.
In addition,there are a number of forums dealing with cryptography available
on the Internet.Among the most worthwhile are
Security and Cryptography forum: Sponsored by DevShed. Discusses issues
related to coding, server applications,network protection, data protection,
firewalls,ciphers, and the like.
Cryptography forum: On Topix.Fairly good focus on technical issues.
Security forums: On WindowsSecurity.com.Broad range of forums, including
cryptographic theory,cryptographic software, firewalls,and malware.
Links to these forums are provided at this book’s Web site.
0.4 STANDARDS
Many of the security techniques and applications described in this book have
been specified as standards.Additionally, standards have been developed to
cover management practices and the overall architecture of security mechanisms
6 CHAPTER 0 / READER’S GUIDE
and services.Throughout this book, we describe the most important standards in
use or being developed for various aspects of cryptography and network security.
Various organizations have been involved in the development or promotion of
these standards.The most important (in the current context) of these organiza-
tions are as follows:
National Institute of Standards and Technology:NIST is a U.S.federal agency
that deals with measurement science,standards,and technology related to U.S.
government use and to the promotion of U.S.private-sector innovation.
Despite its national scope,NIST Federal Information Processing Standards
(FIPS) and Special Publications (SP) have a worldwide impact.
Internet Society: ISOC is a professional membership society with worldwide
organizational and individual membership.It provides leadership in address-
ing issues that confront the future of the Internet and is the organization
home for the groups responsible for Internet infrastructure standards,
including the Internet Engineering Task Force (IETF) and the Internet
Architecture Board (IAB).These organizations develop Internet standards
and related specifications, all of which are published as Requests for
Comments (RFCs).
ITU-T: The International Telecommunication Union (ITU) is an international
organization within the United Nations System in which governments and the
private sector coordinate global telecom networks and services The ITU
Telecommunication Standardization Sector (ITU-T) is one of the three sectors
of the ITU.ITU-T’s mission is the production of standards covering all fields of
telecommunications.ITU-T standards are referred to as Recommendations.
ISO: The International Organization for Standardization (ISO)
1
is a world-
wide federation of national standards bodies from more than 140 countries,
one from each country.ISO is a nongovernmental organization that pro-
motes the development of standardization and related activities with a view
to facilitating the international exchange of goods and services and to devel-
oping cooperation in the spheres of intellectual,scientific, technological, and
economic activity.ISO’s work results in international agreements that are
published as International Standards.
A more detailed discussion of these organizations is contained in Appendix D.
1
ISO is not an acronym (in which case it would be IOS),but it is a word derived from the Greek, meaning
equal.
OVERVIEW
1.1 Computer Security Concepts
A Definition of Computer Security
Examples
The Challenges of Computer Security
1.2 The OSI Security Architecture
1.3 Security Attacks
Passive Attacks
Active Attacks
1.4 Security Services
Authentication
Access Control
Data Confidentiality
Data Integrity
Nonrepudiation
Availability Service
1.5 Security Mechanisms
1.6 A Model for Network Security
1.7 Recommended Reading and Web Sites
1.8 Key Terms,Review Questions,and Problems
CHAPTER
7
8 CHAPTER 1 / OVERVIEW
KEY POINTS
The Open Systems Interconnection (OSI) security architecture provides
a systematic framework for defining security attacks, mechanisms,and
services.
Security attacks are classified as either passive attacks, which include
unauthorized reading of a message of file and traffic analysis or active
attacks,such as modification of messages or files, and denial of service.
A security mechanism is any process (or a device incorporating such a
process) that is designed to detect,prevent, or recover from a security attack.
Examples of mechanisms are encryption algorithms,digital signatures, and
authentication protocols.
Security services include authentication, access control,data confidentiality,
data integrity,nonrepudiation, and availability.
This book focuses on two broad areas:cryptographic algorithms and protocols, which
have a broad range of applications; and network and Internet security,which rely
heavily on cryptographic techniques.
Cryptographic algorithms and protocolscan be grouped into four main areas:
Symmetric encryption: Used to conceal the contents of blocks or streams of
data of any size,including messages, files,encryption keys, and passwords.
Asymmetric encryption: Used to conceal small blocks of data,such as encryption
keys and hash function values,which are used in digital signatures.
Data integrity algorithms: Used to protect blocks of data, such as messages,
from alteration.
Authentication protocols:These are schemes based on the use of cryptographic
algorithms designed to authenticate the identity of entities.
The field of network and Internet securityconsists of measures to deter, prevent,
detect, and correct security violations that involve the transmission of informa-
tion.That is a broad statement that covers a host of possibilities. To give you a
feel for the areas covered in this book, consider the following examples of
security violations:
1. User A transmits a file to user B.The file contains sensitive information
(e.g.,payroll records) that is to be protected from disclosure. User C, who is
The combination of space, time, and strength that must be considered as the
basic elements of this theory of defense makes this a fairly complicated matter.
Consequently,it is not easy to find a fixed point of departure.
On War,Carl Von Clausewitz
1.1 / COMPUTER SECURITY CONCEPTS 9
not authorized to read the file,is able to monitor the transmission and capture
a copy of the file during its transmission.
2. A network manager, D, transmits a message to a computer, E,under its man-
agement.The message instructs computer E to update an authorization file to
include the identities of a number of new users who are to be given access to
that computer. User F intercepts the message,alters its contents to add or
delete entries,and then forwards the message to computer E, which accepts
the message as coming from manager D and updates its authorization file
accordingly.
3. Rather than intercept a message, user F constructs its own message with the
desired entries and transmits that message to computer E as if it had come
from manager D.Computer E accepts the message as coming from manager D
and updates its authorization file accordingly.
4. An employee is fired without warning. The personnel manager sends a
message to a server system to invalidate the employee’s account.When the
invalidation is accomplished,the server is to post a notice to the employee’s
file as confirmation of the action.The employee is able to intercept the
message and delay it long enough to make a final access to the server to
retrieve sensitive information.The message is then forwarded, the action
taken, and the confirmation posted. The employee’s action may go
unnoticed for some considerable time.
5. A message is sent from a customer to a stockbroker with instructions for
various transactions. Subsequently, the investments lose value and the
customer denies sending the message.
Although this list by no means exhausts the possible types of network security
violations,it illustrates the range of concerns of network security.
1.1 COMPUTER SECURITY CONCEPTS
A Definition of Computer Security
The NIST Computer Security Handbook[NIST95] defines the term computer security
as follows:
COMPUTER SECURITY
The protection afforded to an automated information system in order to attain the
applicable objectives of preserving the integrity,availability, and confidentiality of
information system resources (includes hardware,software, firmware, information/
data,and telecommunications).
10 CHAPTER 1 / OVERVIEW
This definition introduces three key objectives that are at the heart of
computer security:
Confidentiality: This term covers two related concepts:
Data
1
confidentiality: Assures that private or confidential information is
not made available or disclosed to unauthorized individuals.
Privacy: Assures that individuals control or influence what information
related to them may be collected and stored and by whom and to whom
that information may be disclosed.
Integrity: This term covers two related concepts:
Data integrity: Assures that information and programs are changed only in
a specified and authorized manner.
System integrity:Assures that a system performs its intended function in an
unimpaired manner, free from deliberate or inadvertent unauthorized
manipulation of the system.
Availability:Assures that systems work promptly and service is not denied to
authorized users.
These three concepts form what is often referred to as the CIA triad
(Figure 1.1).The three concepts embody the fundamental security objectives for
both data and for information and computing services.For example, the NIST
standard FIPS 199 (Standards for Security Categorization of Federal Information
Confidentiality
Data
and
services
Integrity
Availability
Figure 1.1 The Security Requirements
Triad
1
RFC 2828 defines information as “facts and ideas,which can be represented (encoded) as various
forms of data,”and data as “information in a specific physical representation, usually a sequence
of symbols that have meaning;especially a representation of information that can be processed or
produced by a computer.”Security literature typically does not make much of a distinction, nor does
this book.
1.1 / COMPUTER SECURITY CONCEPTS 11
and Information Systems) lists confidentiality,integrity,and availability as the three
security objectives for information and for information systems.FIPS 199 provides a
useful characterization of these three objectives in terms of requirements and the
definition of a loss of security in each category:
Confidentiality: Preserving authorized restrictions on information access
and disclosure,including means for protecting personal privacy and propri-
etary information.A loss of confidentiality is the unauthorized disclosure of
information.
Integrity: Guarding against improper information modification or destruc-
tion, including ensuring information nonrepudiation and authenticity.
A loss of integrity is the unauthorized modification or destruction of
information.
Availability: Ensuring timely and reliable access to and use of information.
A loss of availability is the disruption of access to or use of information or an
information system.
Although the use of the CIA triad to define security objectives is well established,
some in the security field feel that additional concepts are needed to present a complete
picture.Two of the most commonly mentioned are as follows:
Authenticity: The property of being genuine and being able to be verified and
trusted; confidence in the validity of a transmission,a message, or message
originator.This means verifying that users are who they say they are and that
each input arriving at the system came from a trusted source.
Accountability: The security goal that generates the requirement for actions
of an entity to be traced uniquely to that entity.This supports nonrepudia-
tion, deterrence,fault isolation, intrusion detection and prevention, and
after-action recovery and legal action.Because truly secure systems are not
yet an achievable goal, we must be able to trace a security breach to
a responsible party.Systems must keep records of their activities to permit
later forensic analysis to trace security breaches or to aid in transaction
disputes.
Examples
We now provide some examples of applications that illustrate the requirements just
enumerated.
2
For these examples,we use three levels of impact on organizations or
individuals should there be a breach of security (i.e., a loss of confidentiality,
integrity,or availability).These levels are defined in FIPS PUB 199:
Low: The loss could be expected to have a limited adverse effect on organiza-
tional operations,organizational assets,or individuals. A limited adverse effect
means that, for example,the loss of confidentiality, integrity,or availability
2
These examples are taken from a security policy document published by the Information Technology
Security and Privacy Office at Purdue University.
12 CHAPTER 1 / OVERVIEW
might (i) cause a degradation in mission capability to an extent and duration
that the organization is able to perform its primary functions,but the effec-
tiveness of the functions is noticeably reduced;(ii) result in minor damage to
organizational assets;(iii) result in minor financial loss; or (iv) result in minor
harm to individuals.
Moderate: The loss could be expected to have a serious adverse effect on
organizational operations,organizational assets, or individuals. A serious
adverse effect means that,for example, the loss might (i) cause a significant
degradation in mission capability to an extent and duration that the organi-
zation is able to perform its primary functions, but the effectiveness of
the functions is significantly reduced; (ii) result in significant damage to
organizational assets;(iii) result in significant financial loss; or (iv) result in
significant harm to individuals that does not involve loss of life or serious,
life-threatening injuries.
High: The loss could be expected to have a severe or catastrophic adverse
effect on organizational operations,organizational assets, or individuals.
A severe or catastrophic adverse effect means that, for example,the loss
might (i) cause a severe degradation in or loss of mission capability to an
extent and duration that the organization is not able to perform one or
more of its primary functions;(ii) result in major damage to organizational
assets; (iii) result in major financial loss;or (iv) result in severe or cata-
strophic harm to individuals involving loss of life or serious,life-threatening
injuries.
C
ONFIDENTIALITY Student grade information is an asset whose confidentiality
isconsidered to be highly important by students. In the United States, the release
of such information is regulated by the Family Educational Rights and Privacy
Act (FERPA).Grade information should only be available to students, their
parents,and employees that require the information to do their job. Student
enrollment information may have a moderate confidentiality rating.While still
covered by FERPA,this information is seen by more people on a daily basis, is
less likely to be targeted than grade information, and results in less damage if
disclosed. Directory information, such as lists of students or faculty or
departmental lists,may be assigned a low confidentiality rating or indeed no
rating.This information is typically freely available to the public and published
on a school’s Web site.
INTEGRITY Several aspects of integrity are illustrated by the example of a hospital
patient’s allergy information stored in a database.The doctor should be able to trust
that the information is correct and current.Now suppose that an employee (e.g., a
nurse) who is authorized to view and update this information deliberately falsifies
the data to cause harm to the hospital.The database needs to be restored to a
trusted basis quickly,and it should be possible to trace the error back to the person
responsible. Patient allergy information is an example of an asset with a high
requirement for integrity.Inaccurate information could result in serious harm or
death to a patient and expose the hospital to massive liability.
1.1 / COMPUTER SECURITY CONCEPTS 13
An example of an asset that may be assigned a moderate level of integrity
requirement is a Web site that offers a forum to registered users to discuss some
specific topic. Either a registered user or a hacker could falsify some entries or
deface the Web site.If the forum exists only for the enjoyment of the users,brings in
little or no advertising revenue,and is not used for something important such as
research, then potential damage is not severe.The Web master may experience
some data,financial, and time loss.
An example of a low integrity requirement is an anonymous online poll.Many
Web sites,such as news organizations, offer these polls to their users with very few
safeguards.However, the inaccuracy and unscientific nature of such polls is well
understood.
A
VAILABILITY The more critical a component or service, the higher is the level of
availability required.Consider a system that provides authentication services for
critical systems,applications, and devices.An interruption of service results in the
inability for customers to access computing resources and staff to access
the resources they need to perform critical tasks. The loss of the service
translates into a large financial loss in lost employee productivity and potential
customer loss.
An example of an asset that would typically be rated as having a moderate
availability requirement is a public Web site for a university;the Web site provides
information for current and prospective students and donors.Such a site is not a
critical component of the university’s information system,but its unavailability will
cause some embarrassment.
An online telephone directory lookup application would be classified as a
lowavailability requirement. Although the temporary loss of the application may be
an annoyance,there are other ways to access the information, such as a hardcopy
directory or the operator.
The Challenges of Computer Security
Computer and network security is both fascinating and complex.Some of the reasons
follow:
1. Security is not as simple as it might first appear to the novice. The require-
ments seem to be straightforward; indeed,most of the major requirements
for security services can be given self-explanatory,one-word labels: confi-
dentiality,authentication, nonrepudiation, or integrity.But the mechanisms
used to meet those requirements can be quite complex, and understanding
them may involve rather subtle reasoning.
2. In developing a particular security mechanism or algorithm, one must always
consider potential attacks on those security features.In many cases, successful
attacks are designed by looking at the problem in a completely different way,
therefore exploiting an unexpected weakness in the mechanism.
3. Because of point 2, the procedures used to provide particular services are often
counterintuitive.Typically,a security mechanism is complex,and it is not obvious
from the statement of a particular requirement that such elaborate measures are
14 CHAPTER 1 / OVERVIEW
needed. It is only when the various aspects of the threat are considered that
elaborate security mechanisms make sense.
4. Having designed various security mechanisms,it is necessary to decide where
to use them.This is true both in terms of physical placement (e.g., at what
points in a network are certain security mechanisms needed) and in a logical
sense [e.g., at what layer or layers of an architecture such as TCP/IP
(Transmission Control Protocol/Internet Protocol) should mechanisms be
placed].
5. Security mechanisms typically involve more than a particular algorithm or
protocol.They also require that participants be in possession of some secret
information (e.g., an encryption key), which raises questions about the
creation, distribution,and protection of that secret information. There also
may be a reliance on communications protocols whose behavior may compli-
cate the task of developing the security mechanism.For example, if the proper
functioning of the security mechanism requires setting time limits on the
transit time of a message from sender to receiver, then any protocol or
network that introduces variable,unpredictable delays may render such time
limits meaningless.
6. Computer and network security is essentially a battle of wits between a perpetrator
who tries to find holes and the designer or administrator who tries to close them.
The great advantage that the attacker has is that he or she need only find a single
weakness,while the designer must find and eliminate all weaknesses to achieve
perfect security.
7. There is a natural tendency on the part of users and system managers to perceive
little benefit from security investment until a security failure occurs.
8. Security requires regular,even constant, monitoring, and this is difficult in today’s
short-term,overloaded environment.
9. Security is still too often an afterthought to be incorporated into a system after
the design is complete rather than being an integral part of the design process.
10. Many users and even security administrators view strong security as an imped-
iment to efficient and user-friendly operation of an information system or use
of information.
The difficulties just enumerated will be encountered in numerous ways as we
examine the various security threats and mechanisms throughout this book.
1.2 THE OSI SECURITY ARCHITECTURE
To assess effectively the security needs of an organization and to evaluate and choose
various security products and policies,the manager responsible for security needs
some systematic way of defining the requirements for security and characterizing the
approaches to satisfying those requirements.This is difficult enough in a centralized
data processing environment; with the use of local and wide area networks,the
problems are compounded.
1.3 / SECURITY ATTACKS 15
ITU-T
3
Recommendation X.800,Security Architecture for OSI, defines such a
systematic approach.
4
The OSI security architecture is useful to managers as a way
of organizing the task of providing security.Furthermore, because this architecture
was developed as an international standard,computer and communications vendors
have developed security features for their products and services that relate to this
structured definition of services and mechanisms.
For our purposes,the OSI security architecture provides a useful, if abstract,
overview of many of the concepts that this book deals with.The OSI security archi-
tecture focuses on security attacks,mechanisms, and services.These can be defined
briefly as
Security attack: Any action that compromises the security of information
owned by an organization.
Security mechanism: A process (or a device incorporating such a process) that
is designed to detect,prevent, or recover from a security attack.
Security service: A processing or communication service that enhances the
security of the data processing systems and the information transfers of an
organization.The services are intended to counter security attacks, and they
make use of one or more security mechanisms to provide the service.
In the literature,the terms threat and attack are commonly used to mean more
or less the same thing.Table 1.1 provides definitions taken from RFC 2828, Internet
Security Glossary.
1.3 SECURITY ATTACKS
A useful means of classifying security attacks,used both in X.800 and RFC 2828, is
in terms of passive attacks and active attacks.A passive attack attempts to learn or
make use of information from the system but does not affect system resources.An
active attack attempts to alter system resources or affect their operation.
3
The International Telecommunication Union (ITU) Telecommunication Standardization Sector (ITU-T)
is a United Nations-sponsored agency that develops standards,called Recommendations, relating to
telecommunications and to open systems interconnection (OSI).
4
The OSI security architecture was developed in the context of the OSI protocol architecture,which is
described in Appendix L.However,for our purposes in this chapter, an understanding of the OSI protocol
architecture is not required.
Table 1.1 Threats and Attacks (RFC 2828)
Threat
A potential for violation of security,which exists when there is a circumstance,capability, action,
or event that could breach security and cause harm.That is,a threat is a possible danger that
might exploit a vulnerability.
Attack
An assault on system security that derives from an intelligent threat;that is, an intelligent act that is a
deliberate attempt (especially in the sense of a method or technique) to evade security services and
violate the security policy of a system.
16 CHAPTER 1 / OVERVIEW
Passive Attacks
Passive attacks are in the nature of eavesdropping on,or monitoring of, transmis-
sions.The goal of the opponent is to obtain information that is being transmitted.
Two types of passive attacks are the release of message contents and traffic
analysis.
The release of message contents is easily understood (Figure 1.2a).A telephone
conversation,an electronic mail message, and a transferred file may contain sensitive
or confidential information.We would like to prevent an opponent from learning the
contents of these transmissions.
A second type of passive attack, traffic analysis, is subtler (Figure 1.2b).
Suppose that we had a way of masking the contents of messages or other
information traffic so that opponents,even if they captured the message, could
not extract the information from the message. The common technique for
masking contents is encryption. If we had encryption protection in place,an
opponent might still be able to observe the pattern of these messages. The
opponent could determine the location and identity of communicating hosts and
could observe the frequency and length of messages being exchanged. This
information might be useful in guessing the nature of the communication that
was taking place.
Passive attacks are very difficult to detect, because they do not involve any
alteration of the data.Typically,the message traffic is sent and received in an appar-
ently normal fashion,and neither the sender nor receiver is aware that a third party
has read the messages or observed the traffic pattern.However, it is feasible to pre-
vent the success of these attacks,usually by means of encryption. Thus,the empha-
sis in dealing with passive attacks is on prevention rather than detection.
Active Attacks
Active attacks involve some modification of the data stream or the creation of a false
stream and can be subdivided into four categories:masquerade, replay,modification
of messages,and denial of service.
A masquerade takes place when one entity pretends to be a different entity
(Figure 1.3a).A masquerade attack usually includes one of the other forms of active
attack.For example, authentication sequences can be captured and replayed after a
valid authentication sequence has taken place,thus enabling an authorized entity
with few privileges to obtain extra privileges by impersonating an entity that has
those privileges.
Replay involves the passive capture of a data unit and its subsequent retrans-
mission to produce an unauthorized effect (Figure 1.3b).
Modification of messages simply means that some portion of a legitimate
message is altered,or that messages are delayed or reordered, to produce an unau-
thorized effect (Figure 1.3c).For example, a message meaning “Allow John Smith
to read confidential file accounts”is modified to mean “Allow Fred Brown to read
confidential file accounts.
The denial of service prevents or inhibits the normal use or management of
communications facilities (Figure 1.3d).This attack may have a specific target; for
example,an entity may suppress all messages directed to a particular destination
1.3 / SECURITY ATTACKS 17
(a) Release of message contents
Bob
Darth
Alice
Read contents of
message from Bob
to Alice
(b) Traffic analysis
Bob
Darth
Alice
Observe pattern of
messages from Bob
to Alice
Internet or
other comms facility
Internet or
other comms facility
Figure 1.2 Passive Attacks
(e.g.,the security audit service).Another form of service denial is the disruption of an
entire network,either by disabling the network or by overloading it with messages so
as to degrade performance.
Active attacks present the opposite characteristics of passive attacks.Whereas
passive attacks are difficult to detect,measures are available to prevent their success.
18 CHAPTER 1 / OVERVIEW
On the other hand,it is quite difficult to prevent active attacks absolutely because of
the wide variety of potential physical,software, and network vulnerabilities. Instead,
the goal is to detect active attacks and to recover from any disruption or
delays caused by them.If the detection has a deterrent effect, it may also contribute
to prevention.
(a) Masquerade
Bob
Darth
Alice
Alice
Message from Darth
that appears to be
from Bob
(b) Replay
Bob
Darth
Capture message from
Bob to Alice; later
replay message to Alice
Internet or
other comms facility
Internet or
other comms facility
Figure 1.3 Active attacks (Continued)
1.4 / SECURITY SERVICES 19
(c) Modification of messages
Bob
Darth
Alice
Darth modifies
message from Bob
to Alice
(d) Denial of service
Bob
Darth
Server
Darth disrupts service
provided by server
Internet or
other comms facility
Internet or
other comms facility
Figure 1.3 Active attacks
1.4 SECURITY SERVICES
X.800 defines a security service as a service that is provided by a protocol layer of
communicating open systems and that ensures adequate security of the systems or
of data transfers.Perhaps a clearer definition is found in RFC 2828, which provides
the following definition:a processing or communication service that is provided by
20 CHAPTER 1 / OVERVIEW
a system to give a specific kind of protection to system resources;security services
implement security policies and are implemented by security mechanisms.
X.800 divides these services into five categories and fourteen specific services
(Table 1.2).We look at each category in turn.
5
5
There is no universal agreement about many of the terms used in the security literature.For example,
the term integrityis sometimes used to refer to all aspects of information security.The term authentication
is sometimes used to refer both to verification of identity and to the various functions listed under
integrity in this chapter.Our usage here agrees with both X.800 and RFC 2828.
Table 1.2 Security Services (X.800)
AUTHENTICATION
The assurance that the communicating entity is the
one that it claims to be.
Peer Entity Authentication
Used in association with a logical connection to
provide confidence in the identity of the entities
connected.
Data-Origin Authentication
In a connectionless transfer,provides assurance that
the source of received data is as claimed.
ACCESS CONTROL
The prevention of unauthorized use of a resource
(i.e.,this service controls who can have access to a
resource,under what conditions access can occur,
and what those accessing the resource are allowed
to do).
DATA CONFIDENTIALITY
The protection of data from unauthorized
disclosure.
Connection Confidentiality
The protection of all user data on a connection.
Connectionless Confidentiality
The protection of all user data in a single data block
Selective-Field Confidentiality
The confidentiality of selected fields within the user
data on a connection or in a single data block.
Traffic-Flow Confidentiality
The protection of the information that might be
derived from observation of traffic flows.
DATA INTEGRITY
The assurance that data received are exactly as
sent by an authorized entity (i.e.,contain no
modification,insertion, deletion, or replay).
Connection Integrity with Recovery
Provides for the integrity of all user data on a
connection and detects any modification,insertion,
deletion,or replay of any data within an entire data
sequence,with recovery attempted.
Connection Integrity without Recovery
As above,but provides only detection without recovery.
Selective-Field Connection Integrity
Provides for the integrity of selected fields within the
user data of a data block transferred over a connec-
tion and takes the form of determination of whether
the selected fields have been modified,inserted,
deleted,or replayed.
Connectionless Integrity
Provides for the integrity of a single connectionless
data block and may take the form of detection of
data modification.Additionally,a limited form of
replay detection may be provided.
Selective-Field Connectionless Integrity
Provides for the integrity of selected fields within a single
connectionless data block;takes the form of determina-
tion of whether the selected fields have been modified.
NONREPUDIATION
Provides protection against denial by one of the
entities involved in a communication of having
participated in all or part of the communication.
Nonrepudiation,Origin
Proof that the message was sent by the specified party.
Nonrepudiation,Destination
Proof that the message was received by the specified
party.
1.4 / SECURITY SERVICES 21
Authentication
The authentication service is concerned with assuring that a communication is
authentic. In the case of a single message,such as a warning or alarm signal, the
function of the authentication service is to assure the recipient that the message is
from the source that it claims to be from.In the case of an ongoing interaction, such
as the connection of a terminal to a host,two aspects are involved. First, at the time
of connection initiation,the service assures that the two entities are authentic, that
is,that each is the entity that it claims to be. Second,the service must assure that the
connection is not interfered with in such a way that a third party can masquerade as
one of the two legitimate parties for the purposes of unauthorized transmission or
reception.
Two specific authentication services are defined in X.800:
Peer entity authentication: Provides for the corroboration of the identity
of a peer entity in an association. Two entities are considered peers if
they implement to same protocol in different systems; e.g.,two TCP mod-
ules in two communicating systems.Peer entity authentication is provided
for use at the establishment of,or at times during the data transfer phase
of,a connection. It attempts to provide confidence that an entity is not
performing either a masquerade or an unauthorized replay of a previous
connection.
Data origin authentication: Provides for the corroboration of the source
of a data unit. It does not provide protection against the duplication or
modification of data units.This type of service supports applications like
electronic mail,where there are no prior interactions between the commu-
nicating entities.
Access Control
In the context of network security,access control is the ability to limit and control
the access to host systems and applications via communications links.To achieve
this,each entity trying to gain access must first be identified, or authenticated, so
that access rights can be tailored to the individual.
Data Confidentiality
Confidentiality is the protection of transmitted data from passive attacks.With
respect to the content of a data transmission, several levels of protection can be
identified.The broadest service protects all user data transmitted between two users
over a period of time.For example, when a TCP connection is set up between two
systems,this broad protection prevents the release of any user data transmitted over
the TCP connection.Narrower forms of this service can also be defined, including
the protection of a single message or even specific fields within a message.These
refinements are less useful than the broad approach and may even be more complex
and expensive to implement.
The other aspect of confidentiality is the protection of traffic flow from analysis.
This requires that an attacker not be able to observe the source and destination,
frequency,length, or other characteristics of the traffic on a communications facility.
22 CHAPTER 1 / OVERVIEW
Data Integrity
As with confidentiality,integrity can apply to a stream of messages,a single message,
or selected fields within a message.Again, the most useful and straightforward
approach is total stream protection.
A connection-oriented integrity service, one that deals with a stream of
messages,assures that messages are received as sent with no duplication, inser-
tion, modification,reordering, or replays. The destruction of data is also covered
under this service.Thus, the connection-oriented integrity service addresses both
message stream modification and denial of service.On the other hand, a connec-
tionless integrity service,one that deals with individual messages without regard
to any larger context,generally provides protection against message modification
only.
We can make a distinction between service with and without recovery.Because
the integrity service relates to active attacks,we are concerned with detection rather
than prevention. If a violation of integrity is detected,then the service may simply
report this violation, and some other portion of software or human intervention is
required to recover from the violation.Alternatively,there are mechanisms available
to recover from the loss of integrity of data, as we will review subsequently.
The incorporation of automated recovery mechanisms is, in general,the more
attractive alternative.
Nonrepudiation
Nonrepudiation prevents either sender or receiver from denying a transmitted
message.Thus, when a message is sent, the receiver can prove that the alleged
sender in fact sent the message.Similarly,when a message is received, the sender can
prove that the alleged receiver in fact received the message.
Availability Service
Both X.800 and RFC 2828 define availability to be the property of a system or a
system resource being accessible and usable upon demand by an authorized
system entity, according to performance specifications for the system (i.e.,a
system is available if it provides services according to the system design whenever
users request them).A variety of attacks can result in the loss of or reduction in
availability.Some of these attacks are amenable to automated countermeasures,
such as authentication and encryption, whereas others require some sort of
physical action to prevent or recover from loss of availability of elements of a
distributed system.
X.800 treats availability as a property to be associated with various security
services.However, it makes sense to call out specifically an availability service. An
availability service is one that protects a system to ensure its availability.This service
addresses the security concerns raised by denial-of-service attacks.It depends on
proper management and control of system resources and thus depends on access
control service and other security services.
1.5 / SECURITY MECHANISMS 23
1.5 SECURITY MECHANISMS
Table 1.3 lists the security mechanisms defined in X.800.The mechanisms are
divided into those that are implemented in a specific protocol layer,such as TCP or
an application-layer protocol, and those that are not specific to any particular
protocol layer or security service.These mechanisms will be covered in the appro-
priate places in the book. So we do not elaborate now,except to comment on the
Table 1.3 Security Mechanisms (X.800)
SPECIFIC SECURITY MECHANISMS
May be incorporated into the appropriate protocol
layer in order to provide some of the OSI security
services.
Encipherment
The use of mathematical algorithms to transform
data into a form that is not readily intelligible.The
transformation and subsequent recovery of the
datadepend on an algorithm and zero or more
encryption keys.
Digital Signature
Data appended to,or a cryptographic transformation
of,a data unit that allows a recipient of the data unit
to prove the source and integrity of the data unit and
protect against forgery (e.g.,by the recipient).
Access Control
A variety of mechanisms that enforce access rights to
resources.
Data Integrity
A variety of mechanisms used to assure the integrity
of a data unit or stream of data units.
Authentication Exchange
A mechanism intended to ensure the identity of an
entity by means of information exchange.
Traffic Padding
The insertion of bits into gaps in a data stream to
frustrate traffic analysis attempts.
Routing Control
Enables selection of particular physically secure
routes for certain data and allows routing changes,
especially when a breach of security is suspected.
Notarization
The use of a trusted third party to assure certain
properties of a data exchange.
PERVASIVE SECURITY MECHANISMS
Mechanisms that are not specific to any particular
OSI security service or protocol layer.
Trusted Functionality
That which is perceived to be correct with respect to
some criteria (e.g.,as established by a security policy).
Security Label
The marking bound to a resource (which may be a
data unit) that names or designates the security
attributes of that resource.
Event Detection
Detection of security-relevant events.
Security Audit Trail
Data collected and potentially used to facilitate a
security audit,which is an independent review and
examination of system records and activities.
Security Recovery
Deals with requests from mechanisms,such as event
handling and management functions,and takes
recovery actions.
1.6 / A MODEL FOR NETWORK SECURITY 25
definition of encipherment. X.800 distinguishes between reversible encipherment
mechanisms and irreversible encipherment mechanisms.A reversible encipherment
mechanism is simply an encryption algorithm that allows data to be encrypted and
subsequently decrypted. Irreversible encipherment mechanisms include hash algo-
rithms and message authentication codes,which are used in digital signature and
message authentication applications.
Table 1.4,based on one in X.800, indicates the relationship between security
services and security mechanisms.
1.6 A MODEL FOR NETWORK SECURITY
A model for much of what we will be discussing is captured,in very general terms, in
Figure 1.4.A message is to be transferred from one party to another across some sort
of Internet service.The two parties, who are the principals in this transaction, must
cooperate for the exchange to take place.A logical information channel is estab-
lished by defining a route through the Internet from source to destination and by the
cooperative use of communication protocols (e.g.,TCP/IP) by the two principals.
Security aspects come into play when it is necessary or desirable to protect
the information transmission from an opponent who may present a threat to confi-
dentiality,authenticity, and so on. All the techniques for providing security have
two components:
A security-related transformation on the information to be sent. Examples
include the encryption of the message,which scrambles the message so that it is
unreadable by the opponent,and the addition of a code based on the contents
of the message,which can be used to verify the identity of the sender.
Information
channel
Security-related
transformation
Sender
Secret
information
Message
Message
Secure
message
Secure
message
Recipient
Opponent
Trusted third party
(e.g., arbiter, distributer
of secret information)
Security-related
transformation
Secret
information
Figure 1.4 Model for Network Security
26 CHAPTER 1 / OVERVIEW
Some secret information shared by the two principals and, it is hoped,
unknown to the opponent.An example is an encryption key used in conjunc-
tion with the transformation to scramble the message before transmission and
unscramble it on reception.
6
A trusted third party may be needed to achieve secure transmission. For
example,a third party may be responsible for distributing the secret information to
the two principals while keeping it from any opponent. Or a third party may be
needed to arbitrate disputes between the two principals concerning the authenticity
of a message transmission.
This general model shows that there are four basic tasks in designing a particular
security service:
1. Design an algorithm for performing the security-related transformation.
Thealgorithm should be such that an opponent cannot defeat its purpose.
2. Generate the secret information to be used with the algorithm.
3. Develop methods for the distribution and sharing of the secret information.
4. Specify a protocol to be used by the two principals that makes use of the security
algorithm and the secret information to achieve a particular security service.
Parts One through Five of this book concentrate on the types of security mecha-
nisms and services that fit into the model shown in Figure 1.4.However,there are other
security-related situations of interest that do not neatly fit this model but are consid-
ered in this book.A general model of these other situations is illustrated by Figure 1.5,
which reflects a concern for protecting an information system from unwanted access.
Most readers are familiar with the concerns caused by the existence of hackers,who
attempt to penetrate systems that can be accessed over a network.The hacker can
be someone who,with no malign intent, simply gets satisfaction from breaking and
entering a computer system.The intruder can be a disgruntled employee who wishes
to do damage or a criminal who seeks to exploit computer assets for financial gain
(e.g.,obtaining credit card numbers or performing illegal money transfers).
6
Part Two discusses a form of encryption,known as a symmetric encryption,in which only one of the two
principals needs to have the secret information.
Computing resources
(processor, memory, I/O)
Data
Processes
Software
Internal security controls
Information system
Gatekeeper
function
Opponent
—human (e.g., hacker)
—software (e.g., virus, worm)
Access channel
Figure 1.5 Network Access Security Model
1.7 / RECOMMENDED READING AND WEB SITES 27
Another type of unwanted access is the placement in a computer system of
logic that exploits vulnerabilities in the system and that can affect application pro-
grams as well as utility programs,such as editors and compilers. Programs can pre-
sent two kinds of threats:
Information access threats: Intercept or modify data on behalf of users who
should not have access to that data.
Service threats: Exploit service flaws in computers to inhibit use by legitimate
users.
Viruses and worms are two examples of software attacks.Such attacks can be
introduced into a system by means of a disk that contains the unwanted logic con-
cealed in otherwise useful software.They can also be inserted into a system across a
network;this latter mechanism is of more concern in network security.
The security mechanisms needed to cope with unwanted access fall into two
broad categories (see Figure 1.5).The first category might be termed a gatekeeper
function. It includes password-based login procedures that are designed to deny
access to all but authorized users and screening logic that is designed to detect and
reject worms,viruses, and other similar attacks. Once either an unwanted user or
unwanted software gains access,the second line of defense consists of a variety of
internal controls that monitor activity and analyze stored information in an
attempt to detect the presence of unwanted intruders.These issues are explored in
Part Six.
1.7 RECOMMENDED READING AND WEB SITES
[STAL02] provides a broad introduction to both computer and network security.[SCHN00]
is valuable reading for any practitioner in the field of computer or network security: It
discusses the limitations of technology,and cryptography in particular, in providing security
and the need to consider the hardware,the software implementation, the networks, and the
people involved in providing and attacking security.
It is useful to read some of the classic tutorial papers on computer security; these
provide a historical perspective from which to appreciate current work and thinking.The
papers to read are [WARE79],[BROW72], [SALT75],[SHAN77], and [SUMM84]. Two
more recent, short treatments of computer security are [ANDR04] and [LAMP04].
[NIST95] is an exhaustive (290 pages) treatment of the subject.Another good treatment is
[NRC91].Also useful is [FRAS97].
ANDR04 Andrews, M., and Whittaker, J.“Computer Security.” IEEE Security and
Privacy,September/October 2004.
BROW72 Browne,P. “Computer Security—A Survey.”ACM SIGMIS Database, Fall
1972.
FRAS97 Fraser,B. Site Security Handbook. RFC 2196,September 1997.
LAMP04 Lampson,B. “Computer Security in the Real World,”Computer,June 2004.
NIST95 National Institute of Standards and Technology.An Introduction to Computer
Security:The NIST Handbook. Special Publication 800–12, October 1995.
28 CHAPTER 1 / OVERVIEW
7
Because URLs sometimes change,they are not included. For all of the Web sites listed in this and
subsequent chapters,the appropriate link is at this book’s Web site at williamstallings.com/Crypto/
Crypto5e.html.
NRC91 National Research Council. Computers at Risk: Safe Computing in the
Information Age.Washington, D.C.:National Academy Press, 1991.
SALT75 Saltzer,J., and Schroeder, M.“The Protection of Information in Computer
Systems.”Proceedings of the IEEE, September 1975.
SCHN00 Schneier,B. Secrets and Lies:Digital Security in a Networked World. New York:
Wiley,2000.
SHAN77 Shanker,K. “The Total Computer Security Problem:An Overview.”Computer,
June 1977.
STAL08 Stallings,W., and Brown, L. Computer Security. Upper Saddle River,NJ:
Prentice Hall,2008.
SUMM84 Summers, R.“An Overview of Computer Security.”IBM Systems Journal,
Vol.23, No.4, 1984.
WARE79 Ware,W.,ed. Security Controls for Computer Systems. RAND Report 609–1.
October 1979.http://www.rand.org/pubs/reports/R609-1/R609.1.html
Recommended Web Sites:
The following Web sites
7
are of general interest related to cryptography and network security:
IETF Security Area:Material related to Internet security standardization efforts.
The Cryptography FAQ:Lengthy and worthwhile FAQ covering all aspects of cryptog-
raphy.
Tom Dunigan’s Security page:An excellent list of pointers to cryptography and net-
work security Web sites.
Peter Gutmann’s home page:Good collection of cryptography material.
Helgar Lipma’s Cryptology Pointers:Another excellent list of pointers to cryptography
and network security Web sites.
Cryptology ePrint archive: Provides rapid access to recent research in cryptology;con-
sists of a collection of unrefereed papers.
IEEE Technical Committee on Security and Privacy:Copies of their newsletter and
information on IEEE-related activities.
Computer Security Resource Center: Maintained by the National Institute of
Standards and Technology (NIST);contains a broad range of information on security
threats,technology,and standards.
Computer and Network Security Reference Index:A good index to vendor and commer-
cial products,FAQs,newsgroup archives,papers, and other Web sites.
Security Focus: A wide variety of security information,with an emphasis on vendor
products and end-user concerns.
SANS Institute: Similar to Security Focus.Extensive collection of white papers.
1.8 / KEY TERMS,REVIEW QUESTIONS,AND PROBLEMS 29
Risks Digest: Forum on risks to the public in computers and related systems.
Institute for Security and Open Methodologies: An open, collaborative security
research community.Lots of interesting information.
Center for Internet Security: Provides freeware benchmark and scoring tools for eval-
uating security of operating systems,network devices, and applications. Includes case
studies and technical papers.
1.8 KEY TERMS,REVIEW QUESTIONS, AND PROBLEMS
Key Terms
access control
active threat
authentication
authenticity
availability
data confidentiality
data integrity
denial of service
encryption
integrity
intruder
masquerade
nonrepudiation
OSI security architecture
passive threat
replay
security attacks
security mechanisms
security services
traffic analysis
Review Questions
1.1 What is the OSI security architecture?
1.2 What is the difference between passive and active security threats?
1.3 List and briefly define categories of passive and active security attacks.
1.4 List and briefly define categories of security services.
1.5 List and briefly define categories of security mechanisms.
Problems
1.1 Consider an automated teller machine (ATM) in which users provide a personal
identification number (PIN) and a card for account access.Give examples of confi-
dentiality,integrity, and availability requirements associated with the system and, in
each case,indicate the degree of importance of the requirement.
1.2 Repeat Problem 1.1 for a telephone switching system that routes calls through a
switching network based on the telephone number requested by the caller.
1.3 Consider a desktop publishing system used to produce documents for various
organizations.
a. Give an example of a type of publication for which confidentiality of the stored
data is the most important requirement.
b. Give an example of a type of publication in which data integrity is the most
important requirement.
c. Give an example in which system availability is the most important requirement.
1.4 For each of the following assets, assign a low,moderate, or high impact level for the
loss of confidentiality,availability,and integrity, respectively.Justify your answers.
a. An organization managing public information on its Web server.
b. A law enforcement organization managing extremely sensitive investigative
information.
30 CHAPTER 1 / OVERVIEW
c. A financial organization managing routine administrative information (not privacy-
related information).
d. An information system used for large acquisitions in a contracting organization
contains both sensitive,pre-solicitation phase contract information and routine
administrative information.Assess the impact for the two data sets separately and
the information system as a whole.
e. A power plant contains a SCADA (supervisory control and data acquisition) system
controlling the distribution of electric power for a large military installation.The
SCADA system contains both real-time sensor data and routine administrative
information.Assess the impact for the two data sets separately and the information
system as a whole.
1.5 Draw a matrix similar to Table 1.4 that shows the relationship between security services
and attacks.
1.6 Draw a matrix similar to Table 1.4 that shows the relationship between security
mechanisms and attacks.
CLASSICAL ENCRYPTION TECHNIQUES
2.1 Symmetric Cipher Model
Cryptography
Cryptanalysis and Brute-Force Attack
2.2 Substitution Techniques
Caesar Cipher
Monoalphabetic Ciphers
Playfair Cipher
Hill Cipher
Polyalphabetic Ciphers
One-Time Pad
2.3 Transposition Techniques
2.4 Rotor Machines
2.5 Steganography
2.6 Recommended Reading and Web Sites
2.7 Key Terms,Review Questions,and Problems
31
PART 1: SYMMETRIC CIPHERS
CHAPTER
32 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
“I am fairly familiar with all the forms of secret writings, and am myself the
author of a trifling monograph upon the subject,in which I analyze one hundred
and sixty separate ciphers,”said Holmes.
The Adventure of the Dancing Men,Sir Arthur Conan Doyle
KEY POINTS
Symmetric encryption is a form of cryptosystem in which encryption and
decryption are performed using the same key.It is also known as conven-
tional encryption.
Symmetric encryption transforms plaintext into ciphertext using a secret
key and an encryption algorithm. Using the same key and a decryption
algorithm,the plaintext is recovered from the ciphertext.
The two types of attack on an encryption algorithm are cryptanalysis,
based on properties of the encryption algorithm, and brute-force,which
involves trying all possible keys.
Traditional (precomputer) symmetric ciphers use substitution and/or
transposition techniques.Substitution techniques map plaintext elements
(characters,bits) into ciphertext elements. Transposition techniques sys-
tematically transpose the positions of plaintext elements.
Rotor machines are sophisticated precomputer hardware devices that use
substitution techniques.
Steganography is a technique for hiding a secret message within a larger
one in such a way that others cannot discern the presence or contents of the
hidden message.
Symmetric encryption, also referred to as conventional encryption or single-key
encryption,was the only type of encryption in use prior to the development of public-
key encryption in the 1970s.It remains by far the most widely used of the two types of
encryption. Part One examines a number of symmetric ciphers.In this chapter, we
begin with a look at a general model for the symmetric encryption process; this will
enable us to understand the context within which the algorithms are used. Next,we
examine a variety of algorithms in use before the computer era.Finally,we look briefly
at a different approach known as steganography.Chapters 3 and 5 examine the two
most widely used symmetric cipher:DES and AES.
Before beginning,we define some terms. An original message is known as the
plaintext,while the coded message is called the ciphertext. The process of converting
from plaintext to ciphertext is known as encipheringor encryption; restoring the plain-
text from the ciphertext is deciphering or decryption. The many schemes used for
encryption constitute the area of study known as cryptography. Such a scheme is
known as a cryptographic system or a cipher.Techniques used for deciphering a
2.1 / SYMMETRIC CIPHER MODEL 33
message without any knowledge of the enciphering details fall into the area of
cryptanalysis.Cryptanalysis is what the layperson calls “breaking the code.”The areas
of cryptography and cryptanalysis together are called cryptology.
2.1 SYMMETRIC CIPHER MODEL
A symmetric encryption scheme has five ingredients (Figure 2.1):
Plaintext: This is the original intelligible message or data that is fed into the
algorithm as input.
Encryption algorithm: The encryption algorithm performs various substitu-
tions and transformations on the plaintext.
Secret key: The secret key is also input to the encryption algorithm.The key is
a value independent of the plaintext and of the algorithm.The algorithm will
produce a different output depending on the specific key being used at the
time.The exact substitutions and transformations performed by the algorithm
depend on the key.
Ciphertext: This is the scrambled message produced as output. It depends on
the plaintext and the secret key.For a given message, two different keys will
produce two different ciphertexts.The ciphertext is an apparently random
stream of data and,as it stands, is unintelligible.
Decryption algorithm: This is essentially the encryption algorithm run in
reverse.It takes the ciphertext and the secret key and produces the original
plaintext.
There are two requirements for secure use of conventional encryption:
1. We need a strong encryption algorithm.At a minimum, we would like the
algorithm to be such that an opponent who knows the algorithm and has
access to one or more ciphertexts would be unable to decipher the ciphertext
or figure out the key.This requirement is usually stated in a stronger form:The
Plaintext
input
Y = E(K, X) X = D[K, Y]
X
KK
Transmitted
ciphertext
Plaintext
output
Secret key shared by
sender and recipient
Secret key shared by
sender and recipient
Encryption algorithm
(e.g., AES)
Decryption algorithm
(reverse of encryption
algorithm)
Figure 2.1 Simplified Model of Symmetric Encryption
34 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
opponent should be unable to decrypt ciphertext or discover the key even if he
or she is in possession of a number of ciphertexts together with the plaintext
that produced each ciphertext.
2. Sender and receiver must have obtained copies of the secret key in a secure
fashion and must keep the key secure.If someone can discover the key and
knows the algorithm,all communication using this key is readable.
We assume that it is impractical to decrypt a message on the basis of the
ciphertext plus knowledge of the encryption/decryption algorithm. In other words,
we do not need to keep the algorithm secret;we need to keep only the key secret.
This feature of symmetric encryption is what makes it feasible for widespread use.
The fact that the algorithm need not be kept secret means that manufacturers can
and have developed low-cost chip implementations of data encryption algorithms.
These chips are widely available and incorporated into a number of products.With
the use of symmetric encryption, the principal security problem is maintaining the
secrecy of the key.
Let us take a closer look at the essential elements of a symmetric encryp-
tion scheme, using Figure 2.2. A source produces a message in plaintext,
.The elements of are letters in some finite alphabet.
Traditionally,the alphabet usually consisted of the 26 capital letters. Nowadays,
the binary alphabet {0, 1} is typically used.For encryption, a key of the form
is generated.If the key is generated at the message source,
then it must also be provided to the destination by means of some secure chan-
nel.Alternatively, a third party could generate the key and securely deliver it to
both source and destination.
K = [K
1
,K
2
, Á , K
J
]
XMX = [X
1
,X
2
, Á , X
M
]
Message
source
Cryptanalyst
Key
source
Destination
X
X
X
^
K
^
Y = E(K, X)
Secure channel
K
Encryption
algorithm
Decryption
algorithm
Figure 2.2 Model of Symmetric Cryptosystem
2.1 / SYMMETRIC CIPHER MODEL 35
With the message and the encryption key as input, the encryption algo-
rithm forms the ciphertext .We can write this as
This notation indicates that is produced by using encryption algorithm E as a
function of the plaintext , with the specific function determined by the value of
the key .
The intended receiver, in possession of the key, is able to invert the
transformation:
An opponent, observing but not having access to or ,may attempt to
recover or or both and . It is assumed that the opponent knows the encryption
(E) and decryption (D) algorithms.If the opponent is interested in only this particular
message,then the focus of the effort is to recover by generating a plaintext estimate
.Often, however, the opponent is interested in being able to read future messages as
well,in which case an attempt is made to recover by generating an estimate .
Cryptography
Cryptographic systems are characterized along three independent dimensions:
1. T he type of operations used for transforming plaintext to ciphertext. All
encryption algorithms are based on two general principles: substitution,in
which each element in the plaintext (bit, letter,group of bits or letters) is
mapped into another element, and transposition,in which elements in the
plaintext are rearranged.The fundamental requirement is that no informa-
tion be lost (that is, that all operations are reversible). Most systems,
referred to as product systems, involve multiple stages of substitutions and
transpositions.
2. The number of keys used. If both sender and receiver use the same key,the
system is referred to as symmetric,single-key,secret-key, or conventional encryp-
tion. If the sender and receiver use different keys,the system is referred to as
asymmetric,two-key, or public-key encryption.
3. The way in which the plaintext is processed. A block cipher processes the
input one block of elements at a time,producing an output block for each
input block. A stream cipher processes the input elements continuously,
producing output one element at a time,as it goes along.
Cryptanalysis and Brute-Force Attack
Typically,the objective of attacking an encryption system is to recover the key in use
rather than simply to recover the plaintext of a single ciphertext.There are two gen-
eral approaches to attacking a conventional encryption scheme:
Cryptanalysis: Cryptanalytic attacks rely on the nature of the algorithm plus
perhaps some knowledge of the general characteristics of the plaintext or
K
N
K
X
N
X
KXKX
XKY
X = D(K, Y)
K
X
Y
Y = E(K, X)
Y = [Y
1
,Y
2
, Á , Y
N
]
KX
36 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
even some sample plaintext–ciphertext pairs.This type of attack exploits the
characteristics of the algorithm to attempt to deduce a specific plaintext or to
deduce the key being used.
Brute-force attack: The attacker tries every possible key on a piece of cipher-
text until an intelligible translation into plaintext is obtained.On average, half
of all possible keys must be tried to achieve success.
If either type of attack succeeds in deducing the key,the effect is catastrophic:
All future and past messages encrypted with that key are compromised.
We first consider cryptanalysis and then discuss brute-force attacks.
Table 2.1summarizes the various types of cryptanalytic attacks based on the
amount of information known to the cryptanalyst.The most difficult problem is pre-
sented when all that is available is the ciphertext only. In some cases,not even the
encryption algorithm is known, but in general,we can assume that the opponent
does know the algorithm used for encryption.One possible attack under these cir-
cumstances is the brute-force approach of trying all possible keys.If the key space is
very large,this becomes impractical. Thus,the opponent must rely on an analysis of
the ciphertext itself,generally applying various statistical tests to it. To use this
approach,the opponent must have some general idea of the type of plaintext that is
concealed, such as English or French text,an EXE file, a Java source listing, an
accounting file,and so on.
Table 2.1 Types of Attacks on Encrypted Messages
Type of Attack Known to Cryptanalyst
Ciphertext Only
• Encryption algorithm
• Ciphertext
Known Plaintext • Encryption algorithm
• Ciphertext
• One or more plaintext–ciphertext pairs formed with the secret key
Chosen Plaintext • Encryption algorithm
• Ciphertext
• Plaintext message chosen by cryptanalyst,together with its corresponding ciphertext
generated with the secret key
Chosen Ciphertext • Encryption algorithm
• Ciphertext
• Ciphertext chosen by cryptanalyst,together with its corresponding decrypted
plaintext generated with the secret key
Chosen Text • Encryption algorithm
• Ciphertext
• Plaintext message chosen by cryptanalyst,together with its corresponding
ciphertext generated with the secret key
• Ciphertext chosen by cryptanalyst,together with its corresponding decrypted
plaintext generated with the secret key
2.1 / SYMMETRIC CIPHER MODEL 37
The ciphertext-only attack is the easiest to defend against because the oppo-
nent has the least amount of information to work with.In many cases, however,the
analyst has more information.The analyst may be able to capture one or more
plaintext messages as well as their encryptions.Or the analyst may know that cer-
tain plaintext patterns will appear in a message.For example, a file that is encoded
in the Postscript format always begins with the same pattern,or there may be a
standardized header or banner to an electronic funds transfer message,and so on.
All these are examples of known plaintext.With this knowledge,the analyst may be
able to deduce the key on the basis of the way in which the known plaintext is
transformed.
Closely related to the known-plaintext attack is what might be referred to as a
probable-word attack.If the opponent is working with the encryption of some gen-
eral prose message,he or she may have little knowledge of what is in the message.
However,if the opponent is after some very specific information, then parts of the
message may be known.For example,if an entire accounting file is being transmitted,
the opponent may know the placement of certain key words in the header of the file.
As another example,the source code for a program developed by Corporation X
might include a copyright statement in some standardized position.
If the analyst is able somehow to get the source system to insert into the
system a message chosen by the analyst, then a chosen-plaintext attack is possible.
An example of this strategy is differential cryptanalysis,explored in Chapter 3. In
general, if the analyst is able to choose the messages to encrypt,the analyst may
deliberately pick patterns that can be expected to reveal the structure of the key.
Table 2.1lists two other types of attack: chosen ciphertext and chosen text.
These are less commonly employed as cryptanalytic techniques but are nevertheless
possible avenues of attack.
Only relatively weak algorithms fail to withstand a ciphertext-only attack.
Generally,an encryption algorithm is designed to withstand a known-plaintext
attack.
Two more definitions are worthy of note. An encryption scheme is
unconditionally secure if the ciphertext generated by the scheme does not con-
tain enough information to determine uniquely the corresponding plaintext, no
matter how much ciphertext is available.That is, no matter how much time an
opponent has,it is impossible for him or her to decrypt the ciphertext simply
because the required information is not there.With the exception of a scheme
known as the one-time pad (described later in this chapter), there is no encryp-
tion algorithm that is unconditionally secure.Therefore, all that the users of an
encryption algorithm can strive for is an algorithm that meets one or both of the
following criteria:
The cost of breaking the cipher exceeds the value of the encrypted information.
The time required to break the cipher exceeds the useful lifetime of the
information.
An encryption scheme is said to be computationally secure if either of the
foregoing two criteria are met. Unfortunately,it is very difficult to estimate the
amount of effort required to cryptanalyze ciphertext successfully.
38 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
All forms of cryptanalysis for symmetric encryption schemes are designed to
exploit the fact that traces of structure or pattern in the plaintext may survive
encryption and be discernible in the ciphertext.This will become clear as we exam-
ine various symmetric encryption schemes in this chapter.We will see in Part Two
that cryptanalysis for public-key schemes proceeds from a fundamentally different
premise,namely, that the mathematical properties of the pair of keys may make it
possible for one of the two keys to be deduced from the other.
A brute-force attack involves trying every possible key until an intelligible
translation of the ciphertext into plaintext is obtained.On average, half of all possi-
ble keys must be tried to achieve success.Table 2.2shows how much time is involved
for various key spaces.Results are shown for four binary key sizes. The 56-bit key
size is used with the Data Encryption Standard (DES) algorithm, and the 168-bit
key size is used for triple DES.The minimum key size specified for Advanced
Encryption Standard (AES) is 128 bits.Results are also shown for what are called
substitution codes that use a 26-character key (discussed later),in which all possible
permutations of the 26 characters serve as keys.For each key size, the results are
shown assuming that it takes 1 μs to perform a single decryption,which is a reason-
able order of magnitude for today’s machines.With the use of massively parallel
organizations of microprocessors,it may be possible to achieve processing rates
many orders of magnitude greater.The final column of Table 2.2 considers the
results for a system that can process 1 million keys per microsecond.As you can see,
at this performance level,DES can no longer be considered computationally secure.
2.2 SUBSTITUTION TECHNIQUES
In this section and the next,we examine a sampling of what might be called classical
encryption techniques.A study of these techniques enables us to illustrate the basic
approaches to symmetric encryption used today and the types of cryptanalytic
attacks that must be anticipated.
The two basic building blocks of all encryption techniques are substitution and
transposition.We examine these in the next two sections.Finally,we discuss a system
that combines both substitution and transposition.
Table 2.2 Average Time Required for Exhaustive Key Search
Key Size (bits)
Number of
Alternative Keys
Time Required at 1
Decryption/μs
Time Required at
10
6
Decryptions/μs
32
2
32
= 4.3 * 10
9
2
31
ms = 35.8 minutes
2.15 milliseconds
56
2
56
= 7.2 * 10
16
2
55
ms = 1142 years
10.01 hours
128
2
128
= 3.4 * 10
38
2
127
ms = 5.4 * 10
24
years 5.4 * 10
18
years
168
2
168
= 3.7 * 10
50
2
167
ms = 5.9 * 10
36
years 5.9 * 10
30
years
26 characters
(permutation)
26! = 4 * 10
26
2 * 10
26
ms = 6.4 * 10
12
years 6.4 * 10
6
years
2.2 / SUBSTITUTION TECHNIQUES 39
A substitution technique is one in which the letters of plaintext are replaced
by other letters or by numbers or symbols.
1
If the plaintext is viewed as a sequence
of bits,then substitution involves replacing plaintext bit patterns with ciphertext bit
patterns.
Caesar Cipher
The earliest known,and the simplest, use of a substitution cipher was by Julius
Caesar.The Caesar cipher involves replacing each letter of the alphabet with the let-
ter standing three places further down the alphabet.For example,
plain: meet me after the toga party
cipher: PHHW PH DIWHU WKH WRJD SDUWB
Note that the alphabet is wrapped around,so that the letter following Z is A.
We can define the transformation by listing all possibilities,as follows:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
Let us assign a numerical equivalent to each letter:
1
When letters are involved,the following conventions are used in this book.Plaintext is always in lowercase;
ciphertext is in uppercase;key values are in italicized lowercase.
abcde fghij k lm
0 1 2 3 4 5 6 7 8 9 10 11 12
nopqrstuvwx yz
13 14 15 16 17 18 19 20 21 22 23 24 25
Then the algorithm can be expressed as follows.For each plaintext letter ,substi-
tute the ciphertext letter :
2
A shift may be of any amount,so that the general Caesar algorithm is
(2.1)
where takes on a value in the range 1 to 25.The decryption algorithm is simply
(2.2)p = D(k, C) = (C - k) mod 26
k
C = E(k, p) = (p + k) mod 26
C = E(3, p) = (p + 3) mod 26
C
p
2
We define amod n to be the remainder when a is divided by n. For example,11 mod 7 = 4. See Chapter 4
for a further discussion of modular arithmetic.
40 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
If it is known that a given ciphertext is a Caesar cipher, then a brute-force
cryptanalysis is easily performed: simply try all the 25 possible keys.Figure 2.3
shows the results of applying this strategy to the example ciphertext.In this case, the
plaintext leaps out as occupying the third line.
Three important characteristics of this problem enabled us to use a brute-
force cryptanalysis:
1. The encryption and decryption algorithms are known.
2. There are only 25 keys to try.
3. The language of the plaintext is known and easily recognizable.
In most networking situations,we can assume that the algorithms are known.
What generally makes brute-force cryptanalysis impractical is the use of an algo-
rithm that employs a large number of keys.For example, the triple DES algorithm,
examined in Chapter 6, makes use of a 168-bit key,giving a key space of or
greater than possible keys.3.7 * 10
50
2
168
PHHW PH DIWHU WKH WRJD SDUWB
KEY
1 oggv og chvgt vjg vqic rctva
2 nffu nf bgufs uif uphb qbsuz
3 meet me after the toga party
4 ldds ld zesdq sgd snfz ozqsx
5 kccr kc ydrcp rfc rmey nyprw
6 jbbq jb xcqbo qeb qldx mxoqv
7 iaap ia wbpan pda pkcw lwnpu
8 hzzo hz vaozm ocz ojbv kvmot
9 gyyn gy uznyl nby niau julns
10 fxxm fx tymxk max mhzt itkmr
11 ewwl ew sxlwj lzw lgys hsjlq
12 dvvk dv rwkvi kyv kfxr grikp
13 cuuj cu qvjuh jxu jewq fqhjo
14 btti bt puitg iwt idvp epgin
15 assh as othsf hvs hcuo dofhm
16 zrrg zr nsgre gur gbtn cnegl
17 yqqf yq mrfqd ftq fasm bmdfk
18 xppe xp lqepc esp ezrl alcej
19 wood wo kpdob dro dyqk zkbdi
20 vnnc vn jocna cqn cxpj yjach
21 ummb um inbmz bpm bwoi xizbg
22 tlla tl hmaly aol avnh whyaf
23 skkz sk glzkx znk zumg vgxze
24 rjjy rj fkyjw ymj ytlf ufwyd
25 qiix qi ejxiv xli xske tevxc
Figure 2.3 Brute-Force Cryptanalysis of Caesar
Cipher
2.2 / SUBSTITUTION TECHNIQUES 41
The third characteristic is also significant.If the language of the plaintext is
unknown, then plaintext output may not be recognizable.Furthermore, the
input may be abbreviated or compressed in some fashion,again making recogni-
tion difficult. For example,Figure 2.4 shows a portion of a text file compressed
using an algorithm called ZIP.If this file is then encrypted with a simple substi-
tution cipher (expanded to include more than just 26 alphabetic characters),
then theplaintext may not be recognized when it is uncovered in the brute-force
cryptanalysis.
Monoalphabetic Ciphers
With only 25 possible keys,the Caesar cipher is far from secure.A dramatic increase
in the key space can be achieved by allowing an arbitrary substitution.Before pro-
ceeding,we define the term permutation.A permutationof a finite set of elements
is an ordered sequence of all the elements of , with each element appearing exactly
once.For example, if ,there are six permutations of :
In general,there are ! permutations of a set of elements,because the first
element can be chosen in one of nways, the second in ways, the third in
ways,and so on.
Recall the assignment for the Caesar cipher:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
If,instead, the “cipher”line can be any permutation of the 26 alphabetic characters,
then there are 26! or greater than possible keys.This is 10 orders of magni-
tude greater than the key space for DES and would seem to eliminate brute-force
techniques for cryptanalysis.Such an approach is referred to as a monoalphabetic
substitution cipher,because a single cipher alphabet (mapping from plain alphabet
to cipher alphabet) is used per message.
There is,however, another line of attack. If the cryptanalyst knows the
nature of the plaintext (e.g.,noncompressed English text), then the analyst can
exploit the regularities of the language.To see how such a cryptanalysis might
4 * 10
26
n - 2n - 1
nn
abc, acb, bac, bca, cab, cba
SS = {a, b, c}
S
S
Figure 2.4 Sample of Compressed Text
42 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
proceed, we give a partial example here that is adapted from one in [SINK66].
The ciphertext to be solved is
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
As a first step,the relative frequency of the letters can be determined and
compared to a standard frequency distribution for English, such as is shown in
Figure 2.5 (based on [LEWA00]).If the message were long enough, this technique
alone might be sufficient, but because this is a relatively short message,we cannot
expect an exact match. In any case,the relative frequencies of the letters in the
ciphertext (in percentages) are as follows:
0
2
4
6
8
10
12
14
A
8.167
1.492
2.782
4.253
12.702
2.228
2.015
6.094
6.996
0.153
0.772
4.025
2.406
6.749
7.507
1.929
0.095
5.987
6.327
9.056
2.758
0.978
2.360
0.150
1.974
0.074
B C D E F G H I J K L M N
Relative frequency (%)
O P Q R S T U V W X Y Z
Figure 2.5 Relative Frequency of Letters in English Text
P 13.33 H 5.83 F 3.33 B 1.67 C 0.00
Z 11.67 D 5.00 W 3.33 G 1.67 K 0.00
S 8.33 E 5.00 Q 2.50 Y 1.67 L 0.00
U 8.33 V 4.17 T 2.50 I 0.83 N 0.00
O 7.50 X 4.17 A 1.67 J 0.83 R 0.00
M 6.67
2.2 / SUBSTITUTION TECHNIQUES 43
Comparing this breakdown with Figure 2.5,it seems likely that cipher letters
P and Z are the equivalents of plain letters e and t, but it is not certain which is
which.The letters S,U, O,M, and H are all of relatively high frequency and probably
correspond to plain letters from the set {a,h, i, n, o,r, s}. The letters with the lowest
frequencies (namely,A,B, G,Y, I,J) are likely included in the set {b, j, k, q,v, x, z}.
There are a number of ways to proceed at this point.We could make some
tentative assignments and start to fill in the plaintext to see if it looks like a reasonable
“skeleton”of a message. A more systematic approach is to look for other regularities.
For example,certain words may be known to be in the text. Or we could look for
repeating sequences of cipher letters and try to deduce their plaintext equivalents.
A powerful tool is to look at the frequency of two-letter combinations,known
asdigrams. A table similar to Figure 2.5 could be drawn up showing the relative fre-
quency of digrams.The most common such digram is th.In our ciphertext, the most
common digram is ZW,which appears three times.So we make the correspondence
of Z with t and W with h.Then, by our earlier hypothesis, we can equate P with e.
Now notice that the sequence ZWP appears in the ciphertext,and we can translate
that sequence as “the.”This is the most frequent trigram (three-letter combination)
in English,which seems to indicate that we are on the right track.
Next,notice the sequence ZWSZ in the first line. We do not know that these
four letters form a complete word, but if they do,it is of the form th_t. If so, S
equates with a.
So far,then, we have
UZQSOVUOHXMOPVGPOZPEVSGZWSZOPFPESXUDBMETSXAIZ
t a e e te a that e e a a
VUEPHZHMDZSHZOWSFPAPPDTSVPQUZWYMXUZUHSX
e t ta t ha e ee a e th t a
EPYEPOPDZSZUFPOMBZWPFUPZHMDJUDTMOHMQ
e e e tat e the t
Only four letters have been identified,but already we have quite a bit of the
message.Continued analysis of frequencies plus trial and error should easily yield a
solution from this point.The complete plaintext, with spaces added between words,
follows:
it was disclosed yesterday that several informal but
direct contacts have been made with political
representatives of the viet cong in moscow
Monoalphabetic ciphers are easy to break because they reflect the frequency
data of the original alphabet.A countermeasure is to provide multiple substitutes,
known as homophones,for a single letter. For example,the letter e could be assigned
a number of different cipher symbols,such as 16, 74, 35, and 21, with each homo-
phone assigned to a letter in rotation or randomly.If the number of symbols assigned
to each letter is proportional to the relative frequency of that letter,then single-letter
frequency information is completely obliterated.The great mathematician Carl
44 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
Friedrich Gauss believed that he had devised an unbreakable cipher using homo-
phones.However, even with homophones,each element of plaintext affects only one
element of ciphertext, and multiple-letter patterns (e.g.,digram frequencies) still
survive in the ciphertext,making cryptanalysis relatively straightforward.
Two principal methods are used in substitution ciphers to lessen the extent to
which the structure of the plaintext survives in the ciphertext: One approach is to
encrypt multiple letters of plaintext,and the other is to use multiple cipher alphabets.
We briefly examine each.
Playfair Cipher
The best-known multiple-letter encryption cipher is the Playfair,which treats digrams
in the plaintext as single units and translates these units into ciphertext digrams.
3
The Playfair algorithm is based on the use of a 5 × 5 matrix of letters con-
structed using a keyword. Here is an example,solved by Lord Peter Wimsey in
Dorothy Sayers’s Have His Carcase:
4
3
This cipher was actually invented by British scientist Sir Charles Wheatstone in 1854,but it bears the name
of his friend Baron Playfair of St.Andrews,who championed the cipher at the British foreign office.
4
The book provides an absorbing account of a probable-word attack.
MON AR
CHYBD
E F G I/J K
LPQST
UVWXZ
In this case,the keyword is monarchy. The matrix is constructed by filling in
the letters of the keyword (minus duplicates) from left to right and from top to bot-
tom, and then filling in the remainder of the matrix with the remaining letters in
alphabetic order.The letters I and J count as one letter. Plaintext is encrypted two
letters at a time,according to the following rules:
1. Repeating plaintext letters that are in the same pair are separated with a filler
letter,such as x, so that balloon would be treated as ba lx lo on.
2. Two plaintext letters that fall in the same row of the matrix are each replaced by
the letter to the right, with the first element of the row circularly following the
last.For example, ar is encrypted as RM.
3. Two plaintext letters that fall in the same column are each replaced by the letter
beneath, with the top element of the column circularly following the last.For
example,mu is encrypted as CM.
4. Otherwise,each plaintext letter in a pair is replaced by the letter that lies in its
own row and the column occupied by the other plaintext letter. Thus,hs
becomes BP and ea becomes IM (or JM,as the encipherer wishes).
The Playfair cipher is a great advance over simple monoalphabetic ciphers.
For one thing,whereas there are only 26 letters, there are 26 × 26 = 676 digrams,so
2.2 / SUBSTITUTION TECHNIQUES 45
that identification of individual digrams is more difficult.Furthermore, the relative
frequencies of individual letters exhibit a much greater range than that of digrams,
making frequency analysis much more difficult. For these reasons,the Playfair
cipher was for a long time considered unbreakable.It was used as the standard field
system by the British Army in World War I and still enjoyed considerable use by the
U.S.Army and other Allied forces during World War II.
Despite this level of confidence in its security,the Playfair cipher is relatively
easy to break,because it still leaves much of the structure of the plaintext language
intact.A few hundred letters of ciphertext are generally sufficient.
One way of revealing the effectiveness of the Playfair and other ciphers is shown
in Figure 2.6,based on [SIMM93]. The line labeled plaintext plots the frequency distri-
bution of the more than 70,000 alphabetic characters in the Encyclopaedia Britannica
article on cryptology.
5
This is also the frequency distribution of any monoalphabetic
substitution cipher,because the frequency values for individual letters are the same,
just with different letters substituted for the original letters.The plot was developed in
the following way:The number of occurrences of each letter in the text was counted
and divided by the number of occurrences of the letter e (the most frequently used
letter).As a result, e has a relative frequency of 1,t of about 0.76, and so on. The points
on the horizontal axis correspond to the letters in order of decreasing frequency.
Figure 2.6 also shows the frequency distribution that results when the text is
encrypted using the Playfair cipher.To normalize the plot,the number of occurrences
of each letter in the ciphertext was again divided by the number of occurrences of e
0
10
20
30
40
50
60
70
80
90
100
246810121416 22242618 20
Frequency ranked letters
Plaintext
Playfair cipher
Vigenère cipher
Random polyalphabetic cipher
Figure 2.6 Relative Frequency of Occurrence of Letters
5
I am indebted to Gustavus Simmons for providing the plots and explaining their method of construction.
46 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
in the plaintext.The resulting plot therefore shows the extent to which the frequency
distribution of letters,which makes it trivial to solve substitution ciphers, is masked
by encryption.If the frequency distribution information were totally concealed in the
encryption process,the ciphertext plot of frequencies would be flat, and cryptanalysis
using ciphertext only would be effectively impossible.As the figure shows, the
Playfair cipher has a flatter distribution than does plaintext, but nevertheless,it
reveals plenty of structure for a cryptanalyst to work with.
Hill Cipher
6
Another interesting multiletter cipher is the Hill cipher, developed by the mathe-
matician Lester Hill in 1929.
CONCEPTS FROM L INEAR ALGEBRA Before describing the Hill cipher, let us briefly
review some terminology from linear algebra.In this discussion, we are concerned
with matrix arithmetic modulo 26.For the reader who needs a refresher on matrix
multiplication and inversion,see Appendix E.
We define the inverse of a square matrix M by the equation
,where I is the identity matrix. I is a square matrix that is all
zeros except for ones along the main diagonal from upper left to lower right.The
inverse of a matrix does not always exist,but when it does, it satisfies the preceding
equation.For example,
To explain how the inverse of a matrix is computed,we begin by with the con-
cept of determinant.For any square matrix (m × ),the determinant equals the sum
of all the products that can be formed by taking exactly one element from each row
and exactly one element from each column,with certain of the product terms pre-
ceded by a minus sign.For a 2 × 2 matrix,
the determinant is . For a matrix, the value of the determi-
nant is .
If a square matrix A has a nonzero determinant, then the inverse of the matrix is
k
11
k
22
k
33
+ k
21
k
32
k
13
+ k
31
k
12
k
23
- k
31
k
22
k
13
- k
21
k
12
k
33
- k
11
k
32
k
23
3 * 3k
11
k
22
- k
12
k
21
a
k
11
k
12
k
21
k
22
b
m
= a
53 130
156 79
b mod 26 = a
10
01
b
AA
-1
= a
(5 * 9) + (8 * 1) (5 * 2) + (8 * 15)
(17 * 9) + (3 * 1) (17 * 2) + (3 * 15)
b
A = a
58
17 3
b A
-1
mod 26 = a
92
115
b
M(M
-1
) = M
-1
M = I
M
-1
6
This cipher is somewhat more difficult to understand than the others in this chapter,but it illustrates an
important point about cryptanalysis that will be useful later on.This subsection can be skipped on a first
reading.
2.2 / SUBSTITUTION TECHNIQUES 47
computed as , where ( ) is the subdeterminant
formed by deleting the th row and the th column of A, det(A) is the determinant
of A,and (det is the multiplicative inverse of (det A)mod 26.
Continuing our example,
We can show that , because (see
Chapter 4 or Appendix E).Therefore, we compute the inverse of A as
THEHILL ALGORITHM This encryption algorithm takes successive plaintext letters
and substitutes for them ciphertext letters.The substitution is determined by
linear equations in which each character is assigned a numerical value
.For ,the system can be described as
This can be expressed in terms of row vectors and matrices:
7
or
where C and P are row vectors of length 3 representing the plaintext and ciphertext,
and K is a matrix representing the encryption key. Operations are performed
mod 26.
For example,consider the plaintext “paymoremoney” and use the encryp-
tion key
K =
£
17 17 5
21 18 21
2219
3 * 3
C = PK mod 26
(c
1
c
2
c
3
) = (p p
2
p
3
)
£
k
11
k
12
k
13
k
21
k
22
k
23
k
31
k
32
k
33
mod 26
c
3
= (k
31
p
1
+ k
32
p
2
+ k
33
p
3
)mod 26
c
2
= (k
21
p
1
+ k
22
p
2
+ k
23
p
3
)mod 26
c
1
= (k
11
p
1
+ k
12
p
2
+ k
13
p
3
)mod 26
m = 3(a = 0, b = 1, Á , z = 25)
mm
m
A
-1
mod 26 = 3 a
3 - 8
-17 5
b = 3a
318
95
b = a
954
27 15
b = a
92
115
b
A = a
58
17 3
b
9 * 3 = 27 mod 26 = 19
-1
mod 26 = 3
det a
58
17 3
b = (5 * 3) - (8 * 17) =-121 mod 26 = 9
A)
-1
ij
D
ji
[A
-1
]
ij
= (det A)
-1
(-1)
i+j
(D
ji
)
7
Some cryptography books express the plaintext and ciphertext as column vectors,so that the column
vector is placed after the matrix rather than the row vector placed before the matrix.Sage uses row vectors,
so we adopt that convention.
48 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
The first three letters of the plaintext are represented by the vector . Then
. Continuing in this fash-
ion,the ciphertext for the entire plaintext is RRLMWBKASPDH.
Decryption requires using the inverse of the matrix K. We can compute
, and therefore, . We can then compute the
inverse as
This is demonstrated as
It is easily seen that if the matrix is applied to the ciphertext, then the
plaintext is recovered.
In general terms,the Hill system can be expressed as
As with Playfair, the strength of the Hill cipher is that it completely hides
single-letter frequencies.Indeed, with Hill,the use of a larger matrix hides more fre-
quency information.Thus, a Hill cipher hides not only single-letter but also
two-letter frequency information.
Although the Hill cipher is strong against a ciphertext-only attack, it is
easily broken with a known plaintext attack. For an Hill cipher,suppose
we have plaintext–ciphertext pairs, each of length . We label the pairs
such that for 1 …… and
for some unknown key matrix K. Now define two matrices and
.Then we can form the matrix equation Y = XK. If X has an inverse,then
we can determine .If X is not invertible, then a new version of X can
be formed with additional plaintext–ciphertext pairs until an invertible X is
obtained.
Consider this example.Suppose that the plaintext “hillcipher” is encrypted
using a Hill cipher to yield the ciphertext HCRZSSXNSP.Thus,we know that
; ; and so on.Using the first two
plaintext–ciphertext pairs,we have
The inverse of Xcan be computed:
a
78
11 11
b
-1
= a
25 22
123
b
a
72
17 25
b = a
78
11 11
bKmod 26
(1111)K mod 26 = (17 25)(78)K mod 26 = (72)
2 * 2
K = X
-1
Y
Y = (c
ij
)
X = (p
ij
)m * m
mjC
j
= P
j
KP
j
= (p
1j
p
1j
Á
p
mj
)and C
j
= (c
1j
c
1j
Á
c
mj
)
mm
m * m
3 * 3
P = D(K, C) = CK
-1
mod 26 = PKK
-1
= P
C = E(K, P) = PK mod 26
K
-1
£
17 17 5
21 18 21
2219
£
4915
15 17 6
24 0 17
=
£
443 442 442
858 495 780
494 52 365
mod 26 =
£
100
010
001
K
-1
=
4 9 15
£
15 17 6
24 0 17
(detK)
-1
mod 26 = 17 det K = 23
(15 0 24)K = (303 303 531) mod 26 = (17 17 11) = RRL
(15 0 24)
2.2 / SUBSTITUTION TECHNIQUES 49
so
This result is verified by testing the remaining plaintext–ciphertext pairs.
Polyalphabetic Ciphers
Another way to improve on the simple monoalphabetic technique is to use different
monoalphabetic substitutions as one proceeds through the plaintext message.The
general name for this approach is polyalphabetic substitution cipher.All these tech-
niques have the following features in common:
1. A set of related monoalphabetic substitution rules is used.
2. A key determines which particular rule is chosen for a given transformation.
VIGEN
`
ERE CIPHER The best known, and one of the simplest, polyalphabetic ciphers
is the Vigenère cipher.In this scheme,the set of related monoalphabetic substitution
rules consists of the 26 Caesar ciphers with shifts of 0 through 25. Each cipher
is denoted by a key letter, which is the ciphertext letter that substitutes for
the plaintext letter a.Thus, a Caesar cipher with a shift of 3 is denoted by the key
value .
We can express the Vigenère cipher in the following manner.Assume a
sequence of plaintext letters and a key consisting of the
sequence of letters ,where typically < . The sequence of
ciphertext letters is calculated as follows:
Thus,the first letter of the key is added to the first letter of the plaintext,mod 26, the
second letters are added,and so on through the first letters of the plaintext.For
the next letters of the plaintext,the key letters are repeated. This process contin-
ues until all of the plaintext sequence is encrypted. A general equation of the
encryption process is
(2.3)
Compare this with Equation (2.1) for the Caesar cipher. In essence,each
plaintext character is encrypted with a different Caesar cipher,depending on the
corresponding key character. Similarly, decryption is a generalization of
Equation (2.2):
(2.4)
To encrypt a message,a key is needed that is as long as the message. Usually,
the key is a repeating keyword. For example,if the keyword is deceptive, the
message “we are discovered save yourself”is encrypted as
p
i
= (C
i
- k
imod m
)mod 26
C
i
= (p
i
+ k
imod m
)mod 26
m
m
C = C
0
,C
1
,C
2
, Á , C
n-1
= E(K, P) = E[(k
0
,k
1
,k
2
, Á , k
m-1
), (p
0
,p
1
,p
2
, Á , p
n-1
)]
= (p
0
+ k
0
)mod 26, (p
1
+ k
1
)mod 26, Á , (p
m-1
+ k
m-1
)mod 26,
(p
m
+ k
0
)mod 26, (p
m+1
+ k
1
)mod 26, Á , (p
2m-1
+ k
m-1
)mod 26, Á
C = C
0
,C
1
,C
2
, Á , C
n-1
nmK = k
0
,k
1
,k
2
, Á , k
m-1
P = p
0
,p
1
,p
2
, Á , p
n-1
d
K = a
25 22
123
ba
72
17 25
b = a
549 600
398 577
bmod 26 = a
32
85
b
50 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
key:
deceptivedeceptivedeceptive
plaintext: wearediscoveredsaveyourself
ciphertext: ZICVTW
QNGRZGVTWAVZHCQYGLMGJ
Expressed numerically,we have the following result.
key 3 4 2 4 15 19 8 21 4 3 4 2 4 15
plaintext 22 4 0 17 4 3 8 18 2 14 21 4 17 4
ciphertext 25 8 2 21 19 22 16 13 6 17 25 6 21 19
key 19 8 21 4 3 4 2 4 15 19 8 21 4
plaintext 3 18 0 21 4 24 14 20 17 18 4 11 5
ciphertext 22 0 21 25 7 2 16 24 6 11 12 6 9
The strength of this cipher is that there are multiple ciphertext letters for each
plaintext letter,one for each unique letter of the keyword. Thus,the letter frequency
information is obscured. However,not all knowledge of the plaintext structure is
lost.For example, Figure 2.6 shows the frequency distribution for a Vigenère cipher
with a keyword of length 9.An improvement is achieved over the Playfair cipher,
but considerable frequency information remains.
It is instructive to sketch a method of breaking this cipher,because the method
reveals some of the mathematical principles that apply in cryptanalysis.
First,suppose that the opponent believes that the ciphertext was encrypted
using either monoalphabetic substitution or a Vigenère cipher.A simple test can
be made to make a determination.If a monoalphabetic substitution is used, then
the statistical properties of the ciphertext should be the same as that of the
language of the plaintext.Thus, referring to Figure 2.5,there should be one cipher
letter with a relative frequency of occurrence of about 12.7%, one with about
9.06%, and so on.If only a single message is available for analysis, we would not
expect an exact match of this small sample with the statistical profile of the plain-
text language. Nevertheless,if the correspondence is close, we can assume a
monoalphabetic substitution.
If,on the other hand, a Vigenère cipher is suspected, then progress depends on
determining the length of the keyword,as will be seen in a moment. For now,let us con-
centrate on how the keyword length can be determined.The important insight that leads
to a solution is the following:If two identical sequences of plaintext letters occur at a dis-
tance that is an integer multiple of the keyword length, they will generate identical
ciphertext sequences.In the foregoing example,two instances of the sequence “red” are
separated by nine character positions.Consequently, in both cases,r is encrypted using
key letter , e is encrypted using key letter ,and d is encrypted using key letter . Thus,
in both cases,the ciphertext sequence is VTW.We indicate this above by underlining the
relevant ciphertext letters and shading the relevant ciphertext numbers.
An analyst looking at only the ciphertext would detect the repeated sequences
VTW at a displacement of 9 and make the assumption that the keyword is either
three or nine letters in length.The appearance of VTW twice could be by chance
tpe
2.2 / SUBSTITUTION TECHNIQUES 51
and not reflect identical plaintext letters encrypted with identical key letters.
However, if the message is long enough,there will be a number of such repeated
ciphertext sequences.By looking for common factors in the displacements of the
various sequences,the analyst should be able to make a good guess of the keyword
length.
Solution of the cipher now depends on an important insight. If the keyword
length is , then the cipher,in effect, consists of monoalphabetic substitution
ciphers.For example, with the keyword DECEPTIVE,the letters in positions 1, 10,
19,and so on are all encrypted with the same monoalphabetic cipher. Thus,we can
use the known frequency characteristics of the plaintext language to attack each of
the monoalphabetic ciphers separately.
The periodic nature of the keyword can be eliminated by using a nonrepeating
keyword that is as long as the message itself.Vigenère proposed what is referred to
as an autokey system,in which a keyword is concatenated with the plaintext itself to
provide a running key.For our example,
key:
deceptivewearediscoveredsav
plaintext: wearediscoveredsaveyourself
ciphertext: ZICVTWQNGKZEIIGASXSTSLVVWLA
Even this scheme is vulnerable to cryptanalysis.Because the key and the plain-
text share the same frequency distribution of letters,a statistical technique can be
applied.For example,e enciphered by , by Figure 2.5, can be expected to occur with a
frequency of , whereas t enciphered by would occur only about half
as often.These regularities can be exploited to achieve successful cryptanalysis.
8
VERNAM CIPHER The ultimate defense against such a cryptanalysis is to choose a
keyword that is as long as the plaintext and has no statistical relationship to it.Such
a system was introduced by an AT&T engineer named Gilbert Vernam in 1918.His
system works on binary data (bits) rather than letters.The system can be expressed
succinctly as follows (Figure 2.7):
where
th binary digit of plaintext
th binary digit of key
th binary digit of ciphertext
= exclusive-or (XOR) operation
Compare this with Equation (2.3) for the Vigenère cipher.
c
i
= i
k
i
= i
p
i
= i
c
i
= p
i
k
i
t(0.127)
2
L 0.016
e
mm
8
Although the techniques for breaking a Vigen`ere cipher are by no means complex, a 1917 issue of
Scientific Americancharacterized this system as “impossible of translation.”This is a point worth remem-
bering when similar claims are made for modern algorithms.
52 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
Thus,the ciphertext is generated by performing the bitwise XOR of the plain-
text and the key.Because of the properties of the XOR, decryption simply involves
the same bitwise operation:
which compares with Equation (2.4).
The essence of this technique is the means of construction of the key.
Vernam proposed the use of a running loop of tape that eventually repeated the
key,so that in fact the system worked with a very long but repeating keyword.
Although such a scheme,with a long key, presents formidable cryptanalytic diffi-
culties,it can be broken with sufficient ciphertext, the use of known or probable
plaintext sequences,or both.
One-Time Pad
An Army Signal Corp officer,Joseph Mauborgne,proposed an improvement to the
Vernam cipher that yields the ultimate in security.Mauborgne suggested using a
random key that is as long as the message,so that the key need not be repeated. In
addition,the key is to be used to encrypt and decrypt a single message, and then is
discarded.Each new message requires a new key of the same length as the new mes-
sage.Such a scheme, known as a one-time pad, is unbreakable.It produces random
output that bears no statistical relationship to the plaintext.Because the ciphertext
contains no information whatsoever about the plaintext, there is simply no way to
break the code.
An example should illustrate our point. Suppose that we are using a
Vigenère scheme with 27 characters in which the twenty-seventh character is the
space character,but with a one-time key that is as long as the message. Consider
the ciphertext
ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS
We now show two different decryptions using two different keys:
ciphertext: ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS
key:
pxlmvmsydofuyrvzwc tnlebnecvgdupahfzzlmnyih
plaintext: mr mustard with the candlestick in the hall
p
i
= c
i
k
i
Key stream
generator
Cryptographic
bit stream ( k
i
)
Cryptographic
bit stream ( k
i
)
Plaintext
(p
i
)
Plaintext
(p
i
)
Ciphertext
(c
i
)
Key stream
generator
Figure 2.7 Vernam Cipher
2.3 / TRANSPOSITION TECHNIQUES 53
ciphertext: ANKYODKYUREPFJBYOJDSPLREYIUNOFDOIUERFPLUYTS
key:
mfugpmiydgaxgoufhklllmhsqdqogtewbqfgyovuhwt
plaintext: miss scarlet with the knife in the library
Suppose that a cryptanalyst had managed to find these two keys.Two plausible
plaintexts are produced. How is the cryptanalyst to decide which is the correct
decryption (i.e.,which is the correct key)? If the actual key were produced in a truly
random fashion,then the cryptanalyst cannot say that one of these two keys is more
likely than the other.Thus,there is no way to decide which key is correct and there-
fore which plaintext is correct.
In fact,given any plaintext of equal length to the ciphertext, there is a key that
produces that plaintext.Therefore, if you did an exhaustive search of all possible
keys,you would end up with many legible plaintexts, with no way of knowing which
was the intended plaintext.Therefore, the code is unbreakable.
The security of the one-time pad is entirely due to the randomness of the key.
If the stream of characters that constitute the key is truly random,then the stream of
characters that constitute the ciphertext will be truly random.Thus, there are no
patterns or regularities that a cryptanalyst can use to attack the ciphertext.
In theory,we need look no further for a cipher. The one-time pad offers com-
plete security but,in practice, has two fundamental difficulties:
1. There is the practical problem of making large quantities of random keys.Any
heavily used system might require millions of random characters on a regular
basis.Supplying truly random characters in this volume is a significant task.
2. Even more daunting is the problem of key distribution and protection. For
every message to be sent,a key of equal length is needed by both sender and
receiver.Thus,a mammoth key distribution problem exists.
Because of these difficulties,the one-time pad is of limited utility and is useful
primarily for low-bandwidth channels requiring very high security.
The one-time pad is the only cryptosystem that exhibits what is referred to as
perfect secrecy.This concept is explored in Appendix F.
2.3 TRANSPOSITION TECHNIQUES
All the techniques examined so far involve the substitution of a ciphertext symbol
for a plaintext symbol.A very different kind of mapping is achieved by performing
some sort of permutation on the plaintext letters.This technique is referred to as a
transposition cipher.
The simplest such cipher is the rail fence technique,in which the plaintext is
written down as a sequence of diagonals and then read off as a sequence of rows.For
example,to encipher the message “meet me after the toga party” with a rail fence of
depth 2,we write the following:
m e m a t r h t g p r y
e t e f e t e o a a t
54 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
The encrypted message is
This sort of thing would be trivial to cryptanalyze.A more complex scheme is
to write the message in a rectangle,row by row,and read the message off, column by
column, but permute the order of the columns.The order of the columns then
becomes the key to the algorithm.For example,
Key: 4 3 1 2 5 6 7
Plaintext: a t t a c k p
o s t p o n e
d u n t i l t
w o a m x y z
Ciphertext: TTNAAPTMTSUOAODWCOIXKNLYPETZ
Thus,in this example, the key is 4312567.To encrypt, start with the column that
is labeled 1,in this case column 3. Write down all the letters in that column.Proceed to
column 4,which is labeled 2, then column 2, then column 1, then columns 5, 6,and 7.
A pure transposition cipher is easily recognized because it has the same letter
frequencies as the original plaintext. For the type of columnar transposition just
shown,cryptanalysis is fairly straightforward and involves laying out the ciphertext in
a matrix and playing around with column positions.Digram and trigram frequency
tables can be useful.
The transposition cipher can be made significantly more secure by performing
more than one stage of transposition.The result is a more complex permutation that
is not easily reconstructed.Thus, if the foregoing message is reencrypted using the
same algorithm,
Key: 4 3 1 2 5 6 7
Input: t t n a a p t
m t s u o a o
d w c o i x k
n l y p e t z
Output: NSCYAUOPTTWLTMDNAOIEPAXTTOKZ
To visualize the result of this double transposition,designate the letters in
the original plaintext message by the numbers designating their position.Thus,
with 28 letters in the message,the original sequence of letters is
01 02 03 04 05 06 07 08 09 10 11 12 13 14
15 16 17 18 19 20 21 22 23 24 25 26 27 28
After the first transposition,we have
03 10 17 24 04 11 18 25 02 09 16 23 01 08
15 22 05 12 19 26 06 13 20 27 07 14 21 28
MEMATRHTGPRYETEFETEOAAT
2.4 / ROTOR MACHINES 55
which has a somewhat regular structure.But after the second transposition, we have
17 09 05 27 24 16 12 07 10 02 22 20 03 25
15 13 04 23 19 14 11 01 26 21 18 08 06 28
This is a much less structured permutation and is much more difficult to cryptanalyze.
2.4 ROTOR MACHINES
The example just given suggests that multiple stages of encryption can produce an
algorithm that is significantly more difficult to cryptanalyze.This is as true of substi-
tution ciphers as it is of transposition ciphers.Before the introduction of DES, the
most important application of the principle of multiple stages of encryption was a
class of systems known as rotor machines.
9
The basic principle of the rotor machine is illustrated in Figure 2.8.The
machine consists of a set of independently rotating cylinders through which electri-
cal pulses can flow.Each cylinder has 26 input pins and 26 output pins,with internal
wiring that connects each input pin to a unique output pin.For simplicity,only three
of the internal connections in each cylinder are shown.
If we associate each input and output pin with a letter of the alphabet,then a
single cylinder defines a monoalphabetic substitution. For example,in Figure 2.8,
if an operator depresses the key for the letter A,an electric signal is applied to the
first pin of the first cylinder and flows through the internal connection to the
twenty-fifth output pin.
Consider a machine with a single cylinder.After each input key is depressed,the
cylinder rotates one position,so that the internal connections are shifted accordingly.
Thus,a different monoalphabetic substitution cipher is defined. After 26 letters of
plaintext, the cylinder would be back to the initial position.Thus, we have a poly-
alphabetic substitution algorithm with a period of 26.
A single-cylinder system is trivial and does not present a formidable cryptana-
lytic task.The power of the rotor machine is in the use of multiple cylinders,in which
the output pins of one cylinder are connected to the input pins of the next.Figure 2.8
shows a three-cylinder system.The left half of the figure shows a position in which the
input from the operator to the first pin (plaintext letter a) is routed through the three
cylinders to appear at the output of the second pin (ciphertext letter B).
With multiple cylinders,the one closest to the operator input rotates one pin
position with each keystroke.The right half of Figure 2.8 shows the system’s config-
uration after a single keystroke.For every complete rotation of the inner cylinder,
the middle cylinder rotates one pin position. Finally,for every complete rotation
of the middle cylinder, the outer cylinder rotates one pin position.This is the
same type of operation seen with an odometer. The result is that there are
different substitution alphabets used before the system26 * 26 * 26 = 17,576
9
Machines based on the rotor principle were used by both Germany (Enigma) and Japan (Purple)
inWorld War II.The breaking of both codes by the Allies was a significant factor in the war’s outcome.
2.5 / STEGANOGRAPHY 57
repeats. The addition of fourth and fifth rotors results in periods of 456,976
and 11,881,376 letters,respectively. As David Kahn eloquently put it,referring to a
five-rotor machine [KAHN96,page 413]:
A period of that length thwarts any practical possibility of a
straightforward solution on the basis of letter frequency.This
general solution would need about 50 letters per cipher alphabet,
meaning that all five rotors would have to go through their com-
bined cycle 50 times.The ciphertext would have to be as long as
all the speeches made on the floor of the Senate and the House
of Representatives in three successive sessions of Congress.No
cryptanalyst is likely to bag that kind of trophy in his lifetime;
even diplomats,who can be as verbose as politicians, rarely scale
those heights of loquacity.
The significance of the rotor machine today is that it points the way to the most
widely used cipher ever: the Data Encryption Standard (DES).This we examine in
Chapter 3.
2.5 STEGANOGRAPHY
We conclude with a discussion of a technique that (strictly speaking),is not encryp-
tion,namely, steganography.
A plaintext message may be hidden in one of two ways.The methods of
steganography conceal the existence of the message,whereas the methods of cryp-
tography render the message unintelligible to outsiders by various transformations
of the text.
10
A simple form of steganography,but one that is time-consuming to construct,
is one in which an arrangement of words or letters within an apparently innocuous
text spells out the real message.For example, the sequence of first letters of each
word of the overall message spells out the hidden message.Figure 2.9 shows an
example in which a subset of the words of the overall message is used to convey the
hidden message.See if you can decipher this; it’s not too hard.
Various other techniques have been used historically;some examples are the
following [MYER91]:
Character marking: Selected letters of printed or typewritten text are over-
written in pencil.The marks are ordinarily not visible unless the paper is held
at an angle to bright light.
Invisible ink: A number of substances can be used for writing but leave no
visible trace until heat or some chemical is applied to the paper.
10
Steganography was an obsolete word that was revived by David Kahn and given the meaning it has
today [KAHN96].
58 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
Pin punctures: Small pin punctures on selected letters are ordinarily not visi-
ble unless the paper is held up in front of a light.
Typewriter correction ribbon:Used between lines typed with a black ribbon, the
results of typing with the correction tape are visible only under a strong light.
Although these techniques may seem archaic,they have contemporary equiv-
alents.[WAYN93] proposes hiding a message by using the least significant bits of
frames on a CD.For example,the Kodak Photo CD format’s maximum resolution is
20483072 pixels, with each pixel containing 24 bits of RGB color information.The
least significant bit of each 24-bit pixel can be changed without greatly affecting the
quality of the image.The result is that you can hide a 2.3-megabyte message in a
single digital snapshot.There are now a number of software packages available that
take this type of approach to steganography.
Steganography has a number of drawbacks when compared to encryption. It
requires a lot of overhead to hide a relatively few bits of information,although using a
scheme like that proposed in the preceding paragraph may make it more effective.Also,
once the system is discovered,it becomes virtually worthless. This problem,too, can be
overcome if the insertion method depends on some sort of key (e.g.,see Problem 2.20).
Alternatively,a message can be first encrypted and then hidden using steganography.
The advantage of steganography is that it can be employed by parties who
have something to lose should the fact of their secret communication (not necessar-
ily the content) be discovered.Encryption flags traffic as important or secret or may
identify the sender or receiver as someone with something to hide.
Figure 2.9 A Puzzle for Inspector Morse
(FromThe Silent World of Nicholas Quinn, by Colin Dexter)
2.6 / RECOMMENDED READING AND WEB SITES 59
2.6 RECOMMENDED READING AND WEB SITES
For anyone interested in the history of code making and code breaking,the book to read is
[KAHN96].Although it is concerned more with the impact of cryptology than its technical
development,it is an excellent introduction and makes for exciting reading. Another excel-
lent historical account is [SING99].
A short treatment covering the techniques of this chapter,and more, is [GARD72].
There are many books that cover classical cryptography in a more technical vein;one of the
best is [SINK66].[KORN96] is a delightful book to read and contains a lengthy section on
classical techniques.Two cryptography books that contain a fair amount of technical material
on classical techniques are [GARR01] and [NICH99].For the truly interested reader, the
two-volume [NICH96] covers numerous classical ciphers in detail and provides many cipher-
texts to be cryptanalyzed,together with the solutions.
An excellent treatment of rotor machines,including a discussion of their cryptanalysis
is found in [KUMA97].
[KATZ00] provides a thorough treatment of steganography.Another good source is
[WAYN96].
GARD72 Gardner,M. Codes, Ciphers,and Secret Writing. New York:Dover, 1972.
GARR01 Garrett, P.Making, Breaking Codes: An Introduction to Cryptology. Upper
Saddle River,NJ: Prentice Hall, 2001.
KAHN96 Kahn,D. The Codebreakers: The Story of Secret Writing. New York:Scribner,
1996.
KATZ00 Katzenbeisser,S., ed. Information Hiding Techniques for Steganography and
Digital Watermarking.Boston: Artech House,2000.
KORN96 Korner, T. The Pleasures of Counting. Cambridge, England: Cambridge
University Press,1996.
KUMA97 Kumar,I. Cryptology. Laguna Hills,CA: Aegean Park Press,1997.
NICH96 Nichols, R. Classical Cryptography Course. Laguna Hills,CA: Aegean Park
Press,1996.
NICH99 Nichols,R., ed. ICSA Guide to Cryptography. New York:McGraw-Hill, 1999.
SING99 Singh, S. The Code Book: The Science of Secrecy from Ancient Egypt to
Quantum Cryptography.New York:Anchor Books,1999.
SINK66 Sinkov,A. Elementary Cryptanalysis: A Mathematical Approach. Washington,
D.C.:The Mathematical Association of America,1966.
WAYN96 Wayner, P.Disappearing Cryptography. Boston: AP Professional Books,1996.
Recommended Web Sites:
American Cryptogram Association: An association of amateur cryptographers.
The Web site includes information and links to sites concerned with classical
cryptography.
60 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
Review Questions
2.1 What are the essential ingredients of a symmetric cipher?
2.2 What are the two basic functions used in encryption algorithms?
2.3 How many keys are required for two people to communicate via a cipher?
2.4 What is the difference between a block cipher and a stream cipher?
2.5 What are the two general approaches to attacking a cipher?
2.6 List and briefly define types of cryptanalytic attacks based on what is known to the
attacker.
2.7 What is the difference between an unconditionally secure cipher and a computation-
ally secure cipher?
2.8 Briefly define the Caesar cipher.
2.9 Briefly define the monoalphabetic cipher.
2.10 Briefly define the Playfair cipher.
2.11 What is the difference between a monoalphabetic cipher and a polyalphabetic
cipher?
2.12 What are two problems with the one-time pad?
2.13 What is a transposition cipher?
2.14 What is steganography?
Problems
2.1 A generalization of the Caesar cipher, known as the affine Caesar cipher, has the
following form:For each plaintext letter , substitute the ciphertext letter :
C = E([a, b], p) = (ap + b) mod 26
Cp
block cipher
brute-force attack
Caesar cipher
cipher
ciphertext
computationally secure
conventional encryption
cryptanalysis
cryptographic system
cryptography
cryptology
deciphering
decryption
digram
enciphering
encryption
Hill cipher
monoalphabetic cipher
one-time pad
plaintext
Playfair cipher
polyalphabetic cipher
rail fence cipher
single-key encryption
steganography
stream cipher
symmetric encryption
transposition cipher
unconditionally secure
Vigenère cipher
Crypto Corner:Simon Singh’s Web site.Lots of good information, plus interactive tools
for learning about cryptography.
Steganography: Good collection of links and documents.
2.7 KEY TERMS,REVIEW QUESTIONS, AND PROBLEMS
Key Terms
2.7 / KEY TERMS,REVIEW QUESTIONS,AND PROBLEMS 61
A basic requirement of any encryption algorithm is that it be one-to-one.That is, if
, then . Otherwise, decryption is impossible, because more
than one plaintext character maps into the same ciphertext character.The affine
Caesar cipher is not one-to-one for all values of . For example, for and ,
then .
a. Are there any limitations on the value of ? Explain why or why not.
b. Determine which values of are not allowed.
c. Provide a general statement of which values of are and are not allowed. Justify
your statement.
2.2 How many one-to-one affine Caesar ciphers are there?
2.3 A ciphertext has been generated with an affine cipher.The most frequent letter of the
ciphertext is ‘B’,and the second most frequent letter of the ciphertext is ‘U’. Break
this code.
2.4 The following ciphertext was generated using a simple substitution algorithm.
53‡‡†305))6*;4826)4‡.)4‡);806*;48†8¶60))85;;]8*;:‡*8†83
(88)5*†;46(;88*96*?;8)*‡(;485);5*†2:*‡(;4956*2(5*—4)8¶8*
;4069285);)6†8)4‡‡;1(‡9;48081;8:8‡1;48†85;4)485†528806*81
(‡9;48;(88;4(‡?34;48)4‡;161;:188;‡?;
Decrypt this message.
Hints:
1. As you know, the most frequently occurring letter in English is e.Therefore, the
first or second (or perhaps third?) most common character in the message is likely
to stand for e.Also, e is often seen in pairs (e.g., meet,fleet, speed, seen, been,
agree,etc.). Try to find a character in the ciphertext that decodes to e.
2. The most common word in English is “the.”Use this fact to guess the characters
that stand for t and h.
3. Decipher the rest of the message by deducing additional words.
Warning:The resulting message is in English but may not make much sense on a first
reading.
2.5 One way to solve the key distribution problem is to use a line from a book that both
the sender and the receiver possess.Typically,at least in spy novels,the first sentence
of a book serves as the key.The particular scheme discussed in this problem is from
one of the best suspense novels involving secret codes,Talking to Strange Men,by
Ruth Rendell.Work this problem without consulting that book!
Consider the following message:
SIDKHKDM AF HCRKIABIE SHIMC KD LFEAILA
This ciphertext was produced using the first sentence of The Other Side of Silence
(abook about the spy Kim Philby):
The snow lay thick on the steps and the snowflakes driven by the wind
looked black in the headlights of the cars.
A simple substitution cipher was used.
a. What is the encryption algorithm?
b. How secure is it?
c. To make the key distribution problem simple,both parties can agree to use the
first or last sentence of a book as the key.To change the key,they simply need to
agree on a new book.The use of the first sentence would be preferable to the use
of the last.Why?
2.6 In one of his cases, Sherlock Holmes was confronted with the following message.
534 C2 13 127 36 31 4 17 21 41
DOUGLAS 109 293 5 37 BIRLSTONE
26 BIRLSTONE 9 127 171
a
a
b
E([a,b], 13) = 3E([a, b], 0) =
b = 3a = 2a
E(k,p) Z E(k, q)p Z q
62 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
Although Watson was puzzled,Holmes was able immediately to deduce the type of
cipher.Can you?
2.7 This problem uses a real-world example,from an old U.S.Special Forces manual (public
domain).A copy is available at this book’s Web site.
a. Using the two keys (memory words) cryptographicand network security, encrypt
the following message:
Be at the third pillar from the left outside the lyceum theatre tonight at seven.
If you are distrustful bring two friends.
Make reasonable assumptions about how to treat redundant letters and excess
letters in the memory words and how to treat spaces and punctuation.Indicate
what your assumptions are.Note: The message is from the Sherlock Holmes
novel,The Sign of Four.
b. Decrypt the ciphertext. Show your work.
c. Comment on when it would be appropriate to use this technique and what its
advantages are.
2.8 A disadvantage of the general monoalphabetic cipher is that both sender and
receiver must commit the permuted cipher sequence to memory.A common tech-
nique for avoiding this is to use a keyword from which the cipher sequence can be
generated. For example,using the keyword CIPHER, write out the keyword
followed by unused letters in normal order and match this against the plaintext
letters:
plain: a b c d e f g h i j k l m n o p q r s t u v w x y z
cipher: C I P H E R A B D F G J K L M N O Q S T U V W X Y Z
If it is felt that this process does not produce sufficient mixing,write the remaining
letters on successive lines and then generate the sequence by reading down the
columns:
C I P H E R
A B D F G J
K L M N O Q
S T U V W X
Y Z
This yields the sequence:
C A K S Y I B L T Z P D M U H F N V E G O W R J Q X
Such a system is used in the example in Section 2.2 (the one that begins “it was dis-
closed yesterday”).Determine the keyword.
2.9 When the PT-109 American patrol boat,under the command of Lieutenant John F.
Kennedy,was sunk by a Japanese destroyer,a message was received at an Australian
wireless station in Playfair code:
KXJEY UREBE ZWEHE WRYTU HEYFS
KREHE GOYFI WTTTU OLKSY CAJPO
BOTEI ZONTX BYBNT GONEY CUZWR
GDSON SXBOU YWRHE BAAHY USEDQ
The key used was royal new zealand navy.Decrypt the message. Translate TT
intott.
2.10 a. Construct a Playfair matrix with the key largest.
b. Construct a Playfair matrix with the key occurrence. Make a reasonable assump-
tion about how to treat redundant letters in the key.
2.7 / KEY TERMS,REVIEW QUESTIONS,AND PROBLEMS 63
2.11 a. Using this Playfair matrix:
M F H I/J K
UNOP Q
ZVWXY
ELARG
DSTBC
Encrypt this message:
Must see you over Cadogan West.Coming at once.
Note:The message is from the Sherlock Holmes story, The Adventure of the Bruce-
Partington Plans.
b. Repeat part (a) using the Playfair matrix from Problem 2.10a.
c. How do you account for the results of this problem? Can you generalize your
conclusion?
2.12 a. How many possible keys does the Playfair cipher have? Ignore the fact that some
keys might produce identical encryption results. Express your answer as an
approximate power of 2.
b. Now take into account the fact that some Playfair keys produce the same encryp-
tion results.How many effectively unique keys does the Playfair cipher have?
2.13 What substitution system results when we use a 25 × 1 Playfair matrix?
2.14 a. Encrypt the message “meet me at the usual place at ten rather than eight oclock”
using the Hill cipher with the key . Show your calculations and the result.
b. Show the calculations for the corresponding decryption of the ciphertext to
recover the original plaintext.
2.15 We have shown that the Hill cipher succumbs to a known plaintext attack if sufficient
plaintext–ciphertext pairs are provided.It is even easier to solve the Hill cipher if a
chosen plaintext attack can be mounted.Describe such an attack.
2.16 It can be shown that the Hill cipher with the matrix requires that
is relatively prime to 26;that is, the only common positive integer factor of
and 26 is 1.Thus, if or is even,the matrix is not allowed. Determine
the number of different (good) keys there are for a 2 ×2 Hill cipher without counting
them one by one,using the following steps:
a. Find the number of matrices whose determinant is even because one or both rows
are even.(A row is “even” if both entries in the row are even.)
b. Find the number of matrices whose determinant is even because one or both
columns are even.(A column is “even” if both entries in the column are even.)
c. Find the number of matrices whose determinant is even because all of the entries
are odd.
d. Taking into account overlaps, find the total number of matrices whose determi-
nant is even.
e. Find the number of matrices whose determinant is a multiple of 13 because the
first column is a multiple of 13.
f. Find the number of matrices whose determinant is a multiple of 13 where the
first column is not a multiple of 13 but the second column is a multiple of the first
modulo 13.
g. Find the total number of matrices whose determinant is a multiple of 13.
h. Find the number of matrices whose determinant is a multiple of 26 because they
fit cases parts (a) and (e),(b) and (e), (c) and (e), (a) and (f), and so on.
i. Find the total number of matrices whose determinant is neither a multiple of 2 nor
a multiple of 13.
(ad - bc) = 13
(ad - bc)
(ad - bc)a
ab
cd
b
a
94
57
b
64 CHAPTER 2 / CLASSICAL ENCRYPTION TECHNIQUES
2.17 Using the Vigenère cipher,encrypt the word “explanation” using the key leg.
2.18 This problem explores the use of a one-time pad version of the Vigenère cipher.In
this scheme,the key is a stream of random numbers between 0 and 26. For example,if
the key is 3 19 5...,then the first letter of plaintext is encrypted with a shift of 3 letters,
the second with a shift of 19 letters,the third with a shift of 5 letters, and so on.
a. Encrypt the plaintext sendmoremoney with the key stream
9 0 1 7 23 15 21 14 11 11 2 8 9
b. Using the ciphertext produced in part (a), find a key so that the cipher text
decrypts to the plaintext cashnotneeded.
2.19 What is the message embedded in Figure 2.9?
2.20 In one of Dorothy Sayers’s mysteries, Lord Peter is confronted with the message
shown in Figure 2.10.He also discovers the key to the message,which is a sequence of
integers:
a. Decrypt the message.Hint:What is the largest integer value?
b. If the algorithm is known but not the key, how secure is the scheme?
c. If the key is known but not the algorithm,how secure is the scheme?
Programming Problems
2.21 Write a program that can encrypt and decrypt using the general Caesar cipher, also
known as an additive cipher.
2.22 Write a program that can encrypt and decrypt using the affine cipher described in
Problem 2.1.
787656543432112343456567878878765654
3432112343456567878878765654433211234
Figure 2.10 A Puzzle for Lord Peter
2.7 / KEY TERMS,REVIEW QUESTIONS,AND PROBLEMS 65
2.23 Write a program that can perform a letter frequency attack on an additive cipher
without human intervention.Your software should produce possible plaintexts in
rough order of likelihood.It would be good if your user interface allowed the user to
specify “give me the top 10 possible plaintexts.”
2.24 Write a program that can perform a letter frequency attack on any monoalphabetic
substitution cipher without human intervention.Your software should produce possi-
ble plaintexts in rough order of likelihood.It would be good if your user interface
allowed the user to specify “give me the top 10 possible plaintexts.”
2.25 Create software that can encrypt and decrypt using a 2 × 2 Hill cipher.
2.26 Create software that can perform a fast known plaintext attack on a Hill cipher, given
the dimension .How fast are your algorithms,as a function of ?mm
BLOCK CIPHERS AND THE DATA
ENCRYPTION STANDARD
3.1 Block Cipher Principles
Stream Ciphers and Block Ciphers
Motivation for the Feistel Cipher Structure
The Feistel Cipher
3.2 The Data Encryption Standard
DES Encryption
DES Decryption
3.3 A Des Example
Results
The Avalanche Effect
3.4 The Strength of Des
The Use of 56-Bit Keys
The Nature of the DES Algorithm
Timing Attacks
3.5 Differential and Linear Cryptanalysis
Differential Cryptanalysis
Linear Cryptanalysis
3.6 Block Cipher Design Principles
DES Design Criteria
Number of Rounds
Design of Function F
Key Schedule Algorithm
3.7 Recommended Reading and Web Site
3.8 Key Terms,Review Questions,and Problems
CHAPTER
66
3.1 / BLOCK CIPHER PRINCIPLES 67
All the afternoon Mungo had been working on Stern’s code,principally with the
aid of the latest messages which he had copied down at the Nevin Square drop.
Stern was very confident. He must be well aware London Central knew about
that drop. It was obvious that they didn’t care how often Mungo read their
messages,so confident were they in the impenetrability of the code.
Talking to Strange Men,Ruth Rendell
KEY POINTS
A block cipher is an encryption/decryption scheme in which a block of
plaintext is treated as a whole and used to produce a ciphertext block of
equal length.
Many block ciphers have a Feistel structure. Such a structure consists of a
number of identical rounds of processing.In each round, a substitution is
performed on one half of the data being processed,followed by a permu-
tation that interchanges the two halves.The original key is expanded so
that a different key is used for each round.
The Data Encryption Standard (DES) has been the most widely used
encryption algorithm until recently.It exhibits the classic Feistel structure.
DES uses a 64-bit block and a 56-bit key.
Two important methods of cryptanalysis are differential cryptanalysis and
linear cryptanalysis.DES has been shown to be highly resistant to these two
types of attack.
The objective of this chapter is to illustrate the principles of modern symmetric
ciphers.For this purpose, we focus on the most widely used symmetric cipher: the
Data Encryption Standard (DES).Although numerous symmetric ciphers have been
developed since the introduction of DES,and although it is destined to be replaced by
the Advanced Encryption Standard (AES),DES remains the most important such
algorithm. Furthermore,a detailed study of DES provides an understanding of the
principles used in other symmetric ciphers.
This chapter begins with a discussion of the general principles of symmetric
block ciphers,which are the type of symmetric ciphers studied in this book (with the
exception of the stream cipher RC4 in Chapter 7). Next, we cover full DES.
Following this look at a specific algorithm,we return to a more general discussion of
block cipher design.
Compared to public-key ciphers,such as RSA, the structure of DES and most
symmetric ciphers is very complex and cannot be explained as easily as RSA and simi-
lar algorithms.Accordingly, the reader may wish to begin with a simplified version of
DES,which is described in Appendix G. This version allows the reader to perform
encryption and decryption by hand and gain a good understanding of the working of
68 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
the algorithm details.Classroom experience indicates that a study of this simplified
version enhances understanding of DES.
1
3.1 BLOCK CIPHER PRINCIPLES
Many symmetric block encryption algorithms in current use are based on a struc-
ture referred to as a Feistel block cipher [FEIS73].For that reason, it is important to
examine the design principles of the Feistel cipher.We begin with a comparison of
stream ciphers and block ciphers.Then we discuss the motivation for the Feistel
block cipher structure.Finally,we discuss some of its implications.
Stream Ciphers and Block Ciphers
Astream cipher is one that encrypts a digital data stream one bit or one byte at a time.
Examples of classical stream ciphers are the autokeyed Vigenère cipher and the
Vernam cipher.In the ideal case, a one-time pad version of the Vernam cipher would
be used (Figure 2.7),in which the keystream is as long as the plaintext bit stream
.If the cryptographic keystream is random, then this cipher is unbreakable by any
means other than acquiring the keystream.However, the keystream must be provided
to both users in advance via some independent and secure channel.This introduces
insurmountable logistical problems if the intended data traffic is very large.
Accordingly,for practical reasons, the bit-stream generator must be imple-
mented as an algorithmic procedure,so that the cryptographic bit stream can be
produced by both users.In this approach (Figure 3.1a), the bit-stream generator is a
key-controlled algorithm and must produce a bit stream that is cryptographically
strong.Now,the two users need only share the generating key, and each can produce
the keystream.
A block cipher is one in which a block of plaintext is treated as a whole and
used to produce a ciphertext block of equal length.Typically, a block size of 64 or
128 bits is used.As with a stream cipher, the two users share a symmetric encryption
key (Figure 3.1b).Using some of the modes of operation explained in Chapter 6, a
block cipher can be used to achieve the same effect as a stream cipher.
Far more effort has gone into analyzing block ciphers.In general, they seem
applicable to a broader range of applications than stream ciphers.The vast majority
of network-based symmetric cryptographic applications make use of block ciphers.
Accordingly,the concern in this chapter,and in our discussions throughout the book
of symmetric encryption,will primarily focus on block ciphers.
Motivation for the Feistel Cipher Structure
A block cipher operates on a plaintext block of n bits to produce a ciphertext
block of n bits. There are possible different plaintext blocks and, for
the encryption to be reversible (i.e.,for decryption to be possible), each must
2
n
(p
i
)
(k
i
)
1
However,you may safely skip Appendix G, at least on a first reading. If you get lost or bogged down in
the details of DES,then you can go back and start with simplified DES.
3.1 / BLOCK CIPHER PRINCIPLES 69
produce a unique ciphertext block.Such a transformation is called reversible, or
nonsingular.The following examples illustrate nonsingular and singular transfor-
mations for .n = 2
Bit-stream
generation
algorithm
Bit-stream
generation
algorithm
(a) Stream cipher using algorithmic bit-stream generator
(b) Block cipher
Cryptographic
bit stream ( k
i
)
Cryptographic
bit stream ( k
i
)
Plaintext
(p
i
)
Plaintext
(p
i
)
Ciphertext
(c
i
)
Key
(K )
Encryption
algorithm
Plaintext
b bits
b bits
Key
(K)
Key
(K )
Ciphertext
Figure 3.1 Stream Cipher and Block Cipher
2
The reasoning is as follows:For the first plaintext, we can choose any of ciphertext blocks. For the
second plaintext,we choose from among remaining ciphertext blocks,and so on.2
n
- 1
2
n
Reversible Mapping Irreversible Mapping
Plaintext Ciphertext Plaintext Ciphertext
00 11 00 11
01 10 01 10
10 00 10 01
11 01 11 01
In the latter case,a ciphertext of 01 could have been produced by one of two plain-
text blocks.So if we limit ourselves to reversible mappings, the number of different
transformations is !.
2
Figure 3.2 illustrates the logic of a general substitution cipher for .
A 4-bit input produces one of 16 possible input states, which is mapped by the
substitution cipher into a unique one of 16 possible output states,each of which is
represented by 4 ciphertext bits.The encryption and decryption mappings can be
n = 4
2
n
70 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
defined by a tabulation,as shown in Table 3.1. This is the most general form of block
cipher and can be used to define any reversible mapping between plaintext and
ciphertext.Feistel refers to this as the ideal block cipher, because it allows for the max-
imum number of possible encryption mappings from the plaintext block [FEIS75].
4-bit input
4 to 16 decoder
16 to 4 encoder
4-bit output
0123456789101112131415
0123456789101112131415
Figure 3.2 General n-bit-n-bit Block Substitution (shown with n = 4)
Table 3.1 Encryption and Decryption Tables for Substitution
Cipher of Figure 3.2
Plaintext Ciphertext
0000
1110
0001 0100
0010 1101
0011 0001
0100 0010
0101 1111
0110 1011
0111 1000
1000 0011
1001 1010
1010 0110
1011 1100
1100 0101
1101 1001
1110 0000
1111 0111
Ciphertext Plaintext
0000 1110
0001 0011
0010 0100
0011 1000
0100 0001
0101 1100
0110 1010
0111 1111
1000 0111
1001 1101
1010 1001
1011 0110
1100 1011
1101 0010
1110 0000
1111 0101
3.1 / BLOCK CIPHER PRINCIPLES 71
But there is a practical problem with the ideal block cipher.If a small block
size,such as ,is used, then the system is equivalent to a classical substitution
cipher.Such systems, as we have seen,are vulnerable to a statistical analysis of the
plaintext.This weakness is not inherent in the use of a substitution cipher but
rather results from the use of a small block size.If is sufficiently large and an
arbitrary reversible substitution between plaintext and ciphertext is allowed,then
the statistical characteristics of the source plaintext are masked to such an extent
that this type of cryptanalysis is infeasible.
An arbitrary reversible substitution cipher (the ideal block cipher) for a large
block size is not practical,however, from an implementation and performance point
of view.For such a transformation, the mapping itself constitutes the key.Consider
again Table 3.1,which defines one particular reversible mapping from plaintext to
ciphertext for .The mapping can be defined by the entries in the second
column, which show the value of the ciphertext for each plaintext block.This, in
essence,is the key that determines the specific mapping from among all possible
mappings.In this case, using this straightforward method of defining the key, the
required key length is . In general,for an -bit ideal
block cipher,the length of the key defined in this fashion is bits.For a 64-bit
block, which is a desirable length to thwart statistical attacks,the required key
length is bits.
In considering these difficulties,Feistel points out that what is needed is an
approximation to the ideal block cipher system for large n,built up out of compo-
nents that are easily realizable [FEIS75]. But before turning to Feistel’s approach,
let us make one other observation.We could use the general block substitution
cipher but,to make its implementation tractable, confine ourselves to a subset of the
! possible reversible mappings.For example, suppose we define the mapping in
terms of a set of linear equations.In the case of ,we have
where the are the four binary digits of the plaintext block, the are the four
binary digits of the ciphertext block,the are the binary coefficients, and arith-
metic is mod 2.The key size is just , in this case 16 bits.The danger with this
kind of formulation is that it may be vulnerable to cryptanalysis by an attacker
that is aware of the structure of the algorithm. In this example,what we have is
essentially the Hill cipher discussed in Chapter 2, applied to binary data rather
than characters.As we saw in Chapter 2, a simple linear system such as this is
quite vulnerable.
The Feistel Cipher
Feistel proposed [FEIS73] that we can approximate the ideal block cipher by utilizing
the concept of a product cipher,which is the execution of two or more simple ciphers
in sequence in such a way that the final result or product is cryptographically stronger
than any of the component ciphers.The essence of the approach is to develop a block
n
2
k
ij
y
i
x
i
y
1
= k
11
x
1
+ k
12
x
2
+ k
13
x
3
+ k
14
x
4
y
2
= k
21
x
1
+ k
22
x
2
+ k
23
x
3
+ k
24
x
4
y
3
= k
31
x
1
+ k
32
x
2
+ k
33
x
3
+ k
34
x
4
y
4
= k
41
x
1
+ k
42
x
2
+ k
43
x
3
+ k
44
x
4
n = 4
2
n
64 * 2
64
= 2
70
L 10
21
n * 2
n
n(4 bits) * (16 rows) = 64 bits
n = 4
n
n = 4
72 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
cipher with a key length of k bits and a block length of bits, allowing a total of
possible transformations,rather than the ! transformations available with the ideal
block cipher.
In particular,Feistel proposed the use of a cipher that alternates substitutions
and permutations,where these terms are defined as follows:
Substitution: Each plaintext element or group of elements is uniquely
replaced by a corresponding ciphertext element or group of elements.
Permutation: A sequence of plaintext elements is replaced by a permutation
of that sequence.That is, no elements are added or deleted or replaced in the
sequence,rather the order in which the elements appear in the sequence is
changed.
In fact,Feistel’s is a practical application of a proposal by Claude Shannon to
develop a product cipher that alternates confusion and diffusion functions
[SHAN49].
3
We look next at these concepts of diffusion and confusion and then
present the Feistel cipher.But first, it is worth commenting on this remarkable fact:
The Feistel cipher structure,which dates back over a quarter century and which, in
turn,is based on Shannon’s proposal of 1945, is the structure used by many signifi-
cant symmetric block ciphers currently in use.
DIFFUSION AND CONFUSION The terms diffusionand confusion were introduced by
Claude Shannon to capture the two basic building blocks for any cryptographic
system [SHAN49]. Shannon’s concern was to thwart cryptanalysis based on
statistical analysis.The reasoning is as follows. Assume the attacker has some
knowledge of the statistical characteristics of the plaintext. For example,in a
human-readable message in some language, the frequency distribution of the
various letters may be known.Or there may be words or phrases likely to appear in
the message (probable words). If these statistics are in any way reflected in the
ciphertext, the cryptanalyst may be able to deduce the encryption key,part of the
key,or at least a set of keys likely to contain the exact key. In what Shannon refers
to as a strongly ideal cipher, all statistics of the ciphertext are independent of the
particular key used.The arbitrary substitution cipher that we discussed previously
(Figure 3.2) is such a cipher,but as we have seen, it is impractical.
4
Other than recourse to ideal systems,Shannon suggests two methods for frus-
trating statistical cryptanalysis: diffusion and confusion.In diffusion, the statistical
structure of the plaintext is dissipated into long-range statistics of the ciphertext.
This is achieved by having each plaintext digit affect the value of many ciphertext
digits; generally,this is equivalent to having each ciphertext digit be affected by
2
n
2
k
n
3
The paper is available at this book’s Web site.Shannon’s 1949 paper appeared originally as a classified
report in 1945.Shannon enjoys an amazing and unique position in the history of computer and informa-
tion science.He not only developed the seminal ideas of modern cryptography but is also responsible for
inventing the discipline of information theory.Based on his work in information theory,he developed a
formula for the capacity of a data communications channel,which is still used today. In addition, he
founded another discipline,the application of Boolean algebra to the study of digital circuits; this last he
managed to toss off as a master’s thesis.
4
Appendix F expands on Shannon’s concepts concerning measures of secrecy and the security of
cryptographic algorithms.
3.1 / BLOCK CIPHER PRINCIPLES 73
many plaintext digits. An example of diffusion is to encrypt a message
of characters with an averaging operation:
adding successive letters to get a ciphertext letter . One can show that the statis-
tical structure of the plaintext has been dissipated.Thus,the letter frequencies in the
ciphertext will be more nearly equal than in the plaintext; the digram frequencies
will also be more nearly equal,and so on. In a binary block cipher, diffusion can be
achieved by repeatedly performing some permutation on the data followed by
applying a function to that permutation; the effect is that bits from different
positions in the original plaintext contribute to a single bit of ciphertext.
5
Every block cipher involves a transformation of a block of plaintext into a
block of ciphertext,where the transformation depends on the key. The mechanism
of diffusion seeks to make the statistical relationship between the plaintext and
ciphertext as complex as possible in order to thwart attempts to deduce the key.On
the other hand, confusion seeks to make the relationship between the statistics of
the ciphertext and the value of the encryption key as complex as possible,again to
thwart attempts to discover the key.Thus,even if the attacker can get some handle
on the statistics of the ciphertext,the way in which the key was used to produce that
ciphertext is so complex as to make it difficult to deduce the key.This is achieved by
the use of a complex substitution algorithm.In contrast, a simple linear substitution
function would add little confusion.
As [ROBS95b] points out,so successful are diffusion and confusion in captur-
ing the essence of the desired attributes of a block cipher that they have become the
cornerstone of modern block cipher design.
FEISTEL CIPHER STRUCTURE The left-hand side of Figure 3.3 depicts the structure
proposed by Feistel.The inputs to the encryption algorithm are a plaintext block of
length bits and a key .The plaintext block is divided into two halves, and .
The two halves of the data pass through rounds of processing and then combine to
produce the ciphertext block.Each round has as inputs and derived from
the previous round, as well as a subkey derived from the overall . In general,
the subkeys are different from and from each other. In Figure 3.3, 16 rounds
are used,although any number of rounds could be implemented.
All rounds have the same structure.A substitution is performed on the left
half of the data.This is done by applying a round function F to the right half of the
data and then taking the exclusive-OR of the output of that function and the left
half of the data.The round function has the same general structure for each round
but is parameterized by the round subkey . Another way to express this is to say
that F is a function of right-half block of bits and a subkey of bits,which pro-
duces an output value of length w bits: . Following this substitution, aF(RE
i
,K
i+1
)
yw
K
i
KK
i
KK
i
R
i-1
L
i-1
i
n
R
0
L
0
K2w
y
n
k
y
n
= a
a
k
i=1
m
n+i
bmod 26
M = m
1
,m
2
,m
3
, ...
5
Some books on cryptography equate permutation with diffusion.This is incorrect.Permutation, by itself,
does not change the statistics of the plaintext at the level of individual letters or permuted blocks.For
example,in DES, the permutation swaps two 32-bit blocks, so statistics of strings of 32 bits or less are
preserved.
74 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
permutation is performed that consists of the interchange of the two halves of the
data.
6
This structure is a particular form of the substitution-permutation network
(SPN) proposed by Shannon.
Output (ciphertext)
K
1
LD
0
= RE
16
RD
0
= LE
16
LD
2
= RE
14
RD
2
= LE
14
LD
14
= RE
2
RD
14
= LE
2
LD
16
= RE
0
LD
17
= RE
0
RD
16
= LE
0
RD
17
= LE
0
RD
1
= LE
15
LD
1
= RE
15
RD
15
= LE
1
LD
15
= RE
1
Input (ciphertext)
Output (plaintext)
Round 1
K
1
K
2
K
15
K
16
K
2
K
15
K
16
F
LE
0
RE
0
Input (plaintext)
LE
1
RE
1
LE
2
RE
2
F
F
LE
14
RE
14
LE
15
RE
15
LE
16
RE
16
LE
17
RE
17
F
F
F
F
F
Round 2Round 15Round 16
Round 16Round 15Round 2Round 1
Figure 3.3 Feistel Encryption and Decryption (16 rounds)
6
The final round is followed by an interchange that undoes the interchange that is part of the final round.
One could simply leave both interchanges out of the diagram,at the sacrifice of some consistency of
presentation.In any case, the effective lack of a swap in the final round is done to simplify the implemen-
tation of the decryption process,as we shall see.
3.1 / BLOCK CIPHER PRINCIPLES 75
The exact realization of a Feistel network depends on the choice of the following
parameters and design features:
Block size: Larger block sizes mean greater security (all other things being
equal) but reduced encryption/decryption speed for a given algorithm.The
greater security is achieved by greater diffusion.Traditionally, a block size of
64 bits has been considered a reasonable tradeoff and was nearly universal in
block cipher design.However, the new AES uses a 128-bit block size.
Key size:Larger key size means greater security but may decrease encryption/
decryption speed.The greater security is achieved by greater resistance to
brute-force attacks and greater confusion.Key sizes of 64 bits or less are now
widely considered to be inadequate,and 128 bits has become a common size.
Number of rounds: The essence of the Feistel cipher is that a single round
offers inadequate security but that multiple rounds offer increasing security.
Atypical size is 16 rounds.
Subkey generation algorithm: Greater complexity in this algorithm should
lead to greater difficulty of cryptanalysis.
Round function F: Again,greater complexity generally means greater resistance
to cryptanalysis.
There are two other considerations in the design of a Feistel cipher:
Fast software encryption/decryption: In many cases,encryption is embedded
in applications or utility functions in such a way as to preclude a hardware
implementation. Accordingly, the speed of execution of the algorithm
becomes a concern.
Ease of analysis: Although we would like to make our algorithm as difficult as
possible to cryptanalyze,there is great benefit in making the algorithm easy to
analyze.That is, if the algorithm can be concisely and clearly explained, it is
easier to analyze that algorithm for cryptanalytic vulnerabilities and therefore
develop a higher level of assurance as to its strength.DES, for example, does
not have an easily analyzed functionality.
F
EISTEL DECRYPTION ALGORITHM The process of decryption with a Feistel cipher is
essentially the same as the encryption process.The rule is as follows: Use the
ciphertext as input to the algorithm,but use the subkeys in reverse order.That is,
use in the first round, in the second round,and so on,until is used in the
last round.This is a nice feature, because it means we need not implement two
different algorithms;one for encryption and one for decryption.
To see that the same algorithm with a reversed key order produces the correct
result, Figure 3.3shows the encryption process going down the left-hand side and
the decryption process going up the right-hand side for a 16-round algorithm. For
clarity,we use the notation and for data traveling through the encryption
algorithm and and for data traveling through the decryption algorithm.
The diagram indicates that,at every round, the intermediate value of the decryption
process is equal to the corresponding value of the encryption process with the two
halves of the value swapped.To put this another way, let the output of the ith
RD
i
LD
i
RE
i
LE
i
K
1
K
n-1
K
n
K
i
76 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
encryption round be ( concatenated with ).Then the corresponding
output of the (16 – i)th decryption round is or,equivalently, .
Let us walk through Figure 3.3 to demonstrate the validity of the preceding
assertions.After the last iteration of the encryption process, the two halves of the
output are swapped,so that the ciphertext is .The output of that round is
the ciphertext.Now take that ciphertext and use it as input to the same algorithm.
The input to the first round is ,which is equal to the 32-bit swap of the
output of the sixteenth round of the encryption process.
Now we would like to show that the output of the first round of the decryption
process is equal to a 32-bit swap of the input to the sixteenth round of the encryption
process.First, consider the encryption process.We see that
On the decryption side,
The XOR has the following properties:
Thus,we have and . Therefore, the output of the first
round of the decryption process is ,which is the 32-bit swap of the input
to the sixteenth round of the encryption.This correspondence holds all the way
through the 16 iterations,as is easily shown. We can cast this process in general
terms.For the ith iteration of the encryption algorithm,
Rearranging terms:
Thus,we have described the inputs to the ith iteration as a function of the outputs,and
these equations confirm the assignments shown in the right-hand side of Figure 3.3.
Finally,we see that the output of the last round of the decryption process is
.A 32-bit swap recovers the original plaintext, demonstrating the validity
of the Feistel decryption process.
Note that the derivation does not require that F be a reversible function.To
see this,take a limiting case in which F produces a constant output (e.g., all ones)
regardless of the values of its two arguments.The equations still hold.
RE
0
7
LE
0
LE
i-1
= RE
i
{
F(RE
i-1
,K
i
) = RE
i
{
F(LE
i
,K
i
)
RE
i-1
= LE
i
RE
i
= LE
i-1
{
F(RE
i-1
,K
i
)
LE
i
= RE
i-1
RE
15
7
LE
15
RD
1
= LE
15
LD
1
= RE
15
E
{
0 = E
D
{
D = 0
[A
{
B]
{
C = A
{
[B
{
C]
= [LE
15
{
F(RE
15
,K
16
)]
{
F(RE
15
, K
16
)
= RE
16
{
F(RE
15
,K
16
)
RD
1
= LD
0
{
F(RD
0
,K
16
)
LD
1
= RD
0
= LE
16
= RE
15
RE
16
= LE
15
{
F(RE
15
,K
16
)
LE
16
= RE
15
RE
16
7
LE
16
RE
16
7
LE
16
LD
16-i
7
RD
16-i
RE
i
7
LE
i
RE
i
LE
i
LE
i
7
RE
i
3.2 / THE DATA ENCRYPTION STANDARD 77
To help clarify the preceding concepts,let us look at a specific example (Figure 3.4)
and focus on the fifteenth round of encryption,corresponding to the second round of
decryption.Suppose that the blocks at each stage are 32 bits (two 16-bit halves) and that
the key size is 24 bits.Suppose that at the end of encryption round fourteen, the value of
the intermediate block (in hexadecimal) is DE7F03A6. Then and
.Also assume that the value of is 12DE52.After round 15, we have
= 03A6 and .
Now let’s look at the decryption.We assume that and ,
as shown in Figure 3.3, and we want to demonstrate that and
.So, we start with and .
Then, from Figure 3.3, and
[F(03A6,12DE52) DE7F]= DE7F = LE
14
.
3.2 THE DATA ENCRYPTION STANDARD
The most widely used encryption scheme is based on the Data Encryption Standard
(DES) adopted in 1977 by the National Bureau of Standards,now the National
Institute of Standards and Technology (NIST),as Federal Information Processing
Standard 46 (FIPS PUB 46). The algorithm itself is referred to as the Data
Encryption Algorithm (DEA).
7
For DES,data are encrypted in 64-bit blocks using
a 56-bit key.The algorithm transforms 64-bit input in a series of steps into a 64-bit
output.The same steps,with the same key, are used to reverse the encryption.
The DES enjoys widespread use.It has also been the subject of much controversy
concerning how secure the DES is.To appreciate the nature of the controversy,let us
quickly review the history of the DES.
{
RD
2
= F(03A6, 12DE52)
{
LD
2
= 03A6 = RE
14
RD
1
= 03A6LD
1
= F(03A6, 12DE52)
{
DE7FRD
2
= LE
14
LD
2
= RE
14
RD
1
= LE
15
LD
1
= RE
15
RE
15
= F(03A6, 12DE52)
{
DE7FLE
15
K
15
RE
14
= 03A6
LE
14
= DE7F
12DE52
12DE52
F
DE7F 03A6
Encryption round Decryption round
03A6
03A6 03A6F(03A6, 12DE52) DE7F
F(03A6, 12DE52) DE7F
F(03A6, 12DE52)
[F(03A6, 12DE52) DE7F]
= DE7F
F
Round 15
Round 2
Figure 3.4 Feistel Example
7
The terminology is a bit confusing.Until recently,the terms DES and DEAcould be used interchange-
ably.However, the most recent edition of the DES document includes a specification of the DEA
described here plus the triple DEA (TDEA) described in Chapter 6.Both DEA and TDEA are part of
the Data Encryption Standard.Further, until the recent adoption of the official term TDEA, the triple
DEA algorithm was typically referred to as triple DESand written as 3DES. For the sake of convenience,
we will use the term 3DES.
78 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
In the late 1960s,IBM set up a research project in computer cryptography led by
Horst Feistel.The project concluded in 1971 with the development of an algorithm
with the designation LUCIFER [FEIS73], which was sold to Lloyd’s of London for
use in a cash-dispensing system,also developed by IBM. LUCIFER is a Feistel block
cipher that operates on blocks of 64 bits,using a key size of 128 bits. Because of the
promising results produced by the LUCIFER project,IBM embarked on an effort to
develop a marketable commercial encryption product that ideally could be imple-
mented on a single chip.The effort was headed by Walter Tuchman and Carl Meyer,
and it involved not only IBM researchers but also outside consultants and technical
advice from the National Security Agency (NSA).The outcome of this effort was a
refined version of LUCIFER that was more resistant to cryptanalysis but that had a
reduced key size of 56 bits,in order to fit on a single chip.
In 1973,the National Bureau of Standards (NBS) issued a request for proposals
for a national cipher standard. IBM submitted the results of its Tuchman–Meyer
project.This was by far the best algorithm proposed and was adopted in 1977 as the
Data Encryption Standard.
Before its adoption as a standard,the proposed DES was subjected to intense
criticism, which has not subsided to this day.Two areas drew the critics’ fire.First,
the key length in IBM’s original LUCIFER algorithm was 128 bits,but that of the
proposed system was only 56 bits,an enormous reduction in key size of 72 bits.
Critics feared that this key length was too short to withstand brute-force attacks.
The second area of concern was that the design criteria for the internal structure of
DES,the S-boxes, were classified. Thus,users could not be sure that the internal
structure of DES was free of any hidden weak points that would enable NSA to
decipher messages without benefit of the key.Subsequent events, particularly the
recent work on differential cryptanalysis,seem to indicate that DES has a very
strong internal structure.Furthermore, according to IBM participants, the only
changes that were made to the proposal were changes to the S-boxes,suggested by
NSA,that removed vulnerabilities identified in the course of the evaluation process.
Whatever the merits of the case,DES has flourished and is widely used, especially
in financial applications.In 1994, NIST reaffirmed DES for federal use for another five
years;NIST recommended the use of DES for applications other than the protection of
classified information. In 1999,NIST issued a new version of its standard (FIPS PUB
46-3) that indicated that DES should be used only for legacy systems and that triple
DES (which in essence involves repeating the DES algorithm three times on the plain-
text using two or three different keys to produce the ciphertext) be used.We study triple
DES in Chapter 6. Because the underlying encryption and decryption algorithms are
the same for DES and triple DES,it remains important to understand the DES cipher.
DES Encryption
The overall scheme for DES encryption is illustrated in Figure 3.5.As with any
encryption scheme,there are two inputs to the encryption function: the plaintext to
be encrypted and the key.In this case,the plaintext must be 64 bits in length and the
key is 56 bits in length.
8
8
Actually,the function expects a 64-bit key as input. However, only 56 of these bits are ever used; the
other 8 bits can be used as parity bits or simply set arbitrarily.
3.2 / THE DATA ENCRYPTION STANDARD 79
Looking at the left-hand side of the figure,we can see that the processing of
the plaintext proceeds in three phases.First, the 64-bit plaintext passes through an
initial permutation (IP) that rearranges the bits to produce the permuted input.
This is followed by a phase consisting of sixteen rounds of the same function,
which involves both permutation and substitution functions.The output of the last
(sixteenth) round consists of 64 bits that are a function of the input plaintext and
the key.The left and right halves of the output are swapped to produce the
preoutput.Finally,the preoutput is passed through a permutation that is the
inverse of the initial permutation function,to produce the 64-bit ciphertext. With
the exception of the initial and final permutations,DES has the exact structure of
a Feistel cipher,as shown in Figure 3.3.
The right-hand portion of Figure 3.5shows the way in which the 56-bit key
is used.Initially, the key is passed through a permutation function.Then, for each
of the sixteen rounds,a subkey ( ) is produced by the combination of a leftK
i
[IP
-1
]
Initial permutation
Permuted choice 2Round 1
32-bit swap
Inverse initial
permutation
Permuted choice 1
Round 2
Round 16
64-bit plaintext
64-bit key
K
1
K
2
K
16
64-bit ciphertext
Left circular shift
Permuted choice 2
Left circular shift
Permuted choice 2
Left circular shift
64 56
56
56
56
48
48
48
56
64
64 bits
••••••••• •••••••••
•••••••••
Figure 3.5 General Depiction of DES Encryption Algorithm
80 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
circular shift and a permutation.The permutation function is the same for each
round, but a different subkey is produced because of the repeated shifts of the
key bits.
I
NITIAL PERMUTATION The initial permutation and its inverse are defined by tables,
as shown in Tables 3.2aand 3.2b, respectively. The tables are to be interpreted as
follows.The input to a table consists of 64 bits numbered from 1 to 64.The 64 entries
in the permutation table contain a permutation of the numbers from 1 to 64.Each
Table 3.2 Permutation Tables for DES
(a) Initial Permutation (IP)
58 50 42 34 26 18 10 2
60 52 44 36 28 20 12 4
62 54 46 38 30 22 14 6
64 56 48 40 32 24 16 8
57 49 41 33 25 17 9 1
59 51 43 35 27 19 11 3
61 53 45 37 29 21 13 5
63 55 47 39 31 23 15 7
(b) Inverse Initial Permutation (IP
–1
)
40 8 48 16 56 24 64 32
39 7 47 15 55 23 63 31
38 6 46 14 54 22 62 30
37 5 45 13 53 21 61 29
36 4 44 12 52 20 60 28
35 3 43 11 51 19 59 27
34 2 42 10 50 18 58 26
33 1 41 9 49 17 57 25
(c) Expansion Permutation (E)
32 1 2 3 45
4 5 6 7 89
8 9 10 11 12 13
12 13 14 15 16 17
16 17 18 19 20 21
20 21 22 23 24 25
24 25 26 27 28 29
28 29 30 31 32 1
(d) Permutation Function (P)
16 7 20 21 29 12 28 17
1 15 23 26 5 18 31 10
2 8 24 14 32 27 3 9
19 13 30 6 22 11 4 25
3.2 / THE DATA ENCRYPTION STANDARD 81
entry in the permutation table indicates the position of a numbered input bit in the
output,which also consists of 64 bits.
To see that these two permutation functions are indeed the inverse of each
other,consider the following 64-bit input :
where is a binary digit.Then the permutation is as follows:
If we then take the inverse permutation ,it can be
seen that the original ordering of the bits is restored.
D
ETAILSOF SINGLE ROUND Figure 3.6 shows the internal structure of a single round.
Again, begin by focusing on the left-hand side of the diagram.The left and right
halves of each 64-bit intermediate value are treated as separate
32-bit quantities,labeled L (left) and R (right). As in any classic Feistel cipher, the
overall processing at each round can be summarized in the following formulas:
The round key is 48 bits.The input is 32 bits. This input is first
expanded to 48 bits by using a table that defines a permutation plus an expansion
that involves duplication of 16 of the bits (Table 3.2c).The resulting 48 bits are
XORed with . This 48-bit result passes through a substitution function that
produces a 32-bit output,which is permuted as defined by Table 3.2d.
The role of the S-boxes in the function F is illustrated in Figure 3.7.The substi-
tution consists of a set of eight S-boxes,each of which accepts 6 bits as input and
produces 4 bits as output.These transformations are defined in Table 3.3, which is
K
i
R
RRK
i
R
i
= L
i-1
{
F(R
i-1
,K
i
)
L
i
= R
i-1
Y = IP
-1
(X) = IP
-1
(IP(M))
M
58
M
50
M
42
M
34
M
26
M
18
M
10
M
2
M
60
M
52
M
44
M
36
M
28
M
20
M
12
M
4
M
62
M
54
M
46
M
38
M
30
M
22
M
14
M
6
M
64
M
56
M
48
M
40
M
32
M
24
M
16
M
8
M
57
M
49
M
41
M
33
M
25
M
17
M
9
M
1
M
59
M
51
M
43
M
35
M
27
M
19
M
11
M
3
M
61
M
53
M
45
M
37
M
29
M
21
M
13
M
5
M
63
M
55
M
47
M
39
M
31
M
23
M
15
M
7
X = (IP(M)M
i
M
1
M
2
M
3
M
4
M
5
M
6
M
7
M
8
M
9
M
10
M
11
M
12
M
13
M
14
M
15
M
16
M
17
M
18
M
19
M
20
M
21
M
22
M
23
M
24
M
25
M
26
M
27
M
28
M
29
M
30
M
31
M
32
M
33
M
34
M
35
M
36
M
37
M
38
M
39
M
40
M
41
M
42
M
43
M
44
M
45
M
46
M
47
M
48
M
49
M
50
M
51
M
52
M
53
M
54
M
55
M
56
M
57
M
58
M
59
M
60
M
61
M
62
M
63
M
64
M
82 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
interpreted as follows:The first and last bits of the input to box form a 2-bit binary
number to select one of four substitutions defined by the four rows in the table for .
The middle four bits select one of the sixteen columns.The decimal value in the cell
selected by the row and column is then converted to its 4-bit representation to pro-
duce the output.For example, in S
1
,for input 011001, the row is 01 (row 1) and the
column is 1100 (column 12).The value in row 1, column 12 is 9,so the output is 1001.
Each row of an S-box defines a general reversible substitution.Figure 3.2 may
be useful in understanding the mapping.The figure shows the substitution for row 0
of box .
The operation of the S-boxes is worth further comment.Ignore for the moment
the contribution of the key ( ). If you examine the expansion table, you see that the
32bits of input are split into groups of 4 bits and then become groups of 6 bits by taking
the outer bits from the two adjacent groups.For example,if part of the input word is
... efgh ijkl mnop .. .
this becomes
... defghi hijklm lmnopq .. .
K
i
S
1
S
i
S
i
28 bits
Expansion/permutation
(E table)
C
i1
R
i1
L
i1
D
i1
32 bits
28 bits
Left shift(s)
Permutation/contraction
(Permuted choice 2)
XOR
48
48
Substitution/choice
(S-box)
Permutation
(P)
32
XOR
Left shift(s)
L
i
R
i
C
i
D
i
48
32
K
i
F
32 bits
Figure 3.6 Single Round of DES Algorithm
3.2 / THE DATA ENCRYPTION STANDARD 83
The outer two bits of each group select one of four possible substitutions (one row
of an S-box).Then a 4-bit output value is substituted for the particular 4-bit input
(the middle four input bits). The 32-bit output from the eight S-boxes is then
permuted, so that on the next round,the output from each S-box immediately
affects as many others as possible.
KEY GENERATION Returning to Figures 3.5 and 3.6, we see that a 64-bit key is used
as input to the algorithm.The bits of the key are numbered from 1 through 64; every
eighth bit is ignored,as indicated by the lack of shading in Table 3.4a.The key is first
subjected to a permutation governed by a table labeled Permuted Choice One
(Table 3.4b).The resulting 56-bit key is then treated as two 28-bit quantities,labeled
and . At each round, and are separately subjected to a circular left
shift or (rotation) of 1 or 2 bits,as governed by Table 3.4d.These shifted values serve
as input to the next round.They also serve as input to the part labeled Permuted
Choice Two (Table 3.4c),which produces a 48-bit output that serves as input to the
function .
DES Decryption
As with any Feistel cipher,decryption uses the same algorithm as encryption, except
that the application of the subkeys is reversed.
F(R
i-1
,K
i
)
D
i-1
C
i-1
D
0
C
0
S
1
S
2
S
3
S
4
S
5
S
6
S
7
S
8
R (32 bits)
48 bits
E
K (48 bits)
P
32 bits
Figure 3.7 Calculation of F(R,K)
84 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
Table 3.3 Definition of DES S-Boxes
14 4 13 1 2 15 11 8 3 10 6 12 5 9 0 7
S
1
0 15 7 4 14 2 13 1 10 6 12 11 9 5 3 8
4 1 14 8 13 6 2 11 15 12 9 7 3 10 5 0
15 12 8 2 4 9 1 7 5 11 3 14 10 0 6 13
15 1 8 14 6 11 3 4 9 7 2 13 12 0 5 10
S
2
3 13 4 7 15 2 8 14 12 0 1 10 6 9 11 5
0 14 7 11 10 4 13 1 5 8 12 6 9 3 2 15
13 8 10 1 3 15 4 2 11 6 7 12 0 5 14 9
10 0 9 14 6 3 15 5 1 13 12 7 11 4 2 8
S
3
13 7 0 9 3 4 6 10 2 8 5 14 12 11 15 1
13 6 4 9 8 15 3 0 11 1 2 12 5 10 14 7
1 10 13 0 6 9 8 7 4 15 14 3 11 5 2 12
7 13 14 3 0 6 9 10 1 2 8 5 11 12 4 15
S
4
13 8 11 5 6 15 0 3 4 7 2 12 1 10 14 9
10 6 9 0 12 11 7 13 15 1 3 14 5 2 8 4
3 15 0 6 10 1 13 8 9 4 5 11 12 7 2 14
2 12 4 1 7 10 11 6 8 5 3 15 13 0 14 9
S
5
14 11 2 12 4 7 13 1 5 0 15 10 3 9 8 6
4 2 1 11 10 13 7 8 15 9 12 5 6 3 0 14
11 8 12 7 1 14 2 13 6 15 0 9 10 4 5 3
12 1 10 15 9 2 6 8 0 13 3 4 14 7 5 11
S
6
10 15 4 2 7 12 9 5 6 1 13 14 0 11 3 8
9 14 15 5 2 8 12 3 7 0 4 10 1 13 11 6
4 3 2 12 9 5 15 10 11 14 1 7 6 0 8 13
4 11 2 14 15 0 8 13 3 12 9 7 5 10 6 1
S
7
13 0 11 7 4 9 1 10 14 3 5 12 2 15 8 6
1 4 11 13 12 3 7 14 10 15 6 8 0 5 9 2
6 11 13 8 1 4 10 7 9 5 0 15 14 2 3 12
13 2 8 4 6 15 11 1 10 9 3 14 5 0 12 7
S
8
1 15 13 8 10 3 7 4 12 5 6 11 0 14 9 2
7 11 4 1 9 12 14 2 0 6 10 13 15 3 5 8
2 1 14 7 4 10 8 13 15 12 9 0 3 5 6 11
3.3 / A DES EXAMPLE 85
3.3 A DES EXAMPLE
We now work through an example and consider some of its implications.Although
you are not expected to duplicate the example by hand,you will find it informative
to study the hex patterns that occur from one step to the next.
For this example,the plaintext is a hexadecimal palindrome.The plaintext, key,
and resulting ciphertext are as follows:
Plaintext:
02468aceeca86420
Key:
0f1571c947d9e859
Ciphertext:
da02ce3a89ecac3b
Table 3.4 DES Key Schedule Calculation
(a) Input Key
12345678
910111213141516
17 18 19 20 21 22 23 24
25 26 27 28 29 30 31 32
33 34 35 36 37 38 39 40
41 42 43 44 45 46 47 48
49 50 51 52 53 54 55 56
57 58 59 60 61 62 63 64
(b) Permuted Choice One (PC-1)
57 49 41 33 25 17 9
158 50 42 34 2618
10 2 59 51 43 35 27
19 11 3 60 52 44 36
63 55 47 39 31 23 15
762 54 46 38 3022
14 6 61 53 45 37 29
21 13 5 28 20 12 4
(c) Permuted Choice Two (PC-2)
14 17 11 24 1 5 3 28
1562110231912 4
26 8 16 7 27 20 13 2
41 52 31 37 47 55 30 40
51 45 33 48 44 49 39 56
34 53 46 42 50 36 29 32
(d) Schedule of Left Shifts
Round Number 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
Bits Rotated 1 1 2 2 2 2 2 2 1 2 2 2 2 2 2 1
86 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
Results
Table 3.5shows the progression of the algorithm. The first row shows the 32-bit
values of the left and right halves of data after the initial permutation.The next
16 rows show the results after each round.Also shown is the value of the 48-bit
subkey generated for each round.Note that .The final row shows the left-
and right-hand values after the inverse initial permutation.These two values com-
bined form the ciphertext.
The Avalanche Effect
A desirable property of any encryption algorithm is that a small change in either the
plaintext or the key should produce a significant change in the ciphertext.In partic-
ular, a change in one bit of the plaintext or one bit of the key should produce a
change in many bits of the ciphertext.This is referred to as the avalanche effect. If
the change were small,this might provide a way to reduce the size of the plaintext or
key space to be searched.
Using the example from Table 3.5,Table 3.6 shows the result when the fourth
bit of the plaintext is changed, so that the plaintext is 12468aceeca86420.The
second column of the table shows the intermediate 64-bit values at the end of each
round for the two plaintexts.The third column shows the number of bits that differ
between the two intermediate values.The table shows that, after just three rounds,
18 bits differ between the two blocks.On completion, the two ciphertexts differ in
32 bit positions.
L
i
= R
i-1
Table 3.5 DES Example
Round K
i
L
i
R
i
IP
5a005a00 3cf03c0f
1
1e030f03080d2930 3cf03c0f bad22845
2
0a31293432242318 bad22845 99e9b723
3
23072318201d0c1d 99e9b723 0bae3b9e
4
05261d3824311a20 0bae3b9e 42415649
5
3325340136002c25 42415649 18b3fa41
6
123a2d0d04262a1c 18b3fa41 9616fe23
7
021f120b1c130611 9616fe23 67117cf2
8
1c10372a2832002b 67117cf2 c11bfc09
9
04292a380c341f03 c11bfc09 887fbc6c
10
2703212607280403 887fbc6c 600f7e8b
11
2826390c31261504 600f7e8b f596506e
12
12071c241a0a0f08 f596506e 738538b8
13
300935393c0d100b 738538b8 c6a62c4e
14
311e09231321182a c6a62c4e 56b0bd75
15
283d3e0227072528 56b0bd75 75e8fd8f
16
2921080b13143025 75e8fd8f 25896490
IP
–1
da02ce3a 89ecac3b
Note:DES subkeys are shown as eight 6-bit values in hex format
3.3 / A DES EXAMPLE 87
Table 3.6 Avalanche Effect in DES:Change in Plaintext
Round
02468aceeca86420
12468aceeca86420
1
1
3cf03c0fbad22845
3cf03c0fbad32845
1
2
bad2284599e9b723
bad3284539a9b7a3
5
3
99e9b7230bae3b9e
39a9b7a3171cb8b3
18
4
0bae3b9e42415649
171cb8b3ccaca55e
34
5
4241564918b3fa41
ccaca55ed16c3653
37
6
18b3fa419616fe23
d16c3653cf402c68
33
7
9616fe2367117cf2
cf402c682b2cefbc
32
8
67117cf2c11bfc09
2b2cefbc99f91153
33
Table 3.7shows a similar test using the original plaintext of with two keys that
differ in only the fourth bit position: the original key,0f1571c947d9e859, and the
altered key,1f1571c947d9e859.Again, the results show that about half of the bits in
the ciphertext differ and that the avalanche effect is pronounced after just a few rounds.
Round
9
c11bfc09887fbc6c
99f911532eed7d94
32
10
887fbc6c600f7e8b
2eed7d94d0f23094
34
11
600f7e8bf596506e
d0f23094455da9c4
37
12
f596506e738538b8
455da9c47f6e3cf3
31
13
738538b8c6a62c4e
7f6e3cf34bc1a8d9
29
14
c6a62c4e56b0bd75
4bc1a8d91e07d409
33
15
56b0bd7575e8fd8f
1e07d4091ce2e6dc
31
16
75e8fd8f25896490
1ce2e6dc365e5f59
32
IP
–1
da02ce3a89ecac3b
057cde97d7683f2a
32
Table 3.7 Avalanche Effect in DES:Change in Key
Round
02468aceeca86420
02468aceeca86420
0
1
3cf03c0fbad22845
3cf03c0f9ad628c5
3
2
bad2284599e9b723
9ad628c59939136b
11
3
99e9b7230bae3b9e
9939136b768067b7
25
4
0bae3b9e42415649
768067b75a8807c5
29
5
4241564918b3fa41
5a8807c5488dbe94
26
6
18b3fa419616fe23
488dbe94aba7fe53
26
7
9616fe2367117cf2
aba7fe53177d21e4
27
8
67117cf2c11bfc09
177d21e4548f1de4
32
Round
9
c11bfc09887fbc6c
548f1de471f64dfd
34
10
887fbc6c600f7e8b
71f64dfd4279876c
36
11
600f7e8bf596506e
4279876c399fdc0d
32
12
f596506e738538b8
399fdc0d6d208dbb
28
13
738538b8c6a62c4e
6d208dbbb9bdeeaa
33
14
c6a62c4e56b0bd75
b9bdeeaad2c3a56f
30
15
56b0bd7575e8fd8f
d2c3a56f2765c1fb
33
16
75e8fd8f25896490
2765c1fb01263dc4
30
IP
–1
da02ce3a89ecac3b
ee92b50606b62b0b
30
88 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
3.4 THE STRENGTH OF DES
Since its adoption as a federal standard, there have been lingering concerns about
the level of security provided by DES.These concerns, by and large,fall into two
areas:key size and the nature of the algorithm.
The Use of 56-Bit Keys
With a key length of 56 bits,there are possible keys, which is approximately
keys.Thus, on the face of it, a brute-force attack appears impractical.
Assuming that,on average, half the key space has to be searched, a single machine
performing one DES encryption per microsecond would take more than a thousand
years (see Table 2.2) to break the cipher.
However,the assumption of one encryption per microsecond is overly conser-
vative.As far back as 1977, Diffie and Hellman postulated that the technology
existed to build a parallel machine with 1 million encryption devices,each of which
could perform one encryption per microsecond [DIFF77].This would bring the
average search time down to about 10 hours.The authors estimated that the cost
would be about $20 million in 1977 dollars.
DES finally and definitively proved insecure in July 1998,when the Electronic
Frontier Foundation (EFF) announced that it had broken a DES encryption using a
special-purpose “DES cracker” machine that was built for less than $250,000.
Theattack took less than three days. The EFF has published a detailed description of
the machine,enabling others to build their own cracker [EFF98]. And,of course, hard-
ware prices will continue to drop as speeds increase,making DES virtually worthless.
It is important to note that there is more to a key-search attack than simply
running through all possible keys.Unless known plaintext is provided, the analyst
must be able to recognize plaintext as plaintext. If the message is just plain text in
English, then the result pops out easily,although the task of recognizing English
would have to be automated. If the text message has been compressed before
encryption,then recognition is more difficult. And if the message is some more gen-
eral type of data,such as a numerical file, and this has been compressed, the prob-
lem becomes even more difficult to automate.Thus, to supplement the brute-force
approach, some degree of knowledge about the expected plaintext is needed,and
some means of automatically distinguishing plaintext from garble is also needed.
The EFF approach addresses this issue as well and introduces some automated
techniques that would be effective in many contexts.
Fortunately,there are a number of alternatives to DES,the most important of
which are AES and triple DES,discussed in Chapters 5 and 6, respectively.
The Nature of the DES Algorithm
Another concern is the possibility that cryptanalysis is possible by exploiting the
characteristics of the DES algorithm.The focus of concern has been on the eight
substitution tables,or S-boxes, that are used in each iteration. Because the design
criteria for these boxes,and indeed for the entire algorithm, were not made public,
there is a suspicion that the boxes were constructed in such a way that cryptanalysis
7.2 * 10
16
2
56
3.5 / DIFFERENTIAL AND LINEAR CRYPTANALYSIS 89
9
At least,no one has publicly acknowledged such a discovery.
is possible for an opponent who knows the weaknesses in the S-boxes.This assertion
is tantalizing,and over the years a number of regularities and unexpected behaviors
of the S-boxes have been discovered. Despite this,no one has so far succeeded in
discovering the supposed fatal weaknesses in the S-boxes.
9
Timing Attacks
We discuss timing attacks in more detail in Part Two, as they relate to public-key
algorithms. However, the issue may also be relevant for symmetric ciphers.
Inessence, a timing attack is one in which information about the key or the plaintext
is obtained by observing how long it takes a given implementation to perform
decryptions on various ciphertexts.A timing attack exploits the fact that an encryp-
tion or decryption algorithm often takes slightly different amounts of time on
different inputs.[HEVI99] reports on an approach that yields the Hamming weight
(number of bits equal to one) of the secret key.This is a long way from knowing the
actual key,but it is an intriguing first step. The authors conclude that DES appears
to be fairly resistant to a successful timing attack but suggest some avenues to
explore.Although this is an interesting line of attack, it so far appears unlikely that
this technique will ever be successful against DES or more powerful symmetric
ciphers such as triple DES and AES.
3.5 DIFFERENTIAL AND LINEAR CRYPTANALYSIS
For most of its life, the prime concern with DES has been its vulnerability
tobrute-force attack because of its relatively short (56 bits) key length. However,
there has also been interest in finding cryptanalytic attacks on DES. With
the increasing popularity of block ciphers with longer key lengths,including
triple DES, brute-force attacks have become increasingly impractical.Thus,
there has been increased emphasis on cryptanalytic attacks on DES and other
symmetric block ciphers.In this section, we provide a brief overview of the two
most powerful and promising approaches: differential cryptanalysis and linear
cryptanalysis.
Differential Cryptanalysis
One of the most significant advances in cryptanalysis in recent years is differential
cryptanalysis.In this section, we discuss the technique and its applicability to DES.
HISTORY Differential cryptanalysis was not reported in the open literature until
1990.The first published effort appears to have been the cryptanalysis of a block
cipher called FEAL by Murphy [MURP90].This was followed by a number of
papers by Biham and Shamir,who demonstrated this form of attack on a variety of
encryption algorithms and hash functions; their results are summarized in
[BIHA93].
90 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
The most publicized results for this approach have been those that have
application to DES.Differential cryptanalysis is the first published attack that is
capable of breaking DES in less than encryptions.The scheme, as reported in
[BIHA93], can successfully cryptanalyze DES with an effort on the order of
encryptions,requiring chosen plaintexts. Although is certainly significantly
less than , the need for the adversary to find chosen plaintexts makes this
attack of only theoretical interest.
Although differential cryptanalysis is a powerful tool,it does not do very well
against DES.The reason, according to a member of the IBM team that designed
DES [COPP94],is that differential cryptanalysis was known to the team as early as
1974.The need to strengthen DES against attacks using differential cryptanalysis
played a large part in the design of the S-boxes and the permutation P.As evidence
of the impact of these changes, consider these comparable results reported in
[BIHA93]. Differential cryptanalysis of an eight-round LUCIFER algorithm
requires only 256 chosen plaintexts,whereas an attack on an eight-round version of
DES requires chosen plaintexts.
D
IFFERENTIAL CRYPTANALYSISATTACK The differential cryptanalysis attack is complex;
[BIHA93] provides a complete description. The rationale behind differential
cryptanalysis is to observe the behavior of pairs of text blocks evolving along each
round of the cipher,instead of observing the evolution of a single text block. Here,we
provide a brief overview so that you can get the flavor of the attack.
We begin with a change in notation for DES.Consider the original plaintext
block to consist of two halves . Each round of DES maps the right-hand
input into the left-hand output and sets the right-hand output to be a function of the
left-hand input and the subkey for this round.So, at each round,only one new 32-bit
block is created.If we label each new block , then the intermediate
message halves are related as follows:
In differential cryptanalysis,we start with two messages, and , with a
known XOR difference , and consider the difference between the
intermediate message halves: .Then we have
Now,suppose that many pairs of inputs to f with the same difference yield the
same output difference if the same subkey is used.To put this more precisely,let us
say that may cause Y with probability ,if for a fraction of the pairs in which the
input XOR is ,the output XOR equals . We want to suppose that there are a
number of values of that have high probability of causing a particular output
difference.Therefore, if we know and with high probability,then we
know with high probability.Furthermore,if a number of such differences are
determined,it is feasible to determine the subkey used in the function f.
The overall strategy of differential cryptanalysis is based on these considerations
for a single round.The procedure is to begin with two plaintext messages and m¿m
¢m
i+1
¢m
i
¢m
i-1
X
YX
ppX
m
i-1
{
[f(m
i
,K
i
)
{
f(m¿
i
,K
i
)]
= [m
i-1
{
f(m
i
,K
i
)]
{
[m¿
i-1
{
f(m¿
i
,K
i
)]
¢m
i+1
= m
i+1
{
m¿
i+1
¢m
i
= m
i
{
m¿
i
¢m = m
{
m¿
m¿m
m
i+1
= m
i-1
{
f(m
i
,K
i
), i = 1, 2, Á , 16
m
i
(2 i 17)
m
0
,m
1
m
2
14
2
47
2
55
2
47
2
47
2
47
2
55
3.5 / DIFFERENTIAL AND LINEAR CRYPTANALYSIS 91
f
f(m
i
) 40 08 00 00
f(m
i1
) 00 00 00 00
f(m
i2
) 40 08 00 00
m
i3
||m
i2
40 08 00 00 04 00 00 00
m
i1
00 00 00 00
m
i2
04 00 00 00
m
i
04 00 00 00
f
p 0.25
p 1.0
f
p 0.25
m
i1
|| m
i
40 08 00 00 04 00 00 00
Figure 3.8 Differential Propagation through Three Rounds of DES
(numbers in hexadecimal)
with a given difference and trace through a probable pattern of differences after each
round to yield a probable difference for the ciphertext.Actually,there are two proba-
ble patterns of differences for the two 32-bit halves: .Next, we submit
and for encryption to determine the actual difference under the unknown key and
compare the result to the probable difference.If there is a match,
then we suspect that all the probable patterns at all the intermediate rounds are
correct.With that assumption, we can make some deductions about the key bits.
This procedure must be repeated many times to determine all the key bits.
Figure 3.8,based on a figure in [BIHA93], illustrates the propagation of differ-
ences through three rounds of DES.The probabilities shown on the right refer to
the probability that a given set of intermediate differences will appear as a function
E(K,m)
{
E(K,m ¿) = (¢m
17
7
¢m
16
)
m¿
m(¢m
17
7
¢m
16
)
92 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
of the input differences.Overall, after three rounds, the probability that the output
difference is as shown is equal to .
Linear Cryptanalysis
A more recent development is linear cryptanalysis,described in [MATS93].This attack
is based on finding linear approximations to describe the transformations performed in
DES.This method can find a DES key given known plaintexts,as compared to
chosen plaintexts for differential cryptanalysis.Although this is a minor improvement,
because it may be easier to acquire known plaintext rather than chosen plaintext,it still
leaves linear cryptanalysis infeasible as an attack on DES.So far, little work has been
done by other groups to validate the linear cryptanalytic approach.
We now give a brief summary of the principle on which linear cryptanalysis is
based.For a cipher with -bit plaintext and ciphertext blocks and an -bit key,let
the plaintext block be labeled , the cipher text block ,
and the key .Then define
The objective of linear cryptanalysis is to find an effective linear equation of
the form:
(where ; and where the terms represent
fixed,unique bit locations) that holds with probability .The further is from
0.5,the more effective the equation. Once a proposed relation is determined, the pro-
cedure is to compute the results of the left-hand side of the preceding equation for a
large number of plaintext–ciphertext pairs.If the result is 0 more than half the time,
assume .If it is 1 most of the time, assume .
This gives us a linear equation on the key bits.Try to get more such relations so that we
can solve for the key bits.Because we are dealing with linear equations, the problem
can be approached one round of the cipher at a time,with the results combined.
3.6 BLOCK CIPHER DESIGN PRINCIPLES
Although much progress has been made in designing block ciphers that are crypto-
graphically strong, the basic principles have not changed all that much since the
work of Feistel and the DES design team in the early 1970s.It is useful to begin this
discussion by looking at the published design criteria used in the DES effort.Then
we look at three critical aspects of block cipher design:the number of rounds, design
of the function F,and key scheduling.
DES Design Criteria
The criteria used in the design of DES,as reported in [COPP94], focused on the
design of the S-boxes and on the P function that takes the output of the S-boxes
(Figure 3.7).The criteria for the S-boxes are as follows.
K[g
1
,g
2
, Á , g
c
] = 1K[g
1
,g
2
, Á , g
c
] = 0
pp Z 0.5
a, b, and gx = 0 or 1; 1 a; b n; c m
P[a
1
,a
2
, Á , a
a
]
{
C[b
1
, b
2
, Á , b
b
] = K[g
1
,g
2
, Á , g
c
]
A[i, j,Á , k] = A[i]
{
A[j]
{
Á
{
A[k]
K[1], Á , K[m]
C[1], Á C[n]P[1], Á P[n]
mn
2
47
2
43
0.25 * 1 * 0.25 = 0.0625
3.6 / BLOCK CIPHER DESIGN PRINCIPLES 93
1. No output bit of any S-box should be too close a linear function of the
input bits.Specifically, if we select any output bit and any subset of the
six input bits, the fraction of inputs for which this output bit equals
the XOR of these input bits should not be close to 0 or 1,but rather should
be near 1/2.
2. Each row of an S-box (determined by a fixed value of the leftmost and right-
most input bits) should include all 16 possible output bit combinations.
3. If two inputs to an S-box differ in exactly one bit, the outputs must differ in at
least two bits.
4. If two inputs to an S-box differ in the two middle bits exactly,the outputs must
differ in at least two bits.
5. If two inputs to an S-box differ in their first two bits and are identical in their last
two bits,the two outputs must not be the same.
6. For any nonzero 6-bit difference between inputs, no more than eight of the
32 pairs of inputs exhibiting that difference may result in the same output
difference.
7. This is a criterion similar to the previous one, but for the case of three
S-boxes.
Coppersmith pointed out that the first criterion in the preceding list was
needed because the S-boxes are the only nonlinear part of DES.If the S-boxes
were linear (i.e.,each output bit is a linear combination of the input bits), the
entire algorithm would be linear and easily broken.We have seen this phenome-
non with the Hill cipher, which is linear.The remaining criteria were primarily
aimed at thwarting differential cryptanalysis and at providing good confusion
properties.
The criteria for the permutation P are as follows.
1. The four output bits from each S-box at round are distributed so that two of
them affect (provide input for) “middle bits”of round and the other
two affect end bits.The two middle bits of input to an S-box are not shared
with adjacent S-boxes.The end bits are the two left-hand bits and the two
right-hand bits,which are shared with adjacent S-boxes.
2. The four output bits from each S-box affect six different S-boxes on the next
round,and no two affect the same S-box.
3. For two S-boxes , ,if an output bit from affects a middle bit of on the
next round, then an output bit from cannot affect a middle bit of .This
implies that,for , an output bit from must not affect a middle bit of .
These criteria are intended to increase the diffusion of the algorithm.
Number of Rounds
The cryptographic strength of a Feistel cipher derives from three aspects of the
design:the number of rounds, the function F,and the key schedule algorithm. Let us
look first at the choice of the number of rounds.
S
j
S
j
j = k
S
j
S
k
S
k
S
j
kj
(i + 1)
i
94 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
The greater the number of rounds,the more difficult it is to perform crypt-
analysis,even for a relatively weak F. In general, the criterion should be that the
number of rounds is chosen so that known cryptanalytic efforts require greater
effort than a simple brute-force key search attack.This criterion was certainly used
in the design of DES.Schneier [SCHN96] observes that for 16-round DES, a differ-
ential cryptanalysis attack is slightly less efficient than brute force:The differential
cryptanalysis attack requires operations,
10
whereas brute force requires . If
DES had 15 or fewer rounds,differential cryptanalysis would require less effort
than a brute-force key search.
This criterion is attractive,because it makes it easy to judge the strength of an
algorithm and to compare different algorithms.In the absence of a cryptanalytic
breakthrough, the strength of any algorithm that satisfies the criterion can be
judged solely on key length.
Design of Function F
The heart of a Feistel block cipher is the function F.As we have seen,in DES, this
function relies on the use of S-boxes.This is also the case for many other symmetric
block ciphers.However, we can make some general comments about the criteria for
designing F.After that,we look specifically at S-box design.
DESIGN CRITERIA FOR F The function F provides the element of confusion in a
Feistel cipher.Thus,it must be difficult to “unscramble” the substitution performed
by F.One obvious criterion is that F be nonlinear, as we discussed previously.The
more nonlinear F,the more difficult any type of cryptanalysis will be. There are
several measures of nonlinearity,which are beyond the scope of this book. In rough
terms,the more difficult it is to approximate F by a set of linear equations, the more
nonlinear F is.
Several other criteria should be considered in designing F.We would like
the algorithm to have good avalanche properties. Recall that, in general,
thismeans that a change in one bit of the input should produce a change in many
bits of the output. A more stringent version of this is the strict avalanche
criterion (SAC)[WEBS86], which states that any output bit of an S-box should
change with probability 1/2 when any single input bit is inverted for all , .
Although SAC is expressed in terms of S-boxes,a similar criterion could be
applied to F as a whole.This is important when considering designs that do not
include S-boxes.
Another criterion proposed in [WEBS86] is the bit independence criterion
(BIC),which states that output bits and should change independently when any
single input bit is inverted for all , , and .The SAC and BIC criteria appear to
strengthen the effectiveness of the confusion function.
S-BOX DESIGN One of the most intense areas of research in the field of symmetric
block ciphers is that of S-box design. The papers are almost too numerous to
kjii
kj
jii
j
2
55
2
55.1
10
Recall that differential cryptanalysis of DES requires chosen plaintext.If all you have to work with
is known plaintext,then you must sort through a large quantity of known plaintext–ciphertext pairs look-
ing for the useful ones.This brings the level of effort up to .2
55.1
2
47
3.6 / BLOCK CIPHER DESIGN PRINCIPLES 95
count.
11
Here we mention some general principles.In essence, we would like any
change to the input vector to an S-box to result in random-looking changes to the
output.The relationship should be nonlinear and difficult to approximate with
linear functions.
One obvious characteristic of the S-box is its size.An S-box has input
bits and output bits.DES has 6 4 S-boxes. The encryption algorithm Blowfish,
has 8 32 S-boxes.Larger S-boxes, by and large, are more resistant to differential
and linear cryptanalysis [SCHN96].On the other hand, the larger the dimension ,
the (exponentially) larger the lookup table.Thus, for practical reasons,a limit of
equal to about 8 to 10 is usually imposed.Another practical consideration is that the
larger the S-box,the more difficult it is to design it properly.
S-boxes are typically organized in a different manner than used in DES.An
S-box typically consists of rows of bits each. The bits of input select
one of the rows of the S-box,and the bits in that row are the output. For example,
in an S-box,if the input is 00001001, the output consists of the 32 bits in
row 9 (the first row is labeled row 0).
Mister and Adams [MIST96] propose a number of criteria for S-box design.
Among these are that the S-box should satisfy both SAC and BIC.They also suggest
that all linear combinations of S-box columns should be bent.Bent functions are a
special class of Boolean functions that are highly nonlinear according to certain
mathematical criteria [ADAM90].There has been increasing interest in designing
and analyzing S-boxes using bent functions.
A related criterion for S-boxes is proposed and analyzed in [HEYS95].The
authors define the guaranteed avalanche (GA)criterion as follows: An S-box satisfies
GA of order if,for a 1-bit input change, at least output bits change. The authors
conclude that a GA in the range of order 2 to order 5 provides strong diffusion
characteristics for the overall encryption algorithm.
For larger S-boxes,such as , the question arises as to the best method
of selecting the S-box entries in order to meet the type of criteria we have been
discussing. Nyberg,who has written a lot about the theory and practice of S-box
design,suggests the following approaches (quoted in [ROBS95b]):
Random: Use some pseudorandom number generation or some table of random
digits to generate the entries in the S-boxes.This may lead to boxes with undesir-
able characteristics for small sizes (e.g., ) but should be acceptable for large
S-boxes (e.g., ).
Random with testing: Choose S-box entries randomly,then test the results
against various criteria,and throw away those that do not pass.
Human-made: This is a more or less manual approach with only simple mathe-
matics to support it.It is apparently the technique used in the DES design. This
approach is difficult to carry through for large S-boxes.
Math-made: Generate S-boxes according to mathematical principles. By using
mathematical construction,S-boxes can be constructed that offer proven security
against linear and differential cryptanalysis,together with good diffusion.
8 * 32
6 * 4
8 * 32
gg
8 * 32
m
nm2
n
n * m
n
n
m
nn * m
11
A good summary of S-box design studies through early 1996 can be found in [SCHN96].
96 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
A variation on the first technique is to use S-boxes that are both random and key
dependent.An example of this approach is Blowfish, which starts with S-boxes filled
with pseudorandom digits and then alters the contents using the key.A tremendous
advantage of key-dependent S-boxes is that,because they are not fixed, it is impossible
to analyze the S-boxes ahead of time to look for weaknesses.
Key Schedule Algorithm
A final area of block cipher design, and one that has received less attention than
S-box design,is the key schedule algorithm. With any Feistel block cipher,the key is
used to generate one subkey for each round. In general,we would like to select
subkeys to maximize the difficulty of deducing individual subkeys and the difficulty
of working back to the main key.No general principles for this have yet been
promulgated.
Hall suggests [ADAM94] that,at minimum, the key schedule should guarantee
key/ciphertext Strict Avalanche Criterion and Bit Independence Criterion.
3.7 RECOMMENDED READING AND WEB SITE
There is a wealth of information on symmetric encryption.Some of the more worthwhile
references are listed here.An essential reference work is [SCHN96]. This remarkable work
contains descriptions of virtually every cryptographic algorithm and protocol published up
to the time of the writing of the book.The author pulls together results from journals, con-
ference proceedings,government publications, and standards documents and organizes
these into a comprehensive and comprehensible survey.Another worthwhile and detailed
survey is [MENE97].A rigorous mathematical treatment is [STIN06].
The foregoing references provide coverage of public-key as well as symmetric encryption.
Perhaps the most detailed description of DES is [SIMO95];the book also contains an
extensive discussion of differential and linear cryptanalysis of DES.[BARK91] provides a
readable and interesting analysis of the structure of DES and of potential cryptanalytic
approaches to DES.[EFF98] details the most effective brute-force attack on DES.[COPP94]
looks at the inherent strength of DES and its ability to stand up to cryptanalysis.The reader
may also find the following document useful:“The DES Algorithm Illustrated” by J.Orlin
Grabbe,which is available at this book’s Web site.An excellent description of linear and
differential cryptanalysis is in [HEYS02].
BARK91 Barker, W.Introduction to the Analysis of the Data Encryption Standard
(DES).Laguna Hills, CA:Aegean Park Press, 1991.
COPP94 Coppersmith, D.“The Data Encryption Standard (DES) and Its Strength
Against Attacks.”IBM Journal of Research and Development,May 1994.
EFF98 Electronic Frontier Foundation.Cracking DES: Secrets of Encryption Research,
Wiretap Politics,and Chip Design. Sebastopol,CA: O’Reilly, 1998.
HEYS02 Heys,H. “A Tutorial on Linear and Differential Cryptanalysis.”Cryptologia,
July 2002.
MENE97 Menezes, A.; van Oorschot,P.; and Vanstone, S.Handbook of Applied
Cryptography.Boca Raton, FL: CRC Press,1997.
3.8 / KEY TERMS,REVIEW QUESTIONS,AND PROBLEMS 97
SCHN96 Schneier,B. Applied Cryptography. New York:Wiley,1996.
SIMO95 Simovits,M. The DES: An Extensive Documentation and Evaluation. Laguna
Hills,CA:Aegean Park Press, 1995.
STIN06 Stinson, D.Cryptography: Theory and Practice. Boca Raton, FL: Chapman &
Hall,2006.
Recommended Web Site:
Block Cipher Hospital:Contains links to papers and theses on block cipher cryptanalysis.
3.8 KEY TERMS,REVIEW QUESTIONS, AND PROBLEMS
Key Terms
avalanche effect
block cipher
confusion
Data Encryption Standard
(DES)
differential cryptanalysis
diffusion
Feistel cipher
irreversible mapping
key
linear cryptanalysis
permutation
product cipher
reversible mapping
round
round function
subkey
substitution
Review Questions
3.1 Why is it important to study the Feistel cipher?
3.2 What is the difference between a block cipher and a stream cipher?
3.3 Why is it not practical to use an arbitrary reversible substitution cipher of the kind
shown in Table 3.1?
3.4 What is a product cipher?
3.5 What is the difference between diffusion and confusion?
3.6 Which parameters and design choices determine the actual algorithm of a Feistel cipher?
3.7 What is the purpose of the S-boxes in DES?
3.8 Explain the avalanche effect.
3.9 What is the difference between differential and linear cryptanalysis?
Problems
3.1 a. In Section 3.1,under the subsection on the motivation for the Feistel cipher
structure, it was stated that, for a block of bits,the number of different
reversible mappings for the ideal block cipher is .Justify.
b. In that same discussion, it was stated that for the ideal block cipher, which allows all
possible reversible mappings,the size of the key is bits.But, if there are 2
n
!n * 2
n
2
n
!
n
98 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
possible mappings,it should take bits to discriminate among the different
mappings,and so the key length should be . However, .
Explain the discrepancy.
3.2 Consider a Feistel cipher composed of sixteen rounds with a block length of 128 bits
and a key length of 128 bits.Suppose that, for a given , the key scheduling algorithm
determines values for the first eight round keys, , and then sets
Suppose you have a ciphertext . Explain how,with access to an encryption oracle,
you can decrypt and determine using just a single oracle query. This shows that
such a cipher is vulnerable to a chosen plaintext attack.(An encryption oracle can be
thought of as a device that,when given a plaintext, returns the corresponding cipher-
text.The internal details of the device are not known to you and you cannot break
open the device.You can only gain information from the oracle by making queries to
it and observing its responses.)
3.3 Consider a block encryption algorithm that encrypts blocks of length , and let .
Say we have plaintext–ciphertext pairs ,where we assume that the
key selects one of the possible mappings.Imagine that we wish to find by exhaus-
tive search.We could generate key and test whether for . If 1 i tC
i
= E(K¿ , P
i
)K¿
KN!K
P
i
, C
i
= E(K, P
i
)t
N = 2
n
n
mc
c
k
9
= k
8
,k
10
= k
7
,k
11
= k
6
,Á , k
16
= k
1
k
1
,k
2
,Á k
8
k
log
2
2
n
! 6 n * 2
n
log
2
2
n
!
log
2
2
n
!
encrypts each to its proper ,then we have evidence that . However, it may K = K ¿C
i
P
i
K¿
b. What is the probability that and agree on another plaintext–
ciphertext pairs where
3.4 Let be a permutation of the integers ,such that gives the
permuted value of . Put another way, maps the set of -bit integers
into itself and no two integers map into the same integer.DES is such a permutation
for 64-bit integers.We say that has a fixed point at if .That is,if is an
encryption mapping,then a fixed point corresponds to a message that encrypts to
itself.We are interested in the probability that has no fixed points.Show the some-
what unexpected result that over 60% of mappings will have at least one fixed point.
3.5 Consider the substitution defined by row 1 of S-box in Table 3.3.Show a block dia-
gram similar to Figure 3.2that corresponds to this substitution.
3.6 Compute the bits number 1, 16, 33,and 48 at the output of the first round of the DES
decryption,assuming that the ciphertext block is composed of all ones and the exter-
nal key is composed of all ones.
3.7 Suppose the DES F function mapped every 32-bit input R, regardless of the value of
the input K,to
a. 32-bit string of ones
b. bitwise complement of R
Hint:Use the following properties of the XOR operation:
1. What function would DES then compute?
2. What would the decryption look like?
where
are -bit strings of bits
0is an -bit string of zeros
1is an -bit string of onen
n
nA,B,C
A
{
1 = bitwise complement of A
A
{
0 = A
A
{
A = 0
(A
{
B)
{
C = A
{
(B
{
C)
S
1
p
pp(m) = mmp
npm, 0 m 6 2
n
p(m)0, 1, 2, Á , (2
n
- 1)p
0 t¿…N - t?
t¿E(K¿,
#
)E(K,
#
)
be the case that the mappings and exactly agree on the plaintext–
cipher text pairs , and agree on no other pairs.
a. What is the probability that and are in fact distinct mappings?E(K¿,
#
)E(K,
#
)
C
i
P
i
tE(K¿ ,
#
)E(K,
#
)
3.8 / KEY TERMS,REVIEW QUESTIONS,AND PROBLEMS 99
3.8 This problem provides a numerical example of encryption using a one-round version
of DES.We start with the same bit pattern for the key and the plaintext, namely:
0123456789ABCDEF
a. Derive ,the first-round subkey.
b. Derive , .
c. Expand to get ,where is the expansion function of Table 3.2.
d. Calculate .
e. Group the 48-bit result of (d) into sets of 6 bits and evaluate the corresponding
S-box substitutions.
f. Concatenate the results of (e) to get a 32-bit result, .
g. Apply the permutation to get .
h. Calculate .
i. Write down the ciphertext.
3.9 Show that DES decryption is, in fact,the inverse of DES encryption.
3.10 The 32-bit swap after the sixteenth iteration of the DES algorithm is needed to make
the encryption process invertible by simply running the ciphertext back through the
algorithm with the key order reversed. This was demonstrated in Problem 3.7.
However,it still may not be entirely clear why the 32-bit swap is needed. To demon-
strate why,solve the following exercises.First, some notation:
a. Show that the composition is equivalent to the
transformation that interchanges the 32-bit halves, and .That is,show that
b. Now suppose that we did away with the final 32-bit swap in the encryption algo-
rithm.Then we would want the following equality to hold:
Does it?
3.11 Compare the initial permutation table (Table 3.2a) with the permuted choice one
table (Table 3.4b).Are the structures similar? If so, describe the similarities.What
conclusions can you draw from this analysis?
3.12 When using the DES algorithm for decryption, the 16 keys are
used in reverse order.Therefore, the right-hand side of Figure 3.5 is not valid for
decryption. Design a key-generation scheme with the appropriate shift schedule
(analogous to Table 3.4d) for the decryption process.
3.13 a. Let be the bitwise complement of .Prove that if the complement of the plain-
text block is taken and the complement of an encryption key is taken,then the
result of DES encryption with these values is the complement of the original
ciphertext.That is,
XX¿
(K
1
, K
2
, Á , K
16
)
TD
1
(IP(IP
-1
(T
16
(L
15
7
R
15
)))) = L
15
7
R
15
TD
1
(IP(IP
-1
(T
17
(T
16
(L
15
7
R
15
))))) = R
15
7
L
15
R
15
L
15
TD
1
(IP(IP
-1
(T
17
(T
16
(L
15
7
R
15
)))))
of the encryption algorithm
T
17
(R
7
L) = L
7
R,where this transformation occurs after the sixteenth iteration
algorithm for 1 I 16
TD
i
(R
7
L) = the transformation defined by the ith iteration of the encryption
algorithm for 1 I 16
T
i
(R
7
L) = the transformation defined by the ith iteration of the encryption
A
7
B = the concatenation of the bit strings A and B
R
1
= P(B)
{
L
0
P(B)
B
A = E[R
0
]
{
K
1
E[
#
]E[R
0
]R
0
R
0
L
0
K
1
Binary notation: 0000 0001 0010 0011 0100 0101 0110 0111
1000 1001 1010 1011 1100 1101 1110 1111
Hexadecimal notation:
K
Hint:Begin by showing that for any two bit strings of equal length, and ,
.
b. It has been said that a brute-force attack on DES requires searching a key space
of keys. Does the result of part (a) change that?
3.14 Show that in DES the first 24 bits of each subkey come from the same subset of 28
bits of the initial key and that the second 24 bits of each subkey come from a disjoint
subset of 28 bits of the initial key.
3.15 For any block cipher,the fact that it is a nonlinear function is crucial to its security. To
see this,suppose that we have a linear block cipher EL that encrypts 128-bit blocks of
plaintext into 128-bit blocks of ciphertext.Let denote the encryption of a
128-bit message under a key (the actual bit length of is irrelevant).Thus,
Describe how,with 128 chosen ciphertexts, an adversary can decrypt any ciphertext
without knowledge of the secret key .(A “chosen ciphertext” means that an adversary
has the ability to choose a ciphertext and then obtain its decryption.Here, you have 128
plaintext/ciphertext pairs to work with and you have the ability to chose the value of the
ciphertexts.)
Note:The following problems refer to simplified DES, described in Appendix G.
3.16 Refer to Figure G.2,which depicts key generation for S-DES.
a. How important is the initial P10 permutation function?
b. How important are the two LS-1 shift functions?
3.17 The equations for the variables and for S-DES are defined in the section on
S-DES analysis.Provide the equations for and .
3.18 Using S-DES,decrypt the string (
10100010) using the key (0111111101) by hand.
Show intermediate results after each function . Then decode the
first 4 bits of the plaintext string to a letter and the second 4 bits to another letter where
we encode A through P in base 2 (i.e.,A = 0000, B = 0001,..., P = 1111).Hint: As a mid-
way check,after the application of SW,the string should be (
00010011).
Programming Problems
3.19 Create software that can encrypt and decrypt using a general substitution block
cipher.
3.20 Create software that can encrypt and decrypt using S-DES. Test data:use plaintext,
ciphertext,and key of Problem 3.18.
(IP,
F
K
,SW,F
K
,IP
-1
)
ts
rq
k
EL(k,[m
1
{
m
2
]) = EL(k,m
1
)
{
EL(k,m
2
)for all 128-bit patterns m
1
,m
2
kkm
EL (k,m)
2
56
(A
{
B)¿=A¿
{
B
BA
If Y = E(K, X)
Then Y ¿=E(K¿,X¿ )
100 CHAPTER 3 / BLOCK CIPHERS AND THE DATA ENCRYPTION STANDARD
BASIC CONCEPTS IN NUMBER THEORY
AND
FINITE FIELDS
4.1 Divisibility and The Division Algorithm
4.2 The Euclidean Algorithm
4.3 Modular Arithmetic
4.4 Groups,Rings, and Fields
4.5 Finite Fields of the Form
4.6 Polynomial Arithmetic
4.7 Finite Fields of the Form
4.8 Recommended Reading and Web Site
4.9 Key Terms,Review Questions,and Problems
Appendix 4A The Meaning of Mod
GF(2
n
)
GF(p)
101
CHAPTER
102 CHAPTER 4 / BASIC CONCEPTS IN NUMBER THEORY AND FINITE FIELDS
The next morning at daybreak, Star flew indoors,seemingly keen for a lesson.
Isaid, “Tap eight.”She did a brilliant exhibition, first tapping it in 4,4, then giving
me a hasty glance and doing it in 2,2, 2, 2, before coming for her nut.
It is astonishing that Star learned to count up to 8 with no difficulty,and of
her own accord discovered that each number could be given with various differ-
ent divisions,this leaving no doubt that she was consciously thinking each num-
ber.In fact, she did mental arithmetic, although unable,like humans, to name the
numbers.But she learned to recognize their spoken names almost immediately
and was able to remember the sounds of the names.Star is unique as a wild bird,
who of her own free will pursued the science of numbers with keen interest and
astonishing intelligence.
Living with Birds,Len Howard
KEY POINTS
Modular arithmetic is a kind of integer arithmetic that reduces all numbers
to one of a fixed set for some number . Any integer outside
this range is reduced to one in this range by taking the remainder after divi-
sion by .
The greatest common divisor of two integers is the largest positive integer
that exactly divides both integers.
A field is a set of elements on which two arithmetic operations (addition and
multiplication) have been defined and which has the properties of ordinary
arithmetic, such as closure,associativity, commutativity, distributivity,and
having both additive and multiplicative inverses.
Finite fields are important in several areas of cryptography.A finite field is
simply a field with a finite number of elements.It can be shown that the
order of a finite field (number of elements in the field) must be a power of
a prime , where is a positive integer.
Finite fields of order can be defined using arithmetic mod .
Finite fields of order , for , can be defined using arithmetic over
polynomials.
n > 1p
n
pp
np
n
n
n[0, . . . , n - 1]
Finite fields have become increasingly important in cryptography.A number of crypto-
graphic algorithms rely heavily on properties of finite fields,notably the Advanced
Encryption Standard (AES) and elliptic curve cryptography.Other examples include
the message authentication code CMAC and the authenticated encryption scheme
GMC.
This chapter provides the reader with sufficient background on the concepts
of finite fields to be able to understand the design of AES and other cryptographic
algorithms that use finite fields.The first three sections introduce basic concepts
4.1 / DIVISIBILITY AND THE DIVISION ALGORITHM 103
from number theory that are needed in the remainder of the chapter; these
include divisibility,the Euclidian algorithm, and modular arithmetic.Next comes a
brief overview of the concepts of group,ring, and field. This section is somewhat
abstract;the reader may prefer to quickly skim this section on a first reading. We
are then ready to discuss finite fields of the form , where is a prime num-
ber. Next,we need some additional background, this time in polynomial arith-
metic.The chapter concludes with a discussion of finite fields of the form ,
where is a positive integer.
The concepts and techniques of number theory are quite abstract,and it is often
difficult to grasp them intuitively without examples.Accordingly, this chapter and
Chapter 8include a number of examples, each of which is highlighted in a shaded box.
4.1 DIVISIBILITY AND THE DIVISION ALGORITHM
Divisibility
We say that a nonzero divides if for some ,where , , and are
integers.That is, divides if there is no remainder on division. The notation is
commonly used to mean divides . Also, if ,we say that is a divisor of .abb
ƒ
aab
b
ƒ
aab
mbama = mbab
n
GF(2
n
)
p GF(p)
The positive divisors of 24 are 1,2, 3, 4,6, 8, 12, and 24.
13
ƒ
182; -5
ƒ
30; 17
ƒ
289; -3
ƒ
33; 17
ƒ
0
Subsequently,we will need some simple properties of divisibility for integers,
which are as follows:
If , then .
If and ,then .
Any 0 divides 0.
If and ,then :a
ƒ
cb
ƒ
ca
ƒ
b
Zb
a =;bb
ƒ
aa
ƒ
b
a =;1a
ƒ
1
11
ƒ
66 and 66
ƒ
198 = 11
ƒ
198
If and , then for arbitrary integers and .
To see this last point,note that
If ,then is of the form for some integer .
If ,then is of the form for some integer .
So
mg + nh = mbg
1
+ nbh
1
= b * (mg
1
+ nh
1
)
h
1
h = b * h
1
hb
ƒ
h
g
1
g = b * g
1
gb
ƒ
g
nmb
ƒ
(mg + nh)b
ƒ
hb
ƒ
g
104 CHAPTER 4 / BASIC CONCEPTS IN NUMBER THEORY AND FINITE FIELDS
.
To show ,
we have ,
and it is obvious that .7
ƒ
(7(3 * 2 + 2 * 9))
(3 * 14 + 2 * 63) = 7(3 * 2 + 2 * 9)
7
ƒ
(3 * 14 + 2 * 63)
7
ƒ
14 and 7
ƒ
63
b = 7; g = 14; h = 63; m = 3; n = 2
The Division Algorithm
Given any positive integer and any nonnegative integer , if we divide by , we
get an integer quotient and an integer remainder that obey the following rela-
tionship:
(4.1)
where is the largest integer less than or equal to . Equation (4.1) is referred to
as the division algorithm.
1
Figure 4.1ademonstrates that, given and positive , it is always possible to
find and that satisfy the preceding relationship. Represent the integers on the
number line; will fall somewhere on that line (positive is shown, a similar
demonstration can be made for negative ). Starting at 0, proceed to ,2 , up to
, such that . The distance from to is , and we
have found the unique values of and . The remainder is often referred to as a
residue.
rrq
raqnqn a and (q + 1)n 7 aqn
nna
aa
rq
na
x:x;
a = qn + r 0 r 6 n; q = : a/n;
rq
naan
and therefore .b divides mg + nh
1
Equation 4.1 expresses a theorem rather than an algorithm,but by tradition, this is referred to as the
division algorithm.
0
n 2n 3n qn (q + 1)na
n
r
(a) General relationship
015
15
10
30
= 2 15
70
(b) Example: 70 = (4 15) + 10
45
= 3 15
60
= 4 15
75
= 5 15
Figure 4.1 The Relationship a = qn + r;0 r < n
4.2 / THE EUCLIDEAN ALGORITHM 105
Figure 4.1bprovides another example.
a = 11; n = 7; 11 = 1 * 7 + 4; r = 4 q = 1
a =-11; n = 7; - 11 = (- 2) * 7 + 3; r = 3 q =-2
4.2 THE EUCLIDEAN ALGORITHM
One of the basic techniques of number theory is the Euclidean algorithm, which is a
simple procedure for determining the greatest common divisor of two positive integers.
First, we need a simple definition:Two integers are relatively prime if their only
common positive integer factor is 1.
Greatest Common Divisor
Recall that nonzero is defined to be a divisor of if for some ,where , ,
and are integers.We will use the notation gcd( , ) to mean the greatest common
divisor of and . The greatest common divisor of and is the largest integer that
divides both and . We also define gcd(0,0) = 0.
More formally,the positive integer is said to be the greatest common divisor
of and if
1. is a divisor of and of .
2. Any divisor of and is a divisor of .
An equivalent definition is the following:
Because we require that the greatest common divisor be positive,
.In general, .gcd(a, b) = gcd(
ƒ
a
ƒ
,
ƒ
b
ƒ
)gcd(- a, b) = gcd( -a, -b)gcd(a, - b) =
gcd(a, b) =
gcd(a, b) = max[k, such that k
ƒ
a and k
ƒ
b]
cba
bac
ba
c
ba
baba
bam
bama = mbab
gcd(60, 24) = gcd(60, -24) = 12
Also,because all nonzero integers divide 0, we have .
We stated that two integers and are relatively prime if their only common
positive integer factor is 1.This is equivalent to saying that and are relatively
prime if .gcd(a, b)= 1
ba
ba
gcd(a, 0) =
ƒ
a
ƒ
8 and 15 are relatively prime because the positive divisors of 8 are 1,2, 4, and 8,and
the positive divisors of 15 are 1,3, 5, and 15. So 1 is the only integer on both lists.
Finding the Greatest Common Divisor
We now describe an algorithm credited to Euclid for easily finding the greatest
common divisor of two integers.This algorithm has significance subsequently in
this chapter. Suppose we have integers , such that . Becaused = gcd(a, b)ba
106 CHAPTER 4 / BASIC CONCEPTS IN NUMBER THEORY AND FINITE FIELDS
,there is no harm in assuming . Now dividing
by and applying the division algorithm,we can state:
(4.2)
If it happens that ,then and .But if , we can state
that . This is due to the basic properties of divisibility:the relations and
together imply that ,which is the same as . Before proceeding with the
Euclidian algorithm,we need to answer the question: What is the ? We know
that and . Now take any arbitrary integer that divides both and .
Therefore, . Because divides both and , we must have ,
which is the greatest common divisor of and . Therefore .
Let us now return to Equation (4.2) and assume that .Because ,
we can divide by and apply the division algorithm to obtain:
As before, if ,then and if , then .The division
process continues until some zero remainder appears,say, at the th stage
where is divided by .The result is the following system of equations:
(4.3)
At each iteration, we have until finally .
Thus,we can find the greatest common divisor of two integers by repetitive
application of the division algorithm.This scheme is known as the Euclidean
algorithm.
We have essentially argued from the top down that the final result is the
.We can also argue from the bottom up. The first step is to show that
divides and . It follows from the last division in Equation (4.3) that
. The next to last division shows that because it
divides both terms on the right. Successively,one sees that divides all ’s and
finally and . It remains to show that is the largest divisor that divides and
. If we take any arbitrary integer that divides and , it must also divide , as
explained previously.We can follow the sequence of equations in Equation (4.3)
down and show that must divide all ’s. Therefore must divide ,so that
.r
n
= gcd(a, b)
r
n
cr
i
c
r
1
bab
ar
n
ba
r
i
r
n
r
n
divides r
n-2
r
n
divides r
n-1
bar
n
gcd(a, b)
d = gcd(r
n
, 0) = r
n
d = gcd(r
i
,r
i+1
)
a = q
1
b + r
1
0 6 r
1
6 b
b = q
2
r
1
+ r
2
0 6 r
2
6 r
1
r
1
= q
3
r
2
+ r
3
0 6 r
3
6 r
2
##
##
##
r
n-2
= q
n
r
n-1
+ r
n
0 6 r
n
6 r
n-1
r
n-1
= q
n+1
r
n
+ 0
d = gcd(a, b) = r
n
y
r
n
r
n-1
(n + 1)
d = gcd(r
1
,r
2
)r
2
Z 0d = r
1
r
2
= 0
b = q
2
r
1
+ r
2
0 r
2
6 r
1
r
1
b
b 7 r
1
r
1
Z 0
d = gcd(b, r
1
)ba
c dbacc
ƒ
(q
1
b + r
1
) = a
r
1
bcd
ƒ
r
1
d
ƒ
b
gcd(b,r
1
)
d
ƒ
r
1
d
ƒ
(a - q
1
b)
d
ƒ
bd
ƒ
ad
ƒ
r
1
r
1
Z 0d = gcd(a, b) = bb
ƒ
ar
1
= 0
a = q
1
b + r
1
0 r
1
6 b
b
aa Ú b 7 0gcd(
ƒ
a
ƒ
,
ƒ
b
ƒ
) = gcd(a, b)
Table 4.1 Euclidean Algorithm Example
Dividend Divisor Quotient Remainder
a = 1160718174
b = 316258250 q
1
=
3 r
1
= 211943424
b = 316258250
r
1
=
211943434 q
2
=
1 r
2
= 104314826
r
1
= 211943424
r
2
=
104314826 q
3
=
2 r
3
=
3313772
r
2
= 104314826
r
3
=
3313772 q
4
= 31 r
4
=
1587894
r
3
= 3313772
r
4
=
1587894 q
5
=
2 r
5
=
137984
r
4
= 1587894
r
5
=
137984 q
6
= 11 r
6
=
70070
r
5
= 137984
r
6
=
70070 q
7
=
1 r
7
=
67914
r
6
= 70070
r
7
=
67914 q
8
=
1 r
8
=
2156
r
7
= 67914
r
8
=
2156 q
9
= 31 r
9
=
1078
r
8
= 2156
r
9
=
1078 q
10
=
2 r
10
=
0
4.2 / THE EUCLIDEAN ALGORITHM 107
Let us now look at an example with relatively large numbers to see the power
of this algorithm:
In this example,we begin by dividing 1160718174 by 316258250, which gives 3
with a remainder of 211943424.Next we take 316258250 and divide it by 211943424.
The process continues until we get a remainder of 0,yielding a result of 1078.
It will be helpful in what follows to recast the above computation in tabular form.
For every step of the iteration,we have ,where is the dividend,
is the divisor, is the quotient,and is the remainder.Table 4.1 summarizes the results.r
i
q
i
r
i-1
r
i-2
r
i-2
= q
i
r
i-1
+ r
i
To find d = gcd (a,b) = gcd (1160718174, 316258250)
a = q
1
b + r
1
1160718174 = 3 * 316258250 + 211943424
211943424)
d = gcd(316258250,
b = q
2
r
1
+ r
2
316258250 = 1 * 211943424 + 104314826
104314826)
d = gcd(211943424,
r
1
= q
3
r
2
+ r
3
211943424 = 2 * 104314826 + 3313772
3313772)
d = gcd(104314826,
r
2
= q
4
r
3
+ r
4
104314826 = 31 * 3313772 + 1587894
1587894)
d = gcd(3313772,
r
3
= q
5
r
4
+ r
5
3313772 = 2 * 1587894 + 137984
137984)
d = gcd(1587894,
r
4
= q
6
r
5
+ r
6
1587894 = 11 * 137984 + 70070
d = gcd(137984, 70070)
r
5
= q
7
r
6
+ r
7
137984 = 1 * 70070 + 67914
d = gcd(70070, 67914)
r
6
= q
8
r
7
+ r
8
70070 = 1 * 67914 + 2156
d = gcd(67914, 2156)
r
7
= q
9
r
8
+ r
9
67914 = 31 * 2516 + 1078
d = gcd(2156, 1078)
r
8
= q
10
r
9
+ r
10
2156 = 2 * 1078 + 0
d = gcd(1078, 0) = 1078
Therefore,d = gcd(1160718174,
316258250) = 1078
108 CHAPTER 4 / BASIC CONCEPTS IN NUMBER THEORY AND FINITE FIELDS
4.3 MODULAR ARITHMETIC
The Modulus
If is an integer and is a positive integer,we define mod to be the remainder
when is divided by . The integer is called the modulus. Thus,for any integer ,
we can rewrite Equation (4.1) as follows:
a = :a/n; * n + (a mod n)
a= qn + r 0 r 6 n; q = : a/n;
anna
nana
11 mod 7 = 4; - 11 mod 7 = 3
Two integers and are said to be congruent modulo n, if
.This is written as .
2
a K b (mod n)(b mod n)
(a mod n) =ba
73
4 (mod 23); 21
-9 (mod 10)
Note that if , then .
Properties of Congruences
Congruences have the following properties:
1. if .
2. implies .
3. and imply .
To demonstrate the first point,if , then for some .
So we can write Therefore,
.divided by n) = (remainder when b is divided by n) = (b mod n)
(a mod n) = (remainder when b + kn isa = b + kn.
k(a - b) = knn
ƒ
(a - b)
a K c (mod n)b K c (mod n)a K b (mod n)
b K a (mod n)aK b (mod n)
n
ƒ
(a - b)a K b (mod n)
n
ƒ
aa K 0 (mod n)
2
We have just used the operator in two different ways:first as a binary operator that produces a
remainder,as in the expression mod ; second as a congruence relation that shows the equivalence of
two integers,as in the expression . See Appendix 4A for a discussion.K b(mod n)a
ba
mod
23 K 8 (mod 5) because 23 - 8 = 15 = 5 * 3
-11 K 5 (mod 8) because -11 - 5 =-16= 8 * ( -2)
81 K 0 (mod 27) because 81 - 0 = 81 = 27 * 3
The remaining points are as easily proved.
Modular Arithmetic Operations
Note that,by definition (Figure 4.1), the (mod ) operator maps all integers into the set
of integers { }.This suggests the question: Can we perform arithmetic0, 1, Á , (n - 1)
n
4.3 / MODULAR ARITHMETIC 109
operations within the confines of this set? It turns out that we can; this technique is
known as modular arithmetic.
Modular arithmetic exhibits the following properties:
1.
2.
3.
We demonstrate the first property.Define and .
Then we can write for some integer and for some integer
.Then
The remaining properties are proven as easily.Here are examples of the three
properties:
= [(a mod n) + (b mod n)]mod n
= (r
a
+ r
b
) mod n
= (r
a
+ r
b
+ (k + j)n) mod n
(a + b) mod n = (r
a
+ jn + r
b
+ kn) mod n
k
b = r
b
+ knja = r
a
+ jn
(b mod n) = r
b
(a mod n) = r
a
[(a mod n) * (b mod n)] mod n = (a * b) mod n
[(a mod n) - (b mod n)] mod n = (a - b) mod n
[(a mod n) + (b mod n)] mod n = (a + b) mod n
(11 * 15) mod 8 = 165 mod 8 = 5
[(11 mod 8) * (15 mod 8)] mod 8 = 21 mod 8 = 5
(11 - 15) mod 8 =-4 mod 8 = 4
[(11 mod 8) - (15 mod 8)] mod 8 =-4 mod 8 = 4
(11 + 15) mod 8 = 26 mod 8 = 2
[(11 mod 8) + (15 mod 8)] mod 8 = 10 mod 8 = 2
11 mod 8 = 3; 15 mod 8 = 7
Exponentiation is performed by repeated multiplication,as in ordinary arith-
metic.(We have more to say about exponentiation in Chapter 8.)
To find 11
7
mod 13,we can proceed as follows:
11
7
K 11 * 4 * 3 K 132 K 2 (mod 13)
11
4
= (11
2
)
2
K 4
2
K 3 (mod 13)
11
2
= 121 K 4 (mod 13)
Thus,the rules for ordinary arithmetic involving addition, subtraction, and
multiplication carry over into modular arithmetic.
Table 4.2 provides an illustration of modular addition and multiplication
modulo 8.Looking at addition, the results are straightforward, and there is a regular
pattern to the matrix. Both matrices are symmetric about the main diagonal in
conformance to the commutative property of addition and multiplication.As in
ordinary addition,there is an additive inverse, or negative, to each integer in modu-
lar arithmetic. In this case,the negative of an integer is the integer such thatyx
110 CHAPTER 4 / BASIC CONCEPTS IN NUMBER THEORY AND FINITE FIELDS
mod 8 = 0.To find the additive inverse of an integer in the left-hand column,
scan across the corresponding row of the matrix to find the value 0; the integer at
the top of that column is the additive inverse;thus, . Similarly,the
entries in the multiplication table are straightforward.In ordinary arithmetic, there
is a multiplicative inverse,or reciprocal, to each integer.In modular arithmetic mod 8,
the multiplicative inverse of is the integer such that .
Now,to find the multiplicative inverse of an integer from the multiplication table,
scan across the matrix in the row for that integer to find the value 1;the integer at
the top of that column is the multiplicative inverse;thus, .Note
that not all integers mod 8 have a multiplicative inverse;more about that later.
Properties of Modular Arithmetic
Define the set as the set of nonnegative integers less than :
This is referred to as the set of residues,or residue classes . To be more
precise,each integer in represents a residue class.We can label the residue classes
,where
[r] = {a: a is an integer, a K r (mod n)}
(mod n)a s[0], [1], [2], Á , [n - 1]
Z
n
(mod n)
Z
n
= {0, 1, Á , (n - 1)}
nZ
n
(3 * 3) mod 8 = 1
(x * y) mod 8 = 1 mod 8yx
(2 + 6) mod 8 = 0
(x + y)
Table 4.2 Arithmetic Modulo 8
+
01234567
001234567
112345670
223456701
334567012
445670123
556701234
667012345
770123456
(a) Addition modulo 8
× 01234567
000000000
101234567
202460246
303614725
404040404
505274163
606420642
707654321
(b) Multiplication modulo 8
w -w
w
-1
0 0—
1 7 1
2 6—
3 5 3
4 4—
5 3 5
6 2—
7 1 7
(c) Additive and multiplicative
inverses modulo 8
4.3 / MODULAR ARITHMETIC 111
Of all the integers in a residue class,the smallest nonnegative integer is the
one used to represent the residue class.Finding the smallest nonnegative integer to
which is congruent modulo is called reducing kmodulo n.
If we perform modular arithmetic within , the properties shown in Table 4.3
hold for integers in .We show in the next section that this implies that is a com-
mutative ring with a multiplicative identity element.
There is one peculiarity of modular arithmetic that sets it apart from ordinary
arithmetic.First, observe that (as in ordinary arithmetic) we can write the following:
(4.4)if (a + b) K (a + c) (mod n) then b K c (mod n)
Z
n
Z
n
Z
n
nk
The residue classes (mod 4) are
[3] = { Á , -13, - 9, - 5, -1, 3, 7, 11, 15, 19, Á }
[2] = { Á , -14, - 10, - 6, -2, 2, 6, 10, 14, 18, Á }
[1] = { Á , -15, - 11, - 7, -3, 1, 5, 9, 13, 17, Á }
[0] = { Á , -16, - 12, - 8, -4, 0, 4, 8, 12, 16, Á }
(5 + 23) K (5 + 7) (mod 8); 23 K 7(mod 8)
Equation (4.4) is consistent with the existence of an additive inverse.Adding
the additive inverse of to both sides of Equation (4.4),we have
However,the following statement is true only with the attached condition:
(4.5)
Recall that two integers are relatively prime if their only common positive integer
factor is 1. Similar to the case of Equation (4.4),we can say that Equation (4.5) is
if (a * b) K (a * c)
(mod n) then b K c (mod n) if a is relatively prime to n
((-a) + a + b) K ((- a) + a + c) (mod n)
b K c (mod n)
a
Table 4.3 Properties of Modular Arithmetic for Integers in Z
n
Property
Expression
Commutative Laws
(w * x) mod n = (x + w) mod n
(w + x) mod n = (x + w) mod n
Associative Laws
[(w * x) * y] mod n = [w * (x * y)] mod n
[(w + x) + y] mod n = [w + (x + y)] mod n
Distributive Law
[w * (x + y)] mod n = [(w * x) + (w * y)] mod n
Identities
(1 * w) mod n = w mod n
(0 + w) mod n = w mod n
Additive Inverse (– )w
For each ,there exists a such that w + z K 0 mod na zw H Z
n
112 CHAPTER 4 / BASIC CONCEPTS IN NUMBER THEORY AND FINITE FIELDS
consistent with the existence of a multiplicative inverse.Applying the multiplicative
inverse of to both sides of Equation (4.5),we have
((a
-1
)ab) K ((a
-1
)ac) (mod n)
b K c
(mod n)
a
The reason for this strange result is that for any general modulus ,a multiplier
that is applied in turn to the integers 0 through will fail to produce a complete
set of residues if and have any factors in common.na
(n - 1)
an
To see this,consider an example in which the condition of Equation (4.5) does
not hold.The integers 6 and 8 are not relatively prime, since they have the
common factor 2.We have the following:
Yet 3 7 (mod 8).[
6 * 7 = 42 K 2 (mod 8)
6 * 3 = 18 K 2 (mod 8)
With and ,
Because we do not have a complete set of residues when multiplying by 6,
more than one integer in maps into the same residue. Specifically,
; and so on. Because
this is a many-to-one mapping, there is not a unique inverse to the multiply
operation.
However,if we take and ,whose only common factor is 1,
The line of residues contains all the integers in , in a different order.Z
8
Z
8
01 2 3 4 5 6 7
Multiply by 505101520253035
Residues 0 5 2 7 4 1 6 3
n = 8a = 5
6 * 0 mod 8 = 6 * 4 mod 8; 6 * 1 mod 8 = 6 * 5 mod 8
Z
8
Z
8
01 2 3 4 5 6 7
Multiply by 606 12 18 24 30 36 42
Residues 0 6 4 2 0 6 4 2
n = 8a = 6
In general,an integer has a multiplicative inverse in if that integer is relatively
prime to . Table 4.2cshows that the integers 1, 3, 5,and 7 have a multiplicative inverse
in ;but 2, 4, and 6 do not.
Euclidean Algorithm Revisited
The Euclidean algorithm can be based on the following theorem:For any nonnegative
integer and any positive integer ,
(4.6)gcd(a, b) = gcd(b, a mod b)
ba
Z
8
n
Z
n
4.3 / MODULAR ARITHMETIC 113
To see that Equation (4.6) works,let . Then,by the definition of
gcd, and .For any positive integer ,we can express as
with , integers.Therefore, for some integer . But because ,it
also divides kb.We also have . Therefore,d .This shows that is a common
divisor of band (a mod b). Conversely,if is a common divisor of and ( mod ), then
and thus , which is equivalent to . Thus,the set of common
divisors of and is equal to the set of common divisors of and ( mod ).Therefore,the
gcd of one pair is the same as the gcd of the other pair,proving the theorem.
babba
d
ƒ
ad
ƒ
[kb + (a mod b)]d
ƒ
kb
babd
d
ƒ
(a mod b)d
ƒ
a
d
ƒ
bk(a mod b) = a - kbrk
a = kb + r K r (mod b)
a mod b = r
abd
ƒ
bd
ƒ
a
d = gcd(a, b)
gcd(55, 22) = gcd(22, 55 mod 22) = gcd(22, 11) = 11
gcd(11, 10) = gcd(10, 1) = gcd(1, 0) = 1
gcd(18, 12) = gcd(12, 6) = gcd(6, 0) = 6
This is the same scheme shown in Equation (4.3),which can be rewritten in the
following way.
Euclidean Algorithm
Calculate Which satisfies
r
1
= a mod b a = q
1
b + r
1
r
2
= b mod r
1
b = q
2
r
1
+ r
2
r
3
= r
1
mod r
2
r
1
= q
3
r
2
+ r
3
r
n
= r
n-2
mod r
n-1
r
n-2
= q
n
r
n-1
+ r
n
r
n+1
= r
n-1
mod r
n
= 0
d = gcd(a, b) = r
n
r
n-1
= q
n+1
r
n
+ 0
We can define the Euclidean algorithm concisely as the following recursive
function.
Euclid(a,b)
if (b=0) then return a;
else return Euclid(b, a mod b);
The Extended Euclidean Algorithm
We now proceed to look at an extension to the Euclidean algorithm that will be impor-
tant for later computations in the area of finite fields and in encryption algorithms,
suchas RSA. For given integers and , the extended Euclidean algorithm not only
calculate the greatest common divisor but also two additional integers and that
satisfy the following equation.
yxd
ba
Equation (4.6) can be used repetitively to determine the greatest common divisor.