0% found this document useful (0 votes)
98 views240 pages

Postquantum Cryptography 2008

The document presents the proceedings of the Second International Workshop on Post-Quantum Cryptography (PQCrypto 2008), which focused on developing cryptographic systems resilient to quantum computing threats. It highlights the urgency for new public-key cryptosystems as traditional ones may be compromised by quantum algorithms like Shor's. The workshop included contributions from various researchers and aimed to facilitate the exchange of ideas and advancements in post-quantum cryptography.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views240 pages

Postquantum Cryptography 2008

The document presents the proceedings of the Second International Workshop on Post-Quantum Cryptography (PQCrypto 2008), which focused on developing cryptographic systems resilient to quantum computing threats. It highlights the urgency for new public-key cryptosystems as traditional ones may be compromised by quantum algorithms like Shor's. The workshop included contributions from various researchers and aimed to facilitate the exchange of ideas and advancements in post-quantum cryptography.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd

Lecture Notes in Computer Science 5299

Commenced Publication in 1973


Founding and Former Series Editors:
Gerhard Goos, Juris Hartmanis, and Jan van Leeuwen

Editorial Board
David Hutchison
Lancaster University, UK
Takeo Kanade
Carnegie Mellon University, Pittsburgh, PA, USA
Josef Kittler
University of Surrey, Guildford, UK
Jon M. Kleinberg
Cornell University, Ithaca, NY, USA
Alfred Kobsa
University of California, Irvine, CA, USA
Friedemann Mattern
ETH Zurich, Switzerland
John C. Mitchell
Stanford University, CA, USA
Moni Naor
Weizmann Institute of Science, Rehovot, Israel
Oscar Nierstrasz
University of Bern, Switzerland
C. Pandu Rangan
Indian Institute of Technology, Madras, India
Bernhard Steffen
University of Dortmund, Germany
Madhu Sudan
Massachusetts Institute of Technology, MA, USA
Demetri Terzopoulos
University of California, Los Angeles, CA, USA
Doug Tygar
University of California, Berkeley, CA, USA
Gerhard Weikum
Max-Planck Institute of Computer Science, Saarbruecken, Germany
Johannes Buchmann Jintai Ding (Eds.)

Post-Quantum
Cryptography

Second International Workshop, PQCrypto 2008


Cincinnati, OH, USA, October 17-19, 2008
Proceedings

13
Volume Editors

Johannes Buchmann
Technische Universität Darmstadt
Fachbereich Informatik
Hochschulstraße 10, 64289 Darmstadt, Germany
E-mail: buchmann@[Link]

Jintai Ding
The University of Cincinnati
Department of Mathematical Sciences
P.O. Box 210025, Cincinnati, OH 45221-0025, USA
E-mail: [Link]@[Link]

Library of Congress Control Number: 2008936091

CR Subject Classification (1998): E.3, D.4.6, K.6.5, F.2.1-2, C.2, H.4.3

LNCS Sublibrary: SL 4 – Security and Cryptology

ISSN 0302-9743
ISBN-10 3-540-88402-5 Springer Berlin Heidelberg New York
ISBN-13 978-3-540-88402-6 Springer Berlin Heidelberg New York

This work is subject to copyright. All rights are reserved, whether the whole or part of the material is
concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting,
reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication
or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965,
in its current version, and permission for use must always be obtained from Springer. Violations are liable
to prosecution under the German Copyright Law.
Springer is a part of Springer Science+Business Media
[Link]
© Springer-Verlag Berlin Heidelberg 2008
Printed in Germany
Typesetting: Camera-ready by author, data conversion by Scientific Publishing Services, Chennai, India
Printed on acid-free paper SPIN: 12538829 06/3180 543210
Preface

Three decades ago public-key cryptosystems made a revolutionary breakthrough


in cryptography. They have developed into an indispensable part of our mod-
ern communication system. In practical applications RSA, DSA, ECDSA, and
similar public key cryptosystems are commonly used. Their security depends on
assumptions about the difficulty of certain problems in number theory, such as
the Integer Prime Factorization Problem or the Discrete Logarithm Problem.
However, in 1994 Peter Shor showed that quantum computers could break
any public-key cryptosystem based on these hard number theory problems. This
means that if a reasonably powerful quantum computer could be built, it would
put essentially all modern communication into peril. In 2001, Isaac Chuang and
Neil Gershenfeld implemented Shor’s algorithm on a 7-qubit quantum computer.
In 2007 a 16-qubit quantum computer was demonstrated by a start-up company
with the prediction that a 512-qubit or even a 1024-qubit quantum computer
would become available in 2008. Some physicists predicted that within the next
10 to 20 years quantum computers will be built that are sufficiently powerful to
implement Shor’s ideas and to break all existing public key schemes. Thus we
need to look ahead to a future of quantum computers, and we need to prepare
the cryptographic world for that future.
The research community has put much effort into developing quantum com-
puters and at the same time searching for alternative public-key cryptosystems
that could resist these quantum computers. Post-quantum cryptography is a
new fast developing area, where public key cryptosystems are studied that could
resist these emerging attacks. Currently there are four families of public-key cryp-
tosystems that have the potential to resist quantum computers: the code-based
public-key cryptosystems, the hash-based public-key cryptosystems, the lattice-
based public-key cryptosystems and the multivariate public-key cryptosystems.
Clearly there is a need to organize an event for researchers working in this
area to present their results, to exchange ideas and, most importantly, to allow
the world to know what is the state of art of research in this area. In May of 2006,
the First International Workshop on Post-Quantum Cryptography was held at
the Catholic University of Louven in Belgium with support from the European
Network of Excellence for Cryptology (ECRYPT), funded within the Information
Societies Technology Programme (IST) of the European Commission’s Sixth
Framework Programme. This workshop did not have formal proceedings.
PQCrypto 2008, the Second International Workshop on Post-Quantum Cryp-
tography, was held at the University of Cincinnati in Cincinnati, USA, October
17–19. This meeting was sponsored by the University of Cincinnati, the Taft
Research Center and FlexSecure R
GmbH. This workshop had a devoted inter-
national Program Committee, who worked very hard to evaluate and select the
high-quality papers for presentations. Each paper was anonymously reviewed by
VI Preface

at least three Program Committee members. Revised versions of the accepted


papers are published in these proceedings.
We would like to thank all the authors for their support in submitting their
papers and the authors of the accepted papers for their efforts in making these
proceedings possible on time. We are very grateful to the Program Committee
members for devoting for their time and efforts in reviewing and selecting the
papers and we are also very grateful to the external reviewers for their efforts.
We would like to thank the efforts of the local Organization Committee, in
particular, Timothy Hodges and Dieter Schmidt, without whose support this
workshop would not be possible. We would also like to thank Richard Harknett,
Chair of the Taft research center, for his support.
We would like to thank the Easychair electronic conference system, which
made the handling of the submission and reviewing process easy and efficient. In
addition, we would like to thank Daniel Cabarcas for managing all the electronic
processes.
We would also like to thank Springer, in particular Alfred Hofmann and Anna
Kramer for their support in publishing these proceedings.

August 2008 Johannes Buchmann


Jintai Ding
Organization

PQCrypto 2008 was organized by the Department of Mathematical Sciences,


University of Cincinnati.

Executive Committee
Program Chairs Johannes Buchmann (Technical University of
Darmstadt)
Jintai Ding (University of Cincinnati)
General Chair Timothy Hodges (University of Cincinnati)
Local Committee Timothy Hodges (University of Cincinnati)
Jintai Ding (University of Cincinnati)
Dieter Schmidt (University of Cincinnati)

Program Committee

Gernot Albert, Germany Louis Salvail, Denmark


Koichiro Akiyama, Japan Werner Schindler, Germany
Daniel J. Bernstein, USA Nicolas Sendrier, France
Claude Crepeau, Canada Alice Silverberg, USA
Cunshen Ding, China Martijn Stam, Switzerland
Bao Feng, Singapore Michael Szydlo, USA
Louis Goubin, France Shigeo Tsujii, Japan
Tor Helleseth, Norway Thomas Walther, Germany
Tanja Lange, The Netherlands Chaoping Xing, Singapore
Christof Paar, Germany Bo-yin Yang, Taipei

Referees
G. Albert J. Ding W. Schindler
K. Akiyama B. Feng N. Sendrier
R. Avanzi R. Fujita A. Silverberg
J. Baena P. Gaborit M. Stam
D. Bernstein M. Gotaishi M. Szydlo
J. Buchmann L. Goubin K. Tanaka
D. Cabarcas T. Helleseth S. Tsujii
C. Clough T. Lange T. Walther
C. Crepeau X. Nie C. Xing
A. Diene C. Paar B. Yang
C. Ding L. Salvail
VIII Organization

Sponsors
The Taft Research Center at the University of Cincinnati
Department of Mathematical Sciences, University of Cincinnati
FlexSecure
R
GmbH, Darmstadt, Germany
Table of Contents

A New Efficient Threshold Ring Signature Scheme Based on Coding


Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
Carlos Aguilar Melchor, Pierre-Louis Cayrel, and Philippe Gaborit

Square-Vinegar Signature Scheme . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17


John Baena, Crystal Clough, and Jintai Ding

Attacking and Defending the McEliece Cryptosystem . . . . . . . . . . . . . . . . . 31


Daniel J. Bernstein, Tanja Lange, and Christiane Peters

McEliece Cryptosystem Implementation: Theory and Practice . . . . . . . . . 47


Bhaskar Biswas and Nicolas Sendrier

Merkle Tree Traversal Revisited . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63


Johannes Buchmann, Erik Dahmen, and Michael Schneider

Explicit Hard Instances of the Shortest Vector Problem . . . . . . . . . . . . . . . 79


Johannes Buchmann, Richard Lindner, and Markus Rückert

Practical-Sized Instances of Multivariate PKCs: Rainbow, TTS, and


IC-Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Anna Inn-Tung Chen, Chia-Hsin Owen Chen, Ming-Shing Chen,
Chen-Mou Cheng, and Bo-Yin Yang

Digital Signatures Out of Second-Preimage Resistant Hash Functions . . . 109


Erik Dahmen, Katsuyuki Okeya, Tsuyoshi Takagi, and
Camille Vuillaume

Cryptanalysis of Rational Multivariate Public Key Cryptosystems . . . . . . 124


Jintai Ding and John Wagner

Syndrome Based Collision Resistant Hashing . . . . . . . . . . . . . . . . . . . . . . . . 137


Matthieu Finiasz

Nonlinear Piece In Hand Perturbation Vector Method for Enhancing


Security of Multivariate Public Key Cryptosystems . . . . . . . . . . . . . . . . . . . 148
Ryou Fujita, Kohtaro Tadaki, and Shigeo Tsujii

On the Power of Quantum Encryption Keys . . . . . . . . . . . . . . . . . . . . . . . . . 165


Akinori Kawachi and Christopher Portmann

Secure PRNGs from Specialized Polynomial Maps over Any Fq . . . . . . . . 181


Feng-Hao Liu, Chi-Jen Lu, and Bo-Yin Yang
X Table of Contents

MXL2 : Solving Polynomial Equations over GF(2) Using an Improved


Mutant Strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
Mohamed Saied Emam Mohamed,
Wael Said Abd Elmageed Mohamed, Jintai Ding, and
Johannes Buchmann

Side Channels in the McEliece PKC . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216


Falko Strenzke, Erik Tews, H. Gregor Molter,
Raphael Overbeck, and Abdulhadi Shoufan

Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 231


A New Efficient Threshold Ring Signature
Scheme Based on Coding Theory

Carlos Aguilar Melchor, Pierre-Louis Cayrel, and Philippe Gaborit

Université de Limoges, XLIM-DMI,


123, Av. Albert Thomas 87060 Limoges Cedex, France
{[Link],[Link],[Link]}@[Link]

Abstract. Ring signatures were introduced by Rivest, Shamir and Tau-


man in 2001. Bresson, Stern and Szydlo extended the ring signature con-
cept to t-out-of-N threshold ring signatures in 2002. We present in this
paper a generalization of Stern’s code based authentication (and signa-
ture) scheme to the case of t-out-of-N threshold ring signature. The size
of our signature is in O(N ) and does not depend on t. Our protocol is
anonymous and secure in the random oracle model, it has a very short
public key and has a complexity in O(N ). This protocol is the first effi-
cient code-based ring signature scheme and the first code-based threshold
ring signature scheme. Moreover it has a better complexity than number-
theory based schemes which have a complexity in O(N t).

Keywords: Threshold ring signature, code-based cryptography, Stern’s


Scheme,syndrome decoding.

1 Introduction
In 1978, McEliece published a work where he proposed to use the theory of
error correcting codes for confidentiality purposes. More precisely, he designed
an asymmetric encryption algorithm whose principle may be sum up as follows:
Alice applies a secret encoding mecanisms to a message and add to it a large
number of errors, that can only be corrected by Bob who has information about
the secret encoding mechanisms. The zero-knowledge authentication scheme pro-
posed by Stern in [24] is based on a well-known error-correcting codes problem
usually referred as the Syndrome Decoding Problem (SD in short). It is therefore
considered as a good alternative to the numerous authentication schemes whose
security relies on number theory problems, like the factorization and the discrete
logarithm problems.
The concept of ring signature was introduced by Rivest, Shamir and Tau-
man [20] (called RST in the following). A ring signature is considered to be
a simplified group signature without group managers. Ring signatures are re-
lated, but incomparable, to the notion of group signatures in [8]. On one hand,
group signatures have the additional feature that the anonymity of a signer can
be revoked (i.e. the signer can be traced) by a designated group manager, on
the other hand, ring signatures allow greater flexibility: no centralized group

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 1–16, 2008.
c Springer-Verlag Berlin Heidelberg 2008
2 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

manager or coordination among the various users is required (indeed, users may
be unaware of each other at the time they generate their public keys). The orig-
inal motivation was to allow secrets to be leaked anonymously. For example, a
high-ranking government official can sign information with respect to the ring of
all similarly high-ranking officials, the information can then be verified as coming
from someone reputable without exposing the actual signer.
Bresson et al. [5] extended the ring signature scheme into a threshold ring sig-
nature scheme using the concept of partitioning and combining functions. Assume
that t users want to leak some secret information, so that any verifier will be con-
vinced that t users among a select group held for its validity. Simply constructing t
ring signatures clearly does not prove that the message has been signed by differ-
ent signers. A threshold ring signature scheme effectively proves that a minimum
number of users of a certain group must have actually collaborated to produce the
signature, while hiding the precise membership of the subgroup (for example the
ring of public keys of all members of the President’s Cabinet).

Contribution. In this paper, we present a generalization of Stern’s authenti-


cation and signature scheme [24] for ring and threshold ring signature schemes.
Our scheme’s performance does not depend on the number t of signers in the
ring, the overall complexity and length of signatures only depend linearly in
the maximum number of signers N . Our protocol also guarantees computational
anonymity in the random oracle model. Besides these features and its efficiency,
our protocol is also the first non generic coding theory based ring signature (and
threshold ring signature) protocol and may constitute an interesting alternative
to number theory based protocols. Overall our protocol has a very short pub-
lic key size, a signature length linear in N and the best known complexity in
O(N ) when other number theory based threshold ring signature schemes have a
complexity in O(N t).

Organization of the paper. The rest of this paper is organized as follows. In


Section 2, we give a state of the art of ring signature and threshold ring signature.
In Section 3, we describe Stern’s authentication and signature scheme and give
some backround and notation. In Section 4, we present our new generalization
of Stern’s scheme in a threshold ring signature context. In Section 5, we study
the security of the proposed scheme. In Section 6 we consider a variation of the
protocol with double circulant matrices. In Section 7 we discuss the signature
cost and length. Finally, we conclude in Section 8.

2 Overview of Ring Signatures


2.1 Ring Signature
Following the formalization about ring signatures proposed in [20], we explain
in this section the basic definitions and the properties eligible to ring signature
schemes. One assumes that each user has received (via a PKI or a certificate) a
public key pki , for which the corresponding secret key is denoted ski . A regular
ring signature scheme consists of the following triple (Key-Gen, Sign and Verify):
A New Efficient Threshold Ring Signature Scheme Based on Coding Theory 3

– Key-Gen is a probabilistic polynomial algorithm that takes a security pa-


rameter(s) and returns the system, private, and public parameters.
– Sign is a probabilistic polynomial algorithm that takes system parameters,
a private parameter, a list of public keys pk1 , , pkN of the ring, and a message
M . The output of this algorithm is a ring signature σ for the message M .
– Verify is a deterministic algorithm that takes as input a message M , a ring
signature σ, and the public keys of all the members of the corresponding
ring, then outputs True if the ring signature is valid, or False otherwise.
Most of the existing ring signature schemes have a signature length linear in
N , the size of the ring. Many schemes have been proposed, one can cite the work
of Bendery,Katzyz and Morselli in [2] where they present three ring signature
schemes which are provably secure in the standard model. Recently, Shacham
and Waters [22] proposed a ring signature where for N members the signature
consists of 2N + 2 group elements and requires 2N + 3 pairings to verify.
A breathrough on the size of ring signature was obtained in [10] in which
the authors proposed the first (and unique up to now) constant-size scheme
based on accumulator functions and the Fiat-Shamir zero-knowledge identifica-
tion scheme. However, the signature derived from the Fiat-Shamir scheme has
a size of at least 160 kbits. Another
√ construction proposed by Chandran, Groth
and Sahai ([7]) has a size in O( N ).
Recently in [32], Zheng, Li and Chen presented a code-based ring signature
scheme with a signature length of 144 + 126N bits, but this scheme is based on
the signature of [9] which remains very slow in comparison with other schemes.
Eventually a generalization of ring signature schemes in mesh signatures was
proposed by Boyen in [4].

2.2 Threshold Ring Signature


In [5], Bresson, Stern and Szydlo introduced the notion of threshold ring sig-
nature. We explain in this section the basic definitions and the properties of
threshold ring signature schemes.
One assumes that each user has created or received a secret key ski and that
a corresponding public key pki is availabe to everyone.
Let A1 , . . . , AN be the N potential signers of the ring with their pk1 , , pkN
public keys. Then t of the N members form a group of signers, one of them, L,
is the leader on the t-subgroup.
- Setup : initializes the state of the system. On input a security parameter
1l , create a public database pk1 , , pkN , choose a leader L of the group and
generate the system’s parameters;
- Make-GPK : the Group Public Key construction algorithm;
- Commitment-Challenge-Answer : an electronic way to temporarily hide a
sequence of bits that cannot be changed;
- Verification : takes as input the answers of the challenges and verifies the
honestly of the computation, and returns a boolean.
4 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

In [5], the size of the signature grows with the number of users N and the
number of signers t. More precisely, the size of such t-out-of-N signature is
: 2O(t) log2 N  × (tl + N l) computations in the easy direction where l is the
security parameter.
Later, Liu et al. [15] proposed another threshold ring signature based on
Shamir’s secret sharing scheme. Their scheme is separable, with a signature
length linear in N but a complexity in O(N 2 ) for t ≈ N/2 (the cost of secret
sharing scheme). The Mesh signature of [4] can also be used in that case: the
signature length is also linear in N but the verification is in N t bilinear pairings
verifications.
A variation for ring signature was introduced in [26], where the author in-
troduced the notion of linkable ring signature by which a signer can sign only
once being anonymous, since a verifier can link a second signature signed by
the same signer. Although this property may have interesting applications (in
particular for e-vote) it does not provide full anonymity (in the sense that it can-
not be repeated). Later their scheme was extended to threshold ring signature
with a complexity in O(N ), but again, only a linkable ring signature which does
not correspond to original researched feature of [20] and [5], a fully anonymous
scheme.

3 Notation and Backround on Coding Theory and Stern’s


Signature Scheme
3.1 Permutation Notation
We first introduce two notions of block permutation that we will use in our
protocol. Consider n and N two integers.

Definition 3.1. A constant n-block permutation Σ on N blocks is a permutation


by block which permutes together N blocks of length n block by block. Each block
being treated as a unique position as for usual permutations.

A more general type of permutation is the n-block permutation Σ on N blocks.


Definition 3.2. A n-block permutation Σ on N blocks is a permutation which
satisfies that the permutation of a block of length n among N blocks is exactly
included in a block of length n.
A constant n-block permutation is a particular n-block permutation in which
the blocks are permuted as such. For instance the permutation (6, 5, 4, 3, 2, 1) is
2-block permutation on 3 blocks and the permutation (3, 4, 5, 6, 1, 2) is a constant
2-block permutation on 3 blocks since the order on each block ((1, 2), (3, 4) and
(5, 6)) is preserved in the block permutation.
The notion of product permutation is then straightforward. Let us define σ,
a family of N permutations (σ1 , . . . , σN ) of {1, . . . , n} on n positions and Σ a
A New Efficient Threshold Ring Signature Scheme Based on Coding Theory 5

constant n-block permutation on N blocks defined on {1, . . . , N }. We consider


a vector v of size nN of the form :
v = (v1 , v2 , . . . , vn , vn+1 , . . . , vn+n , v2n+1 , . . . , vnN ),
we denote V1 the first n coordinates of v and V2 the n following coordinates and
so on, to obtain: v = (V1 , V2 , ..., VN ). We can then define a n-block permutation
on N blocks, Π = Σ ◦ σ as
Π(w) = Σ ◦ σ(w) = (σ1 (WΣ(1) ), . . . , σN (WΣ(N ) )) = Σ(σ1 (W1 ), · · · , σN (WN )).

3.2 Difficult Problems in Coding Theory


Let us recall recall that a linear binary code C of length n and dimension k,
is a vector subspace of of dimension k of GF (2)n . The weight of an element x
of GF (2)n is the number of non zero coordinates of x. The minimum distance
of a linear code is the minimum weight of any non-zero
n vector of the code. For
any code one can define the scalar product x.y = i=1 xi yi . A generator matrix
G of a code is a generator basis of a code, the dual of code C is defined by
C perp = {y ∈ GF (2)n |x.y = 0, ∀x ∈ C}. Usually a generator matrix of the dual
of a code C is denoted by H. Remark that c ∈ C <=> Hxt = 0. For x ∈ GF (2)n ,
the value Hxt is called the syndrome of x for H.
The usual hard problem considered in coding theory is the following Syndrome
Decoding (SD) problem, proven NP-complete in [3] in 1978.
Problem:(SD) Syndrome decoding of a random code:
Instance: A n − k × n random matrix H over GF (2), a non null target vector
y ∈ GF (2)(n−k) and an integer ω.
Question: Is there x ∈ GF (2)n of weight ≤ ω, such that Hxt = y t ?
This problem was used by Stern for his protocol [24], but in fact a few years
later a variation on this problem called the Minimum Distance (MD) problem
was also proven NP-complete in [27]:

Problem: (MD) Minimum Distance:


Instance: A binary n − k × n matrix H and an integer ω > 0.
Question: Is there a non zero x ∈ GF (2)n of weight ≤ ω, such that Hxt = 0 ?

It was remarked in [12] that this problem could also be used with Stern’s
scheme, the proof works exactly the same. Notice that the practical difficulty of
both SD and MD problems are the same: the difficulty of finding a word of small
weight in a random code. The associated intractable assumptions associated to
these problems are denoted by SD assumption and MD assumption, see [25]
for a precise formal definition of the SD assumption related to the SD problem.

3.3 Stern’s Authentication Scheme


This scheme was developed in 1993 (see [24]). It provides a zero-knowledge au-
thentication scheme, not based on number theory problems. Let h be a hash
6 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

function. Given a public random matrix H of size (n − k) × n over F2 . Each user


receives a secret key s of n bits and of weight ω. A user’s public identifier is the
secret’s key syndrome iL = Hst . It is calculated once in the lifetime of H. It can
thus be used by several future identifications. Let us suppose that L wants to
prove to V that he is indeed the person corresponding to the public identifier
iL . L has his own private key sL such that the public identifier is its syndrome
iL = HstL .
Our two protagonists run the following protocol :

1. [Commitment Step] L randomly chooses y ∈ Fn and a permutation σ of


{1, 2, . . . , n}. Then L sends to V the commitments c1 , c2 and c3 such that :

c1 = h(σ|Hy t ); c2 = h(σ(y)); c3 = h(σ(y ⊕ s))

where h(a|b) denotes the hash of the concatenation of the sequences a and b.
2. [Challenge Step] V sends b ∈ {0, 1, 2} to L.
3. [Answer Step] Three possibilities :
– if b = 0 : L reveals y and σ.
– if b = 1 : L reveals (y ⊕ s) and σ.
– if b = 2 : L reveals σ(y) and σ(s).
4. [Verification Step] Three possibilities :
– if b = 0 : V verifies that c1 , c2 have been honestly calculated.
– if b = 1 : V verifies that c1 , c3 have been honestly calculated.
– if b = 2 : V verifies that c2 , c3 have been honestly calculated, and that the
weight of σ(s) is ω.
5. Iterate the steps 1,2,3,4 until the expected security level is reached.

Fig. 1. Stern’s protocol

Remark 1. During the fourth Step, when b equals 1, it can be noticed that Hy t
derives directly from H(y ⊕ s)t since we have:

Hy t = H(y ⊕ s)t ⊕ iA = H(y ⊕ s)t ⊕ Hst .

It is proven in [24] that this scheme is a zero-knowledge Fiat-Shamir like


scheme with a probability of cheating in 2/3 (rather than in 1/2 for Fiat-Shamir).

Remark 2. In [12] the authors propose a variation on the scheme by taking


the secret key to be a small word of the code associated to H. The Minimum
Distance problem MD, defined in the previous section. This results in exactly
the same protocol except that, as the secret key is a codeword, the public key
(i.e. the secret key’s syndrome) is not the matrix H and the syndrome but
only the matrix H. The protocol remains zero-knowledge with the same feature.
The problem of finding a small weight codeword in a code has the same type
of complexity that the syndrome decoding problem (and is also NP-complete).
A New Efficient Threshold Ring Signature Scheme Based on Coding Theory 7

The only drawback of this point of view is that it relates the secret key with the
matrix H but in our case we will be able to take advantage of that.

4 Our Threshold Ring Signature Scheme


In this section, we describe a new efficient threshold ring identification scheme
based on coding theory. This scheme is a generalization of Stern’s scheme. Fur-
thermore, by applying the Fiat-Shamir heuristics [11] to our threshold ring iden-
tification scheme, we immediately get a t-out-of-N threshold ring signature which
size is in O(N ).

4.1 High-Level Overview


Consider a ring of N members (P1 , · · · , PN ) and among them t users who want
to prove that they have been cooperating to produce a ring signature. Each user
Pi computes a public matrix Hi of (n − k) × n bits. A user’s public key consists
of the public matrix Hi and an integer w (common to all public keys). The
associated secret key is si a word of weight w of the code Ci associated to the
dual of Hi .
The general idea of our protocol is that each of the t signers performs by
himself an instance of Stern’s scheme using matrix Hi and a null syndrome
as parameters (as in the scheme’s variation proposed in [12]). The results are
collected by a leader L among the signers in order to form, with the addition of
the simulation of the N − t non-signers, a new interactive Stern protocol with
the verifier V . The master public matrix H is created as the direct sum of the
ring members’ public matrices. Eventually, the prover P , formed by the set of t

Fig. 2. Threshold ring signature scheme in the case where the t signers are P1 , · · · , Pt
and the leader L = P1 , for a group of N members
8 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

signers among N (see Fig 2 ), proves (by a slightly modified Stern’s scheme - one
adds a condition on the form of the permutation) to the verifier V that he knows
a codeword s of weight tω with a particular structure:s has a null syndrome for
H and a special form on its N blocks of length n: each block of length has
weight 0 or ω. In fact this particular type of word can only be obtained by a
cooperation processus between t members of the ring. Eventually the complexity
is hence the cost of N times the cost of a Stern authentication for a single prover
(the multiplication factor obtained on the length of the matrix H used in the
protocol) and this for any value of t.
Besides the combination of two Stern protocols (one done individually by each
signer Pi with the leader, and one slightly modified done by the leader with the
verifier), our scheme relies on the three following main ideas:

1. The master public key H is obtained as the direct sum of all the public
matrices Hi of each of the N users.
2. Indistinguashability among the members of the ring is obtained first, by
taking a common syndrome value for all the members of the ring: the null syn-
drome, and second, by taking secret keys si with the same weight ω (public
value) associated to public matrices Hi .
3. Permutation constraint: a constraint is added in Stern’s scheme on the type
of permutation used: instead of using a permutation of size N n we use a n-block
permutation on N blocks, which guarantees that the prover knows a word with
a special structure, which can only be obtained by the interaction of t signers.

4.2 Setup
The Setup algorithm is run to obtain the values of the parameters l, n, k, t, w. l
is the security parameter, n and n − k the matrix parameters, ω the weight of
the secret key si , t the number of signers. This algorithm also creates a public
database pk1 , , pkN , (here matrices Hi ). remark that parameters:n, k and ω are
fixed once for all, and that any new user knowing these public parameters can
join the ring. The parameter t has just to be precised at the beginning of the
protocol.
The matrices Hi are constructed in the following way: choose si a random
vector of weight ω, generate k − 1 random vectors and consider the code Ci
obtained by these k words (the operation can be reiterated until the dimension
is exactly k). The matrix Hi is then a (n − k) × n generator matrix of the dual
code of Ci . Remark that this construction lead to a rather large public matrix
Hi , we will consider in Section 7, an intersting variation of the construction.

4.3 Make-GPK
Each user owns a (n − k) × n-matrix Hi (public) and a n-vector si (secret) of
small weight ω (public) such that

Hi sti = 0.
A New Efficient Threshold Ring Signature Scheme Based on Coding Theory 9

1. [Commitment Step]
– Each of the signers chooses yi ∈ Fn 2 randomly and a random permutation σi of
{1, 2, . . . , n} and sends to L the commitments c1,i , c2,i and c3,i such that :

c1,i = h(σi |Hi yit ); c2,i = h(σi (yi )); c3,i = h(σi (yi ⊕ si ))

where h(a1 | · · · |aj ) denotes the hash of the concatenation of the sequence formed by
a1 , · · · , a j .
– L chooses N − t random yi ∈ Fn and N − t random permutations σi of {1, 2, . . . , n}
– L fixes the secret si of the N − t missing users at 0 and computes the N − t
corresponding commitments by choosing random yi and σi (t + 1 ≤ i ≤ N ).
– L chooses a random constant n-block permutation Σ on N blocks {1, · · · , N } in
order to obtain the master commitments:

C1 = h(Σ|c1,1 | . . . |c1,N ), C2 = h(Σ(c2,1 , . . . , c2,N )), C3 = h(Σ(c3,1 , . . . , c3,N )).

– L sends C1 , C2 and C3 to V .
2. [Challenge Step] V sends a challenge b ∈ {0, 1, 2} to L which sends b to the t signers.
3. [Answer Step] Let Pi be one of the t signers. The first part of the step is between each
signer and L.
– Three possibilities :
• if b = 0 : Pi reveals yi and σi .
• if b = 1 : Pi reveals (yi ⊕ si ) (denoted by (y ⊕ s)i ) and σi .
• if b = 2 : Pi reveals σi (yi ) (denoted by (σ(y))i ) and σi (si ) (denoted by (σ(s))i ).
– L simulates the N − t others Stern’s protocol with si = 0 and t + 1 ≤ i ≤ N .
– L computes the answer for V (and sends it) :
• if b = 0 : L constructs y = (y1 , · · · , yN ) and Π = Σ ◦ σ (for σ = (σ1 , · · · , σN ))
and reveals y and Π.
• if b = 1 : L constructs y ⊕ s = ((y ⊕ s)1 , · · · , (y ⊕ s)N ) and reveals y ⊕ s and Π.
• if b = 2 : L constructs Π(y) and Π(s) reveals them.
4. [Verification Step] Three possibilities :
– if b = 0 : V verifies that Π(s) is a n-block permutation and that C1 , C2 have been
honestly calculated.
– if b = 1 : V verifies that Π(s) is a n-block permutation and that C1 , C3 have been
honestly calculated.
– if b = 2 : V verifies that C2 , C3 have been honestly calculated, and that the weight
of Π(s) is tω and that Π(s) is formed of N blocks of length n and of weight ω or 0.
5. Iterate the steps 1,2,3,4 until the expected security level is reached.

Fig. 3. Generalized Stern’s protocol

The problem of finding s of weight ω is a MD problem defined earlier. The


t signers choose a leader L among them which sends a set of public matrices
H1 , · · · , HN .

Remark: In order to simplify the description of the protocol (and to avoid double
indexes), we consider in the following that the t signers correspond to the first t
matrices Hi (1 ≤ i ≤ t) (although more generally their order can be considered
random in {1, .., N } since the order depends of the order of the N matrices sent
by the leader.
10 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

Construction of a public key for the ring


The RPK (Ring Public Key) is contructed by considering, the matrix H de-
scribed as follow: ⎛ ⎞
H1 0 0 · · · 0
⎜ 0 H2 0 · · · 0 ⎟
⎜ ⎟
⎜ .. .. . . .. .. ⎟

H=⎜ . . . . . ⎟ ⎟.
⎜ . . . . .. ⎟
⎝ . . .
. . . . . . ⎠
0 0 0 · · · HN
H, ω and Hi , ∀i ∈ {1; . . . ; N } are public. The si , ∀i ∈ {1; . . . ; N } are private.

4.4 Commitment-Challenge-Answer and Verification Steps


We now describe formally our scheme.
The leader L collects the comitments given from the t − 1 other signers, simu-
lates the N − t non-signers and chooses a random constant n-block permutation
Σ on N blocks. From all these comitments L creates the master comitments
C1 , C2 and C3 which are sent to the verifier V , who answers by giving a chal-
lenge b in {0, 1, 2}. Then L sends the challenge to each of the other t − 1 signers
and collects their answers to create a global answer for V . Upon reception of the
global answer, V verifies that it is correct by checking the comitments as in the
regular Stern’s scheme.
All the details of the protocol are given in Fig. 3. Recall that in the description
of the protocol, in order to avoid complex double indexes in the description we
considered that the t signers corresponded to the first t matrices Hi .

5 Security
5.1 Our Security Model
The security of our protocol relies on two notions of unforgeability and anonymity
secure under the Mininum Distance problem assumption in the random oracle
model.
To prove the first notion we prove that our protocol is an Honest-Verifier Zero-
Knowledge (HZVZK) Proof of Knowledge. It has been proven in [11] that every
HVZK protocol can be turned into a signature scheme by setting the challenge
to the hash value of the comitment together with the message to be signed. Such
a scheme has been proven secure against existential forgery under adaptatively
chosen message attack in the random oracle model in [19].
The second notion of anonymity for our scheme in a threshold context is
defined as follows:
Definition 5.1 (Threshold ring signature anonymity). Let R = {Rk (·, ·)}
be a family of threshold ring signature schemes.
We note SIG ← S(G, M, Rk ) a random choice among the signatures of a t
user group G concerning a message M using the ring signature scheme Rk .
A New Efficient Threshold Ring Signature Scheme Based on Coding Theory 11

R is said to be anonymous if for any c > 0, there is a K such that for any
k > K, any two different subgroups G1 , G2 of t users, any message M and any
polynomial-size probabilistic circuit family C = {Ck (·, ·)},

P r(Ck (SIG, G1 , G2 , P(k)) = G|SIG ← S(G, M, Rk ))) < 1/2 + k −c

G being randomly chosen among {G1 , G2 }, and P(k) being the set of all the
public information about the ring signature scheme.

5.2 Security of Our Scheme


We first prove that our scheme is HVZK with a probability of cheating of 2/3.
We begin by a simple lemma.

Lemma 1. Finding a vector v of length nN such that the global weight of v is


tω, the weight of v for each of the N blocks of length n is 0 or ω and such that
v has a null syndrome for H, is hard under the MD assumption.

Proof. The particular structure of H (direct sum of the Hi of same length n)


implies that finding such a n-block vector of length nN is exactly equivalent to
finding a solution for the local hard problem of finding si of weight ω such that
Hi sti = 0, which is not possible under our assumption.

Theorem 1. Our scheme is a proof of knowledge, with a probability of cheating


2/3, that the group of signers P knows a vector v of length nN such that the
global weight of v is tω, the weight of v for each of the N blocks of length n is
0 or ω and such that v has a null syndrome for H. The scheme is secure under
the MD assumption in the random oracle model.

Proof. (sketch) We need to prove the usual three properties of completeness,


soundness and zero-knowledge. The property of completeness is straightforward
since for instance for b = 0, the knowledge of y and Π permits to recover Σ, σi
and the yi so that it is possible for the verifier to recover all the ci and hence
the master comitment C1 , the same for C2 . The cases b = 1 and b = 2 works the
same. The proof for the soundness and zero-knowledge follow the original proof
of Stern in [25] for the problem defined in the previous lemma, by remarking
that the structure of our generalized protocol is copied on the original structure
of the protocol with Σ in Fig.3 as σ in Fig.1, and with the fact that one checks
in the answers b = 0 and b = 1 in the protocol that the permutation Π is an
n-block permutation on N blocks.

Remark. It is also not possible to have information leaked between signers during
the protocol since each signer only gives information to L (for instance) as in a
regular Stern’s scheme which is zero-knowledge.

Now we consider anonymity of our protocol, the idea of the proof is that if
an adversary has the possibility to get more information on who is a signer
among the N potential signers or who is not, it would mean in our case that the
12 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

adversary is able to know with a better probability than 2/3 that a block si of
s = (s1 , · · · , sN ) of size n among the N such blocks associated to the created
common secret s is completely zero or not. But since we saw that our protocol
was zero-knowledge based on a light modification of the Stern protocol, it would
mean that the adversary is able to get information on the secret s during the
interaction between L and V , which is not possible since the protocol is zero-
knowledge. Formally we obtain:

Theorem 2. Our protocol satisfies the treshold ring signature anonymity.

Proof. Suppose that for a given M , a given c > 0 and two given subgroups G1 , G2
of t users there is a family of circuits C = {Ck (·, ·)} such that for any K there
is a k > K such that

P r(Ck (SIG, G1 , G2 , P(k)) = G|SIG ← S(G, M, Rk ))) > 1/2 + k −c .

Consider a user Pi ∈ G1 such that U ∈ G2 (such a user exists as the groups


are different), and the following circuit: - Whenever the circuit Ck outputs G1 :
output that the i-th (out of N ) block of size n of the secret s associated to the
matrix H is not null. - Whenever the circuit Ck outputs G2 : output that the i-th
(out of N ) block of size n of the secret s associated to the matrix H is null. Such
a circuit guesses with non-negligible advantage whether a part of the secret s
associated to the ring key matrix H is null or not, and therefore breaks the zero-
knowledge property of the protocol. the family of circuits C  = {Ck (·, ·)}

5.3 Practical Security of Stern’s Scheme from [24]


The security of Stern’s Scheme relies on three properties of random linear codes:
1. Random linear codes satisfy a Gilbert-Varshamov type lower bound [16],
2. For large n almost all linear codes lie over the Gilbert-Varshamov bound
[18],
3. Solving the syndrome decoding problem for random codes is NP-complete
[3].
In practice Stern proposed in [24] to use rate 1/2 codes and ω just below the
Gilbert-Varshamov bound associated to the code. For such code the exponential
(ωn)
cost of the best known attack [6] is in ≈ O(n) n−k , which gives a code with
( ω )
today security (280 ) of n = 634 and rate 1/2 and ω = 69.

6 An Interesting Variation of the Scheme Based on


Double-Circulant Matrices
In Section 5 we described a way to create the public matrices Hi , this method as
in the original Stern’s paper, leads to a large size of the public keys Hi in n2 /2
bits. It was recently proposed in [12], to use double-circulant random matrices
A New Efficient Threshold Ring Signature Scheme Based on Coding Theory 13

rather than pure random matrice for such matrices. A double circulant matrix
is a matrix of the form Hi = (I|C) for C a random n/2 × n/2 cyclic matrix
and I the identity matrix. Following this idea one can construct the matrices Hi
as follows: consider si = (a|b) where a and b are random vectors of length n/2
and weight ≈ ω/2, then consider the matrix (A|B) obtained for A and B square
(n/2 × n/2) matrices obtained by the n/2 cyclic shifts of a and b (each row of A
is a shift of the previous row, begining with first row a or b).
Now consider the code Gi generated by the matrix (A|B), the matrix Hi can
then be taken as Hi = (I|C) such that Hi is a dual matrix of Gi and C is cyclic
since A and B are cyclic, and hence can be described with only its first row). It
is explained in [12] that this construction does not decrease the difficulty of the
decoding but clearly decrease dramatically the size of the description of Hi : n/2
bits against n2 /2.
It is then possible to define a new problem:

Problem: (MD-DC) Minimum Distance of Double circulant codes:


Instance: A binary n/2 × n double circulant matrix H and an integer ω > 0.
Question: Is there a non zero x ∈ GF (2)n of weight ≤ ω, such that Hxt = 0 ?

It is not known whether this problem is NP-complete or not, but the problem
is probably as hard as the M D problem, and on practical point of view (see [12]
for details) the practical security is almost the same for best known attack that
the MD problem. Practicly the author of [12] propose n = 347.
Now all the proof of security we considered in this paper can also be adpated
to the MD-DC problem, since for the generalized Stern protocol we introduced
we can take any kind of Hi with the same type of problem: knowing a small
weight vector associated to Hi (in fact only the problem assumption changes).

7 Length and Complexity


In this section examine the complexity of our protocol and compare it to other
protocol.

7.1 The Case t = 1


This case corresponds to the case of classical ring signature scheme, our scheme
is then not so attractive in term of length of signature since we are in N but more
precisely in ≈ 20ko × N (for 20ko the cost of one Stern signature), meanwhile
since the Stern protocol is fast in term of speed our protocol is faster that all
others protocols for N = 2 or 3 which may have some applications.

7.2 The General Case


Signature length
It is straight forward to see that the signature length of our protocol is in O(N ),
more precisely in ≈ 20ko × N , for 20ko the length of one signature by the Fiat-
Shamir paradigm applied to the Stern scheme ( a security of 2−80 is obtained
14 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

by 140 repetitions od the protocol). For instance consider a particular example


with N = 100 and t = 50, we obtain a 2M o signature length, which is quite
large, but still tractable. Of course other number theory based protocols like [4]
or [15] have shorter signture lengths (in 8Ko or 25Ko) but are slower.

Public key size


If we use the double-circulant construction described in Section 6, we obtain, a
public key size in 347N which has a factor 2 or 3 better than [15] and of same
order than [4].

Complexity of the protocol


The cost of the protocol is N times the cost of one Stern signature protocol
hence in O(N ), (more precisely in 140n2 N operations) and this for any t. When
all other fully anonymous threshold ring signature protocol have a complexity in
O(tN ) operations (multiplications or modular exponentiations in large integer
rings, or pairings). Hence on that particular point our algorithm is faster than
other protocols.

8 Conclusion
In this paper we presented a new (fully anonymous) t-out-of-N threshold ring
signature scheme based on coding theory. Our protocol is a very natural general-
ization ot the Stern authentication scheme and our proof is based on the original
proof of Stern. We showed that the notion of weight of vector particularly went
well in the context of ring signature since the notion of ad hoc group corresponds
well to the notion of direct sum of generator matrices and is compatible with
the notion of sum of vector of small weight. Eventually we obtain a fully anony-
mous protocol based on a proof of knowledge in the random oracle model. Our
protocol is the first non-generic protocol based on coding theory and (as usual
for code based protocol) is very fast compared to other number theory based
protocols.
Moreover the protocol we described can also be easily generalized to the case of
general access scenario. Eventually the fact that our construction is not based on
number theory but on coding theory may represent an interesting alternative.
We hope this work will enhance the potential of coding theory in public key
cryptography.

References
1. Abe, M., Ohkubo, M., Suzuki, K.: 1-out-of-N signatures from a variety of keys. In:
Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501. Springer, Heidelberg (2002)
2. Bender, A., Katz, J., Morselli, R.: Ring Signatures: Stronger Definitions, and Con-
structions Without Random Oracles. In: Halevi, S., Rabin, T. (eds.) TCC 2006.
LNCS, vol. 3876, pp. 60–79. Springer, Heidelberg (2006)
3. Berlekamp, E., McEliece, R., van Tilborg, H.: On the inherent intractability of cer-
tain coding problems. IEEE Transactions on Information Theory IT-24(3) (1978)
A New Efficient Threshold Ring Signature Scheme Based on Coding Theory 15

4. Boyen, X.: Mesh Signatures. In: Naor, M. (ed.) EUROCRYPT 2007. LNCS,
vol. 4515, pp. 210–227. Springer, Heidelberg (2007)
5. Bresson, E., Stern, J., Szydlo, M.: Threshold ring signatures and applications to
ad-hoc groups. In: Yung, M. (ed.) CRYPTO 2002. LNCS, vol. 2442. Springer,
Heidelberg (2002)
6. Canteaut, A., Chabaud, F.: A new algorithm for finding minimum-weight words
in a linear code: application to primitive narrow-sense BCH codes of length 511.
IEEE Transactions on Information Theory IT-44, 367–378 (1988)
7. Chandran, N., Groth, J., Sahai, A.: Ring signatures of sub-linear size without
random oracles. In: Arge, L., Cachin, C., Jurdziński, T., Tarlecki, A. (eds.) ICALP
2007. LNCS, vol. 4596, pp. 423–434. Springer, Heidelberg (2007)
8. Chaum, D., van Heyst, E.: Group signatures. In: Davies, D.W. (ed.) EUROCRYPT
1991. LNCS, vol. 547, pp. 257–265. Springer, Heidelberg (1991)
9. Courtois, N., Finiasz, M., Sendrier, N.: How to achieve a MCEliece based digital
signature scheme. In: Boyd, C. (ed.) ASIACRYPT 2001. LNCS, vol. 2248. Springer,
Heidelberg (2001)
10. Dodis, Y., Kiayias, A., Nicolosi, A., Shoup, V.: Anonymous identification in ad-
hoc groups. In: Cachin, C., Camenisch, J.L. (eds.) EUROCRYPT 2004. LNCS,
vol. 3027. Springer, Heidelberg (2004)
11. Fiat, A., Shamir, A.: How to Prove Yourself: Practical Solutions to Identification
and Signature Problems. In: Odlyzko, A.M. (ed.) CRYPTO 1986. LNCS, vol. 263,
pp. 186–194. Springer, Heidelberg (1987)
12. Gaborit, P., Girault, M.: Lightweight code-based authentication and signature ISIT
2007 (2007)
13. Herranz, J., Saez, G.: Forking lemmas for ring signature schemes. In: Johansson,
T., Maitra, S. (eds.) INDOCRYPT 2003. LNCS, vol. 2904, pp. 266–279. Springer,
Heidelberg (2003)
14. Kuwakado, H., Tanaka, H.: Threshold Ring Signature Scheme Based on the Curve.
Transactions of Information Processing Society of Japan 44(8), 2146–2154 (2003)
15. Liu, J.K., Wei, V.K., Wong, D.S.: A Separable Threshold Ring Signature Scheme.
In: Lim, J.-I., Lee, D.-H. (eds.) ICISC 2003. LNCS, vol. 2971, pp. 352–369. Springer,
Heidelberg (2004)
16. MacWilliams, F.J., Sloane, N.J.A.: The Theory of Error Correcting Codes. North-
Holland, Amsterdam (1977)
17. Naor, M.: Deniable Ring Authentication. In: Yung, M. (ed.) CRYPTO 2002. LNCS,
vol. 2442, pp. 481–498. Springer, Heidelberg (2002)
18. Pierce, J.N.: Limit distributions of the minimum distance of random linear codes.
IEEE Trans. Inf. theory IT-13, 595–599 (1967)
19. Pointcheval, D., Stern, J.: Security proofs for signature schemes. In: Maurer, U.M.
(ed.) EUROCRYPT 1996. LNCS, vol. 1070, pp. 387–398. Springer, Heidelberg
(1996)
20. Rivest, R.L., Shamir, A., Tauman, Y.: How to leak a secret. In: Boyd, C. (ed.)
ASIACRYPT 2001. LNCS, vol. 2248, pp. 552–565. Springer, Heidelberg (2001)
21. Sendrier, N.: Cryptosystèmes à clé publique basés sur les codes correcteurs
d’erreurs, Mémoire d’habilitation, Inria 2002 (2002),
[Link]
22. Shacham, H., Waters, B.: Efficient Ring Signatures without Random Oracles. In:
Okamoto, T., Wang, X. (eds.) PKC 2007. LNCS, vol. 4450, pp. 166–180. Springer,
Heidelberg (2007)
23. Shamir, A.: How to share a secret. Com. of the ACM 22(11), 612–613 (1979)
16 C. Aguilar Melchor, P.-L. Cayrel, and P. Gaborit

24. Stern, J.: A new identification scheme based on syndrome decoding. In: Stinson,
D.R. (ed.) CRYPTO 1993. LNCS, vol. 773. Springer, Heidelberg (1994)
25. Stern, J.: A new paradigm for public key identification. IEEE Transactions on
Information THeory 42(6), 2757–2768 (1996),
[Link]
26. Tsang, P.P., Wei, V.K., Chan, T.K., Au, M.H., Liu, J.K., Wong, D.S.: Separa-
ble Linkable Threshold Ring Signatures. In: Canteaut, A., Viswanathan, K. (eds.)
INDOCRYPT 2004. LNCS, vol. 3348, pp. 384–398. Springer, Heidelberg (2004)
27. Vardy, A.: The intractability of computing the minimum distance of a code. IEEE
Transactions on Information Theory 43(6), 1757–1766 (1997)
28. Véron, P.: A fast identification scheme. In: Proceedings of IEEE International Sym-
posium on Information Theory 1995, Whistler, Canada (Septembre 1995)
29. Wong, D.S., Fung, K., Liu, J.K., Wei, V.K.: On the RSCode Construction of Ring
Signature Schemes and a Threshold Setting of RST. In: Qing, S., Gollmann, D.,
Zhou, J. (eds.) ICICS 2003. LNCS, vol. 2836, pp. 34–46. Springer, Heidelberg (2003)
30. Xu, J., Zhang, Z., Feng, D.: A ring signature scheme using bilinear pairings. In:
Lim, C.H., Yung, M. (eds.) WISA 2004. LNCS, vol. 3325. Springer, Heidelberg
(2005)
31. Zhang, F., Kim, K.: ID-Based Blind Signature and Ring Signature from Pairings.
In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501. Springer, Heidelberg
(2002)
32. Zheng, D., Li, X., Chen, K.: Code-based Ring Signature Scheme. International
Journal of Network Security 5(2), 154–157 (2007),
[Link]
[Link]
Square-Vinegar Signature Scheme

John Baena1,2 , Crystal Clough1 , and Jintai Ding1


1
Department of Mathematical Sciences,
University of Cincinnati,
Cincinnati, OH, 45220, USA
{baenagjb,cloughcl}@[Link],ding@[Link]
[Link]
2
Department of Mathematics,
National University of Colombia,
Medellin, Colombia

Abstract. We explore ideas for speeding up HFE-based signature


schemes. In particular, we propose an HFEv− system with odd char-
acteristic and a secret map of degree 2. Changing the characteristic of
the system has a profound effect, which we attempt to explain and also
demonstrate through experiment. We discuss known attacks which could
possibly topple such systems, especially algebraic attacks. After testing
the resilience of these schemes against F4, we suggest parameters that
yield acceptable security levels.

Keywords: Multivariate Cryptography, HFEv− , Signature Scheme,


Odd Characteristic.

1 Introduction
Multivariate public-key cryptosystems (MPKCs) stand among the systems
thought to have the potential to resist quantum computer attacks [4]. This is
because their main security assumption is based on the problem of solving a
system of multivariate polynomial equations, a problem which is still as hard for
a quantum computer to solve as a conventional computer [12,22].
The area of multivariate public-key cryptography essentially began in 1988
with an encryption scheme proposed by Matsumoto and Imai [17]. This system
has since been broken [19], but has inspired many new encryption and signature
schemes. One of these is HFE (Hidden Field Equations), proposed in 1996 by
Patarin [20].
An HFE scheme could still be secure, but the parameters required would
make it so inefficient as to be practically unusable. Many variants of HFE have
been proposed and analyzed, in particular one called HFEv− , a signature scheme
which combines HFE with another system called Oil-Vinegar and also uses the
“−” construction. More about HFEv− in Sect. 2.2. A recent proposal is Quartz,
a signature scheme with HFEv− at its core. Quartz-7m, with slightly differ-
ent parameter choices, is believed secure. These schemes have enticingly short
signatures.

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 17–30, 2008.
c Springer-Verlag Berlin Heidelberg 2008
18 J. Baena, C. Clough, and J. Ding

However, the problem with HFE-based signature schemes is that until now,
they were quite slow. In this paper, we study how some simple but very surprising
changes to existing ideas can yield a system with much faster signing and key
generation at the same security levels as other HFE-based signature schemes. In
particular, we set out to make an HFEv− system with similarly short signatures
and greater efficiency in the form of fast signing times.
This paper is organized as follows. In Sect. 2, we discuss relevant background
on HFE and Quartz systems. In Sect. 3, we introduce the new variant Square-
Vinegar, providing a theoretical overview along with explicit constructions and
experimental data. In Sect. 4, known attacks are addressed and more experi-
mental results presented. Additional data can be found in the appendix.

2 Hidden Field Equations and Quartz


2.1 The Basic HFE Scheme
Let k be a finite field of size q and K a degree n extension field of k. In the original
design, the characteristic of k is 2. K can be seen as an n-dimensional vector
space over k and therefore we can identify K and k n by the usual isomorphism
ϕ : K → k n and its inverse. HFE makes use of an internal secret map F : K → K
defined by  
i j i
F (X) = aij X q +q + bi X q + c , (1)
0≤i<j<n 0≤i<n
qi +qj ≤D qi ≤D

where the coefficients aij , bi , c are randomly chosen from K and D is a fixed
positive integer. A map of this form is often referred to as an HFE map.
By composing F with ϕ and its inverse we obtain the set of n quadratic
multivariate polynomials F̃ = ϕ ◦ F ◦ ϕ−1 : k n → k n . Then we hide the
structure of this map by means of two invertible affine linear transformations
S, T : k n → k n . The public key is the set of quadratic multivariate polynomials
(g1 , g2 , . . . , gn ) = T ◦ F̃ ◦ S. The private key consists of the map F and the affine
linear transformations S and T .
In such a scheme the most delicate matter is the choice of the total degree
D of F . D cannot be too large since decryption (or signing) involves solving
the equation F (X) = Y  for a given Y  ∈ K using the Berlekamp algorithm, a
process whose complexity is determined by D. However this total degree cannot
be too small either to avoid algebraic attacks, like the one developed by Kipnis
and Shamir [15] and the Gröbner Bases (GB) Attack [9].

2.2 HFE Variants


There are several variations of this construction intended to enhance the security
of HFE, among which we find the HFE− [23] and HFEv [20] signature schemes.
HFE− is the signature scheme obtained from HFE in which we omit r of
the polynomials g1 , g2 , . . . , gn from the public key. The intent of doing this is to
eliminate the possibility of certain attacks, in particular algebraic and Kipnis-
Shamir attacks, provided the number r is not too small.
Square-Vinegar Signature Scheme 19

HFEv is a combination of HFE and the Unbalanced Oil & Vinegar scheme
[14,21]. The main idea of HFEv is to add a small number v of new variables,
referred to as the vinegar variables, to HFE. This makes the system somehow
more complicated and changes the structure of the private map. In this case we
replace the map F with a more complicated map G : K × k v → K.
We can combine HFE− and HFEv to obtain the so called HFEv− signature
scheme. In this scheme, r polynomials are kept secret and v additional variables
are introduced.
Quartz is an HFEv− signature scheme with a special choice of the parameters,
which are k = F2 , n = 103, D = 129, r = 3 and v = 4 [24,25]. These parameters
of Quartz have been chosen in order to produce very short signatures: only 128
bits. This makes Quartz specially suitable for very specific applications in which
short signatures are required, like RFID. Quartz was proposed to NESSIE [18],
but it was rejected perhaps due to the fact that its parameters were not chosen
conservatively enough. In 2003 Faugère and Joux stated in [9] that the published
version of Quartz could be broken using Gröbner bases with slightly fewer than
280 computations.
At present time two modified versions of Quartz are thought to be secure,
based on the estimations of [9] on Quartz. The first one, called Quartz-513d, has
parameters k = F2 , n = 103, D = 513, r = 3 and v = 4. The second version,
Quartz-7m, has parameters k = F2 , n = 103, D = 129, r = 7 and v = 0. In
these versions the high degree D makes the signing process very slow. In fact
Quartz-513d was considered impractical for this reason, even as it was proposed.

3 The Square-Vinegar Scheme


We now propose a way to build a fast and highly secure short signature cryp-
tosystem, using the ideas of the HFEv− signature scheme and the new idea of
using finite fields of odd characteristic. With a new choice of parameters we gain
computational efficiency without risking the security of the signature scheme.
From now on we call these Square-Vinegar schemes. Signatures are still short,
which is very convenient to implement in small devices.

3.1 Overview of the New Idea


The set up is basically the same as in the HFEv− signature scheme. As mentioned
above, we replace the map F with the more complicated map G : K × k v → K
defined by
 i j  i
G (X, Xv ) = aij X q +q + βi (Xv ) X q + γ (Xv ) , (2)
0≤i<j<n 0≤i<n
qi +qj ≤D qi ≤D

where the coefficients aij are randomly chosen from K, γ : k v → K is a randomly


chosen quadratic map, βi : k v → K are randomly chosen affine linear maps, and
20 J. Baena, C. Clough, and J. Ding

Xv = (x1 , . . . , xv ) represents the new vinegar variables. More precisely the maps
βi and γ are of the form

βi (Xv ) = ξi,j · xj + νi ,
1≤j≤v
 
γ(Xv ) = ηj,l · xj xl + σj · xj + τ ,
1≤j<l≤v 1≤j≤v

where ξi,j , νi , ηj,l , σj and τ are randomly chosen from K. As in HFE, we


compose G with ϕ and its inverse we obtain the set of n quadratic multi-
variate polynomials. Then we compose with two invertible affine linear trans-
formations T : k n → k n and S : k n+v → k n+v , obtaining the polynomials
(g1 , g2 , . . . , gn ) = T ◦ ϕ−1 ◦ G ◦ ϕ ◦ S. Finally, we remove the last r of these
polynomials. The public key is the set of quadratic multivariate polynomials
(g1 , g2 , . . . , gn−r ) : k n+v → k n−r . The private key consists of the map G and the
affine linear transformations S and T .
While the setup is the same, we make some significant changes. First of all,
we will use a field k of odd characteristic. The benefits of working in an odd
characteristic are discussed in [5] and will be summarized below in Sect. 4. After
making this change, we studied the effect of changing of D and v in order to find
the most efficient values. The motivation was that by using the proper number
of vinegar variables, we could use a smaller degree D and hence considerably
speed up the signing process with the same security level.
With this in mind, we conducted experiments to determine new secure values
for D and v. Much to our surprise, in all of our experiments we found that
D = 2 is sufficiently secure when the field is of odd characteristic, as we will
see in Sect. 4. This makes the signature scheme much faster, as we will see in
Sect. 3.2.

3.2 The Signing Process


Although HFE is perfectly suitable for encryption and digital signatures, the
map F defined by (1) is usually not a surjection. However, in the case of Square-
Vinegar schemes, for every different set of vinegar variables we usually obtain a
totally different quadratic polynomial in X, which increases the probability of
finding a signature for a given document. Actually, in our experiments we were
always able to find a signature.
To sign a given document (ỹ1 , . . . , ỹn−r ) ∈ k n−r , we start by randomly choos-
ing r elements ỹn−r+1 , . . . , ỹn ∈ k to complete a vector in k n . Next, we randomly
choose values (w1 , . . . , wv ) ∈ k v for the vinegar variables Xv , and then solve for
X the equation

G (X, (w1 , . . . , wv )) = ϕ−1 (T −1 (ỹ1 , . . . , ỹn−r , ỹn−r+1 , . . . , ỹn )) . (3)

If this equation has no solutions, a new choice of vinegar variables is made


yielding a new equation to be solved. We continue in this manner until we find
Square-Vinegar Signature Scheme 21

Table 1. Number of tries to sign a document

q D n v r Average number of trials to sign


2 129 103 4 3 1.74
2 2 103 4 3 2.26
13 2 27 3 0 1.85
13 2 28 3 1 1.80
13 2 36 4 3 1.88
31 2 31 4 3 2.09

Table 2. Signing times for some HFEv− systems

q D n v r Number of documents tried Average time to sign


2 129 103 4 3 100 2.646 s
2 2 103 4 3 100 0.166 s
13 2 27 3 0 100 0.024 s
13 2 28 3 1 100 0.026 s
13 2 36 4 3 100 0.034 s
31 2 31 4 3 100 0.041 s

a choice of vinegar variables whose associated equation in X has a solution. The


probability of finding a suitable selection of vinegar variables in a few trials is
high. We could confirm this fact with our computer experiments, as evidenced
in Table 1 below. We used MAGMA 2.14, the latest version, on a Dell Computer
with Windows XP which has an Intel(R) Pentium(R) D CPU 3.00 GHz processor
with 2.00 GB of memory installed, to run the computer experiments.
In each case 100 different random documents were signed. We observed that,
on average, two tries would be enough to find a solution for that equation.
Now suppose that X̃ is a solution of (3), then a signature for the document
(ỹ1 , . . . , ỹn−r ) – actually for the whole vector (ỹ1 , . . . , ỹn ) – is given by

S −1 (ϕ(X̃), w1 , . . . , wv ) ∈ k n+v .

As mentioned above, with our experiments we found that D = 2 suffices as


the degree of the secret map G; we will see more about this in Sect. 4. This is
undoubtedly a novel and surprising discovery since in the previous versions of
HFE and its modifications – all of which are characteristic two – D was always
conservatively chosen, usually D > 128. These high values of D made the process
of signing very slow since solving a univariate equation of such a large degree,
even with the fastest algorithms, is not necessarily a fast procedure. On the
other hand, when D = 2, once the vinegar values have been set, (3) becomes
simply a quadratic equation over the field K. Berlekamp’s Algorithm can solve
a univariate quadratic equation rather quickly, and MAGMA’s implementation
automatically uses the Berlekamp-Zassenhaus algorithm when appropriate [2].
See Table 2 for signing times for several choices of parameters. Note that signing
for the q = 31 case shown is 65 times faster than using Quartz parameters.
22 J. Baena, C. Clough, and J. Ding

Another important consequence of the use of D = 2 is that generation of the


public key for this signature scheme is more efficient. We attribute this to the
large number of multiplications that are needed over the field K for D > 128.
Some results are summarized in Table 3 below.

Table 3. Public key generation times for some HFEv− systems

q D n v r Number of trials Average time


2 129 103 4 3 100 58.066 s
13 2 27 3 0 100 0.780 s
13 2 28 3 1 100 0.830 s
13 2 36 4 3 100 2.019 s
31 2 31 4 3 100 1.271 s

4 Security Analysis
In this section we will consider known attacks against MPKCs (Gröbner Basis,
Kipnis-Shamir, and Vinegar attacks) and discuss their effectiveness against our
new scheme. This will lead us to suggest parameter values for a viable Square-
Vinegar system.
Before considering the aforementioned attacks in detail, let us mention some
minor attacks. First, there do not yet seem to be any attacks against MPKCs
utilizing knowledge of plaintext-ciphertext (or document-signature) pairs. Sec-
ondly, the recent attack on SFlash [8] does not apply here because that attack
used hidden symmetry and invariants of the SFlash public key to overcome the
omission of certain polynomials from the public key, but our public key does not
have such hidden invariants or symmetry due the presence of the vinegar vari-
ables. Also, the attacks used against perturbed systems such as IPHFE, [6,7], do
not seem directly applicable, especially considering the differences between even
and odd characteristic and internal and external perturbation.

4.1 Gröbner Basis Attack


First let us recall what we mean by a Gröbner Basis Attack. Suppose that
someone, who does not know the private key, wants to forge a signature for
a given document (ỹ1 , . . . , ỹn−r ) ∈ k n−r . This attacker has access only to the
public key (g1 , g2 , . . . , gn−r ) : k n+v → k n−r . In order to find a valid signature
for the given document, the attacker has to solve the system of equations

g1 (x1 , . . . , xn , x1 , . . . , xv ) − ỹ1 = 0


g2 (x1 , . . . , xn , x1 , . . . , xv ) − ỹ2 = 0
..
. (4)
 
gn−r (x1 , . . . , xn , x1 , . . . , xv ) − ỹn−r = 0.
Square-Vinegar Signature Scheme 23

Solving these equations directly, without the use of the internal structure of
the system, is known as the algebraic attack. Currently the most efficient al-
gebraic attacks are the Gröbner basis algorithms F4 [10] and F5 [11]. Another
algorithm called XL has also been widely discussed but F4 is seen to be more ef-
ficient [1], so we focused our energy on studying algebraic attacks via F4 . Among
the best implementations of these algorithms is the F4 function of MAGMA [2],
which represents the state of the art in polynomial solving technology.
In [9], algebraic attacks were used to break HFE. The results in that paper
seem to indicate that for any q, an HFE system with small D can be broken in
such a way. However, this is not the case and their claims only hold up when
working over characteristic 2.
Since the system (4) is underdetermined, we expect to find many solutions for
it. In order to forge a signature for the given document, it suffices to find only one
such solution. So we can guess values for some of the variables yielding a system
with the same number of equations but fewer variables, as was done in [3]. This
speeds up the attack significantly. Therefore we randomly guessed v + r of the
variables and then used the Gröbner basis attack to solve the resulting system
of n − r equations with n − r variables, which is faster to solve than (4).
Based on recent observations about MPKCs over odd characteristic [5], we
believe that the choices q = 13 or q = 31 provide a strong defense against an
algebraic attack via Gröbner bases. The key point in the case of odd character-
istic is that the field equations xqi − xi for i = 1, 2, . . . , n + v, appear to be less
useful to an attacker due to their higher degree. In particular, the efficiency of
the Gröbner basis attack seems to rely on small characteristic. It is stated in [5]
that this stems from the fact that characteristic 2 field equations x2i − xi = 0
help to keep the degrees of the polynomials used in the Gröbner basis algorithm
low whereas, for example, x13 i − xi = 0 or xi − xi = 0 are much less useful
31

equations in that regard.


Extensive experiments were run to test this idea on the same computer that
was used for the signing experiments. For different sets of the parameters (q, D,
n, v and r), we generated HFEv− systems and used F4 to solve the system of
equations in (4) for different random documents.
We sought the lowest value of D for which F4 took an acceptably long time.
By extrapolating the data we could then determine what values of n and r should
be used and see if such values were practical. It turns out that D = 2 suffices
and we did not have to test higher values of D. Notice also that the choice of
odd characteristic is important since for even characteristic X → X 2 over K is
just a linear map, which cannot be used as a secret internal map.
Further examination of the data showed that with respect to v the attack time
hit a plateau at some point, and further increasing v did not appear to increase
resistance to the Gröbner basis attack. This behavior can be seen on Fig. 4 in
the Appendix section. By extrapolating the data we think that for our choices
of D, q, r and n the plateau should occur before v = 4, thus we think the choice
of v = 4 is optimal in this sense.
24 J. Baena, C. Clough, and J. Ding

Fig. 1. Running time and required memory under GB Attack for q = 31, v = 4, r = 3
and D = 2. No field equations are used in the attack.

Fig. 2. Running time and required memory under GB attack for q = 31, v = 4, r = 3
and D = 2. Including the field equations in the attack.

As mentioned above, for our choice of q – 13 or 31 – the field equations


are somehow useless during the Gröbner basis attack. To confirm this, we ran
extensive experiments considering this situation, i.e., including and excluding
the field equations from the attack. On Figs. 1 and 2 we can see that, in either
case, the running time and the required memory under the Gröbner bases attack
are exponential in n (similar graphs for q = 13 can be seen on Figs. 5 and 6 in
the Appendix section).
We can observe that when we include the field equations, the memory used
grows much faster than when we do not include them in the attack. This agrees
with what we explained above and this is why we say that the field equations
are useless for the GB attack. Actually, the field equations not only require more
memory but also they slow down the attack for large values of q, for instance
q = 31. The extrapolations made to suggest parameters in Sect. 4.4 take into
account both cases, including and excluding the field equations.
Another important feature that we observed when we excluded the field equa-
tions is that, for fixed n, v and r, we did not get any significant change in the
Square-Vinegar Signature Scheme 25

Fig. 3. Running time under GB attack for n = 9, v = 4, r = 0 and D = 2, for several


values of q. No field equations are used in the attack.

Table 4. Time comparison of some Square-Vinegar systems and random equations


under GB attack. q = 31, d = 2, v = 4, and r = 3.

n Our scheme Random equations


7 0.002 s 0.002 s
8 0.005 s 0.005 s
9 0.022 s 0.022 s
10 0.114 s 0.113 s
11 0.741 s 0.738 s
12 4.921 s 4.755 s
13 37.002 s 37.996 s
14 268.410 s 272.201 s

time required by the GB attack to forge a signature for large values of q, as seen
in Fig. 3. This also justifies the choices of q = 13 and q = 31, since increasing q
will not augment the security of the system.
We also constructed random polynomial equations of the same dimensions
(same q, n, v and r) and found that the time needed to solve such random
equations using Gröbner bases is essentially the same as is needed to break
Square-Vinegar with our choices of parameters. Table 4 shows these times for
different n.
As observed on the graphs, we could only obtain data for n up to 14, due to
memory limitations (any request above 1.2 GB would be immediately rejected
by the computer that we used). However, even among the data that we were able
to collect, we observed that as n increases, the maximum degree of polynomial
used by F4 also increases. Larger scale experiments are being conducted to study
systematically how fast this degree increases as n increases; these results will be
presented in a future paper.
26 J. Baena, C. Clough, and J. Ding

From the information gathered with our experiments it appears that under our
choices of parameters, F4 is no more efficient in solving the public key equations
(4) of a Square-Vinegar scheme than a system of random equations.

4.2 Kipnis-Shamir Attack


Kipnis and Shamir developed an attack against HFE [15]. Their original claims
were questioned in [13], where it was shown that the Kipnis-Shamir attack was
less effective than originally thought and some arguments were made as to why
this should be so.
The original attack on HFE was translated to an attack on HFEv in [6]. The
resulting attack had a high complexity estimate even though the original, more
generous complexity estimates for the HFE attack were used in the computation.
Considering [13] and the fact that we are omitting r polynomials from the public
key, it seems that a Kipnis-Shamir style attack should not work against Square-
Vinegar.

4.3 Vinegar Attack


Since Square-Vinegar utilizes vinegar variables, a priori there is a possibility that
it is vulnerable to an attack similar to the one that felled the original Oil-Vinegar
scheme.
In the original Oil-Vinegar scheme, the core map k n → k n had a specific shape:
each component was a polynomial in which the “oil” variables appeared only
linearly, and thus had a quadratic form with a large block of zeros [14,21]. Upon
inspection of the attack, we realize that it exploits this property of the quadratic
forms [16]. In the Square-Vinegar construction, there are no variables which
appear only linearly. The map G ensures that x1 , . . . , xn appear quadratically,
and the choice of γ ensures that x1 , . . . , xv appear quadratically.
Once a specific K is fixed (in other words, once a specific irreducible poly-
nomial is chosen to define the extension over k), certain blocks of the quadratic
forms of ϕ ◦ G ◦ ϕ−1 are predetermined, but nonzero and not even likely to be
sparse. It appears that an attacker would have to find a matrix that simultane-
ously converts the quadratic forms of all public key polynomials to the prescribed
forms. At present time there does not seem to be any method to solve such a
problem.

4.4 Parameter Suggestions


Based on the analysis and results obtained throughout Sects. 3 and 4 we are able
to suggest new sets of parameters for HFEv− , which we call Square-Vinegar-31
and Square-Vinegar-13. Descriptions are as follows:
Square-Vinegar-31
– q = 31, D = 2, n = 31, v = 4 and r = 3.
– Size of the public key: 12 Kbytes.
– Length of the signature: 175 bits.
Square-Vinegar Signature Scheme 27

– Time needed to sign a message1 : 0.041 seconds on average.


– Time to verify a signature1 : less than 1 ms.
– Best known attack: more than 280 computations.
Square-Vinegar-13
– q = 13, D = 2, n = 36, v = 4 and r = 3.
– Size of the public key: 14 Kbytes.
– Length of the signature: 160 bits.
– Time needed to sign a message1 : 0.034 seconds on average.
– Time to verify a signature1 : less than 1 ms.
– Best known attack: more than 280 computations.
We would also like to propose parameters as toy challenges. The first challenge
is q = 13, n = 27, v = 3 and r = 0. The second challenge is q = 13, n = 28,
v = 3 and r = 1. We expect that with these parameter choices, an attack may
be practically possible.

5 Conclusion
In this paper we analyzed a new HFEv− system that seems to have great po-
tential. We showed that with relatively short signatures, Square-Vinegar can be
used to sign documents very fast. This was accomplished by working in an odd
characteristic and using a low-degree polynomial where previously a very high
degree was required. We performed computer experiments to test the security of
Square-Vinegar. We used algebraic attacks against smaller-scale systems to de-
termine proper q, D, n, r, and v values for plausible schemes. We also examined
other MPKC attacks and gave reasons why Square-Vinegar should be resistant
to them.
In the future we would like to have a better understanding of the apparent
benefit of odd characteristic. We will also, as mentioned above, study the re-
lationship between n and the polynomials used in GB attacks. In addition, we
will further study the effectiveness of attacks similar to those against perturbed
systems.

References
1. Ars, G., Faugère, J.-C., Imai, H., Kawazoe, M., Sugita, M.: Comparison Between
XL and Gröbner Basis Algorithms. In: Lee, P.J. (ed.) ASIACRYPT 2004. LNCS,
vol. 3329, pp. 338–353. Springer, Heidelberg (2004)
2. Computational Algebra Group, University of Sydney. The MAGMA computational
algebra system for algebra, number theory and geometry (2005),
[Link]
3. Courtois, N., Daum, M., Felke, P.: On the Security of HFE, HFEv- and Quartz. In:
Desmedt, Y.G. (ed.) PKC 2003. LNCS, vol. 2567, pp. 337–350. Springer, Heidelberg
(2002)
1
On an Intel(R) Pentium(R) D CPU 3.00 GHz.
28 J. Baena, C. Clough, and J. Ding

4. Ding, J., Gower, J.E., Schmidt, D.: Multivariate Public Key Cryptosystems.
Springer, Heidelberg (2006)
5. Ding, J., Schmidt, D., Werner, F.: Algebraic Attack on HFE Revisited. In: The
11th Information Security Conference, Taipei, Taiwan (September 2008)
6. Ding, J., Schmidt, D.: Cryptanalysis of HFEv and the Internal Perturbation of HFE
cryptosystems. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 288–301.
Springer, Heidelberg (2005)
7. Dubois, V., Granboulan, L., Stern, J.: Cryptanalysis of HFE with Internal Pertur-
bation. In: Okamoto, T., Wang, X. (eds.) PKC 2007. LNCS, vol. 4450, pp. 249–265.
Springer, Heidelberg (2007)
8. Dubois, V., Fouque, P.-A., Shamir, A., Stern, J.: Practical Cryptanalysis of
SFLASH. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 1–12.
Springer, Heidelberg (2007)
9. Faugère, J.-C., Joux, A.: Algebraic cryptanalysis of hidden field equation (HFE)
cryptosystems using Gröbner bases. In: Boneh, D. (ed.) CRYPTO 2003. LNCS,
vol. 2729, pp. 44–60. Springer, Heidelberg (2003)
10. Faugère, J.-C.: A new efficient algorithm for computing Gröbner bases (F4 ). Journal
of Pure and Applied Algebra 139, 61–88 (1999)
11. Faugère, J.-C.: A new efficient algorithm for computing Gröbner bases without
reduction to zero (F5 ). In: International Symposium on Symbolic and Algebraic
Computation — ISSAC 2002, pp. 75–83. ACM Press, New York (2002)
12. Gray, M.R., Johnson, D.S.: Computers and Intractability – A guide to the Theory
of NP-Completeness. W.H. Freeman and Company, New York (1979)
13. Jiang, X., Ding, J., Hu, L.: Kipnis-Shamir’s Attack on HFE Revisited. Cryptology
ePrint Archive, Report 2007/203, [Link]
14. Kipnis, A., Patarin, J., Goubin, L.: Unbalanced oil and vinegar signature schemes.
In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 206–222. Springer,
Heidelberg (1999)
15. Kipnis, A., Shamir, A.: Cryptanalysis of the HFE public key cryptosystem by
relinearization. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 19–30.
Springer, Heidelberg (1999)
16. Kipnis, A., Shamir, A.: Cryptanalysis of the Oil and Vinegar Signature Scheme.
In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 257–267. Springer,
Heidelberg (1998)
17. Matsumoto, T., Imai, H.: Public quadratic polynomial-tuples for efficient signature
verification and message encryption. In: Günther, C.G. (ed.) EUROCRYPT 1988.
LNCS, vol. 330, pp. 419–453. Springer, Heidelberg (1988)
18. NESSIE: New European Schemes for Signatures, Integrity, and Encryption. Infor-
mation Society Technologies Programme of the European Commission (IST-1999-
12324), [Link]
19. Patarin, J.: Cryptanalysis of the Matsumoto and Imai public key scheme of Euro-
crypt 1988. In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 248–261.
Springer, Heidelberg (1995)
20. Patarin, J.: Hidden Field Equations (HFE) and Isomorphism of Polynomials (IP):
Two new families of asymmetric algorithms. In: Maurer, U. (ed.) EUROCRYPT
1996. LNCS, vol. 1070, pp. 33–48. Springer, Heidelberg (1996); extended Version,
[Link]
21. Patarin, J.: The Oil and Vinegar Signature Scheme. In: Dagstuhl Workshop on
Cryptography (September 1997)
Square-Vinegar Signature Scheme 29

22. Patarin, J., Goubin, L.: Trapdoor one-way permutations and multivariate polyno-
mials. In: Han, Y., Quing, S. (eds.) ICICS 1997. LNCS, vol. 1334, pp. 356–368.
Springer, Heidelberg (1997); extended Version,
[Link]

23. Patarin, J., Goubin, L., Courtois, N.: C−+ and HM: variations around two schemes
of T. Matsumoto and H. Imai. In: Ohta, K., Pei, D. (eds.) ASIACRYPT 1998.
LNCS, vol. 1514, pp. 35–50. Springer, Heidelberg (1998)
24. Patarin, J., Goubin, L., Courtois, N.: Quartz, 128-bit long digital signatures. In:
Naccache, D. (ed.) CT-RSA 2001. LNCS, vol. 2020, pp. 352–357. Springer, Heidel-
berg (2001)
25. Patarin, J., Goubin, L., Courtois, N.: Quartz, 128-bit long digital signatures. An
updated version of Quartz specification, pp. 357-359,
[Link]

Appendix: Some Additional Graphs

Fig. 4. Running time under GB attack for n = 13, r = 3 and D = 2, for several values
of v. No field equations are used in the attack.

Fig. 5. Running time and required memory under GB attack for q = 13, v = 4, r = 3
and D = 2. No field equations are used in the attack.
30 J. Baena, C. Clough, and J. Ding

Fig. 6. Running time and required remory under GB attack for q = 13, v = 4, r = 3
and D = 2. Including the field equations in the attack.
Attacking and Defending
the McEliece Cryptosystem

Daniel J. Bernstein1 , Tanja Lange2 , and Christiane Peters2


1
Department of Mathematics, Statistics, and Computer Science (M/C 249)
University of Illinois at Chicago, Chicago, IL 60607–7045, USA
djb@[Link]
2
Department of Mathematics and Computer Science,
Technische Universiteit Eindhoven, P.O. Box 513, 5600 MB Eindhoven, Netherlands
tanja@[Link], [Link]@[Link]

Abstract. This paper presents several improvements to Stern’s attack


on the McEliece cryptosystem and achieves results considerably better
than Canteaut et al. This paper shows that the system with the originally
proposed parameters can be broken in just 1400 days by a single 2.4GHz
Core 2 Quad CPU, or 7 days by a cluster of 200 CPUs. This attack has
been implemented and is now in progress.
This paper proposes new parameters for the McEliece and Niederre-
iter cryptosystems achieving standard levels of security against all known
attacks. The new parameters take account of the improved attack; the
recent introduction of list decoding for binary Goppa codes; and the pos-
sibility of choosing code lengths that are not a power of 2. The resulting
public-key sizes are considerably smaller than previous parameter choices
for the same level of security.

Keywords: McEliece cryptosystem, Stern attack, minimal weight code


word, list decoding binary Goppa codes, security analysis.

1 Introduction
The McEliece cryptosystem was proposed by McEliece in 1978 [10] and the
original version, using Goppa codes, remains unbroken. Quantum computers
do not seem to give any significant improvements in attacking code-based sys-
tems, beyond the generic improvements possible with Grover’s algorithm, and
so the McEliece encryption scheme is one of the interesting candidates for post-
quantum cryptography.
A drawback of the system is the comparably large key size — in order to
hide the well-structured and efficiently decodable Goppa code in the public key,
the full generator matrix of the scrambled code needs to be published. Various
attempts to reduce the key size have used other codes, most notably codes over

Permanent ID of this document: 7868533f20f51f8d769be2aa464647c9. Date of this
document: 2008.08.07. This work has been supported in part by the National Science
Foundation under grant ITR–0716498.

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 31–46, 2008.
c Springer-Verlag Berlin Heidelberg 2008
32 D.J. Bernstein, T. Lange, and C. Peters

larger fields instead of subfield codes; but breaks of variants of the McEliece
system have left essentially only the original system as the strongest candidate.
The fastest known attacks on the original system are based on information set
decoding as implemented by Canteaut and Chabaud [4] and analyzed in greater
detail by Canteaut and Sendrier [5].
In this paper we reconsider attacks on the McEliece cryptosystem and present
improvements to Stern’s attack [17] (which predates the Canteaut–Chabaud at-
tack) and demonstrate that our new attack outperforms any previous ones. The
result is that an attack on the originally proposed parameters of the McEliece
cryptosystem is feasible on a moderate computer cluster. Already Canteaut and
Sendrier had pointed out that the system does not hold up to current security
standards but no actual attack was done before. We have implemented our new
method and expect results soon.
On the defense side our paper proposes new parameters for the McEliece
cryptosystem, selected from a much wider range of parameters than have been
analyzed before. The codes we suggest are also suitable for the Niederreiter
cryptosystem [11], a variant of the McEliece cryptosystem. The new parameters
are designed to minimize public-key size while achieving 80-bit, 128-bit, or 256-
bit security against known attacks — and in particular our attack. (Of course, by
a similar computation, we can find parameters that minimize costs other than key
size.) These new parameters exploit the ability to choose code lengths that are
not powers of 2. They also exploit a recently introduced list-decoding algorithm
for binary Goppa codes — see [2]; list decoding allows senders to introduce more
errors into ciphertexts, leading to higher security with the same key size, or
alternatively the same security with lower key size.

2 Review of the McEliece Cryptosystem


McEliece in [10] introduced a public-key cryptosystem based on error-correcting
codes. The public key is a hidden generator matrix of a binary linear code of
length n and dimension k with error-correcting capability t. McEliece suggested
using classical binary Goppa codes. We will briefly describe the main properties
of these codes before describing the set-up of the cryptosystem.
Linear codes. A binary [n, k] code is a binary linear code of length n and
dimension k, i.e., a k-dimensional subspace of Fn2 . All codes considered in this
paper are binary.
The Hamming weight of an element c ∈ Fn2 is the number of nonzero entries
of c. The minimum distance of an [n, k] code C with k > 0 is the smallest
Hamming weight of any nonzero element of C.
A generator matrix of an [n, k] code C is a k × n matrix G such that C =
xG : x ∈ Fk2 . A parity-check matrix of an [n, k] code C is an (n−k)×n matrix
H such that C = c ∈ Fn2 : HcT = 0 . Here cT means the transpose of c; we
view elements of Fn2 as 1 × n matrices, so cT is an n × 1 matrix.
A systematic generator matrix of an [n, k] code C is a generator matrix of the
form (Ik |Q) where Ik is the k × k identity matrix and Q is a k × (n − k) matrix.
Attacking and Defending the McEliece Cryptosystem 33

The matrix H = (QT |In−k ) is then a parity-check matrix for C. There might
not exist a systematic generator matrix for C, but there exists a systematic
generator matrix for an equivalent code obtained by permuting columns of C.
The classical decoding problem is to find the closest codeword x ∈ C to a
given y ∈ Fn2 , assuming that there is a unique closest codeword. Here close
means that the difference has small Hamming weight. Uniqueness is guaranteed
if there exists a codeword x whose distance from y is less than half the minimum
distance of C.
Classical Goppa codes. Fix a finite field F2d , a basis of F2d over F2 , and
a set of n distinct elements α1 , . . . , αn in F2d . Fix an irreducible polynomial
g ∈ F2d [x] of degree t, where 2 ≤ t ≤ (n − 1)/d. Note that, like [15, page 151]
and unlike [10], we do not require n to be as large as 2d .
The Goppa code Γ = Γ (α1 , . . . , αn , g) consists of all elements c = (c1 , . . . , cn )
in Fn2 satisfying
n
ci
=0 in F2d [x]/g.
i=1
x − αi
The dimension of Γ is at least n − td and typically is exactly n − td. For cryp-
tographic applications one assumes that the dimension is exactly n − td. The
td × n matrix ⎛ ⎞
1/g(α1 ) · · · 1/g(αn )
⎜ α1 /g(α1 ) · · · αn /g(αn ) ⎟
⎜ ⎟
H=⎜ .. .. .. ⎟,
⎝ . . . ⎠
1 /g(α1 ) · · · αn /g(αn )
αt−1 t−1

where each element of F2d is viewed as a column of d elements of F2 in the


specified basis of F2d , is a parity-check matrix of Γ .
The minimum distance of Γ is at least 2t+1. Patterson in [13] gave an efficient
algorithm to correct t errors.
The McEliece cryptosystem. The McEliece secret key consists of an n × n
permutation matrix P ; a nonsingular k × k matrix S; and a generator matrix
G for a Goppa code Γ (α1 , . . . , αn , g) of dimension k = n − td. The sizes n, k, t
are public system parameters, but α1 , . . . , αn , g, P, S are randomly generated
secrets. McEliece suggests in his original paper to choose a [1024, 524] classical
binary Goppa code Γ with irreducible polynomial g of degree t = 50.
The McEliece public key is the k × n matrix SGP .
McEliece encryption of a message m of length k: Compute mSGP and add
a random error vector e of weight t and length n. Send y = mSGP + e.
McEliece decryption: Compute yP −1 = mSG + eP −1 . Note that mSG is
a codeword in Γ , and that the permuted error vector eP −1 has weight t. Use
Patterson’s algorithm to find mS and thereby m.
The Niederreiter cryptosystem. We also consider a variant of the McEliece
cryptosystem published by Niederreiter in [11]. Niederreiter’s system, with the
same Goppa codes used by McEliece, has the same security as McEliece’s system,
as shown in [9].
34 D.J. Bernstein, T. Lange, and C. Peters

Niederreiter’s system differs from McEliece’s system in public-key structure,


encryption mechanism, and decryption mechanism. Beware that the specific sys-
tem in [11] also used different codes — Goppa codes were replaced by general-
ized Reed-Solomon codes — but generalized Reed-Solomon codes were broken by
Sidelnikov and Shestakov in 1992; see [16].
The Niederreiter secret key consists of an n × n permutation matrix P ; a
nonsingular (n − k) × (n − k) matrix S; and a parity-check matrix H for a Goppa
code Γ (α1 , . . . , αn , g) of dimension k = n − td. As before, the sizes n, k, t are
public system parameters, but α1 , . . . , αn , g, P, S are randomly generated secrets.
The Niederreiter public key is the (n − k) × n matrix SHP .
Niederreiter encryption of a message m of length n and weight t: Compute
and send y = SHP mT .
Niederreiter decryption: By linear algebra find z such that HzT = S −1 y.
Then z − mP T is a codeword in Γ . Apply Patterson’s algorithm to find the
error vector mP T and thereby m.
CCA2-secure variants. McEliece’s system as described above does not resist
chosen-ciphertext attacks; i.e., it does not achieve “IND-CCA2 security.” For
instance, encryption of the same message twice produces two different ciphertexts
which can be compared to find out the original message since it is highly unlikely
that errors were added in the same positions.
There are several suggestions to make the system CCA2-secure. Overviews can
be found in [6, Chapters 5–6] and [12]. All techniques share the idea of scram-
bling the message inputs. The aim is to destroy any relations of two dependent
messages which an adversary might be able to exploit.
If we secure McEliece encryption against chosen-ciphertext attacks then we
can use a systematic generator matrix as a public key. This reduces the public-
key size from kn bits to k(n − k) bits: it is sufficient to store the k × (n − k)
matrix Q described above. Similarly for Niederreiter’s system it suffices to store
the non-trivial part of the parity check matrix, reducing the public-key size from
(n − k)n bits to k(n − k) bits.

3 Review of the Stern Attack Algorithm

The most effective attack known against the McEliece and Niederreiter cryp-
tosystems is “information-set decoding.” There are actually many variants of
this attack. A simple form of the attack was introduced by McEliece in [10, Sec-
tion III]. Subsequent variants were introduced by Leon in [8], by Lee and Brickell
in [7], by Stern in [17], by van Tilburg in [18], by Canteaut and Chabanne in [3],
by Canteaut and Chabaud in [4], and by Canteaut and Sendrier in [5].
The new attack presented in Section 4 of this paper is most easily understood
as a variant of Stern’s attack. This section reviews Stern’s attack.
How to break McEliece and Niederreiter. Stern actually states an attack
on a different problem, namely the problem of finding a low-weight codeword.
However, as mentioned by Canteaut and Chabaud in [4, page 368], one can
Attacking and Defending the McEliece Cryptosystem 35

decode a linear code — and thus break the McEliece system — by finding a low-
weight codeword in a slightly larger code.
Specifically, if C is a length-n code over F2 , and y ∈ Fn2 has distance w from
a codeword c ∈ C, then y − c is a weight-w element of the code C + {0, y}.
Conversely, if C is a length-n code over F2 with minimum distance larger than
w, then a weight-w element e ∈ C + {0, y} cannot be in C, so it must be in
C + {y}; in other words, y − e is an element of C with distance w from y.
Recall that a McEliece ciphertext y ∈ Fn2 is known to have distance t from
a unique closest codeword c in a code C that has minimum distance at least
2t + 1. The attacker knows the McEliece public key, a generator matrix for C,
and can simply append y to the list of generators to form a generator matrix for
C + {0, y}. The only weight-t codeword in C + {0, y} is y − c; by finding this
codeword the attacker finds c and easily solves for the plaintext.
Similar comments apply if the attacker is given a Niederreiter public key,
i.e., a parity-check matrix for C. By linear algebra the attacker quickly finds a
generator matrix for C; the attacker then proceeds as above. Similar comments
also apply if the attacker is given a Niederreiter ciphertext. By linear algebra the
attacker finds a word that, when multiplied by the parity-check matrix, produces
the specified ciphertext. The bottleneck in all of these attacks is finding the
weight-t codeword in C + {0, y}.
Beware that there is a slight inefficiency in the reduction from the decoding
problem to the problem of finding low-weight codewords: if C has dimension k
and y ∈ / C then C +{0, y} has slightly larger dimension, namely k+1. The user of
the low-weight-codeword algorithm knows that the generator y will participate
in the solution, but does not pass this information to the algorithm. In this paper
we focus on the low-weight-codeword problem for simplicity.

How to find low-weight words. Stern’s attack has two inputs: first, an integer
w ≥ 0; second, an (n − k) × n parity-check matrix H for an [n, k] code over F2 .
Other standard forms of an [n, k] code, such as a k × n generator matrix, are
easily converted to the parity-check form by linear algebra.
Stern randomly selects n − k out of the n columns of H. He selects a random
size- subset Z of those n−k columns; here  is an algorithm parameter optimized
later. He partitions the remaining k columns into two sets X and Y by having
each column decide independently and uniformly to join X or to join Y .
Stern then searches, in a way discussed below, for codewords that have exactly
p nonzero bits in X, exactly p nonzero bits in Y , 0 nonzero bits in Z, and exactly
w − 2p nonzero bits in the remaining columns. Here p is another algorithm
parameter optimized later. If there are no such codewords, Stern starts with a
new selection of columns.
The search has three steps. First, Stern applies elementary row operations to
H so that the selected n − k columns become the identity matrix. This fails,
forcing the algorithm to restart, if the original (n − k) × (n − k) submatrix of H
is not invertible. Stern guarantees an invertible submatrix, avoiding the cost of
a restart, by choosing each column adaptively as a result of pivots in previous
columns. (In theory this adaptive choice could bias the choice of (X, Y, Z), as
36 D.J. Bernstein, T. Lange, and C. Peters

Stern points out, but the bias does not seem to have a noticeable effect on
performance.)
Second, now that this (n − k) × (n − k) submatrix of H is the identity matrix,
each of the selected n − k columns corresponds to a unique row, namely the row
where that column has a 1 in the submatrix. In particular, the set Z of  columns
corresponds to a set of  rows. For every size-p subset A of X, Stern computes
the sum (mod 2) of the columns in A for each of those  rows, obtaining an -bit
vector π(A). Similarly, Stern computes π(B) for every size-p subset B of Y .
Third, for each collision π(A) = π(B), Stern computes the sum of the 2p
columns in A ∪ B. This sum is an (n − k)-bit vector. If the sum has weight
w − 2p, Stern obtains 0 by adding the corresponding w − 2p columns in the
(n − k) × (n − k) submatrix. Those w − 2p columns, together with A and B,
form a codeword of weight w.

4 The New Attack


This section presents our new attack as the culmination of a series of improve-
ments that we have made to Stern’s attack. The reader is assumed to be familiar
with Stern’s algorithm; see the previous section.
As a result of these improvements, our attack speeds are considerably better
than the attack speeds reported by Canteaut, Chabaud, and Sendrier in [4] and
[5]. See the next two sections for concrete results and comparisons.
Reusing existing pivots. Each iteration of Stern’s algorithm selects n − k
columns of the parity-check matrix H and applies row operations — Gaussian
elimination — to reduce those columns to the (n − k) × (n − k) identity matrix.
Any parity-check matrix for the same code will produce the same results here.
In particular, instead of starting from the originally supplied parity-check ma-
trix, we start from the parity-check matrix produced in the previous iteration —
which, by construction, already has an (n − k) × (n − k) identity submatrix.
About (n − k)2 /n of the newly selected columns will match previously selected
columns, and are simply permuted into identity form with minimal effort, leaving
real work for only about n − k − (n − k)2 /n = (k/n)(n − k) of the columns.
Stern says that reduction involves about (1/2)(n − k)3 + k(n − k)2 bit op-
erations; for example, (3/16)n3 bit operations for k = n/2. To understand this
formula, observe that the first column requires ≤ n − k reductions, each involv-
ing ≤ n − 1 additions (mod 2); the second column requires ≤ n − k reductions,
each involving ≤ n − 2 additions; and so on through the (n − k)th column,
which requires ≤ n − k reductions, each involving ≤ k additions; for a total of
(1/2)(n − k)3 + (k − 1/2)(n − k)2 .
We improve the bit-operation count to k 2 (n − k)(n − k − 1)(3n − k)/4n2 :
for example, (5/128)n2 (n − 2) for k = n/2. Part of the improvement is from
eliminating the work for the first (n − k)2 /n columns. The other part is the
standard observation that the number of reductions in a typical column is only
about (n − k − 1)/2.
Attacking and Defending the McEliece Cryptosystem 37

Forcing more existing pivots. More generally, one can artificially reuse ex-
actly n − k − c column selections, and select the remaining c new columns ran-
domly from among the other k columns, where c is a new algorithm parameter.
Then only c columns need to be newly pivoted. Reducing c below (k/n)(n − k)
saves time correspondingly.
Beware, however, that smaller values of c introduce a dependence between
iterations and require more iterations before the algorithm finds the desired
weight-w word. See Section 5 for a detailed discussion of this effect.
The extreme case c = 1 has appeared before: it was used by Canteaut et al. in
[3, Algorithm 2], [4, Section II.B], and [5, Section 3]. This extreme case minimizes
the time for Gaussian elimination but maximizes the number of iterations of the
entire algorithm.
Illustrative example from the literature: Canteaut and Sendrier report in [5,
Table 2] that they need 9.85 · 1011 iterations to handle n = 1024, k = 525,
w = 50 with their best parameters (p, ) = (2, 18). Stern’s algorithm, with the
same (p, ) = (2, 18), needs only 5.78 · 1011 iterations. Note that these are not
the best parameters for Stern’s algorithm; the parameters p = 3 and  = 28 are
considerably better.
Another illustrative example: Canteaut and Chabaud recommend (p, ) =
(2, 20) for n = 2048, k = 1025, w = 112 in [4, Table 2]. These parameters use
5.067 · 1029 iterations, whereas Stern’s algorithm with the same parameters uses
3.754 · 1029 iterations.
Canteaut and Chabaud say that Gaussian elimination is the “most expensive
step” in previous attacks, justifying the switch to c = 1. We point out, however,
that this switch often loses speed compared to Stern’s original attack. For ex-
ample, Stern’s original attack (without reuse of existing pivots) uses only 2124.06
bit operations for n = 2048, k = 1025, w = 112 with (p, ) = (3, 31), beating the
algorithm by Canteaut et al.; in this case Gaussian elimination is only 22% of
the cost of each iteration.
Both c = 1, as used by Canteaut et al., and c = (k/n)(n − k), as used
(essentially) by Stern, are beaten by intermediate values of c. See Section 5 for
some examples of optimized choices of c.

Faster pivoting. Adding the first selected row to various other rows cancels
all remaining 1’s in the first selected column. Adding the second selected row to
various other rows then cancels all remaining 1’s in the second selected column.
It has frequently been observed — see, e.g., [1] — that there is an overlap of
work in these additions: about 25% of the rows will have both the first row and
the second row added. One can save half of the work in these rows by simply
precomputing the sum of the first row and the second row. The precomputation
involves at most one vector addition (and is free if the first selected column
originally began 1, 1).
More generally, suppose that we defer additions of r rows; here r is another
algorithm parameter. After precomputing all 2r − 1 sums of nonempty subsets of
these rows, we can handle each remaining row with, on average, 1 − 1/2r vector
additions, rather than r/2 vector additions. For example, after precomputing
38 D.J. Bernstein, T. Lange, and C. Peters

15 sums of nonempty subsets of 4 rows, we can handle each remaining row


with, on average, 0.9375 vector additions, rather than 2 vector additions; the
precomputation in this case uses at most 11 vector additions. The optimal choice
of r is roughly lg(n − k) − lg lg(n − k) but interacts with the optimal choice of c.
See [14] for a much more thorough optimization of subset-sum computations.
Multiple choices of Z. Recall that Stern’s algorithm finds a particular weight-
w word if that word has exactly p, p, 0 errors in the column sets X, Y, Z respec-
tively. We generalize Stern’s algorithm to allow m disjoint sets Z1 , Z2 , . . . , Zm
with the same X, Y , each of Z1 , Z2 , . . . , Zm having cardinality ; here m ≥ 1 is
another algorithm parameter.
The cost of this generalization is an m-fold increase in the time spent in the
second and third steps of the algorithm — but the first step, the initial Gaussian
elimination, depends only on X, Y and is done only once. The benefit of this
generalization is that the chance of finding any particular weight-w word grows
by a factor of nearly m.
For example, if (n, k, w) = (1024, 525, 50) and (p, ) = (3, 29), then one set
Z1 works with probability approximately 6.336%, while two disjoint sets Z1 , Z2
work with probability approximately 12.338%. Switching from one set to two
produces a 1.947× increase in effectiveness at the expense of replacing steps
1, 2, 3 by steps 1, 2, 3, 2, 3. This is worthwhile if step 1, Gaussian elimination, is
more than about 5% of the original computation.
Reusing additions of the -bit vectors. The second step of Stern’s algorithm
considers all p-element subsets A of X and all p-element subsets B of Y , and
computes -bit sums π(A), π(B). Stern says that this takes 2p k/2
p bit opera-
k/2
tions for average-size X, Y . Similarly, Canteaut et al. say that there are p
choices of A and k/2p choices of B, each using p bit operations.
We comment that, although computing π(A) means p − 1 additions of -bit
vectors, usually p − 2 of those additions were carried out before. Simple caching
thus reduces the average cost of computing π(A) to only marginally more than
 bit operations for each A. This improvement becomes increasingly important
as p grows.
Faster additions after collisions. The third step of Stern’s algorithm, for the
pairs (A, B) with π(A) = π(B), adds all the columns in A ∪ B.
We point out that, as above, many of these additions overlap. We further
point out that it is rarely necessary to compute all of the rows of the result.
After computing 2(w − 2p + 1) rows one already has, on average, w − 2p + 1
errors; in general, as soon as the number of errors exceeds w − 2p, one can safely
abort this pair (A, B).

5 Attack Optimization and Comparison


Canteaut, Chabaud, and Sendrier announced ten years ago that the original pa-
rameters for McEliece’s cryptosystem were not acceptably secure: specifically, an
attacker can decode 50 errors in a [1024, 524] code over F2 in 264.1 bit operations.
Attacking and Defending the McEliece Cryptosystem 39

Choosing parameters p = 2, m = 2,  = 20, c = 7, and r = 7 in our new attack


shows that the same computation can be done in only 260.55 bit operations,
almost a 12× improvement over Canteaut et al. The number of iterations drops
from 9.85 · 1011 to 4.21 · 1011 , and the number of bit operations per iteration
drops from 20 · 106 to 4 · 106 . As discussed in Section 6, we have achieved even
larger speedups in software.
The rest of this section explains how we computed the number of iterations
used by our attack, and then presents similar results for many more sizes [n, k].
Analysis of the number of iterations. Our parameter optimization relies on
being able to quickly and accurately compute the average number of iterations
required for our attack.
It is easy to understand the success chance of one iteration of the attack:

• The probability of a weight-w word having exactly w − 2p errors in a uni-


form random set of n − k columns is 2p w n−w n
k−2p / k . The actual selection
of columns is adaptive and thus not exactly uniform, but as mentioned in
Section 3 this bias appears to be negligible; we have tried many attacks with
small w and found no significant deviation from uniformity.
• The conditional probability of the 2p errors splitting as p, p between X, Y
is 2p 2p
p /2 . Instead of having each column decide independently whether or
not to join X, we actually make a uniform random selection of exactly k/2
columns for X, replacing 2p p /2
2p
with k/2
p
k/2
p
k
/ 2p , but this is only
a slight change.
• The conditional probability of the remaining w − 2p errors avoiding Z,
a uniform random selection of  out of the remaining n − k columns, is
n−k−(w−2p)
 / n−k
 . As discussed in Section 4, we increase this chance by
allowing disjoint sets Z1 , Z2 , . . . , Zm ; the conditional probability of w − 2p
errors avoiding at least one of Z1 , Z2 , . . . , Zm is
n−k−(w−2p)  n−k−(w−2p)  n−k−(w−2p)
m m
m 
n−k
− 2
n−k
+ 3
n−k
− ···

2 2
3 3

by the inclusion-exclusion principle.

The product of these probabilities is the chance that the first iteration succeeds.
If iterations were independent, as in Stern’s original attack, then the average
number of iterations would be simply the reciprocal of the product of the prob-
abilities. But iterations are not, in fact, independent. The difficulty is that the
number of errors in the selected n − k columns is correlated with the number of
errors in the columns selected in the next iteration. This is most obvious in the
extreme case c = 1 considered by Canteaut et al.: swapping one selected column
for one deselected column is quite likely to preserve the number of errors in the
selected columns. The effect decreases in magnitude as c increases, but iterations
also become slower as c increases; optimal selection of c requires understanding
how c affects the number of iterations.
40 D.J. Bernstein, T. Lange, and C. Peters

To analyze the impact of c we compute a Markov chain for the number of


errors, generalizing the analysis of Canteaut et al. from c = 1 to arbitrary c.
Here are the states of the chain:
• 0: There are 0 errors in the deselected k columns.
• 1: There is 1 error in the deselected k columns.
• ...
• w: There are w errors in the deselected k columns.
• Done: The attack has succeeded.
An iteration of the attack moves between states as follows. Starting from state
u, the attack replaces c selected columns, moving to states u − c, . . . , u − 2, u −
1, u, u + 1, u + 2, . . . , u + c with various probabilities discussed below. The attack
then checks for success, moving from state 2p to state Done with probability
k/2 k/2 n−k−(w−2p)  n−k−(w−2p) 
p p m
β= k
m 
n−k
− 2
n−k
+ · · ·
2p 
2 2

and otherwise staying in the same state.


For c = 1, the column-replacement transition probabilities are mentioned by
Canteaut et al.:
• state u moves to state u − 1 with probability u(n − k − (w − u))/(k(n − k));
• state u moves to state u + 1 with probability (k − u)(w − u)/(k(n − k));
• state u stays in state u otherwise.
For c > 1, there are at least three different interpretations of “select c new
columns”:
• “Type 1”: Choose a selected column; choose a non-selected column; swap.
Continue in this way for a total of c swaps.
• “Type 2”: Choose c distinct selected columns. Swap the first of these with a
random non-selected column. Swap the second with a random non-selected
column. Etc.
• “Type 3”: Choose c distinct selected columns and c distinct non-selected
columns. Swap the first selected column with the first non-selected column.
Swap the second with the second. Etc.
Type 1 is the closest to Canteaut et al.: its transition matrix among states
0, 1, . . . , w is simply the cth power of the matrix for c = 1. On the other hand,
type 1 has the highest chance of re-selecting a column and thus ending up with
fewer than c new columns; this effectively decreases c. Type 2 reduces this chance,
and type 3 eliminates this chance.
The type-3 transition matrix has a simple description: state u moves to state
u + d with probability
 w − u n − k − w + u u  k − u  n − k k 
.
i
i c−i d+i c−d−i c c

For c = 1 this matrix matches the Canteaut-et-al. matrix.


Attacking and Defending the McEliece Cryptosystem 41

We have implemented the type-1 Markov analysis and the type-3 Markov
analysis. To save time we use floating-point computations with a few hundred
bits of precision rather than exact rational computations. We use the MPFI
library (on top of the MPFR library on top of GMP) to compute intervals
around each floating-point number, guaranteeing that rounding errors do not
affect our final results.
As a check we have also performed millions of type-1, type-2, and type-3
simulations and millions of real experiments decoding small numbers of errors.
The simulation results are consistent with the experimental results. The type-
1 and type-3 simulation results are consistent with the predictions from our
Markov-chain software. Type 1 is slightly slower than type 3, and type 2 is
intermediate. Our graphs below use type 3. Our current attack software uses
type 2 but we intend to change it to type 3.
Results. For each (n, t) in a wide range, we have explored parameters for our
new attack and set new records for the number of bit operations needed to

63 108

62 106

61
104
60
102
59
100
58
98
57

56 96

55 94
0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.50 0.55 0.60 0.65 0.70 0.75 0.80
190 350

340
185

330
180
320
175
310

170
300

165 290
0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.50 0.55 0.60 0.65 0.70 0.75 0.80

Fig. 1. Attack cost for n = 1024, n = 2048, n = 4096, n = 8192. Horizontal axis is the
code rate (n − t lg n)/n. Vertical axis is lg(bit operations).
42 D.J. Bernstein, T. Lange, and C. Peters

decode t errors in an [n, n − t lg n] code. Figure 1 shows our new records. Note
that the optimal attack parameters (p, m, , c, r) depend on n, and depend on t
for fixed n.

6 A Successful Attack on the Original McEliece


Parameters

We have implemented, and are carrying out, an attack against the cryptosys-
tem parameters originally proposed by McEliece. Our attack software extracts a
plaintext from a ciphertext by decoding 50 errors in a [1024, 524] code over F2 .
If we were running our attack software on a single computer with a 2.4GHz In-
tel Core 2 Quad Q6600 CPU then we would need, on average, approximately 1400
days (258 CPU cycles) to complete the attack. We are actually running our attack
software on more machines. Running the software on 200 such computers — a
moderate-size cluster costing under $200000 — would reduce the average time to
one week. Note that no communication is needed between the computers.
These attack speeds are much faster than the best speeds reported in the
previous literature. Specifically, Canteaut, Chabaud, and Sendrier in [4] and [5]
report implementation results for a 433MHz DEC Alpha CPU and conclude that
one such computer would need approximately 7400000 days (268 CPU cycles):
“decrypting one message out of 10,000 requires 2 months and 14 days with 10
such computers.”
Of course, the dramatic reduction from 7400000 days to 1400 days can be
partially explained by hardware improvements — the Intel Core 2 Quad runs at
5.54× the clock speed of the Alpha 21164, has four parallel cores (compared
to one), and can perform three arithmetic instructions per cycle in each core
(compared to two). But these hardware improvements alone would only reduce
7400000 days to 220000 days.
The remaining speedup factor of 150, allowing us to carry out the first success-
ful attack on the original McEliece parameters, comes from our improvements
of the attack itself. This section discusses the software performance of our at-
tack in detail. Beware that optimizing CPU cycles is different from, and more
difficult than, optimizing the simplified notion of “bit operations” considered in
Section 4.
We gratefully acknowledge contributions of CPU time from several sources.
At the time of this writing we are carrying out about 3.26 · 109 attack iterations
each day:

• about 1.25·109 iterations/day from 38 cores of the Coding and Cryptography


Computer Cluster (C4) at Technische Universiteit Eindhoven (TU/e)
• about 0.99 · 109 iterations/day from 32 cores in the Department of Electrical
Engineering at National Taiwan University;
• about 0.50·109 iterations/day from 22 cores in the Courbes, Algèbre, Calculs,
Arithmétique des Ordinateurs (CACAO) cluster at Laboratoire Lorrain de
Recherche en Informatique et ses Applications (LORIA);
Attacking and Defending the McEliece Cryptosystem 43

• about 0.26 · 109 iterations/day from 16 cores of the System Architecture and
Networking Distributed and Parallel Integrated Terminal (sandpit) at TU/e;
• about 0.13 · 109 iterations/day from 8 cores of the Argo cluster at the Aca-
demic Computing and Communications Center at the University of Illinois
at Chicago (UIC);
• about 0.13 · 109 iterations/day from 6 cores at the Center for Research and
Instruction in Technologies for Electronic Security (RITES) at UIC; and
• about 0.13 · 109 iterations/day from 4 cores owned by D. J. Bernstein and
Tanja Lange.
We plan to publish our attack software to allow public verification of our speed
results and to allow easy reuse of the same techniques in other decoding prob-
lems.
Number of iterations. Recall that the Canteaut-et-al. attack uses 9.85 · 1011
iterations on average, with (in our notation) p = 2,  = 18, m = 1, and c = 1.
To avoid excessive time spent handling collisions in the main loop, we in-
creased  from 18 to 20. This increased the number of iterations to 11.14 · 1011 .
We then increased m from 1 to 5: for each selection of column sets X, Y we try
five sets Z1 , Z2 , Z3 , Z4 , Z5 . We further increased c from 1 to 32: each iteration
replaces 32 columns from the previous iteration. These choices increased various
parts of the per-iteration time by factors of 5 and (almost) 32 respectively; but
the choices also combined to reduce the number of iterations by a factor of more
than 6, down to 1.85 · 1011 .
Further adjustment of the parameters will clearly produce additional improve-
ments, but having reached feasibility we decided to proceed with our attack.
Time for each iteration. Our attack software carries out an attack iteration
in 6.38 million CPU cycles on one core of a busy Core 2 Quad. “Busy” means
that the other three cores of the Core 2 Quad are also working on the attack; the
cycle counts drop slightly, presumably reflecting reduced L2-cache contention, if
only one core of the Core 2 Quad is active.
About 6.20 of these 6.38 million CPU cycles are accounted for by the following
major components:
• 0.68 million CPU cycles to select new column sets X and Y and to perform
Gaussian elimination. We use 32 new columns in each iteration, as mentioned
above. Each new column is handled by an independent pivot, modifying a few
hundred thousand bits of the matrix; we use standard techniques to combine
64 bit modifications into a small number of CPU instructions, reducing the
cost of the pivot to about 20000 CPU cycles. Further improvements are
clearly possible with further tuning.
• 0.35 million CPU cycles to precompute π(L) for each single column L. There
are m = 5 choices of π, and k = 525 columns L for each π. We handle each
π(L) computation in a naive way, costing more than 100 CPU cycles; this
could be improved but is not a large part of the overall computation.
• 0.36 million CPU cycles to clear hash tables. There are two hash tables, each
with 2 = 220 bits, and clearing both tables costs about 0.07 million CPU
44 D.J. Bernstein, T. Lange, and C. Peters

cycles; this is repeated m = 5 times, accounting for the 0.36 million CPU
cycles.
• 1.13 million CPU cycles to mark, for each size-p set A, the bit at position
π(A) in the first hash table. We use p = 2, so there are 262 · 261/2 = 34191
choices of A, and m = 5 choices of π, for a total of 0.17 million marks,
each costing about 6.6 CPU cycles. Probably the 6.6 could be reduced with
further CPU tuning.
• 1.30 million CPU cycles to check, for each set B, whether the bit at position
π(B) is set in the first hash table, and if so to mark the bit at position π(B)
in the second hash table while appending B to a list of colliding B’s.
• 1.35 million CPU cycles to check, for each set A, whether the bit at position
π(A) is set in the second hash table, and if so to append A to a list of
colliding A’s.
• 0.49 million CPU cycles to sort the list of colliding sets A by π(A) and to
sort the list of colliding sets B by π(B). We use a straightforward radix sort.
• 0.54 million CPU cycles to skim through each collision π(A) = π(B), check-
ing the weight of the sum of the columns in A ∪ B. There are on average
about 5 · 34453 · 34191/220 ≈ 5617 collisions. Without early aborts this step
would cost 1.10 million CPU cycles.
For comparison, Canteaut et al. use 260 million cycles on an Alpha 21164
for each of their iterations (“1000 iterations of the optimized algorithm are per-
formed in 10 minutes . . . at 433 MHz”).

7 Defending the McEliece Cryptosystem


This section proposes new parameters for the McEliece cryptosystem.
Increasing n. The most obvious way to defend McEliece’s cryptosystem is to
increase n, the length of the code used in the cryptosystem. We comment that
allowing values of n between powers of 2 allows considerably better optimization
of (e.g.) the McEliece/Niederreiter public-key size. See below for examples. Aside
from a mild growth in decoding time, there is no obstacle to the key generator
using a Goppa code defined via a field F2d of size much larger than n.
Using list decoding to increase w. The very recent paper [2] has introduced
a list-decoding algorithm for classical irreducible binary Goppa codes, exactly
the codes used in McEliece’s cryptosystem.
 This algorithm allows the receiver to
efficiently decode approximately n − n(n − 2t − 2) ≥ t + 1 errors instead of t
errors. The sender, knowing this, can introduce correspondingly more errors; the
attacker is then faced with a more difficult problem of decoding the additional
errors.
List decoding can, and occasionally does, return more than one codeword
within the specified distance. In CCA2-secure variants of McEliece’s system there
is no difficulty in identifying which codeword is a valid message. Our attack can,
in exactly the same way, easily discard codewords that do not correspond to
valid messages.
Attacking and Defending the McEliece Cryptosystem 45

Analysis and optimization of parameters. We now propose concrete pa-


rameters [n, k] for various security levels in CCA2-secure variants of the McEliece
cryptosystem. Recall that public keys in these variants are systematic generator
matrices occupying k(n − k) bits.
For (just barely!) 80-bit security against our attack we propose [1632, 1269]
Goppa codes (degree t = 33), with 34 errors added by the sender. The public-key
size here is 1269(1632 − 1269) = 460647 bits.
Without list decoding, and with the traditional restriction n = 2d , the best
possibility is [2048, 1751] Goppa codes (t = 27). The public key here is consid-
erably larger, namely 520047 bits.
For 128-bit security we propose [2960, 2288] Goppa codes (t = 56), with 57
errors added by the sender. The public-key size here is 1537536 bits.
For 256-bit security we propose [6624, 5129] Goppa codes (t = 115), with 117
errors added by the sender. The public-key size here is 7667855 bits.
For keys limited to 216 , 217 , 218 , 219 , 220 bytes, we propose Goppa codes of
lengths 1744, 2480, 3408, 4624, 6960 and degrees 35, 45, 67, 95, 119 respectively,
with 36, 46, 68, 97, 121 errors added by the sender. These codes achieve security
levels 84.88, 107.41, 147.94, 191.18, 266.94 against our attack. In general, for any
particular limit on public-key size, codes of rate approximately 0.75 appear to
maximize the difficulty of our attack.

References
1. Bard, G.V.: Accelerating cryptanalysis with the Method of Four Russians. Cryp-
tology ePrint Archive: Report 2006/251 (2006),
[Link]
2. Bernstein, D.J.: List decoding for binary Goppa codes (2008),
[Link]
3. Canteaut, A., Chabanne, H.: A further improvement of the work factor in an
attempt at breaking McEliece’s cryptosystem. In: Charpin, P. (ed.) EUROCODE
1994 (1994), [Link]
4. Canteaut, A., Chabaud, F.: A new algorithm for finding minimum-weight words
in a linear code: application to McEliece’s cryptosystem and to narrow-sense BCH
codes of length 511. IEEE Transactions on Information Theory 44(1), 367–378
(1998)
5. Canteaut, A., Sendrier, N.: Cryptanalysis of the original McEliece cryptosystem.
In: Ohta, K., Pei, D. (eds.) ASIACRYPT 1998. LNCS, vol. 1514, pp. 187–199.
Springer, Heidelberg (1998)
6. Engelbert, D., Overbeck, R., Schmidt, A.: A summary of McEliece-type cryp-
tosystems and their security. Cryptology ePrint Archive: Report 2006/162 (2006),
[Link]
7. Lee, P.J., Brickell, E.F.: An observation on the security of McEliece’s public-key
cryptosystem. In: Günther, C.G. (ed.) EUROCRYPT 1988. LNCS, vol. 330, pp.
275–280. Springer, Heidelberg (1988)
8. Leon, J.S.: A probabilistic algorithm for computing minimum weights of large
error-correcting codes. IEEE Transactions on Information Theory 34(5), 1354–1359
(1988)
46 D.J. Bernstein, T. Lange, and C. Peters

9. Li, Y.X., Deng, R.H., Wang, X.M.: On the equivalence of McEliece’s and Niederre-
iter’s public-key cryptosystems. IEEE Transactions on Information Theory 40(1),
271–273 (1994)
10. McEliece, R.J.: A public-key cryptosystem based on algebraic coding theory, Jet
Propulsion Laboratory DSN Progress Report, 42–44 (1978),
[Link] report2/42-44/[Link]
11. Niederreiter, H.: Knapsack-type cryptosystems and algebraic coding theory. Prob-
lems of Control and Information Theory. Problemy Upravlenija i Teorii Informa-
cii 15(2), 159–166 (1986)
12. Overbeck, R., Sendrier, N.: Code-based cryptography. In: Bernstein, D.J., Buch-
mann, J., Dahmen, E. (eds.) Introduction to post-quantum cryptography. Springer,
Berlin (to appear)
13. Patterson, N.J.: The algebraic decoding of Goppa codes. IEEE Transactions on
Information Theory IT-21, 203–207 (1975)
14. Pippenger, N.: The minimum number of edges in graphs with prescribed paths.
Mathematical Systems Theory 12, 325–346 (1979),
[Link]
15. Sendrier, N.: On the security of the McEliece public-key cryptosystem. In: Blaum,
M., Farrell, P.G., van Tilborg, H.C.A. (eds.) Information, coding and mathematics.
Kluwer International Series in Engineering and Computer Science, vol. 687, pp.
141–163. Kluwer, Dordrecht (2002)
16. Sidelnikov, V.M., Shestakov, S.O.: On insecurity of cryptosystems based on gen-
eralized Reed-Solomon codes. Discrete Mathematics and Applications 2, 439–444
(1992)
17. Stern, J.: A method for finding codewords of small weight. In: Cohen, G., Wolf-
mann, J. (eds.) Coding Theory and Applications 1988. LNCS, vol. 388, pp. 106–113.
Springer, Heidelberg (1989)
18. van Tilburg, J.: On the McEliece public-key cryptosystem. In: Goldwasser, S. (ed.)
CRYPTO 1988. LNCS, vol. 403, pp. 119–131. Springer, Heidelberg (1990)
McEliece Cryptosystem Implementation: Theory
and Practice

Bhaskar Biswas and Nicolas Sendrier

Centre de recherche INRIA Paris - Rocquencourt,


Domaine de Voluceau, Rocquencourt - B.P. 105, 78153 Le Chesnay Cedex, France
{[Link],[Link]}@[Link]
[Link]

Abstract. Though it is old and considered fast, the implementation of


McEliece public-key encryption scheme has never been thoroughly stud-
ied. We consider that problem here and we provide an implementation
with a complete description of our algorithmic choices and parameters
selection, together with the state of the art in cryptanalysis. This pro-
vides a reference for measuring speed and scalability of this cryptosys-
tem. Compared with other, number-theory based, public key scheme, we
demonstrate a gain of a factor at least 5 to 10.

Keywords: public-key cryptosystem, McEliece encryption scheme,


code-based cryptography, cryptographic implementation.

1 Introduction
McEliece encryption scheme was proposed in 1978 [13]. During the thirty years
that have elapsed since, its security, as a one way trapdoor encryption scheme
has never been seriously threatened.
Most of the previous works have been devoted to cryptanalysis and to semantic
security but fewer attempts have been made to examine implementation issues.
Implementing a (public key) cryptosystem is a tradeoff between security and
efficiency. For that reason, cryptanalysis and implementation have to be consid-
ered in unison.
Though the public key size is rather large, the McEliece encryption scheme
possesses some strong features. It has a good security reduction and low com-
plexity algorithms for encryption and decryption. As a consequence, it is con-
ceivable, compared with number-theory based cryptosystems, to gain an order
of magnitude in performance.
In the first part, we will describe a slightly modified version of the scheme
(which we call hybrid). It has two modifications, the first increases the infor-
mation rate by putting some data in the error pattern. The second reduces the
public key size by making use of a generator matrix in row echelon form. We
will show that the same security reduction as for the original system holds. We
will then describe the key generation, the encryption and the decryption algo-
rithms and their implementation. Finally we will give some computation time

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 47–62, 2008.
c Springer-Verlag Berlin Heidelberg 2008
48 B. Biswas and N. Sendrier

for various parameters, compare them with the best known attacks, and discuss
the best tradeoffs.

2 System Description

2.1 McEliece Cryptosystem

Let F be the family of binary t-error-correcting (n, k) codes. We describe McEliece


cryptosystem as1 ,

– Public key: a k × n binary generator matrix G of C ∈ F.


– Secret key: a decoder D for C where (wH (e) ≤ t) ⇒ (D(xG + e) = x).
– Encryption: x → xG + e where, wH (e) ≤ t.
– Decryption: y → D(y), i.e. decoding.

It was introduced by R. McEliece in 1978 with irreducible binary Goppa codes.

2.2 The Hybrid McEliece Scheme

We define an injective mapping ϕ : {0, 1} → Wn,t where Wn,t denotes the set
of words of length n and Hamming weight t. Both ϕ and ϕ−1 should be easy to
compute and the integer  should be close to log2 nt . As for the original scheme,
we use Goppa codes.

– System parameters: two integers m and t. Let n = 2m and k = n − tm.


– Key generation: let Γ (L, g) ∈ Gm,t (see §A)
• Public key: a k×(n−k) binary matrix R such that (Id | R) is a generator
matrix of Γ (L, g)
• Secret key: the pair (L, g), thus the decoder ΨL,g
– Encryption:
{0, 1}k × {0, 1} → {0, 1}n
(x, e) → (x  xR) + ϕ(e)
– Decryption:
{0, 1}n → {0, 1}k × {0, 1}
y → (x, ϕ−1 (e))
where e = ΨL,g (y) and y − e = x  ∗.

There are two differences compared with the original system:

– We use the error to encode information bits.


– We use a public key in row echelon form.

Those changes will improve the credentiality of the system and, as we shall see
in §3, have no impact on the security of the system.
1
wH (·) denotes the Hamming weight.
McEliece Cryptosystem Implementation: Theory and Practice 49

2.3 Choice of Parameters


There are two main approaches for cryptanalysing McEliece system: either de-
code t errors in a random binary (n, k) code, or construct a fast decoder from a
generator matrix. The best known attacks are stated below.
– Decoding attack: a variant of information set decoding proposed by Canteaut
and Chabaud in [6].
– Structural attack: enumerate irreducible polynomials g and test the equiv-
alence of Γ (L0 , g) with the code defined by the public key. The support L0
is fixed and equivalence can be tested in polynomial time with the support
splitting algorithm [16].
Both attacks have an exponential cost. The structural attack is always less ef-
ficient. The parameters are thus chosen according to the Canteaut-Chabaud
algorithm whose performance is given in Figure 1.

Binary work factor (log scale)

n = 8192
300

250

200 n = 4096

150
n = 2048
100

50 n = 1024

0
0 0.2 0.4 k/n 0.6 0.8 1

Fig. 1. Work factor of Canteaut-Chabaud algorithm with Goppa parameters

3 Cryptographic Security
The first reductional proof of security for the McEliece encryption scheme was
given by Kobara and Imai in [12]. In the same paper, several semantically secure
conversions, generic and ad-hoc, are proposed. The purpose of those conversion
is to transform a One Way Encryption (OWE) scheme, the weakest notion of
security, into a scheme resistant to adaptative chosen ciphertext attack (IND-
CCA2), the strongest notion of security.
In this section, we prove that under two algorithmic assumptions (the hardness
of decoding and the pseudo-randomness of Goppa codes), the hybrid version of
McEliece encryption scheme is one way.
50 B. Biswas and N. Sendrier

3.1 One Way Encryption Schemes


We consider a public key encryption scheme where the public key is chosen
uniformly in the space K. Let P and C denote respectively the plaintext and
ciphertext spaces. We consider the sample space Ω = P × K equipped with the
uniform distribution PΩ . An adversary A for this encryption scheme is a mapping
C × K → P. It is successful for (x, K) ∈ Ω if A(EK (x), K) = x, where EK (x)
denotes the encryption of x with the public key K. The success probability of A
for this cryptosystem is equal to

PΩ (A(EK (x), K) = x).

Definition 1 (OWE). A public key encryption scheme is a One Way Encryp-


tion scheme if the probability of success of any of its adversary running in poly-
nomial time is negligible.
In practice, one needs more than just an OWE scheme. For instance, McEliece
encryption scheme, though it is OWE, is vulnerable to many attacks [5,7,11,19].
On the other hand, if we admit the existence of perfect hash functions, there
are generic conversions (see for instance [2,15]) which, starting from an OWE
scheme, provide a scheme resistant against adaptative chosen ciphertext attack.
Those generic conversions as well as other specific ones exist for the original
McEliece encryption scheme (see [12]).

3.2 Security Assumptions


Let m and t be two positive integers, let n = 2m and k = 2m − tm. We denote
{0, 1}k×n the set of binary k × n matrices and by Gm,t the subset consisting
of all generator matrices of a binary irreducible t-error correcting Goppa code
of length n and support F2m (up to a permutation). Finally, recall that Wn,t
denotes the binary words of weight t and length n.
Definition 2. Let m and t be two positive integers, let n = 2m and k = 2m −tm.
Let PΩ0 be the uniform distribution over the sample space

Ω0 = {0, 1}k × Wn,t × {0, 1}k×n

– An adversary is a procedure A : {0, 1}n × {0, 1}k×n → Wn,t . We denote |A|


its maximal running time.
– The success probability of an adversary A is defined as

Succ(A) = PΩ0 (A(xG + e, G) = e) .

– The success probability over Ω  ⊂ Ω0 of an adversary A is defined as

Succ(A | Ω  ) = PΩ0 (A(xG + e, G) = e | (x, e, G) ∈ Ω  ) .

– We call (T, ε)-adversary over Ω  an adversary A such that |A| ≤ T and


Succ(A, Ω  ) ≥ ε.
McEliece Cryptosystem Implementation: Theory and Practice 51

– A distinguisher D is a mapping {0, 1}k×n → {true, false}. We denote |D| its


maximal running time.
– The advantage of a distinguisher D for S ⊂ {0, 1}k×n is defined as

Adv(D, S) = |PΩ0 (D(G) | G ∈ S) − PΩ0 (D(G))| .

– We call (T, ε)-distinguisher over S a distinguisher D such that |D| ≤ T and


Adv(D, S) ≥ ε.

The first assumption states the difficulty of decoding in the average case in a
linear code whose parameters are those of a binary Goppa codes.
Assumption 1. For all (T, ε)-adversary over Ω0 , the ratio T /ε is not upper
bounded by a polymonial in n.
The worst-case is known to be difficult (the associated decision problem is NP-
complete) in the general case [4] (Syndrome Decoding) and in the bounded case
[9] (Goppa Parameterized Bounded Decoding). The status of the average case is
unknown, but it is believed to be difficult [1].
The second assumption states that there exists no efficient distinguisher for
Goppa codes. In other words, the generator matrix of a Goppa code looks ran-
dom.
Assumption 2. For all (T, ε)-distinguisher over Gm,t , the ratio T /ε is not up-
per bounded by a polymonial in n.
There is no formal result to assess this assumption. However, there is no known
invariant for linear code, computable in polynomial time, which behave differ-
ently for random codes and for binary Goppa codes.

3.3 The Hybrid McEliece Encryption Scheme Is One Way

We use the notations and definitions of the previous section. The public key
is a binary k × (n − k) matrix R. We consider a public injective mapping ϕ :
{0, 1} → Wn,t . The hybrid McEliece encryption is defined as

{0, 1}k × {0, 1} −→ {0, 1}n


(x, e) −→ (x  xR) + ϕ(e)

Theorem 1. Under Assumption 1 and Assumption 2, the hybrid McEliece sys-


tem is a OWE scheme.

Before proving the theorem, we will prove some intermediate results in the form
of three lemmas. We will use the following notations:

– Sk×n the binary systematic k × n matrices (i.e. of the form (Id | R)),
– Gm,t = Gm,t ∩ Sk×n the systematic generator matrices of Goppa codes,
– E = Im(ϕ) ⊂ Wn,t the image of {0, 1} by ϕ. In practice E can be any subset
of Wn,t .
52 B. Biswas and N. Sendrier

– We consider the three following decreasing subsets of Ω0 = {0, 1}k × Wn,t ×


{0, 1}k×n

Ω1 = {0, 1}k × E × {0, 1}k×n = {(x, e, G) ∈ Ω0 | e ∈ E}


Ω2 = {0, 1}k × E × Gm,t = {(x, e, G) ∈ Ω1 | G ∈ Gm,t }
Ω3 = {0, 1}k × E × Gm,t = {(x, e, G) ∈ Ω2 | G ∈ Sk×n }

The success probability of an adversary A for the hybrid McEliece scheme is


equal to

Succ(A | Ω3 ) = PΩ0 A(xG + e, G) = e | G ∈ Gm,t , e ∈ Im(ϕ) .


n
Lemma 1. Any (T, ε)-adversary over Ω1 is a T, ε|E|/ t -adversary over Ω0 .

Lemma 2. If there exists a (T, ε)-adversary over Ω2 then


– either there exists (T, ε/2)-adversary A over Ω1 ,
– or there exists (T + O(n2 ), ε/2)-distinguisher for Gm,t .

Lemma 3. If there exists a (T, ε)-adversary over Ω3 then


– either there exists (T + O(n3 ), λε/2)-adversary over Ω2 ,
– or there a exists (O(n3 ), λ/2)-distinguisher for Gm,t ,
where λ ≥ 0.288 is the probability for a binary k × k matrix to be non-singular.

Proofs of the three lemmas are given in appendix §B.


Proof. (of Theorem 1) A (T, ε)-adversary against the hybrid McEliece scheme is
a (T, ε)-adversary over Ω3 with E = Im(ϕ).
If we put together the tree lemmas and the two assumptions, it follows that
if the above (T, ε)-adversary exists then
– either there exists a (T + O(n3 ), λε2−2 / nt )-adversary over Ω0 ,
– or there a exists a (T + O(n3 ), λε/4)-distinguisher for the Goppa codes,
– or there a exists a (O(n3 ), λ/2)-distinguisher for the Goppa codes.
Provided 2 is close to nt (we can reasonably assume, for instance, that the
ratio nt /2 is upper bounded by a small constant, say 4), the existence of an
efficient adversary against the hybrid McEliece scheme would contradict one of
the two assumptions of the statement.

4 Implementation
4.1 Description
We give in Figure 2 a pseudo-code description of the hybrid McEliece encryption
scheme compliant with the description in §2. Algorithms are detailed in the next
section.
McEliece Cryptosystem Implementation: Theory and Practice 53

keygen(m, t) encrypt(x, e, R)
L ← rand permut(F2m ) return (x  x · R) + ϕ(e)
g ← rand irred poly(t)
(R, L ) ← get public key(L, g) decrypt(y, L, g)
SK ← (L , g) e ← decode(y, L, g)
PK ← R return (LSBk (y − e), ϕ−1 (e))
return (P K, SK)

rand permut(F2m ) returns the elements of F2m in a random order.


ϕ and ϕ−1 are briefly described in §C.

Fig. 2. The hybrid McEliece encryption scheme

rand irred poly(t) is irred(f )


do h(z) ← z
f ← rand monic poly(t) for i from 1 to deg(f )/2 do
while (is irred(f ) = false) repeat m times
return f h(z) ← h(z)2 mod f (z)
if (gcd(f (z), h(z) − z) = 1)
return false
return true

rand monic poly(t) returns a random monic polynomial of degree t.

Fig. 3. Generation of irreducible polynomial

4.2 Algorithms

We describe below the main algorithms require for the implementation of the
hybrid McEliece encryption scheme. We won’t describe the finite field operations,
the usual polynomial operations (including the extended Euclidian algorithm
for computing the modular inverse) and linear algebra operations (including the
Gaussian elimination).
In all the above algorithms, we consider an irreducible binary Goppa code
Γ (L, g) with L = (α1 , . . . , αn ) and g(z) ∈ F2m [z] monic irreducible of degree t.

Irreducible polynomial. A given polynomial f (z) ∈ F2m has a factor of


im
degree i if and only if it has a common factor with z 2 − z. Conversely, if for all
im
i ≤ t/2 we have gcd(f (z), z 2 − z) = 1 then f (z) is irreducible. The polynomial
im
z 2 − z has a much too high degree to be handled directly, instead we compute
j
the polynomials hj (z) = z 2 mod f (z) by successive squaring modulo f (z). We
im
have gcd(f (z), z 2 − z) = gcd(f (z), him (z) − z) which greatly simplifies the
computations. The algorithm given in Figure 3 will produce a random monic
irreducible polynomial of degree t.

Building the generator matrix. Let fj (z) = (z − αj )−1 mod g(z) for all
j = 1, . . . , n. A word a = (a1 , . . . , an ) ∈ Fn2 is in Γ (L, g) if and only if
54 B. Biswas and N. Sendrier


n 
n
aj
Ra (z) = aj fj (z) = mod g(z) = 0 (1)
j=1 j=1
z − αj

This defines a t×m parity check matrix over F2m whose j-th column is formed by
the t coefficients, in F2m , of the polynomial fj (z). If we write the field elements
of F2m in a basis over F2 , each of those columns becomes a binary word of length
tm and the n binary column corresponding to the expansions of the fj (z) form
a binary tm × n parity check matrix H of Γ (L, g). We then apply a Gaussian
elimination on H, starting with the last columns, to obtain a k × (n − k) binary
matrix R such that (RT | Id) = U HP with U non-singular and P a permutation
matrix. The matrix P is the product of a small number (between 0 and a few
units) of transpositions. A code with parity check matrix (RT | Id) will admit
G = (Id | R) as generator matrix, so R is the public key. Figure 4 describes the
whole procedure.

Goppa code decoding. Let b = (b1 , . . . , bn ) be the word to be decoded, we


assume that b = a + e with aΓ (L, g) and e ∈ Wn,t . If j1 , . . . , jt are the non zero

positions of e, its locator polynomial is defined as σe (z) = ti=1 (z − αji ).

get public key(L, g) Goppa check matrix(L, g)


H ← Goppa check matrix(L, g) for j from 0 to n do
(R, P ) ← gauss elim(H) fj (z) ← (z − αj )−1 mod g(z)
if (P = Id) cj ← expand(fj )
L ← permute(L, P ) return matrix(c1 , . . . , cn )
return (R, L )

gauss elim(H) returns a permutation matrix P and a matrix R such


that (RT | Id) = U HP for some non-singular matrix U .
permute(L, P ) adjusts the support L according to the permutation P .
expand(fj ) transforms an element of Ft2m into an element of Ftm2 .
matrix(c1 , . . . , cn ) concatenates columns to form a matrix.

Fig. 4. Generation of the parity check matrix

decode(b, L, g)
S(z) ← syndrome(b, L, g)
σ(z) ← solve key eq(S(z), g(z))
(γ1 , . . . , γt ) ← Berlekamp trace algorithm(σ)
e ← error((γ1 , . . . , γt ), L)
return e

syndrome(b, L, g) returns Rb (z) as in (1).


solve key eq(S(z), g(z)) applies Patterson algorithm.
Berlekamp trace algorithm(σ) is described in §D
error((γ1 , . . . , γt ), L) returns the indexes in L of the γi .

Fig. 5. Goppa code decoding


McEliece Cryptosystem Implementation: Theory and Practice 55

1. Syndrome computation: we compute the syndrome Rb (z) of b as in (1).


2. Key equation solving: the locator polynomial verifies the key equation
d
Rb (z)σe (z) = σe (z) mod g(z).
dz
We solve it with Patterson’s algorithm [14] (see §E).
3. Roots computation: we compute the roots of σe with the Berlekamp trace
algorithm [3] (see §D).

5 Simulation Results
We implemented the hybrid version of McEliece encryption scheme in C program-
ming language. In Figures 6 and 7 we plot the running time per plaintext byte
versus the logarithm in base 2 of the work factor of the best known attack [6].
Various values of t were tried for an extension degree 11 ≤ m ≤ 15. As
expected, for a fixed m, the performance gets better for smaller values of t.
However, for a fixed security level, the best performance is not obtained for the
smallest block size (i.e. extension degree). On the contrary the system works
better for higher extension degrees. However, for m ≥ 13 encryption speed for
fixed security becomes steady. See Figure 6 and Figure 7.

extension degree m = 11
extension degree m = 12
extension degree m = 13
Encryption cost (cpu-cycles per byte)

500

400

300

200

100

0
50 100 150 200 250 300
Binary work factor (power of 2)

Fig. 6. Encryption cost vs binary work factor for different extension degrees
56 B. Biswas and N. Sendrier

extension degree m = 11
extension degree m = 12
extension degree m = 13
extension degree m = 14
extension degree m = 15

6000
Decryption cost (cpu-cycles per byte)

5000

4000

3000

2000

1000

0
50 100 150 200 250 300
Binary work factor (power of 2)

Fig. 7. Decryption cost vs binary work factor for different extension degrees

5.1 Comparison with Other Systems

Our simulations were performed on a machine featuring an Intel Core 2 processor


with dual core. It ran on a 32 bits operating system and a single core. The C
program was compiled with icc Intel compiler with the options -g -static -O
-ipo -xP. Timings are given in Table 1. Compared with the state of the art

Table 1. McEliece: selected parameters at a glance

cycles/byte
(m, t) encrypt decrypt key size security
(10, 50) 243 7938 32 kB 60
(11, 32) 178 1848 73 kB 88
(11, 40) 223 2577 86 kB 96
(12, 21) 126 573 118 kB 88
(12, 41) 164 1412 212 kB 130
(13, 18) 119 312 227 kB 93
(13, 29) 149 535 360 kB 129
(14, 15) 132 229 415 kB 91
(15, 13) 132 186 775 kB 90
(16, 12) 132 166 1532 kB 91
McEliece Cryptosystem Implementation: Theory and Practice 57

Table 2. Performance of some other public key systems (EBATS source)

cycles/byte
encrypt decrypt
RSA 1024 (1) 800 23100
RSA 2048 (1) 834 55922
NTRU (2) 4753 8445
(1)
RSA encryption (with malleability defense) using OpenSSL.
(2)
ntru-enc 1 ees787ep1 NTRU encryption with N = 787 and q = 587. Software
written by Mark Etzel (NTRU Cryptosystem).

implementation of other public key encryption schemes (see Table 2), McEliece
encryption gains an order of magnitude for both encryption and decryption.
The source used for Table 2 is an EBATS preliminary report2 of March 2007.

6 Conclusion
We presented here a new modified version of McEliece cryptosystem and its
full implementation. We have shown that code-based public key encryption
scheme compares favorably with optimized implementation of number theory
based schemes.
The system we have implemented here is very fast and offers much flexibility
in the choice of parameters. One of the main observations we made from this
implementation work is the fact that increasing the extension degree m seems to
offer an interesting trade off. Presently, our program do not allow an extension
degree greater than 16.
The source code of the whole implementation is freely avalaible on our website
[Link] The Niederreiter scheme is similar to
McEliece’s in most aspects. We intend to make it available as well.

References
1. Barg, A.: Complexity issues in coding theory. In: Pless, V.S., Huffman, W.C. (eds.)
Handbook of Coding theory, ch. 7, vol. I, pp. 649–754. North-Holland, Amsterdam
(1998)
2. Bellare, M., Rogaway, P.: Optimal asymetric encryption. In: De Santis, A. (ed.)
EUROCRYPT 1994. LNCS, vol. 950, pp. 92–111. Springer, Heidelberg (1995)
3. Berlekamp, E.R.: Factoring polynomials over large finite fields. Mathematics of
Computation 24(111), 713–715 (1970)
4. Berlekamp, E.R., McEliece, R.J., van Tilborg, H.C.: On the inherent intractability
of certain coding problems. IEEE Transactions on Information Theory 24(3) (May
1978)
5. Berson, T.: Failure of the McEliece public-key cryptosystem under message-resend
and related-message attack. In: Kalisky, B. (ed.) CRYPTO 1997. LNCS, vol. 1294,
pp. 213–220. Springer, Heidelberg (1997)
2
[Link]
58 B. Biswas and N. Sendrier

6. Canteaut, A., Chabaud, F.: A new algorithm for finding minimum-weight words in
a linear code: Application to McEliece’s cryptosystem and to narrow-sense BCH
codes of length 511. IEEE Transactions on Information Theory 44(1), 367–378
(1998)
7. Canteaut, A., Sendrier, N.: Cryptanalysis of the original McEliece cryptosystem.
In: Ohta, K., Pei, D. (eds.) ASIACRYPT 1998. LNCS, vol. 1514, pp. 187–199.
Springer, Heidelberg (1998)
8. Cover, T.: Enumerative source encoding. IEEE Transactions on Information The-
ory 19(1), 73–77 (1973)
9. Finiasz, M.: Nouvelles constructions utilisant des codes correcteurs d’erreurs en
cryptographie à clef publique. Thèse de doctorat, École Polytechnique (October
2004)
10. Ganz, J.: Factoring polynomials using binary representations of finite fields. IEEE
Transactions on Information Theory 43(1), 147–153 (1997)
11. Hall, C., Goldberg, I., Schneier, B.: Reaction attacks against several public-key
cryptosystems. In: Varadharajan, V., Mu, Y. (eds.) ICICS 1999. LNCS, vol. 1726,
pp. 2–12. Springer, Heidelberg (1999)
12. Kobara, K., Imai, H.: Semantically secure McEliece public-key cryptosystems -
Conversions for McEliece PKC. In: Kim, K. (ed.) PKC 2001. LNCS, vol. 1992, pp.
19–35. Springer, Heidelberg (2001)
13. McEliece, R.J.: A public-key cryptosystem based on algebraic coding theory. In:
DSN Prog. Rep., Jet Prop. Lab., California Inst. Technol., Pasadena, CA, pp.
114–116 (January 1978)
14. Patterson, N.J.: The algebraic decoding of Goppa codes. IEEE Transactions on
Information Theory 21(2), 203–207 (1975)
15. Pointcheval, D.: Chosen-ciphertext security for any one-way cryptosystem. In: Imai,
H., Zheng, Y. (eds.) PKC 2000. LNCS, vol. 1751, pp. 129–146. Springer, Heidelberg
(2000)
16. Sendrier, N.: Finding the permutation between equivalent codes: the support split-
ting algorithm. IEEE Transactions on Information Theory 46(4), 1193–1203 (2000)
17. Sendrier, N.: Cryptosystèmes à clé publique basés sur les codes correcteurs
d’erreurs. Mémoire d’habilitation à diriger des recherches, Université Paris 6
(March 2002)
18. Sendrier, N.: Encoding information into constant weight words. In: IEEE Confer-
ence, ISIT 2005, pp. 435–438, Adelaide, Australia (September 2005)
19. Sun, H.M.: Further cryptanalysis of the McEliece public-key cryptosystem. IEEE
Trans. on communication letters 4(1), 18–19 (2000)

A Goppa Code
Let m and t denote two positive integers. We will denote Gm,t the set of all
binary irreducible t-error correcting Goppa codes, defined below.
Definition 3. Let L = (α1 , . . . , αn ) be a sequence of n = 2m distinct elements
in F2m and g(z) ∈ F2m [z] an irreducible monic polynomial of degree t. The
binary irreducible Goppa code with support L and generator polynomial g(z),
denoted by Γ (L, g), is defined as the set of words (a1 , . . . , an ) ∈ Fn2 such that

n
aj
Ra (z) = = 0 mod g(z).
j=1
z − αj
McEliece Cryptosystem Implementation: Theory and Practice 59

The Goppa code Γ (L, g) has length n = 2m and dimension3 k ≥ n − mt. We can
associate to it an efficient (polynomial time) decoding procedure, denoted ΨL,g ,
which can correct up to t errors. For all x ∈ Γ (L, g) and all e ∈ {0, 1}n, we have
(wH (e) ≤ t) ⇒ (ΨL,g (x + e) = e).

B Additional Proofs

Proof. (of Lemma 1) Let A denote the (T, ε)-adversary over Ω1 of the statement.
By definition, it is such that

Succ(A | e ∈ E) = Succ(A | Ω1 ) ≥ ε.

We have

Succ(A) = PΩ0 (A(xG + e, G) = e) ≥ PΩ0 (A(xG + e, G) = e, e ∈ E)


≥ PΩ0 (A(xG + e, G) = e | e ∈ E)PΩ0 (e ∈ E)
|E|
≥ Succ(A | e ∈ E)PΩ0 (e ∈ E) ≥ ε n
t

which proves the lemma.

Proof. (of Lemma 2) Let A denote the (T, ε)-adversary over Ω2 of the state-
ment. We consider the distinguisher D defined for all G ∈ {0, 1}k×n by D(G) =
(A(xG + e, G) = e) where (x, e) is randomly and uniformly chosen in {0, 1}k × E.
We have 
PΩ0 (D(G)) = Succ(A | e ∈ E)
PΩ0 (D(G) | Ω2 ) = Succ(A | e ∈ E, G ∈ Gm,t )
¿From which we easily derive

Succ(A | Ω2 ) ≤ Succ(A | Ω1 ) + Adv(D, Gm,t ). (2)

To run D, one has to compute the ciphertext xG + e which has a cost upper
bounded by O(n2 ) and to make one call to A. So we have |D| ≤ T + O(n2 ).
By definition of A, we have Succ(A | Ω2 ) ≥ ε. Thus at least one of the two
right-hand side terms of the inequality (2) is greater than ε/2. This implies that
either A verifies
ε
Succ(A | Ω1 ) ≥
2
or D verifies
ε
Adv(D, Gm,t ) ≥ ,
2
which proves the lemma.
3
Equality holds in all cases of practical interest.
60 B. Biswas and N. Sendrier

Proof. (of Lemma 3) We denote Syst(G) a procedure which returns on any input
G = (U | V ) ∈ {0, 1}k such that U is non-singular the matrix (Id | U −1 V ) ∈
Sk×n . On other inputs, Syst() leave G unchanged.
Let A denote the (T, ε)-adversary over Ω3 of the statement.
We define the adversary A as A (y, G) = A(y, Syst(G)). We define the dis-
tinguisher D which returns true on input G if and only if Syst(G) ∈ Sk×n . The
running time of Syst() is upper bounded by O(n3 ), thus |A | ≤ T + O(n3 ) and
|D| = O(n3 ).
If A succeeds with (x, e, G) ∈ Ω2 and Syst(G) ∈ Sk×n , then A succeeds with
(x , e, Syst(G)) ∈ Ω3 for some x . We have


Succ(A | Ω2 , Syst(G) ∈ Sk×n ) ≥ Succ(A | Ω3 ) ≥ ε

and (note that the events “e ∈ E” and “Syst(G) ∈ Gm,t ” are independent)

Succ(A | Ω3 ) ≤ Succ(A | Ω2 , Syst(G) ∈ Sk×n )


≤ Succ(A | e ∈ E, Syst(G) ∈ Gm,t )
PΩ0 (A (xG + e, G) = e, e ∈ E, Syst(G) ∈ Gm,t )

PΩ0 (e ∈ E, Syst(G) ∈ Gm,t )
PΩ0 (A (xG + e, G) = e, e ∈ E, G ∈ Gm,t )

PΩ0 (e ∈ E)PΩ0 (Syst(G) ∈ Gm,t )
PΩ0 (G ∈ Gm,t )
≤ Succ(A | e ∈ E, G ∈ Gm,t )
PΩ0 (Syst(G) ∈ Gm,t )
Succ(A | Ω2 )
≤ .
PΩ0 (Syst(G) ∈ Sk×n | G ∈ Gm,t )
We consider now the distinguisher D. By definition, we have

PΩ0 (Syst(G) ∈ Sk×n | G ∈ Gm,t ) ≥ PΩ0 (Syst(G) ∈ Sk×n ) − Adv(D, Gm,t ).

We also have λ = PΩ0 (Syst(G) ∈ Sk×n ) the proportion of non-singular binary


k × k matrices. Putting everything together, we get
Succ(A | Ω2 )
ε ≤ Succ(A | Ω3 ) ≤
λ − Adv(D, Gm,t )
and
Succ(A | Ω2 ) ≥ λε − εAdv(D, Gm,t ).
We easily conclude that, if D has an advantage smaller than λ/2 for Gm,t then
A has a success probability over Ω2 greater than λε/2.

C Constant Weight Encoding


For producing the injective mapping ϕ : {0, 1} → Wn,t we need for the hy-
brid scheme is not an easy task. Existing solutions [8,17,18] are all based on a
McEliece Cryptosystem Implementation: Theory and Practice 61

(source) encoder Wn,t → {0, 1}∗ whose decoder is used for processing binary
data. Unfortunately they all have either a high computation cost, or a variable
length encoder.
Here, we use another encoder which uses a new recursive dichotomic model
for the constant weight words. Let x = (xL  xR ) ∈ Wn,t , with n = 2m , where
xL and xR have length n/2 = 2m−1 and i = wH (xL ), we define

nil if i ∈ {0, 2m }
Fm,t (x) = L R
i, Fm−1,i (x ), Fm−1,t−i (x ) else

where a, nil = nil, a = a. Any element of Wn,t is uniquely transformed into a


finite sequence of integers. If x ∈ Wn,t is chosen randomly and uniformly then
the distribution of the head element i of Fm,t (x) is
n/2 n/2
i t−i
Prob(i) = n , i ∈ {0, 1, 2, . . . , t}.
t

The sequences of integers produced by Fm,t can be modeled by a stochastic


process with the above probabilities. We use an adaptative arithmetic source
encoder to encode them. This allows us to produce a nearly optimal encoder
from which we build a fast and efficient mapping ϕ : {0, 1} → Wn,t . For values
of (m, t) of practical cryptographic interest we always have  ≥ log2 nt  − 1.

D Berlekamp Trace Algorithm

Berlekamp trace algorithm was originally published in [3]. The following presen-
tation is inspired from [10]. This algorithm is very efficient for finite fields with
small characteristic. The trace function T r(·) of F2m over F2 is defined by
2 m−1
T r(z) = z + z 2 + z 2 + ... + z 2 ,

it maps the field F2m onto it’s ground field F2 . A key property of the trace
function is that if (β1 , ..., βm ) is any basis of F2m over F2 , then every element
α ∈ F2m is uniquely represented by the binary m-tuple

(T r(β1 · α), ..., T r(βm · α)) .

The basic idea of the Berlekamp trace algorithm is that any f (z) ∈ F2m [z], with
m
f (z) | z 2 − z, splits into two polynomials

g(z) = gcd(f (z), T r(β · z)) and h(z) = gcd(f (z), 1 + T r(β · z)).

The above property of the trace ensures that if β iterates through the basis
(β1 , ..., βm ), we can separate all the roots of f (z) (see Figure 8).
62 B. Biswas and N. Sendrier

BTA(σ, i) Berlekamp trace algorithm(σ)


if deg(σ) ≤ 1 then return BTA(σ, 1)
return rootof(σ)
σ0 ← gcd(σ(z), T r(βi · z))
σ1 ← gcd(σ(z), 1 + T r(βi · z))
return BTA(σ0 , i + 1), BTA(σ1 , i + 1)

Fig. 8. Pseudo code for the Berlekamp trace algorithm

E Patterson Algorithm
The Patterson algorithm [14] solves the Goppa code key equation: given R(z)
and g(z) in F2m [z], with g(z) of degree t respectively, find σ(z) of degree t such
that
d
R(z)σ(z) = σ(z) mod g(z)
dz
d
We write σ(z) = σ0 (z)2 + zσ1 (z)2 . Since σ(z) = σ1 (z)2 , we have
dz
(1 + zR(z))σ1 (z)2 = R(z)σ0 (z)2 mod g(z).
Because g(z) is irreducible, R(z) can be inverted modulo g(z). We put h(z) =
z + R(z)−1 mod g(z) and we have
h(z)σ1 (z)2 = σo (z)2 mod g(z).
The mapping f (z) → f (z)2 mod g(z) is bijective and linear over Ftm
2 , there is a
unique polynomial S(z) such that S(z)2 = h(z) mod g(z). We have
S(z)σ1 (z) = σ0 (z) mod g(z).
The polynomial σ0 (z), σ1 (z) are the unique solution of the equation

⎨ S(z)σ1 (z) = σ0 (z) mod g(z)
deg σ0 ≤ t/2 (3)

deg σ1 ≤ (t − 1)/2
The three steps of the algorithm are the following
1. Compute h(z) = z + R(z)−1 mod g(z) using the extended Euclidian algo-
rithm. 
2. Compute S(z) = h(z) mod g(z)
If s(z) such that s(z)2 = z mod g(z) has been precomputed and h(z) =
h0 + h1 z + . . . + ht−1 z t−1 , we have


(t−1)/2
m−1 
t/2−1
m−1
S(z) = h22i zi + h22i+1 z i s(z)
i=0 i=0

3. Compute (σ0 (z), σ1 (z)) as in (3) using the extended Euclidian algorithm.
The polynomial σ(z) = σ0 (z)2 + zσ1 (z)2 is returned.
Merkle Tree Traversal Revisited

Johannes Buchmann, Erik Dahmen, and Michael Schneider

Technische Universität Darmstadt


Department of Computer Science
Hochschulstraße 10, 64289 Darmstadt, Germany
{buchmann,dahmen,mischnei}@[Link]

Abstract. We propose a new algorithm for computing authentication


paths in the Merkle signature scheme. Compared to the best algorithm for
this task, our algorithm reduces the worst case running time considerably.

Keywords: Authentication path computation, digital signatures, Merkle


signatures, Merkle tree traversal, post-quantum cryptography.

1 Introduction
Digital signatures are extremely important for the security of computer networks
such as the Internet. For example, digital signatures are widely used to ensure
authenticity and integrity of updates for operating systems and other software
applications. Currently used signature schemes like RSA and ECDSA base their
security on the hardness of factoring and computing discrete logarithms. In the
past 20 years, there has been significant progress in solving these problems which
is why the key sizes for RSA and ECDSA are constantly increased [9]. The
security of RSA and ECDSA is also threatened by large quantum computers
that, if built, are able to solve the underlying problems in linear time and thus
are able to completely break RSA and ECDSA [12]. The research on alternative
signature schemes, so-called post quantum signature schemes, is therefore of
extreme importance.
One of the most interesting post-quantum signature schemes is the Merkle
signature scheme (MSS)[10]. Its security can be reduced to the collision resis-
tance of the used hash function [4]. The best known quantum algorithm to find
collisions of hash functions achieves only a square root speed-up compared to the
birthday attack [6]. Therefore, the security of MSS is only marginally affected
if large quantum computers are built. If a specific hash function is found to be
insecure, MSS is easily saved by using a new, secure hash function. This makes
MSS an intriguing candidate for a post-quantum signature scheme. It is therefore
important to implement the Merkle signature scheme as efficiently as possible.
In recent years, many improvements for MSS were proposed [2, 3, 5, 11]. With
those improvements, the performance of MSS is now competitive. However, sign-
ing with MSS is in most cases still slower than signing with ECDSA. This paper
proposes an MSS improvement that reduces the signing time.
The time required for generating a Merkle signature is dominated by the time
for computing the authentication path, that later allows the verifier to deduce

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 63–78, 2008.
c Springer-Verlag Berlin Heidelberg 2008
64 J. Buchmann, E. Dahmen, and M. Schneider

the validity of the one-time verification key from the validity of the MSS public
key. Current algorithms [1, 7, 10, 13, 14] for computing authentication paths
have fairly unbalanced running times. The best case runtime of those algorithms
is significantly shorter than the worst case runtime. So the computation of some
authentication paths is very slow while other authentication paths can be com-
puted very quickly.
Here we propose an authentication path algorithm which is significantly faster
in the worst case than the best algorithm known so far. This is Szydlo’s algorithm
from [13] which provides the optimal time-memory trade-off. In fact, the worst
case runtime of our algorithm is very close to its average case runtime which,
in turn, equals the average case runtime of the best known algorithm proposed
in [13]. The idea of our algorithm is to balance the number of leaves that are
computed in each authentication path computation, since leaves are by far the
most expensive nodes in the Merkle tree. All other known approaches balance
the number of nodes. This does not balance the running time since computing
an inner node only requires one hash function evaluation, while computing a
leaf takes several hundred hash function evaluations; this is because leaves are
essentially one-time verification keys and thus the cost for computing a leaf is
determined by the key pair generation cost of the respective one-time signature
scheme. This problem is pointed out in [1, 11] but no solution has been provided
so far. Our algorithm balances the number of leaves that are computed in each
round. Inner nodes are computed as required and since their cost is negligible
compared to leaves, the worst case time required by our algorithm is extremely
close to the average case time. To be more precise, for each authentication path
our algorithm computes H/2 leaves and 3/2(H − 3) + 1 inner nodes in the worst
case and (H − 1)/2 leaves and (H − 3)/2 inner nodes on average, where H is the
height of the Merkle tree. Our algorithm needs memory to store 3.5H − 4 nodes.
Previous work. There are two different approaches to compute authentication
paths. In [10] Merkle proposes to compute each authentication node separately.
This idea is adopted by Szydlo [14], where he implements a better scheduling of
the node calculations and achieves the optimal trade-off, that is O(H) time and
O(H) space. In [13], Szydlo further improves the constants. For each authen-
tication path his algorithm computes H nodes of the Merkle tree and requires
storage for 3H − 2 nodes.
The second approach is called fractal Merkle tree traversal [7]. This approach
splits the Merkle tree into smaller subtrees and stores a stacked series of sub-
trees that contain authentication paths for several succeeding leaves. Varying the
height h of the subtrees allows a trade-off between time and space needed for the
tree traversal. Using the low space solution (h = log H) requires O(H/ log H)
time and O(H 2 / log H) space. In [1], the authors improve the constants of this
algorithm and prove the optimality of the fractal time-memory trade-off.
Organisation. Section 2 describes a simplified version of our algorithm for
Merkle trees of even height. The general algorithm is presented in Appendix A.
Section 3 compares the new algorithm with that of Szydlo [13]. Section 4 states
our conclusion. Appendix B considers the computation of leaves using a PRNG.
Merkle Tree Traversal Revisited 65

2 Authentication Path Computation


In this section we describe our new algorithm to compute authentication paths.
It is based on Szydlo’s algorithm from [13]. We describe the algorithm in detail,
prove its correctness, and estimate the worst case and average case runtime as
well as the required space.

Definitions and notations. In the following, H ≥ 2 denotes the height of the


Merkle tree. The index of the current leaf is denoted by ϕ ∈ {0, . . . , 2H − 1}. The
nodes in the Merkle tree are denoted by yh [j], where h = 0, . . . , H denotes the
height of the node in the tree (leaves have height 0 and the root has height H)
and j = 0, . . . , 2H−h − 1 denotes the position of this node in the tree counting
from left to right. Further, let f : {0, 1}∗ → {0, 1}n be a cryptographic hash
function. Using this notation, inner nodes of a Merkle tree are computed as

yh [j] = f yh−1 [2j]  yh−1 [2j + 1] , (1)

for h = 1, . . . , H and j = 0, . . . , 2H−h − 1.


Next, we define the value τ . In round ϕ ∈ {0, . . . , 2H − 1}, we define τ as the
height of the first parent of leaf ϕ which is a left node. If leaf ϕ is a left node
itself, then τ = 0. Otherwise τ is given as τ = max{h : 2h |(ϕ + 1)}. Figure 1
shows an example.

τ =2 Right node
Left node

ϕ=3

Fig. 1. The height of the first parent of leaf ϕ that is a left node is τ = 2. The dashed
nodes denote the authentication path for leaf ϕ. The arrows indicate the path from
leaf ϕ to the root.

The value τ tells us on which heights the authentication path for leaf ϕ + 1
requires new nodes. It requires new right nodes on heights h = 0, . . . , τ − 1 and
a single new left node on height τ .
Computing inner nodes. A basic tool to compute inner nodes of a Merkle
tree is the treehash algorithm shown in Algorithm 1. This algorithm uses a stack
Stack with the usual push and pop operations and the Leafcalc(ϕ) routine
which computes the ϕth leaf1 . To compute a node on height h, Algorithm 1 must
be executed 2h times and requires the leaf indices to be input successively from
1
That is, it computes the ϕth one-time key pair and obtains the leaf from the one-time
verification key.
66 J. Buchmann, E. Dahmen, and M. Schneider

left to right, i.e. ϕ = 0, . . . , 2h − 1. In total, the computation of 2h leaves and


2h − 1 hashes (inner nodes) is required. After the last call the stack contains one
node, the desired inner node on height h. The treehash algorithm stores at most
h nodes, so-called tail nodes, on the stack.

Algorithm 1. Treehash
Input: Leaf index ϕ, stack Stack
Output: Updated stack Stack

1. Leaf ← Leafcalc(ϕ)
2. while Leaf has the same height in the tree as the top node on Stack do
(a) Top ← [Link]()
(b) Leaf ← f (Top  Leaf)
3. [Link](Leaf)
4. Return Stack

2.1 Our Authentication Path Algorithm

We now describe our Merkle tree traversal algorithm in detail. We begin with
a simplified version that requires the height of the Merkle tree to be even. The
general version, which comprises a time-memory trade-off suggested by Szydlo,
is discussed in Appendix A. Like Szydlo’s algorithm [13], we deploy two different
strategies to compute authentication nodes, depending on whether the node is
a left child (left authentication node, left node) or a right one. The difference to
Szydlo’s algorithm is, that we only schedule the computation of leaves and not
tree nodes in general.
Data structures. Our algorithm uses the following data structures:

• Authh , h = 0, . . . , H − 1. An array of nodes that stores the current authen-


tication path.
• Retain. The single right authentication node on height H − 2.
• Stack. A stack of nodes with the usual push and pop operations.
• Treehashh , h = 0, . . . , H −3. These are instances of the treehash algorithm.
All these treehash instances share the stack Stack. Further, each instance
has the following entries and methods.
– Treehashh .node. This entry stores a single tail node. This is the first
node Algorithm 1 pushes on the stack. The remaining tail nodes are
pushed on the stack Stack.
– Treehashh .initialize(ϕ). This method initializes this instance with the
index ϕ of the leaf to begin with.
– Treehashh .update(). This method executes Algorithm 1 once, meaning
that it computes the next leaf (Line 1) and performs the necessary hash
function evaluations to compute this leaf’s parents (Line 2b), if tail nodes
are stored on the stack.
Merkle Tree Traversal Revisited 67

– Treehashh .height. This entry stores the height of the lowest tail node
stored by this treehash instance, either on the stack Stack or in the
entry Treehashh .node. If Treehashh does not store any tail nodes
Treehashh .height = h holds. If Treehashh is finished or not initialized
Treehashh .height = ∞ holds.
• Keeph , h = 0, . . . , H − 2. An array of nodes that stores certain nodes for the
efficient computation of left authentication nodes.

Initialization. The initialization of our algorithm is done during the MSS key
pair generation. We store the authentication path for the first leaf (ϕ = 0): Authh
= yh [1], h = 0, . . . , H − 1. We also store the next right authentication node in
the treehash instances: Treehashh .node = yh [3], for h = 0, . . . , H − 3. Finally
we store the single next right authentication node on height H − 2: Retain =
yH−2 [3]. Figure 2 shows which nodes are stored during the initialization.

Auth3

Auth2 Retain

Auth1

Auth0

Treehash0 .node Treehash1 .node

Fig. 2. Initialization of our algorithm. Dashed nodes denote the authentication path
for leaf (ϕ = 0). Dash-dotted nodes denote the nodes stored in the treehash instances
and the single node Retain.

Update and output phase. In the following we describe the update and output
phase of our algorithm. Algorithm 2 shows a pseudo-code description. Input is
the index of the current leaf ϕ ∈ {0, . . . , 2H − 2}, the height of the Merkle tree
H ≥ 2, where H must be even, and the algorithm state Auth, Keep, Retain,
and Treehash prepared in previous rounds or during the initialization. Our
algorithm first generates the authentication path for the next leaf ϕ + 1 and
then computes the algorithm state for the next step. Output is the authentication
path for leaf ϕ + 1.
Computing left authentication nodes. We review the computation of left nodes
due to [13]. The basic idea is to store certain right nodes in an array Keeph ,
h = 0, . . . , H − 2 and use them later to compute left authentication nodes using
only one evaluation of the hash function.
If in round ϕ ∈ {0, . . . , 2H − 2}, the parent of leaf ϕ on height τ + 1 is a
left node (this can be verified by checking if ϕ/2τ +1 is even), then Authτ
is a right node and we store it in Keepτ (Line 2). In round ϕ = ϕ + 2τ the
authentication path for leaf ϕ + 1 requires a new left authentication node on
height τ  = τ + 1. The left child of this authentication node is the authentication
68 J. Buchmann, E. Dahmen, and M. Schneider

node on height τ  − 1 of leaf ϕ . The right child of this node was stored in
Keepτ  −1 in round ϕ. The new left authentication node on height τ  is then
computed as Authτ  = f Authτ  −1  Keepτ  −1 (Line 4a). For those rounds ϕ
where τ = 0 holds, the single new left node required for the authentication path
of leaf ϕ + 1 is the current leaf ϕ. We compute it using the algorithm Leafcalc,
i.e. we set Auth0 = Leafcalc(ϕ) (Line 3).
Computing right authentication nodes. Unlike left authentication nodes, right au-
thentication nodes must be computed from scratch, i.e. starting from the leaves.
This is because none of their child nodes were used in previous authentication
paths. We use one Treehash instance for each height where right authentication
nodes must be computed, i.e. for heights h = 0, . . . , H − 3.
In round ϕ ∈ {0, . . . , 2H − 2}, the authentication path for leaf ϕ + 1 requires
new right authentication nodes on heights h = 0, . . . , τ − 1. Our algorithm is
constructed such that for h ≤ H − 3 these nodes are already computed and
stored in Treehashh .node. If a new authentication node is required on height
h = H − 2 we copy it from the node Retain. Note that there is only one new
right node required on this height during the whole runtime of Algorithm 2.
The authentication path for leaf ϕ + 1 is obtained by copying the nodes from
Treehashh .node and Retain to Authh for h = 0, . . . , τ − 1 (Line 4b).
After copying the right nodes, all treehash instances on height h = 0, . . . , τ −1
are initialized for the computation of the next right authentication node. The
index of the leaf to begin with is ϕ + 1 + 3 · 2h . If ϕ + 1 + 3 · 2h ≥ 2H holds, then
no new right node will be required on this height and the treehash instance is
not initialized anymore (Line 4c).
The last step of the algorithm is to update the treehash instances using the
Treehashh .update() method (Line 5). We perform H/2 − 1 updates in each
round. One update corresponds to one execution of Algorithm 1, i.e. one update
requires the computation of one leaf and the necessary hash function evaluations
to compute this leaf’s parents. We use the strategy from [13] to decide which
of the H − 2 treehash instances receives an update. The treehash instance that
receives an update is the instance where Treehashh .height contains the smallest
value. If there is more than one such instance, we choose the one with the lowest
index (Line 5a).

2.2 Correctness
In this section we show the correctness of Algorithm 2. First we show that the
budget of H/2 − 1 updates per round is sufficient for the treehash instances to
compute the required authentication nodes on time. Then we will show that it
is possible for all treehash instances to share a single stack.
Nodes are computed on time. If Treehashh is initialized in round ϕ, the
authentication node on height h computed by this instance is required in round
ϕ + 2h+1 . During these 2h+1 rounds there are (H − 2)2h updates available and
Treehashh requires 2h updates to complete.
Merkle Tree Traversal Revisited 69

Algorithm 2. Authentication path computation, simplified version


Input: ϕ ∈ {0, . . . , 2H − 2}, H ≥ 2 even, and the algorithm state.
Output: Authentication path for leaf ϕ + 1

1. Let τ = 0 if leaf ϕ is a left node or let τ be the height of the first parent of leaf ϕ
which is a left node:
τ ← max{h : 2h |(ϕ + 1)}
2. If the parent of leaf ϕ on height τ +1 is a left node, store the current authentication
node on height τ in Keepτ :
if ϕ/2τ +1 is even and τ < H − 1 then Keepτ ← Authτ
3. If leaf ϕ is a left node, it is required for the authentication path of leaf ϕ + 1:
if τ = 0 then Auth0 ← Leafcalc(ϕ)
4. Otherwise, if leaf ϕ is a right node, the authentication path for leaf ϕ + 1 changes
on heights 0, . . . , τ :
if τ > 0 then
(a) The authentication path for leaf ϕ+1 requires a new left node on height τ . It is
computed using the current authentication node on height τ − 1 and the node
on height τ − 1 previously stored in Keepτ −1 . The node stored in Keepτ −1
can then be removed:
Authτ ← f (Authτ −1 ||Keepτ −1 ), remove Keepτ −1
(b) The authentication path for leaf ϕ + 1 requires new right nodes on heights
h = 0, . . . , τ − 1. For h ≤ H − 3 these nodes are stored in Treehashh and for
h = H − 2 in Retain:
for h = 0 to τ − 1 do
if h ≤ H − 3 then Authh ← Treehashh .node
if h = H − 2 then Authh ← Retain
(c) For heights 0, . . . , τ − 1 the treehash instances must be initialized anew. The
treehash instance on height h is initialized with the start index ϕ + 1 + 3 · 2h
if this index is smaller than 2H :
for h = 0 to τ − 1 do
if ϕ + 1 + 3 · 2h < 2H then Treehashh .initialize(ϕ + 1 + 3 · 2h )
5. Next we spend the budget of H/2 − 1 updates on the treehash instances to prepare
upcoming authentication nodes:
repeat H/2 − 1 times
(a) We consider only stacks which are initialized and not finished. Let s be the
index of the treehash instance whose lowest tail node has the lowest height.
In case there is more than one such instance we choose the instance with the
j
lowest index: ff
s ← min h : Treehashh .height = min {Treehashj .height}
j=0,...,H−3
(b) The treehash instance with index s receives one update:
Treehashs .update()
6. The last step is to output the authentication path for leaf ϕ + 1:
return Auth0 , . . . , AuthH−1 .
70 J. Buchmann, E. Dahmen, and M. Schneider

On height i = 0, . . . , h−1, a new instance is initialized every 2i+1 rounds, each


requiring 2i updates to complete. During the 2h+1 available rounds, 2h+1 /2i+1
treehash instances are initialized on heights i = 0, . . . , h − 1. All these lower
instances are completed before Treehashh (Line 5a).
In addition, active treehash instances on heights i = h + 1, . . . , H − 3 might
receive updates until their lowest tail node has height h. Once they have a tail
node on height h they don’t receive further updates while the instance on height
h is active (Line 5a). Computing a tail node on height h requires at most 2h
updates.
The number of updates required to complete Treehashh on time is at most


h−1
2h+1 i 
H−3

i+1
· 2 + 2 h
+ 2h = (H − 2)2h (2)
i=0
2
i=h+1

This shows that the budget of H/2 − 1 leaves per round suffices. For h = H − 3
this bound is tight.
Sharing a single stack works. To show that it is possible for all treehash
instances to share a single stack, we have to show that if Treehashh receives
an update and has previously stored tail nodes on the stack, all these tail nodes
are on top of the stack.
When Treehashh receives its first update, the height of the lowest tail node
of Treehashi , i ∈ {h+1, . . . , H −3} is at least h. Otherwise, one of the instances
on height i would receive an update (Line 5a). This means that Treehashh is
completed before Treehashi receives another update and thus tail nodes of
higher treehash instances do not interfere with tail nodes of Treehashh .
While Treehashh is active and stores tail nodes on the stack, it is possible
that treehash instances on lower heights i ∈ {0, . . . , h − 1} receive updates and
store nodes on the stack. If Treehashi receives an update, the height of the
lowest tail node of Treehashh has height ≥ i. This implies that Treehashi
is completed before Treehashh receives another update and therefore doesn’t
store tail nodes on the stack anymore.

2.3 Time and Space Bounds


This section considers the time and space requirements of Algorithm 2. We will
show that
i) On average, our algorithm computes (H − 1)/2 leaves and (H − 3)/2 inner
nodes.
ii) The number of tail nodes stored on the stack is bounded by H − 4.
iii) The number of inner nodes computed by all treehash instances per round is
bounded by 3/2(H − 3).
iv) The number of nodes stored in Keep is bounded by H/2 + 1.
For the space, we have to add the H nodes stored in Auth, the H − 2 nodes
[Link] and the single node stored in Retain. For the worst case time,
Merkle Tree Traversal Revisited 71

we have to add the H/2 − 1 leaves to compute right nodes and one leaf and one
inner node to compute left nodes (Lines 3, 4a in Algorithm 2). All together we
get the following theorem:

Theorem 1. Let H ≥ 2 be even. Algorithm 2 needs to store at most 3.5H − 4


nodes and needs to compute at most H/2 leaves and 3/2(H − 3) + 1 inner nodes
per step to successively compute authentication paths. On average, Algorithm 2
computes (H − 1)/2 leaves and (H − 3)/2 inner nodes per step.

Average costs. We now estimate the average cost of our algorithm in terms of
leaves (L) and inner nodes (I) to compute. We begin with the right nodes. On
height h = 0 there are 2H−1 right leaves to compute. On heights h = 1, . . . , H −3,
there are 2H−h−1 right nodes to compute. Each of these nodes requires the
computation of 2h leaves and 2h − 1 inner nodes. For the left nodes, we must
compute one leaf and one inner node every second step, alternating. This makes
a total of 2H−1 leaves and inner nodes. Summing up yields
H−3  H−3 
 
2 H−h−1
·2 +2
h H−1
L+ 2 H−h−1
· (2 − 1) + 2
h H−1
I (3)
h=0 h=1
 
H −1 H H −3 H
= ·2 L+ ·2 +4 I (4)
2 2

as total number of leaves and inner nodes that must be computed. To obtain
the average cost per step we divide by 2H .
Space required by the stack. We will show that the stack stores at most one
tail node on each height h = 0, . . . , H − 5 at a time.
Treehashh , h ∈ {0, . . . , H − 3} stores up to h tail nodes on different heights
to compute the authentication node on height h. The tail node on height h − 1 is
stored in Treehashh .node and the remaining tail nodes on heights 0, . . . , h − 2
are stored on the stack. When Treehashh receives its first update, the following
two conditions hold:
1. All treehash instances on heights < h are either empty or completed and
store no tail nodes on the stack.
2. All treehash instances on heights > h are either empty or completed or have
tail nodes of height at least h.
Both conditions follow directly from Line 5a in Algorithm 2. These conditions
imply that while Treehashh is active, all tail nodes on the stack that have
height at most h − 2 are on different heights.
If a treehash instance on height i = h + 1, . . . , H − 3 stores a tail node on
the stack, then all treehash instances on heights i + 1, . . . , H − 3 have tail nodes
of height at least i, otherwise the treehash instance on height i wouldn’t have
received any updates in the first place (recall that Treehashi .height = i holds
if Treehashi was just initialized). This implies that all tail nodes on the stack
that have height at least h and at most H − 5 are on different heights.
72 J. Buchmann, E. Dahmen, and M. Schneider

If a treehash instance on height j < h is initialized while Treehashh is active,


the same arguments can be applied to the instance on height j.
In total, there is at most one tail node on each height h = 0, . . . , H − 5 which
bounds the number of nodes stored on the stack by H − 4. This bound is tight
for round ϕ = 2H−1 − 2, before the update that completes the treehash instance
on height H − 3.
Inner nodes computed by treehash. For now we assume that the maximum
number of inner nodes is computed in the following case: TreehashH−3 receives
all u = H/2 − 1 updates and is completed in this round. On input an index ϕ,
the number of inner nodes computed by treehash in the worst case equals the
height of the first parent of leaf ϕ which is a left node, if the corresponding tail
nodes are stored on the stack. On height h, a left node occurs every 2h leaves,
which means that every 2h updates at most h inner nodes are computed by
treehash. This implies that during the u available updates, at most one inner
node on height h is computed every u/2h updates for h = 1, . . . , log2 u. The
last update requires the computation of H − 3 = 2u − 1 inner nodes to obtain the
desired node on height H − 3, i.e. completing this treehash instance. So far only
log2 u inner nodes were counted, so additional 2u − 1 − log2 u inner nodes
must be added. In total, we get the following upper bound for the number of
inner nodes computed per round.
log2 u 
 u
B= + 2u − 1 − log2 u (5)
2h
h=1

In round ϕ = 2H−1 − 2 this bound is tight. This is the last round before the
treehash instance on height H − 3 must be completed and as we saw in Section
2.2, all available updates are required in this case. The desired upper bound is
estimated as follows:
log 2 u  
 u
B≤ + 1 + 2u − 1 − log2 u
2h
h=1
log2 u
 
1 1
=u + 2u − 1 = u 1 − log u + 2u − 1
2h 2 2
h=1

1 3 3
≤u 1− + 2u − 1 = 3u − = (H − 3)
2u 2 2
The next step is to show that the above mentioned case is indeed the worst case.
If a treehash instance on height < H − 3 receives all updates and is completed
in this round, less than B hashes are required. The same holds if the treehash
instance receives all updates but is not completed in this round. The last case
to consider is the one where the u available updates are spend on treehash
instances on different heights. If the active treehash instance Treehashh stores
a tail node ν on height j, it will receive updates until it has a tail node on height
j +1. This requires 2j updates and the computation of 2j inner nodes. Additional
Merkle Tree Traversal Revisited 73

t ∈ {1 . . . H − j − 4} inner nodes are computed to obtain ν’s parent on height


j + t + 1, if Treehashh stores tail nodes on heights j + 1 . . . j + t on the stack
and in Treehashh .node. The next treehash instance that receives the remaining
updates has a tail node on height ≥ j. This instance computes additional inner
nodes only, if there are enough updates left to compute an inner node on height
≥ j + t, the height of the next tail node possibly stored on the stack. But this is
the same scenario that appears in the above mentioned worst case, i.e. if a node
on height j + 1 is computed, the tail nodes on the stack are used to compute its
parent on height j + t + 1 and the same instance receives the next update.
Space required to compute left nodes. First we remark that because of
Steps 2 and 4a in Algorithm 2, the node stored in Keeph−1 is removed whenever
an authentication node is stored in Keeph in the same round, h = 1, . . . , H − 2.
Next we show that if a node gets stored in Keeph , h = 0, . . . , H − 3, then
Keeph+1 is empty. To see this we have to consider in which rounds a node
is stored in Keeph+1 . This happens in rounds ϕ ∈ Sa = {2h+1 − 1 + a ·
2h+3 , . . . , 2h+2 − 1 + a · 2h+3 }, a ∈ N0 . In rounds ϕ = 2h − 1 + b · 2h+2 , b ∈ N0 ,
a node gets stored in Keeph . It is straight forward to compute that ϕ ∈ Sa
implies that 2a + 1/4 ≤ b ≤ 2a + 3/4 which is a contradiction to b ∈ N0 .
As a result, at most H/2 nodes are stored in Keep at a time and two consec-
utive nodes can share one entry. One additional entry is required to temporarily
store the authentication node on height h (Step 2) until node on height h − 1 is
removed (Step 4a).

3 Comparison
We now compare our algorithm with Szydlo’s algorithm from [13]. We compare
the number of leaves, inner nodes, and total hash function evaluations computed
per step in the worst case and the average case.
The computation of an inner node costs one hash function evaluation. This
follows directly from the construction rule for Merkle trees of Equation (1). The
cost to compute one leaf, in terms of hash function evaluations, depends on
the one-time signature scheme used for the MSS. The Lamport–Diffie one-time
signature scheme [8] requires 2n evaluations of the hash function, where n is the
output length of the hash function. The Winternitz one-time signature scheme
[5] roughly requires 2w · n/w evaluations of the hash function, where w is the
Winternitz parameter. For our comparison, we use a cost of 100 hash function
evaluations for each leaf calculation.
Table 1 shows the number of leaves, inner nodes, and total hash function
evaluations computed per step in the worst case and the average case. These
values were obtained experimental. The number of leaves and inner nodes our
algorithm requires according to Theorem 1 are given in parentheses.
This table shows, that the cost for the inner nodes is negligible compared to
the cost for the leaf calculations. Our algorithm reduces the total number of hash
function evaluations required in the worst case by more than 49%, 27%, 28%, 15%
for H = 4, 10, 14, 20, respectively, even when using the comparatively low ratio
74 J. Buchmann, E. Dahmen, and M. Schneider

Table 1. Comparison of the worst case and average case runtime of our algorithm
and Szydlos algorithm from [13]. The values according to Theorem 1 are given in
parentheses.

Our Algorithm Szydlo’s Algorithm


H leaves inner nodes hashes leaves inner nodes hashes
Worst case
4 2 (2) 1 (2.5) 201 4 0 400
10 5 (5) 8 (11.5) 508 7 4 704
14 7 (7) 14 (17.5) 714 10 4 1000
20 10 (10) 24 (26.5) 1024 12 8 1208
Average case
4 1.2 (1.5) 0.6 (0.5) 120.6 1.5 0.6 150.6
10 4.0 (4.5) 3.0 (3.5) 403.0 4.5 3.5 453.5
14 6.0 (6.5) 5.0 (5.5) 605.0 6.5 5.5 655.5
20 9.0 (9.5) 8.0 (8.5) 908.0 9.5 8.5 958.5

of 100 hash function evaluations per leaf. When using larger ratios, as they
occur in practice, the advantage of our algorithm is more distinct. We state the
comparison only for Merkle trees up to a height of H = 20, since for larger
heights the MSS key pair generation becomes too inefficient so that Merkle trees
of height H > 20 cannot be used in practice [2].
For H = 4, 10, 14, 20, our algorithm needs to store 10, 31, 45, 66 nodes and
Szydlo’s algorithm needs to store 10, 28, 40, 58 nodes, respectively. Although
Szydlo’s algorithm requires slightly less storage, additional implementing effort
and possibly overhead must be taken into account when using Szydlo’s algorithm
on platforms without dynamic memory allocation. This is because Szydlo’s algo-
rithm uses separate stacks for each of the H treehash instances, where, roughly
speaking, each stack can store up to O(H) nodes but all stacks together never
store more than O(H) nodes at a time. The simple approach of reserving the
maximal required memory for each stack yields memory usage quadratic in H.
Table 1 also shows, that our algorithm on average performs slightly better than
Szydlo’s algorithm. This is a result of the slightly increased memory usage of our
algorithm. More importantly, comparing the average case and worst case runtime
shows, that the worst case runtime of our algorithm is extremely close to its
average case runtime. This certifies that our algorithm provides balanced timings
for the authentication path generation and thus the MSS signature generation.

4 Conclusion
We proposed a new algorithm for the computation of authentication paths in a
Merkle tree. In the worst case, our algorithm is significantly faster than the best
algorithm known so far, namely Szydlo’s algorithm from [13]. In fact, the worst
Merkle Tree Traversal Revisited 75

case runtime of our algorithm is very close to its average case runtime which,
in turn, equals the average case runtime of Szydlo’s algorithm. The main idea
of our algorithm is to distinguish between leaves and inner nodes of the Merkle
tree and balance the number of leaves computed in each step.
In detail, our algorithm computes H/2 leaves and 3/2(H − 3) + 1 inner nodes
in the worst case and (H − 1)/2 leaves and (H − 3)/2 inner nodes on average. For
example, we reduce the worst case cost for computing authentication paths in a
Merkle tree of height H = 20 by more than 15% compared to Szydlo’s algorithm.
When implementing our algorithm, the space bound of 3.5H − 4 nodes can be
achieved without additional effort, even on platforms that do not offer dynamic
memory allocation.

References
1. Berman, P., Karpinski, M., Nekrich, Y.: Optimal trade-off for Merkle tree traversal.
Theoretical Computer Science 372(1), 26–36 (2007)
2. Buchmann, J., Coronado, C., Dahmen, E., Döring, M., Klintsevich, E.: CMSS — an
improved Merkle signature scheme. In: Barua, R., Lange, T. (eds.) INDOCRYPT
2006. LNCS, vol. 4329, pp. 349–363. Springer, Heidelberg (2006)
3. Buchmann, J., Dahmen, E., Klintsevich, E., Okeya, K., Vuillaume, C.: Merkle
signatures with virtually unlimited signature capacity. In: Katz, J., Yung, M. (eds.)
ACNS 2007. LNCS, vol. 4521, pp. 31–45. Springer, Heidelberg (2007)
4. Coronado, C.: On the security and the efficiency of the Merkle signature scheme.
Cryptology ePrint Archive, Report 2005/192 (2005), [Link]
5. Dods, C., Smart, N., Stam, M.: Hash based digital signature schemes. In: Smart,
N. (ed.) Cryptography and Coding 2005. LNCS, vol. 3796, pp. 96–115. Springer,
Heidelberg (2005)
6. Grover, L.K.: A fast quantum mechanical algorithm for database search. In: Pro-
ceedings of the Twenty-Eighth Annual Symposium on the Theory of Computing,
pp. 212–219. ACM Press, New York (1996)
7. Jakobsson, M., Leighton, T., Micali, S., Szydlo, M.: Fractal Merkle tree representa-
tion and traversal. In: Joye, M. (ed.) CT-RSA 2003. LNCS, vol. 2612, pp. 314–326.
Springer, Heidelberg (2003)
8. Lamport, L.: Constructing digital signatures from a one way function. Technical
Report SRI-CSL-98, SRI International Computer Science Laboratory (1979)
9. Lenstra, A.K., Verheul., E.R.: Selecting cryptographic key sizes. Journal of Cryp-
tology 14(4), 255–293 (2001); updated version (2004),
[Link]
10. Merkle, R.C.: A certified digital signature. In: Brassard, G. (ed.) CRYPTO 1989.
LNCS, vol. 435, pp. 218–238. Springer, Heidelberg (1990)
11. Naor, D., Shenhav, A., Wool, A.: One-time signatures revisited: Practical fast sig-
natures using fractal merkle tree traversal. In: IEEE – 24th Convention of Electrical
and Electronics Engineers in Israel, pp. 255–259 (2006)
12. Shor, P.W.: Algorithms for quantum computation: Discrete logarithms and factor-
ing. In: Proc. 35th Annual Symposium on Foundations of Computer Science, pp.
124–134. IEEE Computer Society Press, Los Alamitos (1994)
76 J. Buchmann, E. Dahmen, and M. Schneider

13. Szydlo, M.: Merkle tree traversal in log space and time (preprint, 2003),
[Link]
14. Szydlo, M.: Merkle tree traversal in log space and time. In: Cachin, C., Camenisch,
J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 541–554. Springer, Heidelberg
(2004)

A General Version of the Authentication Path Algorithm


We now describe the general version of our algorithm that is able to handle
Merkle trees of odd height. The general version also provides an additional trade-
off between the computation time and the storage needed, as suggested by Szydlo
in [13]. The basic idea is to store more right authentication nodes (like the
single node Retain) to obtain an even number of treehash instances that need
updating. The parameter K denotes the number of upper levels where all right
nodes are stored permanently. We must choose K ≥ 2, such that H − K is
even. Instead of a single node Retain, we now use stacks Retainh , h = H −
K, . . . , H − 2, to store all right authentication nodes during the initialization:
Retainh .push(yh [2j + 3]), for h = H − K, . . . , H − 2 and j = 2H−h−1 − 2, . . . , 0.
The pseudo-code of the general version is shown in Algorithm 3. Including the
parameter K in Theorem 1 yields Theorem 2.

Theorem 2. Let H ≥ 2 and K ≥ 2 such that H − K is even. Algorithm 3 needs


to store at most 3H + H/2 − 3K − 2 + 2K nodes and needs to compute at most
(H − K)/2 + 1 leaves and 3(H − K − 1)/2 + 1 inner nodes per step to successively
compute authentication paths. On average, Algorithm 3 computes (H − K + 1)/2
leaves and (H − K − 1)/2 inner nodes per step.

The simplified version described in Section 2 corresponds to the choice K = 2.


The proofs for the correctness and the time and space bounds of the general
version can be obtained by substituting H − K + 2 for H in the proofs for the
simplified version.

B Computing Leaves Using a PRNG


In [2], the authors propose to use a forward secure PRNG to successively compute
the one-time signature keys. The benefit is that the storage cost for the private
key is reduced drastically, since only one seed must be stored instead of 2H
one-time signature keys. Let Seedϕ denote the seed required to compute the
one-time key pair corresponding to leaf ϕ.
During the authentication path computation, leaves which are up to 3·2H−K−1
steps away from the current leaf must be computed. Calling the PRNG that
many times to obtain the seed required to compute this leaf is too inefficient.
Instead we propose the following scheduling strategy that requires H − K calls
to the PRNG in each round to compute the seeds. We have to store two seeds
for each height h = 0, . . . , H − K − 1. The first (SeedActive) is used to suc-
cessively compute the leaves for the authentication node currently constructed
Merkle Tree Traversal Revisited 77

Algorithm 3. Authentication path computation, general version


Input: ϕ ∈ {0, . . . , 2H − 2}, H, K and the algorithm state.
Output: Auththentication path for leaf ϕ + 1

1. Let τ = 0 if leaf ϕ is a left node or let τ be the height of the first parent of leaf ϕ
which is a left node:
τ ← max{h : 2h |(ϕ + 1)}
2. If the parent of leaf ϕ on height τ +1 is a left node, store the current authentication
node on height τ in Keepτ :
if ϕ/2τ +1 is even and τ < H − 1 then Keepτ ← Authτ
3. If leaf ϕ is a left node, it is required for the authentication path of leaf ϕ + 1:
if τ = 0 then Auth0 ← Leafcalc(ϕ)
4. Otherwise, if leaf ϕ is a right node, the authentication path for leaf ϕ + 1 changes
on heights 0, . . . , τ :
if τ > 0 then
(a) The authentication path for leaf ϕ+1 requires a new left node on height τ . It is
computed using the current authentication node on height τ − 1 and the node
on height τ − 1 previously stored in Keepτ −1 . The node stored in Keepτ −1
can then be removed:
Authτ ← f (Authτ −1 ||Keepτ −1 ), remove Keepτ −1
(b) The authentication path for leaf ϕ + 1 requires new right nodes on heights
h = 0, . . . , τ − 1. For h ≤ H − K − 1 these nodes are stored in Treehashh and
for h ≥ H − K in Retainh :
for h = 0 to τ − 1 do
if h ≤ H − K − 1 then Authh ← Treehashh .node
if h > H − K − 1 then Authh ← Retainh .pop()
(c) For heights 0, . . . , min{τ − 1, H − K − 1} the treehash instances must be ini-
tialized anew. The treehash instance on height h is initialized with the start
index ϕ + 1 + 3 · 2h if this index is smaller than 2H :
for h = 0 to min{τ − 1, H − K − 1} do
if ϕ + 1 + 3 · 2h < 2H then Treehashh .initialize(ϕ + 1 + 3 · 2h )
5. Next we spend the budget of (H − K)/2 updates on the treehash instances to
prepare upcoming authentication nodes:
repeat (H − K)/2 times
(a) We consider only stacks which are initialized and not finished. Let s be the
index of the treehash instance whose lowest tail node has the lowest height.
In case there is more than one such instance we choose the instance with the
j
lowest index: ff
s ← min h : Treehashh .height() = min {Treehashj .height()}
j=0,...,H−K−1
(b) The treehash instance with index s receives one update:
Treehashs .update()
6. The last step is to output the authentication path for leaf ϕ + 1:
return Auth0 , . . . , AuthH−1 .
78 J. Buchmann, E. Dahmen, and M. Schneider

by Treehashh and the second (SeedNext) is used for upcoming right nodes
on this height. SeedNext is updated using the PRNG in each round. During
the initialization, we set SeedNexth = Seed3·2h for h = 0, . . . , H − K − 1. In
each round, at first all seeds SeedNexth are updated using the PRNG. If in
round ϕ a new treehash instance is initialized on height h, we copy SeedNexth
to SeedActiveh . In that case SeedNexth = Seedϕ+1+3·2h holds and thus is
the correct seed to begin computing the next authentication node on height h.
Explicit Hard Instances of the
Shortest Vector Problem

Johannes Buchmann, Richard Lindner, and Markus Rückert

Technische Universität Darmstadt, Department of Computer Science


Hochschulstraße 10, 64289 Darmstadt, Germany
{buchmann,rlindner,rueckert}@[Link]

Abstract. Building upon a famous result due to Ajtai, we propose a


sequence of lattice bases with growing dimension, which can be expected
to be hard instances of the shortest vector problem (SVP) and which can
therefore be used to benchmark lattice reduction algorithms.
The SVP is the basis of security for potentially post-quantum cryp-
tosystems. We use our sequence of lattice bases to create a challenge,
which may be helpful in determining appropriate parameters for these
schemes.

Keywords: Lattice reduction, lattice-based cryptography, challenge.

1 Introduction

For the construction of post-quantum cryptosystems, it is necessary to identify


computational problems, whose difficulty can be used as a basis of the security
for such systems, and that remain difficult even in the presence of quantum com-
puters. One candidate is the problem of approximating short vectors in a lattice
(shortest vector problem — SVP). The quantum-hardness of this problem was
analyzed by Ludwig [25] and Regev [33]. They both find that the computational
advantage gained with quantum computers is marginal. There are several cryp-
tographic schemes whose security is based on the intractability of the SVP in
lattices of sufficiently large dimension (e.g. [3, 16, 17, 34]). To determine appro-
priate parameters for these cryptosystems, it is necessary to assess the practical
difficulty of this problem as precisely as possible.
In this paper, we present a sequence of lattice bases with increasing dimension,
which we propose as a world wide challenge. The construction of these lattices
is based both on theoretical and on practical considerations. On the theoretical
side, we apply a result of Ajtai [2]. It states that being able to find a sufficiently
short vector in a random lattice from a certain set, which also contains our
challenge lattices, implies the ability to solve supposedly hard problems (cf. [35])
in all lattices with a slightly smaller dimension than that of the random lattice.
Furthermore, we invoke a theorem of Dirichlet on Diophantine approximation
(cf. [20]). It guarantees the existence of a short vector in each challenge lattice.
On the practical side, using an analysis by Gama and Nguyen [13], we argue

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 79–94, 2008.
c Springer-Verlag Berlin Heidelberg 2008
80 J. Buchmann, R. Lindner, and M. Rückert

that finding this vector is hard for the lattices in our challenge. We also present
first experimental results that confirm the analysis.
Our challenge at [Link] can be considered as
an analogue of similar challenges for the integer factoring problem [36] and the
problems of computing discrete logarithms in the multiplicative group of a finite
field [27], or in the group of points on an elliptic curve over a finite field [10].
Our aim is to evaluate the current state-of-the-art in practical lattice basis
reduction by providing means for an immediate and well-founded comparison.
As a first application of the proposed challenge, we compare the performance of
LLL-type reduction methods — LLL [24], Stehlé’s fpLLL [30], Koy and Schnorr’s
segment LLL (sLLL) [22] — and block-type algorithms — Schnorr’s BKZ [38, 39],
Koy’s primal-dual (PD) [21], Ludwig’s practical random sampling 1 (PSR) [26].
To our knowledge, this is the first comparison of these algorithms.

Related work. Lattice reduction has been subject to intense studies over the last
decades, where a couple of methods and reduction schemes, in particular the
LLL algorithm by Lenstra, Lenstra, and Lovász [24], have been developed and
successively improved. Especially, the block Korkine Zolorarev algorithm (BKZ),
due to Schnorr [38, 39], has become the standard method when strong lattice
basis reduction is required.
There have been several approaches to measure the effectiveness of known
lattice reduction algorithms, especially in the context of the NTRU cryptosystem
[17]. Some of them, as in [18, 19], base their analysis on cryptosystems while
others, like [13, 31], make a more general approach using random lattices.
To our knowledge, there has never been a unified challenge, one that is in-
dependent of a specific cryptosystem, for lattice reduction algorithms. In all
previous challenges, the solution was always known to the creator.

Organization. In Section 2, we provide a brief introduction to lattices and state


some fundamental definitions. In Section 3, we define a family of lattices and
prove two properties, which are fundamental for our explicit construction pre-
sented in Section 4. Then, we give first experimental results comparing the per-
formance of various lattice reduction algorithms in Section 5. Finally, Section 6
introduces the actual lattice challenge.

2 Preliminaries

Let Rn denote the n-dimensional real vectorspace. We write the vectors of this
space in boldface to distinguish them from numbers. Any two vectors v, w ∈ Rn
have an inner product v, T
 w = v  w. Any v ∈ Rn has a length given by the
Euclidean norm v2 = v, v = v12 + · · · + vn2 . In addition to the Euclidean
norm, we also use the maximum mnorm v∞ = maxi=1,...,n {|vi | }.
A lattice in Rn is a set L = { i=1 xi bi | xi ∈ Z}, where b1 , . . . , bm are linearly
independent over R. The matrix B = [b1 , . . . , bm ] is called a basis of the lattice L
1
A practical variant of Schnorr’s random sampling reduction [40].
Explicit Hard Instances of the Shortest Vector Problem 81

and we write L = L(B). The number of linearly independent vectors in the basis
is the dimension of the lattice. If dim(L(B)) = n the lattice is full-dimensional.
An m-dimensional lattice L = L(B) has many different bases, namely all
the matrices in the orbit B GLm (Z) = {BT | T ∈ GLm (Z)}. If the lattice is
full-dimensional and integral, that is L ⊆ Zn , then there exists a unique basis
B = (bi,j ) of L, which is in Hermite normal form (HNF), i.e.
i. bi,j = 0 for all 1 ≤ j < i ≤ m
ii. bi,i > bi,j ≥ 0 for all 1 ≤ i < j ≤ m
Furthermore, the volume vol(L) of a full-dimensional lattice is defined as
| det(B)|, for any basis B of L. For every m-dimensional lattice L there is a
dual (or polar, reciprocal) lattice L∗ = {x ∈ Rm | ∀y ∈ L : x, y ∈ Z}. For any
full-dimensional lattice L = L(B), it holds that L∗ = L((B −1 )T ). The length
of the shortest lattice vector, denoted with λ1 = λ1 (L), is called first successive
minimum.

3 Foundations of the Challenge


In this section, we define a family of sets containing lattices, where each set will
have two important properties:
1. All lattices in the set contain non-obvious short vectors;
2. Being able to find a short vector in a lattice chosen uniformly at random
from the set, implies being able to solve difficult computational problems in
all lattices of a certain smaller dimension.

The family of lattice sets. Let n ∈ N, c1 , c2 ∈ R>0 , such that



1 c1 n
≤ c2 ≤ ln . (1)
2 ln(2) 4 c1 ln(n)
Furthermore, let
m = c1 n ln(n) , (2)
q = nc2  , (3)
and Zq = {0, . . . , q − 1}. For a matrix X ∈ Zn×m q , with column vectors x1 , . . . ,
xm , let
  
 m
m
L(c1 , c2 , n, X) = (v1 , . . . , vm ) ∈ Z  vi xi ≡ 0 (mod q) .

i=1

All lattices in the set L(c1 , c2 , n, ·) = {L(c1 , c2 , n, X)|X ∈ Zn×m


q } are of dimen-
sion m and the family of lattices L is the set of all L(c1 , c2 , n, ·), such that c1 , c2 , n
are chosen according to (1).
In the following theorems, we prove that all lattices in the sets of the family
L have the desired properties.
Existence of short vectors. We prove that all lattices in L(c1 , c2 , n, ·) of the family
L contain a vector with Euclidean norm less than n.
82 J. Buchmann, R. Lindner, and M. Rückert

Theorem 1. Let n ∈ N, c1 , c2 ∈ R>0 , and q, m ∈ N be as described above. Then,


any lattice in L(c1 , c2 , n, ·) ∈ L contains a vector with Euclidean norm less than n.

Proof. Let L(c1 , c2 , n, X) ∈ L(c1 , c2 , n, ·) ∈ L. We first show that any solution of


a certain Diophantine approximation problem corresponds to a vector in L(c1 ,
c2 , n, X). Then, we use a theorem of Dirichlet to establish the existence of a
non-zero lattice vector of length less than n.
Let v ∈ L(c1 , c2 , n, X), then there exists w ∈ Zn , such that
1
X v −w = 0.
q
This is equivalent to  
1 
 X v − w < 1 . (4)
q  q

Dirichlet’s theorem (cf. [20, 37]) states that for any t > 1, there is v ∈ Zm
and w ∈ Zn , such that
 
1 
 X v − w < e− nt and (5)
q 

t
v∞ < e m . (6)

We set t = n ln(q).
√ Then, (5) implies that (4) is satisfied. It remains to prove
that v∞ < n/ m because this implies v2 < n. Using (6), we have
n ln(q) n ln(nc2 ) ∗ 2 n c2 ln(n) 2 c2
t
v∞ < e m ≤ e m ≤ e c1 n ln(n) ≤ e c1 n ln(n)
≤e c1
.

For a rigorous proof of inequality ∗ see Appendix A. Together with (1), this
evaluates to
“ ”
2 c1
2 c2
ln n n n
e c1
≤e 4 c1 c1 ln(n)
≤ ≤√ ,
c1 ln(n) m

which completes the proof.

Hardness of finding short vectors. In the following, we show that being able to
find short vectors in an m-dimensional lattice chosen uniformly at random from
L(c1 , c2 , n, ·) ∈ L, implies being able to solve (conjectured) hard lattice problems
for all lattices of dimension n.
In his seminal work [2], Ajtai proved the following theorem that connects
average-case instances of certain lattice problems to worst-case instances. The
problems are defined as follows.

Lattice problems. Let L ⊆ Zn be an n-dimensional lattice and γ ≥ 1. We define


the
– Approximate shortest length problem (γ-SLP):
Find l ∈ R, such that l ≤ λ1 (L) ≤ γ l.
Explicit Hard Instances of the Shortest Vector Problem 83

– Approximate shortest vector problem (γ-SVP):


Find a vector v ∈ L \ {0}, such that for all w ∈ L : v2 ≤ γ w2 .
– Approximate shortest basis problem (γ-SBP):
Find a basis B of L, such that for all C ∈ B GLm (Z) :

max bi 2 ≤ γ max ci 2 .


i=1,2,...,n i=1,2,...,n

Theorem 2 ([2, Theorem 1]). Let c > 1 be an absolute constant. If there


exists a probabilistic polynomial time (in n) algorithm A that finds a vector
of norm < n in a random m-dimensional lattice from L(c1 , c2 , n, ·) ∈ L with
probability ≥ 1/2 then there exists

1. an algorithm B1 that solves the γ-SLP;


2. an algorithm B2 that solves the SVP, provided that the shortest vector is
γ-unique 2 ;
3. an algorithm B3 that solves the γ-SBP.

Algorithms B1 , B2 , B3 solve the respective problem (each with γ = nc ) with prob-


ability exponentially close to 1 in all lattices of dimension n, i.e. especially in
the worst-case. B1 , B2 , and B3 run in probabilistic polynomial time in n.

As for the constant c in Theorem 2, there have been several improvements to


Ajtai’s reduction with c ≥ 8 [9]. The first improvement (c = 3.5 + ) is due
to Cai and Nerurkar [9], whereas the most recent works by Miccancio [28] and
Micciancio and Regev [29], improve c to almost 3 1.
Asymptotic and practical hardness of the above problems depends on the
choice of γ. A recent survey [35] by Regev states the currently known“approxima-
bility” and “inapproximability” results. As for the complexity of lattice problems,
it focuses on the works of Lagarias, Lenstra, and Schnorr [23], Banaszczyk [7],
Goldreich and Goldwasser [15], Ajtai, Kumar, and Sivakumar [4], Aharonov and
Regev [1], and Peikert [32]. Since it is very helpful and descriptive, we adopted
Figure 1 from the survey.
On the left, there are provably NP-hard problems, followed by a gap for which
the hardness is unknown. In the center, there are problems conjectured not to
be NP-hard because their NP-hardness would contradict the general perception
that coNP = NP. Finally, on the right, there are problems that can be solved in
probabilistic polynomial time.

! √ 2 / log n
1 n1/ log log n n/ log n n 2n log log n/ log n 2n(log log n)

hard NP ∩ coAM NP ∩ coNP BPP P

Fig. 1. The complexity of γ-SVP for increasing γ (some constants omitted)


2
A shortest vector v ∈ L is γ-unique if for all w ∈ L with w2 ≤ γ v2 ⇒ w = ±v.
3
Omitting poly-logarithmic terms in the resulting approximation factor.
84 J. Buchmann, R. Lindner, and M. Rückert

√ the problems in Theorem 2 are not believed to be NP-


We emphasize that
hard because γ > n. Nevertheless, there is no known algorithm that efficiently
solves worst-case instances of lattice problems for sufficiently large dimensions n,
with an approximation factor polynomial in n. So Theorem 2 strongly supports
our claim that computing short vectors in the lattice family is hard. This is also
supported by a heuristic argument of Gama and Nguyen [13], which we refer to
in Section 4.

4 Construction of Explicit Bases

Ajtai’s construction in [2] defines all lattices implicitly. In this section, we show
how to generate explicit integral bases for these lattices.
For any m ≥ 500, we now construct a lattice Lm of dimension m, which is our
hard instance of the SVP. The lattice Lm is of the form L(c1 , c2 , n, X), where the
parameters c1 , c2 , n, X are chosen as a function of the dimension m as follows.
We start with a desired lattice dimension m, set c2 = 1, and choose c1 , n =
n(m) such that (1) and (2) hold. This is done by setting

c1 = inf{c ∈ R | ∃n ∈ N : m = c n ln(n) ∧ c2 ≤ c ln(n/(c ln(n)))/4} , (7)


n(m) = max{n ∈ N | m = c1 n ln(n) ∧ c2 ≤ c1 ln(n/(c1 ln(n)))/4} . (8)

With m = 500, for example, we get c1 = 1.9453, c2 = 1, and n = q = 63.


Having selected the set L(c1 , c2 , n, ·), we “randomly” pick a lattice from it. We
use the digits of π as a source of “randomness” 4 . This approach is supported by
the conjectured normalcy of π in [5, 6]. We write

3.π1 π2 π3 π4 . . . ,

so πi , for i ≥ 1, is the ith decimal digit of π in the expansion after the decimal
point. In order to compensate for potential statistical bias, we define

πi∗ = π2 i + π2 i−1 mod 2 for i ≥ 1 .

Now, we use the sequence (π1∗ , π2∗ , π3∗ , π4∗ , . . .) as a substitute for a sequence of
uniformly distributed random bits.
The matrix X = (xi,j ) ∈ Zn×m
q is chosen via


k+log2 (q)
xi,j = 2l−k πl∗ mod q for 1 ≤ i ≤ n, 1 ≤ j ≤ m ,
l=k
with k = k(i, j) = ((i − 1) m + (j − 1)) log2 (q) + 1 .

With that, we have selected a “random” element L(c1 , c2 , n, X), for which we
will now generate an integral basis.
4
The digits of π can be optained from [Link]
Explicit Hard Instances of the Shortest Vector Problem 85

Let Im be the m-dimensional identity matrix. We start with the matrix


⎛ ⎞
x1,1 · · · xn,1 q 0 · · · 0
⎜ .. ⎟
⎜ x1,2 · · · xn,2 0 q .⎟
Y1 = (X T | q Im ) = ⎜
⎜ . .
⎟.
⎝ ..
. . .. ⎟
. . .. .. . 0⎠
x1,m · · · xn,m 0 · · · 0 q

Let Y2 be the Hermite normal form of Y1 , we compute the transformation matrix


T1 , which satisfies
Y2 T1 = Y1 = (X T | qIm ) .
We set T2 to be equal to T1 , but without the n leading columns. This guarantees
that
Y2 T2 = q Im . (9)
Finally, we set the basis to B = T2T .
Now, we have to show that B is an integral basis of L(c1 , c2 , n, X). Clearly,
B is an integral matrix because the transformation T1 , given by the HNF com-
putation, is in Zm×(n+m) and T2 is the same matrix with the n leading columns
removed.
By the uniqueness of inverses, (9) shows that B = ((Y2 /q)−1 )T . This implies
that B is a basis for the dual lattice of L(Y2 /q) (cf. Section 2). Since Y2 is
an integral transformation of Y1 , they span the same lattice. Thus, L(Y2 /q) =
L(Y1 /q).
By the defining property of the dual lattice, we have that for any v ∈ L(B)
and w ∈ L(Y1 /q), it holds that v, w ∈ Z. So especially for all columns x
of X T , it holds that v, x/q ∈ Z, or equivalently v, x ∈ qZ. This implies
v, x mod q = 0, which in turn gives us L(B) ⊆ L(c1 , c2 , n, X).
Now let v ∈ L(c1 , c2 , n, X), so for any column x of X T we have that the inner
product v, x mod q = 0, or equivalently v, x/q ∈ Z. Since we know L(c1 , c2 ,
n, X) ⊆ Zm , it also holds that v, e ∈ Z for any column e of the identity matrix
Im . Since v has an integral inner product with each column vector in Y1 /q, this
means v is in the dual lattice of L(Y1 /q), which we know to be L(B). Finally,
we have L(B) = L(c1 , c2 , n, X).
For a small example of such a basis, refer to Appendix C.

The choice of parameters. We now argue that our choice of the paramters leads
to m-dimensional lattices Lm = L(c1 , c2 , n, X), in which vectors of norm less
than n(m) are hard to find.
We have chosen c2 = 1. By Theorem 1, this guarantees the existence of lattice
vectors with norm less than n(m) = q in Lm .
A choice of c2 < 1, and thus q < n, would imply that all q-vectors, namely
vectors that are zero except for one entry q, in Zm have Euclidean norm less
than n(m). This renders the lattice challenge preposterous because q-vectors are
easy to find. Moreover, Theorem 1 only guarantees the existence of one short
vector, which in this case might be a q-vector.
86 J. Buchmann, R. Lindner, and M. Rückert

Table 1. Lattice parameters with the necessary Hermite factor γ

m n, q γ
500 63 1.0072m
825 127 1.0050m
1000 160 1.0042m
1250 208 1.0036m
1500 256 1.0031m
1750 304 1.0027m
2000 348 1.0024m

On the other hand, choosing c2 > 1 enlarges c1 , and because of (2) decreases
n(m). Then, the hardness of lattice problems in a large dimension m would be
based on the worst-case hardness of lattice problems in a very small dimension
n. As n decreases, our hardness argument becomes less meaningful because even
worst-case lattice problems in small dimensions are believed to be easy.
Table 1 shows how m and n are related for the selected lattices Lm . For a
graphical overview, up to m = 2000, refer to Appendix B. Thus, in order to
apply Theorem 2 as a strong indication for hardness, we keep n(m) close to m in
the above construction. We choose a pseudo-random X to get a random element
in L(c1 , c2 , n, ·), as required by Theorem 2. Using the recent improvement of
Ajtai’s result due to Gentry, Peikert, and Vaikuntanathan [14], it is possible to
choose c2 arbitrarily close to 1. Their results can also be used to improve our
construction, by providing an even stronger indication of hardness. For this, we
refer the reader to the extended version [8].
To give an even stronger argument for the hardness of the SVP in our lattices,
we use a result by Gama and Nguyen [13]. They argue that finding vectors v in
a lattice L is difficult if
v < γvol(L)1/m , (10)
where γ ≤ 1.01m and m is the dimension of L. In this inequality, γ is called
Hermite factor. For γ ≤ 1.005m Gama and Nguyen state that computing vectors
v that satisfy (10) is “totally out of reach”.
Finding a vector v ∈ Lm of length less than n(m) means finding a vector v
that satisfies (10) with Hermite factor

n(m)
γ< .
vol(Lm )1/m

Such Hermite factors are tabulated in column 3 of Table 1.


In combination with the analysis of Gama and Nguyen, the table suggests
that while finding a vector shorter than n(m) in L500 is still possible, the respec-
tive problem in L825 will be very hard in practice. As the dimension increases, the
Explicit Hard Instances of the Shortest Vector Problem 87

necessary Hermite factor falls below 1.004n and 1.003n. We think that finding
short vectors in the corresponding lattices will require entirely new algorithms.

5 Experiments with Lattice Reduction Algorithms


As a first application of our explicit construction of lattices Lm , we show how
various lattice reduction algorithms perform on them. Basically, there are two
types of algorithms: the LLL-type and the block-type. Building upon LLL, block-
type algorithms are typically stronger, in the sense that they are able to find
significantly shorter vectors. Block-type algorithms, however, are impractical for
large block sizes because their running time increases at least exponentially in
this parameter.

Toy challenges. In Section 4, we have seen that the problem of finding a vector
of length less than n(m) in lattices Lm starts to become difficult for m ≥ 500
and it should be infeasible for m ≥ 825.
Thus, we define a relaxed variant of the family L. It is the family of all lattice
sets L(2, 1, n, ·), i.e. we set c2 = 1 and c1 = 2, so (1) does not necessarily hold.
Although, in such lattices, there is no guarantee for the existence of lattice vec-
tors of norm less than n(m), such vectors indeed exist in practice. Moreover, our
explicit construction in Section 4 still works and produces bases for lattices Lm ,
m < 500. In the following, the lattices Lm , 200 ≤ m < 500, will be referred to
as toy challenges. Explicit parameters for this range can be found in Appendix
D. There, we also compute the necessary Hermite factor as in Section 4. The
factors suggest that current lattice reduction methods are supposed to find lat-
tice vectors of norm less than n(m). Our experiments with block-type methods
confirm this.
All experiments were run on a single core AMD Opteron at 2.6 GHz, using
Shoup’s NTL [41] in version 5.4.2 and GCC 4.1.2. .

Implementations. For LLL and BKZ, we used the famous implementations in-
tegrated in the NTL. We thank Filipović and Koy for making available their
implementations of sLLL and PD, which were part of the diploma thesis [12].
We also thank Ludwig for making available and updating his implementation of
PSR that was part of his PhD thesis [26]. Finally, we thank Cadé and Stehlé for
making available their implementation of fpLLL. It was obtained from [42].
Figure 2 and Figure 3 depict the performance, i.e. the length of the shortest
obtained vector and the logarithmic running time in seconds, for LLL-type and
block-type methods, respectively. The boxed line in the left figures shows the
norm bound n(m) that has to be undercut. While block-type methods reliably
find vectors of norm less than n(m) up to a dimension around 500, the best
LLL-type algorithms merely succeed in dimensions < 300.
While being arguably efficient with our choice of parameters, sLLL is unable
to find sufficiently short vectors even in dimension 200. For larger dimensions,
88 J. Buchmann, R. Lindner, and M. Rückert

LLL (δ = 0.99) LLL (δ = 0.99)


fpLLL (δ = 0.99) fpLLL (δ = 0.99)
sLLL (δ = 0.99, 10 segments) sLLL (δ = 0.99, 10 segments)
ln(n2 )
9
13
8
12
7
11
6
10

ln(τ )
ln(·22 )

5
9

8 4

7 3

6 2

5 1
200 300 400 500 600 700 800 200 300 400 500 600 700 800
m m

(a) Shortest vectors (b) Run time

Fig. 2. Performance of LLL-type lattice reduction with comparable parameters

BKZ (β = 15) BKZ (β = 15)


PD (β = 15, 10 segments) PD (β = 15, 10 segments)
PSR (β = 15) PSR (β = 15)
ln(n2 )
10
13
9
12
11 8
10
7
ln(τ )
ln(·22 )

9
8 6

7 5
6
4
5
4 3
200 300 400 500 600 700 800 200 300 400 500 600 700 800
m m

(a) Shortest vectors (b) Run time

Fig. 3. Performance of block-type lattice reduction with comparable parameters

however, the approximation results of all LLL-type algorithms seem to converge,


whereas the running time performance of fpLLL is significantly surpassed by that
of the other two. Note that we use the default wrapper method of fpLLL. Damien
Stehlé pointed out that one should rather use its faster heuristics (see [8]).
In Figure 3a, observe that BKZ and PSR perform slightly better than PD,
which is mostly due to the internal sLLL step in PD. Accordingly, the graphs
seem to converge at the right end, similarly to those in Figure 2a. While the
approximation performance of block-type algorithms can be further improved
using higher block sizes, this approach is limited by the resulting running time.
Extrapolating to higher dimensions, it becomes obvious that finding sufficiently
short vectors in Lm requires a significantly larger effort for dimensions that are
somewhat higher than 600. This coincides with our observation on the Hermite
factor in Section 4.
As for the running time performance of the block-type schemes, observe in
Figure 3b that all three behave similarly. In lower dimensions, up to about m =
450, BKZ performs strictly better. In higher dimensions, the differences even out
and the random character of PSR becomes obvious in its slightly erratic timing.
Explicit Hard Instances of the Shortest Vector Problem 89

Shortest basis vector


Threshold ln(n2 ) = ln(632 )

8.8
8.7
8.6
8.5
8.4
ln(·22 )
8.3
8.2
8.1
8
7.9
7.8
7.7
4 6 8 10 12 14 16
β

Fig. 4. Shortest vectors found by β-BKZ in dimension m = 500

To conclude, we have reviewed the current state-of-the-art performance of


lattice reduction algorithms, using reasonable parameters. We did not, however,
explore the limits of the block-type methods. This assessment, we leave to the
contestants of the actual lattice challenge that is defined in the next section.

6 The Challenge
In Section 4, we have constructed challenge lattices Lm of dimension m, for
m ≥ 500. The results in Section 3 together with the pseudo-random choice of
Lm guarantee the existence of vectors v ∈ Lm with v2 < n(m), which are
hard to find. For a toy example, refer to Appendix C.
As stated before, we want the lattice challenge to be open in the sense that
it does not terminate when the first short vector is found. Having proven the
existence of just one solution might suggest that there are no more, but during
practical experiments, we found that many successively shorter vectors exist. For
example in Figure 4, we display that in dimension m = 500 BKZ with increasing
block size subsequently finds smaller and smaller lattice vectors.
We propose the following challenge to all researchers and students.

Lattice Challenge
The contestants are given lattice bases of lattices Lm , together with
a norm bound ν. Initially, we set ν = n(m).

The goal is to find a vector v ∈ Lm , with v2 < ν.


Each solution v to the challenge decreases ν to v2 .
The challenge is hosted at [Link]
90 J. Buchmann, R. Lindner, and M. Rückert

Acknowledgements
We would like to thank Oded Regev for his helpful remarks and suggestions.
Furthermore, we thank the program committee and the anonymous reviewers
for their valuable comments.

References
1. Aharonov, D., Regev, O.: Lattice problems in NP ∩ coNP. J. ACM 52(5), 749–765
(2005)
2. Ajtai, M.: Generating hard instances of lattice problems. In: Proceedings of the
Annual Symposium on the Theory of Computing (STOC), pp. 99–108. ACM Press,
New York (1996)
3. Ajtai, M., Dwork, C.: A public-key cryptosystem with worst-case/average-case
equivalence. In: Proceedings of the Annual Symposium on the Theory of Com-
puting (STOC), pp. 284–293. ACM Press, New York (1997)
4. Ajtai, M., Kumar, R., Sivakumar, D.: A sieve algorithm for the shortest lattice
vector problem. In: Proceedings of the Annual Symposium on the Theory of Com-
puting (STOC), pp. 601–610. ACM Press, New York (2001)
5. Bailey, D., Crandall, R.: On the random character of fundamental constant expan-
sions. Experimental Mathematics 10(2), 175–190 (2001)
6. Bailey, D., Crandall, R.: Random generators and normal numbers. Experimental
Mathematics 11(4), 527–546 (2002)
7. Banaszczyk, W.: New bounds in some transference theorems in the geometry of
numbers. Mathematische Annalen 296(4), 625–635 (1993)
8. Buchmann, J., Lindner, R., Rückert, M.: Explicit hard instances of the shortest
vector problem (extended version). Cryptology ePrint Archive, Report 2008/333
(2008), [Link]
9. Cai, J., Nerurkar, A.: An improved worst-case to average-case connection for lattice
problems. In: Proceedings of the Annual Symposium on Foundations of Computer
Science (FOCS), pp. 468–477 (1997)
10. Certicom Corp. The Certicom ECC Challenge,
[Link]
11. Coppersmith, D., Shamir, A.: Lattice Attacks on NTRU. In: Fumy, W. (ed.) EU-
ROCRYPT 1997. LNCS, vol. 1233, pp. 52–61. Springer, Heidelberg (1997)
12. Filipović, B.: Implementierung der gitterbasenreduktion in segmenten. Master’s
thesis, Johann Wolfgang Goethe-Universität Frankfurt am Main (2002)
13. Gama, N., Nguyen, P.Q.: Predicting lattice reduction. In: Smart, N.P. (ed.) EU-
ROCRYPT 2008. LNCS, vol. 4965, pp. 31–51. Springer, Heidelberg (2008)
14. Gentry, C., Peikert, C., Vaikuntanathan, V.: Trapdoors for hard lattices and new
cryptographic constructions. In: Ladner, R.E., Dwork, C. (eds.) STOC, pp. 197–
206. ACM Press, New York (2008)
15. Goldreich, O., Goldwasser, S.: On the limits of nonapproximability of lattice prob-
lems. J. Comput. Syst. Sci. 60(3), 540–563 (2000)
16. Goldreich, O., Goldwasser, S., Halevi, S.: Public-key cryptosystems from lattice
reduction problems. In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294,
pp. 112–131. Springer, Heidelberg (1997)
17. Hoffstein, J., Pipher, J., Silverman, J.H.: NTRU: A ring-based public key cryp-
tosystem. In: Buhler, J. (ed.) ANTS 1998. LNCS, vol. 1423, pp. 267–288. Springer,
Heidelberg (1998)
Explicit Hard Instances of the Shortest Vector Problem 91

18. Hoffstein, J., Silverman, J.H., Whyte, W.: Estimated breaking times for NTRU
lattices. Technical Report 012, Version 2, NTRU Cryptosystems (2003),
[Link]
19. Howgrave-Graham, N., Pipher, H.J.J., Whyte, W.: On estimating the lattice secu-
rity of NTRU. Technical Report 104, Cryptology ePrint Archive (2005),
[Link]
20. Kleinbock, D., Weiss, B.: Dirichlet’s theorem on diophantine approximation and
homogeneous flows. [Link]. 4, 43 (2008)
21. Koy, H.: Primale-duale Segment-Reduktion (2004),
[Link]
22. Koy, H., Schnorr, C.-P.: Segment LLL-reduction of lattice bases. In: Silverman,
J.H. (ed.) CaLC 2001. LNCS, vol. 2146, pp. 67–80. Springer, Heidelberg (2001)
23. Lagarias, J.C., Lenstra Jr., H.W., Schnorr, C.-P.: Korkin-Zolotarev bases and suc-
cessive minima of a lattice and its reciprocal lattice. Combinatorica 10(4), 333–348
(1990)
24. Lenstra, A., Lenstra, H., Lovász, L.: Factoring polynomials with rational coeffi-
cients. Mathematische Annalen 261(4), 515–534 (1982)
25. Ludwig, C.: A faster lattice reduction method using quantum search. In: Ibaraki,
T., Katoh, N., Ono, H. (eds.) ISAAC 2003. LNCS, vol. 2906, pp. 199–208. Springer,
Heidelberg (2003)
26. Ludwig, C.: Practical Lattice Basis Sampling Reduction. PhD thesis, Technische
Universität Darmstadt (2005), [Link]
27. McCurley, K.S.: The discrete logarithm problem. In: Pomerance, C. (ed.) Cryptol-
ogy and computational number theory, Providence, pp. 49–74. American Mathe-
matical Society (1990)
28. Micciancio, D.: Almost perfect lattices, the covering radius problem, and applications
to Ajtai’s connection factor. SIAM Journal on Computing 34(1), 118–169 (2004)
29. Micciancio, D., Regev, O.: Worst-case to average-case reductions based on gaussian
measures. SIAM Journal on Computing 37(1), 267–302 (2007)
30. Nguyen, P.Q., Stehlé, D.: Floating-point LLL revisited. In: Cramer, R. (ed.) EU-
ROCRYPT 2005. LNCS, vol. 3494, pp. 215–233. Springer, Heidelberg (2005)
31. Nguyen, P.Q., Stehlé, D.: LLL on the average. In: Hess, F., Pauli, S., Pohst, M.E.
(eds.) ANTS 2006. LNCS, vol. 4076, pp. 238–256. Springer, Heidelberg (2006)
32. Peikert, C.: Limits on the hardness of lattice problems in p norms. In: IEEE
Conference on Computational Complexity, pp. 333–346. IEEE Computer Society
Press, Los Alamitos (2007)
33. Regev, O.: Quantum computation and lattice problems. SIAM J. Comput. 33(3),
738–760 (2004)
34. Regev, O.: On lattices, learning with errors, random linear codes, and cryptography.
In: Proceedings of the 37th annual ACM symposium on Theory of computing, pp.
84–93. ACM Press, New York (2005)
35. Regev, O.: On the complexity of lattice problems with polynomial approximation
factors. In: A survey for the LLL+25 conference (2007)
36. RSA Security Inc. The RSA Challenge Numbers,
[Link]
37. Schmidt, W.: Diophantine Approximation. Lecture Notes in Mathematics, vol. 785.
Springer, Heidelberg (1980)
38. Schnorr, C.: A hierarchy of polynomial time lattice basis reduction algorithms.
Theoretical Computer Science 53, 201–224 (1987)
39. Schnorr, C.: Block reduced lattice bases and successive minima. Combinatorics,
Probability and Computing 4, 1–16 (1994)
92 J. Buchmann, R. Lindner, and M. Rückert

40. Schnorr, C.: Lattice reduction by random sampling and birthday methods. In:
Alt, H., Habib, M. (eds.) STACS 2003. LNCS, vol. 2607, pp. 146–156. Springer,
Heidelberg (2003)
41. Shoup, V.: Number theory library (NTL) for C++, [Link]
42. Stehlé, D.: Damien Stehlé’s homepage at école normale supérieure de Lyon,
[Link]

A Completing the Proof of Theorem 1


With parameters c1 , c2 , n as in the theorem, we want to show that

c1 n ln(n) ≥ c1 n ln(n)/2 (11)

holds. By (1), we have that c1 ≥ 1/(2 ln(2)). Evaluating both sides of (11) with
n = 1, 2, 3, we find that the inequality holds for these n. For all n ≥ 4, consider
the following.
We have that c1 ≥ 1/(2 ln(2)) ≥ 2/4 ln(4), which implies

c1 n ln(n) ≥ c1 n ln(n) − 1


≥ c1 n ln(n)/2 + c1 n ln(n)/2 − 1
≥ c1 n ln(n)/2 + c1 4 ln(4)/2 − 1
≥ c1 n ln(n)/2

This completes the proof.

B Ratio between m and n


In order to get an idea of the ratio m/n in our challenge lattices, refer to Figure 5.
The bend at m = 500 reflects our choice of c1 and c2 in the toy challenges, where
we “cap” the value of c2 at 2.0.
400

350

300

250

200
n

150

100

50

0
200 400 600 800 1000 1200 1400 1600 1800 2000
m

Fig. 5. Ratio between challenge dimension m and reference dimension n


Explicit Hard Instances of the Shortest Vector Problem 93

C Challenge Example

The following low-dimensional example gives an idea of what the challenge lat-
tices, and the short vectors in them, essentially look like. Its block structure is
similar to the one found by Coppersmith and Shamir for NTRU lattices [11].
This is not surprising because both belong to the class of modular lattices.
Example 1. The transposed challenge basis for m = 30, n = q = 8 looks like:

[
[1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -4 -7 -4 -7 -6 -2 -3 -7]
[0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -7 -4 -1 0 -6 -7 -1 -5]
[0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -2 -2 -6 -2 -6 -6 -4 -6]
[0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -6 -7 -1 -5 -5 -1 -4 -3]
[0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -7 -4 -2 -3 -1 0 -1 -3]
[0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -6 -3 -5 -7 -3 -7 0 -2]
[0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -5 -1 -6 -6 -6 -4 -3 -5]
[0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -1 0 -2 -2 -2 -7]
[0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 -4 -4 -3 0 -5 -7 -6 -4]
[0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 -3 -2 -4 -6 -4 -3 -2 -3]
[0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 -7 -6 -4 0 0 -2 -7 -4]
[0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 -4 -1 0 0 -7 -3 -7 0]
[0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 -1 -6 -3 0 -4 -1 -2 -3]
[0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 -3 -1 0 -4 -3 -3 -2 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 -6 -6 -2 -2 -1 -3 -6 -6]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 -7 -7 -4 -2 -1 -2 -5]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 -6 -2 -1 -4 -4 -3 -2 -6]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 -2 -6 -1 -1 -5 -4 -3 -3]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 -4 0 -5 -4 -6 -7 -5 -2]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 -4 -3 -3 0 -5 -3 -3 -7]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 -4 0 -3 -2 -2 -6 -4 -4]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 -5 -5 -3 0 -1 -3 0 -6]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0]
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8]
]

The shortest vector in the respective lattice is

[0 0 0 0 -1 1 0 0 -1 0 0 0 1 0 0 0 0 0 1 0 0 0 0 -1 0 0 1 0 0 0]

and its Euclidean norm is 7 < n = 8.
94 J. Buchmann, R. Lindner, and M. Rückert

D Toy Challenges

Table 2 depicts the parameters of our toy challenges.

Table 2. Lattice parameters with the necessary Hermite factor γ

m n, q γ
200 30 1.0146m
225 33 1.0133m
250 36 1.0123m
275 38 1.0115m
300 41 1.0107m
325 44 1.0101m
350 46 1.0095m
375 49 1.0091m
400 51 1.0086m
425 54 1.0082m
450 56 1.0079m
475 59 1.0075m
Practical-Sized Instances of Multivariate PKCs:
Rainbow, TTS, and IC-Derivatives

Anna Inn-Tung Chen1 , Chia-Hsin Owen Chen2 , Ming-Shing Chen2 ,


Chen-Mou Cheng1 , and Bo-Yin Yang2,
1
Department of Electrical Engineering, National Taiwan University, Taipei, Taiwan
{anna1110,doug}@[Link]
2
Institute of Information Science, Academia Sinica, Taipei, Taiwan
{mschen,owenhsin,by}@[Link]

Abstract. We present instances of MPKCs (multivariate public key


cryptosystems) with design, given the best attacks we know, and im-
plement them on commodity PC hardware. We also show that they can
hold their own compared to traditional alternatives. In fact, they can be
up to an order of magnitude faster.

Keywords: Gröbner basis, multivariate public key cryptosystem.

1 Introduction
MPKCs (multivariate public key cryptosystems) [14, 31] are PKCs whose public
keys are multivariate polynomials in many small variables. It has two properties
that are often touted: Firstly, it is considered a significant possibility for Post-
Quantum Cryptography, with potential to resist future attacks with quantum
computers. Secondly, it is often considered to be faster than the competition.
Extant MPKCs almost always hide the private map Q via composition with
two affine maps S, T . So, P = (p1 , . . . , pm ) = T ◦ Q ◦ S : Kn → Km , or
S Q T
P : w = (w1 , . . . , wn ) → x = MS w + cS → y → z = MT y + cT = (z1 , . . . , zm )
(1)
The public key consists of the polynomials in P. P(0) is always taken to be zero.
In any given scheme, the central map Q belongs to a certain class of quadratic
maps whose inverse can be computed relatively easily. The maps S, T are affine
(sometimes linear) and full-rank. The xj are called the central variables. The
polynomials giving yi in x are called the central polynomials; when necessary
to distinguish between the variable and the value, we will write yi = qi (x).
The key of a MPKC is the design of the central map because, solving a generic
multivariate quadratic system is hard, so the best solution for finding w given z
invariably turns to other means, which depend on the structure of Q.

Corresponding author.

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 95–108, 2008.
c Springer-Verlag Berlin Heidelberg 2008
96 A.I.-T. Chen et al.

1.1 Questions

Four or five years ago, it was shown that instances of TTS and C ∗− , specifically
TTS/4 and SFLASH, are faster signature schemes than traditional competition
using RSA and ECC [1, 10, 33]. These two instances both been broken in the
meantime [18, 20]. Now that the width of a typical ALU is 64 bits, commodity
PC hardware has never been more friendly to RSA and ECC. While multivariates
still represent a future-proofing effort, can we still say that MPKCs are efficient
on commodity hardware?

1.2 Our Answers

Currently the fastest multivariate PKCs seems to be from the Rainbow and IC
families [16,17]. We run comparisons using Pentium III (P3) machines (on which
NESSIE contestants are tested) and modern Core 2 and Opteron (hereafter C2
an K8) machines. On these test runs, we can say that compared to implementa-
tions using standard PKCs (DSA, RSA, ECDSA), present instances of MPKCs
with design security levels of around 280 can hold their own in terms of efficiency.
In this paper, we describe how we select our Rainbow and IC-derived in-
stances sketch our implementation. We also suggest the new approach of using
bit-slicing when evaluating in GF(16) or other small fields during the construc-
tion of the private map.
In the comparison here, we use D. J. Bernstein’s eBATs system to do bench-
marking. We can conclude that

1. 3IC− p is comparable to SFLASH, but not as fast as Rainbow.


2. Rainbow is fast and TTS faster, although the security is not as well studied.
3. 2IC+ i is a very fast way to build an encryption scheme.

Table 1. Current Multivariate PKCs Compared on a Pentium III 500

Scheme result SecrKey PublKey KeyGen SecrMap PublMap


RSA-1024 1024b 128 B 320 B 2.7 sec 84 ms 2.00 ms
ECC-GF(2163 ) 320b 48 B 24 B 1.6 ms 1.9 ms 5.10 ms
PMI+(136, 6, 18, 8) 144b 5.5 kB 165 kB 1.1 sec 1.23 ms 0.18 ms
rainbow (28 , 18, 12, 12) 336b 24.8 kB 22.5 kB 0.3 sec 0.43 ms 0.40 ms
rainbow (24 , 24, 20, 20) 256b 91.5 kB 83 kB 1.6 sec 0.93 ms 0.73 ms
TTS (28 , 18, 12, 12) 336b 3.5kB 22.5kB 0.04 sec 0.11 ms 0.40 ms
TTS (24 , 24, 20, 20) 256b 5.6kB 83kB 0.43 sec 0.22 ms 0.74 ms
2IC+ i (128,6,16) 144b 5 kB 165 kB 1 sec 0.03 ms 0.17 ms
2IC+ i (256,12,32) 288b 18.5 kB 1184 kB 14.9 sec 0.24 ms 2.60 ms
QUARTZ 128b 71.0 kB 3.9 kB 3.1 sec 11 sec 0.24 ms
3IC-p(24 , 32, 1) 380b 9 kB 148 kB 0.6 sec 2.00 ms 1.90 ms
pFLASH 292b 5.5 kB 72 kB 0.3 sec 5.7 ms 1.70 ms
Practical-Sized Instances of Multivariate PKCs 97

Table 2. Comparison on One core of an Intel Core 2 (C2)

Scheme result SecrKey PublKey KeyGen SecrMap PublMap


PMI+(136, 6, 18, 8) 144b 5.5 kB 165 kB 350.8 Mclk 335.4 kclk 51.4 kclk
PMI+(136, 6, 18, 8)64b 144b 5.5 kB 165 kB 350.4 Mclk 333.9 kclk 46.5 kclk
rainbow (28 , 18, 12, 12) 336b 24.8 kB 22.5 kB 110.7 Mclk 143.9 kclk 121.4 kclk
rainbow (24 , 24, 20, 20) 256b 91.5 kB 83 kB 454.0 Mclk 210.2 kclk 153.8 kclk
rainbow (24 , 24, 20, 20)64b 256b 91.5 kB 83 kB 343.8 Mclk 136.8 kclk 79.3 kclk
TTS (28 , 18, 12, 12) 336b 3.5kB 22.5kB 11.5 Mclk 35.9 kclk 121.4 kclk
TTS (24 , 24, 20, 20) 256b 5.6kB 83kB 175.7 Mclk 64.8 kclk 78.9 kclk
2IC+ i (128,6,16) 144b 5 kB 165 kB 324.7 Mclk 8.3 kclk 52.0 kclk
2IC+ i (128,6,16)64b 144b 5 kB 165 kB 324.9 Mclk 6.7 kclk 46.9 kclk
2IC+ i (256,12,32) 288b 18.5 kB 1184 kB 4119.7 Mclk 26.7 kclk 385.6 kclk
2IC+ i (256,12,32)64b 288b 18.5 kB 1184 kB 4418.2 Mclk 23.0 kclk 266.9 kclk
3IC-p(24 , 32, 1) 380b 9 kB 148 kB 173.6 Mclk 503 kclk 699 kclk
pFLASH 292b 5.5 kB 72 kB 86.6 Mclk 2410 kclk 879 kclk
DSA/ElGamal 1024b 148B 128B 1.08 Mclk 1046 kclk 1244 kclk
RSA 1024b 148B 128B 108 Mclk 2950 kclk 121 kclk
ECC 256b 96B 64B 2.7 Mclk 2850 kclk 3464 kclk

Table 3. Comparison on One Core of an Opteron/Athlon64 (K8)

Scheme result SecrKey PublKey KeyGen SecrMap PublMap


PMI+(136, 6, 18, 8) 144b 5.5 kB 165 kB 425.4 Mclk 388.8 kclk 63.9 kclk
PMI+(136, 6, 18, 8)64b 144b 5.5 kB 165 kB 424.7 Mclk 393.3 kclk 60.4 kclk
rainbow (28 , 18, 12, 12) 336b 24.8 kB 22.5 kB 234.6 Mclk 297.0 kclk 224.4 kclk
rainbow (24 , 24, 20, 20) 256b 91.5 kB 83 kB 544.6 Mclk 224.4 kclk 164.0 kclk
rainbow (24 , 24, 20, 20)64b 256b 91.5 kB 83 kB 396.2 Mclk 138.7 kclk 83.9 kclk
TTS (28 , 18, 12, 12) 336b 3.5kB 22.5kB 20.4 Mclk 69.1 kclk 224.4 kclk
TTS (24 , 24, 20, 20) 256b 5.6kB 83kB 225.2 Mclk 103.8 kclk 84.8 kclk
2IC+ i (128,6,16) 144b 5 kB 165 kB 382.6 Mclk 8.7 kclk 64.2 kclk
2IC+ i (128,6,16)64b 144b 5 kB 165 kB 382.1 Mclk 7.5 kclk 60.1 kclk
2IC+ i (256,12,32) 288b 18.5 kB 1184 kB 5155.5 Mclk 31.1 kclk 537.0 kclk
2IC+ i (256,12,32)64b 288b 18.5 kB 1184 kB 5156.1 Mclk 26.6 kclk 573.9 kclk
3IC-p(24 , 32, 1) 380b 9 kB 148 kB 200.7 Mclk 645 kclk 756 kclk
pFLASH 292b 5.5 kB 72 kB 126.9 Mclk 5036 kclk 872 kclk
DSA/ElGamal 148B 148B 128B 0.864 Mclk 862 kclk 1018 kclk
RSA 1024b 148B 128B 150 Mclk 2647 kclk 117 kclk
ECC 256b 96B 64B 2.8 Mclk 3205 kclk 3837 kclk

1.3 Previous Work

In [4], Berbain, Billet and Gilbert describe several ways to compute the public
maps of MPKCs and compare their efficiency. However, they do not describe the
evaluation of the private maps.
[18] summarizes the state of the art against generalized Rainbow/TTS schemes.
The school of Stern et al developed differential attacks that breaks minus variants
98 A.I.-T. Chen et al.

[24, 20] and internal perturbation [23]. Ways to circumvent these attacks are
proposed in [13, 19].
The above attacks the cryptosystem as an EIP or “structural” problem. To
solve the system of equations, we have this
Problem MQ(q; n, m): Solve the system p1 (x) = p2 (x) = · · · = pm (x) = 0,
where each pi is a quadratic in x = (x1 , . . . , xn ). All coefficients and variables
are in K = GF(q), the field with q elements.
Best known methods for generic MQ are F4 -F5 or XL whose complexities [11,
21,22,32] are very hard to evaluate; asymptotic formulas can be found in [2,3,32].

1.4 Summary and Future Work


Our programs are not very polished; it merely serves to show that MPKCs can
still be fairly fast compared to the state-of-the-art traditional PKCs even on the
most modern and advanced microprocessors. There are some recent advances in
algorithms also, such as computations based on the inverted twisted Edwards
curves [5, 6, 7], which shows that when tuned for the platform, the traditional
cryptosystems can get quite a bit faster. It still remains to us to optimize more
for specific architectures including embedded platforms. Further, it is an open
question on whether the TTS schemes, with some randomness in the central
maps, can be made with comparable security as equally sized Rainbow schemes.
So far we do not have a conclusive answer.

2 Rainbow and TTS Families


We characterize a Rainbow [16] type PKC with u stages:
– The segment structure is given by a sequence 0 < v1 < v2 < · · · < vu+1 = n.
– For l = 1, . . . , u + 1, set Sl := {1, 2, . . . , vl } so that |Sl | = vl and S0 ⊂ S1 ⊂
· · · ⊂ Su+1 = S. Denote by ol := vl+1 − vl and Ol := Sl+1 \ Sl for l = 1 · · · u.
– The central map Q has component polynomials yv1 +1 = qv1 +1 (x), yv1 +2 =
qv1 +2 (x), . . . , yn = qn (x) — notice unusual indexing — of the following form


vl 
n
(k)
 (k)
yk = qk (x) = αij xi xj + βi xi , if k ∈ Ol := {vl + 1 · · · vl+1 }.
i=1 j=i i<vl+1

In every qk , where k ∈ Ol , there is no cross-term xi xj where both i and j


are in Ol at all. So given all the yi with vl < i ≤ vl+1 , and all the xj with
j ≤ vl , we can compute xvl +1 , . . . , xvl+1 .
Si is the i-th vinegar set and Oi the corresponding i-th oil set.
(k)
– To expedite computations, some coefficients (αij ) may be fixed (e.g., set to
zero), chosen at random (and included in the private key), or be interrelated
in a predetermined manner.
Practical-Sized Instances of Multivariate PKCs 99

– To invert Q, determine (usu. at random) x1 , . . . xv1 , i.e., all xk , k ∈ S1 . From


the components of y that corresponds to the polynomials pv1 +1 , . . . pv2 , we
obtain a set of o1 equations in the variables xk , (k ∈ O1 ). We may repeat
the process to find all remaining variables.
For historical reasons, a Rainbow type signature scheme is said to be a TTS
[33] scheme if the coefficients of Q are sparse.

2.1 Known Attacks and Security Criteria


1. Rank (or Low Rank, MinRank) attack to find a central equation with least
rank [33]. " #
Clow rank ≈ q v1 +1 m(n2 /2 − m2 /6)/ m.
Here as below, the unit m is a multiplications in K, and v1 the number
of vinegars in layer 1. This is the “MinRank” attack of [25]. as improved
by [8, 33].
2. Dual Rank (or High Rank) attack [9, 25], which finds a variable appearing
the fewest number of times in a central equation cross-term [18, 33]:
$ 
%
Chigh rank ≈ q on −v n3 /6 m,

where v  counts the vinegar variables that never appears until the final seg-
ment.
3. Trying for a direct solution. The complexity is roughly as MQ(q; m, m).
4. Using the Reconciliation Attack [18], the complexity is as MQ(q; vu , m).
5. Using the Rainbow Band Separation from [18], the complexity is determined
by that of MQ(q; n, m + n).
6. Against TTS, there is Oil-and-Vinegar Separation [30,26,27], which finds an
Oil subspace that is sufficiently large (estimates as corrected in [33]).
" #
CUOV ≈ q n−2o−1 o4 + (some residual term bounded by o3 q m−o /3) m.

o is the max. oil set size, i.e., there is a set of o central variables which are
never multiplied together in the central equations, and no more.

2.2 Choosing Rainbow Instances


First suppose that we wish to use SHA-1, which has 160 bits. It is established
by [18] that using GF(28 ) there is no way to get to 280 security using roughly
that length hash, unpadded.
Specifically, to get the complexity of MQ(28 , m, m), to above 280 (the direct
attack) we need about m = 24. Then we need MQ(28 , n, n + m) to get above 280
(the Rainbow Band Separation), which requires at least n = 42. This requires
an 192-bit hash digest plus padding and a signature length of 336 bits with the
vinegar sequence (18, 12, 12).
If we look at smaller fields, that’s a different story. If we use GF(24 ), we need
20 oil variables each in the last segment and at least 20 vinegar variables in the
100 A.I.-T. Chen et al.

first segment to get by the minrank and high rank attacks. To be comparable
to the sizes of 3IC-p, we choose the vinegar (structural) sequence (24, 20, 20).
The digest is 160 bits and the signature 192. We use random parameters under
this framework and don’t do TTS. The implementations are described below. In
each of the two instances, the central map is inverted by setting up and solving
two identically-sized linear systems.

2.3 Choosing TTS Instances


TTS of the same size over GF(28 ) or GF(24 ) are 2× or more the speed of than
a Rainbow instance. They also tend to have instances also have much lower
memory requirement. But we don’t really know about their security.
The following are TTS instances built with exactly the same rainbow struc-
tural parameters and called henceforth TTS/7. They have exactly the same size
input and output as the corresponding Rainbow instances:
TTS (28 , 18, 12, 12) K =GF(28 ), n = 42, m = 24. Q is structured as follows:


11
yi = xi + ai1 xσi + ai2 xσi + pij xj+18 xπi (j)
j=0
+ pi,12 xπi (12) xπi (15) + pi,13 xπi (13) xπi (16) + pi,14 xπi (14) xπi (17) , i = 18 · · · 29
[indices 0 · · · 17 appears exactly once in each random permutation πi ,
and exactly once among the σ, σ  (where six σi slots are empty)];

11
yi = xi + ai1 xσi + ai2 xσi + ai3 xσi + xj+29 (pij xπi (j) + pi,j+12 xπi (j+12) )
j=0
+ pi,24 xπi (24) xπi (27) + pi,25 xπi (25) xπi (28) + pi,26 xπi (26) xπi (29) , i = 30 · · · 41
[indices 0 · · · 29 appears exactly once in each random permutation πi ,
and exactly once among the σ, σ  , σ  (where six σi slots are empty)].

TTS (24 , 24, 20, 20) K =GF(24 ), n = 64, m = 40.


19
yi = xi + ai1 xσi + ai2 xσi + pij xj+23 xπi (j)
j=0
+ pi,20 xπi (20) xπi (22) + pi,21 xπi (21) xπi (23) , i = 24 · · · 43
[indices 0 · · · 23 appears exactly once in each random permutation πi ,
and exactly once among the σ, σ  (there are only four σi )];

19
yi = xi + ai1 xσi + ai2 xσi + ai3 xσi + xj+44 (pij xπi (j) + pi,j+20 xπi (j+20) )
j=0
+ pi,40 xπi (40) xπi (42) + pi,41 xπi (41) xπi (43) , i = 44 · · · 63
[indices 0 · · · 43 appears exactly once in each random permutation πi ,
and exactly once among the σ, σ  , σ  (there are only four σi )].
Practical-Sized Instances of Multivariate PKCs 101

3 The -Invertible Cycle (IC) and Derivatives

The -invertible cycle [17] can be best considered an improved version or ex-
tension of Matsumoto-Imai, otherwise known as C ∗ [28]. Let’s review first the
latter.
Triangular (and Oil-and-Vinegar, and variants thereof) systems are sometimes
called “single-field” or “small-field” approaches to MPKC design, in contrast to
the approach taken by Matsumoto and Imai in 1988. In such “big-field” variants,
the central map is really a map in a larger field L, a degree n extension of a finite
field K. To be quite precise, we have a map Q : L → L that we can invert, and
pick a K-linear bijection φ : L → Kn . Then we have the following multivariate
polynomial map, which is presumably quadratic (for efficiency):

Q = φ ◦ Q ◦ φ−1 . (2)

then, one “hide” this map Q by composing from both sides by two invertible
affine linear maps S and T in Kn , as in Eq. 1.
Matsumoto and Imai suggest that we pick a K of characteristic 2 and this
map Q
α
Q : x −→ y = x1+q , (3)
where x is an element in L, and such that gcd(1 + q α , q n − 1) = 1. The last
condition ensures that the map Q has an inverse, which is given by
−1
Q (x) = xh , (4)

where h(1 + q α ) = 1 mod (q n − 1). This ensures that we can decrypt any secret
message easily by this inverse. Hereafter we will simply identify a vector space
Kk with larger field L, and Q with Q, totally omitting the isomorphism φ from
formulas.
IC also uses an intermediate field L = Kk and extends C ∗ by using the
following central map from (L∗ ) to itself:

Q : (X1 , . . . , X ) → (Y1 , . . . , Y ) (5)


α
:= (X1 X2 , X2 X3 , . . . , X−1 X , X X1q ).

For “standard 3IC”,  = 3, α = 0. Invertion in (L∗ )3 is then easy.


  
Q−1 : (Y1 , Y2 , Y3 ) ∈ (L∗ )3 → ( Y1 Y3 /Y2 , Y1 Y2 /Y3 , Y2 Y3 /Y1 , ). (6)

Most of the analysis of the properties of the 3IC map can be found in [17] —
the 3IC and C ∗ maps has a lot in common. Typically, we take out 1/3 of the
variables with a minus variation (3IC− ).
For encryption schemes, “2IC” or  = 2, q = 2, α = 1 is suggested.

Q2IC : (X1 , X2 ) → (X1 X2 , X1 X22 ), Q−1


2IC : (Y1 , Y2 ) → (Y1 /Y2 , Y2 /Y1 ).
2
(7)
102 A.I.-T. Chen et al.

We construct 2ICi like we do PMI [12]: Take v = (v1 , . . . , vr ) to be an r-tuple of


random affine forms in the variables x. Let f = (f1 , . . . , fn ) be a random r-tuple
of quadratic functions in v. Let our new Q be defined by

x → y = Q2IC (x) + f (v(x))

where the power operation assumes the vector space to represent a field. The
number of Patarin relations decrease quickly down to 0 as r increases. For every
y, we may find Q−1 (y) by guessing at v(x) = b, finding a candidate x =
Q−1
2IC (y + b) and checking the initial assumption that v(x) = b. Since we repeat
the high going-to-the-h-th-power procedure q r times, we are almost forced to let
q = 2 and make r as low as possible.

3.1 Known Attacks to Internal Perturbation and Defenses


IC has so much in common with C ∗ that we need the same variations. In
other words, we need to do 3IC− p (with minus and projection) and 2IC+ i (with
internal perturbation and plus), paralleling C ∗− p and C ∗+ i (a.k.a. PMI+).
The cryptanalysis of PMI and hence 2ICi depends on the idea that for a
randomly chosen b, the probability is q −r that it lies in the kernel K of the
linear part of v. When that happens, v(x + b) = v(x) for any x. Since q −r is
not too small, if we can distinguish between a vector b ∈ T −1 K (back-mapped
into x-space) and b ∈ T −1 K, we can bypass the protection of the perturbation,
find our bilinear relations and accomplish the cryptanalysis.
In [23], Fouque, Granboulan and Stern built a one-sided distinguisher using a
test on the kernel of the polar form or symmetric difference DP(w, b) = P(b +
w) − P(b) − P(w). We say that t(b) = 1 if dim kerw DP(b, w) = 2gcd(n,α) − 1,
and t(b) = 0 otherwise. If b ∈ K, then t(b) = 1 with probability one, otherwise
it is less than one. In fact if gcd(n, α) > 1, it is is an almost perfect distinguisher.
We omit the gory details and refer the reader to [23] for the complete differential
cryptanalysis.
Typically, to defeat this attack, we need to add a random equations to the
central map. For 2ICi as for PMI, both a and r are roughly proportional to
n creating 2IC+ i like we did PMI+ [13]. PMI+(n, r, a, α) refers to a map from
α
GF(2n ) with r perturbations, a extra variables, and a central map of x → x2 +1 .
Similarly, 2IC+ i(n, r, a) refers to 2IC with r perturbations dimensions and a
added equations.

3.2 Known Attacks to Minus Variants and Defenses


The attack found by Stern etc. can be explained by considering the case of C ∗
cryptosystem. We recollect that the symmetric differential of any function G,
defined formally:

DG(a, x) := G(x + a) − G(x) − G(a) + G(0).


Practical-Sized Instances of Multivariate PKCs 103

is bilinear and symmetric in its variables a and x. Let ζ be an element in the


big field L. Then we have
α
DQ(ζ · a, x) + DQ(a, ζ · x) = (ζ q + ζ)DQ(a, x).

Clearly the public key of C ∗− inherits some of that symmetry. Now not every
skew-symmetric action by a matrix Mζ that corresponds to an L-multiplication
that result in MTζ Hi + Hi Mζ being in the span of the public-key differential
matrices, because S := span{Hi : i = 1 · · · n − r} as compared to span{Hi :
i = 1 · · · n} is missing r of the basis matrices. However, as the authors of [20]
argued heuristically and backed up with empirical evidence, if we just pick the
first three MTζ Hi + Hi Mζ matrices, or any three random linear combinations of
n−r
the form i=1 bi (MTζ Hi + Hi Mζ ) and demand that they fall in S, then
1. There is a good chance to find a nontrivial Mζ satisfying that requirement;
2. This matrix really correspond to a multiplication by ζ in L;
3. Applying the skew-symmetric action of this Mζ to the public-key matrices
leads to other matrices in span{Hi : i = 1 · · · n} that is not in S.
Why three? There are n(n − 1)/2 degrees of freedom in the Hi , so to form a
span of n− r matrices takes n(n− 3)/2 + r linear relations among its components
(n − r and not n because if we are attacking C ∗− , we are missing r components
of the public key). There are n2 degrees of freedom in an n × n matrix U . So, if
we take a random public key, it is always possible to find a U such that

U T H1 + H1 U, U T H2 + H2 U ∈ S = span{Hi : i = 1 · · · n − r},

provided that 3n > 2r. However, if we ask that

U T H1 + H1 U, U T H2 + H2 U, U T H3 + H3 U ∈ S,

there are many more conditions than degrees of freedom, hence it is unlikely to
find a nontrivial solution for truly random Hi . Conversely, for a set of public keys
from C ∗ , tests [20] shows that it almost surely eventually recovers the missing r
equations and break the scheme.
Similarly, [24] and the related [29] shows a similar attack (with a more complex
backend) almost surely breaks 3IC− and any other IC− . For the IC case, the
point is the differential expose the symmetry for a linear map (X1 , X2 , X3 ) →
(ξ1 X1 , ξ2 X2 , ξ3 X3 ). Exactly the same symmetric property is found enabling the
same kind of attacks.
It was pointed out [15] that Internal Perturbation is almost exactly equal to
both Vinegar variables and Projection, or fixing the input to an affine subspace.
Let s be one, two or more. We basically set s variables of the public key to be
zero to create the new public key. However, in the case of signature schemes,
each projected dimension will slow down the signing process by a factor of q. A
104 A.I.-T. Chen et al.

differential attack looks for an invariant or a symmetry. Restricting to a subspace


of the original w-space breaks a symmetry. Something like the Minus variant
destroys an invariant. Hence the use of projection by itself prevents some attacks.
In [19], it was checked experimentally, for various C ∗ parameters n and θ, the
effect of restricting the internal function to a randomly chosen subspace H of
various dimensions s. This is a projected C ∗− instance of parameters (q, n, r, s).
We repeated this check for 3IC− and discover that again the attacks from [24,29]
are prevented. We call this setup 3IC− p(q, k, s).

3.3 Choosing Instances


For signature schemes, we choose C ∗− p(24 , 74, 22, 1), which uses 208-bit hashes
and is related to the original FLASH by the fact that it uses half as wide variables
and project one. We also choose 3IC− p(24 , 32, 1), which acts on 256-bit hashes.
To invert the public map of projected minus signature schemes:
1. Put in random numbers to the “minus” coordinates.
2. Invert the linear transformation T to get y.
3. Invert the central map C ∗ or 3IC to get x.
4. Invert the final linear transformation S to get w.
5. If the last component (nybble) of w is zero, return the rest, else go to step
1 and repeat.
For the encryptions schemes, we choose PMI+(136, 6, 18, 8) and 2IC (128,6,16)
and (256,12,32).
To invert the public map of internally perturbed plus encryption schemes:
1. Invert the linear transformation T to get y.
2. Guess the vector b = v(x).
3. Invert the central map C ∗ or 3IC on y − b to get x.
4. Verify b = v(x) and the extra a central equations; if they don’t hold, then
return to step 2 and repeat.
5. Invert the final linear S to get w.

4 Implementation Techniques
Most of the techniques here are not new, just implemented here. However, we
do suggest that the bit-sliced Gaussian Elimination idea is new.

4.1 Evaluation of Public Polynomials


We pretty much follow the suggestions of [4] for evaluation of the public poly-
nomials. I.e., over GF(28 ) we use traditional methods, i.e., logarithmic and ex-
ponential tables (full 64kB multiplication is faster for long streaming work but
has a much higher up-front time cost for one-time use). Over GF(24 ) we use
Practical-Sized Instances of Multivariate PKCs 105

bit-slicing and build lookup tables of all the cross-terms. Over GF(2) we evaluate
only the non-zero polynomials.

4.2 Operating on Tower Fields


During working with the inversion of the central map, we operate the big-field
systems using as much of tower fields as we can. We note that firstly, GF(2) =
{(0)2 , (1)2 }, where (·)2 means the binary representation. Then t2 + t + (1)2 is
i
irreducible over GF(2). We can implement GF(22 ) recursively. With a proper
i i−1
choice of αi , we let GF(22 ) = GF(22 )[ti ]/(t2i + ti + αi ).. One can also verify
that αi+1 := αi ti will lead to a good series of extensions.
i−1
For a, b, c, d ∈ GF(22 ), we can do Karatsuba-style

(ati + b)(cti + d) = [(a + b)(c + d) + bd]ti + [acαi + bd]

where the addition is the bitwise XOR and the multiplication of expressions of
a, b, c, d and αi are done in GF(22 ). Division can be effected via (ati + b)−1 =
i−1

(ati + a + b)(ab + b2 + a2 αi )−1 .


While most of the instances we work with only looks at tower fields going
up powers of two, a degree-three extension is similar with the extension being
quotiented against t3 +t+1 and similar polynomials, and a three-way Karatsuba
is relatively easy. We can do a similar thing for raising to a power of five.

4.3 Bit-Sliced GF(16) Rainbow Implementations


It is noted in [4] that GF(4) and GF(16) can be bitsliced for good effect.
Actually, any GF(2k ) for small k can be bitsliced this way. In particular, it is
possible to exploit the bitslicing to evaluate the private map.
1. Invert the linear transformation T to get y from z. We can use bitslicing
here to multiply each zi to one columne of the matrix MT−1 .
2. Guess at the initial block of vinegar variables
3. Compute the first system to be solved.
4. Solve the first system via Gauss-Jordan elimination with bitslice.
5. Compute the second system to be solved.
6. Solve the second system via Gauss-Jordan elimination with bitslice. We have
computed all of x.
7. Invert the linear transformation S to get w from x.
Note that during the bitslice solving, every equation can be stored as four bit-
vectors (here 32-bit or double words suffices), which stores every coefficient along
with the constant term. In doing Gauss-Jordan elimination, we use a sequence
of bit test choices to multiply the pivot equation so that the pivot coefficient
becomes 1, and then use bit-slicing SIMD multiplication to add the correct
multiple to every other equation. Bit-Sliced GF(16) is not used for TTS since
the set-up takes too much time.
106 A.I.-T. Chen et al.

4.4 TTS Implementations


There are a few things to note:
1. Due to the sparsity of the central maps, setting up the Gaussian elimination
to run using bitslice takes too much time. Hence, for TTS in GF(16) we
complete the entire computation of the private map expressing each GF(16)
element as a nybble (4 bits or half a byte) and start the evaluation of the
public map by converting the nybble vector packed two to a byte, to the
bitslice form.
2. Again for GF(16), we maintain two 4kByte multiplication tables that allows
us to lookup either abc or ab and ac at the same time.
3. We use the special form of key generation mentioned in [33, 34]. That is,
following Imai and Matsumoto [28], we divide the coefficients involved in
each public key polynomial into linear, square, and crossterm portions thus:
⎡ ⎤
    
zk = Pik wi + Qik wi2 + Rijk wi wj = wi ⎣Pik +Qik wi + Rijk wj ⎦.
i i i<j i i<j

Rijk , which comprise most of the public key, may be computed as in [34]:
⎡ ⎛ ⎞⎤

n−1 
Rijk = ⎣(MT )k,(−n+m) ⎝ p ((MS )αi (MS )βj + (MS )αj (MS )βi )⎠⎦
=n−m p xα xβ in y

The second sum is over all cross-terms p xα xβ in the central equation for
y . For every pair i < j, we can compute at once Rijk for every k in O(n2 )
totalling O(n4 ). Similar computations for Pik and Qik take even less time.
The instances that we chose are tested not to suffer the same kind of attacks
that fell previous TTS schemes, but we still don’t have any conclusive evidence
one way or the other of how likely this type of system can stand in the long run.

Acknowledgements
The authors thank Prof. Jintai Ding and Pei-Yuan Wu for invaluable comments
and discussions, and also to National Science Council for sponsorship under
Grant 96-2221-E-001-031-MY3.

References
1. Akkar, M.-L., Courtois, N.T., Duteuil, R., Goubin, L.: A fast and secure imple-
mentation of SFLASH. In: Desmedt, Y.G. (ed.) PKC 2003. LNCS, vol. 2567, pp.
267–278. Springer, Heidelberg (2002)
2. Bardet, M., Faugère, J.-C., Salvy, B.: On the complexity of Gröbner basis com-
putation of semi-regular overdetermined algebraic equations. In: Proceedings of
the International Conference on Polynomial System Solving, pp. 71–74, Previously
INRIA report RR-5049 (2004)
Practical-Sized Instances of Multivariate PKCs 107

3. Bardet, M., Faugère, J.-C., Salvy, B., Yang, B.-Y.: Asymptotic expansion of the
degree of regularity for semi-regular systems of equations. In: Gianni, P. (ed.)
MEGA 2005 Sardinia (Italy) (2005)
4. Berbain, C., Billet, O., Gilbert, H.: Efficient implementations of multivariate
quadratic systems. In: Biham, E., Youssef, A.M. (eds.) SAC 2006. LNCS, vol. 4356,
pp. 174–187. Springer, Heidelberg (2007)
5. Bernstein, D.J., Birkner, P., Joye, M., Lange, T., Peters, C.: Twisted edwards
curves. In: Vaudenay, S. (ed.) AFRICACRYPT 2008. LNCS, vol. 5023, pp. 389–
405. Springer, Heidelberg (2008)
6. Bernstein, D.J., Lange, T.: Faster addition and doubling on elliptic curves. In:
Kurosawa, K. (ed.) ASIACRYPT 2007. LNCS, vol. 4833, pp. 29–50. Springer,
Heidelberg (2007)
7. Bernstein, D.J., Lange, T.: Inverted edwards coordinates. In: Boztaş, S., Lu, H.-F.
(eds.) AAECC 2007. LNCS, vol. 4851, pp. 20–27. Springer, Heidelberg (2007)
8. Billet, O., Gilbert, H.: Cryptanalysis of rainbow. In: De Prisco, R., Yung, M. (eds.)
SCN 2006. LNCS, vol. 4116, pp. 336–347. Springer, Heidelberg (2006)
9. Coppersmith, D., Stern, J., Vaudenay, S.: The security of the birational permuta-
tion signature schemes. Journal of Cryptology 10, 207–221 (1997)
10. Courtois, N., Goubin, L., Patarin, J.: SFLASH: Primitive specification (second
revised version), Submissions, Sflash, 11 pages (2002),
[Link]
11. Courtois, N.T., Klimov, A., Patarin, J., Shamir, A.: Efficient algorithms for solving
overdefined systems of multivariate polynomial equations. In: Preneel, B. (ed.)
EUROCRYPT 2000. LNCS, vol. 1807, pp. 392–407. Springer, Heidelberg (2000),
[Link]
12. Ding, J.: A new variant of the Matsumoto-Imai cryptosystem through perturbation.
In: Bao, F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp. 305–318.
Springer, Heidelberg (2004)
13. Ding, J., Gower, J.: Inoculating multivariate schemes against differential attacks.
In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T. (eds.) PKC 2006. LNCS, vol. 3958.
Springer, Heidelberg (2006), [Link]
14. Ding, J., Gower, J., Schmidt, D.: Multivariate Public-Key Cryptosystems. In: Ad-
vances in Information Security. Springer, Heidelberg (2006)
15. Ding, J., Schmidt, D.: Cryptanalysis of HFEv and internal perturbation of HFE. In:
Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 288–301. Springer, Heidelberg
(2005)
16. Ding, J., Schmidt, D.: Rainbow, a new multivariable polynomial signature scheme.
In: Ioannidis, J., Keromytis, A., Yung, M. (eds.) ACNS 2005. LNCS, vol. 3531, pp.
164–175. Springer, Heidelberg (2005)
17. Ding, J., Wolf, C., Yang, B.-Y.: -invertible cycles for multivariate quadratic public
key cryptography. In: Okamoto, T., Wang, X. (eds.) PKC 2007. LNCS, vol. 4450,
pp. 266–281. Springer, Heidelberg (2007)
18. Ding, J., Yang, B.-Y., Chen, C.-H.O., Chen, M.-S., Cheng, C.-M.: New differential-
algebraic attacks and reparametrization of rainbow. In: Bellovin, S.M., Gennaro,
R., Keromytis, A., Yung, M. (eds.) ACNS 2008. LNCS, vol. 5037, pp. 242–257.
Springer, Heidelberg (2008), [Link]
19. Ding, J., Yang, B.-Y., Dubois, V., Cheng, C.-M., Chen, O.C.-H.: Breaking the
symmetry: a way to resist the new differential attack. In: ICALP 2008. LNCS.
Springer, Heidelberg (2008), [Link]
108 A.I.-T. Chen et al.

20. Dubois, V., Fouque, P.-A., Shamir, A., Stern, J.: Practical cryptanalysis of
SFLASH. In: Menezes, A. (ed.) CRYPTO 2007. LNCS, vol. 4622, pp. 1–12.
Springer, Heidelberg (2007)
21. Faugère, J.-C.: A new efficient algorithm for computing Gröbner bases (F4 ). Journal
of Pure and Applied Algebra 139, 61–88 (1999)
22. Faugère, J.-C.: A new efficient algorithm for computing Gröbner bases without
reduction to zero (F5 ). In: International Symposium on Symbolic and Algebraic
Computation — ISSAC 2002, pp. 75–83. ACM Press, New York (2002)
23. Fouque, P.-A., Granboulan, L., Stern, J.: Differential cryptanalysis for multivariate
schemes. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 341–353.
Springer, Heidelberg (2005)
24. Fouque, P.-A., Macario-Rat, G., Perret, L., Stern, J.: Total break of the IC- sig-
nature scheme. In: Public Key Cryptography, pp. 1–17 (2008)
25. Goubin, L., Courtois, N.T.: Cryptanalysis of the TTM cryptosystem. In: Okamoto,
T. (ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 44–57. Springer, Heidelberg
(2000)
26. Kipnis, A., Patarin, J., Goubin, L.: Unbalanced Oil and Vinegar signature schemes.
In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 206–222. Springer,
Heidelberg (1999)
27. Kipnis, A., Shamir, A.: Cryptanalysis of the oil and vinegar signature scheme.
In: Krawczyk, H. (ed.) CRYPTO 1998. LNCS, vol. 1462, pp. 257–266. Springer,
Heidelberg (1998)
28. Matsumoto, T., Imai, H.: Public quadratic polynomial-tuples for efficient signature
verification and message-encryption. In: Günther, C.G. (ed.) EUROCRYPT 1988.
LNCS, vol. 330, pp. 419–545. Springer, Heidelberg (1988)
29. Ogura, N., Uchiyama, S.: Remarks on the attack of fouque et al. against the ic
scheme. Cryptology ePrint Archive, Report 2008/208 (2008),
[Link]
30. Wolf, C., Braeken, A., Preneel, B.: Efficient cryptanalysis of RSE(2)PKC and
RSSE(2)PKC. In: Blundo, C., Cimato, S. (eds.) SCN 2004. LNCS, vol. 3352, pp.
294–309. Springer, Heidelberg (2005), [Link]
31. Wolf, C., Preneel, B.: Taxonomy of public key schemes based on the problem of
multivariate quadratic equations. Cryptology ePrint Archive, Report 2005/077, 64
pages, May 12 (2005), [Link]
32. Yang, B.-Y., Chen, J.-M.: All in the XL family: Theory and practice. In: Park, C.-
s., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 67–86. Springer, Heidelberg
(2005)
33. Yang, B.-Y., Chen, J.-M.: Building secure tame-like multivariate public-key cryp-
tosystems: The new TTS. In: Boyd, C., González Nieto, J.M. (eds.) ACISP 2005.
LNCS, vol. 3574, pp. 518–531. Springer, Heidelberg (2005)
34. Yang, B.-Y., Chen, J.-M., Chen, Y.-H.: TTS: High-speed signatures on a low-cost
smart card. In: Joye, M., Quisquater, J.-J. (eds.) CHES 2004. LNCS, vol. 3156,
pp. 371–385. Springer, Heidelberg (2004)
Digital Signatures Out of Second-Preimage
Resistant Hash Functions

Erik Dahmen1 , Katsuyuki Okeya2, Tsuyoshi Takagi3 , and Camille Vuillaume2


1
Technische Universität Darmstadt
dahmen@[Link]
2
Hitachi, Ltd., Systems Development Laboratory
{[Link],[Link]}@[Link]
3
Future University, Hakodate
takagi@[Link]

Abstract. We propose a new construction for Merkle authentication


trees which does not require collision resistant hash functions; in contrast
with previous constructions that attempted to avoid the dependency on
collision resistance, our technique enjoys provable security assuming the
well-understood notion of second-preimage resistance. The resulting
signature scheme is existentially unforgeable when the underlying hash
function is second-preimage resistant, yields shorter signatures, and is af-
fected neither by birthday attacks nor by the recent progresses in collision-
finding algorithms.

Keywords: Merkle signatures, provable security, second-preimage re-


sistance.

1 Introduction
In 1979, Ralph Merkle proposed a digital signature scheme constructed out of
cryptographic hash functions only [7]. The interest of this scheme is that, unlike
most public-key cryptosystems, its security does not rely on number-theoretic
problems. Even if a particular hash function appears insecure, the scheme can be
easily repaired by using a different hash function. Finally, the current research
suggests that the Merkle signature scheme (MSS) will be only marginally affected
if large quantum computers are built, something that is not true for popular
public-key cryptosystems such as RSA and ECC.
The security of the original construction of the MSS relies on a collision re-
sistant hash function for the hash tree and a preimage resistant function for the
one-time signature stage [3]. Regarding security, this construction has two draw-
backs. First, recent attacks on the collision resistance of popular hash functions
such as MD5 [15] and SHA1 [14] show that collision resistance is a goal which
is hard to achieve. Second, the security level of Merkle signatures is determined
by the collision resistance property of the hash function and therefore affected
by birthday attacks.
In [8], the authors argue, without proof, that the security level of the MSS
should be determined by the second-preimage resistance property of the hash

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 109–123, 2008.
c Springer-Verlag Berlin Heidelberg 2008
110 E. Dahmen et al.

function. Although no attack based on a collision finder is known for the MSS,
its security proof does not exclude the existence of such attacks. In addition,
Rohatgi proposes using target-collision resistant hash functions for achieving
goals that are similar to ours [11]. Unfortunately, practical hash functions were
not designed with target-collision resistance in mind, and keyed hash functions
such as HMAC lose all of their security properties when their key is revealed,
and as such, cannot be regarded as target-collision resistant. Although we agree
with [8] that second-preimage resistance should be at the heart of the security
of the MSS, we emphasize that until now, no satisfactory solution is known, at
least from a provable security perspective.
In this paper, we propose a new construction for Merkle authentication trees
and show that the resulting signature scheme is secure against adaptive chosen
message attacks, assuming a second-preimage resistant hash function and a se-
cure one-time signature scheme. Our construction is inspired by the XOR tree
proposed by Bellare and Rogaway for building universal one-way hash functions
out of universal one-way compression functions [1]. However, we use the XOR
tree for a totally different purpose, namely establishing the unforgeability of the
Merkle signature scheme, and we relax the assumption on the compression func-
tion to second-preimage resistance. Even for hash functions with short output
size, our scheme provably yields a high security level; compared to the original
MSS, not only security is improved, but the size of signatures is reduced as well.
The paper is organized as follows: in Section 2 we review security notions
for hash functions and signature schemes. In Section 3 we introduce the new
construction and its security proof. In Section 4 we estimate the security level
of the new scheme. In Section 5 we consider the problem of signing arbitrarily
long messages. In Section 6 we present practical considerations. In Section 7 we
state our conclusion.

2 Hash Functions and Signature Schemes


Hash Functions. We call HK = Hk : {0, 1}∗ → {0, 1}n k∈K a family of hash
functions, parameterized by a key k ∈ K, that map bit strings of arbitrary length
to bit strings of length n. There exist various security notions for hash functions,
see [10] for an overview. In this paper we focus on the three most popular ones,
namely preimage resistance, second-preimage resistance and collision resistance.
In the following, x ∈R X means that x is chosen uniformly at random.
Preimage resistance. For any key k ∈R K and y ∈R Hk ({0, 1}n) it is compu-
tationally infeasible to compute x ∈ {0, 1}∗ such that Hk (x) = y.
Second-preimage resistance. For any key k ∈R K and x ∈R {0, 1}n it is
computationally infeasible to compute x ∈ {0, 1}∗ such that x = x and
Hk (x) = Hk (x ).
Collision resistance. For any key k ∈R K it is computationally infeasible to
compute x, x ∈ {0, 1}∗ such that x = x and Hk (x) = Hk (x ).
We call a family Hk of hash functions (tow , ow ) preimage resistant (respectively
(tspr , spr ) second-preimage resistant or (tcr , cr ) collision resistant), if for any
Digital Signatures Out of Second-Preimage Resistant Hash Functions 111

adversary A that runs in time at most tow (resp. tspr or tcr ), the probability of
finding a preimage (resp. second-preimage or collision) is smaller than ow (resp.
spr or cr ).
Using generic (brute-force) attacks to compute preimages or second-preimages,
one requires tow = tspr = 2n−k evaluations of the hash function, to find a
preimage or second preimage with probability ow = spr = 1/2k . Due to the
birthday paradox, one requires tcr = 2n/2 evaluations of the hash function to
find a collision with probability cr = 1/2.

Signatures Schemes. A signature scheme Sign is defined as the triple (Gen,


Sig, Ver). Gen is the key pair generation algorithm that on input a security
parameter 1n produces a pair (sk, pk), where sk is the private key or signature
key and pk is the public key or verification key, (sk, pk) ← Gen(1n ). Sig is the
signature generation algorithm that on input a message M and private key sk
produces a signature σ(M ) ← Sig(M, sk). Ver is the verification algorithm that
on input (M, σ(M ), pk) checks whether the signature is valid, i.e. it outputs true
if and only if the signature is valid and false otherwise [4]. In the following, let
tGen , tSig , tVer be the time algorithms Gen, Sig, Ver require for key generation,
signing and verification, respectively.
Let Sign = (Gen, Sig, Ver) be a signature scheme. We call Sign a (t, , Q)
signature scheme or (t, , Q) existentially unforgeable under an adaptive chosen
message attack (CMA-secure), if for any forger ForSigsk (·) (pk), that has access to
a signing oracle Sigsk (·) and that runs in time at most t, the success probability
in forging a signature is at most , where ForSigsk (·) (pk) can query the signing
oracle at most Q times [4].

3 Merkle Signatures Using Second-Preimage Resistance


In this section we describe our construction for the Merkle authentication tree,
from now on called SPR-Merkle tree, and prove that the CMA-security of the
resulting signature scheme (SPR-MSS) can be reduced to the second-preimage
resistance of the used hash function and the CMA-security of the chosen one-
time signature scheme (OTS). In the following, let HK = Hk : {0, 1}2n →
{0, 1}n k∈K be a family of (tspr , spr ) second-preimage resistant hash functions.
Our construction differs to the original construction proposed by Merkle in
the following way: before applying the hash function to the concatenation of
two child nodes to compute their parent, both child nodes are XORed with
a randomly chosen mask. Also, a leaf of the SPR-Merkle tree is not the hash
value of the concatenation of the bit strings in the OTS verification key, but
the bit strings themselves. The SPR-Merkle tree is constructed starting directly
from these bit strings. For that reason, it is sufficient that the hash functions
Hk ∈ HK accept only bit strings of length at most 2n as input. In this section
we restrict the length of the message to be signed to n bits. The problem of
signing arbitrarily long messages is considered in Section 5.
112 E. Dahmen et al.

...
yi [j]
random mask random mask
(part of public key) k H (two masks/level)

vi [0] vi [1]

yi+1 [2j] yi+1 [2j +1]

k H k H

vi+1 [0] vi+1 [1] vi+1 [0] vi+1 [1]

... ... ... ...

Fig. 1. XOR construction for the SPR-Merkle tree

Key Pair Generation. The key pair generation of our scheme works as follows.
First choose h ≥ 1, to determine the number of signatures that can be generated
with this key pair, i.e. 2h many. Next compute 2h OTS key pairs (Xj , Yj ), for
j = 0, . . . , 2h −1. We assume that each signature key and verification key consists
of 2l bit strings each of length n. Then choose a key for the hash function k ∈R K
and masks vi [0], vi [1] ∈R {0, 1}n uniformly at random for i = 0, . . . , h+l−1. The
2h · 2l n-bit strings from the verification keys form the leaves of the SPR-Merkle
tree, which in total yields a tree of height h + l. The nodes are denoted by yi [j],
where i = 0, . . . , h + l denotes the height of the node in the tree (the root has
height 0 and the leaves have height h + l) and j = 0, . . . , 2i − 1 denotes the
position of the node on that height, counting from left to right. The inner nodes
are computed as
 
yi [j] = Hk yi+1 [2j] ⊕ vi [0]  yi+1 [2j + 1] ⊕ vi [1]

for i = h + l − 1, . . . , 0 and j = 0, . . . , 2i −1, see Figure 1. The SPR-MSS private


key consists of the 2h OTS signature keys Xj and the SPR-MSS public key
consists of

1. The key for the hash function k,


2. The XOR masks v0 [0], v0 [1], . . . , vh+l−1 [0], vh+l−1 [1], and
3. The root of the Merkle tree y0 [0].

Remark 1. In case the number of bit strings L in the verification key of the
chosen OTS is not a power of 2, the resulting SPR-Merkle tree has height h +
log2 L. The SPR-Merkle tree is constructed such that the subtrees below the
2h nodes yh [j] are unbalanced trees of height log2 L.
Digital Signatures Out of Second-Preimage Resistant Hash Functions 113

Signature Generation. For s ∈ {0, . . . , 2h − 1}, the sth signature of message


M = (m0 , . . . , mn−1 )2 is σs (M ) = s, σOTS (M ), Ys , As , where
– s is the index of the signature,
– σOTS (M ) is the one-time signature of M , generated with Xs ,
– Ys is the sth verification key, and
– As = (ah , . . . , a1 ) is the authentication path for Ys , where ai is the sibling
of the node at height i on the path from yh [s] to the root y0 [0], i.e.

yi [s/2h−i − 1], if s/2h−i ≡ 1 mod 2
ai = , for i = 1, . . . , h.
yi [s/2h−i + 1], if s/2h−i ≡ 0 mod 2

Verification. The verification consists of two steps. First the verifier verifies the
one-time signature of message M using the supplied verification key Ys . Then
he verifies the authenticity of Ys as follows: first he uses the 2l bit strings in Ys
to compute the inner node yh [s] as
yi [j] = Hk yi+1 [2j] ⊕ vi [0]  yi+1 [2j + 1] ⊕ vi [1]
for i = h + l − 1, . . . , h and j = s2i−h , . . . , (s + 1)2i−1 − 1. Then he uses the
authentication path As and recomputes the path from yh [s] to the root y0 [0] as
⎧  
⎨ Hk ai+1 ⊕ vi [0]  pi+1 ⊕ vi [1] , if s/2h−i+1 ≡ 1 mod 2
pi =  
⎩H pi+1 ⊕ vi [0]  ai+1 ⊕ vi [1] , if s/2h−i+1 ≡ 0 mod 2
k

for i = h − 1, . . . , 0 and ph = yh [s]. The signature is valid if p0 equals the


signers public root y0 [0] and the verification of σOTS (M ) was successful. Figure 2

authentication
path to root
path
p0 = y0 [0]
p1 = y1 [0] a1 = y1 [1]

yh−2 [0] ph−2 = yh−2 [s/4] yh−2 [2h−2 −1]

yh−1 [0] ah−1 = yh−1 [s/2−1] ph−1 = yh−1 [s/2] yh−1 [2h−1 −1]

yh [0] ··· ··· ph = yh [s] ah = yh [s+1] ··· yh [2h −1]

yh+l−1 [s2l−1 ] yh+l−1 [(s+1)2l−1 −1]


Ys
yh+l [s2l ] yh+l [s2l +1] ··· yh+l [(s+1)2l −2] yh+l [(s+1)2l −1]

Fig. 2. Notations of the SPR-MSS


114 E. Dahmen et al.

illustrates how the authentication path can be utilized in order to recompute the
root y0 [0].

3.1 Security of the SPR-MSS

We now reduce the CMA-security of the SPR-MSS to the second-preimage re-


sistance of HK and the CMA-security of the used OTS. We do so by showing
in Algorithm 1 how a forger ForSigsk (·) (pk) for the SPR-MSS can be used to
construct an adversary AdvSPR,OTS that either finds a second-preimage for a
random element of HK or breaks the CMA-security for a certain instance of the
underlying OTS. In Algorithm 1, we use the convention that when algorithms
called by AdvSPR,OTS fail, so does AdvSPR,OTS .

Algorithm 1. AdvSPR,OTS

Input: Hash function key k ∈R K, tree height h ≥ 1, first-preimage x ∈R {0, 1}2n ,


OTS instance with verification key Y and signing oracle SigX (·)
Output: Second-preimage x ∈ {0, 1}2n with x = x and Hk (x) = Hk (x ), or existen-
tial forgery for the supplied instance of the OTS, or failure

1. Choose c ∈R {0, . . . , 2h − 1} uniformly at random.


˘ pairs (Xj , Yj ), j = 0, . . . , 2 − 1, j = c and set ¯Yc ← Y .
h
2. Generate OTS key
3. Choose (a, b) ∈R (i, j) : i ∈ {0, . . . , h + l − 1}, j ∈ {0, . . . , 2i − 1} .
4. Choose random masks vi [0], vi [1] ∈R {0, 1}n , i = 0, . . . , h + l − 1, i = a.
5. Construct the Merkle tree up to height a + 1.
6. Choose va [0], va [1] ∈ {0, 1}n such that
` ´ ` ´
x = ya+1 [2b] ⊕ va [0]  ya+1 [2b + 1] ⊕ va [1] .

Note that ya [b] = Hk (x).


7. Use va [0], va [1] to complete the key pair generation.
8. Run ForSigsk (·) (pk).
9. When ForSigsk (·) (pk) asks its qth oracle query with message Mq :
(a) if q = c then obtain the one-time signature of Mq using the signing oracle
SigX (·) provided as input: σOTS (Mq ) ← SigX (Mq ).
(b) else compute σOTS (Mq ) using the qth OTS `signature key Xq . ´
(c) Respond to forger with signature σq (Mq ) = q, σOTS (Mq ), Yq , Aq .
10. When ForSigsk (·) (pk) outputs signature
` σs (M  ) for M  : ´
(a) Verify the signature σs (M  ) = s, σOTS (M  ), Ys , As .
(b) if (Ys , As ) = (Ys , As ):
i. if ya [b] is computed during the verification as ya [b] = Hk (x ) and x = x
holds then return x as second-preimage of x.
ii. else return failure.
(c) else (in that case (Ys , As ) = (Ys , As )):
i. if s = c then return (σOTS (M  ), M  ) as forgery for the supplied instance
of the OTS.
ii. else return failure.
Digital Signatures Out of Second-Preimage Resistant Hash Functions 115

Note that since the first-preimage x is chosen uniformly at random, so are


the masks va [0], va [1]. As a consequence, the adversary AdvSPR,OTS creates an
environment identical to the signature forging game played by the forger. We
will now compute the success probability of AdvSPR,OTS .

Case 1 (Ys , As ) = (Ys , As ). The fact that the verification key Ys can be
authenticated against the root y0 [0] implies a collision of Hk , see Appendix C.
This collision can either occur during the computation of the inner node yh [s] or
during the computation of the path from yh [s] to the root y0 [0]. The adversary
AdvSPR,OTS is successful in finding a second-preimage of x if the node ya [b]
is computed as ya [b] = Hk (x ) with x = x . Since the position of node ya [b]
was chosen at random, the probability that the collision occurs precisely at this
position is at least 1/(2h+l − 1). In total, the success probability of AdvSPR,OTS
is at least /(2h+l − 1), where  is the success probability of the forger.

Case 2 (Ys , As ) = (Ys , As ). In this case σOTS (M  ), M  = σOTS (Ms ), Ms


holds which implies that ForSigsk (·) (pk) generated an existential forgery for one
instance of the underlying OTS. The probability that ForSigsk (·) (pk) breaks
CMA-security of the supplied instance (s = c) is at least 1/2h . In total, the
success probability of AdvSPR,OTS is at least /2h , where  is the success proba-
bility of the forger.
Note that since both cases are complementary, one occurs with probability at
least 1/2. This leads to the following theorem:
Theorem 1 (Security of SPR-MSS). If HK = Hk : {0, 1}2n → {0, 1}n k∈K
is a family of (tspr , spr ) second-preimage resistant hash functions with spr ≤
1/(2h+l+1 − 2) and the used OTS is a (tots , ots , 1) signature scheme with ots ≤
1/2h+1, then the SPR-MSS is a (t, , 2h ) signature scheme with

 ≤ 2 · max (2h+l − 1) · spr , 2h · ots


t = min tspr , tots − 2h · tSig − tVer − tGen .

4 Comparison
Security Level. We compute the security level of the SPR-MSS and compare it
with the original MSS that relies on collision resistance (CR-MSS). As OTS we
use the Lamport–Diffie one-time signature scheme (LD–OTS) [6]. The following
theorem establishes the security of the LD–OTS (details of the reduction can be
found in Appendix A).

Theorem 2 (Security of LD-OTS). If FK = Fk : {0, 1}n → {0, 1}n k∈K


is a family of (tow , ow ) one-way functions with ow ≤ 1/4n, then the LD–OTS
is a (t, , 1) signature scheme with

 ≤ 4n · ow
t = tow − tSig − tGen
116 E. Dahmen et al.

By combining Theorems 1 and 2, we get


 ≤ 2 · max (2h+log2 2n − 1) · spr , 2h+log2 4n · ow
(1)
t = min tspr , tow − 2h · tSig − tVer − tGen .
Note that we can replace tots by tow rather than tow − tSig − tGen since the
time the OTS requires for signing and key generation are already included in
Theorem 1.
The security level is computed as the quotient t/. For the values tow , tspr , ow
and spr we consider the generic attacks of Section 2 and set
ow = 1/2h+log2 4n+1 tow = 2n−h−log2 4n−1
h+log2 2n+1
−2) (2)
spr = 1/(2h+log2 2n+1 − 2) tspr = 2n−log2 (2
which yields  = 1 in Equation (1). The times for signing, verifying, and key
generation are stated in terms of evaluations of Fk and Hk . We set tSig = (h+1)·n
(n to compute the LD–OTS signature and h · n as the average cost for the
authentication path computation using Szydlo’s algorithm [13]), tVer = 3n+h−1
(n to verify the LD–OTS signature, 2n − 1 to compute the inner node yh [s], and
h to compute the path to the root), and tGen = 2h+log2 2n+1 − 1 (2h · 2n to
compute the LD–OTS verification keys and 2h+log2 2n − 1 to compute the root.
By substituting these values, we get
t/ = 2n−h−log2 4n−1 − 2h+log2 (h+1)n − 2log2 (3n+h−1) − 2h+log2 2n+1 + 1.
The values for tSig , tVer and tGen affect the security level only for large h. Oth-
erwise the security level can be estimated as 2n−h−log2 n−4 .
A similar result can be obtained for the security of the CR-MSS with the
following theorem (details of the reduction can be found in Appendix B).
Theorem 3 (Security of CR-MSS). If GK = Gk : {0, 1}2n → {0, 1}n k∈K
is a family of (tcr , cr ) collision resistant hash functions with cr ≤ 1/2 and the
underlying OTS is a (tots , ots , 1) signature scheme with ots ≤ 1/2h+1, then the
CR-MSS is a (t, , 2h ) signature scheme with
 ≤ 2 · max cr , 2h · ots
t = min tcr , tots − 2h · tSig − tVer − tGen
By combining Theorems 2 and 3, we get
 ≤ 2 · max cr , 2h+log2 4n · ow
(3)
t = min tcr , tow − 2h · tSig − tVer − tGen .
We now set cr = 1/2 and tcr = 2n/2 (see Section 2) and use the values for
ow , tow from Equation (2) which yields  = 1 in Equation (3). We further set
tSig = (h + 1) · n, tVer = n + 1 + h and tGen = 2h · 2n + 2h+1 − 1 and get
t/ = 2n/2 − 2h+log2 (h+1)n − 2log2 (n+1+h) − 2h+log2 2n − 2h+1 + 1.
Again, the values for tSig , tVer and tGen affect the security level only for large h.
Otherwise, the security level can be estimated as 2n/2−1 .
Digital Signatures Out of Second-Preimage Resistant Hash Functions 117

Table 1. Security level of SPR-MSS and CR-MSS using the LD–OTS

Output length n 128 160 224 256


118−h 148.67−h 212.19−h
Security level of SPR-MSS 2 2 2 2244−h
Maximal height of tree h h ≤ 52 h ≤ 67 h ≤ 98 h ≤ 114
Security level of CR-MSS 263 279 2111 2127
Maximal height of tree h h ≤ 50 h ≤ 65 h ≤ 96 h ≤ 112

Remark 2. It is possible to choose different trade-offs for the values in Equation


(2). This however would not affect the resulting security level but only the upper
bound for h. We chose these values because they correspond to the extreme case
 = 1 in Equation (1), where Theorems 1 and 3 still hold.

Table 1 shows the security level of SPR-MSS and CR-MSS for different values of
n. It also shows the upper bounds for h such that the security level of SPR-MSS
and CR-MSS can be estimated as 2n−h−log2 n−4 and 2n/2−1 , respectively.
Table 1 shows that the security level is increased drastically when using the
SPR-MSS. As a consequence, the SPR-MSS not only has weaker security as-
sumptions, but hash functions with much smaller output size suffice to obtain
the same security level as the CR-MSS. Nowadays, a security level of at least 280
is required. When using n = 128, the SPR-MSS achieves a security level greater
than 280 for h ≤ 38. To obtain a similar security level with CR-MSS, one must
use n = 224.

Sizes. The CR-MSS public key consists of the root of the Merkle tree and the
key for the hash function. Assuming this key has bit length n, the size of an
CR-MSS public key is 2 · n bits. The SPR-MSS public key must also contain
the 2(h + l) XOR masks, each of bit length n. Therefore, in total the size of an
SPR-MSS public key is 2(h + l + 1) · n bits. In case of the LD–OTS we have
l = log2 2n. Using the same hash function, the signature size is the same for the
CR-MSS and the SPR-MSS. When using the LD–OTS, the one-time signature
of the message consists of n bit strings of length n. The verification key also
consists of n bit strings of length n, since half of the verification key can be
computed from the signature. The authentication path consists of h bit strings
of length n. In total, the size of a signature is (2n + h) · n bits. Table 2 compares
the signature and public key size of the SPR-MSS and the CR-MSS when using
h = 20.
Table 2 shows that in addition to its superior security, the SPR-MSS also
provides smaller signatures than the CR-MSS, at the expense of larger public
keys. In fact, in many cases the signer’s public key, embedded in a certificate, is
part of the signature; for that reason the sum of the sizes of the public key and
the signature is often relevant. However, even in this case, the SPR-MSS is still
superior to the CR-MSS.
118 E. Dahmen et al.

Table 2. Sizes of SPR-MSS and CR-MSS using the LD–OTS

Public key size Signature size Security level


SPR-MSS (n = 128) 7, 424 bits 35, 328 bits 298
CR-MSS (n = 160) 320 bits 54, 400 bits 279
CR-MSS (n = 224) 448 bits 104, 832 bits 2111

5 Signing Arbitrarily Long Data


In Section 3 we restricted the length of the message to be signed to n bits. We now
give some suggestions for signing arbitrarily long messages. The most straight-
forward way is to use a collision resistant hash function anyway. Although this
solution requires stronger security assumptions, the SPR-MSS would still provide
smaller signatures.
A better approach is to use target collision resistant (TCR) hash functions
[1,9]. Recall that in the TCR game, the adversary must first commit to a message
M , then receives the key K, and wins if he can output another message M  such
that HK (M ) = HK (M  ). The security notion TCR is stronger than second-
preimage resistance, but weaker than collision resistance [10]. In the TCR-hash-
and-sign paradigm, the signature of a message M is the pair σ(K||HK (M )), K ,
i.e. the key K must be signed as well. In [1], Bellare and Rogaway show how a
TCR hash function can be constructed from a TCR compression function using
the XOR tree we used in SPR-MSS. In this case, the length of the hash key
depends on the message length. If M has bit length n · b, the bit length of the
key K is 2n · log2 b. In [12] Shoup proposed a linear construction for TCR hash
functions that reduce the bit length of the key to n · log2 b. However, even with
Shoup’s hash function, the key size still depends (logarithmically) on the message
length, and can be relatively large. In order to solve the problem of long keys,
Bellare and Rogaway suggested iterating TCR hash functions [1]. For example,
the TCR hash function can be iterated with three different keys K1 , K2 and K3
as depicted in Figure 3; in this case, although the three keys must be transmitted
with the signature, only K3 must be signed. Since each round reduces the size
of the input to the next hash function, assuming a message with b blocks, after
three iterations, the size of the final key K3 will have about log2 (log2 (log2 (b)))
blocks of n bits. With a 128-bit hash function, if K3 is allowed to have at most
3 blocks, then messages up to 263 blocks (or 271 bits) can be signed.
Unfortunately, even when TCR hash functions are iterated, the signature size
is somewhat large: if K3 has three blocks, the input to the signature scheme has
4n bits, and the signature size is about 4n2 bits. In [5] Halevi and Krawczyk
introduce yet another security notion, which they call enhanced target collision
resistance (eTCR). Unlike the TCR game, the adversary commits to a message
M , receives the key K, and wins if he finds another key and another message
such that HK (M ) = HK  (M  ). When using eTCR hash functions, it is no longer
necessary to sign the key. Furthermore, Halevi and Krawczyk proved that an
Digital Signatures Out of Second-Preimage Resistant Hash Functions 119

C1 C2 C3
M H1 H2 H3 Sig

K1 K2 K3 σ
signature= K1 , K2 , K3 , σ with σ = Sig(K3 ||C3 )
Fig. 3. Iterating TCR hash functions

eTCR hash function can be instantiated by a real-world hash function, where


the blocks of the input message are randomized with a single key, and the key
is appended to the message [5]. Their proof assumes that the underlying com-
pression function has second-preimage resistance-like properties, which they call
eSPR (evaluated second-preimage resistance). Using an eTCR hash function and
assuming eSPR for the underlying compression function, our scheme yields sig-
natures of only about n2 bits.

6 Practical Considerations

Using a Real-World Hash Function. Most of our proofs are based on a


second-preimage resistant family of hash functions. Although there is no ex-
plicit family for SHA1 or MD5, one can regard the initial chaining value as a
key [10], or consider the hash functions themselves to be the key, through the
random choices made by their designers [1]. However, in that case, the key is
known by adversaries before starting the experiment, and not randomly chosen
in the experiment; the corresponding security notion is called always second-
preimage resistant by Rogaway and Shrimpton [10]. Our theorems apply to any
second-preimage-resistant hash function, including always second-preimage re-
sistant hash functions.

Using a Pseudo-Random Number Generator. In the description of our


signature scheme, we assumed that the 2h one-time signature secret keys Xj are
completely stored by the signer. In practice, if the number of signatures 2h is
large, this is of course completely out of question. Instead of randomly generating
the OTS secret keys Xj , one can take them as output of a pseudo-random number
generator with a unique seed, which totally eliminates the issue of storage. The
resulting scheme can be proven to be secure with the additional assumption that
the output of the PRNG is indistinguishable from a truly random number [3].

Shorter One-Time Signatures. The main drawback of Merkle signatures


is their long signature size. In fact, the one-time signature scheme is mostly
responsible for these lengthy signatures, because one-time signatures typically
have a number of blocks proportional to the number of bits to sign. Improving on
the original idea of Lamport, Merkle proposed using one block for each message
bit, where a checksum is appended to the message [7]. In addition, Winternitz
120 E. Dahmen et al.

suggested processing message bits by blocks of w consecutive bits, at the expense


of some more hash computations [7]. Combining
* +these ideas, the number of bit
strings for one instance of the OTS is l = n2 /w + log2 (n/w) /w, the size
of one-time signatures l ∗ n and the number of hash evaluations for signing (and
verifying) is on average 2w−1 ∗ l [2]. Although there are other techniques for
constructing one-time signature schemes out of hash functions, and especially
using graphs instead of trees, practical implementations of one-time signatures
using the improvements from Merkle and Winternitz often outperform graph-
base one-time signatures [2].

7 Conclusion
We proposed SPR-MSS, a variant of the Merkle signature scheme with much
weaker security assumptions than the original construction. More precisely, our
scheme is existentially unforgeable under adaptive chosen message attacks, as-
suming second-preimage and preimage resistant hash functions. Compared to
the original Merkle signature which relies on a collision-resistant hash function,
SPR-MSS provides a higher security level even when the underlying hash func-
tion has a smaller output size. For instance, when using a 128-bit hash function
such as MD5, which is still secure in view of second-preimage resistance, SPR-
MSS offers a security level better than 280 for trees of height up to 38.

References
1. Bellare, M., Rogaway, P.: Collision-resistant hashing: Towards making UOWHFs
practical. In: Kaliski Jr., B.S. (ed.) CRYPTO 1997. LNCS, vol. 1294, pp. 470–484.
Springer, Heidelberg (1997)
2. Dods, C., Smart, N., Stam, M.: Hash based digital signature schemes. In: Smart,
N. (ed.) Cryptography and Coding 2005. LNCS, vol. 3796, pp. 96–115. Springer,
Heidelberg (2005)
3. Garcı́a, L.C.C.: On the security and the efficiency of the merkle signature scheme.
Cryptology ePrint Archive, Report 2005/192 (2005), [Link]
4. Goldwasser, S., Micali, S., Rivest, R.L.: A digital signature scheme secure against
adaptive chosen-message attacks. SIAM Journal on Computing 17(2), 281–308
(1988)
5. Halevi, S., Krawczyk, H.: Strengthening digital signatures via randomized hash-
ing. In: Dwork, C. (ed.) CRYPTO 2006. LNCS, vol. 4117, pp. 41–59. Springer,
Heidelberg (2006)
6. Lamport, L.: Constructing digital signatures from a one way function. Technical
Report SRI-CSL-98, SRI International Computer Science Laboratory (1979)
7. Merkle, R.C.: A certified digital signature. In: Brassard, G. (ed.) CRYPTO 1989.
LNCS, vol. 435, pp. 218–238. Springer, Heidelberg (1990)
8. Naor, D., Shenhav, A., Wool, A.: One-time signatures revisited: Have they become
practical. Cryptology ePrint Archive, Report 2005/442 (2005),
[Link]
Digital Signatures Out of Second-Preimage Resistant Hash Functions 121

9. Naor, M., Yung, M.: Universal one-way hash functions and their cryptographic
applications. In: 21st Annual ACM Symposium on Theory of Computing - STOC
1989, pp. 33–43. ACM Press, New York (1989)
10. Rogaway, P., Shrimpton, T.: Cryptographic hash-function basics: Definitions, im-
plications, and separations for preimage resistance, second-preimage resistance, and
collision resistance. In: Roy, B., Meier, W. (eds.) FSE 2004. LNCS, vol. 3017, pp.
371–388. Springer, Heidelberg (2004)
11. Rohatgi, P.: A compact and fast hybrid signature scheme for multicast packet
authentication. In: ACM Conference on Computer and Communications Security
- CSS 1999, pp. 93–100. ACM Press, New York (1999)
12. Shoup, V.: A composition theorem for universal one-way hash functions. In: Pre-
neel, B. (ed.) EUROCRYPT 2000. LNCS, vol. 1807, pp. 445–452. Springer, Hei-
delberg (2000)
13. Szydlo, M.: Merkle tree traversal in log space and time. In: Cachin, C., Camenisch,
J.L. (eds.) EUROCRYPT 2004. LNCS, vol. 3027, pp. 541–554. Springer, Heidelberg
(2004)
14. Wang, X., Yin, Y.L., Yu, H.: Finding collisions in the full SHA-1. In: Shoup, V.
(ed.) CRYPTO 2005. LNCS, vol. 3621, pp. 17–36. Springer, Heidelberg (2005)
15. Wang, X., Yu, H.: How to break MD5 and other hash functions. In: Cramer, R.
(ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 19–35. Springer, Heidelberg (2005)

A Security of the Lamport–Diffie One-Time Signature


Scheme
This Section describes the Lamport–Diffie one-time signature scheme (LD–OTS)
[6] and states a security reduction to the used one-way function. Let FK =
Fk : {0, 1}n → {0, 1}n k∈K be a family of one-way functions. The one-time
signature key of the LD–OTS consists of the 2n n–bit strings xi [0], xi [1] ∈R
{0, 1}n, i = 0, . . . , n − 1 and a key for the one-way function k ∈R K. The verifi-
cation key Y consists of the 2n n-bit strings yi [0], yi [1] = Fk (xi [0]), Fk (xi [1])
for i = 0, . . . , n − 1 and the key k. The signature of an n-bit message M =
(m0 , . . . , mn−1 )2 is given as σi = xi [mi ], i = 0, . . . , n − 1, i.e. the bit strings from
the signature key are chosen according to the bits of the message; xi [0] if mi = 0
and xi [1] if mi = 1. To verify a signature one has to check if Fk (σi ) = yi [mi ]
holds for all i = 0, . . . , n − 1. The time required by the LD–OTS for key genera-
tion, signing and verifying in terms of evaluations of Fk are tGen = 2n, tSig = n
and tVer = n, respectively. We disregard the time required to randomly choose
the signature key and assume the signer does not store the verification key.
Algorithm 2 shows how a forger ForSigX (·) (Y ) for the LD–OTS can be used to
construct an inverter for an random element of FK . In other words, the security
of the LD–OTS is reduced to the preimage resistance of FK .
The adversary AdvPre is successful in finding a preimage of y if and only if
ForSigX (·) (Y ) queries an M with ma = (1 − b) (Line 5a) and returns a valid
signature for M  with ma = b (Line 6a). The probability for ma = (1 − b) is
1/2. Since M  must be different from the queried message M , there exists at
122 E. Dahmen et al.

Algorithm 2. AdvPre

Input: k ∈R K and y ∈ Fk ({0, 1}n )


Output: x such that y = Fk (x) or failure

1. Generate LD–OTS key pair (X, Y ).


2. Choose a ∈R {1, . . . , n} and b ∈R {0, 1}.
3. Replace ya (b) with y in the LD–OTS verification key Y .
4. Run ForSigX (·) (Y ).
5. When ForSigX (·) (Y ) asks its only oracle query with M = (m0 , . . . , mn−1 )2 :
(a) if ma = (1 − b) then sign M and respond to ForSigX (·) .
(b) else return failure.
6. When ForSigX (·) outputs signature for message M  = (m0 , . . . , mn−1 )2 :
(a) if ma = b then return σa as preimage of y.
(b) else return failure.

least one index c such that mc = 1 − mc . AdvPre is successful if a = c, which


happens with probability at least 1/2n. This result is summarized in Theorem
2 in Section 4.

B Security of the Original Merkle Signature Scheme


This section states a security reduction for the original Merkle signature scheme
to the collision resistance of the used compression function and the CMA-security
of the underlying OTS. The reduction is similar to what was shown in Section 3.1;
the main difference being that we are satisfied if we find a collision anywhere
in the tree. Let GK = Gk : {0, 1}2n → {0, 1}n k∈K be a family of collision
resistant hash functions. Algorithm 3 shows how a forger ForSigsk (·) (pk) for the
MSS can be used to construct a collision finder for a random element of GK .
To compute the success probability of AdvCR,OTS we have to distinguish two
cases.
Case 1 (Ys , As ) = (Ys , As ): The fact that the verification key Ys can be au-
thenticated against the root y0 [0] implies a collision of Gk , see Appendix
C. The success probability of finding a collision is at least , the success
probability of the forger.
Case 2 (Ys , As ) = (Ys , As ): In this case σOTS (M  ), M  = σOTS (Ms ), Ms
holds which implies that ForSigsk (·) (pk) generated an existential forgery for
one instance of the underlying OTS. The probability that ForSigsk (·) (pk)
breaks CMA-security of the supplied instance (s = c) is at least 1/2h . In
total, the success probability of AdvSPR,OTS is at least /2h , where  is the
success probability of the forger.
Note that since both cases are complementary, one occurs with probability at
least 1/2. This result is summarized in Theorem 3 in Section 4.
Digital Signatures Out of Second-Preimage Resistant Hash Functions 123

Algorithm 3. AdvCR,OTS

Input: Key for the hash function k ∈R K, height of the tree h ≥ 1, an instance of the
underlying OTS consisting of a verification key Y and the corresponding signing oracle
SigX (·)
Output: Collision of Gk , existential forgery for the supplied instance of the OTS, or
failure

1. Choose c ∈R {0, . . . , 2h − 1} uniformly at random.


2. Generate OTS key pairs (Xj , Yj ), j = 0, . . . , 2h − 1, j = c and set Yc ← Y .
3. Complete the key pair generation.
4. Run ForSigsk (·) (pk).
5. When ForSigsk (·) (pk) asks its qth oracle query with message Mq :
(a) if q = c then obtain the one-time signature of Mq using the signing oracle
SigX (·) provided as input: σOTS (Mq ) ← SigX (Mq ).
(b) else compute σOTS (Mq ) using the qth OTS ` signature key X´q .
(c) Generate the MSS signature σq (Mq ) = q, σOTS (Mq ), Yq , Aq and respond to
the forger. ` ´
6. When ForSigsk (·) (pk) outputs signature σs (M  ) = s, σOTS (M  ), Ys , As for M  :
 
(a) verify the signature σs (M ).
(b) if (Ys , As ) = (Ys , As ) then return a collision of Gk .
(c) else (if (Ys , As ) = (Ys , As )):
i. if s = c then return (σOTS (M  ), M  ) as forgery for the supplied instance
of the OTS.
ii. else return failure.

C (Ys, As) = (Ys, As) Implies a Collision


Case 1 As = As : Let h ≥ δ > 0 be the index where the authentication paths
are different, i.e. aδ = aδ . Further let (ph , . . . , p0 ), (ph , . . . , p0 ) be the paths
from node yh [s] to the root y0 [0] constructed using the authentication paths
As , As , respectively. We certainly know that p0 = p0 holds. If pδ−1 = pδ−1 ,
then (aδ  pδ ), (aδ  pδ ) is a collision for Hk . Otherwise, there exists an index
δ > γ > 0 such that pγ = pγ and pγ−1 = pγ−1 . Then, (aγ  pγ ), (aγ  pγ ) is
a collision for Hk . Note, that the order in which ai and pi are concatenated
depends on the index s in the signature.
Case 2 Ys = Ys : Let s · 2l ≤ δ < (s + 1) · 2l be the index where the bit strings in
the verification keys are different, i.e. yδ = yδ . If yh [s] = yh [s], there exists an
index h + l ≥ γ > h such that yγ [β] = yγ [β] and yγ−1 
[β/2] = yγ−1 [β/2],
h+l−γ
with β = δ/2 . Then a collision for Hk is given as

(yγ [β]  yγ [β + 1]), (yγ [β]  yγ [β + 1]) , if β ≡ 0 mod 2
(yγ [β − 1]  yγ [β]), (yγ [β − 1]  yγ [β]) , if β ≡ 1 mod 2.

Otherwise, that is if yh [s] = yh [s], similar arguments as in Case 1 can be


used to find a collision.
Cryptanalysis of Rational Multivariate Public
Key Cryptosystems

Jintai Ding and John Wagner

Department of Mathematical Sciences


University of Cincinnati,
Cincinnati, OH, 45220, USA
ding@[Link], wagnerjh@[Link]

Abstract. In 1989, Tsujii, Fujioka, and Hirayama proposed a family of


multivariate public key cryptosystems, where the public key is given as
a set of multivariate rational functions of degree 4. These cryptosystems
are constructed via composition of two quadratic rational maps. In this
paper, we present the cryptanalysis of this family of cryptosystems. The
key point of our attack is to transform a problem of decomposition of two
rational maps into a problem of decomposition of two polynomial maps.
We develop a new improved 2R decomposition method and other new
techniques, which allows us to find an equivalent decomposition of the
rational maps to break the system completely. For the example suggested
for practical applications, it is very fast to derive an equivalent private
key, and it requires only a few seconds on a standard PC.

1 Introduction

Multivariate public key cryptosystems have undergone very fast development in


the last 20 years. They are considered one of the promising families of alternatives
for post-quantum cryptography, which are cryptosytems that could resist attacks
by the quantum computers of the future [1]. Though most people think that Diffie
and Fell wrote the first paper on the multivariate public key cryptosystems [3],
Tsujii, Kurosawa and etc actually did similar work at the same time [7]. Though
this family of cryptosystems is almost 20 years old, it is not so well known. It
actually included several methods rediscovered later, which is partially due to
the fact that they were written in Japanese and were published inside Japan.
Recently it is pointed out by Tsujii [6] that there is not yet any successful attack
on the degree 4 rational multivariate public key cryptosystem designed at that
time (1989)[5].
This family of multivariate public key cryptosystem is very different from
most of the known cryptosystems, namely the public key functions are rational
functions instead of polynomial functions and the total degree of the polynomials
components are of degree 4 instead of degree 2. The public key is presented as:
P (x1 , .., xn ) = (P1 (x1 , .., xn )/Pn+1 (x1 , .., xn ), · · · , Pn (x1 , .., xn )/Pn+1 (x1 , .., xn )),
where Pi (x1 , .., xn ) are degree 4 polynomials over a finite field k. We call this

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 124–136, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Cryptanalysis of Rational Multivariate Public Key Cryptosystems 125

family of cryptosystems rational multivariate public key cryptosystems (RMP-


KCs).
The construction of this family of cryptosystems relies on three basic methods.
The first one is called the core transformation, which is essentially an invertible
rational map with two variables. The second one is called the sequential solution
method, which is essentially invertible rational triangular maps. This ideas was
used later in the name of tractable rational maps in [8], but the authors [8] were
not aware of the work of Tsujii’s group. The last one is the method of compo-
sition of nonlinear maps, which was also used later by Goubin and Patarin [4]
again without knowing the works of Tsujii’s group. The public key therefore has
following expression: P = L3 ◦ G ◦ L2 ◦ F ◦ L1 , where ◦ stands for map compo-
sition and Li are invertible affine maps. G and F are degree two rational maps:
F = (F1 /Fn+1 , · · · , Fn /Fn+1 ; ) G = (G1 /Gn+1 , · · · , Gn /Gn+1 ), where Fi and
Gi are quadratic polynomials and F and G utilize both the core transformation
and the triangular method.
The designers of this family of cryptosystem also employed two very inter-
esting ideas to reduce the public key size, which is a key constraint with the
potential to render a multivariate public key cryptosystem application less ef-
ficient. The first idea is to use functions of a small number of variables over a
relatively large field. Since the the public key size is O(n4 ), using fewer variables
greatly reduces the public key size.
The second idea is to build a public key using a field k, then use an extension
field of k, say K, as the field from which the plaintext is defined. If |k|e = |K|,
then the public key size required is only 1e as large as if K were used to define
the public key. Mathematically, the public key lies in the function ring over k n ,
a subring of the function ring over K n . Encryption and decryption occur using
the larger function ring. This idea was used later in Sflash Version-1 [10].
In 1989, the designers proposed a practical application using k of size 28 , K
of size 232 and n = 5. This application encrypts blocks of 20 bytes using a 756
byte public key. This family of cryptosystems seems to be very interesting and
worthy of further exploration.
As we mentioned before, there is a related cryptosystem called 2R by Patarin,
which is very similar except that F and G are replaced by 2 quadratic polynomial
maps, but this cryptosystem is broken by a decomposition method using partial
derivatives [9]. It is clear this method cannot be directly used on RMPKCs
because of more complicated expressions for derivatives of rational functions.
Our new method begins by viewing separately the denominator and the nu-
merators of the public key as polynomial functions. We would like to decompose
these quartic polynomials into quadratic components. We will use these quadrat-
ics to reconstruct the given public key polynomials, but we first have to transform
them so that the reconstruction is done is a way that we have a complete alter-
nate private key for the cryptosystem. This alternate private key gives us the
ability to invert ciphertext just as easily as the owner of the original private key.
To see how we accomplish this, let’s refer to the polynomial expressions in the
denominator and the numerators of the public key as pi = gi ◦ (f1 , . . . , fn+1 ). We
126 J. Ding and J. Wagner

first find S = Span { fj : 1 ≤ j ≤ n + 1 } . From S, we carefully choose a basis


that will enable us to invert the resulting rational maps when we reconstruct the
public key. After choosing this basis, it is easy to find each gi . We will have to
transform in a similar way the components of Span { gj : 1 ≤ j ≤ n + 1 }.
We would like to emphasize that our attack is not just application of known
methods. In particular, the design of these RMPKCs create two especially inter-
esting challenges for us. The first challenge is to find Span { fj : 1 ≤ j ≤ n + 1 },
and it turns out that the 2R decomposition method alone can not fiund this
space by just applying the partial derivative attack directly to the quartic poly-
nomials pi . Mathematically, our new idea is to use subplanes of our function
space, and the computational means that to do this is very simple: we merely
set some of the variables equal to zero. By combining results from three or more
of such subplanes, we successfully identify Span { fj : 1 ≤ j ≤ n + 1 }. This new
extension of 2R decompostion is very different from that in [2].
The second challenge comes from the use of a common denominator in both
F and G. We must identify each of these two denominators exactly (up to a
scaling factor). This step is necessary to complete the reconstruction of the
public key. To find the exact denominator of F , we capitalize on a weakness in
the design of the core transformation of G. This weakness results in a portion
(subspace) of Span { pj : 1 ≤ j ≤ n + 1 } in which the polynomial elements have
the denominator of F as a factor. We find it using linear algebra techniques.
Finding the exact denominator of G comes to us automatically as we solve for
the gi ’s in the equations pi = gi ◦ (f1 , . . . , fn+1 ).
The paper is arranged as follows. In Section 2, we will present the specifics of
the cryptosystems we will attack. In Section 3, we will present the details of the
cryptanalysis of this family of cryptosystems; we will include our experimental
results and relevant information on computational complexity. In the last section,
we will summarize our learnings.

2 The RMPKC Cryptosystem


In this section, we will present the design of the rational multivariate public key
cryptosystem [5]. Let k be a finite field and k n the n-dimensional vector space
over k.
1. The public key. The public key is given as a set of rational degree 4
functions: P (x1 , ...xn ) = ( PPn+1
1 (x1 ,...,xn ) Pn (x1 ,...,xn )
(x1 ,...,xn ) , · · · , Pn+1 (x1 ,...,xn ) ), where each Pi is
a degree 4 polynomial over k. P is constructed as the composition of the five
maps: P = L3 ◦ G ◦ L2 ◦ F ◦ L1 = (P1 /Pn+1 , · · · , Pn /Pn+1 ). Here L1 , L2 , L3
are invertible, linear transformations over k n . Both F and G are quadratic
rational maps, i.e. each consists of n quadratic rational functions, k n → k.
F = (F1 /Fn+1 , ··, Fn /Fn+1 ) and G = (G1 /Gn+1 , ··, Gn /Gn+1 ), where for
1 ≤ i ≤ n + 1, Fi and Gi are quadratic polynomials in (x1 , . . . , xn ). The
details of the construction of F and G are provided below in the section
explaining the private key. F and G are constructed identically, with different
choices of random parameters.
Cryptanalysis of Rational Multivariate Public Key Cryptosystems 127

Note the denominators used in both rational maps are the same in the
two nonlinear map respectively. Gn+1 is the common denominator for G; it
enables the public key to consist of exactly n + 1 polynomials. Fn+1 is the
common denominator for F ; it enables the composition of degree 2 rational
functions to result in a degree 4 rational function, not that of higher degree.
To see how this works, we’ll introduce a division function, φ : k n+1 −→ k n
x1
with φ(x1 , . . . , xn+1 ) = ( xn+1 , · · · , xxn+1
n
). Also let F̄ , Ḡ : k n −→ k n+1 each
be quadratic polynomials that satisfy
φ ◦ Ḡ = L3 ◦ G and φ ◦ F̄ = L2 ◦ F ◦ L1
resulting in P = φ ◦ Ḡ ◦ φ ◦ F̄ = φ ◦ (Ḡ ◦ φ) ◦ F̄ .
Now let G̃ be the homogenization of Ḡ, i.e. G̃ : k n+1 → k n+1 where
∀ 1 ≤ i ≤ n + 1, G̃i (v1 , . . . , vn+1 ) = vn+1
2 v1
Ḡi ( vn+1 , · · · , vn+1
vn
)=
vn+1 Ḡi ◦ φ(v1 , . . . , vn+1 ).
2

Note that G̃ = Ḡ◦φ, but φ◦ G̃ = φ◦ Ḡ◦φ. So P = φ◦ G̃◦ F̄ where G̃ and F̄ are


quadratic polynomials. The public key, then, contains the ordered list of n+1
quartic polynomials (P1 , . . . , Pn+1 ) where ∀ 1 ≤ i ≤ n + 1, Pi (x1 , . . . , xn ) =
G̃i ◦ F̄ (x1 , . . . , xn ).
2. Encryption. Given a plaintext X = (X1 , · · · , Xn ) ∈ k n one computes the
ciphertext Y  = (Y1 , · · · , Yn ) ∈ k n as
P (X  ,...,X  ) P (X  ,...,X  )
(Y1 , · · · , Yn ) = ( Pn+1
1
(X  ,...,X  ) , · · · , Pn+1 (X  ,...,X  ) ).
1 n n 1 n
1 n 1 n

3. The private key. The private key is the set of the five maps F, G, L1 , L2 , L3
and the key to invert the non-linear maps F and G. The map P can illus-
trated as: k n −L→1 k n −F→ k n −L→2 k n −G→ k n −L→3 k n .
The design principles of the quadratic rational components, F and G, are
identical, except that they use different choices for the random parameters
involved. A two-part construction is used. The first part is what the designers
call a core transformation. The second part is called the sequential part, since
inversion is accomplished sequentially. Its structure can be seen as triangular.
The core tranformation is applied only to the last two components, namely
C = ( FFn−1 , Fn ), which can be viewed as a map k 2 −→ k 2 . To construct
n+1 Fn+1
Fn−1 , Fn , Fn+1 , we first randomly choose 12 elements in k: α1 , . . . , α6 and
β1 , . . . , β6 . C has an inverse which is given by:
C −1 (yn−1 , yn ) = ( α1 yn−1 +α2 yn +α3
α4 yn−1 +α5 yn +α6 ,
β1 yn−1 +β2 yn +β3
β4 yn−1 +β5 yn +β6 ).

Then Fn−1 , Fn and Fn+1 are defined as follows:


∀ n − 1 ≤ i ≤ n + 1, Fi (xn−1 , xn ) = τi,1 xn−1 xn + τi,2 xn−1 + τi,3 xn + τi,4
where the τi,j is defined as follows:
τn−1,1 = α6 β5 − α5 β6 τn,1 = α6 β4 − α4 β6 τn+1,1 = α5 β4 − α4 β5
τn−1,2 = α3 β5 − α5 β3 τn,2 = α3 β4 − α4 β3 τn+1,2 = α1 β4 − α4 β1
τn−1,3 = α6 β2 − α2 β6 τn,3 = α6 β1 − α1 β6 τn+1,3 = α5 β2 − α2 β5
τn−1,4 = α3 β2 − α2 β3 τn,4 = α3 β1 − α1 β3 τn+1,4 = α1 β2 − α2 β1
128 J. Ding and J. Wagner

The rest of the components are given in a triangular form:


∀1 ≤ i ≤ n − 2, Fi (x1 , . . . , xn ) = ai (xi+1 , . . . , xn )xi + bi ((xi+1 , . . . , xn ),
where the ai ’s are randomly chosen linear polynomials and the bi ’s are ran-
domly chosen quadratic polynomials.
4. Decryption. To decrypt, we need to invert the map P , which is done as
follows: P −1 (Y1 , . . . , Yn ) = L−1
1 ◦ F
−1
◦ L−1
2 ◦ G
−1
◦ L−1  
3 (Y1 , . . . , Yn ) =
(X1 , . . . , Xn ). The holder of the private key has the means to find the in-
verse of each of L3 , G, L2 , F, L1 . Performing the calculations in order yields
(X1 , . . . , Xn ). Inversion of the linear transformations is obvious.
To invert the map F is to find the solution of equation: F (x1 , ..., xn ) =
(y1 , ..., yn ) for a given vector (y1 , ..., yn ). We first use the inverse of C to
calculate (xn−1 , xn ) = C −1 (yn−1 
, yn ). Then we plug the resulting values
into the third last component function of F . This gives us the following
linear equation in xn−2 :
 Fn−2 (xn−2 ,xn−1 ,xn ) an−2 (xn−1 ,xn )∗xn−2 +bn−2 (xn−1 ,xn )
yn−2 = Fn+1 (xn−1 ,xn ) = τn−2,1 xn−1 xn +τn−2,2 xn−1 +τn−2,3 xn +τn−2,4


yn−2 ∗(τn−2,1 xn−1 xn +τn−2,2 xn−1 +τn−2,3 xn +τn−2,4 )−bn−2 (xn−1 ,xn )
yielding xn−2 = an−2 (xn−1 ,xn ) .
After obtaining xn−2 , we can plug known values into the fourth last com-
ponent function of F and derive xn−3 . This sequential solution method
is continued to find the rest of (x1 , . . . , xn ) which gives us a solution for
F (x1 , ..., xn ) = (y1 , ..., yn ). Inversion of G is performed in the exact same
manner as F .
Note that in the inversion process, division is required in the calculation
of each of the components of (x1 , . . . , xn ). In each case, the expression for the
divisor is linear in terms of known values of input variables (xi+1 , . . . , xn )
and the given values of output variables (yi , . . . , yn ). In both cases, the prob-
ability of valid division is approximately q−1 q . The probability of successfully
q−1 2n
inverting both F and G, and thus P , therefore, is approximately q .

3 Cryptanalysis of RMPKC
Our attack can be viewed as the decomposition of maps. The cryptanalysis of
RMPKC is performed as follows: given P , the composition of L3 ◦ G ◦ L2 ◦ F ◦ L1 ,
generate a new set of maps L3 , G , L2 , F  , and L1 such that
L3 ◦ G ◦ L2 ◦ F ◦ L1 = L3 ◦ G ◦ L2 ◦ F  ◦ L1 ,
and G and F  can be inverted in the same way as G and F , with the keys
to inversion obtained during the process. This new set of maps can be viewed
as a private key equivalent to the original one, thus can be used to defeat the
RMPKC cryptosystem.
To decompose RMPKC, we will use the partial derivative method, which
takes the composition of two homogeneous quadratic polynomial maps forming
Cryptanalysis of Rational Multivariate Public Key Cryptosystems 129

a homogeneous quartic map, and decomposes it into quadratic maps which, when
composed together, form the original quartic map [9]. Consider g ◦ f where g =
(g1 (x1 , . . . , xm ), .., gm (x1 , . . . , xm ) , f = (f1 (x1 , . . . , xm ), .., fm (x1 , . . . , xm )
and each of the gi ’s and the fi ’s are homogeneous quadratic polynomials. The
first step is to find F = Span { fi : 1 ≤ i ≤ m }, a vector space over k.
Once found, one can select linearly independent quadratics from it, say
(f1 , . . . , fm

). Then by solving a set of linear equations, one can find (g1 , . . . , gm 
)
    
such that ∀ 1 ≤ i ≤ m, gi ◦ f = gi ◦ f where f = (f1 , . . . , fm ).
The critical step of this process is finding F . The following definitions are
needed: D = Span { ∂x ∂
j
gi ◦ f (x1 , . . . , xm ) : 1 ≤ i, j ≤ m };
Λ = { xj f : 1 ≤ j ≤ m, f ∈ F }; R = { θ : ∀ 1 ≤ i ≤ m, xi θ ∈ D }. When
each of the fi ’s and gi ’s are homogeneous quadratic polynomials, D ⊆ Λ. This
is true basically because

∂xj (gi ◦ f) = m
P ∂
∂wr gi (f ) × ∂
∂xj fr (x1 , . . . , xm )
r=1

∂ ∂
where ∂w r
gi (f ) is linear in the f ’s and ∂x j
f (x1 , . . . , xm ) is linear in the (x1 , . . . ,
xm ).
We calculate D and R from g ◦ f . If D = Λ, then R = F and this step is
complete. When D ⊂ Λ, R ⊂ F. Why R ⊆ F and D = Λ ⇐⇒ R = F should be
fairly easy to see.
Application of the partial derivative attack to RMPKC requires some addi-
tional work. As we saw in the explanation of the public key, we have access
to n + 1 polynomials of the form Pi = G̃i ◦ F̄ (x1 , . . . , xn ) where G̃i is a ho-
mogeneous quadratic polynomial and F̄ consists of non-homogeneous quadratic
polynomials. Our first step is to homogenize each of the Pi ’s, which effectively
homogenizes each of the F̄i ’s, yielding the following:

P̃i (x1 , . . . , xn+1 ) = G̃i ◦ F̃ (x1 , . . . , xn+1 )

where each of the P̃i ’s are homogeneous quartic polynomials and each of the G̃i ’s
and F̃i ’s are homogeneous quadratic polynomials.
Then we begin the partial derivative attack, by calculating D from G̃i ◦
F̃ (x1 , . . . , xn+1 ). We never get D = Λ, due to the triangular structure of G and
the use of k which has characteristic 2. We are able to recover F by applying the
attack with a new method of projection of our functions to subplanes; the details
will be provided in the section that follows. After finding F , we de-homogenize
the space by setting xn+1 = 1.
The second challenge that the specifics of RMPKC present to the partial
derivative attack is the challenge to select the polynomials F1 , . . . , Fn+1

from
F |xn+1 =1 in such a way that they may be easily inverted. The procedure we
use to find such F1 , . . . , Fn+1

is described below. The process results in a linear
transformation L1 and a quadratic rational map F  , which inverts in the same


manner as F for the holder of the private key.


Then to continue the partial derivative attack we can find the gi ’s that satisfy
Pi = gi ◦ F  ; but these gi ’s would not invert easily. So we define G  = Span { gi :
130 J. Ding and J. Wagner

1 ≤ i ≤ n + 1 } and select polynomials from G  which we can invert. This


process generates linear transformations L2 and L3 , and quadratic rational map
G , which inverts in the same manner as G in the private key. Then we have
P = L3 ◦ G ◦ L2 ◦ F  ◦ L1 , an alternative private key, thus breaking the RMPKC.
We organize our attack into four phases. The sections that follow will present
an explanation in further detail of each phase.
1. Find F = Span { F̃i : 1 ≤ i ≤ n + 1 }.
2. Determine F  and L1 .
3. Find G  = Span { gi | gi ◦ F  ◦ L1 = Pi : 1 ≤ i ≤ n + 1 }.
4. Determine G , L2 , and L3 .

3.1 Phase I: Find F = Span { F̃i : 1 ≤ i ≤ n + 1 }


We start with the public key, P = G̃ ◦ F̄ = (P1 , . . . , Pn+1 ) and homogenize
by creating P̃ = (P̃1 , . . . , P̃n+1 ) using ∀ 1 ≤ i ≤ n + 1, P̃i (x1 , . . . , xn+1 ) =
x1
x4n+1 Pi ( xn+1 , · · · , xxn+1
n
). This gives us P̃ = G̃ ◦ F̃ where F̃ = (F̃1 , . . . , F̃n+1 ) and
∀ 1 ≤ i ≤ n + 1, F̃i (x1 , . . . , xn+1 ) = x2n+1 F̄i ( xn+1
x1
, · · · , xxn+1
n
).
To proceed we need to define Hi ∀ i ∈ { 1, 2, 3 } as the set of all homogeneous
polynomials in k[x1 , . . . , xn+1 ] of degree i. Each Hi is a vector space over k as
well as a subset of k[x1 , . . . , xn+1 ]. For notational simplification, we will use
context to distinguish between these uses of Hi .
We now define D, R, and Λ for G̃ ◦ F̃ . Recall we calculate D and R from P̃ .
D = Span { ∂
∂xj G̃i ◦ F̃ (x1 , . . . , xn+1 ) : 1 ≤ i, j ≤ n + 1 } ⊂ H3

Λ = { xj f : 1 ≤ j ≤ n + 1, f ∈ F } ⊂ H3
R = { f ∈ H2 : ∀1 ≤ i ≤ n + 1, xi f ∈ D }.
Since the polynomials of G̃ and F̃ are homogeneous quadratics, we are guar-
anteed D ⊆ Λ and R ⊆ F. We also have D = Λ ⇐⇒ R = F . Because of the
structure of the original polynomials in G and the use of a field of characteristic
2, we will always find D ⊂ Λ and therefore R ⊂ F. So we use the following
definitions of Γ and γ to help explain how to see what is happening with indi-
vidual f ’s in F , why they do not find themselves in R, and how we are going to
eventually find them with our alternative approach.
Γ (f ) = { θ ∈ H1 : θf ∈ D } and γ(f ) = dim( Γ (f ) ).
Clearly, f ∈ R ⇐⇒ γ(f ) = n+1. We always get γ(f ) ≤ n+1, and M in { γ(f ) :
f ∈ F } describes how far away from obtaining R = F for any given application
of RMPKC. For n = 5 and n = 6, we find M in { γ(f ) : f ∈ F } = n almost
every time. For n = 7 we usually get M in { γ(f ) : f ∈ F } = n − 1. And for
n ≥ 8 we most likely get M in { γ(f ) : f ∈ F } = n−2. Our alternative approach
works most simply for M in { γ(f ) : f ∈ F } = n. We will describe this now in
detail; then briefly show how we accomplish this for M in { γ(f ) : f ∈ F } < n.
We again start with the key definitions, valid ∀ 1 ≤ s ≤ n + 1; and we have
access to each Ds and Rs .
Cryptanalysis of Rational Multivariate Public Key Cryptosystems 131

Fs = Span { f (x1 , . . . , xs−1 , 0, xs+1 , . . . , xn+1 ) : ∀ f ∈ F } .


Ds = Span { ∂
∂xj G̃i ◦ F̃ (x1 , . . . , xs−1 , 0, xs+1 , . . . , xn+1 ) : 1 ≤ i, j ≤ n + 1 } .

Λs = { xi f : 1 ≤ i ≤ n + 1(i = s), f ∈ Fs } .
Rs = { f ∈ H2 : ∀ 1 ≤ i ≤ n + 1(i = s), xi f ∈ Ds } .
Γs (f ) = { θ ∈ H1 : θf ∈ Ds } , γs (f ) = dim( Γs (f ) ).
Now we always get Ds ⊆ Λs , Rs ⊆ Fs , and Ds = Λs ⇐⇒ Rs = Fs ⇐⇒
M in { γs (f ) : f ∈ Fs } = n. Fortunately for this attack, with high probability,
γs (f ) = M in { γ(f ), n } . This is a crucial point. At this time, we do not have
a mathematical explanation for why it is so; our experiments confirm it with
consistent results. Once we get ∀ 1 ≤ s ≤ n + 1, Rs = Fs , finding F is easy.
Let Rs+ = Rs + Span { xs xi : 1 ≤ i ≤ n + 1 } . When Rs = Fs , F ⊂ Rs+ .
n+1
T
Furthermore, if ∀ 1 ≤ s ≤ n + 1, Rs = Fs , then F = s=1 Rs+ , completing the
task of finding F .
For the cases of M in { γ(f ) : f ∈ F } < n, we expand our alternative
approach one or more levels further. Notice above the spaces Rs+ , which are
created by setting xs = 0, finding Ds and Rs , then adding Span { xs xi : 1 ≤ i ≤
n + 1 } . For n = 7, when we have M in { γ(f ) : f ∈ F } = n − 1, we use xs1 =
0 = xs2 where s1 = s2 . Following the same manner we form Ds1 ,s2 and Rs1 ,s2 .
Then we let Rs+1 ,s2 = Rs1 ,s2 + Span { xs1 xi : 1 ≤ i ≤ n +
T
1 } + Span { xs2 xi :
1 ≤ i ≤ n + 1 } . With consistency, we do get F = 1≤s1 ,s2 ≤n+1 Rs+1 ,s2 .
s1 =s2
For n ≥ 8, when we have M in { γ(f ) : f ∈ F } = n − 2, we use xs1 = 0 =
xs2 = 0 = xs3 where s1 = s2 = s3 = s1 . Following the same manner we form
Ds1 ,s2 ,s3 and Rs1 ,s2 ,s3 . Then we let Rs+1 ,s2 ,s3 = Rs1 ,s2 ,s3 + Span { xs1 xi : 1 ≤
i ≤ n + 1 } + Span { xs2 xi : 1 ≤ i ≤ T n + 1 } + Span { xs3 xi : 1 ≤ i ≤ n + 1 } .
Again we consistently get F = 1≤s1 ,s2 ,s3 ≤n+1 Rs+1 ,s2 ,s3 .
s1 =s2 =s3 =s1

3.2 Phase II: Choose F and L1


F
In this phase we will determine the quadratic polynomials of F  = ( F  1 , · · · ,
n+1
Fn

Fn+1 ) and the linear transformation, L1 such that

Span { Fi ◦ L1 : 1 ≤ i ≤ n + 1 } = Span { Fi ◦ L1 : 1 ≤ i ≤ n + 1 } ,


and F  can be easily inverted just like F .
However, we do need one additional condition on our new map, namely we

must have Fn+1 ◦ L1 = λFn+1 ◦ L1 for some λ ∈ k. This is necessary in order to
find the proper G , which will be determined later, to be chosen so that it too
can be inverted in the same manner as G.
Our first step is to determine a core transformation in F  . From the defi-
nition in Section 2, we can see that there is a subspace spanned by two lin-
early independent linear functions in F , which actually lies in the space spanned
132 J. Ding and J. Wagner

by Fn−1 , Fn , Fn+1 . Therefore F  also contains a subspace that is contained in



Span { θn−1 , θn , 1 } for some θn−1 
, θn ∈ H1 . This space can be found easily, and
it is clear that we have Span { θn−1 , θn } = Span { L1,n−1 , L1,n } ,where L1,n−1


and L1,n are the last two components of the linear transformation L1 . Next we
find the three-dimensional subspace of F which forms the core transformation,
 2  2 
i.e. let R = F ∩ Span { θn−1 , θn , θn−1 θn , θn−1

, θn , 1 }.
By construction, we know not only that ∃ R1 , R2 , R3 ∈ R such that R =
Span { R1 , R2 , R3 } and R3 ∈ Span { θn−1 2 , θn 2 , θn−1 θn , 1 } and R1 , R2 ∈

Span { θn−1 , θn , 1 } , but also that ∃ θn−1 , θn ∈ Span { θn−1 
, θn } where
R1 , R2 ∈ Span { θn−1 , θn , 1 } and R3 ∈ Span { θn−1 θn , 1 } . Furthermore, R3
 
θn +bθn +c. We can find appropriately
2 2
can be chosen so that R3 = θn−1 +aθn−1
   
θn−1 = θn−1 + sθn and θn = θn−1 + tθn by finding the right values for s and t.
We solve for s and t by equating the quadratic terms of our chosen R3 , i.e.
 
θn−1 2
+ aθn−1 θn + bθn 2 = (θn−1 
+ sθn )(θn−1

+ tθn ). So s + t = a and st = b.
Thus s(a − s) = b, i.e. s − as + b = 0. In characteristic 2, this last equation is
2

actually linear and can be solved for s.


This choice of θi allows us to calculate an inversion function for the core trans-
formation (described below), just like the inversion function of F . Coincidently,
either θn−1 = λ1 L1,n−1 and θn = λ2 L1,n for some λ1 , λ2 ∈ k or θn−1 = λ1 L1,n
and θn = λ2 L1,n−1 for some λ1 , λ2 ∈ k; but we don’t care which nor do we use
this result directly.

To get Fn+1 ◦ L1 = λFn+1 ◦ L1 for some λ ∈ k, we choose fn+1 ∈ R such that
fn+1 |ρ for some nonzero ρ ∈ P = Span { Pi : 1 ≤ i ≤ n + 1 }. This works to
identify fn+1 = λFn+1 ◦ L1 for some λ ∈ k because the quadratic polynomials
of G become homogeneous when composed with the rational functions in F ,
making the linear subspace of the polynomials of G become a subspace divisible
by Fn+1 ◦ L1 (the denominator) when composed with L2 ◦ F ◦ L1 .
We randomly choose fn−1 , fn ∈ R such that R = Span { fi : n − 1 ≤ i ≤
n + 1 }. We then determine f1 , . . . , fn−2 and θ1 , . . . , θn−2 sequentially, by first
choosing fn−2 and θn−2 , then working our way to f1 and θ1 . Our procedure is
as follows:
∀ i = (n − 2, n − 3, · · · , 2) find θi ∈ / Span { θi+1 , . . . , θn } and fi ∈ F such
that fi ∈ Span { θj θk : i≤j≤k≤n+1 k=i } + Span { θj : i ≤ j ≤ n + 1 } + 1.
The last components, f1 and θ1 , can be chosen randomly as long as Span { fi :
1 ≤ i ≤ n + 1 } = F and Span { θi : 1 ≤ i ≤ n + 1 } = Span { xi : 1 ≤ i ≤ n }.
θ1 , . . . , θn are the components of L1 . It is easy to calculate F1 , . . . , Fn+1 such
that ∀ 1 ≤ i ≤ n + 1, fi = Fi ◦ L1 .
Now that we have determined L1 and F  , we can find the inversion func-
tion parameters ( α1 , . . . , α6 , β1 , . . . , β6 ) for the core transformation of F  by
considering
F (xn−1 ,xn ) F  (xn−1 ,xn )
α1 Fn−1
 (x ,x )
+α2 F  n (x ,x )
+α3
n+1 n−1 n n+1 n−1 n
x n−1 = F  (xn−1 ,xn ) F  (xn−1 ,xn )
=
α4 Fn−1
 (x ,x )
+α5 F  n (x ,x )
+α6
n+1 n−1 n n+1 n−1 n
α1 Fn−1

(xn−1 ,xn ) +α2 Fn (xn−1 ,xn ) +α3 Fn+1

(xn−1 ,xn )
α4 Fn−1
 (xn−1 ,xn ) +α5 Fn (xn−1 ,xn ) +α6 Fn+1
 (xn−1 ,xn )
Cryptanalysis of Rational Multivariate Public Key Cryptosystems 133

or equivalently

xn−1 α4 Fn−1



(xn−1 , xn ) + α5 Fn (xn−1 , xn ) + α6 Fn+1

(xn−1 , xn ) =
     
α1 Fn−1 (xn−1 , xn ) + α2 Fn (xn−1 , xn ) + α3 Fn+1 (xn−1 , xn )

We equate the coefficients of the terms (1, xn−1 , xn , (xn−1 )2 , xn−1 xn , and
(xn−1 )2 xn ) and simultaneously solve for the α1 , . . . , α6 . In the same manner
we find β1 , . . . , β6 by starting with
F (xn−1 ,xn ) F  (xn−1 ,xn )
β1 Fn−1
 (x ,x )
+β2 F  n (x ,x )
+β3
n+1 n−1 n n+1 n−1 n
x =n F (xn−1 ,xn ) F  (xn−1 ,xn )
=
β4 Fn−1
 (x ,x )
+β5 F  n (x ,x )
+β6
n+1 n−1 n n+1 n−1 n
     
β1 Fn−1 (xn−1 ,xn ) +β2 Fn (xn−1 ,xn ) +β3 Fn+1 (xn−1 ,xn )
β4 Fn−1
 (xn−1 ,xn ) +β5 Fn (xn−1 ,xn ) +β6 Fn+1
 (xn−1 ,xn )

Phase III: Find G  ∀ 1 ≤ i ≤ n + 1, find linear combinations of { (Fj ◦


L1 )(Fr ◦ L1 ) : 1 ≤ j ≤ r ≤ n + 1 } which are equal to Pi . The coefficients of
these combinations are the coefficients of the homogeneous polynomials Ḡi .
Let G  = Span { Ḡi : 1 ≤ i ≤ n + 1 }.

3.3 Phase IV: Choose G , L2 and L3


⎞ ⎛
G1 /Gn+1
⎜ .. ⎟
In this phase we will determine the quadratic polynomials of G = ⎝ . ⎠;
Gn /Gn+1
and the linear transformations, L2
and L3
such that ∀ 1 ≤ i ≤ n + 1, Pi =
(L3 )i ◦ G ◦ L2 ◦ F  ◦ L1 , and G can be easily inverted just like G.
Our first step is to determine a core transformation in G . We easily find two
linearly independent linear vectors in G  , φn−1 and φn . Let U=Span { φn−1 , φn }.
That makes U = Span { L2,n−1 , L2,n }. Next we find the three-dimensional
subspace of G  which forms the core transformation, i.e.
let V = G  ∩ Span { φn−1 , φn , φn−1 φn , φn−1 , φn , 1 }.
2 2

Now we find φn−1 and φn in U such that ∀ g ∈ V, g ∈ Span { φn−1 φn , φn−1 ,


φn , 1 }. This choice of φ’s allows us to calculate an inversion function for the
core transformation, just like the inversion function of G. Coincidently, either
φn−1 = λ1 L2,n−1 and φn = λ2 L2,n for some λ1 , λ2 ∈ k or φn−1 = λ1 L2,n and
φn = λ2 L2,n−1 for some λ1 , λ2 ∈ k; but we don’t care which nor do we use this
result directly.
Up to this point, our work with G has been identical to the work with
F . The method to determine Gn+1 is the first place where we differ. Gn+1


will be the quadratic polynomial in two variables such that Gn+1 (φn−1 , φn ) =
Ḡn+1 (x1 , . . . , xn , 1).
Now we randomly choose gn−1 , gn ∈ V such that V = Span { gi : n − 1 ≤ i ≤
n + 1 }. We then determine g1 , . . . , gn−2 and φ1 , . . . , φn−2 sequentially, by first
choosing gn−2 and φn−2 , then working our way to g1 and φ1 . Our procedure is
as follows:
134 J. Ding and J. Wagner

∀ i = (n − 2, n − 3, · · · , 2) find φi ∈ / Span { φi+1 , . . . , φn } and gi ∈ G  such


that gi ∈ Span { φj φk : i≤j≤k≤n+1
k=i } + Span { φj : i ≤ j ≤ n + 1 } + 1.
The last components, g1 and φ1 , can be chosen randomly as long as Span { gi :
1 ≤ i ≤ n + 1 } = G  and Span { φi : 1 ≤ i ≤ n + 1 } = Span { xi : 1 ≤ i ≤ n }.
φ1 , . . . , φn are the components of L2 . And again we must differ in our approach
to G from the approach to F  . At this point, we have for 1 ≤ i ≤ n, Ḡi is a
linear combination of { gj : 1 ≤ j ≤ n + 1 }. We need to have ∀1 ≤ i ≤ n, Ḡi is
a linear combination of only { gj : 1 ≤ j ≤ n }, (excluding gn+1 ).
To explain how we do this is best done using (n + 1) x (n + 1) matrices. Let
χ be the matrix of the linear transformation (k n+1⎛−→ k n+1⎞) such⎛that ⎞
⎛ ⎞ ⎛ g ◦ L ⎞ ⎛ Ḡ ⎞ ∗ ··· ∗ ∗ ∗ ··· ∗ 0
1 2 1 ⎜ .. . . .. .. ⎟ ⎜ .. . . .. .. ⎟
⎝ χ ⎠⎜ ⎝
.. ⎟ ⎜ .. ⎟
⎠ = ⎝ ⎠ . χ is in the form
⎜. . . .⎟
⎜ ⎟ but
⎜. . . .⎟
⎜ ⎟
. . ⎝ ⎠ ⎝∗ · · · ∗ 0⎠
  ∗ ··· ∗ ∗
gn+1 ◦ L2 Ḡn+1
0 ··· 0 ∗ 0 ··· 0 ∗
is the form which we need.
So we find an invertible upper triangular matrix π and an invertible matrix
ν of the desired form such that νχ = π. The zero entries of π provide linear
equations to solve for the entries of ν⎛with coefficients ⎞ from
⎛ χ, which
⎞ are ⎛ known. ⎞
G1 /Gn+1 G1 g1
⎜ .. ⎟ ⎜ . ⎟ ⎜ . ⎟
Now we have χ = ν −1 π. So let G = ⎝ . ⎠ where ⎝ .. ⎠ = π ⎝ .. ⎠;
Gn /Gn+1 Gn+1 gn+1
 −1
and let L⎛
3 = ν ⎞. ⎛ ⎞ ⎛ ⎞ ⎛  ⎞
Ḡ1 g1 ◦ L2 g1 ◦ L2 G1 ◦ L2
⎜ ⎟ ⎜ ⎟ ⎜ ⎟ ⎜ ⎟
Thus ⎝ ... ⎠ = χ ⎝ ..
. ⎠ = ν −1 π ⎝
..
. ⎠ = L3 ⎝
..
. ⎠.
    
Ḡn+1 gn+1 ◦ L2 gn+1 ◦ L2 Gn+1 ◦ L2
Furthermore, P = L3 ◦ G ◦ L2 ◦ F  ◦ L1 and our decomposition is complete.
We can find the inversion function parameters ( δ1 , . . . , δ6 , γ1 , . . . , γ6 ) for the
core transformation of G in the exact same manner that we found α1 , . . . , α6
and β1 , . . . , β6 for F  .
In summary, we have created an alternate CQRM cryptosystem using L1 , F  ,
L2 , G , and L3 such that L3 ◦ G ◦ L2 ◦ F  ◦ L1 = L3 ◦ G ◦ L2 ◦ F ◦ L1 and both G


and F  are invertible, just like G and F ; so cryptanalysis of CQRM is complete.

3.4 Experimental Results and Computational Complexity


The proposal for RMPKC in 1989 suggested an implementation with k of size 28
and n = 5. Our attack programmed in Magma completes cryptanalysis consis-
tently in less than six seconds running on a personal computer with a Pentium
4 1.5 GHz processor and 256 MB of RAM. We ran several experiments at higher
values of n and for larger fields k.
Increasing the size of the field increases the run time of the program linearly.
The larger values of n cause a much greater run time and manifest the critical
elements of both the public key size of the cryptosystem and the computational
Cryptanalysis of Rational Multivariate Public Key Cryptosystems 135

complexity of our cryptanalysis. Since the public key is a set of n + 1 quartic


polynomials, its size is of order O(n4 ).
The following table indicates the public key size, median total run time, and
median percent of total run time for each of the four steps, for various values of
n as indicated. We used |k| = 216 , which seems to be reasonable. A k of size 232
would be quite reasonable as well.

Public Total Run Step 1 Step 2 Step 3 Step 4


n Key Time Find F Define L1 & F  Find G  Define L2 , G & L3
(kBytes) (sec) (%) (%) (%) (%)
5 1.5 10.8 11 78 8 3
6 2.9 40.0 9 80 8 2
10 22.0 1949 15 76 8 1
14 91.8 33654 10 80 9 1

Step 2 clearly comprises the bulk of the run time. Finding of the exact denom-
1
inator of F takes almost all of this time, requiring 24 (16n6 + 131n5 + 440n4 +
3 2
595n + 419n + 114n) operations. However, step 1 has computational complex-
ity of O(n7 ) and step 3 has computational complexity of O(n9 ) so eventually at
higher values for n step 3 will comprise the bulk of the run time.

Remark. The steps above shows our attack is not a simple application of any
one existing attack method, let alone, just the Minrank attack alone. The key
is that we need first to accomplish a polynomial map decomposition and then
recover a subtle rational map decomposition equivalent to the original one, which
requires much more than the Minrank method. One more important point is
the direct algebraic attack, namely from the public key, we can derive a set of
polynomial equations once we are given the ciphertext, but these are degree
4 equations not degree 2 equations, whose computation complexity, as we
all know, is much higher than the case of degree 2 equations. This is further
complicated by the fact that we are working on the field of size of 232 , where
the field equations can not be used. This is confirmed by our experiments, for
example, Magma F 4 implementation failed to solve even the cases n = 5 on an
ordinary PC, which was proposed more than 20 years ago.

4 Conclusion
We develop a new improved 2R decomposition method to break the family of
rational multivariate public key cryptosystems proposed by Tsujii, Fujioka, and
Hirayama in 1989. We show that it is polynomial time to break this family of
cryptosystems in terms of the number of variables, the critical parameter of
the system. We demonstrate in experiments that our method is very efficient
and we can break the scheme originally suggested for practical applications in
a few seconds on a standard PC. The main contribution is that we develop new
techniques to improve the original 2R decomposition such that it can be used
successfully to attack a special family of rational maps. Although we defeat the
136 J. Ding and J. Wagner

cryptosystems, we still believe that this family of cryptosystems contains some


very interesting ideas that may be utilized effectively.

References
1. International Workshop on Post-Quantum Cryptography. Katholieke Universiteit
Leuven, Belgium, May 24–26 (2006), [Link]
2. Faugere, J.-C., Perret, L.: Cryptanalysis of 2R- Schemes. In: Dwork, C. (ed.)
CRYPTO 2006. LNCS, vol. 4117, pp. 357–372. Springer, Heidelberg (2006)
3. Fell, H., Diffie, W.: Analysis of a public key approach based on polynomial sub-
stitution. In: Williams, H.C. (ed.) CRYPTO 1985. LNCS, vol. 218, pp. 340–349.
Springer, Heidelberg (1986)
4. Goubin, L., Patarin, J.: Asymmetric Cryptography with S-Boxes, Extended Ver-
sion, [Link]
5. Tsujii, S., Fujioka, A., Hirayama, Y.: Generalization of the public key cryptosys-
tem based on the difficulty of solving a system of non-linear equations. ICICE
Transactions (A) J72-A 2, 390–397 (1989), [Link]
6. Tsujii, S., Tadaki, K., Fujita, R.: Piece In Hand Concept for Enhancing the Security
of Multivariate Type Public Key Cryptosystems: Public Key Without Containing
All the Information of Secret Key, Cryptology ePrint Archive, Report 2004/366
(2004), [Link]
7. Tsujii, S., Kurosawa, K., Itoh, T., Fujioka, A., Matsumoto, T.: A public key cryp-
tosystem based on the difficulty of solving a system of nonlinear equations. ICICE
Transactions (D) J69-D 12, 1963–1970 (1986)
8. Lih-Chung, W., Yuh-Hua, H., Lai, F., Chun-Yen, C., Bo-Yin, Y.: Tractable rational
map signature. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 244–257.
Springer, Heidelberg (2005)
9. Ye, D.F., Lam, K.Y., Dai, Z.D.: Cryptanalysis of 2R Schemes. In: Wiener, M. (ed.)
CRYPTO 1999. LNCS, vol. 1666, pp. 315–325. Springer, Heidelberg (1999)
10. Specifications of SFLASH, NESSIE documentation,
[Link]
Syndrome Based Collision Resistant Hashing

Matthieu Finiasz

ENSTA

Abstract. Hash functions are a hot topic at the moment in cryptogra-


phy. Many proposals are going to be made for SHA-3, and among them,
some provably collision resistant hash functions might also be proposed.
These do not really compete with “standard” designs as they are usually
much slower and not well suited for constrained environments. However,
they present an interesting alternative when speed is not the main ob-
jective. As always when dealing with provable security, hard problems
are involved, and the fast syndrome-based cryptographic hash function
proposed by Augot, Finiasz and Sendrier at Mycrypt 2005 relies on the
problem of Syndrome Decoding, a well known “Post Quantum” problem
from coding theory. In this article we review the different variants and
attacks against it so as to clearly point out which choices are secure and
which are not.

Keywords: hash functions, syndrome decoding, provable security.

1 Introduction
At Mycrypt 2005 Augot, Finiasz and Sendrier proposed a new “provably collision
resistant” family of hash functions [1]. This family, called Fast Syndrome Based
hash function (or simply FSB), is provably collision resistant in the sense that
finding a collision for FSB requires to solve a hard problem of coding theory,
namely, the Syndrome Decoding problem. However, even if finding collisions
requires to solve an NP-complete problem, some algorithms still exist to solve
it and choosing secure parameters for the function turned out to be harder
than expected. As a consequence, some attacks were found making some of the
originally proposed parameters unsafe. The aim of this article is to review the
different FSB variants and the various attacks against them, and to clearly point
out which parameters are insecure and which are not.
The FSB construction is based on the Merkle-Damgård design. Therefore, we
only describe the compression function which is then iterated in order to obtain
the hash function. As a result, if finding collisions for this compression function
is hard, then finding collisions for the full hash function will also be hard. The
goal of this design is to be able to reduce the problem of finding collisions to the
syndrome decoding problem.
The compression function is composed of two sub-functions:
– First a constant weight encoding function which takes the s input bits of the
compression function and outputs a binary word of length n and Hamming
weight w,

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 137–147, 2008.
c Springer-Verlag Berlin Heidelberg 2008
138 M. Finiasz

– Then a syndrome computation function which multiplies the previously ob-


tained constant weight word by a r × n binary matrix H and outputs the
resulting r bits (which in coding theory are usually called a syndrome). In
practice, as w is usually small, this multiplication is simply the XOR of w
columns of H.
Depending on the choice of the constant weight encoding algorithm and of the
matrix the FSB hash function can either be faster or safer, depending on what
matters the most. Up to now, two different choices have been proposed, and some
attacks on specific parameters sets have followed. In this article we first recall
these two constructions (see Section 2) and then present all the known attacks
against them (see Section 3). In Section 4 we discuss some other issues concerning
the FSB construction and eventually propose some up to date candidates in
Section 5.

2 Description
As explained, the FSB compression function is composed of two sub-functions :
the constant weight encoding function takes s input bits and outputs a word of
length n and weight w, the syndrome computation function uses a binary matrix
H of size r × n and multiplies it by the previous low weight word to output r
bits.

2.1 Original Version


The original version of FSB was first presented in [1]. It uses regular words for
constant weight encoding and matrix H is a random matrix.

Definition 1. A regular word of weight w and length n is a binary word of length


n
n containing exactly one non-zero bit in each of its w intervals of length w .

The reasons for these encoding/matrix choices are quite simple:


– Regular word encoding is the fastest possible constant weight encoding. If
n
w is a power of two, then, the constant weight encoding simply consists in
n
reading log2 w input bits at a time in order to get the index of a column of
H.
– A random matrix was chosen for H for security reasons: it is widely believed
among coding theorists that random linear codes have good properties mak-
ing them hard to decode. As we will see in Section 3, finding collisions for
FSB is equivalent to decoding in a code of parity check matrix H, thus, using
a random matrix is probably a good choice.

2.2 Quasi-Cyclic Version


The main drawback of the original version is the size of the matrix H. The pa-
rameters proposed in [1] all involve using a matrix of at least a few hundred
Syndrome Based Collision Resistant Hashing 139

kilobytes (if not a few megabytes), which is a lot for most constrained envi-
ronments. For this reason, an improved version was presented in [6], still using
regular words for constant weight encoding, but this time using a quasi-cyclic
binary matrix H.

Definition 2. A r × n quasi-cyclic matrix is a matrix composed of nr cyclic


blocs. Each r × r cyclic bloc is such that the i-th line of the bloc is a cyclic shift
by i − 1 positions of the first line of the bloc.
A quasi-cyclic matrix is thus entirely defined by its first line, and the size of
the cyclic blocs.

Quasi-cyclic codes are very interesting as they decrease the size of the description
of the hash function a lot (a single line is enough). Moreover, it is proven in [8]
that when the size of the cyclic blocs is a prime p (and 2 is a generator of GF(p)),
these codes have properties similar to random codes.
Unfortunately, some of the parameters proposed in [6] were awkwardly se-
lected, making them subject to new attacks that we will present in the following
section.

3 Known Attacks
The aim of our hash function construction is to be collision resistant. As it
uses the Merkle-Damgård design, it is sufficient that our compression function is
collision resistant. We thus want to evaluate the complexity of the best collision
search algorithms against it. In practice, there are two ways to find a collision
in the FSB compression function:
– Either find a collision in the constant weight encoding algorithm: choosing
an injective encoding is enough to guarantee that no such collision exists,
– Or find two words of weight w having the same syndrome: that is, find two
words c and c such that H × c = H × c .
As our compression function needs to compress, this second type of collision
always exists, our goal is thus only to make them hard to find!

Problem 1 (Collision search). Given a binary matrix H and a weight w, find a


word c of non-zero weight ≤ 2w such that H × c = 0.

In other words, finding collisions for FSB requires to find a set of 2w or less
columns of H which XOR to zero.

3.1 Decoding Attack


The most natural algorithms to solve the collision problem for FSB are decoding
algorithms: if one considers H as the parity check matrix of a binary code, finding
a collision consists in looking for a code word of weight ≤ 2w (which is roughly
equivalent to a decoding problem). Depending on the choice for the matrix H,
140 M. Finiasz

this requires to either find a structure in H making this search easier, or find
a low weight code word in a random code (that is, assume that H contains no
specific structure making decoding easier).
For the original version of FSB, a truly random matrix is used for H, therefore,
the probability that a structure exists in H is negligible. For the quasi-cyclic
version, an obvious structure exists: the matrix is quasi-cyclic. Nevertheless, if
the quasi-cyclic length is well chosen, no specific decoding algorithm is known
and decoding can only be done as in a random code. However, as we will see in
Section 3.4, a bad choice for the quasi-cyclic length can make the search for low
weight code words much easier.
Considering no structure can be found in H, the best decoding algorithm for
a random binary code is the Canteaut-Chabaud algorithm [4]. This algorithm
is the most advanced of the information set decoding algorithms family and
is specifically designed to solve the hardest decoding instances, that is, finding
words of weight w close to the Gilbert-Varshamov bound when a single solution
exists. Here, the weights we are looking for are much larger (otherwise no com-
pression is possible) which places us in a domain where decoding is somehow
easier and where a large number of solutions exist. Giving a closed formula for
the complexity of this algorithm is very difficult, especially when many solutions
exist, but it is however possible to program an algorithm computing the best
possible work factor for a given set of parameters. Additionally, for the domain
of parameters we are considering, the Canteaut-Chabaud algorithm is almost
always slower than the generalized birthday technique we describe in the next
section. When choosing parameters we thus simply checked that the work factor
for this attack was above the expected security level, and this was always the case.

3.2 Wagner’s Generalized Birthday Technique

Wagner’s generalized birthday technique [10] is an extension of the standard


birthday collision search technique that uses any power of two number of lists
instead of only two. This technique takes advantage of the large number of
solutions and looks for specific solutions which can be found more efficiently. It
was first applied to FSB in [5].

Standard Birthday Technique. We are looking for 2w columns of H which XOR


to 0. These columns are r bits long so we know that if we can build two lists
r
of 2 2 elements containing XORs of w columns of H, there is a high probability
that these two lists contain an identical element. Building these lists and finding
r
this collision can be done in time/space complexity O(2 2 ).

Generalized Birthday Technique. With Wagner’s generalized technique, one has


r
to build 2a lists of 2 a+1 elements containing XORs of 2wa columns of H. These
w
lists are then merged pairwise to obtain 2a−1 lists of XORs of 2a−1 columns of
H. However, in the resulting lists, instead of keeping all the possible elements,
r
only those starting with a+1 zeros are kept: this way, the size of the lists will
r
not increase. Then, these lists are once again merged pairwise, canceling a+1
Syndrome Based Collision Resistant Hashing 141

bits again and so on, until only two lists are left and the standard birthday
technique can be used. With this technique, collisions can be found in time/space
r
complexity of O(2 a+1 ), for any value of a such that enough elements are found
to populate the 2a starting lists.
Depending on s and r, the size of the input and output of the compression
function, it is easy to evaluate the largest possible value for a, and thus the
best possible complexity for this attack. There are s input bits to the function,
meaning that 2s different inputs exist. Thus, 2s words of weight w can be built.
w
The number L of different words of weight 2a−1 that can be built must thus
verify 2a−1 ≤ 2 . Additionally, if we want the attack to be possible, the size
L s
r
L of the starting lists must be large enough, meaning that we need L ≥ 2 a+1 .
Thus, any valid parameter a must verify:
r 
2 a+1 r s a small 2a−1 s
≤ 2 s
⇐⇒ − a + 1 ≤ ⇐⇒ ≤ . (1)
2a−1 a+1 2a−1 a+1 r
For s = r, it is interesting to note that a = 3 verifies the inequality. If we want
the function to compress (that is, s > r), a = 3 will thus always be possible,
r
and a security higher than 2 4 is never possible. This is why a final compression
function (see Section 4.2) will always be necessary.

3.3 Linearization
The linearization attack against FSB was presented in [9]. The idea of this attack
is that when w becomes large enough, the problem of finding a collision can be
linearized in order to reduce it to a linear algebra problem. A collision can then
be found in polynomial time!
There are two forms to this attack: first the straight-forward linearization,
then a extension making it possible to use this attack in some cases where it
could not normally apply.

Simple Linearization. Suppose w = 2r . In this case, finding a collision consists


in finding r columns of H XORing to 0. Now, instead of looking for any word
of weight r and length n we restrict ourselves to very specific words: each of the
r columns of H will be chosen among one pair of columns, meaning that the
{0,1}
i-th column will be either h0i or h1i , where all the hi are different columns of
H. Finding a collision now requires to determine a binary vector B of length r
where the i-th bit of B decides which column to choose between h0i and h1i . Now
comes the linearization: we build a matrix H such that the i-th columns of H
is hi = h1i − h0i . Now, we simply need to find B such that:

r
H × B = h0i .
i=1

This is a linear system to solve and it is done in polynomial time. Thus, as soon
as 2w ≥ r, finding a collision for FSB can be done in polynomial time.
142 M. Finiasz

Extension of the Attack. When w < r2 , the previous attack can still be applied,
but the matrix H will no longer be square and the probability that a solution B
exists will probably be negligible. To improve this, one can use a larger alphabet:
instead of choosing two columns one can choose three columns of H at a time and
code two bits of B with them. However, three columns give three possibilities and
two bits of B require four columns (with the fourth column being the XOR of the
second and the third). Thus, each solution vector B using extended alphabets
will have probability 14 per set of three columns of being invalid. This solution
will thus increase the chance that a solution vector B can be found, but will
decrease the probability that this solution is realizable in practice. According
to [9], if 2w + 2w = r (with w ≤ w), the probability that a valid solution is
found is: 2w
3 
× 0.28879 ! 2−0.830w −1.792 .
4
This attack is thus usable as soon as w ≥ r4 , but it will mostly be of interest
when w is close to r2 .

3.4 Quasi-Cyclic Divisibility


This attack was presented in [7] and it exploits the divisibility of the cycle
length in the quasi-cyclic version of FSB. Suppose H is built of nr cyclic blocs
of size r × r and there exists a divisor p or r such that r = p × r . Then,
Wagner’s generalized birthday technique can be applied on the length r instead
of r. Instead of looking for any word of weight w and length n, we focus on
words having a “cyclic” syndrome: each time the i-th column of a bloc of H
is chosen, the p − 1 columns at position i + r , i + 2r ,... cyclicly to i − r , are
also chosen. This way, a bloc-wise cyclic shift by r positions of the input word
keeps it unchanged. This means that the syndrome of a word selected this way
also remains unchanged when cyclicly shifted by r positions. Thus, if the r top
positions of the syndrome are null, the whole syndrome is also null.
Focusing on the r top rows of H and selecting only the previously described
words, we now need to apply Wagner’s technique to the following problem: find

p columns of a r × p binary matrix XORing to 0. The largest possible value
w n

for the parameter a of Wagner’s algorithm is smaller than before, but the attack
applies to r bits only and the final complexity drops significantly. When selecting
parameters for FSB, it is important that such an attack cannot be applied.

4 Other Issues
4.1 IV Weakness
As pointed out in [7] another weakness of the original FSB compression function
lies in the way the input to the compression function is handled. In particular,
Syndrome Based Collision Resistant Hashing 143

the chaining bits (or IV) and the message bits are simply concatenated, and
no mixing whatsoever is applied. When using regular words, this means that
the output of the compression function is simply the XOR of two independent
hashes: one resulting from the IV bits, the other one from the message bits. If
one can find a collision on the message part of the compression function (this will
be somehow harder than a normal collision as less input bits are available), then
this collision is IV independent. This has no influence on the collision resistance
of the function, but it is a problem when using the hash function as a MAC or
as a PRF for example: the resistance to some attacks falls from the cost of an
inversion (or second preimage) to the cost of building a message only collision
(which will probably be just above the cost for building a standard collision).
In order to avoid such problems, the best thing would be to mix the input bits
through a diffusion function. However, such a mixing is quite costly and would
severely reduce the throughput of the hash function. The best solution is thus
probably to integrate this diffusion in the constant weight encoding function. As
stated in Section 5.1, a simple interleaving of the message bits with the IV bits
is enough to avoid this problem.

4.2 Final Compression Function


Another issue with the FSB compression function is the ratio between the se-
curity against collision and the output size of the function. A hash function is
r
expected to have a security of 2 2 against collisions if it outputs r-bits hashes.
With FSB, we have seen in Section 3.2 that if our compression function is to
r
compress, an attack in 2 4 will always be possible. The solution is thus to use a
final compression function: use FSB to hash the message into a large hash, and
then use a final compression function g to reduce the size of the hash from r
to r bits, where r is twice the bit security of FSB against collisions. However,
finding a suitable function g is not straight-forward:
– If g is collision resistant, then using FSB hash and then g will lead to a
collision resistant hash function. However, even if g is not collision resistant,
building a collisions on the complete hash from a collision on g will not be
easy: it requires to invert FSB. So requiring collision resistance for g is clearly
too much.
– If g is a linear function, then g can be applied to all the columns of H and
finding a collision on the whole function will only require to find a collision
on a matrix H of size r × n. Thus the security against collisions will be less
r
than 2 4 .
Apart from this, it is hard to state anything relevant about the final compres-
sion function and we believe that most non-linear compression function could do
the trick. However, as far as provable security is concerned, choosing a provably
collision resistant function g is probably the only choice at the moment.
144 M. Finiasz

5 Possible Candidates
5.1 Constant Weight Encoding
There are many ways to perform constant weight encoding, spanning from the
one to one encoding where all words of weight w are equiprobable, to the regular
word encoding. The first one is the most bit efficient (the compression function
will have the largest possible input for some given parameters n and w), the
second one is the fastest. When dealing with hash functions, speed is usually
a very important factor and fast constant weight encoding would be a natural
choice, however, concerning security, all results on the hardness of syndrome
decoding consider random words of weight w, not regular words (or words with
any other structure). Luckily, when looking for collisions, a collision for any
given constant weight encoding is also a collision for the one to one encoding:
any pair of words of weight w (even with a strong structure) can be coded with
the one to one equiprobable encoding. Thus, finding collisions for FSB using
regular words can not be easier than finding collisions for FSB using a more bit
efficient encoding.
However, no proof can be given that finding collisions for regular words is
indeed harder than with the one to one equiprobable encoding. Thus, when
choosing parameters for FSB, we will consider the security of FSB with one to
one constant weight encoding, even if a faster encoding is used in practice.
The conclusion of this is that using regular word encoding is certainly the
best choice for efficiency. However, as seen in Section 4.1, using such an encoding
causes IV weakness issues. In order to avoid these issues it is necessary that every
index of a non-zero bit of the constant weight word depends from the value of
both the IV and the message. This way, no IV independent collision can be built.
Interleaving the bits coming from the IV (or chaining value) with those of the
message is thus a solution. Depending on the parameters chosen for the function,
different interleavings will be possible.

5.2 Matrix Choice


The choice of the matrix H is also very important for the efficiency of FSB.
Of course, for optimal security, nothing can be better than a truly random ma-
trix, but in this case the description of the hash function will be very large and
will not be suitable for memory constraint devices and for most hardware im-
plementations. Thus, using matrices that can be described with a single line is
important if FSB is ever to be used in practice.
The results of Gaborit and Zémor [8] tend to prove that well chosen quasi-
cyclic codes have good properties making them suitable candidates. However,
this requires that r is a prime number, which will certainly make implementation
less efficient than using powers of two. Our idea is thus to use a truncated quasi-
cyclic matrix instead of a standard quasi-cyclic matrix.
Definition 3. A r × n matrix H is a truncated quasi-cyclic matrix if it can be
divided in nr blocs of size r × r and each bloc is the top left sub-bloc of a p × p
cyclic bloc.
Syndrome Based Collision Resistant Hashing 145

With this definition, any matrix H built of blocks which are Toeplitz matrices
will be a truncated quasi-cyclic matrix with p > 2r, but in order to be as close
as possible to standard quasi-cyclic matrices, we will always choose r very close
to p. Then, the description of the r × n matrix H can be reduced to a “first line”
of nr × p bits and the values of p and r.
As explained in [8], in order for p to be a suitable choice it must be prime,
and 2 must be a generator of GF(p). Hence, it is easy to check the best p for
a given r: one simply needs to test the primes greater than r one by one until
2 is a generator. For example, for r = 512 we get p = 523, for r = 768 we get
p = 773 and for r = 1024 we get p = 1061.

5.3 Choosing Parameters


When choosing parameters we want Wagner’s attack to be the most efficient
so that we precisely control the security of FSB. As seen in Section 3.2, this
security only depends on the output size r and the input size s of the compres-
sion function. As stated in Section 5.1 we want to measure the security of the
construction when using a one to one constant weight encoding, which means,
s is not the effective number of bits that the compression function will read,
n
but s = log2 w . Once r and s are chosen to obtain the desired security level,
one simply needs to select a convenient value for w (and deduce n), such that
linearization attacks are impossible.
Concerning previously proposed parameters, all those proposed in [6] use a
quasi-cyclic matrix of cyclicity a power of 2 and are thus all subject to the attack
of Section 3.4: none of these parameters should be used. Three sets of parameters
were proposed in the original paper [1]:
– Short hash has its security reduced to 272.2 as the security gained from using
regular words is no longer taken into account.
– Fast hash is subject to the extended linearization attack and has its security
reduced to 259.9 .
– The Intermediate proposal however still has a security above 280 and can
still be considered safe.

Parameters for 80-bit Security. Choosing r = 512 and a security of 280 against
collisions we get from Equation (1) that s ≤ 1688. Now, to avoid linearization
attacks we need w ≤ r4 = 128. If we choose w = 128, we get for n = 218 a value
s = 1587 which is suitable. Our first proposition is thus to use:

r = 512, n = 218 , w = 128,

with regular word encoding, and a truncated quasi-cyclic matrix with p = 523.
For the IV interleaving, each of the w positions are coded by 11 input bits,
4 of which are taken from the IV and the rest from the message. With these
parameters FSB reads input blocs of 896 bits and outputs 512 bits. These bits
can then be compressed to 160 bits using a suitable final compression function.
The matrix H is described by 267 776 bits (∼ 32.7kB).
146 M. Finiasz

Parameters for 128-bit Security. For 128-bit security we need r larger than 512.
We can use r = 768 and obtain s ≤ 2048. If we pick w = 192 and n = 3 × 215 we
get s = 1999 and linearization attacks are impossible. Our proposition is to use:

r = 768, n = 3 × 215 , w = 192,

with regular word encoding, and a truncated quasi-cyclic matrix with p = 773.
Each position is coded using 9 input bits, so the IV interleaving will take 4 bits
from the IV and 5 bits from the message each time. FSB thus reads input blocs
of 960 bits and output 768 bits which, at the end, need to be compressed to 256
bits. The matrix H is described by 98 944 bits (∼ 12kB).
The same parameters, using a shorter n = 3 × 214 will probably be more
efficient as each position will be coded with 8bits, 4 from the IV and 4 from the
message, even if only 768 bit blocs are read instead of 960 bit blocs. Moreover,
it will have a shorter description (∼ 6kB) and the security against collisions will
be a little higher (about 2133 ).

6 Conclusion

Taking into account all the different attacks against FSB, it is still possible
to select parameters that offer both a high level of security (relying on well
identified problems) and a satisfying efficiency. Also, apart from the choice of
the final compression function, the other choices that had to be made for FSB
seem clear: use regular word encoding (with IV interleaving) and a truncated
quasi-cyclic matrix. For the final compression function, using a provably secure
pseudo-random generator could be a good choice: use the output of FSB as
an IV and generate the desired number of bits of output. One could then use
the generators of Blum-Blum-Shub [3], or preferably for post-quantum security
QUAD [2].

References

1. Augot, D., Finiasz, M., Sendrier, N.: A family of fast syndrome based cryptographic
hash functions. In: Dawson, E., Vaudenay, S. (eds.) Mycrypt 2005. LNCS, vol. 3715,
pp. 64–83. Springer, Heidelberg (2005)
2. Berbain, C., Gilbert, H., Patarin, J.: QUAD: a practical stream cipher with prov-
able security. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp.
109–128. Springer, Heidelberg (2006)
3. Blum, L., Blum, M., Shub, M.: Comparison of two pseudo-random number gen-
erators. In: Chaum, D., Rivest, R.L., Sherman, A. (eds.) Crypto 1982, pp. 61–78.
Plenum (1983)
4. Canteaut, A., Chabaud, F.: A new algorithm for finding minimum-weight words in
a linear code: Application to McEliece’s cryptosystem and to narrow-sense BCH
codes of length 511. IEEE Transactions on Information Theory 44(1), 367–378
(1998)
Syndrome Based Collision Resistant Hashing 147

5. Coron, J.-S., Joux, A.: Cryptanalysis of a provably secure cryptographic hash func-
tion. IACR eprint archive (2004), [Link]
6. Finiasz, M., Gaborit, P., Sendrier, N.: Improved fast syndrome based cryptographic
hash functions. In: Rijmen, V. (ed.) ECRYPT Workshop on Hash Functions (2007)
7. Fouque, P.-A., Leurent, G.: Cryptanalysis of a hash function based on quasi-cyclic
codes. In: Malkin, T. (ed.) CT-RSA 2008. LNCS, vol. 4964, pp. 19–35. Springer,
Heidelberg (2008)
8. Gaborit, P., Zémor., G.: Asymptotic improvement of the Gilbert-Varshamov bound
for linear codes. In: IEEE Conference, ISIT 2006, pp. 287–291 (2006)
9. Saarinen, M.-J.O.: Linearization attacks against syndrome based hashes. In: Sri-
nathan, K., Rangan, C.P., Yung, M. (eds.) INDOCRYPT 2007. LNCS, vol. 4859,
pp. 1–9. Springer, Heidelberg (2007)
10. Wagner, D.: A generalized birthday problem. In: Yung, M. (ed.) CRYPTO 2002.
LNCS, vol. 2442, pp. 288–304. Springer, Heidelberg (2002)
Nonlinear Piece In Hand Perturbation Vector
Method for Enhancing Security of Multivariate
Public Key Cryptosystems

Ryou Fujita1 , Kohtaro Tadaki2 , and Shigeo Tsujii1


1
Institute of Information Security
2–14–1 Tsuruya-cho, Kanagawa-ku, Yokohama-shi, 221–0835 Japan
2
Research and Development Initiative, Chuo University
1–13–27 Kasuga, Bunkyo-ku, Tokyo, 112–8551 Japan

Abstract. The piece in hand (PH) is a general scheme which is appli-


cable to any reasonable type of multivariate public key cryptosystems
for the purpose of enhancing their security. In this paper, we propose a
new class PH method called NLPHPV (NonLinear Piece in Hand Pertur-
bation Vector) method. Although our NLPHPV uses similar perturba-
tion vectors as are used for the previously known internal perturbation
method, this new method can avoid redundant repetitions in decryp-
tion process. With properly chosen parameter sizes, NLPHPV achieves
an observable gain in security from the original multivariate public key
cryptosystem. We demonstrate these by both theoretical analyses and
computer simulations against major known attacks and provides the con-
crete sizes of security parameters, with which we even expect the grater
security against potential quantum attacks.

Keywords: public key cryptosystem, multivariate polynomial, multi-


variate public key cryptosystem, piece in hand concept, perturbation
vector.

1 Introduction
Multivariate Public Key Cryptosystems (MPKCs, for short) originally proposed
in 80’s as possible alternatives to the traditional, widely-used public key cryp-
tosystems, such as RSA and ElGamal cryptosystems. One of the motivations for
researching MPKC is that the public key cryptosystems based on the intractabil-
ity of prime factorization or discrete logarithm problem are presently assumed
to be secure, but their security will not be guaranteed in the quantum computer
age. On the other hand, no quantum algorithm is known so far to be able to
solve efficiently the underlying problem of MPKCs, i.e., the problem of solving a
set of multivariate quadratic or higher degree polynomial equations over a finite
field.
Since the original research of MPKCs was started, many new schemes have
been proposed so far. At the same time, many new methods to cryptanalyze

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 148–164, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Nonlinear Piece In Hand Perturbation Vector Method 149

MPKCs have also been discovered. Recently, for the purpose of resisting these
attacks, the research on the method for enhancing security of MPKCs is becom-
ing one of the main themes of this area. The piece in hand (PH, for short) matrix
method aims to bring the computational complexity of cryptanalysis close to ex-
ponential time by adding random polynomial terms to original MPKC. The PH
methods were introduced and studied in a series of papers [27, 28, 29, 30, 31, 32,
33, 34]. Among them, there are primary two types of the PH matrix methods;
the linear PH matrix methods and the nonlinear PH matrix methods. In par-
ticular, the papers [31, 32, 33, 34] proposed the linear PH matrix method with
random variables and the nonlinear PH matrix method, and showed that these
PH matrix methods lead to the substantial gain in security against the Gröbner
basis attack under computer experiments.
Because of the nonlinearity of the PH matrix, the nonlinear PH matrix meth-
ods are expected to enhance the security of the original MPKC more than the lin-
ear PH matrix methods in general. Thus, in the present paper, we propose a new
PH method, called NonLinear Piece in Hand Perturbation Vector (NLPHPV, for
short) method, which can be applied to both encryption schemes and signature
schemes in general.1 The adopted application of perturbation vector is similar to
the internal perturbation method [3] and the construction of R-SE(2)PKC [13],
where random transformation is mixed with the “non-singular” transformation.
In particular, on the internal perturbation method, computational complexity
by the Gröbner basis attack is reported in [5], the paper showed that when r
is not too small (i.e., r  6), the perturbed Matsumoto-Imai cryptosystem [3]
is secure against the Gröbner basis attack, where r is the perturbation dimen-
sion. Note, however, that in exchange for enhancing the security, the decryption
process of the internal perturbation method becomes q r times slower than un-
perturbed one, where q is the number of field elements. This fact contrasts with
our NLPHPV method in a sense that it does not require repeated processes of
decryption process which grows exponentially, though the cipher text size be-
comes slightly large. From this point of view of efficiency, NLPHPV method can
be a good alternative to the internal perturbation method. We also discuss on
security benefit of the NLPHPV method against major known attacks, i.e., the
Gröbner basis attack, the rank attack [37], and the differential attack [9]. Based
on also our security considerations, we suggest concrete parameter sizes for the
NLPHPV method.
This paper is organized as follows. We begin in Section 2 with some basic
notation and a brief introduction of the schemes of MPKCs in general. We in-
troduce the NLPHPV method in Section 3. We then show, based on computer
experiments, that the NLPHPV method properly provides substantial security
against the Gröbner basis attack in Section 4. We discuss the immunity of the
NLPHPV method against known attacks in Section 5. Based on the discussion,
we suggest parameters for the NLPHPV method in Section 6. We conclude this
paper with the future direction of our work in Section 7.

1
In signature scheme, the parameters of the NLPHPV method are restricted to some
region. We will deal with the issue in Section 3 and Subsection 5.2.
150 R. Fujita, K. Tadaki, and S. Tsujii

2 Preliminaries
In this section we review the schemes of MPKCs in general after introducing
some notations about fields, polynomials, and matrices.

2.1 Notations
We represent a column vector in general by bold face symbols such as p, E, and
X.
– Fq : finite field which has q elements with q ≥ 2.
– Fq [x1 , . . . , xk ]: set of all polynomials in variables x1 , x2 , . . . , xk with coeffi-
cients in Fq .
– S n×l : set of all n × l matrices whose entries are in a nonempty set S with
positive integers n and l. Let S n×1 = S n .
– S n : set of all column vectors consisting n entries in S.
– AT ∈ S l×n : transpose of A for matrix A ∈ S n×l .
– f (g) = (h1 , . . . , hn )T ∈ Fq [x1 , . . . , xm ]n : substitution of g for the variables
in f , where f = (f1 , . . . , fn )T ∈ Fq [x1 , . . . , xk ]n , g = (g1 , . . . , gk )T ∈
Fq [x1 , . . . , xm ]k are polynomial column vectors. Each hi is the polynomial in
Fq [x1 , . . . , xm ] obtained by substituting g1 , . . . , gk for the variables x1 , . . . , xk
in fi , respectively.
– f (p) ∈ Fq n : vector obtained by substituting p1 , . . . , pk for the variables
x1 , . . . , xk in f , respectively, for f ∈ Fq [x1 , . . . , xk ]n and p ∈ Fq k , where
p = (p1 , . . . , pk )T with p1 , . . . , pk ∈ Fq .

2.2 MPKCs in General


A MPKC as in [3, 12, 13, 14, 17, 18, 19, 21, 25, 26, 36, 38] are often made by
the following building blocks:
Secret key: The secret key includes the following:
– the two invertible matrices A0 ∈ Fq k×k , B0 ∈ Fq n×n ;
– the polynomial transformation G ∈ Fq [x1 , . . . , xk ]n whose inverse is ef-
ficiently computable.
Public key: The public key includes the following:
– the finite field Fq including its additive and multiplicative structure;
– the polynomial vector E = B0 G(A0 x) ∈ Fq [x1 , . . . , xk ]n , where x =
(x1 , . . . , xk )T ∈ Fq [x1 , . . . , xk ]k .
Encryption: Given a plain text vector p = (p1 , . . . , pk )T ∈ Fq k , the corre-
sponding cipher text is the vector c = E(p) .
Decryption: Given the cipher text vector c = (c1 , . . . , cn )T ∈ Fq n , decryption
includes the following steps:
(i) Compute w = B0−1 c ∈ Fq n ,
(ii) Compute v ∈ Fq k from w by using the inverse transformation of G,
(iii) Compute p = A−1 0 v ∈ Fq .
k
Nonlinear Piece In Hand Perturbation Vector Method 151

E = B0 G(A0 x): public key

A0 : secret key G: secret key B0 : secret key


v = A0 p
- w = G(v)
- c = B0 w
- ?
p ∈ Fq k  v ∈ Fq k  w ∈ Fq n  c ∈ Fq n
plain text cipher text

Fig. 1. Scheme of Multivariate Public Key Cryptosystem

3 Nonlinear Piece In Hand Perturbation Vector


(NLPHPV) Method
Let K be an arbitrary MPKC whose public key polynomial vector is given by
E ∈ Fq [x1 , . . . , xk ]n , as described in Subsection 2.2. Let f , l and h be any
positive integers. We set g = n + l + h. Let p and z be any positive integers with
def
p ≤ k ≤ z, and let t be any nonnegative integer with t ≤ z − p. The relation
between these parameters and correspondence to plain text and random number
is given in Figure 2.

k
 -  z
-
plain text in
original MPKC K z

p z−p p z−p
 - -  - -
p u
x µ λ u1 u2
plain text
 random
- variables plain text
 -
random number
-
variables
t z−p−t
 - t
z

x Aµ p y = Au1
 - -  - -
p k−p p k−p

Fig. 2. Plain text and random number

Let A ∈ Fq (k−p)×t and C ∈ Fq f ×z be randomly chosen matrices. Let r ∈


Fq [x1 , . . . , xz ]h be a randomly chosen polynomial vector. In the NLPHPV method,
a new MPKC K , is constructed from the given MPKC K for the purpose of
enhancing the security. A public key E , is constructed
, ∈ Fq [x1 , . . . , xz ]g of K
from the original public key E of K.
Secret key: The secret key includes the following:
– secret key of K;
– randomly chosen invertible matrix B ∈ Fq g×g ;
152 R. Fujita, K. Tadaki, and S. Tsujii

– polynomial transformation H ∈ Fq [x1 , . . . , xf ]l whose inverse is effi-


ciently computable;
– the nonlinear piece in hand perturbation vector Q ∈ Fq [x1 , . . . , xf ]n ,
which is randomly chosen.
Public key: The public key includes the following:
– the finite field Fq including its additive and multiplicative structure;
– the number of plain text variables in the NLPHPV method p;
– the polynomial vector E, ∈ Fq [x1 , . . . , xz ]g . E
, is constructed as the fol-
lowing equation:
⎛  ⎞
x
⎜ E Aµ + Q(f ) ⎟
E, = B⎜ ⎟. (1)
def ⎝ H(f ) ⎠
r

Here x = (x1 , . . . , xp )T ∈ Fq [x1 , . . . , xp ]p , µ = (xp+1 , . . . , xp+t )T ∈


Fq [xp+1 , . . . , xp+t ]t , λ
 = (xp+1 , . . . , xz ) ∈ Fq [xp+1 , . . . , xz ]
T z−p
, f =
x
(f1 , . . . , ff )T = C ∈ Fq [x1 , . . . , xz ]f . Note that, in the right-hand
λ
side of (1), the vector Aµ ∈ Fq [xp+1 , . . . , xp+t ]k−p is substituted for
the variables xp+1 , . . . , xk in the original public key E while keeping
the variables x1 , . . . , xp in E unchanged. Q(f ) plays a role in masking
the original public key E and randomizing it. r is appended to the
polynomial sets in order to cope with the differential attack [6, 9].
Note that t random variables xp+1 , . . . , xp+t in µ are included in E
from z −p random variables xp+1 , . . . , xz in λ. Then, increasing the value
z−p
t makes these random variables indistinguishable.
 
x x
Remark 1. We may replace E in (1) with E D in a more
Aµ µ
general form. Here D ∈ Fq k×(p+t) is a randomly chosen  matrix such  that,

p p
for any p, p ∈ Fq p and any u1 , u1 ∈ Fq t , if D = D , then
u1 u1

p = p . This condition on D is needed to recover  the plain text uniquely.
Ip 0
However, D can be rewritten as D = U for some invertible matrix
0 A
 
x x
U ∈ Fq k×k
. Thus, the transformation E is equivalent to E D
Aµ µ
since A0 is randomly chosen in original MPKC K.
In signature scheme, the requirement of the uniqueness in decryption is
removed. Thus, the matrix D can be randomly chosen and the distinction
between plain text and random variables is also removed.

Encryption: Given a plain text vector p = (p1 , . . . , pp )T ∈ Fq p and a random


number u = (u1 , 
. . . , uz−p )T ∈ Fq z−p , the corresponding cipher text is the
vector , , p .
c=E
u
Nonlinear Piece In Hand Perturbation Vector Method 153

Decryption: Given the cipher text vector ,


c = (,
c1 , . . . , ,
cg )T ∈ Fq g , decryption
includes the following steps:
(i) Compute B −1 c,. By (1), we see that
⎛  ⎞
p
⎜E y
+ Q(f (z)) ⎟
B −1 c, = ⎜

⎟,

H(f (z))
r(z)
 
p u1
where z = ∈ Fq z , u = ∈ Fq z−p , u1 ∈ Fq t , u2 ∈ Fq z−p−t ,
def u u2
y = Au1 ∈ Fq k−p .
def
(ii) Compute f (z) from the value H(f (z)) by using the inverse transforma-
tion of H.
(iii) Compute Q(f (z)) by substitution of f (z) for Q.
 
p p
(iv) Compute E from the value E + Q(f (z)).
y y

p
(v) Compute by using the secret key of K. Note that y is discarded
y
after the decryption.
In signature scheme, the matrices A and C are included in the secret
 key,

0A0 p
and it is needed to compute u by solving linear equation =
  C λ
y p
for unknown λ, and to check if r = r(z) for the solution
f (z) u

public key
plain text computation
„ « e
cipher text
p ∈ Fq p - p E - ec ∈ Fq g
∈ Fq z
Z
}
Z u
Z >


Z 
Z  secret key
Z
 original MPKC
Z  computation
Z „ « secret key of
random number  Z p original MPKC
u ∈ Fq z−p  ∈ Fq k  c ∈ Fq n 
y
6
E: public key of original MPKC

Fig. 3. NonLinear Piece in Hand Perturbation Vector method


154 R. Fujita, K. Tadaki, and S. Tsujii

p
u, where r(z) is the value given above. Since the probability that
2
u
h
satisfies this criteria is 1/q on average, h must be small as possible in
signature scheme.

The encryption and decryption processes in the NLPHPV method are schemat-
ically represented in Figure 3.

4 Experimental Results

In this section, based on computer experiments, we clarify the enhancement of


the security by the NLPHPV method proposed in the previous section.
Recently, Faugère and Joux [8] showed in an experimental manner that com-
puting a Gröbner basis (GB, for short) of the public key is likely to be an efficient
attack to HFE [21], which is one of major MPKCs. In fact, they broke the first
HFE challenge (80bits) proposed by Patarin. The attack used by them is to
compute a Gröbner basis for the ideal generated by polynomial components in
E − c, where E is a public key and c is a cipher text vector.

Table 1. Computational times of the Table 2. Computational times of the


GB attack for PMI+ GB attack for the enhanced MI by the
NLPHPV method
Parameters Computational times
k r a in second Parameters Computational times
28 6 0 845 k l h in second
28 6 5 733 28 17 3 290
28 6 10 563 28 17 4 289
28 6 15 436 28 17 5 263
29 6 15 747 29 17 3 537
30 6 15 1305 29 17 8 402
29 17 10 349
k: number of plain text variables 30 17 3 936
r: perturbation dimension 30 17 8 701
a: number of Plus polynomials 30 17 13 513

We report in Table 1 and Table 2 the time required for the GB attack against
the perturbed Matsumoto-Imai-Plus cryptosystem (PMI+, for short) [6] and the
Matsumoto-Imai cryptosystem (MI, for short) [18] enhanced by the NLPHPV
method. Note that n = k and q = 2 for the public keys E ∈ Fq [x1 , . . . , xk ]n
of MI by its specification. We deal with the case of p = z = k, f = l in
0 „ «1
„«„ « p
D0x D
2
The equation is replaced with =@ u1 A for unknown x and λ
C λ
f (z)
when the matrix D above is randomly chosen.
Nonlinear Piece In Hand Perturbation Vector Method 155

the NLPHPV method. As an practical example of the polynomial transforma-


tion H in the NLPHPV method, we use the public key polynomials of the
HFE in which the degree of the secret univariate polynomial is more than 128,3
though we can choose any H. The computation times are evaluated on PRO-
SIDE edAEW416R2 workstation with AMD Opteron Model 854 processors at
2.80GHz and 64GB of RAM. We use the algorithm F4 implemented on the com-
putational algebra system Magma V2.12-21. In Table 1 and Table 2, due to the
constraint of computing ability, only the cases of k = 28, 29, 30 are computed.
Since MI may have polynomial time complexity about O(k 7 ) of cryptanalysis,
as shown in [5] and our preliminary experimental results, it is quite difficult at
present to compare MI with the enhanced MI in a practical length of a plain
text such as 200bits. If we can experimentally cryptanalyze the MI enhanced by
the NLPHPV method in the practical length of a plain text in order to compare
it with the original MI, then this implies that the cryptosystem enhanced by
NLPHPV method is useless in itself. This is a limitation and dilemma of the
security evaluation by computer experiments. On the other hand, our another
computer experiments with the same facilities show that it takes about 0.07 sec-
onds to cryptanalyze the plain MI with k = 30 by the GB attack. Since plain
MI with k = 30 was cryptanalyzed within about 0.07 seconds under our envi-
ronment, it would be estimated that the perturbation by internal or NLPHPV
enhances the F4 time complexity by about 104 times. This fact shows that the
internal perturbation method and the NLPHPV method enhance the security of
MI against the GB attack.
We now consider the applicability of the internal perturbation method and the
NLPHPV method. The internal perturbation method requires q r times decryp-
tion complexity of the original MPKC. On the other hand, the NLPHPV method
requires at most a few times decryption complexity of the original MPKC regard-
less of the value of q. Though the application of the NLPHPV method requires
the increase of cipher text size, in terms of the decryption time, the NLPHPV
method seems to be a possible alternative to the internal perturbation method
in the enhancement of the security against the GB attack.
Remark 2. In the above, we only dealt with the case that no random variable
was introduced. For the purpose of enhancing the security further, it is possible
to introduce random variables. As shown in Appendix A, the increase of the
number z − p of random variables xp+1 , . . . , xz increases the time required for
the GB attack against the enhanced cryptosystem K , and provides substantial
security against the GB attack.

5 Discussion on Security
In this section, we discuss the security of the NLPHPV method against major
known attacks. The main purpose of this section is to enclose the secure pa-
rameter region of the NLPHPV method by both theoretical and experimental
observations.
3
The optimal choice of H is still open. We will clarify this point in the future work.
156 R. Fujita, K. Tadaki, and S. Tsujii

5.1 GB Attack

As stated in the previous section, based on computer experiments, the NLPHPV


method properly provides substantial security, and enhances the security of the
Matsumoto-Imai cryptosystem against the GB attack. In the case where the
original MPKC is other than Matsumoto-Imai cryptosystem, or in the case where
signature scheme is considered, we will clarify their security against the GB
attack in the full version of this paper. A purely theoretical treatment of their
security is also an issue in the future.

5.2 Rank Attack

In 2004 Wolf, Braeken, and Preneel [37] introduced an attack against a class of
MPKCs, called step-wise triangular schemes (STS, for short), based on the rank
calculation of the public key (see also [1, 10, 23]). On the other hand, recently,
Ito, Fukushima, and Kaneko [11] proposed an attack against the MPKC which
is obtained by applying the linear PH matrix method to the sequential solution
method as an original MPKC. Their attack makes use of an STS-like structure
of the MPKC.
In fact, the structure of the public key of the NLPHPV method can be seen as
a gSTS (general step-wise ⎛ triangular
⎞ structure) [37]. The detailed description is
C
⎜ Ip 0 ⎟
given below. Let A = ⎜ ⎟
⎝ 0 A 0 ⎠ ∈ Fq
z×z
be an invertible matrix, where A, C
R
are as in Section 3, Ip is the identity matrix in Fq p×p , and R is a specific matrixin
x
Fq (z−k−f )×z . For A , we define x = (x1 , . . . , xf , . . . , xf +k , . . . , xz )T = A ,
def λ
where x, λ are as in Section 3. Let x1 = (x1 , . . . , xf )T , x2 = (xf +1 , . . . , xf +k )T ,
 
   T   x  x
and x3 = (xf +k+1 , . . . , xz ) be parts of x . Then, x1 = C , x2 = ,
λ Aµ
where µ is as in Section 3. We denote H = (h1 , . . . , hl ) ∈ Fq [x1 , . . . , xf ] ,
T l

Q = (q1 , . . . , qn )T ∈ Fq [x1 , . . . , xf ]n , E = (e1 , . . . , en )T ∈ Fq [x1 , . . . , xk ]n ,


where H, Q, and E are as in Section 3. By substitution of x1 for the vari-
ables in H, we obtain H(x1 ), which is equalto H(f ) in (1). Similarly, Q(x1 )
x
and E(x2 ) are equal to Q(f ) and E in (1), respectively. We define

r = (r1 , . . . , rh )T = r (A )−1 X ∈ Fq [x1 , . . . , xz ]h , where X = (x1 , . . . , xz )T ∈
def 
x
Fq [x1 , . . . , xz ] and r is as in Section 3. Then, r (x ) = r (A )−1 A
z
=
 λ
x
r = r.
λ
Using H(x1 ), Q(x1 ), E(x2 ), and r  (x ) above, we construct the gSTS corre-
sponding to (1) as follows:
Nonlinear Piece In Hand Perturbation Vector Method 157
⎧   

⎨ y1 = h1 (x1 , . . . , xf ),
Step 1 ..
⎪ .
⎩ 
yl = hl (x1 , . . . , xf ),
⎧     
⎪ yl+1 = q1 (x1 , . . . , xf ) + e1 (xf +1 , . . . , xf +k ),

Step 2 .. (2)
⎪ .
⎩ 
yl+n = qn (x1 , . . . , xf ) + en (xf +1 , . . . , xf +k ),
⎧      

⎨ yl+n+1 = r1 (x1 , . . . , xf , . . . , xf +k , . . . , xz ),
Step 3 ..
⎪ .

yg = rh (x1 , . . . , xf , . . . , xf +k , . . . , xz ).

, = By  , where E,
We denote y  = (y1 , . . . , yg )T . Then, E , B are as in Section 3.
In this gSTS, the number of layers is 3, the numbers of new variables (step-
width) are f , k, z − k − f , and the numbers of equations (step-height) are l, n,
h, respectively. This structure may bring down undesirable vulnerability against
the rank attack. In the following, we discuss the security of the NLPHPV method
against two rank attacks; high rank attack and low rank attack.

High Rank Attack. In the high rank attack against the gSTS, to separate
the part of Step 3 in (2) from the public key, the attacker searches vectors
v = (v1 , . . . , vg )T ∈ Fq g . The vectors form together an invertible matrix whose
row is a row of the secret key B −1 or its linear equivalent copy, since multiplying
B −1 to the public key E , separates their layers. The attacker can find each of
the vectors v with a probability 1/q h by checking whether
 g 

rank vi Pi ≤ f + k,
i=1

for randomly chosen v1 , . . . vg ∈ Fq , where Pi are matrices, in a quadratic form, of


the public key polynomial vector E , = (, e1 , . . . , e,g )T = (X T P1 X, . . . , X T Pg X)T ,
with X = (x1 , . . . , xz ) ∈ Fq [x1 , . . . , xz ]z .
T

One of the simple countermeasures is to make the step-height of Step 3 thick,


i.e., to make the number h of polynomials in the randomly chosen polynomial
vector r in the NLPHPV method large. If q h is large enough, the probability
1/q h becomes negligible. However, larger h loses efficiency of cryptosystem in
signature scheme as mentioned in Section 3.
In the case that h is not too large, one of the countermeasures against the
weakness is to combine Step 2 with Step 3, i.e., to set f + k = z. Then, both
on Step 2 and on Step 3 in (2), the rank is z = f + k, and the difference of the
rank between these steps disappears. Also, the combination of Step 2 and Step
3 replaces the probability 1/q h by 1/q n+h . In the case where n is large enough,
this probability becomes negligible, and therefore the high rank attack could be
intractable.
158 R. Fujita, K. Tadaki, and S. Tsujii

Low Rank Attack. In the low rank attack against the gSTS, the attacker can
find w = (w1 , . . . , wg )T ∈ Fq z with a probability 1/q f by checking whether the
unknown v = (v1 , . . . , vg ) has f solutions in equation
 g 

vi Pi w = 0,
i=1

for randomly chosen w1 , . . . wg ∈ Fq .


One of the countermeasures against the weakness is to widen the step-width of
Step 1, i.e., to choose f to be large enough. Then, the probability 1/q f becomes
small, and therefore the low rank attack could be intractable.

5.3 Differential Attack


In 2005 Fouque, Granboulan, and Stern [9] adapted the differential cryptanal-
ysis to MPKCs in order to break MI and its variant, called PMI [3]. In the
differential attack, the attacker tries to find v = (v1 , . . . , vz )T ∈ Fq z such that
dim (ker (Lv )) = δ, where Lv ∈ Fq z×z , Lv X = E(X , ,
+v)− E(X)− ,
E(v)+ ,
E(0),
X = (x1 , . . . , xz )T ∈ Fq [x1 , . . . , xz ]z , and δ is a specific value.
We confirmed, by computer experiments, that the dimensions of the kernel
in the NLPHPV method are the same in almost all cases. Moreover, note that
the differential cryptanalysis might be applied only to Matsumoto-Imai type
cryptosystems and the application of Plus method might recover their security
against the cryptanalysis [6]. In the NLPHPV method proposed in this paper,
the original MPKC K can be chosen to be any MPKC, not limited to Matsumoto-
Imai type cryptosystems, and the NLPHPV method has a structure like Plus
method. Thus, the NLPHPV method might be immune against the differential
cryptanalysis. We will clarify this point in the future work.

6 Consideration on Secure Parameter Setting


Based on the discussion on the security in the previous section, we suggest a
secure parameter setting of the NLPHPV method in Table 3.

Table 3. Parameter Setting

Parameters
q p k n z g f l h t Public Key Size
Encryption scheme 256 260 260 8.89 MB
The enhanced encryption scheme 256 256 260 260 420 300 20 20 20 82 26.65 MB
by the NLPHPV method
Signature scheme 256 30 20 9.92 KB
The enhanced signature scheme 256 30 20 50 30 20 39.78 KB
by the NLPHPV method
Nonlinear Piece In Hand Perturbation Vector Method 159

In recently proposed major MPKCs, public key sizes for encryption schemes
are 175 KB in PMI+ [6] and 160.2 KB in IC i+ [7], and for signature schemes 15
KB in Rainbow [4] and 9.92 KB in IC- [7]. The main purpose of these schemes
is to implement them on small devices with limited computing resources. On
the other hand, we assume the situation in the future when quantum comput-
ers appear, and place much more value on the security than the efficiency, such
as the reduction of key size. Let us consider the security level of the quan-
tum computer age where quantum computers are available. Then, the simple
application of the Grover’s algorithm to√exhaustive search of 2N candidates re-
duces the time complexity O(2N ) to O( 2N ). On the other hand, nowadays, the
exhaustive search of 280 candidates is thought to be impossible and the com-
plexity 280 is selected as the standard security level in present cryptographic
community. Therefore, we assume that the security level of the quantum com-
puter age is greater than the complexity 2160 . Note that we omit the evalu-
ation of the size of secret key below. This is because the size of secret key
of a MPKC is much smaller than that of public key and different in various
MPKCs.

6.1 Encryption Scheme


The plain text size is 2048 bits. Information transmission rate (i.e., the size of
plain text divided by the size of cipher text) is 256/300 ≈ 0.853. The public key
size increases about 3 times from the original encryption scheme. In the original
encryption scheme, the numbers of plain text and cipher text variables are 260.
In the high rank attack against this scheme, the probability with which the
attacker finds each of the vectors v is 1/q h . Therefore, the attack complexity
of the high attack is q h = 2160 on average. On the other hand, in the low rank
attack, the probability with which the attacker finds w is 1/q f . Therefore, the
attack complexity of the low rank attack is q f = 2160 on average. For these
reasons, these rank attacks are intractable. Also, since z−p 82 ≈ 2
= 164 160
t , it
is also intractable to distinguish random variables.

6.2 Signature Scheme


The signature size is 400 bits. In the original signature scheme, the number of
input variables is 20, and 30 output variables. The public key size increases about
4 times from the original signature scheme.
In the high rank attack against this scheme, the probability with which the
attacker finds each of the vectors v is 1/q n+h not 1/q h , since z = f + k as noted
in Subsection 5.2. Therefore, the attack complexity of the high rank attack is
q n+h > q n = 2720 . On the other hand, in the low rank attack, the probability
with which the attacker finds w is 1/q f . Therefore, the attack complexity of the
low rank attack is q f = 2160 on average. For these reasons, these rank attacks
are intractable.
160 R. Fujita, K. Tadaki, and S. Tsujii

7 Concluding Remarks
In this paper, we proposed a new class of PH methods called NonLinear Piece in
Hand Perturbation Vector (NLPHPV) method. NLPHPV is more efficient than
previously known internal perturbation methods in terms of the decryption pro-
cess avoiding redundant repetitive steps. Based on computer experiments, we
have shown the enhancement of the security of the Matsumoto-Imai cryptosys-
tem by the method against the Gröbner basis attack. Then, by considering the
security against known other attacks, we have suggested a secure parameter set-
ting of the NLPHPV method for the quantum computer age. From the practical
view point of current interest, it is also important to evaluate the efficiency of
both encryption and decryption in the cryptosystem enhanced by the method.
However, since the aim of the present paper is mainly to develop the framework
of nonlinear PH matrix methods as a potential countermeasure against the ad-
vent of quantum computers in the future, this practical issue is not considered
in this paper but discussed in another paper. Because of the same reason, we
have not considered some provable security, for example IND-CCA of the class
of PH methods for encryption but considered just the encryption primitive E ,
for an MPKC which is obtained by applying the NLPHPV method. We leave
the consideration of the stronger security to a future study.

Acknowledgments
The authors are grateful to Dr. Tomohiro Harayama and Mr. Masahito Gotaishi
for helpful discussions and comments.
This work is supported by the “Strategic information and COmmunications
R&D Promotion programmE” (SCOPE) from the Ministry of Internal Affairs
and Communications of Japan.

References
1. Coppersmith, D., Stern, J., Vaudenay, S.: Attacks on the birational permutation
signature schemes. In: Stinson, D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp.
435–443. Springer, Heidelberg (1994)
2. Courtois, N., Klimov, A., Patarin, J., Shamir, A.: Efficient algorithms for solving
overdefined systems of multivariate polynomial equations. In: Preneel, B. (ed.)
EUROCRYPT 2000. LNCS, vol. 1807, pp. 392–407. Springer, Heidelberg (2000)
3. Ding, J.: A new variant of the Matsumoto-Imai cryptosystem through perturba-
tion. In: Bao, F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp. 305–318.
Springer, Heidelberg (2004)
4. Ding, J., Schmidt, D.: Rainbow, a new multivariable polynomial signature scheme.
In: Ioannidis, J., Keromytis, A., Yung, M. (eds.) ACNS 2005. LNCS, vol. 3531, pp.
164–175. Springer, Heidelberg (2005)
5. Ding, J., Gower, J.E., Schmidt, D., Wolf, C., Yin, Z.: Complexity estimates for
the F4 attack on the perturbed Matsumoto-Imai cryptosystem. In: Smart, N. (ed.)
Cryptography and Coding 2005. LNCS, vol. 3796, pp. 262–277. Springer, Heidel-
berg (2005)
Nonlinear Piece In Hand Perturbation Vector Method 161

6. Ding, J., Gower, J.E.: Inoculating multivariate schemes against differential attacks.
In: Yung, M., Dodis, Y., Kiayias, A., Malkin, T. (eds.) PKC 2006. LNCS, vol. 3958,
pp. 290–301. Springer, Heidelberg (2006)
7. Ding, J., Wolf, C., Yang, B.Y.: -Invertible Cycles for Multivariate Quadratic
(MQ) public key cryptography. In: Okamoto, T., Wang, X. (eds.) PKC 2007.
LNCS, vol. 4450, pp. 266–281. Springer, Heidelberg (2007)
8. Faugère, J.C., Joux, A.: Algebraic cryptanalysis of hidden field equation (HFE)
cryptosystems using Gröbner bases. In: Boneh, D. (ed.) CRYPTO 2003. LNCS,
vol. 2729, pp. 44–60. Springer, Heidelberg (2003)
9. Fouque, P.A., Granboulan, L., Stern, J.: Differential cryptanalysis for multivariate
schemes. In: Cramer, R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 341–353.
Springer, Heidelberg (2005)
10. Goubin, L., Courtois, N.: Cryptanalysis of the TTM cryptosystem. In: Okamoto, T.
(ed.) ASIACRYPT 2000. LNCS, vol. 1976, pp. 44–57. Springer, Heidelberg (2000)
11. Ito, D., Fukushima, Y., Kaneko, T.: On the security of piece in hand concept
based on sequential solution method. Technical Report of IEICE, ISEC2006-30,
SITE2006-27 (2006-7) (July 2006) (in Japanese)
12. Kasahara, M., Sakai, R.: A new principle of public key cryptosystem and its real-
ization. Technical Report of IEICE, ISEC2000-92 (2000-11) (November 2000) (in
Japanese)
13. Kasahara, M., Sakai, R.: A construction of public key cryptosystem for realizing
ciphertext of size 100 bit and digital signature scheme. IEICE Transactions on
Fundamentals E87-A(1), 102–109 (2004)
14. Kasahara, M., Sakai, R.: A construction of public-key cryptosystem based on singu-
lar simultaneous equations. IEICE Transactions on Fundamentals E88-A(1), 74–80
(2005)
15. Kipnis, A., Patarin, J., Goubin, L.: Unbalanced Oil and Vinegar signature schemes.
In: Stern, J. (ed.) EUROCRYPT 1999. LNCS, vol. 1592, pp. 206–222. Springer,
Heidelberg (1999)
16. Kipnis, A., Shamir, A.: Cryptanalysis of the HFE public key cryptosystem by
relinearization. In: Wiener, M. (ed.) CRYPTO 1999. LNCS, vol. 1666, pp. 19–30.
Springer, Heidelberg (1999)
17. Matsumoto, T., Imai, H., Harashima, H., Miyakawa, H.: A class of asymmetric
cryptosystems using obscure representations of enciphering functions. In: 1983 Na-
tional Convention Record on Information Systems, IECE Japan, pp. S8–5 (1983)
(in Japanese)
18. Matsumoto, T., Imai, H.: Public quadratic polynomial-tuples for efficient signature-
verification and message-encryption. In: Günther, C.G. (ed.) EUROCRYPT 1988.
LNCS, vol. 330, pp. 419–453. Springer, Heidelberg (1988)
19. Moh, T.T.: A public key system with signature and master key functions. Com-
munications in Algebra 27, 2207–2222 (1999)
20. Patarin, J.: Cryptanalysis of the Matsumoto and Imai public key scheme of Euro-
crypt 1988. In: Coppersmith, D. (ed.) CRYPTO 1995. LNCS, vol. 963, pp. 248–261.
Springer, Heidelberg (1995)
21. Patarin, J.: Hidden fields equations (HFE) and isomorphisms of polynomials (IP):
two new families of asymmetric algorithms. In: Maurer, U.M. (ed.) EUROCRYPT
1996. LNCS, vol. 1070, pp. 33–48. Springer, Heidelberg (1996)

22. Patarin, J., Goubin, L., Courtois, N.: C−+ and HM : Variations around two schemes
of T. Matsumoto and H. Imai. In: Ohta, K., Pei, D. (eds.) ASIACRYPT 1998.
LNCS, vol. 1514, pp. 35–49. Springer, Heidelberg (1998)
162 R. Fujita, K. Tadaki, and S. Tsujii

23. Shamir, A.: Efficient signature schemes based on birational permutations. In: Stinson,
D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 1–12. Springer, Heidelberg (1994)
24. Tadaki, K., Tsujii, S.: On the enhancement of security by piece in hand matrix
method for multivariate public key cryptosystems. In: Proc. SCIS 2007, vol. 2C1-3
(2007)
25. Tsujii, S., Kurosawa, K., Itoh, T., Fujioka, A., Matsumoto, T.: A public-key cryp-
tosystem based on the difficulty of solving a system of non-linear equations. IECE
Transactions (D) J69-D(12), 1963–1970 (1986) (in Japanese)
26. Tsujii, S., Fujioka, A., Hirayama, Y.: Generalization of the public-key cryptosys-
tem based on the difficulty of solving a system of non-linear equations. IEICE
Transactions (A) J72-A(2), 390–397 (1989) (in Japanese) (An English translation
of [26] is included in [29] as an appendix)
27. Tsujii, S.: A new structure of primitive public key cryptosystem based on soldiers
in hand matrix. Technical Report TRISE 02-03, Chuo University (July 2003)
28. Tsujii, S., Fujita, R., Tadaki, K.: Proposal of MOCHIGOMA (piece in hand) con-
cept for multivariate type public key cryptosystem. Technical Report of IEICE,
ISEC2004-74 (2004-09) (September 2004)
29. Tsujii, S., Tadaki, K., Fujita, R.: Piece in hand concept for enhancing the security
of multivariate type public key cryptosystems: public key without containing all the
information of secret key. Cryptology ePrint Archive, Report 2004/366 (December
2004), [Link]
30. Tsujii, S., Tadaki, K., Fujita, R.: Piece in hand concept for enhancing the security
of multivariate type public key cryptosystems: public key without containing all
the information of secret key. In: Proc. SCIS 2005, vol. 2E1-3, pp. 487–492 (2005),
[Link] tsujii/[Link]
31. Tsujii, S., Tadaki, K., Fujita, R.: Proposal for piece in hand (soldiers in hand)
matrix — general concept for enhancing security of multivariate public key cryp-
tosystems — Ver.2. In: Proc. SCIS 2006, vol. 2A4-1 (2006) (in Japanese),
[Link] tsujii/[Link]
32. Tsujii, S., Tadaki, K., Fujita, R.: Proposal for piece in hand matrix ver.2: gen-
eral concept for enhancing security of multivariate public key cryptosystems. In:
Workshop Record of the International Workshop on Post-Quantum Cryptography
(PQCrypto 2006), pp. 103–117 (2006),
[Link]
33. Tsujii, S., Tadaki, K., Fujita, R.: Proposal for piece in hand matrix: general concept
for enhancing security of multivariate public key cryptosystems. IEICE Transac-
tions on Fundamentals E90-A(5), 992–999 (2007),
[Link] tsujii/[Link]
34. Tsujii, S., Tadaki, K., Fujita, R.: Nonlinear piece in hand matrix method for en-
hancing security of multivariate public key cryptosystems. In: Proceedings of the
First International Conference on Symbolic Computation and Cryptography (SCC
2008), pp. 124–144 (2008)
35. Wang, L.C., Hu, Y.H., Lai, F., Chou, C.Y., Yang, B.Y.: Tractable rational map sig-
nature. In: Vaudenay, S. (ed.) PKC 2005. LNCS, vol. 3386, pp. 244–257. Springer,
Heidelberg (2005)
36. Wang, L.C., Yang, B.Y., Hu, Y.H., Lai, F.: A medium-field multivariate public-key
encryption scheme. In: Pointcheval, D. (ed.) CT-RSA 2006. LNCS, vol. 3860, pp.
132–149. Springer, Heidelberg (2006)
37. Wolf, C., Braeken, A., Preneel, B.: Efficient cryptanalysis of RSE(2)PKC and
RSSE(2)PKC. In: Blundo, C., Cimato, S. (eds.) SCN 2004. LNCS, vol. 3352, pp.
294–309. Springer, Heidelberg (2005)
Nonlinear Piece In Hand Perturbation Vector Method 163

38. Wolf, C., Preneel, B.: Taxonomy of Public Key Schemes based on the problem
of Multivariate Quadratic equations. Cryptology ePrint Archive, Report 2005/077
(December 2005), [Link]

A Experimental Results in NLPHPV Method with


Random Variables
We report in Table 4 and Table 5 the time required for the GB attack against
MPKC (MI or R-SE(2)PKC (RSE, for short)) and the MPKC enhanced by
the NLPHPV method. Note that n = k and q = 2 for the public keys E ∈
Fq [x1 , . . . , xk ]n of MI and RSE by their specifications. Table 4 and Table 5 give
the comparison of the particular case with a plain text of 15 bits (MI with k = 15
and the enhanced MI with z = 47, g = 35, or RSE with k = 15 and the enhanced
RSE with z = 44, g = 35). This shows that the time required for cryptanalysis
is increased by more than 105 times by the application of the NLPHPV method.
This fact shows that the NLPHPV method enhances the security of MI and RSE
against the GB attack. Table 4 and Table 5 show that the increase of the number
z − p of random variables xp+1 , . . . , xz increases the time required for the GB
attack against the enhanced cryptosystem K , and provides substantial security
against the GB attack.

Table 4. Comparison between computational times of the GB attack for MI and the
enhanced MI by the NLPHPV method

Parameters Computational times


Cryptosystems p k z g f l h t in second
15 < 10−2
MI 20 0.01
25 0.03
30 0.07
35 0.2
40 0.4
45 0.7
50 1
55 2
60 4
15 20 40 35 10 10 5 10 75
The enhanced MI 15 20 43 35 10 10 5 10 129
by the NLPHPV method 15 20 45 35 10 10 5 10 260
15 20 46 35 10 10 5 10 320
15 20 47 35 10 10 5 10 1029
15 20 40 40 10 10 10 10 97
15 20 43 40 10 10 10 10 161
15 20 47 40 10 10 10 10 284
15 20 48 40 10 10 10 10 495
15 20 49 40 10 10 10 10 1077
164 R. Fujita, K. Tadaki, and S. Tsujii

Table 5. Comparison between computational times of the GB attack for RSE and the
enhanced RSE by the NLPHPV method

Parameters Computational times


Cryptosystems p k z g f l h t in second
15 0.01
RSE 20 0.03
25 0.1
30 0.2
35 0.5
40 1
45 2
50 5
55 9
60 16
15 20 40 35 10 10 5 10 40
The enhanced RSE 15 20 41 35 10 10 5 10 71
by the NLPHPV method 15 20 42 35 10 10 5 10 179
15 20 43 35 10 10 5 10 713
15 20 44 35 10 10 5 10 2791
15 20 40 40 10 10 10 10 51
15 20 42 40 10 10 10 10 82
15 20 44 40 10 10 10 10 231
15 20 45 40 10 10 10 10 877
15 20 46 40 10 10 10 10 2327
On the Power of Quantum Encryption Keys

Akinori Kawachi and Christopher Portmann

Department of Mathematical and Computing Sciences, Tokyo Institute of


Technology, 2-12-1 Ookayama, Meguro-ku, Tokyo 152-8552, Japan
kawachi@[Link], [Link]@[Link]

Abstract. The standard definition of quantum state randomization,


which is the quantum analog of the classical one-time pad, consists in
applying some transformation to the quantum message conditioned on a
classical secret key k. We investigate encryption schemes in which this
transformation is conditioned on a quantum encryption key state ρk in-
stead of a classical string, and extend this symmetric-key scheme to an
asymmetric-key model in which copies of the same encryption key ρk
may be held by several different people, but maintaining information-
theoretical security. We find bounds on the message size and the number
of copies of the encryption key which can be safely created in these two
models in terms of the entropy of the decryption key, and show that the
optimal bound can be asymptotically reached by a scheme using classical
encryption keys. This means that the use of quantum states as encryp-
tion keys does not allow more of these to be created and shared, nor
encrypt larger messages, than if these keys are purely classical.

1 Introduction

1.1 Quantum Encryption

To encrypt a quantum state σ, the standard procedure consists in applying some


(unitary) transformation Uk to the state, which depends on a classical string k.
This string serves as secret key, and anyone who knows this key can perform the
reverse operation and obtain the original state. If the transformations U1 , U2 , . . .
are chosen with probabilities p1 , p2 , . . . , such that when averaged over all possible
choices of key, 
R(σ) = pk Uk σUk† , (1)
k

the result looks random, i.e., close to the fully mixed state, R(σ) ≈ I/d, this
cipher can safely be transmitted on an insecure channel. This procedure is called
approximate quantum state randomization or approximate quantum one-time
pad [1, 2, 3] or quantum one-time pad, quantum Vernam cipher or quantum
private channel in the case of perfect security [4, 5, 6], and is the quantum
equivalent of the classical one-time pad.
An encryption scheme which uses such a randomization procedure is called
symmetric, because the same key is used to encrypt and decrypt the message.

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 165–180, 2008.
c Springer-Verlag Berlin Heidelberg 2008
166 A. Kawachi and C. Portmann

An alternative paradigm is asymmetric-key cryptography, in which a different


key is used for encryption and decryption. In such a cryptosystem the encryp-
tion key may be shared amongst many different people, because possessing this
key is not sufficient to perform the reverse operation, decryption. This can be
seen as a natural extension of symmetric-key cryptography, because this latter
corresponds to the special case in which the encryption and decryption keys are
identical and can be shared with only one person.
Although the encryption model given in Eq. (1) is symmetric, by replacing
the classical encryption key with a quantum state we can make it asymmetric.
To see this, let us rewrite Eq. (1) as
 $   %
pk trK U |kk| ⊗ σ S U † ,
K
R(σ) = (2)
k

where U := k |kk| ⊗ Uk . The encryption key in Eq. (2), |kk|, is diagonal
in the computational basis, i.e., classical, but an arbitrary quantum state, ρk ,
could be used instead, e.g.,
 " #
R(σ) = pk trK U ρK k ⊗σ
S
U† , (3)
k

for some set of quantum encryption keys {ρk }k .


If the sender only holds such a quantum encryption key state ρk without know-
ing the corresponding decryption key k, then the resulting model is asymmetric
in the sense that possessing this copy of the encryption key state is enough to
perform the encryption, but not to decrypt. So many different people can hold
copies of the encryption key without compromising the security of the scheme.
It is generally impossible to distinguish between non-orthogonal quantum states
with certainty (we refer to the textbook by Nielsen and Chuang [7] for an in-
troduction to quantum information), so measuring a quantum state cannot tell
us precisely what it is, and possessing a copy of the encryption key state does
not allow us to know how the quantum message got transformed, making it
impossible to guess the message, except with exponentially small probability.
Up to roughly log N copies of a state can be needed to discriminate between
N possible states [8], so such a scheme could allow the same encryption key to
be used several times, if multiple copies of this quantum key state are shared
with any party wishing to encrypt a message. The scheme will stay secure as
long as the number of copies created stays below a certain threshold. What
is more, the security which can be achieved is information-theoretic like for
standard quantum state randomization schemes [9], not computational like most
asymmetric-key encryption schemes.
Such an asymmetric-key cryptosystem is just a possible application of a quan-
tum state randomization scheme which uses quantum keys. It is also interest-
ing to study quantum state randomization with quantum keys for itself (in the
symmetric-key model), without considering other parties holding extra copies
of the same encryption key. In this paper we study these schemes in both the
symmetric-key and asymmetric-key models, and compare their efficiency in terms
On the Power of Quantum Encryption Keys 167

of message size and number of usages of the same encryption key to quantum
state randomization schemes which use only classical keys.

1.2 Related Work


Quantum one-time pads were first proposed in [4, 5] for perfect security, then
approximate security was considered in, e.g., [1, 2, 3]. All these schemes assume
the sender and receiver share some secret classical string which is used only once
to perform the encryption. We extend these models in the symmetric-key case by
conditioning the encryption operation on a quantum key and considering security
with multiple uses of the same key, and then in the asymmetric-key case by con-
sidering security with multiple users holding copies of the same encryption key.
The first scheme using quantum keys in an asymmetric-key model was pro-
posed by Kawachi et al. [10], although they considered the restricted scenario
of classical messages. Their scheme can encrypt a 1 bit classical message, and
their security proof is computational, as it reduces the task of breaking the
scheme to a graph automorphism problem. They extended their scheme to a
multi-bit version [11], but without security proof. Hayashi et al. [9] then gave
an information-theoretical security proof for [11]. The quantum asymmetric-key
model we consider is a generalization and extension of that of [10, 11].

1.3 Main Contributions


The main result of this paper is that using quantum encryption keys has no
advantage over classical keys with respect to the number of copies of the en-
cryption key which can be safely created and to the size of the messages which
can be encrypted, both in the symmetric and asymmetric-key models. Contrary
to what was believed and motivated previous works with quantum keys, the in-
trinsic indistinguishability of quantum states does not allow more of these to be
created and shared as encryption keys, than if these keys are purely classical.
To show this, we first find an upper bound on the quantum message size and
on the number of copies of the encryption key which can be securely produced.
We show that if t copies of the key are created and if the quantum messages
encrypted are of dimension d, then they have to be such that t log d  H (K) for
the scheme to be secure, where H (K) is the entropy of the decryption key.
We then construct a quantum state randomization scheme and show that it
meets this upper bound in both the symmetric and asymmetric-key models. The
encryption keys this scheme uses are however all diagonal in the same bases, i.e.,
classical. This means that the scheme with classical keys is optimal in terms of
message size and number of usages of the same key, and no scheme with quantum
keys can perform better.
We also show how to extend quantum asymmetric-key encryption schemes
for classical message (such as [11]) to encrypt quantum messages as well. To do
this, we combine these schemes for classical messages with a standard quantum
one-time pad, and prove that the resulting scheme is still secure.
168 A. Kawachi and C. Portmann

1.4 Organization of the Paper


In Section 2 we develop the encryption models with quantum keys sketched
in this introduction. We first redefine quantum state randomization schemes
using quantum keys instead of classical keys in Section 2.1 and generalize the
standard security definition for multiple usage of the same key in this symmetric-
key model. In Section 2.2 we then show how to construct an asymmetric-key
cryptosystem using such a quantum state randomization scheme with quantum
keys and define its security. Section 2.3 contains a few notes about the special
case of classical messages, which are relevant for the rest of the paper.
In Section 3 we find an upper bound on the message size and number of
copies of the encryption key which can be created, both for the symmetric and
asymmetric-key models.
In Section 4 we construct a quantum state randomization scheme which uses
classical encryption keys, but which meets the optimality bounds for quantum
keys from the previous section in both models. We give this construction in
three steps. First in Section 4.1 we construct a scheme which can randomize
classical messages only. Then in Section 4.2 we show how to combine this scheme
for classical messages with a standard approximate quantum one-time pad to
randomize any quantum state. And finally in Section 4.3 we calculate the key
size of the scheme proposed and show that it corresponds to the bound found in
Section 3.
We conclude in Section 5 with a brief summary and further comments about
the results.
Technical proofs appear in Appendix A.

2 Encryption Model
2.1 Quantum Encryption Keys
Let us consider a setting in which we have two parties, a sender and a receiver,
who wish to transmit a quantum state, σ, from one to the other in a secure way
over an insecure channel. If they share a secret classical string, k, they can ap-
ply some completely positive, trace-preserving (CPTP) map Ek to the quantum
message and send the cipher Ek (σ). If the key k was chosen with probability pk ,
to any person who does not know this key the transmitted state is

R(σ) = pk Ek (σ), (4)
k

which will look random for “well chosen” maps Ek . This is the most general from
of quantum state randomization [6].
If instead the sender has a quantum state ρk , he can apply some CPTP map
E to both the shared state and the quantum message, and send E(ρk ⊗ σ). So
for someone who does not know ρk the state sent is

R(σ) = pk E(ρk ⊗ σ). (5)
k
On the Power of Quantum Encryption Keys 169

It is clear that Eqs. (4) and (5) produce equivalent ciphers, because for every
set of CPTP maps {Ek }k there exists a map E and set of states {ρk }k such
that for all messages σ, Ek (σ) = E(ρk ⊗ σ), and vice versa. The difference lies
in the knowledge needed to perform the encryption. In the first case (Eq. (4))
the sender needs to know the secret key k to know which CPTP map Ek to
apply. In the second case (Eq. (5)) the sender only needs to hold a copy of the
encryption key ρk , he does not need to know what it is or what secret key k it
corresponds to. This allows us to construct in Section 2.2 a quantum asymmetric-
key cryptosystem in which copies of the same encryption key ρk can be used by
many different users. In this section we focus on the symmetric-key model and
define quantum state randomization (QSR) schemes with quantum encryption
keys and their security in this model.

Definition 1. Let B(H) denote the set of linear operators on H.


A quantum state randomization (QSR) scheme with quantum encryption keys
consists of the following tuple,

T = (PK , {ρk }k∈K , E) .

ρk ∈ B(HK ) are density operators on a Hilbert space HK . They are called


encryption keys and are indexed by elements k ∈ K called decryption keys.
PK (·) is a probability distribution over the set of decryption keys K, corre-
sponding to the probability with which each en/decryption key-pair should be
chosen.
E : B(HK ⊗ HS ) → B(HC ), is a completely positive, trace-preserving (CPTP)
map from the set of linear operators on the joint system of encryption key and
message Hilbert spaces, HK and HS respectively, to the set of linear operators
on the cipher Hilbert space HC , and is called encryption operator.
To encrypt a quantum message given by its density operator σ ∈ B(HS ) with
the encryption key ρk , the encryption operator is applied to the key and message,
resulting in the cipher
ρk,σ := E(ρk ⊗ σ).
Definition 1 describes how to encrypt a quantum message, but for such a scheme
to be useful, it must also be possible to decrypt the message for someone who
knows which key k was used, i.e., it must be possible to invert the encryption
operation.

Definition 2. A QSR scheme given by the tuple T = (PK , {ρk }k∈K , E) is said
to be invertible on the set S ⊆ B(HS ) if for every k ∈ K with PK (k) > 0 there
exists a CPTP map Dk : B(HC ) → B(HS ) such that for all density operators
σ ∈ S,
Dk E(ρk ⊗ σ) = σ.
Furthermore, a QSR scheme must – as its name says – randomize a quantum
state. We define this in the same way as previous works on approximate quantum
state randomization [1, 2, 3], by bounding the distance between the ciphers
averaged over all possible choices of key and some state independent from the
170 A. Kawachi and C. Portmann

message. We however generalize this to encrypt t messages with the same key,
because the asymmetric-key model we define Section 2.2 will need this. It is
always possible to consider the case t = 1 in the symmetric-key model, if multiple
uses of the same key are not desired.
We will use the trace norm as distance measure between two states, because it
is directly related to the probability that an optimal measurement can distinguish
between these two states, and is therefore meaningful in the context of√eaves-
dropping. The trace norm of a matrix A is defined by Atr := tr |A| = tr A† A,
which is also equal to the sum of the singular values of A.
Definition 3. A QSR scheme given by the tuple T = (PK , {ρk }k∈K , E) is said to
be (t, )-randomizing on the set S ⊆ B(HS ) if there exists a density operator τ ∈
⊗t
B HC such that for all t-tuples of message density operators ω = (σ1 , . . . , σt ) ∈
×t
S
R(ω) − τ tr ≤ , (6)

where R(ω) = k PK (k)ρk,σ1 ⊗ · · · ⊗ ρk,σt and ρk,σi = E(ρk ⊗ σi ).

2.2 Quantum Asymmetric-Key Cryptosystem


As announced in the previous section, the idea behind the quantum asymmetric-
key cryptosystem model is that many different people hold a copy of some quan-
tum state ρk which serves as encryption key, and anyone who wishes to send a
message to the originator of the encryption keys uses a quantum state random-
ization scheme, as described in Definition 1. This is depicted in Section 1.
If the QSR scheme used to encrypt the messages is (t, )-randomizing and
no more than t copies of the encryption key were released, an eavesdropper
who intercepts the ciphers will not be able to distinguish them from some state
independent from the messages, so not get any information about these messages.
This is however not the only attack he may perform.
As we consider a scenario in which copies of the encryption key are shared
between many different people, the adversary could hold one or many of them.
If a total of t copies of the encryption key were produced and t1 were used to
encrypt messages ω = (σ1 , . . . , σt1 ), in the worst case we have to assume that
the adversary has the t2 := t − t1 remaining unused copies of the key. So his
total state is 
ρE
ω := PK (k)ρk,σ1 ⊗ · · · ⊗ ρk,σt1 ⊗ ρ⊗t2
k , (7)
k∈K
where ρk,σi is the cipher of the message σi encrypted with the key ρk . This leads
to the following security definition.
Definition 4. We call a quantum asymmetric-key cryptosystem (t, )-indistin-
guishable on the set S ⊆ B(HS ) if for all t1 ∈ {0, 1, . . . , t}, t2 := t − t1 , there
⊗t1 ⊗t2
exists a density operator τ ∈ B HC ⊗ HK such that for all t1 -tuples of
message density operators ω = (σ1 , . . . , σt1 ) ∈ S ×t1 ,
 E 
ρ − τ  ≤ ,
ω tr

where ρE
ω is the state the adversary obtains as defined in Eq. (7).
On the Power of Quantum Encryption Keys 171

Fig. 1. Quantum asymmetric-key cryptosystem model. Bob and Charlie hold copies of
Alice’s encryption key ρk . To send her a message, they encrypt it with the key and a
given QSR scheme, and send the resulting cipher to her. An eavesdropper, Eve, may
intercept the ciphers as well as possess some copies of the encryption key herself.

Remark 1. Definition 4 is clearly more general than the security criteria of Defi-
nition 3 ((t, )-randomization) as this latter corresponds to the special case t1 = t.
However, for the scheme constructed in Section 4 the two are equivalent, and
proving one proves the other. This is the case in particular if the encryption key
is equal to the cipher of some specific message σ0 , i.e., ρk = ρk,σ0 = E(ρk ⊗ σ0 ),
in which case holding an extra copy of the encryption key does not give more
information about the decryption key than holding an extra cipher state.

2.3 Classical Messages


In the following sections we will also be interested in the special case of schemes
which encrypt classical messages only. Classical messages can be represented by
a set of mutually orthogonal quantum states, which we will take to be the basis
states of the message Hilbert space and denote by {|s}s∈S . So these schemes
must be invertible and randomizing on the set of basis states of the message
Hilbert space.
When considering classical messages only, we will simplify the notation when
possible and represent a message by a string s instead of by its density matrix
|ss|, e.g., the cipher of the message s encrypted with the key ρk is
ρk,s := E (ρk ⊗ |ss|) .
172 A. Kawachi and C. Portmann

Remark 2. Definition 2 (invertibility) can be simplified when only classical mes-


sages are considered: a QSR scheme given by the tuple T = (PK , {ρk }k∈K , E) is
invertible for the set of classical messages S, if for every k ∈ K with PK (k) > 0
the ciphers {ρk,s }s∈S are mutually orthogonal, where ρk,s := E (ρk ⊗ |ss|) for
some orthonormal basis {|s}s∈S of the message Hilbert space HS .
We will also use a different but equivalent definition to measure how well a
scheme can randomize a message when dealing with classical messages. This new
security criteria allows us to simplify some proofs.
Definition 5. A QSR scheme given by the tuple T = (PK , {ρk }k∈K , E) is said to
be (t, )-secure for the set of classical messages S if for all probability distributions
PS t (·) over the set of t-tuples of messages S ×t ,
 t t 
 SC t t
ρ − ρS ⊗ ρC  ≤ , (8)
tr

St C t
where ρ is the state of the joint systems of t-fold message and cipher Hilbert
t t
spaces, and ρS and ρC are the result of tracing out the cipher respectively
message systems. I.e.,
t t  
ρS C = PS t (s)|ss| ⊗ PK (k)ρk,s1 ⊗ · · · ⊗ ρk,st ,
s∈S ×t k∈K
t 
S
ρ = PS t (s)|ss|,
s∈S ×t
t  
ρC = PS t (s) PK (k)ρk,s1 ⊗ · · · ⊗ ρk,st ,
s∈S ×t k∈K

where s = (s1 , . . . , st ).

This security definition can be interpreted the following way. No matter what
the probability distribution on the secret messages is – let the adversary choose
it – the message and cipher spaces are nearly in product form, i.e., the cipher
gives next to no information about the message.
The following lemma proves that this new security definition is equivalent to
the previous one (Definition 3) up to a constant factor.
Lemma 1. If a QSR scheme is (t, )-randomizing for a set of classical messages
S, then it is (t, 2)-secure for S. If a QSR scheme is (t, )-secure for a set of
classical messages S, then it is (t, 2)-randomizing for S.
Proof. Immediate after writing out the definitions explicitly, using the triangle
inequality for one direction and considering the distribution on the message
tuples PS t (s1 ) = PS t (s2 ) = 1/2 for any s1 , s2 ∈ S ×t for the converse.

3 Lower Bounds on the Key Size


It is intuitively clear that the more copies of the encryption key state ρk are
created, the more information the adversary gets about the decryption key k ∈ K
On the Power of Quantum Encryption Keys 173

and the more insecure the scheme becomes. As it turns out, the number of copies
of the encryption key which can be safely used is directly linked to the size of
the decryption key, i.e., the cardinality of the decryption key set K.
Let us assume a QSR scheme with quantum encryption keys is used to en-
crypt classical messages of size m. Then if t copies of the encryption key state
are released and used, the size of the total message encrypted with the same
decryption key k is tm. We prove in this section that the decryption key has
to be of the same size as the total message to achieve information-theoretical
security, i.e., log |K|  tm. In Section 4 we then give a scheme which reaches this
bound asymptotically.
Theorem 1. If a QSR scheme given by the tuple T = (PK , {ρk }k∈K , E) is in-
vertible for the set of classical messages S, then when t messages (s1 , . . . , st ) are
chosen from S with (joint) probability distribution PS t (s1 , . . . , st ) and encrypted
with the same key,
 t t  H (S t ) − H (K) − 2
 SC t t
ρ − ρS ⊗ ρC  ≥ , (9)
tr 4t log |S|
t t
where H(·) is the Shannon entropy and ρS C is the state of the t-fold message
and cipher systems:
t t  
ρS C = PS t (s)|ss| ⊗ PK (k)ρk,s1 ⊗ · · · ⊗ ρk,st ,
s∈S ×t k∈K
t 
S
ρ = PS t (s)|ss|, (10)
s∈S ×t
t  
ρC = PS t (s) PK (k)ρk,s1 ⊗ · · · ⊗ ρk,st ,
s∈S ×t k∈K

where s = (s1 , . . . , st ).
Proof in Appendix A.1.
Corollary 1. For a QSR scheme to be (t, )-randomizing or (t, )-indistinguish-
able, it is necessary that
H (K) ≥ (1 − 8)t log d − 2, (11)
where d is the dimension of the message Hilbert space HS and H (K) is the
entropy of the decryption key.

Proof in Appendix A.2.


Remark 3. Approximate quantum one-time pad schemes usually only consider
the special case in which the cipher has the same dimension as the message
[1, 3]. A more general scenario in which an ancilla is appended to the message
is however also possible. It was proven in [6] that for perfect security such an
extended scheme needs a key of the same size as in the restricted scenario, namely
2 log d. Corollary 1 for t = 1 proves the same for approximate security, namely
roughly log d bits of key are necessary, just as when no ancilla is present.
174 A. Kawachi and C. Portmann

4 Near-Optimal Scheme

To simplify the presentation of the QSR scheme, we first define it for classical
messages in Section 4.1, show that it is invertible and find a bound on t, the
number of copies of the encryption key which can be released, for it to be (t, )-
randomizing for an exponentially small . In Section 4.2 we extend the scheme
to encrypt any quantum message of a given size, and show again that it is
invertible and randomizing. And finally in Section 4.3 we calculate the size of
the key necessary to encrypt a message of a given length, and show that it is
nearly asymptotically equal to the lower bound found in Section 3.

4.1 Classical Messages


Without loss of generality, let the message space be of dimension dim HS =
2m . The classical messages can then be represented by strings of length m,
S := {0, 1}m. We now define a QSR scheme which uses encryption key states
of dimension dim HK = 2m+n , where n is a security parameter, i.e., the scheme
will be (t, )-randomizing for  = 2−Θ(n) .
We define the set of decryption keys to be the set of all (m×n) binary matrices,

K := {0, 1}m×n. (12)

This set has size |K| = 2mn and each key is chosen with uniform probability.
For every decryption key A ∈ K the corresponding encryption key is defined
as 
1
ρA := n |Ax, xAx, x|, (13)
2 n
x∈{0,1}

where Ax is the multiplication of the matrix A with the vector x.


The encryption operator E : B(HK ⊗ HS ) → B(HC ) consists in applying the
unitary  K S
U := |y ⊕ s, xy, x| |ss|
x∈{0,1}n
s,y∈{0,1}m

and tracing out the message system S, i.e.,


   
U† .
S
ρA,s := trS U ρK k ⊗ |ss|

This results in the cipher for the message s being


1 
ρA,s = |Ax ⊕ s, xAx ⊕ s, x|. (14)
2n
x∈{0,1}n

These states are mutually orthogonal for different messages s so by Remark 2


this scheme is invertible.
On the Power of Quantum Encryption Keys 175

We now show that this scheme is (t, )-randomizing for  = 2−δn+1 and t =
(1 − δ)n, 0 < δ < 1.

Theorem 2. For the QSR scheme defined above in Eqs. (12), (13) and (14)
⊗t
there exists a density operator τ ∈ B(HC ) such that for all t-tuples of messages
×t
s = (s1 , . . . , st ) ∈ S , if t = (1 − δ)n, 0 < δ < 1, then

γs − τ tr ≤ 2−δn+1 ,

where γs isthe encryption of s with this scheme averaged over all possible keys,
i.e., γs = A∈K PK (A)ρA,s1 ⊗ · · · ⊗ ρA,st .
Proof in Appendix A.3.

Corollary 2. An asymmetric-key cryptosystem using this QSR scheme is (t, )-


indistinguishable (Definition 4) for  = 2−δn+1 and t = (1 − δ)n, 0 < δ < 1.

Proof. Immediate from Theorem 2 by noticing that ρA,0 = ρA .

4.2 Quantum Messages


We will now extend the encryption scheme given above to encrypt any quantum
state, not only classical ones. To do this we will show how to combine a QSR
scheme with quantum keys which is (t, 1 )-randomizing for classical messages (like
the one from Section 4.1) with a QSR scheme with classical keys which is (1, 2 )-
randomizing for quantum states (which is the case of any standard QSR scheme,
e.g., [1, 2, 3, 4, 5, 6]) to produce a QSR scheme which is (t, 1 + t2 )-randomizing.
The general idea is to choose a classical key for the second scheme at random, en-
crypt the quantum message with this scheme, then encrypt the classical key with
the quantum encryption key of the first scheme, and send both ciphers.

Theorem 3. Let a QSR scheme with quantum keys be given by the tuple T1 =
(PK , {ρk }k∈K , E), where E : B(HK ⊗ HS ) → B(HC ), and let a QSR scheme with
classical keys be given by the tuple T2 = (PS , {Fs }s∈S ), where Fs : B(HR ) →
B(HD ). We combine the two to produce the QSR scheme with quantum encryp-
tion keys given by T3 = (PK , {ρk }k∈K , G), where G : B(HK ⊗HR ) → B(HC ⊗HD )
is defined by

G(ρk ⊗ σ) := PS (s)E (ρk ⊗ |ss|) ⊗ Fs (σ). (15)
s∈S

If T1 forms a quantum asymmetric-key cryptosystem which is invertible and


(t, 1 )-indistinguishable (respectively randomizing) for the basis states of HS and
T2 is an invertible and (1, 2 )-randomizing QSR scheme for any state on HR ,
then T3 forms an invertible and (t, 1 + t2 )-indistinguishable (respectively ran-
domizing) cryptosystem for all density operator messages on HR .
Proof in Appendix A.4.
176 A. Kawachi and C. Portmann

4.3 Key Size


To construct the QSR scheme for quantum messages as described in Section 4.2
we combine the scheme for classical messages from Section 4.1 and the approxi-
mate one-time pad scheme of Dickinson and Nayak [3].
The scheme from Section 4.1 is (t, 1 )-randomizing for t = (1 − δ)n and 1 =
2−δn+1 , and uses a key with entropy H (K) = nm = (t + log 11 + 1)m. The
scheme of Dickinson and Nayak [3] is (1, 2 )-randomizing and uses a key with
entropy m = log d + log 12 + 4 to encrypt a quantum state of dimension d. So
by combining these our final scheme is (t, 1 + t2 )-randomizing and uses a key
with entropy
1 1
H (K) = (t + log + 1)(log d + log + 4)
1 2
to encrypt t states of dimension d. By choosing 1 and 2 to be polynomial
in 1t and log1 d respectively, the key has size H (K) = t log d + o(t log d), which
nearly reaches the asymptotic optimality found in Eq. (11), namely H (K) ≥
(1 − 8)t log d − 2. Exponential security can be achieved at the cost of a slightly
reduced asymptotic efficiency. For 1 = 2−δ1 t and 2 = d−δ2 for some small
δ1 , δ2 > 0, the key has size H (K) = (1 + δ1 )(1 + δ2 )t log d + o(t log d).

5 Consequence for Quantum Keys


The scheme presented in Section 4 uses the encryption keys
1 
ρA = n |Ax, xAx, x|, (16)
2 n
x∈{0,1}

for some (m × n)-matrix decryption key A. Although these keys are written
as quantum states using the bra-ket notation to fit in the framework for QSR
schemes with quantum keys developed in the previous sections, the states from
Eq. (16) are all diagonal in the computational basis. So they are classical and
could have been represented by a classical random variable XA which takes the
value (Ax, x) with probability 2−n .
This scheme meets the optimality bound on the key size from Section 3. This
bound tells us that for a given set of decryption keys K, no matter how the
encryption keys {ρk }k∈K are constructed, the number of copies of the encryp-
tion keys which can be created, t, and the dimension of the messages which
can be encrypted, d, have to be such that t log d  H (K) for the scheme to be
information-theoretically secure. But this bound is met by a scheme using clas-
sical keys, hence no scheme using quantum keys can perform better. So using
quantum keys in a QSR scheme has no advantage with respect to the message
size and number of usages of the same key over classical keys.
This result applies to both the symmetric-key and asymmetric-key models
as the optimality was shown with respect to both (t, )-randomization (Defini-
tion 3) and (t, )-indistinguishability (Definition 4), the security definitions for
the symmetric-key and asymmetric-key models respectively.
On the Power of Quantum Encryption Keys 177

References
1. Hayden, P., Leung, D., Shor, P.W., Winter, A.: Randomizing quantum states:
Constructions and applications. Communications in Mathematical Physics 250,
371–391 (2004)
2. Ambainis, A., Smith, A.: Small pseudo-random families of matrices: Derandomizing
approximate quantum encryption. In: Jansen, K., Khanna, S., Rolim, J., Ron, D.
(eds.) RANDOM 2004 and APPROX 2004. LNCS, vol. 3122, pp. 249–260. Springer,
Heidelberg (2004)
3. Dickinson, P., Nayak, A.: Approximate randomization of quantum states with fewer
bits of key. In: AIP Conference Proceedings, vol. 864, pp. 18–36 (2006)
4. Boykin, P.O., Roychowdhury, V.: Optimal encryption of quantum bits. Physical
Review A 67, 42317 (2003)
5. Ambainis, A., Mosca, M., Tapp, A., de Wolf, R.: Private quantum channels. In:
FOCS 2000: Proceedings of the 41st Annual Symposium on Foundations of Com-
puter Science, Washington, DC, USA, vol. 547. IEEE Computer Society, Los
Alamitos (2000)
6. Nayak, A., Sen, P.: Invertible quantum operations and perfect encryption of quan-
tum states. Quantum Information and Computation 7, 103–110 (2007)
7. Nielsen, M.A., Chuang, I.L.: Quantum Computation and Quantum Information.
Cambridge University Press, Cambridge (2000)
8. Harrow, A.W., Winter, A.: How many copies are needed for state discrimination?
quant-ph/0606131 (2006)
9. Hayashi, M., Kawachi, A., Kobayashi, H.: Quantum measurements for hidden sub-
group problems with optimal sample complexity. Quantum Information and Com-
putation 8, 345–358 (2008)
10. Kawachi, A., Koshiba, T., Nishimura, H., Yamakami, T.: Computational indistin-
guishability between quantum states and its cryptographic application. In: Cramer,
R. (ed.) EUROCRYPT 2005. LNCS, vol. 3494, pp. 268–284. Springer, Heidelberg
(2005)
11. Kawachi, A., Koshiba, T., Nishimura, H., Yamakami, T.: Computational indistin-
guishability between quantum states and its cryptographic application. Full version
of [10], quant-ph/0403069 (2006)
12. Alicki, R., Fannes, M.: Continuity of quantum conditional information. Journal of
Physics A: Mathematical and General 37, L55–L57 (2004)

A Proofs
A.1 Proof of Theorem 1 in Section 3 on Page 173
A theorem by Alicki and Fanes [12] tells us that for any two states ρAB and
σ AB on the joint system HAB = HA ⊗ HB with δ := ρAB − σ AB tr ≤ 1 and
dA := dim HA ,
   
S ρAB ρB − S σ AB σ B  ≤ 4δ log dA + 2h (δ) , (17)

where S ρAB ρB := S ρAB − S ρB is the conditional Von Neumann entropy
and h(p) := p log p1 + (1 − p) log 1−p
1
is the binary entropy. h(δ) ≤ 1, so from
Eq. (17) we get
   
 AB  S ρAB ρB − S σ AB σ B  − 2
ρ −σ AB 
≥ .
tr 4 log dA
178 A. Kawachi and C. Portmann

By applying this to the left-hand side of Eq. (9) we obtain


 t  t  t t
 t t  S ρ S
+ S ρ C
− S ρS C − 2
 SC St Ct 
ρ −ρ ⊗ρ  ≥ .
tr 4t log |S|
To prove this theorem it remains to show that
 t  t  t t
S ρS + S ρC − S ρS C ≥ H S t − H (K) .

For this we will need the two following bounds on the Von Neumann entropy
(see e.g, [7]):
 
 
S p x ρx ≥ px S (ρx ) ,
x∈X x∈X
 
 
S p x ρx ≤ H (X ) + px S (ρx ) .
x∈X x∈X

Equality is obtained in the second equation if the states {ρx }x∈X are all mutually
orthogonal. By using these bounds and Eq. (10) we see that
 
 t t  
S ρ S C
=H S + t
PS t (s) S PK (k)ρk,s1 ⊗ · · · ⊗ ρk,st
s∈S ×t k∈K

≤H S t
+ H (K) + PK (k)PS t (s) S (ρk,s1 ⊗ · · · ⊗ ρk,st ) ,
×t
s∈S
k∈K
 t

S ρS = H St ,
 
   
Ct
S ρ ≥ PK (k) S PS t (s)ρk,s1 ⊗ · · · ⊗ ρk,st
k∈K s∈S ×t

= H St + PK (k)PS t (s) S (ρk,s1 ⊗ · · · ⊗ ρk,st ) .
×t
s∈S
k∈K

We have equality in the last line because the scheme is invertible on S, i.e., by
Definition 2 and Remark 2 the states {ρk,s1 ⊗ · · · ⊗ ρk,st }s1 ,...,st ∈S are mutually
orthogonal. By putting this all together we conclude the proof.

A.2 Proof of Corollary 1 in Section 3 on Page 173


Definition 5 says that for a scheme to be (t, )-secure we need
 t t 
 SC t t
ρ − ρS ⊗ ρC  ≤ 
tr

for all probability distributions PS t . So for the uniform distribution we get from
Theorem 1 that for a scheme to be (t, )-secure we need
H (K) ≥ (1 − 4)t log |S| − 2.
On the Power of Quantum Encryption Keys 179

By Lemma 1 we then have the condition

H (K) ≥ (1 − 8)t log |S| − 2

for the scheme to be (t, )-randomizing for the classical messages S. And as
classical messages are a subset of quantum messages – namely an orthonormal
basis of the message Hilbert space – this bound extends to the case of quantum
messages on a Hilbert space of dimension dS = |S|.
As (t, )-randomization is a special case of (t, )-indistinguishability, namely
for t1 = t, it is immediate that this lower bound also applies to (t, )-indistin-
guishability.

A.3 Proof of Theorem 2 in Section 4.1 on Page 175


The τ in question is the fully mixed state τ = 1
2t(m+n)
I. By writing γs with the
values of the ciphers from Eq. (14) we get
1 
γs = |. . . , Axi ⊕ si , xi , . . . . . . , Axi ⊕ si , xi , . . . |.
2mn 2tn m×n
A∈{0,1}
x1 ,...,xt ∈{0,1}n

A unitary performing bit flips can take γs to γr for any s, r ∈ S t , so


   
 1   1 
γs − I = γr − t(m+n) I

 2t(m+n) tr  2  ,
tr

and it is sufficient to evaluate


    
 1   1 
γ0 − I we − ,
 2 t(m+n)  =  2 t(n+m)  (18)
tr e∈EVec(γ0 )

where e are the eigenvectors of γ0 and we the corresponding eigenvalues.


So we need to calculate the eigenvalues of
1 
γ0 = |Ax1 , x1 , . . . , Axt , xt Ax1 , x1 , . . . , Axt , xt |. (19)
2mn 2tn m×n
A∈{0,1}
x1 ,...,xt ∈{0,1}n

Let us fix x1 , . . . , xt . It is immediate from the linearity of Ax that if exactly d


of the vectors {xi }ti=1 are linearly independent, then

|Ax1 , x1 , . . . , Axt , xt Ax1 , x1 , . . . , Axt , xt |
A∈{0,1}m×n

uniformly spans a space of dimension 2dm , and for different values of x1 , . . . , xt


these subspaces are all mutually orthogonal. Let Dt be the random variable
representing the number of independent vectors amongst t binary vectors of
length n, when chosen uniformly at random, and let PDt (d) = Pr[Dt = d] be the
180 A. Kawachi and C. Portmann

probability that exactly d of these vectors are linearly independent. The matrix
given in Eq. (19) then has exactly 2tn PDt (d)2dm eigenvectors with eigenvalue
2dm 2tn , for 0 ≤ d ≤ t. The remaining eigenvectors have eigenvalue 0.
1

So Eq. (18) becomes

    
 1  t
1 1
we − =2 2tn PDt (d)2dm − t(m+n)
 2t(m+n)  2dm 2tn 2
e∈EVec(ρE
0 )
d=0


t  
=2 PDt (d) 1 − 2−(t−d)m
d=0

t−1
≤2 PDt (d) = 2(1 − PDt (t))
d=0
≤2 t−n+1
.

For t = (1 − δ)n, 0 < δ < 1, we have for all s ∈ S t , γs − τ tr ≤ 2−δn+1 .

A.4 Proof of Theorem 3 in Section 4.2 on Page 175


The invertibility of the scheme formed with T3 is immediate. To prove the indis-
tinguishability we need to show that for all t1 ∈ {0, 1, . . . , t}, t2 := t − t1 , there
⊗t1 ⊗t2 ⊗t1
exists a density operator τ ∈ B HC ⊗ HK ⊗ HD such that for all t1 -tuples
×t1
of message density
 operators ω = (σ1 , . . . , σt1 ) ∈ B(H R) , ρE
ω − τ tr ≤ ,
where ρE = k∈K KP (k)G(ρ ⊗ σ ) ⊗ · · · ⊗ G(ρ ⊗ σ ) ⊗ ρ t2
.
ω  k 1 k t1
⊗t2
k
Let us write γs := k∈K PK (k)ρk,s1 ⊗· · ·⊗ρk,st1 ⊗ρk , where s = (s1 , . . . , st1 )
and ρk,si = E(ρk ⊗ |si si |), and µσ := s∈S PS (s)Fs (σ). And let τ1 and τ2 be
the two states such that γs − τ1 tr ≤ 1 and µσ − τ2 tr ≤ 2 for all s and σ
respectively. We define δs := γs − τ1 and τ := τ1 ⊗ τ2⊗t1 . Then by the triangle
inequality and changing the order of the registers
 
 
 E    
ρω − τ  ≤  PS t (s)δs ⊗ Fs1 (σ1 ) ⊗ · · · ⊗ Fst1 (σt1 )
tr  
s∈S ×t1 
  tr
+ τ1 ⊗ µσ1 ⊗ · · · ⊗ µσt − τ  1 tr
 
t1
≤ PS t (s)δs tr + µσi − τ2 tr
s∈S ×t1 i=1

≤  1 + t1  2 .

As (t, )-randomization is a special case of (t, )-indistinguishability, namely


for t1 = t, it is immediate that T3 is also (t, 1 + t2 )-randomizing.
Secure PRNGs from Specialized Polynomial
Maps over Any Fq

Feng-Hao Liu1 , Chi-Jen Lu2 , and Bo-Yin Yang2


1
Department of Computer Science, Brown University, Providence RI, USA
fenghao@[Link]
2
Institute of Information Science, Academia Sinica, Taipei, Taiwan
{cjlu,byyang}@[Link]

Abstract. Berbain, Gilbert, and Patarin presented QUAD, a pseudo ran-


dom number generator (PRNG) at Eurocrypt 2006. QUAD (as PRNG and
stream cipher) may be proved secure based on an interesting hardness
assumption about the one-wayness of multivariate quadratic polynomial
systems over F2 .
The original BGP proof only worked for F2 and left a gap to general
Fq . We show that the result can be generalized to any arbitrary finite
field Fq , and thus produces a stream cipher with alphabets in Fq .
Further, we generalize the underlying hardness assumption to special-
ized systems in Fq (including F2 ) that can be evaluated more efficiently.
Barring breakthroughs in the current state-of-the-art for system-solving,
a rough implementation of a provably secure instance of our new PRNG
is twice as fast and takes 1/10 the storage of an instance of QUAD with
the same level of provable security.
Recent results on specialization on security are also examined. And we
conclude that our ideas are consistent with these new developments and
complement them. This gives a clue that we may build secure primitives
based on specialized polynomial maps which are more efficient.

Keywords: sparse multivariate polynomial map, PRNG, hash function,


provable security.

1 Introduction

Cryptographers have used multivariate polynomial maps for primitives since


Matsumoto-Imai [26] but there is a dearth of results proving security based on
plausible hardness assumptions. Berbain, Gilbert and Patarin presented a break-
through in Eurocrypt 2006, when they proposed a PRNG/stream cipher that is
provably secure provided that the class of multivariate quadratic polynomials is
probabilistically one way:

Class MQ(q, n, m): For given q, n, m, the class MQ(q, n, m) consists of all
systems of m quadratic polynomials in Fq with n variables. To choose a
random system S from MQ(q, n, m), we write each polynomial Pk (x) as

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 181–202, 2008.
c Springer-Verlag Berlin Heidelberg 2008
182 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

 
aijk xi xj + 1≤i≤n bik xk + ck , where every aijk , bik , ck is chosen
1≤i≤j≤n
uniformly in Fq .
Solving S(x) = b for any MQ system S is known as the “multivariate
quadratic” problem.
It is often claimed that the NP-completeness of this problem [19] is the
basis for multivariate public-key cryptosystems. We could take instead Pi ’s
to be polynomials of degree d instead of quadratic and get the class of “mul-
tivariate polynomial systems” MP (q, d, n, m). This contains MQ(q, n, m)
as a subset, so solving arbitrary S(x) = b for any MP system S would be
no easier. However, it is not easy to base a proof on worst-case hardness; the
premise used in [7] is the following average-case hardness assumption:
Assumption MQ: Given any k and prime power q, for parameters n, m sat-
isfying m/n = k + o(1), no probabilistic polynomial-time algorithm can
solve (in poly(n)-time) any fixed ε > 0 proportion of systems S drawn from
MQ(q, n, m), and a vector b = (b1 , b2 , . . . , bm ) drawn from S(Un ), where Un
is uniform distribution over (Fq )n such that S(x) = b.

With this premise, [7, Theorem 4] proved the QUAD PRNG secure over F2 .
However, a looseness factor in its security argument in the security proof means
that provably secure QUAD instances over F2 are not yet of practical speed. It also
does not work for fields larger than F2 . A similar result over any Fq is non-trivial
to prove, which we do here with different and more involved techniques. However,
instances of QUAD with the same-size state over larger fields are significantly less
secure [33]. To increase the difficulty of solving a system of nonlinear polynomial
equations, we can plausibly change (a) the field size q, (b) the number of variables
n, or (c) the degree d of the system (cf. [3,4,31]). Each costs time and space (for
a reduction from the MQ problem in Fq case to F2 case, see [30]). Even with a
hardware implementation, an increase in resource consumption is inevitable.
A logical next step is to combine all these approaches but find polynomials that
are easier to evaluate. A natural candidate is sparsity in the chosen polynomials.
To our survey, however, there are no prior positive results for provable security
of specialized polynomial systems, and specifically sparse ones.
So the questions we are trying to answer are:

– Can we prove a similar result to [7] allowing for more efficiently evaluated
specialized systems?
– What do we know about how these specializations affect complexity of system-
solving?

1.1 Our New Ideas and Main Results

Instead of MQ, we investigate a class SMP(q, d, n, m, (η2 , . . . , ηd )) of sparse


polynomials systems with arbitrary affine parts and terms at other degrees
with specified density. I.e., S = (P1 (x), P2 (x), · · · , Pm (x)) ∈ SMP(q, d, n,
m, (η2 , . . . , ηd )) consists of m polynomials of degree d in the variables x =
(x1 , x2 , . . . , xn ); each Pi is a degree-d polynomial such that exactly ηi = ηi (n)
Secure PRNGs from Specialized Polynomial Maps over Any Fq 183

nonzero degree-i terms are present for each i ≥ 2. The affine terms (coefficients)
are totally randomly chosen. Also all the operations and coefficients are in Fq .
d (i)
To rephrase, the i-th polynomial we can be written as Pi (x) = j=2 Qj (x)+
 (i)
1≤j≤n aij xj + ci where each Qj (x) can be written in the form
1≤σ(1)≤σ(2)≤···≤σ(j)≤n a(σ(1),σ(2),...,σ(j)) xσ(1) xσ(2) . . . xσ(j) , or the sum of ηj
monomials with degree j. “A random system from SMP(q, d, n, m, (η2 , . . . , ηd ))”
then has a probability distribution as follows: all aij , ci are uniformly chosen from
(i)
Fq . To determine each Qj (x), we firstly uniformly choose ηj out of n+j−1 j co-
efficients to be nonzero, then uniformly choose each of these nonzero coefficients
from F∗q := Fq \ {0}. All the others coefficients wil be zero.
We now propose a probabilistic one-wayness assumption to base a security
theorem on.
Assumption SMP : For given q, d, and for n, m, η2 , . . . , ηd such that m/n =
k + o(1) and ηi /n = ki + o(1) (where k, k2 , k3 , . . . are constants) there is
no probabilistic algorithm which can solve (in poly(n)-time) any fixed ε >
0 proportion of instances S(x) drawn from SMP((q, d, n, m, (η2 , . . . , ηd )),
and a vector b = (b1 , b2 , . . . , bm ) drawn from S(Un ), where Un is uniform
distribution over (Fq )n such that S(x) = b.
In Secs. 2–3 Assumption SMP is shown to yield a secure PRNG (and hence a
probably secure stream cipher), for any q. The key to this extension to general
Fq involves a reconstruction over linear polynomials, which is a non-trivial gener-
alization of the Goldreich-Levin hard core bit by Goldreich-Rubinfeld-Sudan [21].
We then check that SMP instances are hard to solve on average (i.e., not
just worst case) via the known fastest generic (cf. Sec. 4 and Appendix B) and
special-purpose algorithms. Finally we discuss their practical use. Preliminary
implementations of our SPELT (Sparse Polynomials, Every Linear Term) can
achieve 5541 and 11744 cycles per byte for a SMP-based secure stream cipher
over F16 (quartic, 108 variables) and F2 (cubic, 208 variables) respectively. The
former is at least twice as fast as any other stream ciphers provably secure at
the same parameters (cf. Sec. 5.2).
There is another possible candidate for the one-wayness assumption, SRQ,
proposed by Prof. Jintai Ding, that is worth studying. We put a brief description
in the Appendix C, and address an interesting potential topic for the future work.
The authors would like to thank Prof. Jintai Ding for the proofreading,
suggestions, and discussions. The full version of this work can be found at
”[Link]

1.2 Previous Work


There had been “provably secure” PRNGs based on discrete log [20], or on
hardness of factorization (as in Blum, Blum, and Shub [10]) or a modification
thereof [29], or MQ [7]. But the security proofs always require impractically
high parameters for “provable security”, which limit their utility. For example:
– The BBS stream generator at commonly used parameters is not provably
secure [23, Sec. 6.1].
184 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

– With [29], the specified security level was 270 , today’s cryptographers usually
aim for 280 (3DES units).
– Similarly with QUAD there is a gap between the “recommended” instances
and the provably secure instances (i.e., the tested instances were unprovable
or unproven [33]).
– PRNGs based on decisional Diffie-Hellman assumption have almost no gap
between the hardness of breaking the PRNG and solving the underlying in-
tractable problem, but known primitives based on DDH and exponentiation
in Zp [22,16] are generally slower than those based on other assumptions.
The generic types of methods for solving polynomial systems — Faugère’s F4 -
F5 and XL-derivatives — are not affected drastically by sparsity. In the former,
sparsity is quickly lost and tests show that there is no substantial difference in
timing when solving SMP instances. Recent versions of XL [33] speeds up pro-
portionally to sparsity. We therefore surveyed the literature for recent results on
solving or attacking specialized systems in crypto, listed below. These results
do not contradict our hardness assumption.
– Aumasson-Meier (ICISC 2007) [1] shows that in some cases sparsity in pri-
marily underdefined — more variables than equations — systems leads to
improved attacks. Results are very intresting and takes more study but do
not apply to overdetermined systems in general.
– Bard-Courtois-Jefferson [2] tests SAT solvers on uniformly sparse F2 equa-
tions, and gives numbers.
– Raddum-Samaev [27,28] attacks “clumped” systems (even though the title
says “sparse”). Similarly the Courtois-Pieprzyk XSL attack [13] requires a
lot of structures (i.e., “clumping”).

2 PRNG Based on Specialized Polynomial Map in F2


This section both provides a recap of past results and extends them to specialized
maps over F2 . We will start with definitions and models, then give the key results
on the provable security level.

Computational Distinguishability: Probability distributions D1 and D2 over


a finite set Ω are computationally distinguishable with computing re-
sources R and advantage  if there exist a probabilistic algorithm A which on
any input x ∈ Ω outputs answer 1 (accept) or 0 (reject) using computing re-
sources at most R and satisfies |Prx∈D1 (A(x) = 1) − Prx∈D2 (A(x) = 1)| >
. The above probabilities are not only taken over x values distributed ac-
cording to D1 or D2 , but also over the random choices that are used by
algorithm A. Algorithm A is called a distinguisher with advantage .
If no such algorithm exists, then we say that D1 and D2 are computation-
ally indistinguishable with advantage . If R is not specified, we implicitly
mean feasible computing resources (e.g., < 280 simple operations, and rea-
sonable limits [usually polynomially many] in sampling from D1 and D2 ).
Secure PRNGs from Specialized Polynomial Maps over Any Fq 185

PRNG: Let n < L be two integers and K = Fq be a finite field. The function
G : K n → K L is said to be a Pseudorandom Number Generator (PRNG) if
the probability distribution of the random variable G(x), where the vector
x is uniformly random in K n , is computationally indistinguishable (with
distinguisher resource R) from a uniformly random vector in K L . Usually
q = 2 but it is not required.
Linear polynomialn maps: A linear polynomial map R : (Fq )n → (Fq ) means
R(x) = i=1 ai xi , where x = (x1 , x2 , . . . , xn ), and x1 , x2 , . . . , xn are vari-
ables. If we give these variables values in Fq , by setting  (x1 , x2 , . . . , xn ) =
(b1 , b2 , . . . , bn ) for bi ∈ Fq , denoted as b, then R(b) = ni=1 ai bi is an element
in Fq .
In the following sections, a “random” linear polynomial map (or form)
has the coefficients ai ’s randomly chosen from Fq . Also, when we mention
R or R(x) refers to the function but when we write R(b), that means the
value of the function R with input vector b.
Instance from SMP (or MQ): If S is an instance drawn from SMP(q, d, n,
m, (η2 , . . . , ηd )), then S(x) = (P1 (x), P2 (x), . . . , Pm (x)) (x = (x1 , x2 , . . . , xn )
are variables) is a function that maps (Fq )n → (Fq )m and each Pi (x) has the
same probability distribution as that mentioned in section 1.2. For example,
if b=(b1 , b2 , . . . , bn ) is a vector in(Fq )n , then S(b)=(P1 (b), P2 (b), . . . , Pm (b)),
a value in (Fq )m .
Note: Heretofore we will also say SMP(n, m) for short, if no confusion is
likely to ensue.

Given any PRNG, there is a standard way to stretch it into an old-fashioned


stream cipher (Prop. 1), i.e. stream cipher without IV and key setup. There are
ways to set up an initial state securely, such as in Sec. 3.1. Thus we concentrate
our efforts on building a PRNG from any MQ family of map from Fn2 → Fm 2 ; in
order, we need to
1. Show that if an instance S drawn from MQ is not a PRNG, then for a
(secretly) given vector b we can predict, with the help of information from
the value of S, S(b), and any linear form R, the value of R(b) with strictly
larger than 1/2 +  probability; then
2. Use Goldreich-Levin theorem, which states that the value of any linear func-
tion R, R(b) is a hardcore bit of any Fn2 → Fm 2 one-way function S, S(b),
and R. I.e., being able to guess with strictly larger than 1/2 +  probability
R(b) from S, S(b), and R means that we can invert S(b) with non-negligible
probability.

2.1 From Distinguisher to Predictor


In fact, the following two results are valid for any K = Fq . In [7], the proofs
were covered only in the F2 case. However, the generalization is nontrivial but
straitforward. Therefore, for simplicity, we put the generalized propositions here,
though this section is for the F2 case.
186 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

Proposition 1 ([7]). Take a stream cipher with Q : K n → K n and P : K n →


K r as the update and output filter functions and random initial state x0 , that
is, starting from the initial state x0 , at each step we update with xi+1 = Q(xi )
and output yi = P(xi ).

x0 / x1 = Q(x0 ) / x2 = Q(x1 ) / x3 = Q(x2 ) / ···

   
y0 = P(x0 ) y1 = P(x1 ) y2 = P(x2 ) y3 = P(x3 ) ···

If we can distinguish between its first λ blocks of output (y0 , y1 , . . . , yλ−1 ) and
a true random vector in K λr with advantage  in time T , then we can distinguish
between the output of a true random vector in K n+r and the output of S = (P, Q)
in time T + λTS with advantage /λ. [Standard Proof is in Appendix A.]
Proposition 2 (an extention of [7]). Let K = Fq . Suppose there is an
algorithm A that given a system S(: K n → K m ) chosen from SMP(q, d, n,
m, (η2 , . . . , ηd )) distinguishing S(Un ) from a uniform random distribution Um ,
(where Ur means uniform distribution over K r for the r,) with advantage at least
 in time T . Then there is an algorithm B that, given (1) a system S : K n → K m
from SMP(n, m), (2) any K n → K linear form R, and (3) y = S(b), where b
is an secret input value randomly chosen from K n , predicts R(b) with success
probability at least (1 + /2)/q using at most T + 2TS operations.
Proof. Without loss of generality, we may suppose that A has probability at least
 higher to return 1 on an input distribution (S, S(Un )) than on distribution
(S, Um ). Define a recentered distinguisher

A(S, w), probability 12
A (S, w) :=
1 − A(S, u), u ∈ K m uniform random, probabilty 12

then A returns 1 with probability 1+


2 on input (S, S(Un )) and with probability
1
2 on input (S, Um ).
Now, given an input S and y ∈ K m , the algorithm B first randomly chooses
a value v ∈ K (representing a guess for R(b)), then randomly chooses a vector
u ∈ K m , and form S := S + Ru : K n → K m . This is equal to S plus a random
linear polynomial (see above for the meaning of random linear form) and is hence
of SMP(n, m). Define algorithm B as following:

v, if A (S , y + vu) = 1;
B(S, y, R) :=
uniformly pick an element from K\{v}, if A (S , y + vu) = 0.
If v = R(b), y + vu = S (b), else y + vu is equal to S (b) plus a nonzero
multiple of the random vector u, hence is equivalent to being uniformly random.
The probability that B := B(S, S(b), R) is the correct guess is hence

Pr(B = R(b)) = Pr(B = v|v = R(b)) Pr(v = R(b)) + Pr(B = R(b)|v = R(b)) Pr(v = R(b))
  
1 1  q−1 1 1 1 
= + + = 1+ .
q 2 2 q 2 q−1 q 2
Secure PRNGs from Specialized Polynomial Maps over Any Fq 187

Note: We see that the reasoning can work this way if and only if S = S + Ru
have the same distribution as S. Otherwise, we cannot guarentee the distinguisher
A will output the same distribution.

2.2 Constructing a PRNG from MQ (F2 Case)


Proposition 3 ([25], [7]). Suppose there is an algorithm B that given a system
S(: Fn2 → Fm 2 ) from MQ(2, n, m), a random n-bit to one-bit linear form R and
the image S(b) of a randomly chosen unknown b, predicts R(b) with probability
at least 12 +  over all possible inputs (S, S(b), R) using time T , then there is an
algorithm C that given S and the m-bit image S(b) of a randomly chosen n-bit
vector b produces a preimage of S(b) with probability (over all b and S) at least
/2 in time  
 8n2 8n 8n
T = 2 T + log + 2 TS
 2 
Note: This is really the Goldreich-Levin theorem of which we omit the proof
here. This essentially states that linear forms are hard-core of any one-way func-
tion. In fact, the tighter form [7, Proof of Theorem 3] (using a fast Walsh trans-
form) can be simply followed word-for-word.
This above result (which only holds for F2 ) with Prop. 2 shows that any MQ
family of maps induces PRNGs over F2 . To get a useful stream cipher, we can
combine Props. 1–3:
Proposition 4 ([25], [7]). If S = (P, Q) is an instance drawn from MQ(2, n,
n + r), where P : Fn2 → Fr2 , Q : Fn2 → Fn2 are the stream cipher as in Prop. 1,
then if we can distinguish between λ output blocks of the stream cipher from truly
random distribution in T time, we can find b from S(b), where b is a randomly

chosen input, with probability at least 8λ in time
 
 27 n2 λ2 27 nλ2 27 nλ2
T = T + (λ + 2)T S + log + 2 + TS (1)
2 2 2
Note: Roughly this means that if we let r = n, want to establish a safety level
of 280 multiplications, want L = λr = 240 bits between key refreshes, and can
accept  = 10−2 , then T   2230 /n. All we need now is to find a map from
Fn2 → F2n
2 which takes this amount of time to invert.
As we see below, unless equation-solving improves greatly for sparse systems,
this implies that a handful of cubic terms added to a QUAD system with n = r =
208, q = 2 can be deemed secure to 280 . There is no sense in going any lower
than that, because solving a system with n bit-variables can never take much
more effort than 2n times whatever time it takes to evaluate one equation.

3 PRNG Based on SMP in Fq


In Proposition 3, [7] transformed the problem into a variation of Goldreich-Levin
theorem (in F2 ). The transformation still works in Fq ; however, Goldreich-Levin
188 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

theorm gets stuck in this place. Here we show a way to extend the main results
to Fq , by using a generalization of the Goldreich-Levin hard-core bit theorem.

Proposition 5 ([25] and contribution of this paper). Let K = Fq . Suppose


there is an algorithm B that given a system S(: K n → K m ) from SMP(n, m),
a random K n → K linear form R and the image S(b) of a randomly chosen
unknown b, predicts R(b) with probability at least 1q +  over all possible inputs
(S, S(b), R) using time T , then there is an algorithm C that given S and the
m-bit image S(b) of a randomly chosen vector b ∈ K n produces a preimage of
S(b) with probability (over all b and S) at least /2 in time
 nq  n 2
1
T  ≤ 210 log2 T + 1− −2 TS
5  q

Remark [intuition of why argument in F2 cannot be applied in Fq ]: If we know


that one out of two exclusive possibilities takes place with probability strictly
larger than 50%, then the other one must happen strictly less often 50%. If
we know that one of q possibilities takes place with probability strictly greater
than 1/q, we cannot be sure that another possibility does not occur with even
higher possibility. Therefore, we can only treat this as a case of learning a linear
functional with queries to a highly noisy oracle. Due to this difference, the
order of  in T  /T is as high as −5 in Prop. 5, but only −2 in Prop. 3.

Proposition 6 ([25] and contribution or this paper). If S = (P, Q) is an


instance drawn from a SMP(n, n + r), where P : Fnq → Frq , Q : Fnq → Fnq are the
stream cipher as in Prop. 1, then if we can distinguish between λ output blocks of
the stream cipher and uniform distribution in T time, we can invert S(b) with

probability at least 4qλ in time
 2
nq 6 λ5 2qnλ 1 4q 2 λ2
T  = 215 log2 (T + (λ + 2)TS ) + 1 − TS (2)
5  q 2

This is a straightforward combination of Props. 1, 2, and 5. In the remainder


of this section, we give a proof to Prop. 5 by a variation of the procedure used
by Goldreich-Rubinfeld-Sudan [21, Secs. 2 and 4], to give it concrete values that
we can derive security proofs from.

3.1 Conversion to a Modern (IV-Dependent Stream) Cipher


This paper mostly deals with the security of PRNGs, which are essentially old-
fashioned stream ciphers. If we have a secure PRNG S = (s0 , s1 ), where both
s0 , s1 are maps from K n → K n — note that S can be identical to S — then
the following is a standard way to derive the initial state x0 ∈ K n from the
bitstream (key) c = (c1 , c2 , . . . , cKL ) ∈ {0, 1}KL and an initial u ∈ K n , where
KL is the length of the key:

x0 := scKL (scKL−1 (· · · (sc2 (sc1 (u))) · · · )).


Secure PRNGs from Specialized Polynomial Maps over Any Fq 189

This is known as the tree-based construction. From an old-fashioned provably


secure stream cipher (i.e., the key is the initial state), the above construction
achieves security in the resulting IV-dependent stream cipher, at the cost of some
additional looseness in the security proof. A recent example of this is [6].
Thus, all our work really applies to the modern type of stream ciphers which
require an IV-dependent setup, except that the security parameters may be
slightly different.

3.2 Hardcore Predicate and Learning Polynomials


Let x = (x1 , x2 , . . . , xn ), b = (b1 , b2 , . . . , bn ), and xi , bi are elements in a finite
field K = Fq . Given an arbitrary strong one way function h(x), then F (x, b) =
(h(x), b) is also a one way function. Claim x · b is the hard-core bit of F (x, b),
where x · b means their inner product.
Supposed we have a predictor P which predicts its hardcore x · b given
(h(x), b) with probability more than 1q + , then we can write in the math
form:
1
Pr [P (h(x), b) = x · b] > + .
b,x q
By Markov inequality, we know there must be more than /2 fraction of x such
that Prb [P (h(x), b) = x · b] > 1q + 2 . For this fraction of x, we are trying to
 of h(x) (F (x) as well) through the predictor. Also x · b can be
find the inverse
written as bi xi , then
$  % 1 
Pr P (h(x), b) = bi xi > + .
b q 2
This means that, if we can find a polynomial which almost matches an ar-
bitrary function P , a predictor function, then we can eventually invert x from
F (x) a non-negligible portion of the time. Now we try to reconstruct such linear
polynomials through the access of the predictor, largely following the footsteps
of [21].

3.3 Intuition of Reconstructing Linear Polynomials


Now we are given some oracle accesses to a function f : K n → K, where K is
a finite field and |K| = q. We need to find all linear polynomials which nmatch
f with at least 1q +  fraction of inputs x. Let p(x1 , x2 , . . . , xn ) = 1 pi xi ,
i
and i-th prefix of p is 1 pj xj . The algorithm runs n rounds, and in the i-
th round, it extends all possible candidates from the (i − 1)-th round with all
elements in K and screens them, filtering out most bad prefixes. The pseudocode
of the algorithm is presented in Algorithm 2. Since we want the algorithm to be
efficient, we must efficiently screen possible prefixes from all extensions. We now
introduce a screening algorithm to be called TestPrefix.
190 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

Algorithm 1. TestPrefix(f ,,n,(c1 , c2 . . . , ci ))[21]


Repeat poly1 ( n ) times:
Pick s = si+1 , . . . , sn ∈R GF(q)
Let t = poly2 ( n )
for k = 1 to t do
Pick r = r1 , r2 . . . , ri ∈R GF(q)
σ (k) = f (r, s) − ij=1 cj rj
end for
If there is σ (k) = σ for at least 1q + 3 fraction of the k’s then ouput accept and halt
endRepeat
If all iterations were completed without accepting, then reject

Algorithm 2. Find All Polynomials(f,  )[21]


set a candidate queue Q[i] which stores all the candidates (c1 , c2 , c3 , . . . , ci ) in the
i-th round
for i = 1 to n do
Pick all elements in Q[i]
TestPrefix(f ,,n,(c1 , c2 . . . , ci , α)) for all α ∈ F
If TestPrefix accepts, then push (c1 , c2 . . . , ci , α) into Q[i + 1] i.e. it is a candidate
in the (i + 1)-th round
end for

Supposed we are testing the i-th prefix (c1 , c2 , . . . , ci ), we are going to evaluate
the quantity of:
⎡ ⎤
 i
Ps (σ) := Pr ⎣f (r, s) = cj rj + σ ⎦
r1 ,r2 ...,ri ∈K
j=1
n
where r = (r1 , r2 , . . . , ri ). The value of σ can be thought as a guess of i+1 pj sj .
For every s, we can estimate the probability by a sample of several r’s, and
the error rate can be controlled by the times of sampling. If such s makes the
probability significantly larger than 1/q, then we accept. If no such s exists, we
reject. The detailed algorithm is stated in the Algorithm 1: TestPrefix.
If a candidate (c1 , c2 , . . . , ci ) passes through the Algorithm 1 for at least one
suffix s, there is a σ such that the estimate of Ps (σ) is greater than 1q +

3 . For a correct candidate (c1 , c2 , . . . , ci ), i.e. (c1 , c2 , . . . , ci ) is the prefix of
p = (p1 , p2 , . . . , pn ) which matches f for at least 1q + , and an arbitrary σ =
n
i+1 pj sj , it satisfies that Es [Ps (σ)] ≥ q + . By Markov’s inequality, for at
1

least /2 fraction of s and some corresponding σ, it holds that Ps (σ) ≥ 1q + 2 .


In Algorithm 1, we set 1q + 3 as the passing criteria; thus the correct candidate
will pass though the Algorithm 1 with great probability. However, [21, Sec. 4]
shows that the total passing number of candidates in each round is limited. In
fact, only a small number of candidates will pass the test. This maximum (also
given by [21, Sec. 4]) number of prefixes that pass the test is ≤ (1 − 1q )2 −2 .
Secure PRNGs from Specialized Polynomial Maps over Any Fq 191

3.4 Giving Concrete Values to “Order of Polynomially Many”


Since there are /2 fraction of suffix s such that Ps (σ) ≥ 1q + 2 , we can randomly
choose the suffix polynomially many times (k1 times) to ensure that we would
select such s with high probability. Also, for such s, if we choose polynomially
many times (k2 times) of r, there would be high probability that we would find
some α for at least 1q + 3 fraction. We are estimating how the polynomially many
should be as the following:

Pr [ TestPrefix fails ] ≤
 1 

Pr[no such s is chosen ] + Pr no single element exists more than + fraction
q 3

k1  1 
Pr [ no such s is chosen] ≤ (1 − /2)k1 ≤ e− 2 ≤  2
2
1− 1
q −2 nq

So, we take k1 as O( 1 log( n )) ≈ 3 1 log( n ). On the other hand, we want to


estimate the probability of there are no σ’s with fraction at least 1q + 3 . For a
correct suffix s, we know for uniform r, we get that σ with probability more
than 1q + 2 . Let Xi be the random variable with value 1 if the i-th trial of r gets
the correct σ, 0 otherwise. Then we have Pr[Xi = 1] ≥ 1q + 2 . Suppose we do k2
trials:
. / 0k 1
1   2
1 
Pr no single element exists more than + fraction ≤ Pr Xi < ( + )k2
q 3 1
q 3
0 
 k2 X  1
 i 1   
≤ Pr  i=1 − + ≥ ,
 k2 q 2  6
since these Xi ’s are independent, then by Chernoff’s bound we have
0   1
 k2 X  
 i=1 i 1  k2 2 1 
Pr  − ( + ) ≥ ≤ 2e− 2×36 ≤  2 ,
 t q 2  6 2
1 − q −2 nq
1

k2 = O( log(n/)
2 ) ≈ 216 log(n/)
2 is sufficient to make the inequality hold. Thus,
we have

Pr [ TestPrefix fails ] ≤  2 .
1 − 1q −2 nq

Pr [ ] ≤ Pr [ ]≤ Pr [ ]

 2 
1 −2 
≤ 1−  nq  2 =
q
1 − 1q −2 nq
192 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

Therefore, the algorithm will work with high probability. The worst case
running time of algorithm 2 should be: k1 k2 (1 − 1q )2 12 nq = O( n5 log2 ( n )) 
210 nq5
 log2 ( n ).
2
Note: 1 − 1q −2 is the maximum number of candidates which pass in each
round.

4 On SMP under Generic Solvers

To verify that SMP represent one way property, we need to show that

1. Generic system-solvers do not run substantially faster on them; and


2. There are no specialized solvers that can take advantage of the sparsity.

Here “generic” means the ability to handle any multivariate polynomial system
with n variables and m equations in Fq . There are two well-known types of
generic methods for solving polynomial systems, both related to the original
Buchberger’s algorithm. One is Faugère’s F4 -F5 and the other is XL-derivatives.
In the former, sparsity is quickly lost and tests show that there are little difference
in timing when solving SMP instances. With recent versions of XL [33], the
sparsity results in a proportional decrease in complexity. The effect of sparsity
on such generic methods should be predictable and not very drastic, as shown
by some testing (cf. Sec. 4.1). We briefly describe what is known about XL and
F4 -F5 in Appendix B.

4.1 Testing the One-Wayness with Generic Solvers

We conducted numerous tests on SMP maps at various degrees and sparsity


over the fields F2 , F16 , and F256 . For example, Table 1 lists our tests in solving
random MQ(256, n, m) instances where each polynomial only has n quadratic
terms [we call these instances SMQ(256, n, m, n)] with F4 over GF(256). It
takes almost the same time as solving an MQ instance of the same size.
For XL variants that use sparse solvers as the last step [33] test results (one
of which is shown in Table 2) confirms the natural guess: For SMP instances
where the number of non-linear terms is not overly small, the solution degree of
XL is unchanged, and the speed naturally goes down as the number of terms,
nearly in direct proportion (in Tab. 2, should be close to n/4).

Table 1. SMQ(256, n, m, n) timing (sec): MAGMA 2.12, 2GB RAM, Athlon64x2


2.2GHz

m − n DXL Dreg n = 9 n = 10 n = 11 n = 12 n = 13
0 2m m 6.03 46.69 350.38 3322.21 sigmem
1 m  m+1
2√
 1.19 8.91 53.64 413.34 2535.32
2  2 
m+1 m+2− m+2
2
 0.31 2.20 12.40 88.09 436.10
Secure PRNGs from Specialized Polynomial Maps over Any Fq 193

Table 2. XL/Wiedemann timing (sec) on Core2Quad 2.4GHz, icc, 4-thread OpenMP,


8GB RAM

n 7 8 9 10 11 12 13
D 5 6 6 7 7 8 8
SMQ(256, n, n + 2, n) 9.34 · 10−2 1.17 · 100 4.04 · 100 6.02 · 101 1.51 · 102 2.34 · 103 5.97 · 103

MQ(256, n, n + 2) 2.06 · 10−1 2.92 · 100 1.10 · 10 1.81 · 102 4.94 · 102 8.20 · 103 2, 22 · 104

ratio 2.20 2.49 2.73 3.00 3.27 3.50 3.72

For F2 , there are many special optimizations made for F4 in MAGMA, so we


ran tests at various densities of quadratic terms in version 2.12-20 and 2.13-8.
Typical results are given in Fig. 1. Most of the time the data points are close to
each other. In some tests they overlap each other so closely that no difference in
the timing is seen in a diagram.

4.2 A Brief Discussion on Specialization and Security


Since generic system-solvers show no unexpected improvement on our special-
izations, it remains for us to check that there are no other big improvements in
solving specialized systems for. We list below what we know of recent new at-
tempts on solving or attacking specialized systems in crypto, and show that our
results are consistent with these new results and somewhat complements them.
– Aumasson-Meier [1] presented several ideas to attack primitives built on
sparse polynomials systems, which we sketch separately in Sec. 4.3 below.
– Raddum-Samaev [27,28] attacks what they term “sparse” systems, where
each equation depend on a small number of variables. Essentially, the authors
state that for systems of equations in n bit variables such that each equation
depends on only k variables, we can solve the system in time roughly pro-
1
portional to 2(1− k )n using a relatively small memory footprint. Since XL for
cubics and higher degrees over F2 is more time-consuming than brute-force,
this is fairly impressive. However, the “sparsity” defined by the authors is
closer to “input locality” and very different from what people usually denote
with this term. The attack is hence not applicable to SMP-based stream
ciphers.
In a similar vein is the purported XSL attack on AES [13]. While the
S was supposed to stand for Sparse, it really requires Structure – i.e., each
equation depending on very few variables. So, whether that attack actually
works or not, it does not apply to SMP-based systems.
– Bard-Courtois-Jefferson [2] use SAT solvers on uniformly sparse F2 equations
and give experimental numbers. According to the authors, the methods takes
up much less memory than F4 or derivatives, but is slower than these tra-
ditional methods when they have enough memory.
Some numbers for very overdefined and very sparse systems shows that
converting to a conjunctive normal form and then running a SAT solver
can have good results. This seems to be a very intriguing approach, but
194 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

so far there are no theoretical analysis especially for when the number of
equations is a few times the number of variables, which is the case for SMP
constructions.

4.3 Solutions and Collisions in Sparse Polynomial Systems

Aumasson-Meier recent published [1] some quite interesting ideas on finding


solutions or collisions for primitives using sparse polynomial systems (e.g., hashes
proposed in [15]).
They showed that which implies that using sparse polynomials systems of
uniform density (in every degree) for Merkle-Damgård compression will not be
universally collision-free. Some underdefined systems that are sparse in the higher
degrees can be solved with lower complexity. Their results do not apply to overde-
termined systems in general. We summarize relevant results below.

1. Overdetermined higher-degree maps that are sparse of uniform density, or at


least sparse in the linear terms, is shown to have high probability of trivial
collisions and near-collisions.
It seems that everyone agrees, that linear terms should be totally random
when constructing sparse polynomial systems for symmetric primitives.
2. Suppose we have an underdetermined higher-degree map sparse in the non-
affine part, i.e.,

P : Fn+r
2 → Fn2 , P(x) = b + M x + Q(x)

where Q has only quadratic or higher terms and is sparse. Aumasson-Meier


suggests that we can find P−1 (y) as follows: find a basis for the kernel
space of the augmented matrix [M ; b + y]. Collect these basis vectors in a
(n + r + 1) × (r + 1) matrix M  as a linear code. For an arbitrary w ∈ Fr+1
2 ,
the codeword x̄ = M  w will represent a solution to y = M x + b if its last
component is 1. Use known methods to find relatively low-weight codewords
for the code M  and substitute into Q(x), expecting it to vanish with non-
negligible probability.
Aumasson-Meier proposes to apply this for collisions in Merkle-Damgård
hashes with cubic compressor functions. It does not work for fields other
than F2 or overdetermined systems. Its exact complexity is unknown and
requires some further work.
3. Conversely, it has been suggested if we have an overdetermined higher-degree
map
P : Fn2 → Fn+r
2 , P(x) = b + M x + Q(x)
where Q has only quadratic or higher terms and is extremely sparse, we can
consider P(x) = y as M x = (y + b)+ perturbation, and use known meth-
ods for decoding attacks, i.e., solving overdetermined linear equations with
perturbation. However, SMP maps with a moderate number of quadratic
terms will be intractible.
Secure PRNGs from Specialized Polynomial Maps over Any Fq 195

We note that other specialized polynomials can be constructed that are also
easier to evaluate such as the SRQ construction (cf. Appendix C) which also can
carry through the same arguments as SMP, so our process is more general than
it looks.

5 Summary of Uses for Specialized Polynomial Systems


All information seems to point to the conclusion that we always use totally
random linear terms, no matter what else we do. With that taken into account,
specialized random systems (such as SMP) represent improvements over generic
systems in terms of storage and (likely) speed.

5.1 The Secure Stream Ciphers SPELT


We build a stream cipher called SPELT(q, d, n, r, (η2 , . . . , ηd )), which resembles
the construction in section 2:
We specify a prime power q (usually a power of 2), positive integers n and r, a
degree d. We have “update function” Qi = (Qi,1 , Qi,2 , . . . , Qi,n ) : Fnq → Fnq and
“output filter” Pi = (Pi,1 , Pi,2 , . . . , Pi,r ) : Fnq → Frq , for i ∈ {0, 1}. We still do
yn = P(xn ) [output]; xn+1 = Q(xn ) [transition], iterated according to the initial
vector. To repeat, every polynomial here is of degree d. Affine (constant and
linear) term or coefficient are still uniformly random. But terms of each degree
are selected according to different densities of terms, such that the degree-i
terms are sparse to the point of having only ηi terms. The difference
between Eq. 1 and Eq. 2, which governs the maximum provable security levels
we can get, affects our parameter choices quite a bit, as seen below.
By Eq. 2, if L = λn lg q is the desired keystream length, the looseness factor
T  /T is roughly 
215 q 6 (L/)5 2 2qL
lg
n4 lg5 q  lg q
If we let q = 16, r = n, want a safety level of T = 280 multiplications, L = 240
bits between key refreshes, and can accept  = 10−2 , then T   2354 /n4 . We
propose the following instances:
– SPELT using q = 16, d = 3 (cubics), n = r = 160, 20 quadratic and 15 cubic
terms per equation. Projected XL degree is 54, storage requirement is 2184
bytes. T  is about 2346 multiplications, which guarantees  288 multiplica-
tions security. This runs at 6875 cycles/byte.
– SPELT using d = 4 (quartics), n = r = 108, 20 quadratic, 15 cubic, and 10
quartic terms per equation. Projected XL degree is 65, storage requirement
is 2174 bytes. T  is about 2339 multiplications guaranteeing  281 multipli-
cations security at a preliminary 5541 cycles/byte.
– SPELT using q = 2, n = r = 208, d = 3 (cubics), with 20 cubic terms
each equation. Preliminary tests achieve 11744 cycles/byte. The expected
complexity for solving 208 variables and 416 equations is ∼ 2224 (by brute-
force trials, which is much faster than XL here), which translates to a 282
proven security level.
196 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

5.2 Comparisons: A Case for SPELT


All modern-day microprocessor are capable of doing 64-bit arithmetic at least,
and there is a natural way to implement QUAD that runs very fast over F2 , limited
only by the ability to stream data. However, as number of variables goes up, the
storage needed for QUAD goes up cubically, and for parameter choices that are
secure, the dataset overflows even the massive caches of an Intel Core 2. That
is what slows down QUAD(2, 320, 320) — tests on a borrowed ia64 server shows
that it is almost exactly the same speed as the SPELT(2, 3, 208, 208, [480, 20]).
Looking at the numbers, it seems that the idea of specializd polynomials is
a good complement to the approach of using polynomial maps for symmetric
primitives introduced by Berbain-Gilbert-Patarin.

Table 3. Point-by-Point, SPELT vs. QUAD on a K8 or C2

Stream Cipher Block Storage Cycles/Byte Security Level


SPELT (2,3,208,208,[480,20]) 208b 0.43 MB 11744 282 Proven
SPELT (16,4,108,108,[20,15,10]) 864b 48 kB 5541 280 Proven
QUAD (2,320,320) 320b 3.92 MB 13646 282 Proven
QUAD (2,160,160) 160b 0.98 MB 2081 2140 Best Attack
SPELT (16,4,32,32,[10,8,5]) 128b 8.6 kB 1244 2152 Best Attack

We hasten to add that our programming is quite primitive, and may not match
the more polished implementations (e.g., [5]). We are still working to improve
our programming and parameter choices. Also, in hardware implementations,
the power of sparsity should be even more pronounced.

5.3 For Possible Use in Hash Functions


In [8] Billet et al proposes to use two-staged constructions with a random 192-bit
to 464-bit expanding quadratic map followed by a 464-bit to 384-bit quadratic
contraction. They show that in general a PRNG followed by a one-way compres-
sion function is a one-way function.
In [15] the same construction is proposed but with SRQ quadratics (see Ap-
pendix C) and no proof. Now we see that the abovementioned results from [8]
and Prop. 6, which justify the design up to a point. This is an area that still takes
some study, and perhaps require extra ideas, such as having a hybrid construc-
tion with a sparse polynomial expansion stage and a different kind of contraction
stage.

References
1. Aumasson, J.-P., Meier, W.: Analysis of multivariate hash functions. In: Nam, K.-
H., Rhee, G. (eds.) ICISC 2007. LNCS, vol. 4817, pp. 309–323. Springer, Heidelberg
(2007)
Secure PRNGs from Specialized Polynomial Maps over Any Fq 197

2. Bard, G.V., Courtois, N.T., Jefferson, C.: Efficient methods for conversion and
solution of sparse systems of low-degree multivariate polynomials over gf(2) via
sat-solvers. Cryptology ePrint Archive, Report 2007/024 (2007),
[Link]
3. Bardet, M., Faugère, J.-C., Salvy, B.: On the complexity of Gröbner basis compu-
tation of semi-regular overdetermined algebraic equations. In: Proceedings of the
International Conference on Polynomial System Solving, pp. 71–74 (2004) (Previ-
ously INRIA report RR-5049)
4. Bardet, M., Faugère, J.-C., Salvy, B., Yang, B.-Y.: Asymptotic expansion of the
degree of regularity for semi-regular systems of equations. In: Gianni, P. (ed.)
MEGA 2005 Sardinia (Italy) (2005)
5. Berbain, C., Billet, O., Gilbert, H.: Efficient implementations of multivariate
quadratic systems. In: Biham, E., Youssef, A.M. (eds.) SAC 2006. LNCS, vol. 4356,
pp. 174–187. Springer, Heidelberg (2007)
6. Berbain, C., Gilbert, H.: On the security of IV dependent stream ciphers. In:
Biryukov, A. (ed.) FSE 2007. LNCS, vol. 4593, pp. 254–273. Springer, Heidelberg
(2007)
7. Berbain, C., Gilbert, H., Patarin, J.: QUAD: A practical stream cipher with prov-
able security. In: Vaudenay, S. (ed.) EUROCRYPT 2006. LNCS, vol. 4004, pp.
109–128. Springer, Heidelberg (2006)
8. Billet, O., Robshaw, M.J.B., Peyrin, T.: On building hash functions from multivari-
ate quadratic equations. In: Pieprzyk, J., Ghodosi, H., Dawson, E. (eds.) ACISP
2007. LNCS, vol. 4586, pp. 82–95. Springer, Heidelberg (2007)
9. Biryukov, A. (ed.): FSE 2007. LNCS, vol. 4593. Springer, Heidelberg (2007)
10. Blum, L., Blum, M., Shub, M.: Comparison of two pseudo-random number gener-
ators. In: Rivest, R.L., Sherman, A., Chaum, D. (eds.) CRYPTO 1982, pp. 61–78.
Plenum Press, New York (1983)
11. Buchberger, B.: Ein Algorithmus zum Auffinden der Basiselemente des Restklassen-
ringes nach einem nulldimensionalen Polynomideal. PhD thesis, Innsbruck (1965)
12. Courtois, N.T., Klimov, A., Patarin, J., Shamir, A.: Efficient algorithms for solving
overdefined systems of multivariate polynomial equations. In: Preneel, B. (ed.)
EUROCRYPT 2000. LNCS, vol. 1807, pp. 392–407. Springer, Heidelberg (2000),
[Link]
13. Courtois, N.T., Pieprzyk, J.: Cryptanalysis of block ciphers with overdefined sys-
tems of equations. In: Zheng, Y. (ed.) ASIACRYPT 2002. LNCS, vol. 2501, pp.
267–287. Springer, Heidelberg (2002)
14. Diem, C.: The XL-algorithm and a conjecture from commutative algebra. In: Lee,
P.J. (ed.) ASIACRYPT 2004. LNCS, vol. 3329. Springer, Heidelberg (2004)
15. Ding, J., Yang, B.-Y.: Multivariate polynomials for hashing. In: Inscrypt. LNCS.
Springer, Heidelberg (2007), [Link]
16. Farashahi, R.R., Schoenmakers, B., Sidorenko, A.: Efficient pseudorandom genera-
tors based on the ddh assumption. In: Public Key Cryptography, pp. 426–441 (2007)
17. Faugère, J.-C.: A new efficient algorithm for computing Gröbner bases (F4 ). Journal
of Pure and Applied Algebra 139, 61–88 (1999)
18. Faugère, J.-C.: A new efficient algorithm for computing Gröbner bases without
reduction to zero (F5 ). In: International Symposium on Symbolic and Algebraic
Computation — ISSAC 2002, pp. 75–83. ACM Press, New York (2002)
19. Garey, M.R., Johnson, D.S.: Computers and Intractability — A Guide to the The-
ory of NP-Completeness. W.H. Freeman and Company, New York (1979)
20. Gennaro, R.: An improved pseudo-random generator based on the discrete loga-
rithm problem. Journal of Cryptology 18, 91–110 (2000)
198 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

21. Goldreich, O., Rubinfeld, R., Sudan, M.: Learning polynomials with queries: The
highly noisy case. SIAM Journal on Discrete Mathematics 13(4), 535–570 (2000)
22. Jiang, S.: Efficient primitives from exponentiation in zp . In: Batten, L.M., Safavi-
Naini, R. (eds.) ACISP 2006. LNCS, vol. 4058, pp. 259–270. Springer, Heidelberg
(2006)
23. Koblitz, N., Menezes, A.: Another look at provable security (part 2). In: Barua,
R., Lange, T. (eds.) INDOCRYPT 2006. LNCS, vol. 4329, pp. 148–175. Springer,
Heidelberg (2006)
24. Lazard, D.: Gröbner-bases, Gaussian elimination and resolution of systems of al-
gebraic equations. In: van Hulzen, J.A. (ed.) ISSAC 1983 and EUROCAL 1983.
LNCS, vol. 162, pp. 146–156. Springer, Heidelberg (1983)
25. Levin, L., Goldreich, O.: A hard-core predicate for all one-way functions. In: John-
son, D.S. (ed.) 21st ACM Symposium on the Theory of Computing — STOC 1989,
pp. 25–32. ACM Press, New York (1989)
26. Matsumoto, T., Imai, H.: Public quadratic polynomial-tuples for efficient signature
verification and message-encryption. In: Günther, C.G. (ed.) EUROCRYPT 1988.
LNCS, vol. 330, pp. 419–545. Springer, Heidelberg (1988)
27. Raddum, H., Semaev, I.: New technique for solving sparse equation systems. Cryp-
tology ePrint Archive, Report 2006/475 (2006), [Link]
28. Semaev, I.: On solving sparse algebraic equations over finite fields (part ii). Cryp-
tology ePrint Archive, Report 2007/280 (2007), [Link]
29. Steinfeld, R., Pieprzyk, J., Wang, H.: On the provable security of an efficient rsa-
based pseudorandom generator. In: Lai, X., Chen, K. (eds.) ASIACRYPT 2006.
LNCS, vol. 4284, pp. 194–209. Springer, Heidelberg (2006)
30. Wolf, C.: Multivariate Quadratic Polynomials in Public Key Cryptography. PhD
thesis, Katholieke Universiteit Leuven (2005), [Link]
31. Yang, B.-Y., Chen, J.-M.: All in the XL family: Theory and practice. In: Park, C.-
s., Chee, S. (eds.) ICISC 2004. LNCS, vol. 3506, pp. 67–86. Springer, Heidelberg
(2005)
32. Yang, B.-Y., Chen, J.-M.: Theoretical analysis of XL over small fields. In: Wang, H.,
Pieprzyk, J., Varadharajan, V. (eds.) ACISP 2004. LNCS, vol. 3108, pp. 277–288.
Springer, Heidelberg (2004)
33. Yang, B.-Y., Chen, O.C.-H., Bernstein, D.J., Chen, J.-M.: Analysis of QUAD. In:
Biryukov [9], pp. 290–307

A Proof of Prop. 1
Proof. We introduce hybrid probability distributions Di (S) over K L (L := λr):
For 0 ≤ i ≤ λ respectively associate with the random variables
ti (S, x) := w1 , w2 , . . . , wi , P(x), P(Q(x)), . . . , P(Qλ−i−1 (x))
where the wj and x are random independent uniformly distributed vectors in K n
and we use the notational conventions that (w1 , w2 , . . . , wi ) is the null string
if i = 0, and that
P(x), P(Q(x)), . . . , P(Qλ−i−1 (x))
is the null string if i = λ. Consequently D0 (S) is the distribution of the L-unit
keystream and Dλ (S) is the uniform distribution over K L . We denote by pi (S)
Secure PRNGs from Specialized Polynomial Maps over Any Fq 199

the probability that A accepts a random L-long sequence distributed according


to Di (S), and pi the mean value of pi (S) over the space of sparse polynomial
systems S. We have supposed that algorithm A distinguishes between D0 (S) and
Dλ (S) with advantage , in other words that |p0 − pλ | ≥ .
Algorithm B works thus: on input (x1 , x2 ) ∈ K n+r with x1 ∈ K r , x2 ∈ K n ,
it selects randomly an i such that 0 ≤ i ≤ λ − 1 and constructs the L-long vector
t(S, x1 , x2 ) := (w1 , w2 , . . . , wi , x1 , P(x2 ), P(Q(x2 )), . . . , P(Qλ−i−1 (x2 ))).
If (x1 , x2 ) is distributed accordingly to the output distribution of S, i.e. (x1 , x2 ) =
S(x) = (P(x), Q(x)) for a uniformly distributed value of x, then
t(S, x1 , x2 ) := w1 , w2 , . . . , wi , P(x), P(Q(x)), . . . , P(Qλ−i−1 (x))
is distributed according to Di (S). Now if (x1 , x2 ) is distributed according to the
uniform distribution, then
t(S, x1 , x2 ) = w1 , w2 , . . . , wi , x1 , P(x2 ), P(Q(x2 )), . . . , P(Qλ−i−2 (x2 ))
which is distributed according to Di+1 (S). To distinguish between the output
of S from uniform, algorithm B calls A with inputs (S, t(S, x1 , x2 )) and returns
that same return value. Hence
 
 
 Pr (B(S, S(x)) = 1 − Pr (B(S, S(x1 , x2 )) = 1
S,x S,x 
 λ−1 
1  1   1
λ−1
 
= pi − pi  = |p0 − pλ | ≥ .
λ λ  λ λ
i=0 i=0

B XL and F4 -F5 Families for System-Solving


The XL and F4 -F5 families of algorithms are spiritual descendants of Lazard’s
idea [24]: run an elimination on an extended Macaulay matrix (i.e., extending
the resultant concept to many variables) as an improvement to Buchberger’s
algorithm for computing Gröbner bases [11].
Since we cannot discuss these methods in detail, we try to describe them
briefly along with their projected complexities. Again, suppose we have the sys-
tem P1 (x) = P2 (x) = · · · = Pm (x) = 0, where Pi is a degree-di polynomial in
x = (x1 , . . . , xn ), coefficients and variables in K = Fq .

Method XL [12]: Fix a degree D(≥ max Pi ). The set of degree-D-or-lower


monomials is denoted T = T (D) . |T (D) | is the number of degree ≤ D monomials
and will be denoted T = T (D) . We now take each equation Pi = 0 and multiply
it by every monomial up to D − di to get an equation 2 that is at most degree
m
D. Collect all such equations in the set R = R(D) := i=1 {(uPi = 0) : u ∈
T (D−di ) }. We treat every monomial in T as independent and try to solve R as
a linear system of equations.
The critical parameter is the difference between I = dim(spanR), the rank of the
space of equations R, and T . If T −I = 0, the original system cannot be satisfied;
200 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

if T − I = 1, then we should find a unique solution (with very high probability).


Also, if T − I < min(D, q − 1), we can reduce to a univariate equation [12]. We
would like to predict D0 , the smallest D enabling resolution.
Note: For any pair of indices i, j ≤ m, among linear combinations of the multi-
ples of Pj = 0 will be Pi Pj = 0, and among linear combinations of the multiples
of Pi = 0 will be Pi Pj = 0 — i.e., one dependency in spanR. In Fq , (Pi )q = Pi
which generates a similar type of dependency.
Proposition 7 ([32]). Denote by [u]s the coefficient of the monomial u in the
expansion of s, then:
(1 − tq )n D n
1. T = [tD ] which reduces to n+D when q > D, and
(1 − t)n+1 D j=0 j

when q = 2.
2. If the system is regular up to degree D, i.e., if the relations R(D) has no
other dependencies than the obvious ones generated by Pi Pj = Pj Pi
and Piq = Pi , then

T − I=[tD ] G(t), where G(t) := G(t; n; d1 , d2 , . . . , dm )=


(1 − tq )n   1 − t .
m dj

(1 − t)n+1 j=1
1−t q dj

(3)
3. For overdefined systems, Eq. 3 cannot hold when D > DXL = min{D :
[tD ]G(t) ≤ 0}. If Eq. 3 holds up for every D < DXL and resolves at DXL ,
we say that the system is q-semiregular. It is generally believed [3,14] that
for random systems it is overwhelmingly likely that D0 = DXL , and
indeed the system is not q-semiregular with very small probability.
4. When it resolves, XL takes CXL  (c0 + c1 lg T ) τ T 2 multiplications in Fq ,
using a sparse solver like Wiedemann [31]. Here τ is the average number of
terms per equation.
We cannot describe methods F4 -F5 [17,18], which are just too sophisticated
and complex to present here. Instead, we simply sketch a result that yields their
complexities:
Proposition 8. [3] For q-semiregular systems, F4 or F5 operates at the degree
⎧ ⎛ ⎞ ⎫
⎨ q n 3
m  ⎬
(1 − t ) 1 − t dj
D = Dreg := min D : [tD ] ⎝ ⎠<0 ,
⎩ (1 − t)n j=1 1 − tq dj ⎭

and take  (c0 +c1 lg T̄ ) T̄ ω multiplications, where T̄=[tDreg ] ((1 − tq )n (1 − t)−n )


counts the monomials of degree exactly Dreg , and 2 < ω ≤ 3 is the order of
matrix multiplication used.
We do not know what works best under various resource limitations. We take
the position of [33], e.g., XL with a sparse solver represents the best way to solve
large and more or less random overdetermined systems when the size of main
memory space is the critical restraint.
Secure PRNGs from Specialized Polynomial Maps over Any Fq 201

C SRQ, a Potential Candidate for One Way Function


An SRQ (Sparse Rotated Quadratics) instance is an MQ system specialized so
that it is non-sparse but can be computed with fewer computations than normal
quadratics.
Problem SRQ(q, n, m, h): In Fq , solve P1 (x) = P2 (x) = · · · = Pm (x) = 0. The
Pi are quadratics formed from “sequence of rotations”, that is Start with
P0 = x1 x2 + x3 x4 + · · · + xn−1 xn (where n is even), and obtain successive Pj
by performing sparse affine maps on x. I.e., x(0) := x, x(i) := M (i) x(i−1) +
b(i) , yi := Pi (x) := P0 (x(i) ) + ci , ∀i. Matrices M (i) are randomly chosen,
invertible and sparse with h entries per row.
The idea behind SRQ is that any quadratic map can be written as f ◦L, where
f is a standard form and L is an invertible linear map. Now we will choose L to
be sparse. A standard form for characteristic 2 fields is the “rank form” which
for full-rank quadratics is

P0 (x) = x1 x2 + x3 x4 + · · · xn−1 xn .

Clearly, by taking a random c and b, we can give P0 (x + c) + b any random


affine part. Since each x(i) is related to x = x(0) by an invertible affine map, this
holds for every component Pi . This means that results pertaining to sparsity of
the linear terms such as [1] (cf. Sec. 4.3) never apply, and hence it is plausible
for SRQ to form a one way function class.
In Fig. 1, the samples labelled “sparse non-random” are SRQ tests. It seem
as if their behavior under MAGMA’s F4 is no different than normal quadratics.
In this SRQ construction, even if h = 3 (the rotation matrices M (i) have only
three entries per row), the number of cross-terms in each equations still quickly
increases to have as many terms as totally random ones, so point 2 in Sec. 4.3
does not apply here. Indeed, since a quadratic over F2 of rank 2k has bias 2−k−1 ,
the SRQ form acts exactly like a random quadratic under point 3 in Sec. 4.3.

10000
Dense Random
Sparse Random (1/10)
Sparse Random (2/n)
Sparse Non-random
Best Fit of Dense Random

1000
Time (Second)

100

10
11 12 13 14 15
n

Fig. 1. “Sparsely 2n → 3n F2 quadratics” in MAGMA


202 F.-H. Liu, C.-J. Lu, and B.-Y. Yang

10000
Dense Random
Sparse Random (1/50)
Sparse Random (1/n)
Best Fit of Dense Random
1000
Time (Second)

100

10

0.1
21 22 23 24 25 26 27 28 29 30 31
n

Fig. 2. “Sparsely n → 2n F2 quadratics” in MAGMA


MXL2 : Solving Polynomial Equations over
GF(2) Using an Improved Mutant Strategy

Mohamed Saied Emam Mohamed1 , Wael Said Abd Elmageed Mohamed1 ,


Jintai Ding2 , and Johannes Buchmann1
1
TU Darmstadt, FB Informatik,
Hochschulstrasse 10, 64289 Darmstadt, Germany
{mohamed,wael,buchmann}@[Link]
2
Department of Mathematical Sciences, University of Cincinnati,
Cincinnati OH 45220, USA
[Link]@[Link]

Abstract. MutantXL is an algorithm for solving systems of polyno-


mial equations that was proposed at SCC 2008. This paper proposes
two substantial improvements to this algorithm over GF(2) that result
in significantly reduced memory usage. We present experimental results
comparing MXL2 to the XL algorithm, the MutantXL algorithm and
Magma’s implementation of F4 . For this comparison we have chosen
small, randomly generated instances of the MQ problem and quadratic
systems derived from HFE instances. In both cases, the largest matrices
produced by MXL2 are substantially smaller than the ones produced by
MutantXL and XL. Moreover, for a significant number of cases we even
see a reduction of the size of the largest matrix when we compare MXL2
against Magma’s F4 implementation.

1 Introduction
Solving systems of multivariate quadratic equations is an important problem in
cryptology. The problem of solving such systems over finite fields is called the
Multivariate Quadratic (MQ) problem. In the last two decades, several cryp-
tosystems based on the MQ problem have been proposed as in [1,2,3,4,5]. For
generic instances it is proven that the MQ problem is NP-complete [6]. However
for some cryptographic schemes the problem of solving the corresponding MQ
system has been demonstrated to be easier, allowing these schemes to be bro-
ken. Therefore it is very important to develop efficient algorithms to solve MQ
systems.
Recently, MutantXL [7] and MutantF4 [8] were proposed at SCC 2008, two
algorithms based on Ding’s mutant concept. Roughly speaking, in algorithms
that operate on linearized representations of the polynomial system by increasing
degree – such as F4 and XL – this concept proposes to maximize the effect
of lower-degree polynomials occurring during the computation. In this paper,
we present MutantXL2 (MXL2 ) – a new algorithm based on MutantXL that
oftentimes allows to solve systems with significantly smaller matrix sizes than

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 203–215, 2008.
c Springer-Verlag Berlin Heidelberg 2008
204 M.S.E. Mohamed et al.

XL and MutantXL. Moreover, experimental results for both HFE systems and
random systems demonstrate that for a significant number of cases we even get
a reduction of the size of the largest matrix when comparing MXL2 against
Magma’s F4 implementation.
The paper is organized as follows. In Section 2 the key ideas of the MXL2
algorithm and the required definitions are presented. A formal description and
explanations of the algorithm are in Section 3. Section 4 contains the experi-
mental results. In Section 5 we conclude our paper.

2 Improvements to the Mutant Strategy


In this section we present the key ideas of the MXL2 algorithm and explain their
importance for solving systems of multivariate quadratic polynomial equations
more efficiently. Throughout the paper we will use the following notations: Let
X := {x1 , . . . , xn } be a set of variables, upon which we impose the following
order: x1 < x2 < . . . < xn . Let

R = F2 [x1 , . . . , xn ]/(x21 − x1 , ..., x2n − xn )

be the ring of polynomial functions over F2 in X with the monomials of R ordered


by the graded lexicographical order <glex . By an abuse of notation, we call the
elements of R polynomials throughout this paper. Let P = (p1 , . . . , pm ) ∈ Rm
be a sequence of m quadratic polynomials in R. Throughout the operation of
the algorithms described in this paper, a degree bound D will be used. This
degree bound denotes the maximum degree of the polynomials contained in P .
Note that the contents of P will be changed throughout the operation of the
algorithm.
Some algorithms for solving the system

pj (x1 , . . . , xn ) = 0, 1 ≤ j ≤ m (1)

such as XL and MutantXL are based on finding new elements in the ideal gener-
ated by the polynomials of P that correspond to equations that are easy to solve,
i.e. univariate or linear polynomials. The MutantXL algorithm is an application
of the mutant concept to the XL algorithm. The following definitions explain
the term mutant:
Definition 1. Let g ∈ R be a polynomial in the ideal generated by the elements
of P . Naturally, it can be written as

g= gp p (2)
p∈P

where gp ∈ R, p ∈ P . The level of this representation is defined to be

max{deg(gp p) : p ∈ P }.
MXL2 : Solving Polynomial Equations over GF(2) 205

Note that this level depends on P . The level of the polynomial g is defined to be
the minimum level of all of its representations.

Definition 2. Let g ∈ R be a polynomial in the ideal generated by the elements


of P . The polynomial g is called a mutant with respect to P if its degree is less
than its level.

Next, we explain the meaning of mutants. When a mutant is written as a linear


combination (2), then one of the polynomials gp p has a degree exceeding the
degree of the mutant. This means that a mutant of degree d cannot be found
as a linear combination of polynomials of the form mp where m is a monomial,
p ∈ P and the degree of mp is at most d. However, such mutants could help in
solving the system (1) if we can find them efficiently.
Given a degree bound D, the MutantXL algorithm extends the system of
polynomial equations (1) by multiplying the polynomials on the left-hand side
by all monomials up to degree D − deg(pi ). Then the system is linearized by
considering the monomials as new variables and applying Gaussian elimination
on the resulting linear system. MutantXL searches for univariate equations, if
no such equations exist, it searches for mutants, that are new polynomials of
degree < D. If mutants are found, they are multiplied by all monomials such
that the produced polynomials have degree ≤ D. Using this strategy, MutantXL
achieves to enlarge the system without incrementing D.
In many experiments with MutantXL on some HFE systems and some ran-
domly generated multivariate quadratic systems, we noticed that there are two
problems. The first occurs when the number of lower degree mutants is very
large, we observed this produces many reductions to zero. A second problem
occurs when an iteration does not produce mutants at all or produces only an
insufficient number of mutants to solve the system at lower degree D. In this
case MutantXL behaves like XL.
Our proposed improvements handle both problems, while using the same lin-
earization strategy as the original MutantXL. This allows us to compute the
solution with fewer polynomials. To handle the first problem, we need the fol-
lowing notation.
Let Sk : { m ∈ R : deg(m) ≤ k} be the set of all monomials of R that have
degree less than or equal to k. Combinatorially, the number of elements of this
set can be computed as


k 
n
|Sk | = ,1 ≤ k ≤ n (3)

=1

where n is the number of variables.


The MXL2 algorithm as well as MutantXL are based on the mutant concept,
however MXL2 introduces a heuristic strategy of only choosing the minimum
number of mutants, which will be called necessary mutants. Let k be the degree
of the lowest-degree mutant occuring and the number of the linearly independent
elements of degree ≤ k+1 in P be Q(k+1). Then the smallest number of mutants
206 M.S.E. Mohamed et al.

that are needed to generate |Sk+1 | linearly independent equations of degree ≤


k + 1 is
(|Sk+1 | − Q(k + 1))/n, (4)
where Sk+1 is as in (3) and n is the number of variables. Therefore by multiplying
only the necessary number of mutants, the system can potentially be solved by
a smaller number of polynomials and a minimum number of multiplication. This
handles the first problem. In the following we explain how MXL2 solves the
second problem.
Suppose we have a system with not enough mutants. In this case we noticed
that in the process of space enlargement, MutantXL multiply all original poly-
nomials by all monomials of degree D − 2. In most cases only a small number of
extended polynomials that are produced are needed to solve the system. More-
over the system will be solved only when some of these elements are reduced to
lower degree elements. To be more precise, the degree of the extended polyno-
mials is decreased only if the higher degree terms are eliminated. We have found
that by using a partitioned enlargement strategy and a successive multiplication
of polynomials with variables method, while excluding redundant products, we
can solve the system with a smaller number of equations. To discuss this idea in
details we first need to define the following:
Definition 3. The leading variable of a polynomial p in R is x, if x is the
smallest variable, according to the order defined on the variables, in the leading
term of p. It can be written as

LV(p) = x (5)

Definition 4. Let Pk = {p ∈ P : deg(p) = k} and x ∈ X. We define Pkx as


follows
Pkx = {p ∈ Pk : LV (p) = x} (6)

In the process of space enlargement, MXL2 deals with the polynomials of PD


differently. Let PD be divided into a set of subsets depending
2 x on the leading
variable of each polynomial in it. In other words, PD = PD , where X is the
x∈X
x
set of variables as defined previously and PD as in (6). MXL2 enlarges P by
increments D and multiplies the elements of PD as follows: Let x be the largest
x
variable, according to the order defined on the variables, that has PD = ∅. MXL2
x
successively multiplies each polynomial of PD by variables such that each variable
is multiplied only once. This process is repeated for the next smaller variable x
with PDx
= ∅ until the solution is obtained, otherwise the system enlarges to the
next D. Therefore MXL2 may solve the system by enlarging only subsets of PD ,
while MutantXL solves the system by enlarging all the elements of PD . MXL2
handles the second problem by using this partitioned enlargement strategy.
In the next section we describe MXL2. In section 4 we present examples that
show that MXL2 completely beats the first version of MutantXL and beats in
most cases Magma’s implementation of F4 for only the memory efficiency.
MXL2 : Solving Polynomial Equations over GF(2) 207

3 MXL2 Algorithm

In this Section we explain the MXL2 algorithm. We use the notation of the
previous section. So P is a finite set of polynomials in R. For simplicity, we
assume that the system (1) is quadratic and has a unique solution.
We use a graded lexicographical ordering in the process of linearization and
during the Gaussian elimination. MXL2 creates a multiplication history one
dimension array to store each previous variable multiplier of each polynomial
and for the originals the previous multiplier is 1. The set of solutions of the
system is defined as {x = b : x is variable and b ∈ {0, 1}}. The description of the
algorithm is as follows.

– Initialization Use Gaussian elimination to make P linearly independent. Set


the set of root polynomials to ∅, the total degree bound D to 2, the elimina-
tion degree to D, system extended to false, mutants to ∅, and multiplication
history to a one dimension array with number of elements as P and initialize
these elements by ones (Algorithm 1 lines 16 – 21).
– Gauss Use linearization to transform the set of all polynomials in P of degree
≤ elimination degree into reduced row echelon form (Algorithm 1 lines 23
and 24).
– Extract Roots copy all new polynomials of degree ≤ 2 to the root polynomials
set (Algorithm 1 line 25).
– If there are univariate polynomials in the roots, then determine the values
of the corresponding variables, and remove the solved variables from the
variable set. If this solves the system return the solution and terminate.
Otherwise, substitute the values for the variables in the roots, set P to the
roots, set elimination degree to the maximum degree of the roots, reset the
multiplication history to an array of number of elements as P and initialize
these elements to ones, and go back to Gauss (Algorithm 1 lines 26 – 32).
– Extract Mutants copy all new polynomials of degree < D from {P } to mu-
tants (Algorithm 1 line 34).
– If there are mutants found, then extend the multiplication history by an ar-
ray of the number of elements of the same length as the new polynomials
initialized by ones, multiply the necessary number of mutants having the
minimum degree, as stated in Section 2, by all variables, set the multipli-
cation history for each new polynomial by its variable multiplier, include
the resulting polynomials in P , set the elimination degree to that minimum
degree + 1, and remove all multiplied mutants from mutants (Algorithm 2
lines 9 – 20).
– Otherwise, if system extended is false; then increment D by 1, set x to the
largest leading variable under the variable order satisfies that PD−1 x
= ∅, set
x
system extended to true; multiply each polynomial p in PD−1 by all unsolved
variables < the variable stored in the multiplication history of p, include the
resulting polynomials in P , set x to the next smaller leading variable satisfies
x
that PD−1 = ∅, if there is no such variable, then set system extended to false,
elimination degree to D, and go back to Gauss (Algorithm 2 lines 22 – 39).
208 M.S.E. Mohamed et al.

To give a more formal description of MXL2 algorithm and its sub-algorithms,


firstly we need to define the following subroutines:

Solve(Roots, X): if there are univariate equations in the roots, then solve them
and return the solutions.
Substitute(Solution, roots): use all the solutions found to simplify the roots.
Reset(history, n): reset history to an array with number of elements equal to n
and initialized by ones.
Extend(history, n): append to history an array with number of elements equal
to n and initialized by ones.
SelectNecessary(M, D, k, n): compute the necessary number of mutants with
degree k as in equation (4), let the mutants be ordered depending on their lead-
ing terms, then return the necessary mutants by ascending order.
Xpartition(P, x): return {p ∈ P : LV (p) = x}.
LargestLeading(P ): return max{y : y = LV (p), p ∈ P, y ∈ X}.
NextSmallerLeading(P, x): return max{y: y = LV (p), p∈P , y∈X and y<x}.

Algorithm1. MXL2
1. Inputs
2. F : set of quadratic polynomials.
3. D: highest system degree starts by 2.
4. X: set of variables.
5. Output
6. Solution: solution of F=0.
7. Variables
8. RP : set of all regular polynomials produced during the process.
9. M : set of mutants.
10. roots: set of all polynomials of degree ≤ 2
11. x: variable
12. ed: elimination degree
13. history: array of length #RP to store previous variable multiplier
14. extended: a flag to enlarge the system
15. Begin
16. RP ← F
17. M ← ∅
18. Solution ← ∅
19. ed ← 2
20. history ← [1,. . .,1]
21. extended ← false
22. repeat
23. Linearize RP using graded lex order
24. Gauss(Extract(RP, ed, ≤), history)
25. roots ← roots ∪ Extract(RP, 2, ≤)
26. Solution ← Solution ∪ Solve(roots, X)
27. if there are solutions then
MXL2 : Solving Polynomial Equations over GF(2) 209

28. roots ← Substitute(Solution, roots)


29. RP ← roots
30. history ← Reset(history, #roots)
31. M ←∅
32. ed ← D ← max{deg(p) : p ∈ roots}
33. else
34. M ← M ∪ Extract(RP, D − 1, ≤)
35. RP ← RP ∪ Enlarge(RP, M, X, D, x, history, extended, ed)
36. end if
37. until roots = ∅
38. End

Algorithm2: Enlarge(RP, M, X, D, x, history, extended, ed)


1. history, extended, ed: may be changed during the process.
2. Variable
3. N P : set of new polynomials.
4. N M : necessary mutants
5. Q: set of degree D-1 polynomials have leading variable x
6. k: minimum degree of the mutants
7. Begin
8. N P ← ∅
9. if M = ∅ then
10. k ← min{deg(p) ∈ M }
11. N M ← SelectNecessary(M, D, k, #X)
12. Extend(history, #X · #N M )
13. for all p ∈ N M do
14. for all y in X do
15. N P ← N P ∪ {y · p}
16. history[y · p] = y
17. end for
18. end for
19. M ← M \ SM
20. ed ← k + 1
21. else
22. if not extended then
23. D ←D+1
24. x ← LargestLeading(Extract(RP, D − 1, =))
25. extended ← true
26. end if
27. Q ← XPartition(Extract(RP, D − 1, =), x)
28. Extend(history, #X · #Q)
29. for all p ∈ Q do
30. for all y ∈ X: y < history[p] do
31. N P ← N P ∪ {y · p}
32. history[y · p] ← y
210 M.S.E. Mohamed et al.

33. end for


34. end for
35. x ← NextSmallerLeading(Extract(RP, D − 1, =), x)
36. if x is undefined then
37. extend ← false
38. end if
39. ed ← D
40. end if
41. Return N P
42. End

Algorithm3: Extract(P, degree, operation)


1. P : set of polynomials
2. SP : set of selected polynomials
3. operation: conditional operations belongs to {<, ≤, >, ≥, =}
4. Begin
5. for all p ∈ P do
6. if deg(p) operation degree then
7. SP ← SP ∪ {p}
8. end if
9. end for
10. End

We show that the system is partially enlarged, so MXL2 leads to the original
MutantXL if the system is solved with the last partition enlarged. Whereas
MXL2 outperforms the original MutantXL if it solves the system by earlier
partition enlarged. This will be clarified experimentally in the next section.

4 Experimental Results
In this section, we present the experimental results for our implementation of the
MXL2 algorithm. We compare MXL2 with the original MutantXL, Magma’s
implementation of F4 , and the XL algorithm for some random systems (5-24
equations in 5-24 variables). The results can be found in Table 1. Moreover,
we have another comparison for MXL2, original MutantXL, and Magma for
some HFE systems (25-55 equations in 25-55 variables) in order to clarify that
mutant strategy has the ability to be helpful with different types of systems. See
the results in Table 2. For XL and MutantXL, all monomials up to the degree
bound D are computed and accounted for as columns in the matrix, even if they
did not appear in any polynomial. For MXL2 on the other hand, we omitted
columns that only contained zeros.
Random systems were taken from [9], HFE systems (30-55 equations in 30-
55 variables) were generated with code contained in [10], and one HFE sys-
tem (25 equations in 25 variables) was taken from the Hotaru distribution [11].
The results for F4 were obtained using Magma version 2.13-10; the parameter
MXL2 : Solving Polynomial Equations over GF(2) 211

Table 1. Random Comparison

# Var XL MutantXL Magma MXL2


# Eq
5 30×26 30×26 30×26 20×25
6* 42×42 47×42 46×40 33×38
7* 203×99 154×64 154×64 63×64
8* 296×163 136×93 131×88 96×93
9 414×256 414×256 480×226 151×149
10 560×386 560×386 624×3396 228×281
11 737×562 737×562 804×503 408×423
12 948×794 948×794 1005×704 519×610
13 1196×1093 1196×1093 1251×980 1096×927
14* 6475×3473 1771×1471 1538×1336 1191×1185
15* 8520×4944 2786×2941 2639×1535 1946×1758
16 11016×6885 11016×6885 9993×4034 2840×2861
17 14025×9402 14025×9402 12382×5784 3740×4184
18 17613×12616 17613×12616 15187×8120 6508×7043
19 21850×16664 21850×16664 18441×11041 9185×11212
20 26810×21700 26810×21700 22441×14979 14302×12384
21* 153405×82160 31641×27896 26860×19756 14365×20945
22* 194579×110056 92831×35443 63621×21855 35463×25342
23* 244145×145499 76558×44552 41866×29010 39263×36343
24* no sol. obtained 298477×190051 207150×78637 75825×69708

HFE:=true was used to solve HFE systems. The MXL2 algorithm has been im-
plemented in C/C++ based on the latest version of M4RI package [12]. For each
example, we give the number of equations (#Eq), number of variables (#Var),
the degree of the hidden univariate high-degree polynomial for HFE (HUD) and
the size of the largest linear system to which Gauss is applied. The ’*’ in the first
column for random systems means that, there are some mutants in this system.
In all experiments, the highest degree of the polynomials generated by Mu-
tantXL and MXL2 is equal to the highest degree of the S-polynomial in Magma.
In MXL2 implementation, we use only one matrix from starting to the end of
the process by enlarging and extending the initial matrix, the largest matrix is
the accumulative of all polynomials that are held in the memory. unfortunately,
in Magma we can not know the total accumulative matrices size because it is
not an open source.
In Table 1, we see that in practice MXL2 is an improvement for memory effi-
ciency over the original MutantXL. For systems for which mutants are produced
during the computation, MutantXL is better than XL. If no mutants occur,
MutantXL behaves identically to XL. Comparing XL, MutantXL, and MXL2 ;
MXL2 is the most efficient even if there are no mutants. In almost all cases
MXL2 has the smallest number of columns as well as a smaller number of rows
compared to the F4 implementation contained in Magma. We can see easily that
70% of the cases MXL2 is better, 5% is equal, and 25% is worse.
212 M.S.E. Mohamed et al.

Table 2. HFE Comparison

# Var HUD Magma MutantXL MXL2


# Eq
25 96 12495×15276 14219×15276 11926×15276
30 64 23832×31931 26922×31931 19174×31931
35 48 27644×59536 31255×59536 30030×59536
40 33 45210×102091 49620×102091 46693×102091
45 24 43575×164221 57734×164221 45480×164221
50 40 75012×251176 85025×251176 67826×251176
55 48 104068×368831 119515×368831 60116×368831

Table 3. Time Comparison

System MutantXL MXL2


RND5 0.004 0.001
RND6 0.001 0.004
RND7 0.004 0.008
RND8 0.004 0.001
RND9 0.016 0.012
RND10 0.024 0.016
RND11 0.044 0.024
RND12 0.072 0.040
RND13 0.112 0.084
RND14 0.252 0.184
RND15 0.372 0.256
RND16 13.629 1.636
RND17 28.342 2.420
RND18 92.078 9.561
RND19 178.971 20.057
RND20 346.062 70.001
RND21 699.108 126.576
RND22 1182.410 498.839
RND23 1636.000 854.753
RND24 23370.001 12384.700

In Table 2, we also present HFE systems comparison. In all these seven ex-
amples for all the three algorithms (Magma’s F4 , MutantXL, and MXL2 ), all
the monomials up to degree bound D appear in Magma, MutantXL, and MXL2.
therefore, the number of columns are equal in all the three algorithms. It is clear
that MXL2 has a smaller number of rows in four cases of seven. In all cases
MXL2 outperforms MutantXL.
A time comparison in seconds for random systems between MutantXL and
MXL2 can be found in Table 3. We use in this comparison a Sun Fire X2200
M2 server with 2 dual core Opteron 2218 CPU running at 2.6GHz and 8GB
of RAM. We did not make such a comparison between Magma and MXL2 for
HFE instances. This is due to the following reasons: we use a special Magma
MXL2 : Solving Polynomial Equations over GF(2) 213

Table 4. Strategy Comparison

# Var Method1 Method2 Method3 Method4


# Eq
5 30×26 30×26 25×25 20×25
6 47×42 47×42 33×38 33×38
7 154×64 63×64 154×64 63×64
8 136×93 96×93 136×93 96×93
9 414×239 414×239 232×149 151×149
10 560×367 560×367 318×281 228×281
11 737×541 737×541 408×423 408×423
12 948×771 948×771 519×610 519×610
13 1196×1068 1196×1068 1616×967 1096×927
14 1771×1444 1484×1444 1485×1185 1191×1185
15 2786×1921 1946×1921 2681×1807 1946×1758
16 11016×5592 10681×5592 6552×2861 2840×2861
17 14025×7919 13601×7919 4862×4184 3740×4184
18 17613×10930 17086×10930 6508×7043 6508×7043
19 21850×14762 21205×14762 9185×11212 9185×11212
20 26810×19554 26031×19554 14302×12384 14302×12384
21 31641×25447 31641×25447 14428×20945 14365×20945
22 92831×34624 38116×32665 56385×28195 35463×25342
23 76558×43650 45541×43650 39263×36343 39263×36343
24 298477×190051 297810×190051 75825×69708 75825×69708

implementation for HFE systems by using the HFE:=true parameter, the MXL2
implementation is based on M4RI package which is not in its optimal speed
as claimed by M4RI contributors and the MXL2 implementation itself is not
optimal at this point. From Table 3, it is clear that the MXL2 has a good
performance for speed compared to MutantXL.
In order to shed light on which strategy (necessary mutants or partitioned
enlargement) worked more than the other in which case, we make another com-
parison for random systems. In this comparison, we have 4 methods that cover
all possibilities to use the two strategies. Method1 is for multiplying all lower
degree mutants that are extracted at certain level, non of the two strategies are
used. Method2 is for multiplying only our claimed necessary number of mutants,
necessary mutant strategy. We use Method3 for partitioned enlargement strat-
egy, multiplications are for all lower degree mutants. For both the two strategies
which is MXL2 too, we use Metod4. See Table 4.
In Table 4, comparing Method1 and Method2, we see that practically the
necessary mutant strategy sometimes has an effect in the cases which have a
large enough number of hidden mutants (cases 7, 8, 14, 15, 22 and 23). In a case
that has less mutants (cases 6, 21 and 24) or no mutants at all (cases 5, 9, 10-13,
and 16-20), the total number of rows is the same as in Method1. Furthermore,
in case 22 because of not all mutants were multiplied, the number of columns
is decreased. By comparing Method1 and Method3, most of the cases in the
partitioned enlargement strategy have a smaller number of rows except for case
214 M.S.E. Mohamed et al.

13 which is worst because Method3 extracts mutants earlier than Method1, so it


multiplies all these mutants while MutantXL solves and ends before multiplying
them. In a case that is solved with the last partition, the two methods are
identical (case 7 and 8).
Indeed, using both the two strategies as in Method4 is the best choice. In
all cases the number of rows in this method is less than or equal the minimum
number of rows for both Method2 and Method3,
#rows in M ethod4 ≤ min(#rows in M ethod2, #rows in M ethod3)
In some cases (13, 15 and 22) using both the two strategies leads to a smaller
number of columns.

5 Conclusion
Experimentally, we can conclude that the MXL2 algorithm is an efficient im-
provement over the original MutantXL in case of GF(2). Not only can MXL2
solve multivariate systems at a lower degree than the usual XL but also can
solve these systems using a smaller number of polynomials than the original
MutantXL, since we produce all possible new equations without enlarging the
number of the monomials. Therefore the size of the matrix constructed by MXL2
is much smaller than the matrix constructed by the original MutantXL. We did
not claim that we are absolutely better than F4 but we are going in this direction.
We apply the mutant strategy into two different systems, namely random and
HFE. We believe that mutant strategy is a general approach that can improve
most of multivariate polynomial solving algorithms.
In the future we will study how to build MXL2 using a sparse matrix represen-
tation instead of the dense one to optimize our implementation. We also need
to enhance the mutant selection strategy to reduce the number of redundant
polynomials, study the theoretical aspects of the algorithm, apply the algorithm
to other systems of equations, generalize it to other finite fields and deal with
systems of equations that have multiple solutions.

Acknowledgment
We would like to thank Ralf-Philipp Weinmann for several helpful discussions
and comments on earlier drafts of this paper.

References
1. Matsumoto, T., Imai, H.: Public Quadratic Polynomial-Tuples for Efficient
Signature-Verification and Message-Encryption. In: Günther, C.G. (ed.) EURO-
CRYPT 1988. LNCS, vol. 330, pp. 419–453. Springer, Heidelberg (1988)
2. Patarin, J.: Hidden Fields Equations (HFE) and Isomorphisms of Polynomials (IP):
two new families of Asymmetric Algorithms. In: Maurer, U.M. (ed.) EUROCRYPT
1996. LNCS, vol. 1070, pp. 33–48. Springer, Heidelberg (1996)
MXL2 : Solving Polynomial Equations over GF(2) 215


3. Patarin, J., Goubin, L., Courtois, N.: C−+ and HM: Variations Around Two
Schemes of T. Matsumoto and H. Imai. In: Ohta, K., Pei, D. (eds.) ASIACRYPT
1998. LNCS, vol. 1514, pp. 35–50. Springer, Heidelberg (1998)
4. Moh, T.: A Public Key System With Signature And Master Key Functions. Com-
munications in Algebra 27, 2207–2222 (1999)
5. Ding, J.: A New Variant of the Matsumoto-Imai Cryptosystem through Pertur-
bation. In: Bao, F., Deng, R., Zhou, J. (eds.) PKC 2004. LNCS, vol. 2947, pp.
305–318. Springer, Heidelberg (2004)
6. Courtois, N.T., Klimov, A., Patarin, J., Shamir, A.: Efficient Algorithms for Solving
Overdefined Systems of Multivariate Polynomial Equations. In: Preneel, B. (ed.)
EUROCRYPT 2000. LNCS, vol. 1807, pp. 392–407. Springer, Heidelberg (2000)
7. Ding, J., Buchmann, J., Mohamed, M.S.E., Moahmed, W.S.A., Weinmann, R.P.:
MutantXL. In: Proceedings of the 1st international conference on Symbolic Com-
putation and Cryptography (SCC 2008), Beijing, China, LMIB, pp. 16–22 (2008),
[Link]
MutantXL [Link]
8. Ding, J., Cabarcas, D., Schmidt, D., Buchmann, J., Tohaneanu, S.: Mutant
Gröbner Basis Algorithm. In: Proceedings of the 1st international conference on
Symbolic Computation and Cryptography (SCC 2008), Beijing, China, LMIB, pp.
23–32 (2008)
9. Courtois, N.T.: Experimental Algebraic Cryptanalysis of Block Ciphers (2007),
[Link]
10. Segers, A.: Algebraic Attacks from a Gröbner Basis Perspective. Master’s thesis,
Department of Mathematics and Computing Science, TECHNISCHE UNIVER-
SITEIT EINDHOVEN, Eindhoven (2004)
11. Shigeo, M.: Hotaru (2005), [Link]
hotaru/hotaru/hfe25-96?view=markup
12. Albrecht, M., Bard, G.: M4RI – Linear Algebra over GF(2) (2008),
[Link]
Side Channels in the McEliece PKC

Falko Strenzke1 , Erik Tews2 , H. Gregor Molter3 , Raphael Overbeck4,


and Abdulhadi Shoufan3
1
FlexSecure GmbH, Germany
strenzke@[Link]
2
Cryptography and Computeralgebra, Department of Computer Science,
Technische Universität Darmstadt, Germany
e tews@[Link]
3
Integrated Circuits and Systems Lab, Department of Computer Science,
Technische Universität Darmstadt, Germany
{molter,shoufan}@[Link]
4
Ecole Polytechnique Fédérale de Lausanne, Switzerland
[Link]@[Link]

Abstract. The McEliece public key cryptosystem (PKC) is regarded


as secure in the presence of quantum computers because no efficient
quantum algorithm is known for the underlying problems, which this
cryptosystem is built upon. As we show in this paper, a straightfor-
ward implementation of this system may feature several side channels.
Specifically, we present a Timing Attack which was executed successfully
against a software implementation of the McEliece PKC. Furthermore,
the critical system components for key generation and decryption are
inspected to identify channels enabling power and cache attacks. Imple-
mentation aspects are proposed as countermeasures to face these attacks.

Keywords: side channel attack, timing attack, post quantum crypto-


graphy.

1 Introduction

Current cryptographic systems depend on complex mathematical problems such


as the factorization of large prime numbers and the calculation of discrete log-
arithms [1,2,3,4]. These systems are known to be vulnerable against certain al-
gorithms which could be implemented efficiently on quantum computers [5,6,7].
New classes of cryptographic schemes will be needed to guarantee system and
network security also in the presence of quantum computers. Examples for
theses classes are the hash-based cryptography, such as the Merkle signature
scheme [8,9], and code-based cryptography such as McEliece PKC [10,11].
The McEliece PKC is based on Goppa codes. The strongest known attack
is based on solving the NP-hard decoding problem, and no quantum algorithm
has been proposed which increases the efficiency of this attack [12]. So, although

A part of the work of F. Strenzke was done at2 .

J. Buchmann and J. Ding (Eds.): PQCrypto 2008, LNCS 5299, pp. 216–229, 2008.
c Springer-Verlag Berlin Heidelberg 2008
Side Channels in the McEliece PKC 217

well-studied regarding its security against algorithm attacks, to the best of our
knowledge, the McEliece PKC has never been analyzed with respect to side
channel attacks. Side channel attacks target a cryptographic system taking ad-
vantage of its implementation [13,14,15,16]. Algorithm execution is associated
with measurable quantities such as power consumption and execution time. The
amounts of these quantities depend on the data processed by the algorithm. If
the processed data is secret such as a private key, then the measured quanti-
ties may disclose the secret totally or partially. To prevent side channel attacks,
countermeasures must be included during the implementation of the algorithm.

Our contribution
This paper addresses side channel attacks on the McEliece PKC and corre-
sponding countermeasures. It is constructed as follows. Section 2 presents as
preliminaries the Goppa code and the McEliece PKC in brief. Section 3 details
a timing attack on the degree of error locator polynomial, which is used in the
error correction step in the decryption algorithm. A theoretical justification for
this attack is presented as well as experimental results of the execution of the
attack against a software implementation. Also, countermeasures are addressed.
Section 4 outlines two other side channel attacks and related countermeasures:
a power attack on the construction of the parity check matrix during key gener-
ation and a cache attack on the permutation of code words during decryption.
Section 5 concludes the paper.

2 Preliminaries
In this section we assume that the reader is familiar with the basics of error
correction codes. We use the notation given e.g. in [17].

2.1 Goppa Codes


Goppa codes [18] are a class of linear error correcting codes. The McEliece PKC
makes use of irreducible binary Goppa codes, so we will restrict ourselves to this
subclass.

Definition 1. Let the polynomial


t
g(Y ) = gi Y i ∈ F2m [Y ] (1)
i=0

be monic and irreducible over F2m [Y ], and let m, t be positive integers. Then
g(Y ) is called a Goppa polynomial (for an irreducible binary Goppa code).
Then an irreducible binary Goppa code is defined as


n−1
ci
G(F2m , g(Y )) = {c ∈ Fn2 |Sc (Y ) := = 0 mod g(Y )} (2)
i=0
Y − γi
218 F. Strenzke et al.

where n = 2m , Sc (Y ) is the syndrome of c, the γi , i = 0, . . . , n − 1 are pairwise


distinct elements of F2m , and ci are the entries of the vector c.
The code defined in such way has length n, dimension k = n − mt and can
correct up to t errors. The canonical check matrix H for G(F2m , g(Y )) can be
computed from the syndrome equation and is given in Appendix A.

2.2 The McEliece PKC


The McEliece PKC is named after its inventor [10]. It is a public key encryption
scheme based on general coding theory. In the following, we will give a brief
description of the individual algorithms for key generation, encryption and de-
cryption, without presenting the mathematical foundations behind the scheme or
the consideration of its security. For these considerations, the reader is referred
to [19].
Here, we describe the PKC without any CCA2-conversion, as it was origi-
nally designed. Without such a conversion, the scheme will be vulnerable against
adaptive chosen-ciphertext attacks [19]. However, a suitable conversion, like the
Korbara-Imai-Conversion [11], will solve this problem. In Section 3.2 we show
that the usage of a CCA2-conversion does not prevent the side channel attack
described in Section 3.1.
Parameters of the McEliece PKC. The security parameters m ∈ N and
t ∈ N with t % 2m have to be chosen in order to set up a McEliece PKC. An
example for secure values would be m = 11, t = 50. These values can be derived
from the considerations given in [19] or [20]. In addition, F2m and the γi are
public parameters.

Key Generation
The private key. The secret key consists of two parts. The first part of the
secret key in the McEliece PKC is a Goppa polynomial g(Y ) of degree t over
F2m according to definition 1, with random coefficients. The second part of the
private key is a randomly created n × n permutation matrix P.

The public key. The public key is generated from the secret key as follows. First,
compute H on the basis of g(Y ). Then take Gpub = [Ik | R] as the generator
in systematic form corresponding to the parity check matrix HP (refer to
Appendix A for the creation of the parity check matrix and the generator of a
Goppa code).
Encryption. Assume Alice wants to encrypt a message v ∈ Fk2 . Firstly, she has
to create a random binary vector e of length n and Hamming weight wt (e) = t.
Then she computes the ciphertext z = vGpub ⊕ e.
Decryption. In order to decrypt the ciphertext, Bob computes zP. Then he
applies error correction by executing an error correction algorithm, such as the
Patterson Algorithm described in Section 2.3, to determine eP. Afterwards, he
recovers the message v as the first k bits of z ⊕ ePP−1 .
Side Channels in the McEliece PKC 219

2.3 Error Correction for Irreducible Binary Goppa Codes


In the following we briefly describe how error correction can be performed with
binary irreducible Goppa codes. The error correction of Goppa codes makes use
of the so called error locator polynomial
3
σe (X) = (X − γj ) ∈ F2m [X], (3)
j∈Te

where Te = {i|ei = 1} and e is the error vector of the distorted code word to
be decoded. Once the error locator polynomial is known, the error vector e is
determined as

e = (σe (γ0 ), σe (γ1 ), · · · , σe (γn−1 )) ⊕ (1, 1, · · · , 1). (4)

The Patterson Algorithm is an efficient algorithm for the determination of the


error locator polynomial. It can be found in detail in [19]. We will restrict our
description to those features that are necessary to understand the attack we are
going to present. Also, we do not provide derivations for most of the equations
we specify in the following.
The Patterson Algorithm actually does not determine σe (X) as defined in
Equation 3, but computes σ̄e (X) = σe (X) mod g(X), where σ̄e (X) = σe (X) if
wt (e)  t.
The algorithm uses the fact that the error locator polynomial can be written
as
σ̄e (X) = α2 (X) + Xβ 2 (X). (5)
!
Defining τ (X) = Sz−1 (X) + X mod g(X), with Sz (X) being the syndrome of
the distorted code word z, the following equation holds:

β(X)τ (X) = α(X) mod g(X) (6)

Then, assuming that no more than t errors occurred, Equation 6 can be solved
by applying the Euclidean algorithm with a breaking condition concerning the
degree of the remainder [19]. Specifically, the remainder in the last step is taken
as α(X) and the breaking condition is deg (α(X))   2t . It can be shown that
then, deg (β(X))   t−1
2 .
From this, it follows that the polynomial σ̄e (X) defined over Equation 5 will
be of degree  t. In the case that the number of errors is no larger than t, from
Equation 3 it follows that deg (σ̄e (X)) = wt (e) since then σ̄e (X) = σe (X)
For the case of more than t errors, we give the following remark.

Remark 1. If wt (e) > t, then the deg (σ̄e (X)) = t with probability 1 − 2−m .

This remark can be justified easily: Since the σe (X) computed via Equation 3
would yield deg (σe (X)) = wt (e), we find that the calculation mod g(X) in
Equation 6 leads to polynomials σ̄e (X) of degree t with coefficients that we
can assume to be almost randomly distributed, where the leading coefficient is
220 F. Strenzke et al.

not necessarily non zero. But clearly, for random coefficients out of F2m , the
probability that the leading coefficient is not zero is 1 − 2−m , which is amounts
to 0.9995 for m = 11. Furthermore, experimental results confirm the claim of
the remark.

3 Attack on the Degree of the Error Locator Polynomial


The dependence of the degree of the error locator polynomial σ̄e (X) on the
number of errors in the decoding algorithm, which we examined in Section 2.3,
can be used as a basis of a chosen-ciphertext side channel attack. We will describe
it as a pure timing attack, though it clearly could be supported by incorporating
analysis of the respective power traces.

3.1 The Timing Attack


When computing the error vector according to Equation 4, the error locator
polynomial is evaluated 2m times. Clearly, in a naive implementation, the time
taken by the evaluation will increase with the degree of σ̄e (X).
The scenario for our attack is as follows: Alice encrypts a plaintext v to a ci-
phertext z = vGpub ⊕ e according to the algorithm described in Section 2.2. Eve
receives a copy of z, and mounts the side channel attack by submitting manip-
ulated ciphertexts z i to Bob, who applies the decryption algorithm according
to Section 2.2 to every single one of them. It is assumed that the decryption
algorithm makes use of the Patterson Algorithm. Eve is able to measure the ex-
ecution time of each decryption. In order to achieve a simple model, let us further
assume that the only cause of timing differences is the evaluation of the error
locator polynomial σe (X) in the Patterson Algorithm according to Equation 4.
The attack is described in algorithm 1. Here, sparse vec (i) denotes the vector
with zeros as entries except for the i-th position having value 1, and the first
position being indexed by 0. The key idea is to flip the bit at position i in z,
resulting in z i , and then to find out whether the i-th position of e was zero
or one. This in turn can be derived from the running time of the decryption
algorithm on input z i , since σ̄e (X) will be of degree t − 1 if ei = 1, and of degree
t otherwise.

3.2 The Timing Attack in the Presence of a CCA2-Conversion


A conversion like Pointcheval’s [21] or Korbara and Imai’s [11] makes sure that ci-
phertexts manipulated in the way described in algorithm 1 will not be decrypted,
i.e. no plaintext will be output by the decryption device. This is ensured by a
respective check performed after the error vector e has been determined via the
Patterson Algorithm.
However, the possibility of our side channel attack is not affected by this fact,
since in the presence of the conversion the attacker will still find a substring of
the ciphertext which actually is equivalent to z = vGpub ⊕ e and choose this as
the target of his manipulations. Furthermore, the Patterson Algorithm will run
Side Channels in the McEliece PKC 221

Algorithm 1. Timing Attack against the evaluation of σe (X)


Require: ciphertext z, and the parameter t, of the McEliece PKC.
Ensure: a guess e of the error vector e used by Alice to encrypt z.
1: for i = 0 to n − 1 do
2: Compute z i = z ⊕ sparse vec (i).
3: Take the time ui as the mean of N measured decryption times where z i is used
as the input to the decryption device.
4: end for
5: Put the t smallest timings ui into the set M .
6: return the vector e with entries ei = 1 when ui ∈ M and all other entries as
zeros.

through all its steps regardless of whether the ciphertext has been manipulated
or not. Only afterwards the algorithm will detect the manipulation and refuse
decryption.

3.3 Implementation of the Attack


We realized the attack against a software implementation of McEliece. Specif-
ically, our target was the implementation of the scheme in the FlexiProvider1,
which is a Java Cryptographic Extension (JCE) Provider. The implementation
uses the Patterson Algorithm in the decoding step of the decryption phase. For
simplicity, we did not include any CCA2-Conversion.
We executed the attack on an AMD Opteron 2218 CPU running at 2.6 GHz
under Linux 2.6.20 and Java 6 from Sun. A single attack with N = 2 took
less than 2 minutes, which makes it very effective and useable in a real world
scenario. Even a remote attack against a TLS server using McEliece seems to be
possible.
The security parameters we used for the attack are m = 11 and t = 50. The
attack algorithm was realized just as depicted in algorithm 1. With the choice of
N = 2 we recovered all positions of e correctly in half of the executed attacks.
The exact results can be found in Appendix C.

3.4 Proposed Countermeasure


The reason for the comparatively high efficiency of the attack is that the error
locator polynomial is evaluated 2m times in the Patterson Algorithm. For the
security parameter m = 11, as in our example, these are 2048 evaluations. This
means that even a small difference in a single evaluation will be inflated to
considerable size.
In order to avoid the differences in the decryption time arising from the dif-
ferent degrees of σ̄e (X), it is a straightforward countermeasure to simply raise
its degree artificially in the case that it is found to be lower than t. Note that
furthermore all coefficients in the polynomial of degree t have to be non zero in
order to avoid timing differences.
1
[Link]
222 F. Strenzke et al.

3.5 Improvements of the Attack


The simple version of the timing attack provided in algorithm 1 already en-
abled successful attacks under idealized conditions. For real life scenarios, two
improvements of the attack are feasible.
– Once the attacker has found one position j with ej = 1, he can apply an im-
proved version of the attack. Specifically, he can then create the manipulated
ciphertexts
z i = z i ⊕ sparse vec (j)
for all i = j and use them as input for the decryption device. As a result,
each z i will contain either t or t−2 errors. Where in algorithm 1 the attacker
had to distinguish between timings resulting from degrees of σ̄e (X) differing
by 1, this difference in degrees is now 2, resulting in an even higher difference
in the timings.
– In the attack given in Algorithm 1, it is already provided that the attacker
takes the average time of multiple decryptions of the same ciphertext in order
to decrease noise. Still, certain deterministic timing differences could arise
in the algorithm, causing certain timings up and uq to differ considerably,
even though ep = eq .
However, once the attacker knows a number of error and non-error po-
sitions, he can modify z i in a way such that the number of errors remains
constant. Each of these ciphertexts will contain the same number of errors
with respect to the Goppa code as z i , but will cause the Patterson Algorithm
to start with a different syndrome. Thus, if the attacker averages over the
corresponding execution times, he can eliminate the possible timing differ-
ences arising from certain syndromes.

4 Other Side Channels


The McEliece system contains several other operations, which enable side chan-
nel attacks, if these operations are implemented in a straightforward manner,
i.e. one-to-one according to the algorithm specification. In this section, two of
these critical operations are presented: the setup of the parity check matrix H
during key generation and the calculation of the matrix zP during decryption.
The first one presents a potential side channel for power attacks [14], the second
one for cache attacks [22].

4.1 Generation of Parity Check Matrix


The parity check matrix H is generated by applying complex matrix operations
over F2m based on the secret polynomial g(Y ). According to [19] an element of
the check matrix H can be written as

t
hi,j = g(γj−1 )−1 s−t+i−1
gs γj−1 , (7)
s=t−i+1

where i = 1, . . . , t and j = 1, . . . , n (see Appendix A).


Side Channels in the McEliece PKC 223

h1,1 h2,1 h3,1 h1,2 h2,2 h3,2

g3 1 g2 1 g3 γ0 g1 1 g2 γ0 g3 γ02 g3 1 g2 1 g3 γ1 g1 1 g2 γ1 g3 γ12

time

Fig. 1. Execution Order: Polynomial Multiplication

Inspecting this relation, two operations may be critical for power attacks [16].
These are the polynomial evaluation for the field elements g(γj ) and the multiplica-
s−t+i−1
tion of the polynomial coefficients with the powers of the field elements gs γj−1 .
Polynomial multiplication. Figure 1 shows schematically the multiplication
steps executed to calculate the first and second column of H. Here, we use t = 3
for simplicity. Remember that H has t rows and n columns.
From this figure it is evident that the multiplication steps and, thus, their
power traces reveal high regularity. An exact application of the above relation
results in multipliying g3 by 1 once for each column of H. Obviously, the power
trace of these products may be used to indicate the start of the processing of
a new column, which is essential for power attacks. Furthermore, it is highly
probable that the power traces of g2 γ0 and g2 γ1 can be used to estimate the
secret coefficient g2 as the γi are public.
To complicate this attack, the multiplications gs γj−1 must be performed in a
manner, which does not leak information on gs . This can be achieved (at least
partially) by masking. Each gs is multiplied by a random value ri ∈ F2m before
multiplying it by the field element γj−1 . The de-masking using ri−1 is performed
after calculating the sum:
 t 

−1 −1 s−t+i−1
hi,j = g(γj−1 ) ri (ri gs ) γj−1 . (8)
s=t−i+1

In the above equation, the parentheses denote in which order the evaluation shall
be performed.
This masking will be even more profitable if it is combined with a random-
ization of the order of term estimations. By this means the association of power
traces with time is blurred considerably.
Polynomial evaluation. This operation is highly time-consuming and is per-
formed in a pre-estimation phase, as a rule. The description in this section relates
to this pre-estimation. Referring to the definition of the  generator polynomial
t
g(Y ), its evaluation for a field element γj can be written as i=0 gi γji . This means
that polynomial evaluation amounts to multiplication over F2m with highly reg-
ular patterns, which again presents a possible side channel for power attacks.
Fig. 2 depicts the chronological sequence of evaluating a polynomial of degree
t = 3 for two field elements in a straightforward implementation. Similar to
the case presented previously, countermeasures of masking and randomization
should be employed.
224 F. Strenzke et al.

g(γ0 ) g(γ1 )

g0 1 g1 γ0 g2 γ02 g3 γ03 g0 1 g1 γ1 g2 γ12 g3 γ13

time

Fig. 2. Execution Order: Polynomial Evaluation

Using polynomial evaluation as power side channels is also possible in the


decryption phase when the error vector is determined according to Equation 4.

4.2 Estimation of the Matrix zP


Presenting a possible power analysis attack scenario in Section 4.1, we will now
focus upon a possible cache attack [22] scenario. Cache attacks, a specific type
of so called microarchitectural attacks, have already been successfully mounted
against software implementations [23].
The decryption of a ciphertext may also leak the other private key part, the
permutation matrix P. We assume that the permutation matrix itself is not
stored directly as a matrix in the memory; it is rather implemented as some
lookup-table for the rows and columns to save memory. This lookup-table is
used in the decryption phase to compute zP and eP.
In a straightforward implementation one may calculate these permutations by
the following algorithm:

Algorithm 2. Permutation of z  = zP
Require: Private permutation matrix P lookup-table tP and ciphertext vector z ∈ Fn
2.
Ensure: The permutation z  = zP.
1: for i = 1 to n do
2: Lookup j = tP i .
3: Set zi = zj .
4: end for
5: return permutated vector z  .

The code in algorithm 2 will create memory access on addresses depending on


the secret permutation P. An attacker can use this to gain information about P.
Let us assume a scenario where the attacker has access to the system running the
decryption process, and where the CPU of the computer supports simultaneous
multithreading. The attacker executes a spy process parallel to the process of
the decryption application. Let us further assume that the attacker knows the
position of z in the main memory. Ideally, between any two iterations of the
loop in algorithm 2, the spy process erases the content of z from the CPU cache
and fills the respective cache blocks with some data of his own. It also regularly
performs memory access to this data, measuring the execution time for this
access.
Side Channels in the McEliece PKC 225

From these timings, gathered while the decryption process was running in
parallel, the attacker will be able to judge with certain precision which part
of z was accessed during which iteration. Specifically, assume that for a certain
iteration the time taken by the memory access of the spy process to a certain date
indicates a cache miss. Then the attacker knows that the decryptions process
accessed just that part of z, which was stored in the same cache block. Note that
the rule relating main memory addresses to cache blocks is system dependent
and thus known to the attacker.
Due to the fact that in general the size of a cache block will be larger than
one entry zi , usually the attacker will not be able to get the exact index of the
entry of z which has been accessed. Instead he will find out that for example an
entry between z0 and z31 must have been accessed. If the memory location of
z differs in different executions and does not always have the same offset from
the beginning of a cache block, the attacker might be able to narrow the access
down to a single entry of z.
In a weaker scenario, where the system running the decryption process does
not support simultaneous multithreading, the attacker will not be able to peek
into the decryption routine at every iteration, but with some probability the
operating system will perform a context switch, interrupting algorithm 2 and
continuing the spy process. In such a scenario the attack would be much harder,
but still not impossible, assuming the attacker can repeat the measurement often
enough.

Countermeasures. A possible contermeasure is to modify algorithm 2 to an


algorithm whose memory access doesn’t depend on the content of tP . We have
implemented algorithm 3, which satisfies this requirement. It has constant run-
ning time, performs no jumps depending on secret input, and does only access
memory addresses depending on public input. Therefore, it should be secure
against timing-, cache-, and branch prediction attacks [24]. Unfortunatly, this
increases the running time from O(2m ) to O((2m )2 ). Here, the operators ∼, &,
>>, & =, |, and − are used as they are used in the C programming language.
The idea behind algorithm 3 is the following: As in algorithm 2, tP i is read in
line 2. Algorithm 2 would now read zj and write it to zi . The write to zi is not
critical, because i is public, but j depends on P and a read of zj would reveal
information about j and therefore about P.
Algorithm 3 uses the following countermeasure. In line 3, zi is initialized with
0. In line 4, a new loop is started, where k runs from 0 to n−1. In every iteration,
zi is read to l and zk is read to m. Now, we have to distinguish between two
cases:
1. j = k: In this case, we want to write m = zk = zj to zi , as in algorithm 2.
2. j = k: In this case, we don’t want to modify zi . But to create the same
memory access as in case 1, we assign l = zi to zi , and therefore leave zi
unchanged.
In order to do this without an if-then-else statement, the following trick is
used by algorithm 3: The XOR-difference s between j and k is computed in
226 F. Strenzke et al.

Algorithm 3. secure permutation of z  = zP


Require: Private permutation matrix P lookup-table tP and ciphertext vector z ∈ Fn
2
Ensure: The permutation z  = zP
1: for i = 0 to n − 1 do
2: j = tP i
3: zi = 0
4: for k = 0 to n − 1 do
5: l = zi
6: m = zk
7: s = j ⊕k
8: s | = s >> 1
9: s | = s >> 2
10: s | = s >> 4
11: s | = s >> 8
12: s | = s >> 16
13: s&= 1
14: s = ∼ (s − 1)
15: zi = (s&l)|((∼ s)&m)
16: end for
17: end for
18: return z 

line 7. If the difference is 0, j = k and we are in case 1. If the difference is not


0, then we are in case 2.
Lines 8 to 14 now make sure, that if s is not 0, all bits in s will be set to 1.
Now, the expression (s&l)|((∼ s)&m) will evaluate to l, and l will be written to
zi in line 15.
If s was 0 after line 7, s will still be 0 after line 14. Now the expression
(s&l)|((∼ s)&m) will evaluate to m, and m will be written to zi in line 15.

5 Conclusion

In this paper we have shown that the McEliece PKC like most known public
key cryptosystems, bears a high risk of leaking secret information through side
channels if the implementation does not feature appropriate countermeasures.
We have detailed a timing attack, which was also implemented and executed
against an existing software implementation of the cryptosystem. Our results
show the high vulnerability of an implementation without countermeasures.
Furthermore, we presented a feasible power attack against the key generation
phase, where certain operations involve the same secret value repeatedly. In
general, key generation is a more difficult target for a side channel attack than
decryption, because in contrast to that operation the attacker can only perform
one measurement. But our considerations show, that without countermeasures,
an implementation of the key generation might be vulnerable to a sophisticated
power attack.
Side Channels in the McEliece PKC 227

The cache attack designed to reveal the permutation that is part of the secret
key, again benefits from the fact that the number of measurements the attacker
may perform is in principle without any restraint. Thus the proposed secure al-
gorithm seems to be an important countermeasure for software implementations
intended for use in a multi user operating system.
Clearly, other parts of the cryptosystem require to be inspected with the same
accuracy. This is especially true for the decryption phase, where the secret Goppa
polynomial is employed in different operations.
The McEliece PKC, though existing for 30 years, has not experienced wide
use so far. But since it is one of the candidates for post quantum public key
cryptosystems, it might become practically relevant in the near future. With
our work, besides the specific problems and solutions we present, we want to
demonstrate that with the experience gathered in recent work exposing the vul-
nerabilities of other cryptosystems, it is possible to identify the potential side
channels in a cryptosystem before it becomes commonly adopted.

References
1. Diffie, W., Hellman, M.: New directions in cryptography. IEEE Transactions on
Information Theory 22(6), 644–654 (1976)
2. Rivest, R.L., Shamir, A., Adleman, L.: A method for obtaining digital signatures
and public-key cryptosystems. Communications of the ACM 21(2), 120–126 (1978)
3. Miller, V.S.: Use of Elliptic Curves in Cryptography. In: Williams, H.C. (ed.)
CRYPTO 1985. LNCS, vol. 218, pp. 417–426. Springer, Heidelberg (1986)
4. ElGamal, T.: A Public Key Cryptosystem and A Signature Based on Discrete
Logarims. IEEE Transactions on Information Theory (1985)
5. Shor, P.W.: Algorithms for quantum computation: discrete logarithms and factor-
ing. In: Proceedings, 35-th Annual Symposium on Foundation of Computer Science
(1994)
6. Shor, P.W.: Polynomial time algorithms for prime factorization and discrete log-
arithms on a quantum computer. SIAM Journal on Computing 26(5), 1484–1509
(1997)
7. Proos, J., Zalka, C.: Shor’s discrete logarithm quantum algorithm for elliptic curves,
Technical Report quant-ph/0301141, arXiv (2006)
8. Merkle, R.: A Certified Digital Signature. In: Proceedings of the 9th Annual Inter-
national Cryptology Conference on Advances in Cryptology, pp. 218–238 (1989)
9. Buchmann, J., Garcia, L., Dahmen, E., Doering, M., Klintsevich, E.: CMSS-An Im-
proved Merkle Signature Scheme. In: 7th International Conference on Cryptology
in India-Indocrypt, vol. 6, pp. 349–363 (2006)
10. McEliece, R.J.: A public key cryptosystem based on algebraic coding theory. DSN
progress report 42-44, 114–116 (1978)
11. Korbara, K., Imai, H.: Semantically secure McEliece public-key cryptosystems -
conversions for McEliece PKC. In: Kim, K.-c. (ed.) PKC 2001. LNCS, vol. 1992.
Springer, Heidelberg (2001)
12. Menezes, A., van Oorschot, P., Vanstone, S.: Handbook of Applied Cryptography.
CRC Press, Boca Raton (1996)
13. Kocher, P.: Timing Attacks on Implementations of Diffie-Hellman, RSA, DSS, and
Other Systems. In: Proceedings of the 16th Annual International Cryptology Con-
ference on Advances in Cryptology, pp. 104–113 (1996)
228 F. Strenzke et al.

14. Kocher, P.: Differential Power Analysis. In: Wiener, M. (ed.) CRYPTO 1999.
LNCS, vol. 1666, pp. 388–397. Springer, Heidelberg (1999)
15. Tsunoo, Y., Tsujihara, E., Minematsu, K., Miyauchi, H.: Cryptanalysis of Block
Ciphers Implemented on Computers with Cache. In: International Symposium on
Information Theory and Applications, pp. 803–806 (2002)
16. Schindler, W., Lemke, K., Paar, C.: A Stochastic Model for Differential Side Chan-
nel Cryptanalysis. In: Rao, J.R., Sunar, B. (eds.) CHES 2005. LNCS, vol. 3659,
pp. 30–46. Springer, Heidelberg (2005)
17. MacWilliams, F.J., Sloane, N.J.A.: The theory of error correcting codes. North-
Holland, Amsterdam (1997)
18. Goppa, V.D.: A new class of linear correcting codes. Problems of Information
Transmission 6, 207–212 (1970)
19. Engelbert, D., Overbeck, R., Schmidt, A.: A Summary of McEliece-Type Cryp-
tosystems and their Security. Journal of Mathematical Cryptology (2006) (accepted
for publication)
20. Canteaut, A., Chabaud, F.: A new algorithm for finding minimum-weight words
in a linear code: application to primitive narrow-sense BCH-codes of length 511.
IEEE Transactions on Information Theory 44(1), 367–378 (1998)
21. Pointcheval, D.: Chosen-chipertext security for any one-way cryptosystem. In: Imai,
H., Zheng, Y. (eds.) PKC 2000. LNCS, vol. 1751, pp. 129–146. Springer, Heidelberg
(2000)
22. Percival, C.: Cache missing for fun and profit,
[Link]
23. Schindler, W., Acıiçmez, O.: A Vulnerability in RSA Implementations due to In-
struction Cache Analysis and its Demonstration on OpenSSL. In: Malkin, T. (ed.)
CT-RSA 2008. LNCS, vol. 4964, Springer, Heidelberg (2008)
24. Acıiçmez, O., Seifert, J.P., Koç, Ç.: Predicting secret keys via branch prediction.
In: Abe, M. (ed.) CT-RSA 2007. LNCS, vol. 4377. Springer, Heidelberg (2007)

A Parity Check Matrix and Generator of an Irreducible


Binary Goppa Code
The parity check matrix H of a Goppa code determined by the Goppa polynomial
g can be determined as follows. H = XYZ, where
⎡ ⎤ ⎡ ⎤
gt 0 0 · · · 0 1 1 ··· 1
⎢ gt−1 gt 0 · · · 0 ⎥ ⎢ γ0 γ1 · · · γn−1 ⎥
⎢ ⎥ ⎢ ⎥
X = ⎢ . . . . . ⎥,Y = ⎢ . .. . . . ⎥,
⎣ .. .. .. . . .. ⎦ ⎣ .. . . .. ⎦
g1 g2 g3 · · · gt γ0t−1 γ1t−1 · · · γn−1
t−1


1 1 1
Z = diag , ,..., .
g(γ0 ) g(γ1 ) g(γn−1 )
Here diag (. . .) denotes the diagonal matrix with entries specified in the argu-
ment. H is t × n matrix with entries in the field F2m .
As for any error correcting code, the parity check matrix allows for the com-
putation of the syndrome of a distorted code word:

Sz (Y ) = zH Y t−1 , · · · , Y, 1 .
Side Channels in the McEliece PKC 229

The multiplication with Y t−1 , · · · , Y, 1 is used to turn the coefficient vector


into a polynomial in F2mt .
The generator of the code is constructed from the parity check matrix in the
following way:
Transform the t × n matrix H over F2m into an mt × n matrix H2 over F2 by
expanding the rows. Then, find an invertible matrix S such that
" #
S · H2 = Imt | R ,
i.e., bring H into a systematic form using the Gauss algorithm. Here, Ix is the
x × x identity matrix. Now take G = [Ik | R] as the public key. G is a k × n
matrix over F2 , where k = n − mt.

B The Extended Euclidean Algorithm (XGCD)


The extended Euclidean algorithm can be used to compute the greatest common
divisor (gcd) of two polynomials [17].
In order to compute the gcd of two polynomials r−1 (Y ) and r0 (Y ) with
deg (r0 ) (Y )  deg (r−1 (Y )), we make repeated divisions to find the following
sequence of equations:
r−1 (Y ) = q1 (Y )r0 (Y ) + r1 (Y ), deg (r1 ) < deg (r0 ) ,
r0 (Y ) = q2 (Y )r1 (Y ) + r2 (Y ), deg (r2 ) < deg (r1 ) ,
...
ri−2 (Y ) = qi (Y )ri−1 (Y ) + rj (Y ), deg (ri ) < deg (ri−1 ) ,
ri−1 (Y ) = qi+1 (Y )ri (Y )
Then ri (Y ) is the gcd of r−1 (Y ) and r0 (Y ).

C Experimental Results for the Timing Attack


Here, we show the experimentally determined probabilities (see Section 3.3) for
the respective amounts of correctly guessed error positions.
N =1 N =2
Prob (wt (e ⊕ e)  0) 0% 48%
Prob (wt (e ⊕ e)  2) 0% 77%
Prob (wt (e ⊕ e)  4) 0% 96%
Prob (wt (e ⊕ e)  6) 4% 99%
Prob (wt (e ⊕ e)  8) 9% 99%
Prob (wt (e ⊕ e)  10) 16% 100%
Prob (wt (e ⊕ e)  12) 22% 100%
Prob (wt (e ⊕ e)  14) 32% 100%
Prob (wt (e ⊕ e)  16) 46% 100%
Prob (wt (e ⊕ e)  18) 60% 100%
Prob (wt (e ⊕ e)  20) 74% 100%
Prob (wt (e ⊕ e)  22) 83% 100%
Prob (wt (e ⊕ e)  24) 89% 100%
Author Index

Aguilar Melchor, Carlos 1 Mohamed, Mohamed Saied Emam 203


Mohamed, Wael Said
Baena, John 17 Abd Elmageed 203
Bernstein, Daniel J. 31 Molter, H. Gregor 216
Biswas, Bhaskar 47
Buchmann, Johannes 63, 79, 203 Okeya, Katsuyuki 109
Overbeck, Raphael 216
Cayrel, Pierre-Louis 1
Chen, Anna Inn-Tung 95 Peters, Christiane 31
Chen, Chia-Hsin Owen 95 Portmann, Christopher 165
Chen, Ming-Shing 95
Cheng, Chen-Mou 95 Rückert, Markus 79
Clough, Crystal 17
Schneider, Michael 63
Dahmen, Erik 63, 109 Sendrier, Nicolas 47
Ding, Jintai 17, 124, 203 Shoufan, Abdulhadi 216
Strenzke, Falko 216
Finiasz, Matthieu 137
Fujita, Ryou 148 Tadaki, Kohtaro 148
Takagi, Tsuyoshi 109
Gaborit, Philippe 1 Tews, Erik 216
Tsujii, Shigeo 148
Kawachi, Akinori 165
Vuillaume, Camille 109
Lange, Tanja 31
Lindner, Richard 79 Wagner, John 124
Liu, Feng-Hao 181
Lu, Chi-Jen 181 Yang, Bo-Yin 95, 181

You might also like