0% found this document useful (0 votes)
4 views17 pages

Nice

Uploaded by

ayomoma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views17 pages

Nice

Uploaded by

ayomoma
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Not Enough LESS: An Improved

Algorithm for Solving Code Equivalence


Problems over Fq

Ward Beullens(B)

imec-COSIC, KU Leuven, Leuven, Belgium


[email protected]

Abstract. Recently, a new code based signature scheme, called LESS,


was proposed with three concrete instantiations, each aiming to provide
128 bits of classical security [3]. Two instantiations (LESS-I and LESS-
II) are based on the conjectured hardness of the linear code equivalence
problem, while a third instantiation, LESS-III, is based on the conjec-
tured hardness of the permutation code equivalence problem for weakly
self-dual codes. We give an improved algorithm for solving both these
problems over sufficiently large finite fields. Our implementation breaks
LESS-I and LESS-III in approximately 25 s and 2 s respectively on an
Intel i5-8400H CPU. Since the field size for LESS-II is relatively small
(F7 ) our algorithm does not improve on existing methods. Nonetheless,
we estimate that LESS-II can be broken with approximately 244 row
operations.

Keywords: Permutation code equivalence problem · Linear code


equivalence problem · Code-based cryptography · Post-quantum
cryptography

1 Introduction
Two q-ary linear codes C1 and C2 of length n and dimension k are called per-
mutation equivalent if there exists a permutation π ∈ Sn such that π(C1 ) = C2 .
Similarly, if there exists a monomial permutation μ ∈ Mn = (F× n
q )  Sn such
that μ(C1 ) = C2 the codes are said to be linearly equivalent (a monomial per-
mutation acts on vectors in Fnq by permuting the entries and also multiplying
each entry with a unit of Fq ). The problem of finding π ∈ Sn (or μ ∈ Mn ) given
equivalent C1 and C2 is called the permutation equivalence problem (or linear
equivalence problem respectively)1 .
1
There also exists a more general notion of equivalence called semi-linear equivalence.
Our methods generalize to semi-linear equivalences, but since this is not relevant for
the security of LESS, we do not elaborate on this.

This work was supported by CyberSecurity Research Flanders with reference num-
ber VR20192203 and the Research Council KU Leuven grants C14/18/067 and
STG/17/019. Ward Beullens is funded by FWO SB fellowship 1S95620N.
c Springer Nature Switzerland AG 2021
O. Dunkelman et al. (Eds.): SAC 2020, LNCS 12804, pp. 387–403, 2021.
https://doi.org/10.1007/978-3-030-81652-0_15
388 W. Beullens

Definition 1 (Permutation Code Equivalence Problem). Given genera-


tor matrices of two permutation equivalent codes C1 and C2 , find a permutation
π ∈ Sn such that C2 = π(C1 ).

Definition 2 (Linear Code Equivalence Problem). Given generator matri-


ces of two linearly equivalent codes C1 and C2 , find a monomial permutation
μ ∈ Mn such that C2 = μ(C1 ).

The hardness of the permutation equivalence problem is relevant for the security
of the McEliece and Girault post-quantum cryptosystems [7,10]. More recently,
Biasse, Micheli, Persichetti, and Santini proposed a new code-based signature
scheme whose security only relies on the hardness of the linear code equivalence
problem or permutation code equivalence problem. The public key consists of
generator matrices for two equivalent codes C1 and C2 , and a signature is a zero-
knowledge proof of knowledge of an equivalence μ ∈ Mn (or π ∈ Sn ) such that
μ(C1 ) = C2 (or π(C1 ) = C2 respectively). In the case of permutation equivalence,
the codes C1 and C2 are chosen to be weakly self-dual, because otherwise π can
be recovered in polynomial time [12]. Table 1 shows the proposed parameter sets
for LESS, aiming for 128 bits of security.

Table 1. Proposed parameter sets for the LESS signature scheme.

Parameter set n k p Equivalence


LESS-I 54 27 53 Linear
LESS-II 106 45 7 Linear
LESS-III 60 25 31 Permutation

1.1 Previous Work

We will briefly go over some of the algorithms that have been proposed for the
permutation and linear code equivalence problems below. The state of the art
for the permutation code equivalence problem is that random instances can be
solved in polynomial time with the Support Splitting Algorithm (SSA), but that
instances with codes that have large hulls require a runtime that is exponential
in the dimension of the hull. Weakly self-dual codes (these are codes C such that
C ⊂ C ⊥ ) have hulls of maximal dimension dim(H(C)) = dim(C) and are believed
to be the hardest instances of the permutation equivalence problem. The state
of the art for the linear code equivalence problem is that instances over Fq with
q ≤ 4 can be solved in polynomial time with the SSA algorithm via a reduction
to the permutation equivalence problem, but for q > 4 this reduction results in
codes with a large hull, which means the SSA algorithm is not efficient. Hence,
the linear code equivalence problem is conjectured to be hard on average for
q > 4 [13].
Not Enough LESS: An Improved Algorithm 389

Leon’s Algorithm. Leon’s algorithm [9] for finding linear and permutation
equivalences relies on the observation that applying a permutation or a monomial
permutation does not change the hamming weight of a codeword. Therefore, if
we compute the sets X1 and X2 of all the minimal-weight codewords of C1
and C2 respectively, then it must be that X2 = π(X1 ) or X2 = μ(X1 ) in the
case of permutation equivalence or linear equivalence respectively. Leon gives an
algorithm to compute a μ ∈ Mn that satisfies X2 = μ(X1 ) with a time complexity
that is polynomial in |X1 |. Usually, the sets X1 and X2 have “enough structure”,
such that if μ satisfies X2 = μ(X1 ), then also C2 = μ(C1 ) with non-negligible
probability. If this is not the case, then one can also consider larger sets X1 and
X2 that contain all the codewords in C1 and C2 respectively whose weight is one
more than the minimal weight. Since the sets X1 and X2 are usually small, the
complexity of the algorithm is dominated by the complexity of computing X1
and X2 .
Feulner gives an algorithm that computes a canonical representative of an
equivalence class of codes. The complexity of this algorithm is close to that of
Leon’s algorithm [6].

Support Splitting Algorithm. The Support Splitting Slgorithm (SSA) of


Sendrier [12] defines the concept of a signature. A signature is a property of
a position in a code that is invariant for permutations. More precicely, it is a
function S that takes a code C and a position i ∈ {1, · · · , n} as input and outputs
an element of an output space P , such that for any permutation π ∈ Sn we have

S(C, i) = S(π(C), π(i)) .

We say that a signature is totally discriminant for C if i = j implies that


S(C, i) = S(C, j). If a signature S is efficiently computable and totally dis-
criminant for a code C1 , then one can easily solve the permutation equivalence
problem by computing S(C1 , i) and S(C2 , i) for all i ∈ {1, · · · , n} and compar-
ing the outputs. Even if the signature is not totally discriminant, a sufficiently
discriminant signature can still be used to solve the permutation equivalence
Problem by iteratively refining the signature.
The SSA uses the concept of the hull of a code to construct an efficiently
computable signature. The hull of a code C is the intersection of the code with
its dual: H(C) = C ∩ C ⊥ . This concept is very useful in the context of the per-
mutation equivalence problem because taking the hull commutes with applying
a permutation2

H(π(C)) = π(C) ∩ π(C)⊥ = π(C ∩ C ⊥ ) = π(H(C)) .

2
This is not the case for monomial permutations: H(µ(C)) is not necessarily equal to
µ(H(C)) for µ ∈ Mn . This is why the SSA can not be directly applied to find linear
equivalences.
390 W. Beullens

The SSA defines a signature as S(C, i) := W (H(Ci )), where Ci is the code C
punctured at position i, and W (C) denotes the weight enumerator polynomial
of the code C. While this signature is typically not fully discriminant, it is still
discriminant enough to efficiently solve the permutation equivalence Problem for
random matrices. However, a limitation of the SSA algorithm is that computing
the enumerator of the hull is not efficient when the hull of C is large. For random
codes this is not a problem because typically the hull is small.

Algebraic Approach. The code equivalence problems can be solved alge-


braically, by expressing the condition π(C1 ) = C2 or μ(C1 ) = C2 as a system
of polynomial equations, and trying to solve this system with Gröbner basis
methods [11]. Similar to the SSA algorithm, this solves the permutation code
equivalence problem for random instances in polynomial time, but the complex-
ity is exponential in the dimension of the hull. The approach also works for the
linear code equivalence problem, but it is only efficient for q ≤ 4.

1.2 Our Contributions


In this paper, we propose an improvement on Leon’s algorithm for code equiv-
alence that works best over sufficiently large finite fields. If x ∈ C1 and
y = π(x) ∈ C2 , then the multiset of entries of x matches the multiset of entries
of y. Our algorithm is based on the observation that if the size of the finite
field is large enough then the implication also holds in the other direction with
large probability: If x ∈ C1 and y ∈ C2 are low-weight codewords with the same
multiset of entries, then with large probability π(x) = y. Our algorithm does a
collision search to find a small number of such pairs (x, y = π(x)), from which
one can easily recover π. We also give a generalization of this idea that works
for the linear equivalence problem.
We implemented our algorithm and used it to break the LESS signature
scheme. In the LESS-I and LESS-III parameter sets the finite field is large enough
for our algorithm to improve on Leons’s algorithm. We show that we can recover
a LESS-I or LESS-III secret key in only 25 s or 2 s respectively. We estimate that
recovering the secret key is also possible in practice with Leon’s algorithm, but it
would be significantly more costly. LESS-II works over F7 , which is too small for
our algorithm to improve on Leon’s algorithm: We estimate that our algorithm
requires approximately 250.4 row operations, while Leon’s algorithm would take
only 243.9 row operations.

2 Preliminaries
2.1 Notation
For a q-ary linear code C of length n and dimension k we say a matrix G ∈ Fk×n
q
is a generator matrix for C if C = G, where G denotes the span of the rows
(n−k)×n
of G. Similarly, we say that a matrix H ∈ Fq is a parity check matrix for
Not Enough LESS: An Improved Algorithm 391

C if C ⊥ = H, where C ⊥ = {x ∈ Fnq | x · y = 0 ∀y ∈ C} is the dual code of C. For


a vector x ∈ Fnq we denote by Supp(x) the set of indices of the nonzero entries
in x, i.e. Supp(x) = {i | xi = 0}. We define the support Supp(C) of a code C to
be the union of the supports of the codewords in C. We let wt(x) = |Supp(x)| be
the Hamming weight of x. We denote by Bn (w) the Hamming ball with radius
w, i.e. the set of vectors in x ∈ Fnq with wt(x) ≤ w. For a permutation π ∈ Sn
and a vector x of length n, we write π(x) for the vector obtained by permuting
the entries of x with the permutation π, that is we have (π(x))i = xπ(i) for all
i ∈ {1, · · · , n}. For a monomial permutation μ = (ν, π) ∈ Mn = (F× n
q )  Sn and
n
a vector x ∈ Fq , we write μ(x) to denote the vector obtained by applying μ to
the entries of x. Concretely, we have (μ(x))i = νi · xπ(i) for all i ∈ {1, · · · , n}.
For a code C and π ∈ Sn (or μ ∈ Mn ), we denote by π(C) (or μ(C)) the code that
consist of permutations (or monomial permutations respectively) of codewords
in C.

2.2 Information Set Decoding

The algorithms in this paper will make use of information set decoding to find
sparse vectors in q-ary linear codes. In particular, we will use the Lee-Brickell
algorithm with parameter p = 2. To find low-weight codewords in a code C =
M the algorithm repeatedly computes the echelon form of M with respect
to a random choice of k pivot columns. Then, the algorithm inspects all the
linear combinations of p = 2 rows of the matrix. Given the echelon form of the
matrix, we are guaranteed that all these linear combinations have weight at most
n − k + 2, but if we are lucky enough we will find codewords that are even more
sparse. We repeat this until a sufficiently sparse codeword is found.

Complexity of the Algorithm. The complexity of the algorithm depends on


the length n and the dimension k of the code, target weight w, and whether
we want to find a single codeword, all the codewords, or a large number N of
codewords. First, suppose there is a distinguished codeword x ∈ C with weight
w that we want to find. For a random choice of pivot columns, the Lee-Brickell
algorithm will output x if the support of x intersects the set of pivot columns
(also known as the information set) in exactly 2 positions. The probability that
this happens is   
n−k k
w−2 2
P∞ (n, k, w) := n .
w

k cost of each iteration is k row operations for the Gaussian


2
Therefore, since the
elimination and 2 q row operations to iterate over all the linear combinations
of 2 rows (up to multiplication by a constant), the algorithm will find x after
approximately
      
n
k −1 q w
C∞ (n, k, w) = k + 2
q P (n, k, w) = O n−k 
2 w−2
392 W. Beullens

row operations. Heuristically, for random codes we expect the support of the
different codewords to behave as if they are “independent”, so if there exist
(q − 1)N codewords of weight w (i.e. N different codewords up to multiplication
by a scalar), then we expect the probability that one iteration of the Lee-Brickell
algorithm succeeds to be

P1 (n, k, w) = 1 − (1 − P∞ (n, k, w))N .

Thus, if N is small enough, we have P1 (n, k, w) ≈ N P∞ (n, k, w), and the com-
plexity of finding is a single weight-w codeword is C1 (n, k, w) ≈ C∞ (n, k, w)/N .
Finally, if the goal is to find L out of the N distinct weight-w codewords
(up to multiplication by a scalar), the cost of finding the first codeword is C1 =
C∞ (n, k, w)/N , the cost of finding the second codeword is C∞ (n, k, w)/(N − 1),
the cost of finding the third codeword is C∞ (n, k, w)/(N −2) and so on. Summing
up these costs, we get that the cost of finding L distinct codewords is
L−1 
 1
CL (n, k, w) ≈ C∞ (n, k, w) · .
i=0
N −i

Therefore, if L << N , we can estimate CL ≈ C∞ L/N , and if the goal is to


find all the codewords, we get CN ≈ C∞ ln(N ), where ln denotes the natural
N
logarithm, because i=1 1/i ≈ ln(N ).

3 New Algorithm for Permutation Equivalences over Fq


In this section, we introduce an algorithm for the permutation equivalence Prob-
lem over sufficiently large fields Fq (which is the case of the LESS-III parameter
set). The complexity of the algorithm is independent of the size of the hull of
the equivalent codes. Therefore, our algorithm can be used to find equivalences
when the hull is so large that using the SSA algorithm becomes infeasible. The
complexity of the algorithm is better than Leon’s algorithm when the size of the
finite field is sufficiently large.

Main Idea. Leon’s algorithm computes the sets X1 = C1 ∩ Bn (wmin ) and


X2 = C2 ∩ Bn (wmin ), where wmin is the minimal weight of codewords in C1 and
solves the Code equivalence problem by looking for π ∈ Sn such that π(X1 ) =
X2 . The bottleneck of Leon’s algorithm is computing X1 = C1 ∩ Bn (wmin ) and
X2 = C2 ∩ Bn (wmin ), so if we want to improve the complexity of the attack we
need to avoid computing all of X1 and X2 . An easy observation is that permuting
a codeword x does not only preserve its Hamming weight, but also the multiset
of entries of x. Therefore, if there is an element x ∈ X1 with a unique multiset,
then one can immediately see to which vector y = π(x) ∈ X2 it gets mapped.
For example, if X1 contains the vector
 
x = 0 4 0 0 7 4 0 0 0 14 ,
Not Enough LESS: An Improved Algorithm 393

and if X2 contains the vector


 
y = 0 4 7 0 0 14 0 4 0 0 ,

then assuming there are no other vectors in X1 and X2 with the same multiset of
entries, we know that π(x) = y. In particular we know that π(5) = 3, π(10) = 6
and π(2), π(6) ∈ {2, 8}.
Instead of computing all of X1 and X2 , our algorithm will search for a small
number of such pairs (x, y = π(x)), which will give enough information to deter-
mine π. This approach will not work if Fq is too small, because then X1 will
contain a lot of vectors with the same multiset of entries. (E.g. in the case q = 2,
all the vectors with the same weight have the same multiset of entries.)
If we compute only Θ( |X1 | log n) elements of X1 and X2 , then we expect
to find Θ(log n) pairs (x, y = π(x)), which suffices to recover π. This speeds
up the procedure by a factor Θ( |X1 |/ log n), which is only a small factor.
We can improve this further by considering larger sets X1 = C1 ∩ Bn (w) and
X2 = C2 ∩ Bn (w) for a weight w that is not minimal. In the most favorable case
where the multisets of the vectors in Xi are still unique for w = n − k + 1, then
we can sample from X1 and X2 in polynomial time using Gaussian elimination,
 n 
and we get an algorithm that runs in time Õ k−1 , where Õ is the usual
big-O notation but ignoring polynomial factors.

Description of the Algorithm. The algorithm works as follows:


n!
1. Let w be maximal subject to (n−w)! q −n+k < 4 log
1
n and w ≤ n − k + 1.
2. Repeatedly use information set decoding to generate a list L that con-
tains |Bn (w)| · q −n+k−1 · 2 log n pairs of the form (x, lex(x)), where x ∈
C1 ∩ Bn (w) and where lex(x) is the lexicographically first element of the set
{π(αx)|π ∈ Sn , α ∈ F×
q }.
3. Initialize an empty list P and repeatedly use information set decoding to
generate y ∈ C2 ∩ Bn (w). If there is a pair (x, lex(x)) in L such that lex(x) =
lex(y), then append (x, y) to P . Continue until P has 2 log(n) elements.
4. Use a backtracking algorithm to iterate over all permutations π that satisfy
π(x) = y for all (x, y) ∈ P until a permutation is found that satisfies
π(C1 ) = C2 .

Heuristic Analysis of the Algorithm. Heuristically, we expect that for


x ∈ C1 ∩ Bn (w) the probability that there exists x ∈ C1 ∩ Bn (w) such that
n!
x  = x and lex(x) = lex(x ) to be bounded by (n−w)! q −n+k , because there
n!
are at most (n−w)! values of x (up to multiplication by a unit) for which

lex(x ) = lex(x), (namely all the permutations x), and each of these vectors
is expected to be in C1 with probability q −(n−k) . In step 1 of the algorithm we
choose w such that the probability estimate that x is part of such a collision in
lex is at most 1/(4 log n).
394 W. Beullens

We have lex(C1 ∩ Bn (w)) = lex(C2 ∩ Bn (w)) and heuristically the size of this
set is close to |C1 ∩ Bn (w)|/(q − 1) ≈ |Bn (w)|/q n−k+1 since lex is almost (q − 1)-
to-one. Therefore, it takes roughly |Bn (w)|2 log n/q n−k+1 |L| iterations of step 3
until 2 log n pairs (x, y) with lex(x) = lex(y) are found. We chose the list size
|L| = |Bn (w)|2 log n/q n−k+1 so that the work in step 2 and step 3 is balanced.
The last part of the algorithm assumes that for each pair (x, y) found in step 3
we have π(x) = y. This can only fail with probability bounded by 1/4 log n,
because this implies that π(x) and y ∈ C2 ∩ Bn (w) form a collision for lex.
Summing over all the 2 log n pairs we get that the probability that π(x) = y
holds for all the pairs in P is at least 1/2. If this is the case then there are typically
very few permutations σ (most of the time only one) that satisfy σ(x) = y
and the true code equivalence π must be one of them.
The complexity of the attack is dominated by the cost of the ISD algorithm
to find |L| weight-w codewords in C1 and C2 in step 2 and 3, which is

2 · C|L| (n, k, w)

In our implementation we have used the Lee-Brickell algrithm [8] with p = 2 to


instantiate the ISD oracle3 . In this case, the number of row-operations used by
the ISD algorithm can be approximated (see Sect. 2.2) as
  n √ 
|L| w log n
2 · C|L| (n, k, w) ≈ C∞ = O n−k  .
|C1 ∩ Bn (w)|/(q − 1) w−2 · |Bn (w)|q −n+k

The Algorithm in Practice. An implementation of our algorithm in C, as


well as a python script to estimate the complexity of our attack is made publicly
available at
www.github.com/WardBeullens/LESS Attack.
We used this implementation to break the LESS-III parameter set. The public
key of a LESS-III signature consist of two permutation equivalent codes C1 and
C2 of length n = 60 and dimension k = 25 over F31 . The codes are chosen to be
weakly self-dual. From experiments, we see that the weakly self-dual property
does not seem to affect the complexity or the success rate of our attack.
n!
For these parameters, the maximal value of w that satisfies (n−w)! q −n+k <
4 log n is w = 30, so we use the Lee-Brickell algorithm to find codewords in C1 and
1

C2 with Hamming weight at most 30. The list size is |Bn (w)| · q −n+k−1 · 2 log n,

3
One can also use more advanced ISD algorithms such as Stern’s algorithm [14], but
since we will be working with relatively large finite fields we found that this does
not offer a big speedup. To simplify the analysis and the implementation we have
chosen for the Lee-Brickell algorithm.
Not Enough LESS: An Improved Algorithm 395

which is close to 25000. With these parameter choices, the algorithm runs in
about 2 s on a laptop with an Intel i5-8400H CPU at 2.50 GHz. The rate at which
pairs are found closely matched the heuristic analysis of the previous section:
The analysis suggests that we should have to do approximately 214.7 Gaussian
eliminations, while the average number of Gaussian eliminations measured in our
experiments is 214.6 . However, we find that the heuristic lower bound of 1/2 for
the success probability is not tight: The algorithm terminates successfully in all
of the executions. This is because in our heuristic analysis we used n!/(n − w)! as
an upper bound for the number of permutations of a vector x of weight w. This
upper bound is only achieved if all the entries of x are distinct. For a random
vector x the number of permutations is much smaller, which explains why the
observed probability of a bad collision is much lower than our heuristic upper
bound.
Remark 1. If we use the algorithm for longer codes the list L will quickly be so
large that it would be very costly to store the entire list in memory. To avoid
this we can define 2 functions F1 and F2 that take a random seed as input, run
an ISD algorithm to find a weight w codeword x in C1 or C2 respectively and
output lex(x). Then we can use a memoryless claw-finding algorithm such as the
Van Oorschot-Wiener algorithm [16] to find inputs a, b such that F1 (a) = F2 (b).
This makes the memory complexity of the algorithm polynomial, at essentially
no cost in time complexity. Since memory is not an issue for attacking the LESS
parameters we did not implement this approach.

Comparison with Leon’s Algorithm and New Parameters for LESS.


We expect recovering a LESS-III secret key with Leon’s algorithm would require
224.5 iterations of the Lee-Brickell algorithm, significantly more than the 214.6
iterations that our algorithm requires. Figure 1 shows how the complexity of our
attack and Leon’s attack scales with increasing code length n. The left graph
shows the situation where the field size q and the dimension k increases linearly
with the code length, while the graph on the right shows the case where q = 31 is
fixed. In both cases, our algorithm outperforms Leon’s algorithm, but since our
algorithm can exploit the large field size, the gap is larger in the first case. The
sawtooth-like behavior of the complexity of Leon’s algorithm is related to the
number of vectors of minimal weight, which oscillates up and down. We see that
in order to achieve 128 bits of security (i.e. an attack needs 2128 row operations)
we can use a q-ary code of length n = 280, dimension k = 117 and q = 149.
Alternatively, if we keep q = 31 fixed, we could use a code of length n = 305 and
dimension k = 127. This would result in an average signature size of 18.8 KB or
21.1 KB respectively. This is almost a factor 5 larger than the current signature
396 W. Beullens

size of 3.8 KB4 . The public key size would increase from 0.53 KB5 to 16.8 KB or
13.8 KB for the q = 149 or q = 31 parameter set respectively, an increase of more
than a factor 25. The fact that our algorithm performs better in comparison to
Leon’s algorithm for larger finite fields is illustrated in Fig. 2, where we plot the
complexity of both algorithms for n = 250, k = 104 and for various field sizes.

Fig. 1. Complexity of Leon’s algorithm and our algorithm for finding permutation
equivalences in function of the code Length. In the left graph the field size scales
linearly with the code length, in the right graph the field size q = 31 is fixed. In both
cases the rate of the code is fixed at k/n = 5/12.

Fig. 2. Complexity of Leon’s algorithm and our algorithm for finding permutation
equivalences in function of the finite field size for random linear codes of length n = 250
and dimension k = 104.

4
The LESS paper claims 7.8 KB. but 4 KB of the signature consists of commitments
that can be recomputed by the verifier, so this does not need to be included in the
signature size.
5
The LESS paper claims 0.9 KB public keys, but the generator matrix can be put in
normal form, which reduces the size from k × n field elements to k × (n − k) field
elements.
Not Enough LESS: An Improved Algorithm 397

4 New Algorithm for Linear Equivalences over Fq


In this section, we generalize the algorithm from the previous section to the linear
equivalence Problem. The main obstacle we need to overcome is that it does not
seem possible given sparse vectors x ∈ C1 and y ∈ C2 to verify if μ(x) = y,
where μ ∈ Mn is the monomial transformation such that μ(C1 ) = C2 . In the
permutation equivalence setting, we could guess that if the multiset of entries
of x equals the multiset of entries of y then π(x) = y. If the size of the finite
field was large enough, then this was correct with large probability. This strategy
does not work in the linear equivalence setting, because monomial permutations
do not preserve the multiset of entries. In fact, monomial transformations do
not preserve anything beyond the hamming weight of a vector, because for any
two codewords x and y with the same weight there exists μ ∈ Mn such that
μ(x) = y.

Main Idea. To overcome this problem, be will replace sparse vectors by 2-


dimensional subspaces with small support. Let

X1 (w) = {V ⊂ C1 | dim(V ) = 2 and |Supp(V )| ≤ w}

be the set of 2 dimensional linear subspaces of C1 with support of size at most w


and similarly we let X2 (w) be the set of 2-spaces in C2 with support of size w. If
μ ∈ Mn is a monomial permutation such that μ(C1 ) = C2 , then for all V ∈ X1 (w)
we have μ(V ) ∈ X2 . Analogously with the algorithm from the previous section,
we will sample 2-spaces from X1 (w) and from X2 (w) in the hope of finding
spaces V ∈ X1 (w) and U ∈ X2 (w) such that μ(V ) = W . Then, after finding
Ω(log(n)) such pairs we expect to be able to recover the equivalence μ. To detect
if μ(V ) = W we define lex(V ) to be the lexicographically first basis of a 2-space
in the Mn -orbit of V . Clearly, if μ(V ) = W , then the Mn -orbits of V and W
will be the same and hence lex(V ) = lex(W ). Moreover, since the dimension of
V and W is only 2, it is feasible to compute lex(V ) and lex(W ) efficiently.

Computing lex(V ). To compute lex(V ) we can simply consider all the bases
x, y that generate V (there are (q 2 − 1)(q 2 − q) of them) and for each of them
find the monomial transformation μ such that μ(x), μ(y) comes first lexico-
graphically, and then take the permuted basis that comes first out of these
(q 2 − 1)(q 2 − q) options. Given a basis x, y, finding the lexicographically first
value of μ(x), μ(y) is relatively straightforward: First make sure that μ(x) is
minimal, and then use the remaining degrees of freedom to minimize μ(y). The
minimal μ(x) consists of n − wt(x) zeroes followed by wt(x) ones, which is
achieved by multiplying the non-zero entries of x (and the corresponding entries
of y) by their inverse and permuting x such that all the ones are in the back.
The remaining degrees of freedom of μ can be used to make the first n − wt(x)
entries of μ(y) consist of a number of zeros followed by a number of ones and to
sort the remaining entries of μ(y) in ascending order.
398 W. Beullens

A basis x, y for V can only lead to the lexicographicallt first μ(x), μ(y) if the
hamming weight of x is minimal among all the vectors in V . Therefore, we only
need to consider bases x, y where the hamming weight of x is minimal. When
the first basis vector is fixed, choosing the second basis vector and minimizing
the basis, can be on average done with a constant number row operations, so
the average cost of the algorithm is q + 1 + O(N ) = O(q) row operations, where
the q + 1 operations stem from finding the minimal weigth vectors in V , and N
is the number of such vectors.

Example 1. The following is an example what lex(V ) could look like:


 
19 3 21 36 17 44 0 47 34 19 48 3 0 47 0 38 27 8 49 18 8 0 0 31 26 52 7 30 37 47
V =
35 24 13 0 50 40 0 52 6 19 37 28 0 13 0 49 34 20 24 30 24 45 0 39 42 0 18 17 28 36

 
0000001111111111 1 1 1 1 1 1 1 1 1 1 1 1 1 1
lex(V ) =
0 0 0 0 1 1 0 0 1 1 2 3 4 5 9 9 11 13 15 19 20 21 23 27 32 33 34 36 37 39

Description of the Algorithm. The algorithm works as follows:


n!
1. Let w be maximal subject to (n−w)! q w−1+2k−2n < 4 log1
n and w ≤ n − k + 2.
2. Repeatedly use information set decoding to generate a lists L that contains
n
w ·q
2(−n+k+w−2) · 2 log n pairs of the form (V, lex(V )), where V ∈ X (w).
1
3. Initialize an empty list P and repeatedly use information set decoding to
generate W ∈ X2 . If there is a pair (V, lex(V )) in L such that lex(V ) = lex(W ),
then append (V, W ) to P . Continue until P has 2 log(n) elements.
4. Use a backtracking algorithm to iterate over all monomial permutations μ
that satisfy μ(V ) = W for all (V, W ) ∈ P until a monomial permutation is
found that satisfies μ(C1 ) = C2 .

Heuristic Analysis of the Algorithm. The heuristic analysis of this algo-


rithm is very similar to that of our permutation equivalence algorithm. This
time the size of a M2 -orbit of a 2-space V with |Supp(V )| ≤ w is bounded by
n! w−1 (q k −1)(q k −q)
(n−w)! (q −1) n −1)(q n −q) ≈ q
2(k−n)
and a random 2-space has probability of (q
n!
of being a subspace of C1 . So as long as we pick w such that (n−w)! q w−1+2k−2n <
4 log n we expect the probability that one of (V, W ) the pairs that we found
1

are such that lex(V ) = lex(W ) but μ(V ) = W is bounded by 1/2. The size
 n  (qw −1)(qw −q) −2(n−k)
of X1 (w) and |X2 (w)| is expected to be at most w (q 2 −1)(q 2 −q) q ≈
 n  2(w−2−n+k) n
w q , because for each of the w supports S of size w, there are
(q w −1)(q w −q)
(q 2 −1)(q 2 −q) 2-spaces whose support is included in S, and we expect one
out q 2 (n − k) of them to lie in C1 . Therefore, if we set the list size to
n
w ·q
2(−n+k+w−2) · 2 log n then we expect the third step of the algorithm

to terminate after roughly |L| iterations. (We are counting the subspaces V with
Not Enough LESS: An Improved Algorithm 399

|Supp(V )| < w multiple times, so X1 (w) is slightly smaller than our estimate.
This is not a problem, because it means that the third step will terminate slightly
sooner than our analysis suggests.)
The complexity of the algorithm consists of the ISD effort to sample |L|
elements from X1 (w) and X2 (w) respectively, and the costs of computing lex.
We have to compute lex an expected number of 2|L| times; once for each of the 2-
spaces in the list L and once for each 2-space found in step 3. Since the number
of row operations per lex is O(q), the total cost of computing lex is O(q|L||).
To sample the 2-spaces we use asn adaptation of the Lee-Brickell algorithm:
We repeatedly put a generator matrix of C1 in echelon form with respect to a
random choice of pivot columns, and then we look at the span of any 2 out of
k rows of the new matrix. Given the echelon form of the matrix, the support
of these 2-spaces has size at most n − k + 2, and if we are lucky the size of the
support will be smaller than or equal to w. The complexity of this algorithm is
very similar to that of the standard Lee-Brickell algorithm for finding codewords
(see Sect. 2.2).
For a 2-space V ∈ X1 (w), the Lee-Brickell algorithm will find V if the random
choice of pivots intersects
n−k kSupp(V
  n  ) in 2 positions, which happens with
k probabil-
ity P∞ (n, k, w) = w−2 2 / w . The cost per iteration is O(k 2
+ 2 ) = O(k )
2

row operations for the Gaussian elimination and for enumerating the 2-spaces,
so the expected number of row operations until we find |L| elements in X1 (w)
and X2 (w) is
 n  ⎛ n ⎞
k2 w |L| w · log n
O  n−k k ≈ O⎝  n−k  q −w+2+n−k ⎠ .
w−2 2 |X1 (w)| w−2

4.1 The Algorithm in Practice

We have implemented the algorithm and applied it to the LESS-I parameter set.
The public key of a LESS-I signature consist of two linearly equivalent codes
C1 and C2 chosen uniformly at random of length n = 54 and dimension k = 27
n!
over F53 . The largest value of w satisfying (n−w)! q w−1+2k−2n < 4 log
1
n is w = 28,
n
w ·q
so we use the Lee-Brickell algorithm to generate 2(−n+k+w−2) · 2 log n

subspaces of C1 and C2 with support of size at most 28, which comes down to
approximately 2800000 subspaces of each code. From our implementation we see
that this takes on average about 220.6 Gaussian eliminations, which matches the
heuristic analysis of the previous section very well. The attack takes in total
about 230.9 row operations, which amounts to about 25 s on a laptop with an
Intel i5-8400H CPU at 2.50 GHz. Approximately 12 s are spent computing the
spaces V , the remaining time is spent computing lex(V ).
400 W. Beullens

Remark 2. Similar to the Permutation equivalence case, it is possible to use a


memoryless collision search algorithm to remove the large memory cost of the
attack at essentially no runtime cost.

Comparison with Leon’s Algorithm and New Parameters for LESS.


We expect Leon’s algorithm (using the Lee-Brickell algorithm to instantiate the
ISD oracle) to require 238.3 row operations, which is significantly more than the
230.9 operations that our algorithm requires. Figure 3 shows the complexity of
our algorithm and Leon’s algorithm for increasing code length. If the size of the
finite field increases linearly with the code length, then the gap between our
algorithm and Leon’s algorithm increases exponentially. In contrast, if the field
size is fixed, then Leon’s algorithm will eventually outperform our algorithm.
Figure 4 shows that our algorithm exploits the large field size so well, that in
some regimes increasing the field size hurts security. Therefore, when picking
parameters for LESS, it is best not to pick a field size that is too big. To achieve
128 bits of security against our algorithm and Leon’s algorithm one could use
linearly equivalent codes of length 250 and dimension 125 over F53 . This results
in a signature size of 28.4 KB, more than 3 times the original LESS-I signature
size of 8.4 KB. The public key size would be 11.4 KB, more than 22 times the
original public key size of 0.5 KB. We found that for the LESS-II parameter set,
the finite field F7 is too small for our algoritm to improve over Leon’s algoritm,
which we estimate would take about 244 row operations.

Fig. 3. Complexity of Leon’s algorithm and our algorithm for finding linear equiva-
lences in function of the code Length. In the left graph the field size scales linearly
with the code length, in the right graph the field size q = 53 is fixed. In both cases the
rate of the code is fixed at k/n = 1/2.
Not Enough LESS: An Improved Algorithm 401

Fig. 4. Estimated complexity of Leon’s algorithm and our algorithm for finding linear
equivalences in function of the finite field size for random weakly self-dual codes of
length n = 250 and dimension k = 125.

5 Conclusion

We have introduced a new algorithm for finding permutation equivalences and


linear equivalences between codes that improves upon Leon’s algorithm for suf-
ficiently large field sizes. Leon’s algorithm requires computing the set of all the
codewords of minimal length, in contrast, to find permutation equivalences our
algorithm only requires to compute a small (square root) fraction of the code-
words that have a certain (non-minimal) weight. To find linear equivalences we
compute a small fraction of the 2-dimensional subspaces of the code with small
(but not minimal) support. We implement the algorithm and use it to break
the recently proposed LESS system. We show that the LESS-I and LESS-III
parameter sets can be broken in only 25 s and 2 s respectively. We propose larger
parameters that resist our attack and Leon’s original attack that come at the cost
of at least a factor 3 increase in signature size and a factor 22 increase in key size.
We compare the new parameters of LESS to some other code-based signature
schemes in Table 2. Despite the significant increase in signature size and key size,
LESS still has smaller signatures than other zero-knowledge based signatures in
the Hamming metric, such as Stern’s protocol [15], Veron’s protocol [17] and
the CVE scheme [4]. For example, we estimate that with some straightforward
optimizations, the Fiat-Shamir transformed version of CVE identification pro-
tocol has a signature size of 38 KB at 128 bits of security. However, the smaller
signature size of LESS comes at the cost of larger public keys. Compared to
cRVDC [2], a recent zero-knowledge-based proposal using the rank metric, the
signature size of LESS is very similar, but the LESS public keys are much larger.
Compared to the Durandal scheme [1], LESS has a similar public key size, but
larger signatures. Finally, compared to WAVE [5] LESS has much smaller public
keys, but also much larger signatures.
402 W. Beullens

Table 2. Comparison of the new LESS parameters with some other code-based signa-
ture schemes.

CVE [15] cRVDC [2] Durandal [1] Wave [5] LESS-I LESS-III
Metric Hamming Rank Rank Hamming Hamming
type FS FS FS w/ abort Trapdoor FS
Public key 104 B 152 B 15 KB 3.2 MB 11 KB 17 KB
signature 38 KB 22 KB 4.0 KB 1.6 KB 28 KB 19 KB

References
1. Aragon, N., Blazy, O., Gaborit, P., Hauteville, A., Zémor, G.: Durandal: a rank
metric based signature scheme. In: Ishai, Y., Rijmen, V. (eds.) EUROCRYPT 2019.
Part III LNCS, vol. 11478, pp. 728–758. Springer, Cham (2019). https://doi.org/
10.1007/978-3-030-17659-4 25
2. Bellini, E., Caullery, F., Gaborit, P., Manzano, M., Mateu. V.: Improved veron
identification and signature schemes in the rank metric. In:2019 IEEE International
Symposium on Information Theory (ISIT), pp. 1872–1876. IEEE (2019)
3. Biasse, J.-F., Micheli, G., Persichetti, E., Santini, P.: LESS is more: code-based sig-
natures without syndromes. Cryptology ePrint Archive, Report 2020/594 (2020).
https://eprint.iacr.org/2020/594
4. Cayrel, P.-L., Véron, P., El Yousfi Alaoui, S.M.: A zero-knowledge identification
scheme based on the q-ary syndrome decoding problem. In: Biryukov, A., Gong, G.,
Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 171–186. Springer, Heidelberg
(2011). https://doi.org/10.1007/978-3-642-19574-7 12
5. Debris-Alazard, T., Sendrier, N., Tillich, J.-P.: Wave: a new family of trapdoor one-
way preimage sampleable functions based on codes. In: Galbraith, S.D., Moriai,
S. (eds.) ASIACRYPT 2019. Part I LNCS, vol. 11921, pp. 21–51. Springer, Cham
(2019). https://doi.org/10.1007/978-3-030-34578-5 2
6. Feulner, T.: The automorphism groups of linear codes and canonical representatives
of their semilinear isometry classes. Adv. Math. Comm. 3(4), 363–383 (2009)
7. Girault, M.: A (non-practical) three-pass identification protocol using coding the-
ory. In: Seberry, J., Pieprzyk, J. (eds.) AUSCRYPT 1990. LNCS, vol. 453, pp.
265–272. Springer, Heidelberg (1990). https://doi.org/10.1007/BFb0030367
8. Lee, P.J., Brickell, E.F.: An observation on the security of McEliece’s public-key
cryptosystem. In: Barstow, D., et al. (ed.) EUROCRYPT 1988. LNCS, vol. 330, pp.
275–280. Springer, Heidelberg (1988). https://doi.org/10.1007/3-540-45961-8 25
9. Leon, J.: Computing automorphism groups of error-correcting codes. IEEE Trans.
Inf. Theory 28(3), 496–511 (1982)
10. McEliece, R.J.: A public-key cryptosystem based on algebraic coding theory. Jet
Propulsion Laboratory DSN Progress Report, pp. 42–44 (1978)
11. Saeed, M.A.: Algebraic approach for code equivalence. PhD thesis, Normandie
Université; University of Khartoum (2017)
12. Sendrier, N.: Finding the permutation between equivalent linear codes: the support
splitting algorithm. IEEE Trans. Inf. Theory 46(4), 1193–1203 (2000)
Not Enough LESS: An Improved Algorithm 403

13. Sendrier, N., Simos, D.E.: The hardness of code equivalence over Fq and Its appli-
cation to code-based cryptography. In: Gaborit, P. (ed.) PQCrypto 2013. LNCS,
vol. 7932, pp. 203–216. Springer, Heidelberg (2013). https://doi.org/10.1007/978-
3-642-38616-9 14
14. Stern, J.: A method for finding codewords of small weight. In: Cohen, G., Wolf-
mann, J. (eds.) Coding Theory 1988. LNCS, vol. 388, pp. 106–113. Springer, Hei-
delberg (1989). https://doi.org/10.1007/BFb0019850
15. Stern, J.: A new identification scheme based on syndrome decoding. In: Stinson,
D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 13–21. Springer, Heidelberg (1994).
https://doi.org/10.1007/3-540-48329-2 2
16. van Oorschot, P.C., Wiener, M.J.: Parallel collision search with cryptanalytic appli-
cations. J. Cryptol. 12(1), 1–28 (1999)
17. Pascal, V.: Improved identification schemes based on error-correcting codes. Appl.
Algebra Eng. Commun. Comput. 8(1), 57–69 (1997)

You might also like