Nice
Nice
Ward Beullens(B)
1 Introduction
Two q-ary linear codes C1 and C2 of length n and dimension k are called per-
mutation equivalent if there exists a permutation π ∈ Sn such that π(C1 ) = C2 .
Similarly, if there exists a monomial permutation μ ∈ Mn = (F× n
q ) Sn such
that μ(C1 ) = C2 the codes are said to be linearly equivalent (a monomial per-
mutation acts on vectors in Fnq by permuting the entries and also multiplying
each entry with a unit of Fq ). The problem of finding π ∈ Sn (or μ ∈ Mn ) given
equivalent C1 and C2 is called the permutation equivalence problem (or linear
equivalence problem respectively)1 .
1
There also exists a more general notion of equivalence called semi-linear equivalence.
Our methods generalize to semi-linear equivalences, but since this is not relevant for
the security of LESS, we do not elaborate on this.
This work was supported by CyberSecurity Research Flanders with reference num-
ber VR20192203 and the Research Council KU Leuven grants C14/18/067 and
STG/17/019. Ward Beullens is funded by FWO SB fellowship 1S95620N.
c Springer Nature Switzerland AG 2021
O. Dunkelman et al. (Eds.): SAC 2020, LNCS 12804, pp. 387–403, 2021.
https://doi.org/10.1007/978-3-030-81652-0_15
388 W. Beullens
The hardness of the permutation equivalence problem is relevant for the security
of the McEliece and Girault post-quantum cryptosystems [7,10]. More recently,
Biasse, Micheli, Persichetti, and Santini proposed a new code-based signature
scheme whose security only relies on the hardness of the linear code equivalence
problem or permutation code equivalence problem. The public key consists of
generator matrices for two equivalent codes C1 and C2 , and a signature is a zero-
knowledge proof of knowledge of an equivalence μ ∈ Mn (or π ∈ Sn ) such that
μ(C1 ) = C2 (or π(C1 ) = C2 respectively). In the case of permutation equivalence,
the codes C1 and C2 are chosen to be weakly self-dual, because otherwise π can
be recovered in polynomial time [12]. Table 1 shows the proposed parameter sets
for LESS, aiming for 128 bits of security.
We will briefly go over some of the algorithms that have been proposed for the
permutation and linear code equivalence problems below. The state of the art
for the permutation code equivalence problem is that random instances can be
solved in polynomial time with the Support Splitting Algorithm (SSA), but that
instances with codes that have large hulls require a runtime that is exponential
in the dimension of the hull. Weakly self-dual codes (these are codes C such that
C ⊂ C ⊥ ) have hulls of maximal dimension dim(H(C)) = dim(C) and are believed
to be the hardest instances of the permutation equivalence problem. The state
of the art for the linear code equivalence problem is that instances over Fq with
q ≤ 4 can be solved in polynomial time with the SSA algorithm via a reduction
to the permutation equivalence problem, but for q > 4 this reduction results in
codes with a large hull, which means the SSA algorithm is not efficient. Hence,
the linear code equivalence problem is conjectured to be hard on average for
q > 4 [13].
Not Enough LESS: An Improved Algorithm 389
Leon’s Algorithm. Leon’s algorithm [9] for finding linear and permutation
equivalences relies on the observation that applying a permutation or a monomial
permutation does not change the hamming weight of a codeword. Therefore, if
we compute the sets X1 and X2 of all the minimal-weight codewords of C1
and C2 respectively, then it must be that X2 = π(X1 ) or X2 = μ(X1 ) in the
case of permutation equivalence or linear equivalence respectively. Leon gives an
algorithm to compute a μ ∈ Mn that satisfies X2 = μ(X1 ) with a time complexity
that is polynomial in |X1 |. Usually, the sets X1 and X2 have “enough structure”,
such that if μ satisfies X2 = μ(X1 ), then also C2 = μ(C1 ) with non-negligible
probability. If this is not the case, then one can also consider larger sets X1 and
X2 that contain all the codewords in C1 and C2 respectively whose weight is one
more than the minimal weight. Since the sets X1 and X2 are usually small, the
complexity of the algorithm is dominated by the complexity of computing X1
and X2 .
Feulner gives an algorithm that computes a canonical representative of an
equivalence class of codes. The complexity of this algorithm is close to that of
Leon’s algorithm [6].
2
This is not the case for monomial permutations: H(µ(C)) is not necessarily equal to
µ(H(C)) for µ ∈ Mn . This is why the SSA can not be directly applied to find linear
equivalences.
390 W. Beullens
The SSA defines a signature as S(C, i) := W (H(Ci )), where Ci is the code C
punctured at position i, and W (C) denotes the weight enumerator polynomial
of the code C. While this signature is typically not fully discriminant, it is still
discriminant enough to efficiently solve the permutation equivalence Problem for
random matrices. However, a limitation of the SSA algorithm is that computing
the enumerator of the hull is not efficient when the hull of C is large. For random
codes this is not a problem because typically the hull is small.
2 Preliminaries
2.1 Notation
For a q-ary linear code C of length n and dimension k we say a matrix G ∈ Fk×n
q
is a generator matrix for C if C = G, where G denotes the span of the rows
(n−k)×n
of G. Similarly, we say that a matrix H ∈ Fq is a parity check matrix for
Not Enough LESS: An Improved Algorithm 391
The algorithms in this paper will make use of information set decoding to find
sparse vectors in q-ary linear codes. In particular, we will use the Lee-Brickell
algorithm with parameter p = 2. To find low-weight codewords in a code C =
M the algorithm repeatedly computes the echelon form of M with respect
to a random choice of k pivot columns. Then, the algorithm inspects all the
linear combinations of p = 2 rows of the matrix. Given the echelon form of the
matrix, we are guaranteed that all these linear combinations have weight at most
n − k + 2, but if we are lucky enough we will find codewords that are even more
sparse. We repeat this until a sufficiently sparse codeword is found.
row operations. Heuristically, for random codes we expect the support of the
different codewords to behave as if they are “independent”, so if there exist
(q − 1)N codewords of weight w (i.e. N different codewords up to multiplication
by a scalar), then we expect the probability that one iteration of the Lee-Brickell
algorithm succeeds to be
Thus, if N is small enough, we have P1 (n, k, w) ≈ N P∞ (n, k, w), and the com-
plexity of finding is a single weight-w codeword is C1 (n, k, w) ≈ C∞ (n, k, w)/N .
Finally, if the goal is to find L out of the N distinct weight-w codewords
(up to multiplication by a scalar), the cost of finding the first codeword is C1 =
C∞ (n, k, w)/N , the cost of finding the second codeword is C∞ (n, k, w)/(N − 1),
the cost of finding the third codeword is C∞ (n, k, w)/(N −2) and so on. Summing
up these costs, we get that the cost of finding L distinct codewords is
L−1
1
CL (n, k, w) ≈ C∞ (n, k, w) · .
i=0
N −i
then assuming there are no other vectors in X1 and X2 with the same multiset of
entries, we know that π(x) = y. In particular we know that π(5) = 3, π(10) = 6
and π(2), π(6) ∈ {2, 8}.
Instead of computing all of X1 and X2 , our algorithm will search for a small
number of such pairs (x, y = π(x)), which will give enough information to deter-
mine π. This approach will not work if Fq is too small, because then X1 will
contain a lot of vectors with the same multiset of entries. (E.g. in the case q = 2,
all the vectors with the same weight have the same multiset of entries.)
If we compute only Θ( |X1 | log n) elements of X1 and X2 , then we expect
to find Θ(log n) pairs (x, y = π(x)), which suffices to recover π. This speeds
up the procedure by a factor Θ( |X1 |/ log n), which is only a small factor.
We can improve this further by considering larger sets X1 = C1 ∩ Bn (w) and
X2 = C2 ∩ Bn (w) for a weight w that is not minimal. In the most favorable case
where the multisets of the vectors in Xi are still unique for w = n − k + 1, then
we can sample from X1 and X2 in polynomial time using Gaussian elimination,
n
and we get an algorithm that runs in time Õ k−1 , where Õ is the usual
big-O notation but ignoring polynomial factors.
We have lex(C1 ∩ Bn (w)) = lex(C2 ∩ Bn (w)) and heuristically the size of this
set is close to |C1 ∩ Bn (w)|/(q − 1) ≈ |Bn (w)|/q n−k+1 since lex is almost (q − 1)-
to-one. Therefore, it takes roughly |Bn (w)|2 log n/q n−k+1 |L| iterations of step 3
until 2 log n pairs (x, y) with lex(x) = lex(y) are found. We chose the list size
|L| = |Bn (w)|2 log n/q n−k+1 so that the work in step 2 and step 3 is balanced.
The last part of the algorithm assumes that for each pair (x, y) found in step 3
we have π(x) = y. This can only fail with probability bounded by 1/4 log n,
because this implies that π(x) and y ∈ C2 ∩ Bn (w) form a collision for lex.
Summing over all the 2 log n pairs we get that the probability that π(x) = y
holds for all the pairs in P is at least 1/2. If this is the case then there are typically
very few permutations σ (most of the time only one) that satisfy σ(x) = y
and the true code equivalence π must be one of them.
The complexity of the attack is dominated by the cost of the ISD algorithm
to find |L| weight-w codewords in C1 and C2 in step 2 and 3, which is
2 · C|L| (n, k, w)
C2 with Hamming weight at most 30. The list size is |Bn (w)| · q −n+k−1 · 2 log n,
3
One can also use more advanced ISD algorithms such as Stern’s algorithm [14], but
since we will be working with relatively large finite fields we found that this does
not offer a big speedup. To simplify the analysis and the implementation we have
chosen for the Lee-Brickell algorithm.
Not Enough LESS: An Improved Algorithm 395
which is close to 25000. With these parameter choices, the algorithm runs in
about 2 s on a laptop with an Intel i5-8400H CPU at 2.50 GHz. The rate at which
pairs are found closely matched the heuristic analysis of the previous section:
The analysis suggests that we should have to do approximately 214.7 Gaussian
eliminations, while the average number of Gaussian eliminations measured in our
experiments is 214.6 . However, we find that the heuristic lower bound of 1/2 for
the success probability is not tight: The algorithm terminates successfully in all
of the executions. This is because in our heuristic analysis we used n!/(n − w)! as
an upper bound for the number of permutations of a vector x of weight w. This
upper bound is only achieved if all the entries of x are distinct. For a random
vector x the number of permutations is much smaller, which explains why the
observed probability of a bad collision is much lower than our heuristic upper
bound.
Remark 1. If we use the algorithm for longer codes the list L will quickly be so
large that it would be very costly to store the entire list in memory. To avoid
this we can define 2 functions F1 and F2 that take a random seed as input, run
an ISD algorithm to find a weight w codeword x in C1 or C2 respectively and
output lex(x). Then we can use a memoryless claw-finding algorithm such as the
Van Oorschot-Wiener algorithm [16] to find inputs a, b such that F1 (a) = F2 (b).
This makes the memory complexity of the algorithm polynomial, at essentially
no cost in time complexity. Since memory is not an issue for attacking the LESS
parameters we did not implement this approach.
size of 3.8 KB4 . The public key size would increase from 0.53 KB5 to 16.8 KB or
13.8 KB for the q = 149 or q = 31 parameter set respectively, an increase of more
than a factor 25. The fact that our algorithm performs better in comparison to
Leon’s algorithm for larger finite fields is illustrated in Fig. 2, where we plot the
complexity of both algorithms for n = 250, k = 104 and for various field sizes.
Fig. 1. Complexity of Leon’s algorithm and our algorithm for finding permutation
equivalences in function of the code Length. In the left graph the field size scales
linearly with the code length, in the right graph the field size q = 31 is fixed. In both
cases the rate of the code is fixed at k/n = 5/12.
Fig. 2. Complexity of Leon’s algorithm and our algorithm for finding permutation
equivalences in function of the finite field size for random linear codes of length n = 250
and dimension k = 104.
4
The LESS paper claims 7.8 KB. but 4 KB of the signature consists of commitments
that can be recomputed by the verifier, so this does not need to be included in the
signature size.
5
The LESS paper claims 0.9 KB public keys, but the generator matrix can be put in
normal form, which reduces the size from k × n field elements to k × (n − k) field
elements.
Not Enough LESS: An Improved Algorithm 397
Computing lex(V ). To compute lex(V ) we can simply consider all the bases
x, y that generate V (there are (q 2 − 1)(q 2 − q) of them) and for each of them
find the monomial transformation μ such that μ(x), μ(y) comes first lexico-
graphically, and then take the permuted basis that comes first out of these
(q 2 − 1)(q 2 − q) options. Given a basis x, y, finding the lexicographically first
value of μ(x), μ(y) is relatively straightforward: First make sure that μ(x) is
minimal, and then use the remaining degrees of freedom to minimize μ(y). The
minimal μ(x) consists of n − wt(x) zeroes followed by wt(x) ones, which is
achieved by multiplying the non-zero entries of x (and the corresponding entries
of y) by their inverse and permuting x such that all the ones are in the back.
The remaining degrees of freedom of μ can be used to make the first n − wt(x)
entries of μ(y) consist of a number of zeros followed by a number of ones and to
sort the remaining entries of μ(y) in ascending order.
398 W. Beullens
A basis x, y for V can only lead to the lexicographicallt first μ(x), μ(y) if the
hamming weight of x is minimal among all the vectors in V . Therefore, we only
need to consider bases x, y where the hamming weight of x is minimal. When
the first basis vector is fixed, choosing the second basis vector and minimizing
the basis, can be on average done with a constant number row operations, so
the average cost of the algorithm is q + 1 + O(N ) = O(q) row operations, where
the q + 1 operations stem from finding the minimal weigth vectors in V , and N
is the number of such vectors.
0000001111111111 1 1 1 1 1 1 1 1 1 1 1 1 1 1
lex(V ) =
0 0 0 0 1 1 0 0 1 1 2 3 4 5 9 9 11 13 15 19 20 21 23 27 32 33 34 36 37 39
are such that lex(V ) = lex(W ) but μ(V ) = W is bounded by 1/2. The size
n (qw −1)(qw −q) −2(n−k)
of X1 (w) and |X2 (w)| is expected to be at most w (q 2 −1)(q 2 −q) q ≈
n 2(w−2−n+k) n
w q , because for each of the w supports S of size w, there are
(q w −1)(q w −q)
(q 2 −1)(q 2 −q) 2-spaces whose support is included in S, and we expect one
out q 2 (n − k) of them to lie in C1 . Therefore, if we set the list size to
n
w ·q
2(−n+k+w−2) · 2 log n then we expect the third step of the algorithm
to terminate after roughly |L| iterations. (We are counting the subspaces V with
Not Enough LESS: An Improved Algorithm 399
|Supp(V )| < w multiple times, so X1 (w) is slightly smaller than our estimate.
This is not a problem, because it means that the third step will terminate slightly
sooner than our analysis suggests.)
The complexity of the algorithm consists of the ISD effort to sample |L|
elements from X1 (w) and X2 (w) respectively, and the costs of computing lex.
We have to compute lex an expected number of 2|L| times; once for each of the 2-
spaces in the list L and once for each 2-space found in step 3. Since the number
of row operations per lex is O(q), the total cost of computing lex is O(q|L||).
To sample the 2-spaces we use asn adaptation of the Lee-Brickell algorithm:
We repeatedly put a generator matrix of C1 in echelon form with respect to a
random choice of pivot columns, and then we look at the span of any 2 out of
k rows of the new matrix. Given the echelon form of the matrix, the support
of these 2-spaces has size at most n − k + 2, and if we are lucky the size of the
support will be smaller than or equal to w. The complexity of this algorithm is
very similar to that of the standard Lee-Brickell algorithm for finding codewords
(see Sect. 2.2).
For a 2-space V ∈ X1 (w), the Lee-Brickell algorithm will find V if the random
choice of pivots intersects
n−k kSupp(V
n ) in 2 positions, which happens with
k probabil-
ity P∞ (n, k, w) = w−2 2 / w . The cost per iteration is O(k 2
+ 2 ) = O(k )
2
row operations for the Gaussian elimination and for enumerating the 2-spaces,
so the expected number of row operations until we find |L| elements in X1 (w)
and X2 (w) is
n ⎛ n ⎞
k2 w |L| w · log n
O n−k k ≈ O⎝ n−k q −w+2+n−k ⎠ .
w−2 2 |X1 (w)| w−2
We have implemented the algorithm and applied it to the LESS-I parameter set.
The public key of a LESS-I signature consist of two linearly equivalent codes
C1 and C2 chosen uniformly at random of length n = 54 and dimension k = 27
n!
over F53 . The largest value of w satisfying (n−w)! q w−1+2k−2n < 4 log
1
n is w = 28,
n
w ·q
so we use the Lee-Brickell algorithm to generate 2(−n+k+w−2) · 2 log n
subspaces of C1 and C2 with support of size at most 28, which comes down to
approximately 2800000 subspaces of each code. From our implementation we see
that this takes on average about 220.6 Gaussian eliminations, which matches the
heuristic analysis of the previous section very well. The attack takes in total
about 230.9 row operations, which amounts to about 25 s on a laptop with an
Intel i5-8400H CPU at 2.50 GHz. Approximately 12 s are spent computing the
spaces V , the remaining time is spent computing lex(V ).
400 W. Beullens
Fig. 3. Complexity of Leon’s algorithm and our algorithm for finding linear equiva-
lences in function of the code Length. In the left graph the field size scales linearly
with the code length, in the right graph the field size q = 53 is fixed. In both cases the
rate of the code is fixed at k/n = 1/2.
Not Enough LESS: An Improved Algorithm 401
Fig. 4. Estimated complexity of Leon’s algorithm and our algorithm for finding linear
equivalences in function of the finite field size for random weakly self-dual codes of
length n = 250 and dimension k = 125.
5 Conclusion
Table 2. Comparison of the new LESS parameters with some other code-based signa-
ture schemes.
CVE [15] cRVDC [2] Durandal [1] Wave [5] LESS-I LESS-III
Metric Hamming Rank Rank Hamming Hamming
type FS FS FS w/ abort Trapdoor FS
Public key 104 B 152 B 15 KB 3.2 MB 11 KB 17 KB
signature 38 KB 22 KB 4.0 KB 1.6 KB 28 KB 19 KB
References
1. Aragon, N., Blazy, O., Gaborit, P., Hauteville, A., Zémor, G.: Durandal: a rank
metric based signature scheme. In: Ishai, Y., Rijmen, V. (eds.) EUROCRYPT 2019.
Part III LNCS, vol. 11478, pp. 728–758. Springer, Cham (2019). https://doi.org/
10.1007/978-3-030-17659-4 25
2. Bellini, E., Caullery, F., Gaborit, P., Manzano, M., Mateu. V.: Improved veron
identification and signature schemes in the rank metric. In:2019 IEEE International
Symposium on Information Theory (ISIT), pp. 1872–1876. IEEE (2019)
3. Biasse, J.-F., Micheli, G., Persichetti, E., Santini, P.: LESS is more: code-based sig-
natures without syndromes. Cryptology ePrint Archive, Report 2020/594 (2020).
https://eprint.iacr.org/2020/594
4. Cayrel, P.-L., Véron, P., El Yousfi Alaoui, S.M.: A zero-knowledge identification
scheme based on the q-ary syndrome decoding problem. In: Biryukov, A., Gong, G.,
Stinson, D.R. (eds.) SAC 2010. LNCS, vol. 6544, pp. 171–186. Springer, Heidelberg
(2011). https://doi.org/10.1007/978-3-642-19574-7 12
5. Debris-Alazard, T., Sendrier, N., Tillich, J.-P.: Wave: a new family of trapdoor one-
way preimage sampleable functions based on codes. In: Galbraith, S.D., Moriai,
S. (eds.) ASIACRYPT 2019. Part I LNCS, vol. 11921, pp. 21–51. Springer, Cham
(2019). https://doi.org/10.1007/978-3-030-34578-5 2
6. Feulner, T.: The automorphism groups of linear codes and canonical representatives
of their semilinear isometry classes. Adv. Math. Comm. 3(4), 363–383 (2009)
7. Girault, M.: A (non-practical) three-pass identification protocol using coding the-
ory. In: Seberry, J., Pieprzyk, J. (eds.) AUSCRYPT 1990. LNCS, vol. 453, pp.
265–272. Springer, Heidelberg (1990). https://doi.org/10.1007/BFb0030367
8. Lee, P.J., Brickell, E.F.: An observation on the security of McEliece’s public-key
cryptosystem. In: Barstow, D., et al. (ed.) EUROCRYPT 1988. LNCS, vol. 330, pp.
275–280. Springer, Heidelberg (1988). https://doi.org/10.1007/3-540-45961-8 25
9. Leon, J.: Computing automorphism groups of error-correcting codes. IEEE Trans.
Inf. Theory 28(3), 496–511 (1982)
10. McEliece, R.J.: A public-key cryptosystem based on algebraic coding theory. Jet
Propulsion Laboratory DSN Progress Report, pp. 42–44 (1978)
11. Saeed, M.A.: Algebraic approach for code equivalence. PhD thesis, Normandie
Université; University of Khartoum (2017)
12. Sendrier, N.: Finding the permutation between equivalent linear codes: the support
splitting algorithm. IEEE Trans. Inf. Theory 46(4), 1193–1203 (2000)
Not Enough LESS: An Improved Algorithm 403
13. Sendrier, N., Simos, D.E.: The hardness of code equivalence over Fq and Its appli-
cation to code-based cryptography. In: Gaborit, P. (ed.) PQCrypto 2013. LNCS,
vol. 7932, pp. 203–216. Springer, Heidelberg (2013). https://doi.org/10.1007/978-
3-642-38616-9 14
14. Stern, J.: A method for finding codewords of small weight. In: Cohen, G., Wolf-
mann, J. (eds.) Coding Theory 1988. LNCS, vol. 388, pp. 106–113. Springer, Hei-
delberg (1989). https://doi.org/10.1007/BFb0019850
15. Stern, J.: A new identification scheme based on syndrome decoding. In: Stinson,
D.R. (ed.) CRYPTO 1993. LNCS, vol. 773, pp. 13–21. Springer, Heidelberg (1994).
https://doi.org/10.1007/3-540-48329-2 2
16. van Oorschot, P.C., Wiener, M.J.: Parallel collision search with cryptanalytic appli-
cations. J. Cryptol. 12(1), 1–28 (1999)
17. Pascal, V.: Improved identification schemes based on error-correcting codes. Appl.
Algebra Eng. Commun. Comput. 8(1), 57–69 (1997)