2007 HLS2
2007 HLS2
net/publication/280301776
CITATIONS READS
0 459
1 author:
Denis Berthier
Institut Mines-Télécom
43 PUBLICATIONS 109 CITATIONS
SEE PROFILE
All content following this page was uploaded by Denis Berthier on 31 May 2020.
Second Edition
Denis Berthier
of Sudoku
Second Edition
Books by Denis Berthier:
Le Savoir et l’Ordinateur, Editions L’Harmattan, Paris, November 2002.
Méditations sur le Réel et le Virtuel, Editions L’Harmattan, Paris, May 2004.
The Hidden Logic of Sudoku (First Edition), Lulu.com, May 2007.
This work is subject to copyright. All rights are reserved. This work may not be translated or copied in
whole or in part without the prior written permission of the publisher, except for brief excerpts in
connection with reviews or scholarly analysis. Use in connection with any form of information storage or
retrieval, electronic adaptation, computer software, or by similar or dissimilar methods now known or
hereafter developed is forbidden.
The first edition of this book has been published in May 2007 by Lulu.com Publishers (© 2007 Denis
Berthier, ISBN: 978-1-84753-472-9).
987654321
Second Edition
ISBN: 978-1-84799-214-7
Contents
Prologue................................................................................................................13
Introduction..........................................................................................................17
1 The Sudoku problem and the resolution methods .........................................17
2 The roles of Logic and AI in this book .........................................................25
3 Examples and classification results...............................................................27
Chapter XX. Hidden xyz- and xyzt- chains (hxyz- and hxyzt- chains)............315
XX.1 Introduction to hidden xyzt-chains (or hxyzt-chains) .............................315
XX.2 Examples and independence results .......................................................318
Miscellanea ...........................................................................................................385
1 The question of completeness .......................................................................385
2 The question of confluence ...........................................................................387
3 The question of uniqueness ...........................................................................391
4 Are any other types of rules necessary? ........................................................401
Conclusion ............................................................................................................403
References.............................................................................................................413
Foreword to the Second Edition
The first edition of this book (May 2007) introduced a conceptual framework for
Sudoku solving, where "resolution rules" played a central role. All the concepts
were formalised in Predicate Logic (FOL), which (surprisingly) was a new idea: all
the books and Web forums had always considered Propositional Logic was enough.
The concepts were also straightforwardly grounded in the notions every player uses
when solving a puzzle. This framework (unchanged in this second edition) was thus
totally player oriented from the start; it can be considered as a mere formalisation of
what has always been looked for when it is said a "pure logic solution" is wanted.
On the practical side, I also introduced new resolution rules, based on natural
generalisations of the famous xy-chains, such as xyt-, xyz- and zyzt- chains; contra-
ry to those proposed in the current literature, these were not based on subsets. The
systematic clarification and exploitation of all the generalised symmetries of the
game also led me to a new source of generalisation and provided the "hidden" coun-
terparts of the previous chains. After the first edition was published, I devised a
further generalisation, pushing the idea of super-symmetry to its maximal extent and
allowing to solve almost any puzzle with short chain patterns. Giving a systematic
presentation of these new rules (which I had introduced less formally on Web
forums) was the main reason for this second edition; and this provided the occasion
for local improvements of the parts already present in the first.
Let us state the main modifications that have been made in Parts One to Three:
– the graphics have been improved, especially for the grids with candidates;
– the extended Sudoku board defined in chapter II, the way to build and use it in
practice, which were previously only available on my Web pages, have been fully
12 The Hidden Logic of Sudoku
integrated into the book; moreover, they are now explicitly used in several exam-
ples, making the whole book more obviously player oriented;
– a new notation, the "nrc notation", is now used for displaying the solution
paths of all the examples; being more compact, it allowed the introduction of Part
Four without significantly increasing the total number of pages.
Do I need to introduce the game of Sudoku and its diabolical puzzles, when
everyone knows that they have invaded the whole planet? One sad winter evening in
December 2005, I came across one of those grids by chance – unless it was already
the first step in some plot of the Powers of Darkness, who had placed it as a challen-
ge to me on a full page of a magazine in a hairdressing saloon. Unfortunately I
thought to myself: "well, let’s see if this stupid game can help me get lighter ideas".
Alas! I did not know yet that Sudoku grids are like the arabesques described by
Lovecraft: once you begin following their perverse thread, your mind becomes ire-
vocably ensnared. Reader, unless you have already fallen into this abyss, do not take
my warning lightly: keep away! Gödel’s sentence of the mind, a riddle that is able to
paralyse it into unending loops, is not where Douglas Hofstadter has been looking
for it: you will find it in the Sudoku grids! Or should I say: it will find you?
To make it short, I was not only unable to unlock my mind from the puzzle
before I had completed it, but I was also coerced by the spell into trying immedia-
tely to solve a second one, even though I still had the same opinion in my mind:
what a stupid game!
But this is only the beginning of the story: so mischievous is the Sudoku virus
that it has to replicate and disseminate itself by all available means. Judge for your-
self. On the morrow morning, I was hardly awake when I discovered that the same
Fiendish Powers who had plotted my first encounter with the grids had also implan-
ted a weird idea into the innermost meanders of my neurons, softened up by too
short a night: "what a nice elementary example that would be for the students of my
introductory courses in Logic and Artificial Intelligence (AI)!" This was the real
start of it all.
14 The Hidden Logic of Sudoku
As I was somehow aware some spell was acting on me and there was a risk of
transferring it onto the poor fellows, I had to make sure there was a firm scientific or
technical basis on which they could obtain a foothold. So I first checked that the
problem was feasible (in particular that the computation times would not be
prohibitive) and I quickly developed a small knowledge base able to solve all the
example grids I could find on the Web. It worked by recursive Trial and Error, i.e. it
by carrying out a systematic exploration of all the possibilities1. I needed a little
more time to optimise the knowledge base so as to solve most of the puzzles in a
fraction of a second and the remaining ones in a few seconds on my old Powerbook,
but this was disconcertingly easy (compared to the time one needs to solve a puzzle
"by hand"). So I gave this as an exercise to a first group of students – and discovered
with some horror that many of them were already addicted to the wicked game.
Although I was not so naive to expect anything else, my program was deeply
frustrating: the machine reached the solution infinitely faster than any human being
could, even in his dreams, but it did so in an utterly stupid way (by exploring
thousands of hypotheses, sometimes more than twenty levels deep).
Still goaded by the same Devilish Powers, I decided to write another (rule based)
program that would simulate the behaviour of an expert player, able to justify all its
steps as a human player would. In the meanwhile, I had discovered on the Web lots
of pages describing resolution rules based on "pure logic", i.e. not resorting to Trial
and Error. These rules consist of detecting patterns of varied complexity (and
propagating associated constraints) of a type much less obvious than those mentio-
ned above – the names of which might make you pensive (Naked Single, Hidden
Single, Naked Subset, Hidden Subset, Block-Row-Interaction, X-Wing, Swordfish,
Jellyfish and all other sorts of fishy things, XYZ-Wing, Death Blossom…). Buil-
ding a series of projects for other groups of students provided me with a pretext for
pursuing these new avenues.
The first thing to consider about the way a human being tackles the problem is
that a puzzle is never submitted in a purely logical form; on the contrary, it is always
centred on a spatial presentation2, i.e.: "complete the following grid…". This might
seem harmless since it is not very difficult to translate the whole data into pure
logic. Nevertheless, this spatial presentation of the game insidiously leads to the
extended and biased representation universally used for resolution (a representation
used by all the Web sites I have visited and by all the books I have browsed): in
every cell, one writes either the number that must definitely occupy it or (with a
1
Of course, this (depth first) exploration of the tree of possibilities was duly pruned by
propagation of the constraints defining the game (along rows, columns and 3x3 blocks).
2
Notice that the same remark applies to most of the so called logical games.
Prologue 15
pencil and in smaller size) the list of all its "candidates", i.e. of all the numbers that
may still occupy it. Solving the grid then consists of progressively reducing this list
of candidates by constraints propagation, until only one possibility remains for each
cell. The resolution rules one can find on the Web are the expression of more or less
complicated constraints; they are nearly always formulated on the basis of this
representation.
It is time now to say a few words about the first discovery my writhing neurons
made while trying to escape the spell: this universal spatial presentation of the
puzzle, together with the associated model of cells to be filled with one number
each, hide some logical symmetries of the problem3. And considering that eliciting
these symmetries leads to the quasi identification of complex rules (such as X-Wing,
Swordfish and Jellyfish) with apparently much simpler ones (such as Naked Pairs,
Naked Triplets and Naked Quadruplets respectively), there is a mathematical beauty
in it.
But they have very twisted minds and you can never know by which pernicious
paths they will have you reach their goals. At the time I caught Sudoku (as one
catches a cold), I was wondering, like many researchers in cognitive science, if there
is anything describable that makes the symbols or signs we use in language and in
many other forms of ordinary life different from the formal symbols of AI. With the
findings mentioned above, this very vague question mingled with, or focused on, the
relationship between the abstract logical formulation of the game and its spatial
presentation4. That is to say, I got a new excuse for continuing working on Sudoku.
The fact is I never consciously decided to write this book before the post-its, wri-
tings, drawings and programs had accumulated while I progressively shook the spell
off as days passed. One vicious thing leading to a virtuous one, the whole process
ended with this book (and the corresponding program SudoRules) being written.
The Gods are victorious and our spirits are high.
3
Since this discovery, I have not been able to find any systematic reference on the Web to
anything similar and I think granting it the central place it has in this book is original. But I
must confess that I have not read the sixty and some million pages related to Sudoku.
4
This topic is still under investigation and will hardly be tackled in the present book.
The Gods are victorious and our spirits are high,
but do not forget that
lurking forever behind each and every Sudoku grid are
the Powers of Darkness…
Introduction
Given a 9x9 grid, partially filled with numbers from 1 to 9 (the "entries" of the
problem, also called the "clues" or the "givens"), complete it with numbers from 1 to
9 so that in every of the nine rows, in every of the nine columns and in every of the
nine disjoint blocks of 3x3 contiguous cells, the following property holds:
– there is at most one occurrence of each of these numbers.
Although this defining property can be replaced by either of the following two,
that are obviously equivalent to it, we shall stick to the first formulation, for reasons
that will appear later (in chapter IV, section 1.2):
– there is at least one occurrence of each of these numbers,
– there is exactly one occurrence of each of these numbers.
Since rows, columns and blocks play similar roles in the defining constraints,
they will naturally appear to do so in many other places and it is convenient to intro-
duce a word that makes no difference between them: a unit is either a row or a
column or a block. And we say that two cells share a unit if they are either in the
same row or in the same column or in the same block (where "or" is non exclusive).
We also say that these two cells are linked, or that they see each other. It should be
noticed that this (symmetric) relation between two cells, whichever of the three
equivalent names it is given, does not depend in any way on the content of these
cells but only on their place in the grid; it is therefore a straightforward and quasi
physical notion.
18 The Hidden Logic of Sudoku
As can be seen from the definition, a Sudoku grid is a special case of a Latin
Square. Latin Squares must satisfy the same constraints as Sudoku, except the con-
dition on blocks. The practical consequences of this relationship between Sudoku
and Latin Squares will appear throughout this book (and the logical relationship
between the two theories will be fully clarified in chapter IV).
Figure 1 below shows the standard presentations of a problem grid (also called a
puzzle) and of a solution grid (also called a complete Sudoku grid).
1 2 6 7 3 8 9 4 5 1 2
3 5 9 1 2 7 3 5 4 8 6
6 7 8 4 5 6 1 2 9 7 3
7 3 7 9 8 2 6 1 3 5 4
4 8 5 2 6 4 7 3 8 9 1
1 1 3 4 5 8 9 2 6 7
1 2 4 6 9 1 2 8 7 3 5
8 4 2 8 7 3 5 6 1 4 9
5 6 3 5 1 9 4 7 6 2 8
The problem statement lists the constraints a solution grid must satisfy, i.e. it
says what we want. It does not say anything about how we can obtain it: this is the
job of the resolution methods and the resolution rules on which they are based (two
notions that will be progressively refined in this introduction, until the final defini-
tion of a resolution rule can be given in chapter IV).
c1 c2 c3 c4 c5 c6 c7 c8 c9
3 3 3
r1 4 5 6 4 6 4 5 6 4 4 4 5 1 2 r1
8 9 7 9 7 8 9 7 8 9 7 8 9 7 8 9 9
2 1 2 1 2 2
r2 4 6 4 6 4 6 3 5 4 6 4 6
8 9 7 9 7 8 9 7 8 9 9 8 9 8 9
23 1 2 3 1 2 3 1 1 2 3
r3 4 5 4 4 5 6
4 4 4 5 7 4 5 r3
8 9 9 8 9 8 9 8 9 9 8 9
2 2 2 1 1 2 2 1
r4 7
4 6 4 5 6 5 5 6 6 3
5 6 4 5 6 r4
9 8 9 8 9 8 9 8 9 9 9
2 3 2 3 2 3 1 1 2 3 2 1
r5 5 6 6 5 6 4
5 6 6 8
5 6 5 6 r5
9 9 9 7 9 7 9 9 7 9
2 3 2 3 2 3 2 3 2 2
r6 1 4 6 4 5 6 5 5 6 6 4 5 5 6 4 5 6 r6
9 8 9 7 8 9 7 8 9 7 8 9 7 9 9 7 9
3 3 3 3 3 3
r7 4 6 4 6 4 6 1 2
4 6 5 5 5 r7
9 7 9 7 9 7 8 9 7 9 8 9 7 8 9
2 3 1 2 3 3 3 1 2 1 3
r8 6 8
6 5 5 6 6 5 4
5 r8
9 7 9 7 9 7 9 7 9 7 9 7 9
2 3 1 2 3 3 3 2 3 1 3
r9 4 5
4 4 4 6 r9
9 7 9 7 8 9 7 8 9 7 8 9 8 9 7 8 9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Figure 2. Grid Royle17-3 of Figure 1, with the candidates remaining after the elementary
constraints have been propagated
Given this choice, the process of solving a grid "by hand" is generally initialised
by defining the "candidates" for each cell. For later formalisation, one must give a
careful definition of this notion: at any stage of the resolution process, candidates
for a cell are the numbers that are not yet explicitly known to be impossible values
for this cell. At the start of the game, one possibility is to consider that any cell with
20 The Hidden Logic of Sudoku
no input value admits all numbers from 1 to 9 as candidates (but more subtle initiali-
sations can be considered).
Usually candidates for a cell are displayed in the grid as smaller and/or clearer
letters in this cell (as shown in Figure 2); for better readability of such representa-
tions, the nine blocks will be marked by thick borders and each of the possible
values will always be represented at the same relative place in each of the cells.
According to the type of their action part, such rules can be classified into three
categories:
– either assert the final value of a cell (when it is proven there is only one possi-
bility left for it); there are very few rules of this type;
– or delete some candidate(s) (which we call the target values of the pattern)
from some cell(s) (which we call the target cells of the pattern); as appears from a
quick browsing of the available literature and as will be confirmed by this book,
most resolution rules are of this type; they express specific forms of constraints
propagation; their general form is: if such a pattern is present, then it is impossible
for some value(s) to be in some cell(s) and the corresponding candidates must be
deleted from them;
– or, for some very difficult grids, recursively make a hypothesis on the value of
a cell, analyse its consequences and apply the eliminations induced by the
contradictions thus discovered; techniques of this kind (named "recursive Trial and
Error" or "recursive guess"), do not fit our condition-action form and are proscribed
by purists (for the reason that, most of the time, they make solving the puzzle totally
uninteresting); this book will show that they are very rarely needed if one admits
complex chain rules.
It should be noted that all of the above resolution rules, whatever their type, do
not assert that there is a solution. But for recursive Trial and Error, they may be
interpreted from an operational point of view as: "from what is known in the current
situation, do conclude that any solution, if there is any, will satisfy the following".
Introduction 21
As one proceeds with resolution, candidates for each cell form a monotone
decreasing set. With a little care, this remains true even when making hypotheses
(i.e. resorting to recursive Trial and Error) cannot be avoided; in our "SudoRules"
solver, for instance, in this case all candidates are explicitly relativised to finite sets
of hypotheses (called "contexts") and monotonicity is thus maintained.
The four simpler constraints propagation rules (obviously valid) are the direct
translation of the initial problem formulation into operational rules for managing
candidates. We call them "the (four) elementary constraints propagation rules"
(ECP):
– ECP(cell): "if a value is asserted for a cell (as is the case for the initial values),
then remove all the other candidates for this cell";
– ECP(row): "if a value is asserted for a cell (as is the case for the initial values),
then remove this value from the candidates for any other cell in the same row";
– ECP(col): "if a value is asserted for a cell (as is the case for the initial values),
then remove this value from the candidates for any other cell in the same column";
– ECP(blk): "if a value is asserted for a cell (as is the case for the initial values),
then remove this value from the candidates for any other cell in the same block".
Together with NS, the four elementary constraints propagation rules constitute
"the (five) elementary rules".
A novice player may think that these five elementary rules express the whole
problem and that applying them repeatedly is therefore enough to solve any puzzle.
If such were the case, you'd probably never have heard of Sudoku, because it would
amount to mere paper scratching. Anyway, as he gets stuck in situations where none
of these rules remains applicable, he soon discovers that, except for the simplest
grids, this is very far from being sufficient. The puzzle in figure 1 is a simple illus-
tration of how you get stuck if you only know and use the five elementary rules: the
resulting situation is shown in figure 2, in which none of these rules can be applied.
As we shall see later (in chapter V), for this particular puzzle, there is an easy way
to unblock it. But, as we shall also see, there are lots of puzzles that need rules of a
much higher complexity in order to be solved. And this is why Sudoku has become
22 The Hidden Logic of Sudoku
so popular: all but the easiest puzzles require a particular combination of neuron-
titillating techniques and may even suggest the discovery of as yet unknown ones.
One general way out of the blocked situation described above is recursive Trial
and Error: when stuck, one can start a systematic (depth first) exploration of the tree
of possibilities, duly pruned by the propagation of elementary constraints (thus
avoiding the exploration of obviously contradictory possibilities). Simplistic as this
method may be, it has a major theoretical advantage, justifying that we keep it in our
arsenal of techniques to solve a grid: it is guaranteed either to find a solution if there
is (at least) one or to prove there is none.
Finally, we keep recursive Trial and Error in our arsenal, but we keep it as a last
resort weapon, to be used when nothing else can be done; we shall see that with all
the rules defined later in this book, such a strategy guarantees that, most of the time
(in 99,7% of Royle’s 36,628 reference cases defined below; in 97% of the randomly
generated puzzles), this technique is not needed; and, when it is needed, we have
found no case for which one level of hypothesis was not enough. With the rules for
3D chains introduced in this second edition, these percentages rise to over 99.99%.
Because the five elementary rules are not enough to solve any puzzle and recur-
sive Trial and Error is not realistic from a human solver point of view, other resolu-
tion techniques must be devised.
Since Sudoku was invented, more or less complex resolution rules have been
defined. They are based on the eliciting of various types of additional constraints,
some of which may be non-obvious consequences of the problem statement.
Unfortunately, very often in the available literature, these rules, especially the
most complex ones, are only illustrated by examples and their definitions remain
Introduction 23
rather vague – which incurs both redundancy in the rules proposed by various au-
thors and much uncertainty regarding their scopes of application. To check this, just
look at some of the innumerable Web sites dedicated to Sudoku (more than sixty
millions, only a few of which are listed in the bibliography at the end of this book).
It appears that this vagueness is due to the lack of a general guiding principle for
stating the rules, and this in turn is due to the lack of a clear notion of the com-
plexity of a rule.
Later in this book, we shall provide a precise and sometimes unusual rephrasing
of most of the familiar rules. Besides the constraint of non-ambiguity, the general
guiding principle we adopt can be considered a version of Occam’s Razor. One can
easily find some logical and some psychological support for it. It can be viewed
from two complementary, but essentially equivalent, points of view:
– from the point of view of the preconditions of a rule: a rule should apply only
in cases when simpler rules do not, i.e. its preconditions must be so specific as not to
subsume those of simpler rules; but they must also be so general as to cover as many
cases as possible; said otherwise, the scope of a rule must be extended as far as the
logic underlying it allows;
– from the point of view of the conclusions of a rule (its action part): a rule
should produce effective results, i.e. its conclusions should not be obtainable by
simpler rules.
Of course, with the same reference to "simpler rules" in the two points of view,
this principle relies on a definition of the complexity of a rule (or at least of the rela-
tive complexities of two rules). In this book, we build a hierarchy of rules progressi-
vely, based on:
– a distinction between three general classes of rules: subset rules, interaction
rules and chain rules;
– a generalised notion of logical symmetry and associated representations;
– a second guiding principle: a rule obtained from another by some (generalised
or not) logical symmetry must be granted the same logical complexity.
Given our objective of formalising the methods applied by a human solver, our
second principle is highly debatable. There may be a great gap between abstract
logical complexity and psychological complexity for the human solver. But the fact
is that, in most cases, we have no idea of how psychological complexity can be mea-
sured. It is even doubtful that a given resolution rule could be given a psychological
complexity measure independent of the "geographical" situation on the grid of the
cells it applies to, i.e. independent of the most elementary symmetries inherent to
24 The Hidden Logic of Sudoku
Sudoku (see chapter I); for instance, identical patterns of candidates on adjacent
cells may be easier to see than the same patterns in distant cells; and this may also
depend on individual psychological specificities. On the other hand, it is our hope
that a partial relative ordering of our rules, based on their logical formulation and
consistent with all the logical symmetries of the game, will serve as a reference for
future measures of the psychological deviations from it. Moreover, there is a strong
argument in favour of this principle, if one adopts the graphical representations and
the extended Sudoku board (defined in chapter II) that makes obvious the equiva-
lences associated to generalised symmetries.
Notice that we are looking for a partial complexity order relation on the set of
resolution rules and that this is a very different task from trying to rank the puzzles
based on some definition of the complexity of their resolution path (unless one
defines the ranking of a puzzle as the complexity of the most complex rule necessa-
ry to solve it – not a very realistic ranking). Of course, there must be some relation-
ship between a ranking of the puzzles and a partial complexity order on the set of
resolution rules. Nevertheless, given a fixed set of rules, we shall see through exam-
ples that it can solve puzzles whose solution paths vary largely in complexity (what-
ever intuitive notion of complexity one adopts for the paths). In this book, we shall
not tackle the problem of ranking the puzzles.
One last point can now be clarified. Everywhere in this book, a resolution
method must be understood strictly as:
– a set of resolution rules,
– a non-strict precedence ordering among them. Non-strict means that two rules
can have the same precedence (for instance, there is no reason to give a rule higher
precedence than that obtained from it by transposing rows and columns or by any
generalised symmetry).
⎢ ⎢ End do
⎢ End do
⎢ Apply rule on selected matching pattern
End loop
In this context, a natural question arises: given a set of resolution rules, can
different orderings lead to different puzzles being solved or unsolved? The answer is
in the notion of confluence, to be explained in chapter XXII, where it will be shown
that all the sets of rules introduced in this book have the confluence property and
that the ordering of the rules is therefore irrelevant as long as we are only interested
in solving puzzles; but it is of course very relevant when we also consider the effi-
ciency of the associated method, e.g. the simplicity of the solution paths.
This abstract property has a very practical meaning for the player: it allows
him/her not to be as systematic in the application of the rules as a machine would
be, without running the risk of being blocked because of missing an elimination he
could have done earlier in the resolution process.
As its organisation shows, this book is centred on the Sudoku problem itself.
Nevertheless, from the points of view of logic or AI, it can also be considered as a
long exercise in either of these disciplines. So let us clarify the roles we grant them.
Throughout this book, the primary function of logic will be that of a compact
notation tool for expressing the resolution rules in a non ambiguous way and expli-
citing the symmetry relationships between them (the simplest and most striking
example of this is the set of rules for Singles in section V.2).
For better readability, the rules we introduce will always be formulated first in
plain English and their validity will only be established by elementary non-formal
means. The non mathematically oriented reader should therefore not be discouraged
by the logical formalism. He can even skip chapters III and IV and the formal
version of each rule that will usually follow its intuitive definition.
Moreover, in the very important case of the various types of chains we shall
consider, the associated rules will always be expressed in an intuitive graphical
formalism (partly inspired from existing informal representations one can find on
Web forums, but also resolutely diverging from them when necessary); it will be
26 The Hidden Logic of Sudoku
The formalism we use relies effectively on the strictest formal logic and it would
not be very difficult to use it as a basis for formal proofs. From a logical point of
view and given the basic definitions of chapters III and IV, we consider that these
formal proofs are no more than easy exercises for students in logic and we should
not overload this book with them.
Finally, the other role assigned to logic is that of a mediator between the intuiti-
ve formulation of the resolution rules and their implementation in our AI program
(SudoRules, or any other). This is a methodological point for AI (or software engi-
neering in general): no program development should ever be started before precise
definitions of its components are given (though not necessarily in strict logical
form) – a common sense principle that is very often violated, even by those who
consider it as obvious (this is the teacher speaking)!
Productivity of new rules was tested as soon as they were introduced. Some-
times, it was very hard to find an example for a rule (such as rules for Naked-,
Hidden- or Super-Hidden- Quadruplets). And sometimes, when no example could
be found, it led to the conjecture, and then to the proof, that the supposedly new rule
was subsumed by (i.e. could be reduced to) simpler ones.
Introduction 27
As I said above, this book can also be considered as a (long) exercise in AI.
Many computer science departments in universities have taken Sudoku as a basis for
various projects. My personal experience is that it is a most welcome topic for a pro-
ject in computer science or AI.
As can be seen from a fast browsing of this book, many examples are scattered
in every chapter, making nearly a third of the content. This is not only because a
book on Sudoku without a lot of examples would be like a French lunch without
cheese. All our examples satisfy precise functions and their choice is anything but
arbitrary. We decided that each example should:
– be as short as possible,
– illustrate a precise rule,
– prove that the rule it illustrates cannot be reduced to simpler ones (in this
sense, the detailed resolution paths given for all the examples must be considered as
proofs of independence theorems),
– originate from a real puzzle (this may seem an obvious constraint, but one can
find examples on the Web where a partial situation is displayed with no indication
as to its origin; for instance, one can find an example of an xy-chain of length 20;
but I have never seen any real puzzle whose resolution needed to consider such a
long xy-chain).
has not been proven that grids with fewer than seventeen entries cannot have a
unique solution); in order to avoid confusion with the broader notion of minimality
defined below, we call this case 17-minimal; grid number n in this collection is
always named Royle17-n;
– the second, hereafter named the Sudogen0 collection, consists of 10,000
puzzles randomly generated with the C generator suexg (see http://magictour.free.fr
/suexco.txt for a description of the generation principles), with seed 0 for the
random numbers generator; grid number n in this collection is always named
Sudogen0-n;
– the third, hereafter named the Sudogen17 collection, consists of 10,000
puzzles randomly generated with the same software as above, but using a different
seed (17); grid number n in this collection is always named Sudogen17-n.
All the puzzles in our three test databases are minimal in the following sense
(broader than the one used by Gordon Royle, in that they may have more than
seventeen entries): they have a unique solution and any puzzle obtained from them
by eliminating any one of their entries has more than one solution. For the Royle-17
case, this property results from the assembling choices of the collection; for the two
Sudogen cases, the property is included in the principles of the generating software.
As for the specific examples chosen in this book to illustrate our rules, most of
them draw upon the Royle17 collection. Occasionally, we also take examples from
the Sudogen0 and Sudogen17 collections. The main reason for preferring the Royle-
17 puzzle database is that showing that there is a 17-minimal puzzle for which a rule
applies is a stronger result than just showing that this rule applies to some grid with
no specific property (but this is not really important for the purposes of this book).
And the main reason for using also randomly generated puzzles is for not relying on
biased databases when we study global classification results.
Convention 2: let us define the theory (i.e. the set of rules) L1_0 as the union of
all the above elementary (ECP) rules and the semi-elementary rules (Naked Singles
and Hidden Singles – NS and HS) defined in chapter V. It can easily be checked that
the final rules that apply to a puzzle always belong to L1_0, at least when these rules
are given higher priority than more complex ones. Except in chapter V, where they
are introduced, they will always be omitted from the end of the listing of the
resolution path.
Convention 3: it can easily be checked that for most puzzles (due to the fact that
they are minimal), the first rules that can be applied (not mentioning ECP) are semi-
elementary propagation rules (NS and HS) whose direct effect is to add values.
Except in chapter V, we shall replace the initial puzzle P by its "L1_0 elaboration",
i.e. by the puzzle obtained by adding to P all the values asserted by these first semi-
elementary rules. Of course, this puzzle is no longer minimal (but it originates in a
clearly defined minimal one).
Convention 4: since, for complex puzzles, this is sometimes still not enough for
concentrating the listing on the rule we want to illustrate, when necessary we shall
replace the original puzzle by the one obtained after applying rules of higher com-
plexity than the semi-elementary ones (but of lower complexity than the one the
example intends to illustrate). If P is a puzzle and T is a Sudoku Resolution Theory
(i.e. a set of resolution rules), the puzzle obtained from P by applying repeatedly all
the rules in T until none of them can be applied and keeping only the values thus
asserted (i.e. discarding any information on the candidates eliminated) is called the T
elaboration of P. Notice that the T elaboration of P is a real puzzle (although not
minimal). It includes all the consequences of T on P that can be expressed as values.
With the above conventions, only the interesting part of the resolution process of
the initial minimal puzzle will be displayed. But, even with the economy resulting
from the above conventions, some traces of resolution processes may be very long.
Most of the time, we shall select examples with short traces, but this will not always
be possible and, in order to help you keep in mind that these examples are somewhat
exceptional and the possibilities are much more varied, especially for rules relative
to long chains, longer examples will also be given from time to time (see for
instance chapters XV or XVIII).
All our examples will respect the following uniform format (Figure 3), except
that, due to page setting constraints, the figure displaying the puzzle, the introduc-
tory text and/or the listing may be inverted.
The first line of the resolution path indicates by which resolution theory T1
(L3_0 in the above example) the elaborated puzzle displayed in the second grid was
obtained and (inside parenthesis) to which simpler elaboration T0 (L1 in the above
example) it is equivalent. Both statements are useful: the second indicates the mini-
mal theory T0 necessary (which is also the part of T1 effectively used) to get the
elaborated version of P; the first indicates that P cannot be solved in (the stronger)
T1 alone.
This first line of the resolution path also indicates in which theory T (stronger
than T1) the resolution path that follows is obtained (L3_0+XY3 in the example).
Most of the time, T will be of type T1+R, i.e. will be obtained from T1 by adding a
single rule, R. Thus, by showing that P can be solved in T1+R but not in T1 alone,
the example proves that the R rule is not subsumed by (i.e. cannot be reduced to) the
set of rules in T1. This is a very important property, because the converse would
mean that, given T1, R is useless. To express it, we write that P belongs to [T1]+R.
Every example of this form can thus be considered as an independence theorem.
Introduction 31
-----------------------------------------------------------------------------------------------------
<Introductory text>
3 1 4 7 5 6 8 2 9 3 1 4 7 5 6 8 2 9 3 1
7 9 3 6 1 7 9 2 3 6 1 4 7 9 2 5 8
8 9 2 1 3 6 7 8 9 2 5 1 3 4 6 7
1 3 2 1 3 2 5 6 7 1 3 2 5 4 8 9 6
4 7 6 2 4 9 3 8 7 1 5 6 2 4 9 3 8 7 1 5
1 5 8 1 6 3 2 9 5 8 1 6 7 3 2 4
5 4 6 7 5 3 9 8 4 1 6 7 2 5 3 9 8 4 1 6 7 2
2 8 2 8 6 7 9 5 1 4 3 2 8 6 7 9 5 1 4 3
3 1 4 7 3 2 6 1 4 7 3 2 6 5 8 9
Resolution path in L3_0+XY3 for the L3_0 (or L1) elaboration of Royle17-186:
xy3-chain {n8 n5}r2c8 — {n5 n4}r3c7 — {n4 n8}r4c7 ==> r4c8 ≠ 8
… (Naked-Singles and Hidden-Singles)
-----------------------------------------------------------------------------------------------------
Then comes the resolution path proper. Each step in the resolution path is the
application of a well defined resolution rule in T to the precisely decribed and
purely factual situation resulting from the previous rule applications; the resolution
path is thus a proof of the solution within theory T (where "proof" is meant in the
strict mathematical logic sense).
Starting from the elaborated version of the puzzle, only the sequence of non-
obvious resolution steps will be displayed. Each line in the sequence consists of the
name of the rule applied, followed in order by: the description of how the condition
part is satisfied (how the rule is "instantiated"), the "==>" sign, the conclusion(s)
allowed by the "action" part. Details of the "nrc notation" used for the condition part
will be described progressively with each rule we study. The conclusion part is
always either that a candidate can be eliminated, symbolically written as here: r4c8
≠ 8, or that a value must be asserted, written symbolically as e.g. r4c8 = 8. When the
same rule instantation justifies several conclusions, they will be written on the same
line, separated by commas: e.g. r4c8 ≠ 8, r5c8 ≠ 8.
32 The Hidden Logic of Sudoku
The rule(s) of interest in the path will be displayed in bold characters. In the
above example, there is only one step, the application of the XY3 rule to some
clearly described pattern of cells and values.
The trace of a resolution path will always end with the line "… (Naked-Singles
and Hidden-Singles)" or something similar to remind you of convention C2.
The above conventions present the following advantages for you reader, if you
want to try the examples. First, you may skip the uninteresting parts and start from
the central puzzle; it is not minimal, but it is a real puzzle. Then, most of the time,
the first rule you will have to apply (after the obvious ECP) will be the one studied
in the chapter of the example; and, when this is not the case, the steps you will have
to apply before you reach this rule will be clearly indicated so that you can easily
reproduce them until you reach the pattern of interest. Our examples are designed to
help you detect these patterns but they suppose an active participation on your part:
only the initial values are displayed; it is left to you to apply ECP and the other rules
of the resolution path to reach the desired situation. Occasionally, the detailed situa-
tion at some point in the resolution path (i.e. all the values and candidates present at
this point) will be displayed so that you can directly check the presence of the
pattern under discussion, but, due to place constraints, this cannot be systematic.
Finally, note that all the traces of resolution processes given in this book were
obtained with version 13 of our SudoRules solver (with some hand editing for a
shorter and cleaner appearance), run in the CLIPS 6.24 environment (more on this in
chapter XXI).
FOUNDATIONS
Chapter I
I.1. Symmetries
Throughout this book, the word "symmetry" is used in the general abstract
mathematical sense. A Sudoku symmetry, or symmetry for short, is thus just a trans-
formation that, when applied to any valid Sudoku grid, produces a valid Sudoku
grid. Any combination of symmetries is a symmetry, there is a null symmetry (that
does not change anything) and every symmetry has a reverse; we therefore have a
group of symmetries.
Two grids (completed or not) that are related by symmetry are said to be essen-
tially equivalent. The reason is that when the first is solved, its solution path can be
transposed to solve the other. The abstract notions above become very concrete and
intuitive as soon as a set of generators for the whole group of symmetries is given.
By definition, any symmetry is then composed of a finite sequence of these
generating ones. The simplest set of generators one can consider is composed of two
different types of obvious symmetries (see e.g. [RUS 05]):
– permutations of the numbers: the numerical values of the numbers used to fill
the grid are totally irrelevant; they could indeed be replaced by arbitrary symbols; a
Japaglish word ("Wordoku") has even been invented for the purpose of naming
puzzles to be filled with letters instead of numbers, which is hiding the fact that this
is essentially the same game; keeping numbers from 1 to 9 as symbols, any permuta-
tion of the numbers (which is just a relabelling of the entries) defines a symmetry of
the game; there are obviously 9! = 362,880 such symmetries.
36 The Hidden Logic of Sudoku
As of the writing of this book, symmetries have been used mainly to count the
number of essentially non-equivalent grids. Expressed in terms of elementary sym-
metries, two grids (completed or not) are essentially equivalent if there is a sequence
of elementary symmetries such that the second is obtained from the first by appli-
cation of this sequence; this does not entail that they are of "humanly equivalent
difficulty" – whatever intuitive meaning one can associate with this last sentence.
Thus, E. Russell & F. Jarvis have shown in [RUS 05] that the number of non
essentially equivalent complete Sudoku grids is 5,472,730,538 – much less than the
a priori possibly different 6,670,903,752,021,072,936,960 complete grids, but still
enough to spend trying to solve them more of your next lives than you'd need to
reach nirvana. So much more so, considering that the number of essentially different
puzzles may be even greater, its exact value being still unknown. The point is that
each complete grid may be the solution for many different minimal puzzles. For
instance, Gordon Royle has published a grid (displayed in Figure 1) such that there
are 29 puzzles with seventeen entries whose unique solution is this grid.
Symmetries, analogies and supersymmetries 37
6 3 9 2 4 1 7 8 5
2 8 4 7 6 5 1 9 3
5 1 7 9 8 3 6 2 4
1 2 3 8 5 7 9 4 6
7 9 6 4 3 2 8 5 1
4 5 8 6 1 9 2 3 7
3 4 2 1 7 8 5 6 9
8 6 1 5 9 4 3 7 2
9 7 5 3 2 6 4 1 8
Later we shall formulate axioms for Sudoku in a logical language and in a way
that exhibits all the previous symmetries. In turn, such symmetries in the axioms
will lead to symmetries in the logical formulation of our resolution rules. But all the
types of symmetries are not expressed in the same way in these axioms or rules.
Primary symmetries other than row-column are totally transparent, in that they
make use of variable names (for numbers, rows, columns…) but they refer to no
specific values of these entities.
meta-theorem 1 (informal): for any valid resolution rule, the rule deduced
from it by permuting systematically the words "row" and "column" is valid and it
has obviously the same logical complexity as the original. We shall express this
as: the set of valid resolution rules is closed under symmetry.
Let the nine rows be numbered 1, 2, …, 9 from top to bottom. Let the nine
columns be numbered 1, 2, …, 9 from left to right. Let the nine blocks and the nine
38 The Hidden Logic of Sudoku
squares inside any fixed block be numbered according to the same following
scheme:
123
456
789
Coordinates should not be confused with the various names that can be given to
the rows, columns, blocks, squares and cells for displaying purposes. Various
displaying conventions can be used, but we shall systematically stick to the
following classical convention, which we have found most convenient:
– rows are named: r1, r2, r3, r4, r5, r6, r7, r8, r9;
– columns are named: c1, c2, c3, c4, c5, c6, c7, c8, c9;
– cells in natural rc-space are named accordingly, in the obvious way: r1c1,
r1c2, …, r9c9;
– blocks are named: b1, b2, b3, b4, b5, b6, b7, b8, b9;
– squares in a block are named: s1, s2, s3, s4, s5, s6, s7, s8, s9;
– as a result, cells in rc-space can also be named: b1s1, b1s2, …, b9s9;
Symmetries, analogies and supersymmetries 39
– when needed, numbers are named n1, n2, n3, n4, n5, n6, n7, n8, n9; this will
be useful in the next chapter when we consider "abstract spaces": row-number,
column-number and block-number and we want to name cells in these spaces: r1n1,
r1n2… in rn-space; c1n1, c1n2,… in cn-space; b1n1, b1n2,… in bn-space; the
reason is that r11, r12… or c11, c12… would be rather obscure and confusing.
Notice that, as the same subscripted lower case letters will be used for variables,
these displaying conventions might lead to some confusion between variables and
constants. But this risk of confusion is very limited: no constant symbol will ever
appear in an axiom or a resolution rule and no variable symbol will ever appear in
the description of any real facts on a real grid.
I.4. Analogies
Analogies should not be confused with symmetries. There are some analogies
between rows and blocks (or between columns and blocks) but there is no real sym-
metry.
This is related to the fact that the two canonical coordinate systems do not share
the same properties with respect to the game of Sudoku. There is a symmetry be-
tween the coordinates in the first system (rows and columns) and, relying explicitly
on this symmetry, many axioms and rules exist by pairs; but there is no symmetry
between the coordinates in the second system (blocks and squares) so that trans-
posing rules from the first system to the second would be meaningless.
There is nevertheless a partial analogy between rows (or columns) and blocks,
captured by the following informal
What the phrases "systematic symmetry between rows and columns" and "pro-
ved without using the row-column symmetry property" mean will be defined preci-
sely in chapter IV.
40 The Hidden Logic of Sudoku
I.5. Supersymmetries
These abstract spaces, their associated graphical representations and the three
meta-theorems stated in the present chapter will be abundantly illustrated in the
subset rules of chapters VI, VII and VIII where they will be used to show that
apparently complex familiar rules (such as X-wing, Swordfish or Jellyfish) are no
more than the supersymmetric versions of obvious ones (Naked-Pairs, Naked-
Triplets and Naked-Quadruplets, respectively). They will also be the basis for intro-
ducing the notion of hidden chains and associated new resolution rules in chapters
XV, XVIII and XX.
Chapter II
The reason for considering rn-cell with coordinates (r, n) in row-number space is
that it will contain all the possibilities (i.e. all the possible columns) for the unique
instance of number n that must occur in this row r; similarly, the reason for
considering cn-cell with coordinates (c, n) in column-number space is that it will
contain all the possibilities (i.e. all the possible rows) for the unique instance of
number n that must occur in column c; finally, the reason for considering bn-cell
with coordinates (b, n) in block-number space is that it will contain all the possibili-
ties (i.e. all the possible squares) for the unique instance of number n that must
occur in block b.
42 The Hidden Logic of Sudoku
At any point in the resolution process, all the data on the grid (values and candi-
dates) can be displayed in any of these four representations. We insist that they all
display exactly the same abstract logical information content – or, to say it more
formally: they correspond to the same underlying set of ground atomic formulæ in
the logical language that will be introduced later. They should be considered only as
different visual supports for symmetry, analogy and supersymmetry, in the sense
that it is easier to detect some patterns in some representations that in others, as
many chapters in this book will show. The correspondences are straightforward and
given by the following equivalences:
– number n is in rc-cell (r, c),
– column c is in rn-cell (r, n),
– row r is in cn-cell (c, n),
– square s is in bn-cell (b, n) – where (r, c) = [b, s].
< row
3 5 5 6
6 7 4 8
7 3 7 1
4 8 4 7
1 1
1 2 4 5
8 4 8 2
5 6 2 7
< block
7 1 5 6 7
2 4 2 3 9 8
5 8 7 1
9 2 4
3 9 1 4
4 3 8 5
8 5 1 2
5 7
Figure 1. Same puzzle Royle17-3 as in the introduction, Figure 1, but viewed in the four
different representation spaces
Complementary graphical representations 43
Notice that pseudo blocks (i.e. groups of 3x3 rn-, cn- or bn- cells) have no
meaning in the new rn- or cn- representations (this is why we do not mark them with
thick borders): only constraints on Latin Squares can be directly propagated in row-
number and column-number spaces (as will be proved in chapter IV). And links in
bn-space cannot use the number coordinate.
Let us illustrate our new representations with an example. Starting from the
puzzle in the upper left corner of Figure 1 (puzzle Royle17-3), we can first display
its entries in the standard grid and in the three new grids of the same Figure 1.
c1 c2 c3 c4 c5 c6 c7 c8 c9
2 3 2 3 3 3
n1 6 8 9
7 4 5 4 5
8
1 4 5
8 9
2 3 2 3 2 3 2 3
n2 5
8 9
4 5 6 4 5 6
8 9
4 6 7 4 5 6
8
6 4 5 6
9
1 n2
1 3 1 3 1 3 3
n3 5
7 8 9 7
5 6 5 6
7 8 9
6
8 9
2 5 6
7 8 9
4 7 9 7 8 9
1 2 3 1 2 3 1 2 3 1 3 1 3 1 2 3 2 3
n4
7 9
4
7
6 4
7
6
9
5 9 7 9
6 8 4 6 n4
1 3 1 3 1 3 3
n5 5 9 4 5 6 4
8
6 4 5 6
8
2 7 8
6 4 5 6
7
4 5 6
7 8
n5
1 2 1 2 1 2 2 2
n6 5
7 8
4 5 6
7
4 5 6
7 8
3 4 5 6
8
4 5 6
7 8
9 4 5 6 4 5 6 n6
1 2 1 2 1 2 1 1
n7 4 7 7 8 9
6
8 9
5 6
8 9
5 6
7 8 9 7 8
6 3 5 6
7 8 9
n7
1 2 3 1 2 3 1 2 1 3 1 3 2 2 3
n8 8 4 6 4 6
9
4 6
9
4
7
6
9
5 7 9 7 9
n8
1 2 3 1 2 3 1 2 3 1 2 1 3 1 3 1 2 3 2 2 3
n9 5 4 5 6 4 5 6 4 6 4 5 6 4 5 6 6 4 5 6 4 5 6 n9
7 8 9 7 7 8 9 8 9 8 9 7 8 9 7 8 7 9 7 8 9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Figure 2. Same puzzle Royle17-3 as in the introduction, Figure 2, but viewed in number-
column space
Let us apply all the elementary constraints propagation rules to the standard grid
for this puzzle. We get the natural representation of the result in row-column space
(Figure 2 of the introduction). Now, suppose we display the grid thus obtained, with
44 The Hidden Logic of Sudoku
its candidates, in the full rn- and cn- representations with candidates. Generating
these new grid representations by hand is easy as long as we consider only values, as
in Figure 1, but it requires some care when it comes to the candida-tes.
Nevertheless, with some practice, it is relatively easy to apply the above stated
equivalences. Moreover, programming a spreadsheet computing the three new grids
and their candidates automatically from the first is an easy exercise.
It becomes obvious that there is a cn-cell (n1c7) with only one possibility left:
the unique instance of number 1 that must appear somewhere in column 7 is in fact
confined to row 8 (i.e. cn-cell n1c7 has only one row candidate: r8).
As an example of the groups of 3x3 cn-cells having no meaning, we can see that
the same candidate (row) appears twice in two of these pseudo-blocks (7, i.e. r7, in
the second upper pseudo-block and 1, i.e. r1, in the third upper pseudo-block).
This is our first case of a "Hidden-Single" in a column. Notice that the phrase
"hidden single in a column" suggests properly that, in column c7, cell r8c7 has a
single possible value but that this fact is not visible by looking only at the candidates
for this cell. Of course, one can also find Hidden-Singles in rows or in blocks.
Actually, our example puzzle Royle17-3 can be solved using only these three types
of Hidden-Singles (in addition to the elementary rules, of course). It shares this
property with a total of 8,051 (among 36,628) puzzles in the Royle17 database –
which also entails that the remaining 28,577 grids in this database cannot be solved
with only these rules.
Complementary graphical representations 45
Now, a few comments about these new graphical representations are in order.
Should one admit them as an acceptable basis for human solving? What can be
considered as accessible to a human solver? There will probably never be any
general agreement on this point. My personal opinion is that, given the additional
paperwork needed for building and maintaining the four representations in parallel,
they are not very useful for easy grids (and, in particular, for the detection of hidden
subsets); but, one can easily imagine a computerised interface that maintains the
coherency between the four grids (any time you eliminate a candidate from one of
them, this is transferred to the others).
Moreover, there are many difficult puzzles that become easier to solve if we use
such representations (and rules based on them); they therefore seem to be an
inescapable tool for the advanced player. As a result, a significant part of this book
is based on symmetries, supersymmetries and hidden structures (subsets, chains…).
For advanced examples, see chapters XV, XVII and XVIII, where hidden chains of
various types are introduced and shown to be irreducible to non-hidden chains.
As many chapters in this book will show, especially when we deal with chains,
the rn- and cn- spaces will allow us to describe simple patterns and rules that would
need much more complex descriptions if we tried to do it in the standard rc-space. In
order to facilitate their use, the rn- and cn- representations of these spaces can be
grouped with the standard one into the following Extended Sudoku Board. (We
don’t use much the bn-representation, although we could, because it is of limited
practical interest and it won’t reappear in the sequel.)
Notice that the rn- and cn- representations do not replace the standard one; they
are added to it, so that the three representations, when placed in the proper relative
positions, form an extended Sudoku board, as given in section 4 below. In order to
avoid confusion between rows, columns and numbers, in this extended board we
shall systematically use their full names: r1, r2,…; c1, c2,…; n1, n2,…
46 The Hidden Logic of Sudoku
c2 c3 c4 c5 c6 c7 c8 c9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r2 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r2
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r3 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r3
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r4 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r4
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r5 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r5
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r6
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r7 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r7
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r8 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r8
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3 n1 n2 n3
r9 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 n4 n5 n6 r9
n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9 n7 n8 n9
c1 c2 c3 c4 c5 c6 c7 c8 c9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n1 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n2 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n3 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 n3
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n4 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 n4
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n5 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 n5
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 n6
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n7 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 n7
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n8 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 n8
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3 r1 r2 r3
n9 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 r4 r5 r6 n9
r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9 r7 r8 r9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Complementary graphical representations 47
n1 n2 n3 n4 n5 n6 n7 n8 n9
c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r1 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r2 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r3 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r4 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r5 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r6 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r7 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r8 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3 c1 c2 c3
r9 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6 c4 c5 c6
c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9 c7 c8 c9
n2 n3 n4 n5 n6 n7 n8 n9
In the one page version of this Extended Sudoku Board (available on my Web
pages), the two parts that can only be displayed on two pages here are stuck together
along the central "r1…r9" column. Notice that we also use variants of this extended
bard, in which the n, r and c letters inside the sub-boards are not written (there are
only numbers); which form you prefer is only a matter of personal taste. In our
examples, we shall alternate between the two, so that you can get an idea of the
possibilities.
In this section, we describe a simple step by step procedure for building the
graphical rn- and cn- representations of a puzzle in the rn- and cn- spaces, starting
from the standard representation in the rc-space. This procedure can be applied at
any stage in the resolution process. The next section will explain how to use these
representations. The example we shall use throughout the construction procedure is
that of Figure 2 in the introduction.
48 The Hidden Logic of Sudoku
Consider first row 1 of the rc-representation, in which the numbers in each cell
simply represent the known values or the candidates for this cell:
3 3 3
4 5 6
8 9
4
7
6
9
4 5 6
7 8 9 7 8 9
4
7 8 9
4
7 8 9
4 5
9
1 2
and let us build row r1 of the rn-representation, in which the numbers in each cell
will represent the known column or the candidate columns for this cell. This is done
systematically for each rn-cell, from left to right.
1 2 3 1 2 3 1 3 1 2 3 2 3 1 3 1 2 3
8 9 7
5 6
7
4 5 6 4 5 6 4 5 6
7
Complementary graphical representations 49
As a final check for row r1: there must be exactly the same total number of signs
in the rc- and rn- representations: 34 in this example. There also must be exactly the
same number of decided cells (cells with only one value): 2 in this example. This is
a very effective test in practice (although not a complete proof of correcteness).
To complete the rn-representation, merely repeat the same procedure for each
successive row.
The procedure goes along the same lines as for rows in the rn- representation.
Instead of building rows one by one from top to bottomn, we build columns one by
one, from left to right. We leave the details to the reader. The full result for our
example has already been given in Figure 2 above.
As was the case for each row, there is a final check for each column: there must
be exactly the same total number of signs in the rc- and cn- representations; there
must also be exactly the same number of decided cells.
Deciding how to use the auxiliary representations in rn- and cn- spaces is a
different question from how to build them. One could say that it defines a strategic
level in relation to these representations.
What is most time consuming with the rn- and cn- representations of a puzzle is
building them for the first time. Moreover, the more candidates there remains on the
standard grid, the longer it is (for a human being) to generate these auxiliary repre-
sentations. Therefore, it may be wise to delay their generation until you are blocked.
This is all the wiser that simple hidden patterns such as subsets can be found directly
on the standard grid and the auxiliary representations are useful mainly for complex
chain patterns. The example given in section 3 is probably the more complex you
50 The Hidden Logic of Sudoku
will ever see; most of the time, there will be much fewer candidates left when you
start building the auxiliary rn- and cn- representations.
Delaying the building and use of these representations will entail that, before
generating them, you may have searched for patterns in rc-space more complex than
those you would have searched for in the auxiliary spaces, had you generated them
sooner. But our confluence theorem in chapter XXII shows that this cannot prevent
you from reaching the solution. The only consequence of this strategy is that you
won't classify the puzzle at the same level as you would if you had used the
auxiliary representations from the start, but this is pointless as long as you are only
concerned with solving the puzzle.
Once the rn- and cn- representations are built, maintaining them is worth and
easy, provided that this is done systematically, so that you don't have to build them
several times from nought.
In this process do not forget the elementary constraints propagation rules that
apply after any value is asserted. In particular, do not forget that block constraints
must be applied only in the rc-representation and that the eliminations they entail
must be transfered to the rn- and cn- representations (remember that there is no
constraint on the pseudo-blocks in these auxiliary representations). Row and column
constraints can indifferently be applied directly in the three representations or trans-
ferred from one to the others.
Chapter III
Although this book may be used as a support for exercises in Logic or AI, and
must therefore adopt a clear and non ambiguous formalism, it is not intended as an
introductory textbook on these topics and we want it to be readable by Sudoku
addicts. If you are not mathematically oriented, you should not be discouraged by
the formalism introduced here: apart from the proof of (the very important) meta-
theorems 1, 2 and 3 (in chapter IV) and some local remarks, it will mainly be used
as a very compact notation tool for writing Sudoku resolution rules. In any case, all
52 The Hidden Logic of Sudoku
these rules will first be formulated in plain English, so that you will always be able
to skip the logical version if you are definitively allergic to formalism. Moreover,
most of the resolution rules (and, in particular, the chain rules of all the types
considered in this book) will also be displayed in intuitive graphical representations
(that will be shown to be strictly equivalent to logical formulæ). As for GT and ST,
you can consider them as completely obvious from an intuitive point of view (and
skip this chapter and the next or keep them for further reading).
III.1.2. Sorts
In this book, "Number" will always mean "integer between 1 and 9". "Number"
is the type of the objects intended to fill up the cells of a grid; when we need to refer
to other kinds of numbers, we shall use their usual specific mathematical type: for
instance, integers from 0 to infinity are called integers. The subscripts appearing in
variables of any type are integers, not Numbers; this distinction will be important
later, because we shall have to consider "chains" of length greater than 9. We have
chosen to introduce the sort Number, because Sudoku is generally expressed in
terms of digits, but one could introduce instead a sort Symbol, with nine arbitrary
constant symbols.
Attached to each sort, there are two sets of symbols, one for naming constant ob-
jects of this sort, and one for naming variables of this sort. In the GT case, we have:
– Number:
- constant symbols: 1n, 2n, 3n, 4n, 5n, 6n, 7n, 8n, 9n
- variable symbols: n, n’, n’’, n0, n1, n2, …
– Row:
- constant symbols: 1r, 2r, 3r, 4r, 5r, 6r, 7r, 8r, 9r
- variable symbols: r, r’, r’’, r0, r1, r2, …
– Column:
- constant symbols: 1c, 2c, 3c, 4c, 5c, 6c, 7c, 8c, 9c
- variable symbols: c, c’, c’’, c1, c2, …
– Block:
- constant symbols: 1b, 2b, 3b, 4b, 5b, 6b, 7b, 8b, 9b
- variable symbols: b, b’, b’’, b0, b1, b2, …
– Square:
- constant symbols: 1s, 2s, 3s, 4s, 5s, 6s, 7s, 8s, 9s
- variable symbols: s, s’, s’’, s0, s1, s2, …
– Unit-Type (a unit-type is one of the three symbols: row, col, blk):
- constant symbols: row, col, blk
- variable symbols: ut, ut’, ut’’, ut0, ut1, ut2, …
54 The Hidden Logic of Sudoku
In MS-FOL, the sets of constants of different sorts must not overlap (for
instance, 1 is not the same thing if it designates a row or if it designates a column);
we have therefore introduced different symbols for constants of different sorts (for
instance: 1n, 1r, 1c, …); however, most of the time, we shall be very lax on these
symbols: whenever the sort of a constant symbol is clear from the context, i.e. from
the predicate in whose scope it appears, we shall drop the subscripts.
Notice that we introduce a sort for unit types but no sort for units: a unit can only
be a row, a column or a block (there are 27 units, but we shall never need to refer to
the set of units as such), whereas a unit type can only be one of three formal
symbols: row, col or blk.
For constants of any fixed sort, we adopt a unique names assumption: two
different constant symbols do not designate the same entity. For each of the first five
sorts, this amounts to adding the following thirty-six axioms (with subscripts appro-
priate to each sort):
2≠1,
3≠1, 3≠2,
…,
8≠1, 8≠2, 8≠3, 8≠4, 8≠5, 8≠6, 8≠7,
9≠1, 9≠2, 9≠3, 9≠4, 9≠5, 9≠6, 9≠7, 9≠8,
Alternatively, for the sorts Number, Row, Column, Block and Square, we could
have introduced binary predicates <n, <r, <c, <b and <n, together with the general
axioms for a full order, plus the specific axioms:
1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 (with appropriate subscripts for each sort).
Let us call OGT (ordered GT) the theory thus obtained. Using OGT instead of
GT as the basic theory for grids does not make any difference for Sudoku Theories:
due to the symmetries of Sudoku explained in chapter I, none of the resolution rules
will depend on these orderings. In the sequel, we shall use GT, but, if you prefer,
you can replace GT everywhere with OGT.
Grid Theory and Sudoku Theory 55
As for the variable symbols, they explicitly carry their sort with the first letter of
their name, so that they can be used straightforwardly in quantifiers or equality with
no more specification. For instance:
– ∀r always means "for all row r",
– ∀c always means "for all column c",
– ∃n always means "there exists a number n",
– = can only be used with objects of the same sort, so that writing r = c is not
allowed; to be more formal, the = sign should also be subscripted according to the
type of objects it relates; for instance, to assert that two rows r1 and r2 are equal, we
should use a specific equality symbol =r and write r1 =r r2 (but we shall be lax on this
notation also, since no confusion can arise from it).
Any time such shorthands appear in the text, it is left as an exercise for the
reader to write the corresponding full formula that does not use them.
In GT, in addition to the six equality predicate symbols associated to the above-
defined Sorts, we have only one primary predicate symbol (we call it primary, to
distinguish it from the auxiliary predicate symbols that will be defined later):
– correspondence, with arity 4 and signature (Row, Column, Block, Square).
Just to make things more intuitive about the atomic formulæ built on this predicate
(but this is anticipating on the axioms of GT), let us say that the intended meaning
of correspondence(r, c, b, s) is that (r, c) and [b, s] are the coordinates of the same
cell in the two canonical coordinate systems of natural rc-space: row-column and
block-square.
of course, the intended meaning of this predicate is that suggested by its name: the
two cells share either a row or a column or a block; notice that a cell is not consi-
dered as sharing a unit with itself.
notice that, since two cells can share two units, it would not be possible to replace
this predicate by a function that would take as input the coordinates of the two cells
and produce as a result a unit shared by them.
III.1.4. Formulæ
– if F is a formula, and x is a variable of any sort, then ∀xF and ∃xF are
formulæ.
The auxiliary predicates in the previous section are defined from open formulæ
(without quantifiers, except "same-block").
Notice that, as a result of these axioms, all the sorts under consideration in Grid
Theory (and in the forthcoming Sudoku Theory and Sudoku Resolution Theories)
have finite domains. We could therefore have chosen to write all these theories in
the formalism of Proposition Calculus (i.e. zero order logic) instead of Predicate
Calculus (i.e. first order logic, be it multi-sorted or not). But the choice of MS-FOL,
that appears to be a mere facility of writing from a theoretical standpoint, makes a
major difference from a practical standpoint: Grid axioms, Sudoku axioms and
resolution rules will be much more compact in MS-FOL. Nevertheless, the major
argument for this choice is that it allows the writing of explicit generalisations.
Grid Theory and Sudoku Theory 59
That is, for every cell with coordinates (r°, c°) and [b°, s°] in the two canonical
coordinate systems – which supposes that [b°, s°] = F(r°, c°) –, we assert the axiom:
correspondence(r°, c°, b°, s°). (In this formulation, r°, c°, b°, and s° are considered
as meta-variables for arbitrary but constant values).
It can easily be checked that these 81 axioms, together with the above 189 sort
axioms, are enough to fix the "geometrical" structure of the grid, up to isomorphism,
i.e. up to the geometrical symmetries explained in chapter I, allowing one to display
it in the usual graphical representation.
The Grid Theory defined above can be simplified according to the following
principles:
– forget the sorts Block and Square,
– forget the primary predicate "correspondence",
60 The Hidden Logic of Sudoku
– forget all the axioms referring to the above sorts or predicates (including all
the correspondence axioms).
Then we obtain a theory of grids that does not mention blocks: LSGT.
Proof: the proof involves some easy but tedious technicalities concerning the
correspondence between theories in MS-FOL and in FOL (along the lines of [MEI
93]). Given a model of GT, just forget anything about blocks and squares to get a
model of LSGT. Conversely, given a model of LSGT, the key is that the correspon-
dence axioms can be used to define new predicates for blocks and squares and that
these predicates can, in turn, be used to introduce the new sorts Block and Square.
Details of the proof are left as an exercise to the motivated reader.
Why do we introduce this simplified grid theory LSGT? The main reasons will
appear in the next chapter:
– by design, LSGT is block-free (in the sense defined in section IV.5.1);
– as the constraints on Latin Squares do not refer to blocks, LSGT is necessary
and sufficient to deal with the theory of Latin Squares;
– the theory of Latin Squares is very precisely related not only to Sudoku
Theory (this is obvious) but also to Sudoku Resolution Theories and to meta-
theorem 3.
ST has the same sorts and the same absence of functions as GT.
In ST, in addition to those already in GT, we have only one more predicate sym-
bol (we call it primary, to distinguish it from the auxiliary predicate symbols defined
later):
– value, with arity 3 and signature (Number, Row, Column).
Just to make things more intuitive about the atomic formulæ built on this predicate
(but this is anticipating on the axioms of ST), let us say that the intended meaning of
Grid Theory and Sudoku Theory 61
value(n, r, c) is that number n is the value of cell (r, c) in natural row-column space;
this is equivalent to saying that column c is the value of rn-cell (r, n) in rn-space, or
that row r is the value of cn-cell (c, n) in cn-space.
We also introduce the following additional auxiliary predicate; in its loose nota-
tion, it will be very useful to exhibit analogies:
– value’, with signature (Number, Block, Square);
value’(n, b, s) is defined as a shorthand for:
∃r∃c [correspondence(r, c, b, s) & value(n, r, c)];
value’(n, b, s) will generally be loosely written as value[n, b, s];
the intended meaning of value[n, b, s] is that number n occupies cell [b, s] in rc-
space, in block-square coordinates.
The only point in stating the axioms of ST is that we must be careful if we want
to guarantee the best possible proximity with the resolution theories to be defined
later. For instance, if we write that there must be one value for each cell (in fine an
inescapable condition of the problem), this precludes all intermediate states from
satisfying this axiom; we therefore try to limit the number of such assertions: indeed
it will appear in only one axiom (ST5). All the other general conditions in the state-
ment of the problem can be expressed as "single occupancy" or "mutual exclusion"
axioms – this is why, anticipating on the present formalisation, we adopted the first
formulation of the game in the introduction.
ST is defined as the specialisation of GT (i.e. it has all the axioms of GT) with
the following additional axioms. These axioms are the quasi direct transliteration of
the English formulation of the problem, as given in the introduction:
– ST1: in natural row-column space, every rc-cell has at most one number as its
value (i.e. given any cell, it can have at most one value):
– ST2: in abstract row-number space, every rn-cell has at most one column as its
value (i.e. given a number, in any given row it can appear in at most one column):
– ST3: in abstract column-number space, every cn-cell has at most one row as
its value (i.e. given a number, in any given column it can appear in at most one
row):
– ST4: in abstract block-number space, every bn-cell has at most one square as
its value (i.e. given a number, in any given block it can appear in at most one
square):
At this point, it is important to notice that the first three of these axioms exhibit
the symmetries and supersymmetries reviewed in chapter I (and they are block-free
according to the definition in forthcoming section IV.5), while the fourth exhibits
analogy with the second and the third (and it is not block-free).
Notice that this notion of a proof puts strong restrictions on how the axioms in T
can be used to prove theorems.
From a logical point of view, the set ST of axioms is necessary and sufficient to
define the Sudoku problem: given any puzzle P (with axiom EP corresponding to its
entries) and any complete grid G, the following are equivalent:
– G is compatible with the entries for P and it satisfies ST, i.e. G is a solution for
P in the intuitive sense;
– G is a model (in the standard sense of mathematical logic) of ST ∪ {EP};
– G satisfies the axioms in ST ∪{EP}.
ST is therefore theoretically perfect: for any puzzle, its formal and intuitive
meanings coincide. The only problem with it is practical: it does not tell us how to
build a complete grid.
From an operational point of view, the first four axioms (ST1 to ST4) could be
considered as contradiction detection rules. For instance, axiom ST1 could be re-
written in the following operational form: if, at some point in the resolution process
of a puzzle, we reach a situation in which two different values should be assigned to
the same cell, then we can conclude that the puzzle has no solution (the entries of
the puzzle are contradictory with the axioms). Similar reformulations could be
obtained for axioms ST2 to ST4. They are, somehow, operational forms of these
axioms. But do these forms express all the operational consequences of the original
formulæ? Actually, the developments in chapter IV will show that they don’t (and
they are indeed very far from doing so). The situation for axiom ST5 is still worse,
since it does not tell anything about how it can be used in practice.
Vague as this may remain, let us define the aim we shall pursue with Sudoku
Resolution Theories: replace the above axioms by another set of axioms that could
be easily interpreted as (or transformed into) a set of operational rules for building a
solution. And, since most known resolution rules are based on the notion of a candi-
date and on the progressive elimination of candidates, we want to write rules expli-
citly designed for this purpose. The problem is that, unless one admits recursive
Trial and Error (which is not a rule), no theory of this kind is known that would be
equivalent to ST.
64 The Hidden Logic of Sudoku
Notice that, given any puzzle P, the axioms of ST together with EP a priori
imply neither the existence nor the uniqueness of a solution for P. Concerning the
existence, this may seem to contradict axiom ST5, but ST5 only puts a condition on
a solution, it does not assert that there is a solution (i.e. that EP is consistency with
ST). Indeed, any axiom that would assert the existence of a solution for any P would
be trivially inconsistent. Moreover, no set of a priori conditions on the entries of a
puzzle P is known that would ensure that P has a solution (at least one). Obviously,
some trivial necessary conditions for existence can be written (such as not having
the same entry twice in a row, a column or a block) but they are very far from being
sufficient).
As for uniqueness, for any puzzle P and corresponding axiom EP, it could be
expressed by one more axiom:
– ST6: ST ∪{EP} has at most one solution:
There are famous examples of puzzles that have been proposed and asserted as
having a unique solution and that have indeed several ones. Many of the resolution
rules that have been proposed to take uniqueness into account are used inconsis-
tently to conclude that some puzzle has a unique solution. Moreover, the unique-
ness of a solution for a given puzzle can be asserted only if it has already been
proven – which supposes that there exists some means for proving it. In our ap-
proach, we shall never take the uniqueness of a solution as granted and we therefore
do not adopt axiom ST6. As a consequence, no resolution rule based on the assump-
tion of uniqueness will ever be written in this book (except in chapter XXII). The
classification results given in chapter XXI show that 97% of the randomly generated
puzzles (and 99.7% of the Royle17 database) can be solved without the assumption
of uniqueness (and without the infamous Trial and Error) using only 2D chains and
the further results given in Part Four show that this proportion raises to at least
99.99% if we use 3D chains.
Chapter IV
As the first illustration of our logical formalism, section 3 introduces our mini-
mal Resolution Theory (Basic Sudoku Resolution Theory or BSRT for short) and
expresses its axioms in this formalism. Here, "minimal" means that all the other
resolution theories introduced in this book will be extensions of BSRT with addi-
tional axioms (logically speaking, they will therefore be specialisations of BSRT).
Finally, section 4 introduces several notions (block-free and block-positive formu-
læ…) and applies them to prove the formal versions of meta-theorems 1, 2 and 3
and an extension of the last. This extension will be very useful when we want to
apply it in practice.
The only place where the logical formalism of this chapter is explicitly necessary
is in proving the validity of our most powerful tools: meta-theorems 1, 2 and 3, as
explained in section 2. If you want to understand the proofs of the general meta-
theorems that will be frequently used in this book to assert new rules (especially
new chain rules), you must also understand the notion of a block-free formula
introduced in this chapter. Nevertheless, if you do not understand these proofs, you
can also consider the meta-theorems as simple heuristics suggesting new potential
rules and you can prove directly all the rules we shall deduce from them (this will
generally be very easy).
As our first approximation, we could say that Sudoku Theory is about what we
want (a complete grid satisfying the general Sudoku constraints and the specific data
entries), with no consideration at all for the way it can be obtained, whereas a Sudo-
ku Resolution Theory is about how we can reach the desired final state; but then we
must correct the resulting erroneous suggestion that a theory of this second kind
would be mainly concerned with resolution processes.
associated with the method, in which case one could use temporal or dynamic logic
for modelling them. This is not the point of view chosen in this book, where we con-
sider a resolution method from the point of view of the "knowledge states" under-
lying it and we adopt epistemic (modal) logic to model these. Whereas the main part
of this book deals with Resolution Theories, a problem (confluence) specific to reso-
lution methods based on them will be tackled in chapter XXII.
Then from a logical standpoint, the only purpose of a resolution theory is to res-
trict the number of knowledge states compatible with the axioms (i.e. the number of
partial solutions, expressed in terms of values and candidates) and the relationships
that exist between them. From an operational standpoint, it can be used as a refe-
rence for defining a resolution method that will dynamically modify the current
information content; but, before a resolution theory can be used this way, there must
be some operationalisation process. This distinction is essential (and very classical
in Artificial Intelligence) because a given logical axiom (taken from a resolution
theory) can be operationalised in different ways. (To be more specific: it can be
expressed in the form of different rules in an inference engine.)
Notice that all the theories of interest should be restricted to satisfy two obvious
general requirements:
– they are naturally compatible with the general symmetries analysed in chapter
I, (of course, the meaning of "naturally" must be further specified),
– they can apply to any set of grid entries.
But this is far from being enough to constrain the possible theories of interest.
The analyses in this section constitute the central part of this chapter and they are
the key to understanding the logical foundations of this book: given that the naive
notion of a candidate is at the basis of the various popular resolution rules and of the
formulation of any resolution method, can one grant it a well defined logical status?
68 The Hidden Logic of Sudoku
Let us first clarify the following point. One apparent problem in choosing the
notion of a candidate as the basis for a logical formulation is that the set of candi-
dates for any cell is monotonically decreasing throughout the resolution process,
whereas logic is usually associated with monotonically increasing sets: starting from
what is initially assumed to be true (the axioms), each step in a proof adds new
assertions to what has been proved to be true in the previous steps; there is no possi-
bility in standard logic for removing anything.
What is really important in logic is that the abstract information content is mono-
tone increasing throughout the resolution process. (In other terms, one should not
confuse this information content with possibly varied representations of it.) In the
sequel, when we write resolution rules, we shall sometimes conform to the Sudoku
literature usage and refer to candidates, but we must keep in mind that, when expres-
sed with not-candidates, the underlying logic is always monotone increasing. To
eliminate any ambiguity: as long as we are concerned with the logical foundations
of our theories, the notion (and the predicate) we shall consider as primary is that of
a not-candidate, but in practice our rules will also mention the usual auxiliary predi-
cate "candidate" (whose precise definition will be given below).
spontaneously: a value is asserted as being true, while a candidate is known (or not
known) to be incompatible with all that is already known. One way to interpret this
is as an indication that the underlying logic of any Sudoku Resolution Theory based
on candidates should be epistemic: it should be a logic of knowledge as opposed to a
logic of truth (such as standard logic or MS-FOL).
Before entering into the formal details, let us define the notions of a knowledge
state and of an epistemic model. Defining the model theoretic aspects before the
syntactic aspects is not the usual way to proceed in logic, but it is more intuitive.
Since, in fine, "value" and "candidate" will be defined as having an epistemic
content and will appear as auxiliary predicates, let us adopt two new primary
predicates "value°" and "cand°" intended to express the simple truth of an atomic
fact.
Notice that any knowledge state is a finite set and the whole set KS is therefore
finite (and independent of any particular puzzle) although very large.
Thus, the intended meaning of KS1 ≤ KS2 is that when one passes from one
knowledge state to a "greater" one (according to this order relation), the information
content can only increase – the deletion of a candidate being considered as an
increase of this information content. In practical terms, it also means that KS2 is
closer to a solution (or to the detection of a contradiction) than KS1 is.
For instance, we can notice that our definition of KSP corresponds to a trivial
initialisation of the problem and that smarter definitions could be considered, where
some cand° could be excluded from the start (e.g. by taking ECP into account). This
would amount to restricting the epistemic model KSP of P to a smaller relevant part.
Simplistic as they may seem, these notions allow us to state precisely what kind
of resolution rules we are looking for. Given a resolution theory T, the application of
any resolution rule R in T to a puzzle P should lead from one knowledge state in
KSP to a greater one, with the following interpretation: if, starting from a knowledge
state KS in KSP, we notice a pattern (or configuration) of cells, units, value°s and
cand°s, satisfying the condition part of R, then R can be applied to this pattern; and,
if we apply it, in the resulting knowledge state KS1 and in all the subsequent ones
(still in KSP), the value°(s) and cand°(s) specified in the action part of R will
respectively be added and deleted. Notice that the whole process of detecting a
pattern, applying a rule and passing from KS to KS1 is superimposed on KSP but is
not part of this abstract static model.
Now, still starting from the same knowledge state KS, if we notice that the
conditions of another resolution rule R’ in T are also satisfied in KS and if we apply
R’ instead of R, we usually reach a knowledge state KS2 (still in KSP) different from
KS1. For a real understanding of what a resolution theory is and is not, it is crucial to
remark that the (relatively informal) definition we have just given does not a priori
imply that the two states KS1 and KS2 are T-compatible, in the sense that there
would be a state KS3 such that KS1 ≤ KS3, KS2 ≤ KS3 and KS3 is accessible both
from KS1 and KS2 via rules in T. See chapter XXII for some elaborations on this
and the associated fundamental notion of confluence.
Which logical axioms for the epistemic operator K should one adopt? This is the
subject of much philosophical and scientific debate. It concerns the relationship be-
tween truth and belief and the axioms expressing this relationship. There are several
theories in competition, the most common of which are, in increasing order of
strength: S4 < S4.2 < S4.3 < S4.4 < S5 (on this point and the following, see for
instance the Stanford Encyclopedia of Philosophy: http://plato.stanford.edu/entries/
logic-epistemic/).
72 The Hidden Logic of Sudoku
One thing should nevertheless be noted: in epistemic logic, for any ground
atomic formula A, "A or ¬ A" is true and known to be true – i.e. "K(A or ¬A)" is
true –, but this is not the case for "KA or K¬A". For instance, given some definite
place in space-time, it is always true that either it is raining (A) or it is not (¬A) at
this place, and you know this is true (K(A or ¬A)). But it is not true that either you
know it is raining (KA) or you know it is not raining (K¬A) at this place: it may
happen that you do not know anything about the weather at this place. Said other-
wise, knowing that ¬A and not knowing that A are very different things (and the
first is much stronger than the second).
Well, the logic of any Sudoku Resolution Theory must be based on modal (epis-
temic) logic. But this does not tell us yet how values and candidates should be rela-
Sudoku Resolution Theories 73
ted. The most natural possibility would certainly have been to define "candidate" as
an auxiliary predicate, built on the predicates necessary to define Sudoku Theory,
by: candidate(n, r, c) ≡ ¬K¬value(n, r, c). But, remembering that the notion of a
candidate is introduced with (ultimately) operational purposes in mind, what could
be concretely gained with this definition is unclear when "value" is assigned its
usual truth theoretic meaning.
Instead, our approach will be somewhat more convoluted. In our Sudoku Reso-
lution Theories, neither "candidate" nor "value" will be considered formally as
primary predicates; the primary predicates will be "value°" and "cand°", expressing
simple truth in a possible world (i.e. in a knowledge state).
Let us first reconsider the status of "value". As all the values asserted during the
resolution process will either be the entries of the problem or will be explicitly
asserted by resolution rules, all the values ever present on the grid will not only be
true in a knowledge state, they will be known to be true in this state; therefore,
predicate "value" itself should now be given an epistemic status, with value(n, r, c)
intended to mean: it is effectively known that the value of cell(r, c) is n. Let us thus
define "value" as an auxiliary predicate by axiom VAL:
It can be seen that these definitions are consistent with the persistence of what is
known from one knowledge state to those accessible from it and the non-persistence
of what is not known. Strong or "positive" facts of types "value" and "not-
candidate" cannot disappear once they have appeared, but weak or "negative" facts
of type "candidate" can.
74 The Hidden Logic of Sudoku
To get our first specific answer to the main questions of this section: what is the
relationship between predicates "value" and "candidate" and what is the relationship
between Sudoku Theory and all the possible Sudoku Resolution Theories, we now
adopt the following axiom (call it VCR, "Value to Candidate Relationship"), which,
after being reformulated into condition-action rules, will be the true logical founda-
tion for all our Sudoku Resolution Theories:
intuitively, it means that n is effectively known to be the value of cell (r, c) if and
only if it is effectively known that no other value for this cell is possible.
So far so good; but we are not very enthusiastic with the prospect of having to
overload the formulation of our resolution rules with epistemic operators. Let us do
one more step. Anticipating on our resolution rules (which may not refer explicitly
to knowledge states), it appears that, in their naive formulations, their (non static)
conditions will bear on the presence of some candidates and on the absence of others
and their conclusions will always be the assertion of a value or the elimination of a
candidate. Let us see how this can be understood and written in the present forma-
lism:
– a condition on the absence of a candidate means that it is effectively known to
be excluded (K ¬ cand°) and must therefore be expressed with the auxiliary predi-
cate "not-candidate" defined for this purpose;
– a condition on the presence of a candidate means that it is not effectively
known to be excluded (¬ K ¬ cand°) and must therefore be expressed with the auxi-
liary predicate "candidate" defined for this purpose;
– a conclusion on the assertion of a value means that the value is effectively
known to be true (K value°) and must therefore be expressed with the auxiliary
predicate "value";
– a conclusion on the elimination of a candidate means that this candidate is
effectively known to be excluded (K ¬ cand°) and must therefore be expressed with
the auxiliary predicate "not-candidate";
– the entries of a puzzle P must be understood in terms of effectively known
initial values (K value°) and must therefore be expressed with the auxiliary
predicate "value";
Sudoku Resolution Theories 75
As a result of the above analysis, the two primary predicates "value°" and
"cand°" will never appear as such in the formulation of resolution rules, leaving
their place to the epistemic predicates "value" and "not-candidate". This result has a
very intuitive meaning: the formulation of a resolution rule will not be based on the
truth of anything in a possible world (an abstract notion depending on that of a
logical consequence of the axioms) but only on what is already effectively known or
not effectively known; and, using "value" and "not-candidate" as pseudo primary
predicates, no explicit epistemic operator will ever be needed in the logical
formulation of the resolution rules!!
Notice that such an approach and a conclusion can be generalised to any game
based on the progressive elimination of candidates.
After all the above preliminary analyses, time has come to turn to the axioms we
want all our Sudoku Resolution Theories to share.
All the theories considered in this book will be specialisations of the theory defi-
ned in this section, called the "Basic Sudoku Resolution Theory" or BSRT. "Specia-
lisation" means that:
– they are formulated with exactly the same language (in particular, they do not
introduce any new primary predicate – although they may introduce auxiliary ones,
considered as previously as shorthands),
– they contain all the axioms of BSRT.
BSRT includes the axioms of Grid Theory (GT) defined in chapter III. Contrary
to the general Sudoku Theory ST:
– BSRT is based on the primary predicates "value" and "not-candidate",
– as explained in section 2 above, these predicates now have an underlying epis-
temic meaning (that may remain implicit),
– BSRT cannot include the axioms of ST as such (they are not resolution rules),
– the underlying logic is not classical but intuitionistic (constructive).
IV.3.1. Sorts
Sorts are the same as in the general grids theory GT and in ST.
just to make things more intuitive about the atomic formulæ built on this predicate
(but this is anticipating on the axioms of BSRT), let us say that the intended
meaning of not-candidate(n, r, c) is at the same time that:
- it is effectively known that number n cannot occupy cell (r, c) in natural
row-column space;
- it is effectively known that column c cannot receive the unique instance of
number n that should be found in row r, i.e. it is effectively known that column c
cannot occupy rn-cell (r, n) in abstract row-number space;
- it is effectively known that row r cannot receive the unique instance of
number n that should be found in column c, i.e. it is effectively known that row r
cannot occupy cn-cell (c, n) in abstract column-number space.
The first group of four axioms on candidates expresses the mutual exclusion
conditions on cells, rows, columns and blocks. These four rules, also called the
elementary constraints propagation rules, can be considered as the direct operational
transpositions of axioms ST1 to ST4. (Of course, they can easily be proven from
these axioms plus VCR.) They can be used in practice to eliminate candidates as
soon as a value is asserted. In this respect, they will be much more useful than rules
such as ST1 to ST4 could be:
The sixth axiom expresses the right to left implication in the basic relationship
between values and candidates (i.e. of axiom VCR in section 2):
– NS or Naked-Single: assert a value whenever there is a unique candidate for a
cell:
∀r∀c∀n {[candidate(n, r, c) & ∀n1≠n not-candidate(n1, r, c)]
=> value(n, r, c)}.
80 The Hidden Logic of Sudoku
Finally, we define:
ECP = {ECP(cell), ECP(row), ECP(col), ECP(blk)}
L0 = ECP ∪ {CD}
BSRT = L0 ∪ {NS}.
Notice that axiom ST5 of ST has not been transposed for inclusion into BSRT,
because there is no obvious way to write it as a resolution rule. This incompleteness
of BSRT is the fundamental reason why we must search for compensatory additio-
nal resolution rules.
As was the case for Sudoku Theory ST, with any specific puzzle P we can
associate the axiom EP defined as the finite conjunction of all the formulæ of type
value(nk, ri, cj) corresponding to each entry of P. Then, when added to the axioms of
BRST (or any extension of it), axiom EP defines a Sudoku Resolution Theory for the
specific puzzle P.
Notice that, although the form and the name are identical to those used in ST, the
axiom for the entries of a puzzle does not have the same meaning as for ST. In the
context of BSRT, it is given an epistemic meaning.
Let LS be the theory of Latin Squares, defined as the block-free part of ST. More
precisely, LS is defined as follows:
The language of LS is the block-free part of the language of BRST; the axioms
of LS are:
– those of LSGT (the block-free version of GT, as defined in chapter III),
– those of ST except ST4 (the block-free part of ST).
Before giving the formal definitions, let me warn you that they are just a
pompous way of saying what was said informally in chapter I, so that you may skip
them if you are not interested in the technicalities:
– Src(F) is the formula obtained from F by permuting systematically the words
"row" and "column",
– Srn(F) is the formula obtained from F by permuting systematically the words
"row" and "number",
– Scn(F) is the formula obtained from F by permuting systematically the words
"column" and "number".
As usual in logic, the formal definitions of Src(F), Srn(F) and Scn(F) are given
recursively, following the general construction of a formula:
– block-free atomic formulæ (notice that, for "value" and "not-candidate" the
sorts cannot be permuted in the predicate itself, but the indices on the variables are
permuted; this is technically very important, especially when we deal with transfor-
mations of formulæ with different numbers of variables of different sorts):
– logical connectives:
– quantifiers:
Sudoku Resolution Theories 83
Notice that the three transformations are involutive, i.e. for any block-free
formula F, one has Src•Src(F) = F, Srn•Srn(F) =F and Scn•Scn(F) =F.
Notice also that Src can be extended trivially to any (non necessarily block-free)
formula F; of course Src(F) is not block-free if F is not. It suffices to define Src(F) for
non block-free atomic fo-mulæ, i.e. for equality between two blocks, two squares or
two unit types and for the non block-free primary predicate "correspondence". This
is obvious and details are left to the reader.
As a result of the above rules, we have the following table for the almost primary
predicate "candidate".
Let us now define equality of cells in rn- and cn- spaces, with the following two
block-free auxiliary predicates:
– same-rn-cell, with arity 4 and signature (Row, Number, Row, Number);
same-rn-cell(r1, n1, r2, n2) is defined as a shorthand for: r1 = r2 & n1 = n2;
– same-cn-cell, with arity 4 and signature (Column, Number, Column,
Number);
84 The Hidden Logic of Sudoku
As a result of this exercise, the above three predicates are the only ones we can
get when we start from any of them and we repeatedly apply any series of transfor-
mations from the set {Src(F), Srn(F), Scn(F)}.
Nevertheless, one should not conclude from this particular (but important) case
that relations such as Src•Srn(F) = Scn(F) are general. For the general formula F,
things are more complex; chapters VI to VIII will show practical consequences of
this remark on the sets of resolution rules for Pairs, Triplets and Quadruplets.
To any formula one can associate a well defined block-free formula, called its
block-free transform.
Remarks:
Sudoku Resolution Theories 85
– the last two conditions are justified by the fact that non block-free variables
are eliminated together with the non block-free atomic formulæ containing them;
– for any formula F (and not only the atomic ones), if F is block-free, then BF(F)
is simply F;
– obvious examples: the block-free transforms of row-intersects(…) and of
column-intersects(…) are ⊥.
Exercise 1: show that all the relationships described in Figure 1 below between
the various notions of connectedness introduced above are true. This will be very
useful when we transpose subset or chain rules.
86 The Hidden Logic of Sudoku
Srn
rn-connected
Src (r1,n1,r2,n2)
Scn
share-a-unit BF rc-connected
(r1,c1,r2,c2) (r1,c1,r2,c2) Src
Srn
cn-connected
(c1,n1,c2,n2)
Scn
Theorem IV.2: any block-free resolution rule is already valid in LS+VCR (the
theory of Latin Squares extended to candidates). Stated otherwise: a block-free
formula is valid for Sudoku if and only if it is valid for Latin Squares.
5
Technical remark: one may think that this theoren could be proved using general theorems
in logic such as the interpolation theorem and/or Gentzen’s theorem in the sequents calculus:
"if a sequent Γ ∆ is provable in the sequents calculus then it has a proof that uses only
sequents formed on the sub formulæ of Γ ∆". But it does not seem to work.
Sudoku Resolution Theories 87
of T or can be deduced from the previous ones by modus ponens (i.e. from the law
"from A and from A ⇒ B deduce B").
The only axiom in T which is not already block-free is ST4. But, for ST4, we
have:
ST4 ≡ ∀b∀n∀s1∀s2 {value[n, b, s1] & value[n, b, s2] ⇒ s1 = s2})
≡ ∀b∀n∀s1∀s2
{ ∃r1∃c1 [correspondence(r1, c1, b, s1) & value(n, r1, c1)} &
∃r2∃c2 [correspondence(r2, c2, b, s2) & value(n, r2, c2)]
⇒
s1 = s2 };
As for theorem IV.2, a block-free proof uses only the block-free transforms of
the axioms in ST+VCR. But, since BF(ST4) = TRUE, this is exactly LS+VCR.
We now have the technical tools necessary for stating and proving our three
fundamental meta-theorems.
88 The Hidden Logic of Sudoku
Proof: the proof (for Srn) is similar to that of meta-theorem 1. Let now T be the
theory consisting of the axioms in LS+VCR. After theorem IV.2, there is a proof of
R in T. From such a proof, a proof of Srn(R) in T (it will automatically be also a
proof in ST+VCR) can be obtained by replacing successively each step in the first
proof (axioms included) by its transformation under Srn. This is legitimate since:
– each formula in the first proof is block-free and Srn can be applied to it;
– the set of axioms in T is invariant under Srn symmetry;
– any application of modus ponens can be transposed, because Srn(A⇒B) ≡
Srn(A)⇒ Srn(B).
Stating and proving meta-theorem 2 is done along the same lines as we did for
meta-theorems 1 and 3. As previously, we must begin by introducing a new formal
notion, the notion of the Rrcbs transform of a block-free formula.
F Srcbs(F)
i = n n j n i =n n j
ri =r rj b i =b b j
ci =c cj si =s sj
value(ni,rj,ck) value[ni, bj, sk]
not-candidate(ni,rj,ck) not-candidate[ni,rj,ck]
– logical (and modal) connectives: all of them simply commute with Srcbs
– quantifiers:
Srcbs(F)
∀niF1 ∀niSrcbs(F1)
∀riF1 ∀biSrcbs(F1)
∀ciF1 ∀siSrcbs(F1)
∃niF1 ∃niSrcbs(F1)
∃riF1 ∃biSrcbs(F1)
∃ciF1 ∃siSrcbs(F1)
As a result of the above rules, we have the following table for the almost primary
predicate:
Srcbs(F)
ni,rj,ck) candidate[ni,rj,ck]
After theorem IV.2, there is a proof of R in LS+VCR. This is not enough for our
purpose, but the proof of theorem IV.2 can be transposed to show that there is a
proof of R in LS+VCR that does not use axiom ST3; it is therefore a proof of R
using only the axioms in the set {ST1, ST2, ST5, VCR}. From this proof of R, a
90 The Hidden Logic of Sudoku
proof of Srcbs(R) using only the axioms in the set {ST1, ST4, ST5, VCR} (a subset
of ST+VCR) is obtained by replacing each step in the first proof by its transfor-
mation under Srcbs. This is legitimate since:
– each formula in the first proof is block-free and Srcbs can be applied to it;
– under Srcbs, ST1, ST5 and VCR are invariant and ST2 becomes ST4;
– any application of modus ponens can be transposed, because Srcbs(A⇒B) ≡
Srcbs(A)⇒Srcbs(B).
For easier formulation, let us consider formulæ written without the logical sym-
bol for implication ("⇒"), i.e. written with only the following logical symbols: ∧, ∨,
¬, ∀, ∃. Notice that (using the trivial identity A ⇒ B ≡ ¬A ∨ B) every formula can
be rewritten in this form. Remember also that the condition part of any resolution
rule satisfies this constraint.
The proof of the first part is obvious. Notice that BF(R) is weaker than R, since
it has stronger conditions; it might therefore be considered as totally uninteresting.
But BF(R) is block-free and it can be submitted to meta-theorem 3. This is the way
how, in the chapters dealing with chains, counterparts of all the chain rules in
natural rc-space will be defined in rn- and cn-spaces, leading to entirely new types
of chains.
In this short chapter, the following familiar rules will be studied and their rela-
tionships through symmetry, analogy and supersymmetry will be established:
– Naked-Single, or NS for short: if there is a row and a column such that there is
one and only one candidate for the cell they define, then assert it as the value of this
cell;
– Hidden-Single-in-a-row, or HS(row) for short: if there is a row and a number
such that the number is a candidate for one and only one cell in this row, then assert
it as the value of this cell;
– Hidden-Single-in-a-column, or HS(column) for short: if there is a column and
a number such that the number is a candidate for one and only one cell in this
column, then assert it as the value of this cell;
– Hidden-Single-in-a-block, or HS(block) for short: if there is a block and a
number such that the number is a candidate for one and only one square in this
block, then assert it as the value of this square.
Validity of each of these rules is obvious. Notice the duality between rows and
columns, but the absence of duality between blocks and squares (e.g. there is no rule
Hidden-Single-in-a-square).
Let us use the abstract spaces introduced in chapter II and rephrase the above
rules so as to better display the symmetries linking them:
94 The Hidden Logic of Sudoku
– NS: if, in natural row-column space, there is a rc-cell (r, c) with only one
candidate (number), then assert it as the value of this rc-cell;
– HS(row): if, in abstract row-number space, there is a rn-cell (r, n) with only
one candidate (column), then assert it as the value of this rn-cell;
– HS(col): if, in abstract column-number space, there is a cn-cell (c, n) with only
one candidate (row), then assert it as the value of this cn-cell;
– HS(blk): if, in abstract block-number space, there is a bn-cell (b, n) with only
one candidate (square), then assert it as the value of this bn-cell.
HS(row)
Scn Srcbs
NS Src HS(blk)
Srn
Scrbs
HS(col)
This is our first and simpler example of supersymmetry, together with analogy.
The four rules can be phrased similarly: if, in any of the four row-column, row-
number, column-number or block-number spaces, there is a cell with only one
possibility left for the remaining variable, then assert it as the final value. The first
three rules express supersymmetry; the fourth expresses analogy with the previous
ones. More specifically:
– HS(row) is obtained from NS by supersymmetry: permuting "number" and
"column"; formally: HS(row) = Scn(NS);
– HS(col) is obtained from NS by supersymmetry: permuting "number" and
"row"; formally: HS(col) = Srn(NS); it is also obtained from HS(row) by symmetry:
permuting "row" and "column"; formally: HS(col) = Src(HS(row));
– HS(blk) is obtained from HS(row) by analogy: replacing "row" by "block" and
"column" by "square"; formally: HS(blk) = Srcbs(HS(row)).
– HS(blk) is also obtained from HS(col) by analogy: replacing "column" by
"block" and "row" by "square"; formally: HS(blk) = Scrbs(HS(col)).
Subset rules, level one: Singles 95
Figure 1 summarises all the formal relations of symmetry, analogy and super-
symmetry between these four rules.
Whereas the English sentences for expressing our rules are deduced from each
other by permuting properly the words "row", "column", "number", "block" and
"square", the logical formulæ expressing them are deduced from each other simply
by permuting the quantifiers. As a result and an illustration of the expressive power
of multi-sorted logic, their logical formulation is still more striking in its
compactness:
Notice that, in conformance with our convention in section IV.3.2.3, the short-
hands on the quantifiers have been developed in reference to the primary predicates
"value" and "not-candidate".
In the sequel, such developments will be left to the reader motivated by the
purity of logical formulæ. But, as formal logic is used here merely as a compact
notation tool, and not for providing formal proofs of the rules, they will never be
needed.
V.3. Example
It is the case that grid Royle17-3 can be solved using only the rules defined
above (in addition, of course, to the elementary constraints propagation rules). As
will be shown by the classification results at the end of this book, this is a relatively
frequent property (shared by 46% of the puzzles in the Royle17 database and more
than 41% of the randomly generated grids in Sudogen0 and Sudogen17).
1 2 6 7 3 8 9 4 5 1 2
3 5 9 1 2 7 3 5 4 8 6
6 7 8 4 5 6 1 2 9 7 3
7 3 7 9 8 2 6 1 3 5 4
4 8 5 2 6 4 7 3 8 9 1
1 1 3 4 5 8 9 2 6 7
1 2 4 6 9 1 2 8 7 3 5
8 4 2 8 7 3 5 6 1 4 9
5 6 3 5 1 9 4 7 6 2 8
Here is the detailed listing for the resolution path of puzzle Royle17-3 (applica-
tion of elementary constraints propagation is not displayed). Remember from what
we said in the introduction that, in future examples, all the steps shown here will be
considered as obvious and will not be displayed.
Resolution path in L1_0 for Royle17-3:
hidden-single-in-block b3 ==> r3c9 = 3
hidden-single-in-column c7 ==> r8c7 = 1
hidden-single-in-row r9 ==> r9c3 = 1
hidden-single-in-row r2 ==> r2c2 = 1
hidden-single-in-column c7 ==> r6c7 = 2
hidden-single-in-column c8 ==> r9c8 = 2
hidden-single-in-block b9 ==> r7c8 = 3
hidden-single-in-column c8 ==> r2c8 = 8
hidden-single-in-block b3 ==> r2c9 = 6
hidden-single-in-column c7 ==> r7c7 = 7
Subset rules, level one: Singles 97
For this particular puzzle, it can be seen that only rules of type Hidden-Single are
applied in the first steps and that only rules of type Naked-Single are applied in the
last steps (forgetting elementary propagation rules). In grids that can be solved with
only these two types of rules, invocation of rules of each type are generally more
intermingled than in this example. However, whichever set of rules the resolution of
a puzzle needs and still forgetting ECP, most of the time, the last rules applied need
not be of a type more complex than Naked-Single.
Let us define theory L1_0 as the union of the axioms of BSRT (which already
includes Naked-Single) with the three Hidden-Single rules:
Full level 1 and associated theory L1 will be obtained from L1_0 by adding
interactions rules (see chapter IX). But for the next three chapters, we continue with
subset rules.
98 The Hidden Logic of Sudoku
Whenever there can be no confusion, we shall use the same name to designate a
resolution theory and the set of puzzles that can be solved by it.
Chapter VI
The set of rules relative to Pairs constitutes a still more striking illustration of
our approach based on symmetries than the rules relative to Singles. In this chapter,
the following familiar rules will be studied in full detail and their relationships
through symmetry, analogy and supersymmetry will be established:
– Naked-Pairs-in-a-row, or NP(row) for short: if there is a row and there are two
different cells in this row that have exactly the same two different candidates, then
remove these two candidates from all the other cells in this row;
– Naked-Pairs-in-a-column, or NP(col) for short: if there is a column and there
are two different cells in this column that have exactly the same two different candi-
dates, then remove these two candidates from all the other cells in this column;
– Naked-Pairs-in-a-block, or NP(blk) for short: if there is a block and there are
two different cells in this block that have exactly the same two different candidates,
then remove these two candidates from all the other cells in this block;
– Hidden-Pairs-in-a-row, or HP(row) for short: if there is a row and there are
two different cells in this row and two different numbers n1 and n2 that appear in the
candidates for no other cell in this row than these two, then remove any number
other than n1 or n2 from the two cells; notice that applying this rule has the effect of
producing a Naked-Pairs-in-a-row from a Hidden-Pairs-in-a-row;
– Hidden-Pairs-in-a-column, or HP(col) for short: if there is a column and there
are two different cells in this column and two different numbers n1 and n2 that
appear in the candidates for no other cell in this column than these two, then remove
any number other than n1 or n2 from the two cells; notice that applying this rule has
100 The Hidden Logic of Sudoku
Moreover, it will be shown that all these rules are related by symmetry, analogy
or supersymmetry. More specifically:
– NP(col) is obtained from NP(row) by symmetry: permuting "row" and
"column"; formally: NP(row) = Src(NP(col));
– NP(blk) is obtained from NP(row) by analogy: replacing "row" by "block" and
"column" by "square"; formally: NP(blk) = Srcbs(NP(row));
– HP(row) is obtained from NP(row) by supersymmetry: permuting "number"
and "column"; formally: HP(row) = Scn(NP(row));
– HP(col) is obtained from NP(col) by supersymmetry: permuting "number" and
"row"; formally: HP(col) = Srn(NP(col));
– HP(blk) is obtained from HP(row) by analogy: replacing "row" by "block" and
"column" by "square"; formally: HP(blk) = Srcbs(HP(row));
– X-Wing(row) is obtained from HP(row) by supersymmetry: permuting
"number" and "row"; in symbols: X-Wing(row) = SHP(row), where SHP(row) is
defined by SHP(row) = Srn(HP(row)) = Srn•Scn(NP(row));
– X-Wing(col) is obtained from HP(col) by supersymmetry: permuting
"number" and "column"; in symbols: X-Wing(col) = SHP(col), where SHP(col) is
defined by SHP(col) = Srn(HP(col)) = Scn•Srn(NP(col)).
We shall also give detailed examples of puzzles where rules NP, HP and SHP
apply, together with their resolution paths. For one of these examples, we shall
display the situation both in natural row-column space and in row-number space.
Subset rules, level two: Pairs 101
This is intended to illustrate how the proper choice of a graphical representation (in
this case the choice of the proper space) reveals what was hidden (or super hidden).
VI.1. Naked-Pairs
VI.1.1. Naked-Pairs-in-a-row
Validity of the rule is very easy to prove: in row r, each of the two cells defined
by columns c1 and c2 must get a value and only two values (n1 and n2) are available
for them, which entails that, whatever distribution is made between them of these
two values, none of these two values remains available for the other cells in the
same row.
To formalise this rule, we have to consider the case of a cell in standard row-
column space (also called a rc-cell) that has exactly two candidates (with given va-
lues); we say that this cell is confined to two values or that it is bivalue; and we
introduce the following auxiliary block-free predicate:
– rc-bivalue, with arity 4 and signature (Row, Column, Number, Number);
rc-bivalue(r, c, n1, n2) is defined as:
candidate(n1, r, c) & candidate(n2, r, c) & n1≠n2 &
∀n∉{n1, n2} not-candidate(n, r, c).
∀r∀c1≠c2∀n1≠n2
{ rc-bivalue(r, c1, n1, n2) &
rc-bivalue(r, c2, n1, n2)
⇒
∀c∉{c1, c2}∀n∈{n1, n2} not-candidate(n, r, c) }.
102 The Hidden Logic of Sudoku
VI.1.2. Naked-Pairs-in-a-column
Using the same predicate as before and applying the Src transformation to the
logical formulation of Naked-Pairs-in-a-row, we get the logical formulation of
Naked-Pairs-in-a-column (which is obviously block-free):
∀c∀r1≠r2∀n1≠n2
{ rc-bivalue(r1, c, n1, n2) &
rc-bivalue(r2, c, n1, n2)
⇒
∀r∉{r1, r2}∀n∈{n1, n2} not-candidate(n, r, c) }.
VI.1.3. Naked-Pairs-in-a-block
or, equivalently:
∀b∀s1≠s2∀n1≠n2
{ rc-bivalue[b, s1, n1, n2] &
rc-bivalue[b, s2, n1, n2]
⇒
∀s∉{s1, s2}∀n∈{n1, n2} not-candidate[n, b, s] }.
Let us give a very simple example of a puzzle that can be solved using only the
elementary constraints propagation rules, Naked-Single, Hidden-Single and Naked-
Pairs (puzzle Royle17-144, Figure 1).
The original (minimal) grid is displayed first, then its L1_0 elaboration (obtained
by application of the first Naked-Single and Hidden-Singles rules), then its solution.
As explained in the introduction, in the listing of the resolution process, only the
interesting parts are displayed (i.e. one starts with the elaborated grid and the final
NS and HS rules are omitted).
2 4 3 8 5 2 4 3 8 5 7 6 1 9 2 4
1 3 1 2 3 9 1 2 3 4 8 7 5 6
7 7 4 2 6 7 4 2 9 5 8 1 3
6 3 8 6 4 3 2 8 6 1 4 5 7 3 9 2
8 2 7 4 3 8 2 7 4 3 9 8 2 5 6 1
5 5 2 4 8 7 5 2 9 6 1 3 4 8 7
4 1 6 4 7 1 2 6 8 4 5 7 1 2 9 6 3 8
2 7 5 2 6 8 1 7 5 2 9 6 8 3 4 1 7 5
8 1 8 2 4 1 3 8 5 7 6 2 4 9
VI.2. Hidden-Pairs
VI.2.1. Hidden-Pairs-in-a-row
Let us now consider the Hidden-Pairs-in-a-row rule. To obtain it, we do the same
as we did with Naked-Single. We just apply the informal version of meta-theorem 3
to Naked-Pairs-in-a-row, permuting the words "number" and "column". That is,
once transposed in row-number space, a Naked-Pairs-in-a-row looks graphically like
a Naked-Pairs-in-a-row.
It can easily be seen that "rn-bivalue(ri, nj, c k1, ck2)" is the Scn transform of "rc-
bivalue(ri, cj, nk1, nk2)".
∀r∀n1≠n2∀c1≠c2
{ rn-bivalue(r, n1, c1, c2) &
rn-bivalue(r, n2, c1, c2)
⇒
∀n∉{n1, n2}∀c∈{c1, c2} not-candidate(n, r, c) }.
VI.2.2. Hidden-Pairs-in-a-column
It can easily be seen that "cn-bivalue(cj, ni, rk1, rk2)" is the Srn transform of "rc-
bivalue(ri, cj, nk1, nk2)".
106 The Hidden Logic of Sudoku
∀c∀n1≠n2∀r1≠r2
{ cn-bivalue(c, n1, r1, r2) &
cn-bivalue(c, n2, r1, r2)
⇒
∀n∉{n1, n2}∀r∈{r1, r2} not-candidate(n, r, c) }.
VI.2.3. Hidden-Pairs-in-a-block
∀b∀n1≠n2∀s1≠s2
{ bn-bivalue(b, n1, s1, s2) &
bn-bivalue(b, n2, s1, s2)
⇒
∀n∉{n1, n2}∀s∈{s1, s2} not-candidate[b, n, s] }.
Subset rules, level two: Pairs 107
The situation described above for Hidden Pairs can be captured in a more uni-
form manner if we introduce a new definition, that will play a major conceptual role
later in some chain rules (c-chain rules): two cells are very classically called
conjugate along some unit u for a given number n if they are different, they share
the given unit u and, in this unit, no other cell has n among its candidates. This
situation is captured by the following new auxiliary (non block-free) predicate:
– conjugate has arity 5 and signature (Number, Row, Column, Row, Column,
Unit-Type);
conjugate(n, r1, c1, r2, c2, ut) is defined as:
candidate(n, r1, c1) & candidate(n, r2, c2) &
{ [ut=row & r1=r2 & rn-bivalue(r1, n, c1, c2)] or
[ut=col & c1=c2 & cn-bivalue(c1, n, r1, r2] or
[ut=blk &
∃b∃s1∃s2 [ correspondence(r1, c1, b, s1) &
correspondence(r2, c2, b, s2) &
bn-bivalue(b, n, s1, s2)]] }.
Using this definition, the three rules for Hidden Pairs can be rephrased as a
single one: if there are two different cells that are conjugate along a given unit for
two different values, then eliminate any other candidate from these two cells. The
associated logical formulation follows:
∀n1≠n2∀r1∀r2∀c1∀c2∀ut∀n∉{n1, n2}
{ conjugate(n1, r1, c1, r2, c2, ut) &
conjugate(n2, r1, c1, r2, c2, ut)
⇒
not-candidate(n, r1, c1) & not-candidate(n, r2, c2) }.
The counterpart of this uniformity, and the reason why we have not based this
chapter on this global auxiliary predicate for conjugacy along any unit, is that it
would have hidden part of the symmetries and supersymmetries linking all the rules
relative to Pairs (see section 3.3 below).
108 The Hidden Logic of Sudoku
In our first example (puzzle Royle17-262, Figure 2), the first rule one can apply
to the L1_0 elaborated puzzle is Hidden-Pairs-in-a-column. Notice the way it is
displayed in the nrc-notation.
3 6 3 6 1 4 8 5 7 2 9 3 6
3 5 3 5 2 6 3 7 8 9 4 1 5 2
2 2 2 9 5 6 3 1 7 4 8
6 8 6 7 8 3 1 4 2 6 7 8 9 5
7 4 7 4 7 5 6 1 8 9 4 2 3
5 3 5 3 8 2 9 4 5 3 6 7 1
7 2 1 7 6 2 1 5 8 3 7 4 6 2 1 9
6 9 6 9 4 6 2 9 1 5 3 8 7
1 1 9 7 1 3 2 8 5 6 4
Resolution path in L1_0+NP+HP for the L1_0+NP (or L1_0) elaboration of Royle17-262:
hidden-pairs-in-a-column {n3 n5}{r8 r9}c7 ==> r9c7 ≠ 9, r9c7 ≠ 7, r9c7 ≠ 6
… (Naked-Singles and Hidden-Singles)
The puzzles in Figures 3 and 4 are interesting examples, showing that, even at
the level of these basic rules, solving some puzzles may require an elaborate
combination of Naked-Pairs and Hidden-Pairs in rows, columns and blocks. Notice
how (naked or hidden) pairs in blocks are displayed.
2 1 2 1 8 9 4 6 7 5 3 2 1
7 3 7 3 2 5 7 1 3 9 2 6 4 8
4 8 2 4 8 3 2 6 4 1 8 5 7 9
1 6 5 1 6 2 5 1 4 8 9 6 3 2 5 7
3 4 3 2 4 9 3 2 5 8 7 4 1 6
2 2 7 6 5 1 2 4 9 8 3
2 5 2 5 2 1 3 8 5 6 7 9 4
7 8 7 8 2 4 5 9 7 3 1 8 6 2
6 6 2 6 8 7 2 4 9 1 3 5
6 1 6 1 7 2 3 5 4 8 9 6 1
4 2 4 2 4 8 1 2 9 6 3 7 5
9 9 9 5 6 1 3 7 8 2 4
6 7 1 5 3 8 6 7 4 1 5 3 8 6 7 4 1 5 2 3 9
1 7 1 4 7 3 1 4 9 7 2 6 5 8
4 4 1 7 5 9 2 6 8 3 4 1 7
2 8 5 2 8 5 2 7 9 8 6 1 5 4 3
3 3 6 3 8 7 5 4 1 9 2
1 4 5 3 2 9 7 8 6
Notice that:
NP(row) ⎯> NP(col) under Src, i.e. under the permutation: row <⎯> column;
NP(row) ⎯> HP(row) under Scn, i.e. under the permutation: column <⎯> number;
NP(col) ⎯> HP(col) under Srn, i.e. under the permutation: row <⎯> number.
One might therefore be tempted to think that these permutations can be combi-
ned in the simplest manner, and that the three rules correspond to each other through
the three permutations one can do on the three symbols n, r, c, as was the case for
the NS and HS rules in chapter V. For instance, one might think that applying
symmetry Scn to NP(row) and then symmetry Srn to HP(row) is equivalent to ap-
plying directly symmetry Src to NP(row) and that, consequently Srn•Scn(row) should
be NP(col), and similarly Srn•Scn(col) should be NP(row). But, as suggested in
section IV.5.1.3, this is not true!
As can be seen on the logical formulæ (see section 3.3 for details), the difference
with what happened in the case of NS and HS is related to the number of quantifiers
concerned by each of these symmetries: this number is not the same in all the cases.
Geometrically, this is also explained by the fact that the symmetries do not apply in
the same spaces.
Scn Srn
NP(row) HP(row) SHP(row)
The full story is (temporarily) given by Figure 5, where double sided arrows
indicate symmetries and two new rules, that remain to be identified, have been intro-
duced: SHP(row) and SHP(col). This graph is more complex than the one we had
for Singles in chapter V. For simplicity, analogies are not displayed.
Subset rules, level two: Pairs 111
The SHP(row) rule is obtained from the HP(row) rule by permuting the words
row and number in row-number space, according to meta-theorem 3.
Let us first do this permutation formally, i.e. by applying the Srn transform to
HP(row) = Scn(NP(row)). Super-Hidden-Pairs-in-rows (logical formulation):
∀n∀r1≠r2∀c1≠c2
{ rn-bivalue(r1, n, c1, c2) &
rn-bivalue(r2, n, c1, c2)
⇒
∀r∉{r1, r2}∀c∈{c1, c2} not-candidate(n, r, c) }.
Let us now try to understand the result. First comes the literal English transcrip-
tion of the logical formula:
Admittedly, this is not absolutely clear. So let us try to make it a little bit more
explicit by a new equivalent formulation: if there is a number n, and there are two
different rows r1 and r2, such that, in these rows, n appears as a candidate only in co-
lumns c1 and c2, then, in any of the two columns, eliminate n from the candidates for
any row other than r1 and r2.
Here comes the surprise: this is the usual formulation of X-Wing-in-rows – the
direct proof of which is obvious (in each of the two rows there are two cells that can
receive the instance of n in this row, and any two of these two instances cannot be in
the same column; therefore, whatever their exact position may be in each of the two
rows, there is one of them in each of the two columns; which implies that, in each of
the two columns there can be no instance of n but in the two rows).
Finally, we have shown that the familiar X-Wing-in-rows rule is the super-
hidden version of Naked-Pairs-in-a-row: SHP(row) ≡ Srn(HP(row)) = X-Wing(row).
112 The Hidden Logic of Sudoku
For completeness, let us just write the logical formulation of the SHP(col) rule.
We leave it as an exercise to the reader to check that there are two paths to obtain
the same result: either starting from HT(col) and applying number-column (i.e. Scn)
permutation (meta-theorem 3), or starting from SHT(row) and applying row-column
(i.e. Src) permutation (meta-theorem 2). Super-Hidden-Pairs-in-columns (logical for-
mulation):
∀n∀c1≠c2∀r1≠r2
{ cn-bivalue(c1, n, r1, r2) &
cn-bivalue(c2, n, r1, r2)
⇒
∀c∉{c1, c2}∀r∈{r1, r2} not-candidate(n, r, c) }.
Similarly to the previous case, we find that the familiar X-Wing-in-columns rule
is the supersymmetric version of Naked-Pairs-in-a-column:
Now, several natural questions may arise in the mind of the awfully inquisitive
reader, such as:
– what if, instead of applying symmetry Scn to NP(row), we apply symmetry Srn?
– what if we formulate a rule analogous to X-Wing-in-rows but in row-number
space – i.e. a rule that should be called Hidden-X-Wing-in-rows or HXW(row) or
HSHP(row)?
Do we get new unknown rules? The answer is no, and the previous set of rules is
strongly closed under symmetry and supersymmetry. More specifically, the real full
story is not to be found in Figure 5 but in Figure 6. The practical consequence of this
for the sequel is that it exempts us from searching for unknown types of rules
dealing with Pairs of any kind.
After the first edition of this book was published, I was informed that the idea of
"another view of fish" (i.e. of X-Wings, Swordfish and Jellyfish – see chapters VII
and VIII) had already been expressed on the Sudoku Players Forum by "Arcilla".
But, as it missed a formal support and the general idea of supersymmetry, it did not
develop into a global framework and it led neither to the systematic relationships
displayed in Figure 6 below nor to the idea of hidden chains.
Subset rules, level two: Pairs 113
HP(row)
Scn Srn
NP(row) SHP(row)
Srn
Src Src
Scn
Src
NP(col) SHP(col)
Srn Scn
HP(col)
Figure 6. The full set of symmetries and supersymmetries for Pairs
It is worth checking some of the details and proving some of the above asser-
tions. This is an easy exercise about Src, Srn and Scn transforms, provided that we are
very careful with the indices.
Consider atomic formula rc-bivalue(ri, cj, nk1, nk2). By Scn, it becomes rn-
bivalue(ri, nj, ck1, ck2). By Srn, this last formula becomes rn-bivalue(rj, ni, ck1, ck2). In
turn, by Src, this becomes cn-bivalue(cj, ni, rk1, rk2); notice that this is the same thing
as the Srn transform of the original formula. Let us now apply the same series of
transformations to rule NP(row). But, let us first rewrite it with indices on the
variables. NP(row):
∀ri∀cj1≠cj2∀nk1≠nk2
{ rc-bivalue(ri, cj1, nk1, nk2) &
rc-bivalue(ri, cj2, nk1, nk2)
⇒
∀c∉{cj1, cj2}∀n∈{nk1, nk2} not-candidate(n, ri, c) }.
114 The Hidden Logic of Sudoku
∀ri∀nj1≠nj2∀ck1≠ck2
{ rn-bivalue(ri, nj1, ck1, ck2) &
rn-bivalue(ri, nj2, ck1, ck2)
⇒
∀n∉{nj1, nj2}∀c∈{ck1, ck2} not-candidate(n, ri, c) }.
∀ni∀rj1≠rj2∀ck1≠ck2
{ rn-bivalue(r j1, n i, ck1, ck2) &
rn-bivalue(r j2, n i, ck1, ck2)
⇒
∀r∉{rj1, rj2}∀c∈{ck1, ck2} not-candidate(ni, r, c) }.
Now comes the new part of the relationships. When submitted to Scn, the above
SHP(row) becomes:
∀ci∀rj1≠rj2∀nk1≠nk2
{ rc-bivalue(r j1, ci, nk1, nk2) &
rc-bivalue(r j2, ci, nk1, nk2)
⇒
∀r∉{rj1, rj2}∀n∈{nk1, nk2} not-candidate(n, r, ci) }.
2 3 4 2 8 1 3 4 9 6 7 2 8 1 5 3 4
1 8 6 1 8 6 4 2 1 5 8 3 6 4 2 9 7
4 2 5 1 8 4 3 2 7 9 5 6 1 8
4 5 2 8 4 6 5 3 2 1 8 4 6 5 3 9 7 2 1
1 8 2 3 6 1 8 4 2 9 3 6 1 7 8 4 5
7 7 1 8 4 2 3 6 7 1 5 8 4 2 3 6 9
2 4 3 2 1 4 3 8 6 5 2 1 4 7 3 9 8 6
6 1 6 4 2 8 1 3 6 7 4 9 2 8 1 5 3
3 8 1 6 4 2 3 8 9 1 5 6 4 7 2
c1 c2 c3 c4 c5 c6 c7 c8 c9
r1 5 5 6 5 2 8 1 5 6 3 4
9 7 9 7 9 7 9
3 3
r2 1 7
5
9
8
7 9
6 4 2
7
5 6
9 7
5 6
9
r2
3 3
r3 4 7
6
9
2 7 9 7 9
5
7
6
9
1 8
r4 8 4 6 5 3 7 9 7 9
2 1 r4
r5 2 5
9
3 6 1 7 9
8 4 7
5
9
r5
r6 7 1 5
9 8 4 2 3 6 5
9
r6
r7 5 2 1 4 5 3 5 8 6 r7
9 7 9 7 9
r8 6 7
5
9
4
7 9
2 8 1
7
5
9
3 r8
r9 3 8 7
5
9
1
7
5
9
6 4
7
5
9
2 r9
c1 c2 c3 c4 c5 c6 c7 c8 c9
It is worth dwelling a little on the situation for the candidate sets after propaga-
tion of all the elementary constraints has occurred (Figure 8), which corresponds to
starting from the second grid in Figure 7 (call it Royle17-6973*).
One can see lots of Naked-Pairs and Naked-Triplets (see next chapter for a
definition) in rows, columns and blocks; but none of them can produce any new
result. In columns c1 and c7, one can also see our X-Wing for number 5: in these
two columns, this number appears only in rows r1 and r7. Therefore one can delete
number 5 from any cell in rows r1 and r7 unless it is in column c1 or c7, i.e. in this
case from cells r7c5, r1c2, r1c3. (In the above listing, deletion of 5 from r1c2 is
interrupted by the application of a simpler rule, a mere artifact of SudoRules).
c1 c2 c3 c4 c5 c6 c7 c8 c9
n1 2 6 7 9 5 1 8 3 4 n1
n2 5 7 3 1 8 6 2 4 9 n2
2 3 2 3
n3 9 5 4 7 6 1 8
n4 3 4 8 7 6 2 9 5 1 n4
1 1 2 1 1 2 2
n5
7
5 6 4 3 7 8 9
5 6 n5
8 9 7 9
1 3 1 3
n6 8 4 5 2 9 6 7 n6
1 2 3 1 2 3 3 1 3 2 2
n7 6 8 9 8 7 9
4 5 4
7 8 9
5 n7
n8 4 9 2 6 1 8 5 7 3 n8
1 1 2 3 1 2 3 3 1 3 2 2
n9 5 6 4 5 4 5 6 n9
7 8 9 8 7 9 7 8 9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Figure 9. Grid Royle17-6073* in cn-space, with the remaining candidates. Remember that
numbers in the cn-cells represent column-candidates for these cells
Subset rules, level two: Pairs 117
Here, in simili-row 5, one can see a simili Naked-Pairs in columns c1 and c7 for
values 1 and 7 (i.e. rows r1 and r7). Application of simili NP(row), eliminates 1 (i.e.
r1) from n5c2 and n5c3 and 7 (i.e. r7) from n5c5 – which corresponds exactly to
eliminating number 5 from r1c2, r1c3 and r7c5, as we did in the previous row-
column representation.
Our second example (puzzle Royle17-5499, Figure 10) illustrates how it may be
necessary to combine X-Wing and Naked-Pairs.
118 The Hidden Logic of Sudoku
7 8 2 7 1 8 2 6 4 3 9 7 1 8 5 2
2 8 2 8 4 1 3 7 5 9 6 2 8 4 1 3
1 1 8 2 4 3 1 8 2 5 4 3 9 6 7
5 4 5 7 9 2 4 5 3 8 7 1 9 2 4 6
7 6 7 2 6 3 4 7 1 2 6 5 3 8 9
3 2 3 2 9 6 3 8 4 1 7 5
1 5 3 1 9 2 5 3 8 6 7 1 9 2 5 3 4
2 7 2 8 6 7 9 3 2 4 8 5 6 7 9 1
4 4 7 2 9 1 5 4 3 7 6 2 8
Figure 10. Puzzle Royle17-5499, its L1_0 elaboration and its solution
Finally, our third example (puzzle Royle17-32408, Figure 11) illustrates how
things can be more complex, even when only rules for Singles and Pairs are needed
to solve a puzzle; its solution combines two kinds of Naked-Pairs (in rows and in
blocks), two kinds of Hidden-Pairs (in rows and in blocks) and two X-Wings (in
columns).
6 3 1 6 8 3 1 6 7 4 8 3 5 2 1 9
2 7 1 2 7 9 5 1 2 7 9 6 8 3 4
1 2 9 3 8 1 2 4 5 7 6
1 3 1 6 8 3 1 2 9 6 5 8 4 3 7
2 7 2 7 4 5 3 2 1 7 6 9 8
9 9 3 7 8 6 9 4 3 1 5 2
2 4 8 2 4 7 5 8 9 6 2 4 7 5 8 9 3 6 1
9 5 3 6 1 4 7 2 9 8 5 3 6 1 4 7 2 9 8 5
8 7 8 3 6 1 7 8 9 5 3 6 1 7 2 4
Figure 11. Puzzle Royle17-32408, its L1_0 elaboration and its solution
hidden-pairs-in-a-row {n3 n8}r3{c2 c3} ==> r3c3 ≠ 9, r3c3 ≠ 5, r3c3 ≠ 4, r3c2 ≠ 9, r3c2 ≠
7, r3c2 ≠ 5
x-wing-in-columns n9{r3 r5}{c1 c8} ==> r5c9 ≠ 9, r5c3 ≠ 9, r5c2 ≠ 9, r3c9 ≠ 9
x-wing-in-columns n7{r3 r6}{c1 c8} ==> r6c9 ≠ 7, r6c2 ≠ 7, r3c9 ≠ 7
hidden-pairs-in-a-block {n7 n9}{r1c9 r3c8} ==> r3c8 ≠ 5, r3c8 ≠ 4, r1c9 ≠ 4, r1c9 ≠ 2
hidden-single-in-a-block ==> r1c7 = 2
naked-pairs-in-a-row {n4 n5}r4{c5 c7} ==> r4c9 ≠ 4, r4c3 ≠ 5, r4c3 ≠ 4
… (Naked-Singles)
VI.4. Theory L2
L2 = L1 ∪ NP ∪ HP ∪ SHP.
At level L2, there will be no additional rules. Notice that the symmetry relation-
ships described by Figure 6 show that L2 is supersymmetric.
Chapter VII
Chapter VI has illustrated in full detail our approach on all rules relative to Pairs.
In this chapter, we shall not write the rules relative to Triplets with all the details
given for Pairs. Instead, we shall leave part of this work to the reader and concen-
trate on features that Pairs did not exhibit, particularly in what concerns the way
these rules must be formulated.
The following familiar rules will be studied and their relations through symme-
try, analogy and supersymmetry will be established:
– Naked-Triplets-in-a-row, or NT(row) for short;
– Naked-Triplets-in-a-column, or NT(col) for short;
– Naked-Triplets-in-a-block, or NT(blk) for short;
– Hidden-Triplets-in-a-row, or HT(row) for short;
– Hidden-Triplets-in-a-column, or HT(col) for short;
– Hidden-Triplets-in-a-block, or HT(blk) for short.
The super hidden version of each of these rules will also be introduced and
proven to be respectively equivalent to the more familiar Swordfish-in-rows and
Swordfish-in-columns:
– Super-Hidden-Triplets-in-rows, or SHT(row) for short;
– Super-Hidden-Triplets-in-columns, or SHT(col) for short.
This will give a graph of symmetries (Figure 1, where analogies are not display-
ed) similar to the one we had for Pairs.
122 The Hidden Logic of Sudoku
Scn Srn
NT(row) HT(row) SHT(row)
But, again, this is not the full story. Since the number of cells (and quantifiers)
concerned by these rules is higher than in the case of Pairs, one may fear that the
real full story is awfully more complicated – but this is not true. As in the case of
Pairs, it is to be found in Figure 2. The additional symmetries can be proved easily
by checking the precise logical formulation of all the rules, exactly as was done for
Pairs.
VII.1. Naked-Triplets
VII.1.1. Naked-Triplets-in-a-row
HT(row)
Scn Srn
NT(row) SHT(row)
Srn
Src Src
Scn
Src
NT(col) SHT(col)
Srn Scn
HT(col)
So, neither of the two usual formulations of the Naked-Triplets rule is correct
according to our guiding principles. How then can one formulate this rule so that it
is comprehensive but does not subsume Naked-Pairs-in-a-row or Naked-Single-in-a-
row? It is enough to make certain that the three cells have no candidate other than
the three given numbers (say n1, n2 and n3), that each of them has more than one
candidate and that no two of them have exactly the same two candidates. The only
way to do this is to impose candidates n1 and n2 for cell 1, candidates n2 and n3 for
cell 2 and candidates n3 and n1 for cell 3. We get the final formulation, more com-
plex than the usual ones but with its full natural scope:
∀r∀3≠c1c2c3∀3≠n1n2n3
{ candidate(n1, r, c1) & candidate(n2, r, c1) &
∀n∉{n1, n2, n3} not-candidate(n, r, c1) &
candidate(n2, r, c2) & candidate(n3, r, c2) &
∀n∉{n1, n2, n3} not-candidate(n, r, c2) &
candidate(n3, r, c3) & candidate(n1, r, c3) &
∀n∉{n1, n2, n3} not-candidate(n, r, c3)
⇒
∀c∉{c1, c2, c3}∀n∈{n1, n2, n3} not-candidate(n, r, c) }.
VII.1.2. Naked-Triplets-in-a-column
∀c∀3≠r1r2r3∀3≠n1n2n3
{ candidate(n1, r1, c) & candidate(n2, r1, c) &
∀n∉{n1, n2, n3} not-candidate(n, r1, c) &
candidate(n2, r2, c) & candidate(n3, r2, c) &
∀n∉{n1, n2, n3} not-candidate(n, r2, c) &
candidate(n3, r3, c) & candidate(n1, r3, c) &
∀n∉{n1, n2, n3} not-candidate(n, r3, c)
⇒
∀r∉{r1, r2, r3}∀n∈{n1, n2, n3} not-candidate(n, r, c) }.
VII.1.3. Naked-Triplets-in-a-block
∀b∀3≠s1s2s3∀3≠n1n2n3
{ candidate[n1, b, s1] & candidate[n2, b, s1] &
∀n∉{n1, n2, n3} not-candidate[n, b, s1] &
candidate[n2, b, s2] & candidate[n3, b, s2] &
∀n∉{n1, n2, n3} not-candidate[n, b, s2] &
candidate[n3, b, s3] & candidate[n1, b, s3] &
∀n∉{n1, n2, n3} not-candidate[n, b, s3]
⇒
∀s∉{s1, s2, s3}∀n∈{n1, n2, n3} not-candidate[n, b, s] }.
8 2 3 8 7 1 2 9 3 5 8 4 7 1 6 2
1 3 1 8 2 3 7 1 6 8 5 2 9 3 4 7
7 4 7 4 2 1 3 8 7 4 2 1 3 6 9 8 5
2 3 5 8 1 6 2 7 3 5 9 4 8 1 6 2 7 3 5 9 4
5 1 5 7 3 2 1 6 5 7 3 9 8 4 2 1 6
6 2 9 4 6 5 1 7 3 8 2 9 4 6 5 1 7 3 8
1 7 1 2 8 7 6 5 9 4 1 2 8 7 3
3 2 3 2 3 2 1 7 6 8 4 5 9
8 8 2 4 8 7 3 9 5 6 2 1
1 3 8 1 2 3 8 1 2 5 9 7 6 3 4 8
6 4 6 4 8 3 6 9 4 2 8 5 1 7
8 7 8 4 3 5 1 9 2 6
2 3 1 2 3 1 2 5 3 8 1 7 6 9 4
7 5 7 5 4 1 6 2 9 3 8 7 5
8 8 7 8 9 7 6 4 5 1 3 2
7 5 6 9 7 5 6 9 7 8 5 3 2 4 6 1
8 2 3 1 8 2 6 3 1 7 8 4 2 5 9
4 4 5 4 2 1 6 9 7 8 3
VII.2. Hidden-Triplets
VII.2.1. Hidden-Triplets-in-a-row
In our databases, puzzle Royle17-13727 (Figure 5) is the one with the shortest
resolution path (apart from NS and HS) among those requiring HT.
9 6 5 9 6 5 1 2 4 9 6 3 5 1 7 8
1 2 1 2 7 1 6 4 9 8 3 2 5
8 1 3 5 8 1 7 2 4 6 9
8 1 3 2 8 5 7 6 1 3 4 9 2 8 5 7 6 1 3
5 7 9 5 3 7 9 6 1 2 5 3 7 9 6 1 2 8 4
6 6 8 1 9 6 8 1 2 4 3 9 5 7
2 1 2 1 9 2 5 7 1 4 8 3 6
8 7 8 7 1 8 6 3 5 2 9 7 4 1
5 1 5 2 1 7 4 3 8 6 5 9 2
Now, in row-number space, there remains to consider a rule that one could call
Super-Hidden-Triplets-in-a-row (or Naked-Triplets-in-a-number in row-number
space, by analogy with Naked-Triplets-in-a-column in row-column space). This rule
is obtained from Hidden-Triplets-in-a-row by permuting the words "row" and
"number", according to meta-theorem 3. Let us first do this permutation formally,
i.e. by applying the Srn transform to HT(row) = Scn(NT(row)). Super-Hidden-
Triplets-in-rows (logical formulation):
∀n∀3≠r1r2r3∀3≠c1c2c3
{ candidate(n, r1, c1) & candidate(n, r1, c2) &
∀c∉{c1, c2, c3} not-candidate(n, r1, c) &
candidate(n, r2, c2) & candidate(n, r2, c3) &
∀c∉{c1, c2, c3} not-candidate(n, r2, c) &
candidate(n, r3, c3) & candidate(n, r3, c1) &
∀c∉{c1, c2, c3} not-candidate(n, r3, c)
⇒
∀r∉{r1, r2, r3}∀c∈{c1, c2, c3} not-candidate(n, r, c) }.
Let us now try to understand the result. First comes the direct English translite-
ration:
Admittedly, this is not very explicit. So let us try to clarify it a little by tempora-
rily forgetting part of the conditions: if there is a number n, and there are three diffe-
rent rows r1, r2 and r3 and three different columns c1, c2 and c3, such that for each of
the three rows the instance of number n that must be somewhere in each of these
rows can actually only be in either of the three columns, then in any of the three
columns eliminate n from the candidates for any row different from the given three.
Subset rules, level three: Triplets 131
After chapter VI and our approach of X-Wings, this should not be a total sur-
prise: we find the advanced formulation of the Swordfish-in-rows rule – the direct
proof of which is obvious (in each of the three rows there are three cells that can
receive the unique instance of n in this row, and any two of these three instances
cannot be in the same column; therefore, whatever the exact positions of n may be in
each of the three rows, there is one of them in each of the three columns; which
implies that, in each of the three columns there can be no instance of n but in the
three given rows).
There remains one point to examine. In the last English formulation of this rule,
we have discarded part of the conditions. This part corresponds to what we have
added to Comprehensive-Naked-Triplets-in-a-row; it is just what prevents Sword-
fish-in-rows from reducing to X-wing. Finally, we have not only shown that Sword-
fish-in-rows is the supersymmetric version of Naked-Triplets-in-a-row, but we have
also found the proper way to write this rule according to our guiding principles, in as
comprehensive a way as possible.
5 4 5 1 4 8 3 6 7 5 1 4 2 8 9 3 6
3 8 6 1 3 5 8 4 6 9 2 1 3 5 8 7 4
1 4 8 3 6 5 1 4 8 3 7 6 9 5 2 1
3 8 7 3 5 6 8 4 7 1 3 2 5 6 8 4 7 1 9
6 5 8 6 3 1 4 5 8 6 7 3 9 1 4 5 2
2 1 4 2 5 6 8 3 9 1 4 2 5 7 6 8 3
5 6 4 3 5 6 4 8 2 3 9 5 7 6 1 4 8
1 8 3 1 8 3 6 5 1 7 8 9 4 2 3 6 5
5 6 8 3 7 5 4 6 8 1 3 2 9 7
c1 c2 c3 c4 c5 c6 c7 c8 c9
n2 n2 n2
r1 n5 n1 n4 n8 n3 n6
n7 n9 n7 n9 n9
n2 n2 n2
r2 n6 n1 n3 n5 n8 n4
n7 n9 n7 n9 n7 n9
n2 n2
r3 n4 n8 n3 n6 n5 n1 r3
n7 n9 n7 n9 n7 n9
n2 n2
r4 n3 n5 n6 n8 n4 n7 n1 r4
n9 n9
n2 n2
r5 n8 n6 n3 n1 n4 n5 r5
n7 n9 n7 n9 n9
c1 c2 c3 c4 c5 c6 c7 c8 c9
c1 c2 c3 c4 c5 c6 c7 c8 c9
The first (puzzle Royle17-18966, Figure 6) is interesting in that the simplest rule
applicable to its L2 elaboration (which is equal to its L1_0 elaboration) is Swordfish
and this is enough (apart from NS, HS and ECP) to solve the grid.
Resolution path in L3_0 for the L2+NT+HT (or L1_0) elaboration of Royle17-18966:
swordfish-in-columns n7{r2 r8 r3}{c2 c4 c8} ==> r8c6 ≠ 7, r8c5 ≠ 7, r3c6 ≠ 7
… (Naked-Singles and Hidden-Singles)
In this case, it is worth considering the full grid with candidates, both in rc- and
cn- spaces (Figure 7). Spotting this Sworfish in the standard representation may be
difficult because it is very degenerate (some of the cells on which it lies are even
decided). In the cn-representation, it looks like a very degenerate Naked-Triplets,
but still a Naked-Triplets. After we have spoken of hxy-chains, it will also appear as
an hxy-cn-chain of length 3. The four candidates eliminated by the Swordfidh-in-
columns rule are shown in slightly larger bold italics, in both grids.
7 1 7 6 5 1 3 7 6 2 4 5 9 1 8 3
3 8 3 6 8 5 4 3 2 1 6 7 9
6 3 6 3 9 1 6 8 7 5 2 4
3 8 4 3 8 2 6 5 9 1 7 4 3 8 2 6 5 9 1 7
1 2 9 7 5 1 3 2 6 9 7 5 8 1 3 2 4 6
6 5 1 2 6 7 3 5 1 2 6 7 9 4 3 5 8
1 4 6 5 1 7 3 4 6 2 5 1 7 9 3 8 4 6 2
2 7 2 4 3 1 7 6 2 4 3 1 7 6 8 9 5
5 3 6 8 9 5 7 3 1 6 8 9 5 4 2 7 3 1
Resolution path in L3_0 for the L2+NT+HT (or L1_0) elaboration of Royle17-34029:
hidden-pairs-in-a-column {n1 n7}{r2 r3}c6 ==> r3c6 ≠ 9, r3c6 ≠ 8, r3c6 ≠ 4, r3c6 ≠ 2, r2c6 ≠
9, r2c6 ≠ 4, r2c6 ≠ 2
swordfish-in-columns n8{r6 r3 r8}{c5 c7 c9} ==> r8c8 ≠ 8
naked-single ==> r8c8 = 9
row r1 interaction-with-block b2 ==> r3c5 ≠ 9, r2c5 ≠ 9
… (Naked-Singles and Hidden-Singles)
134 The Hidden Logic of Sudoku
For our final example (puzzle Royle17-35491, Figure 9), the L2 elaboration
(which has only one more value than the L1_0 elaboration) requires combining
HP(col) and SHP(col).
8 1 8 7 1 3 8 7 2 4 5 6 1 9 3
3 9 3 7 9 5 4 3 2 1 7 8 6
7 3 7 3 6 1 7 9 8 5 2 4
3 9 3 9 7 6 1 4 3 9 2 7 5 6 1 8
1 2 8 6 1 3 2 7 5 8 6 9 1 3 2 4 7
7 5 1 2 7 3 5 1 2 7 8 6 4 3 5 9
1 4 7 6 1 8 3 4 7 2 6 1 8 5 3 9 4 7 2
2 8 2 4 3 1 8 7 2 4 3 1 8 7 9 6 5
6 3 7 9 5 6 8 3 1 7 9 5 6 4 2 8 3 1
Finally, let us define theory L3_0 as the union of L2 with the set of rules defined
in the present chapter:
L3_0 = L2 ∪ NT ∪ HT ∪ SHT.
Full level 3 and theory L3 will be obtained from L3_0 by adding XY-Wing and
XYZ-wing rules (see chapter X).
Chapter VIII
As the case of Quadruplets is very similar to that of Triplets, the first three
sections of this chapter strictly parallel those of chapter VII, except for the specific
examples.
The following familiar rules will be studied and their relationships through sym-
metry, analogy and supersymmetry will be established:
– Naked-Quadruplets-in-a-row, or NQ(row) for short;
– Naked-Quadruplets-in-a-column, or NQ(col) for short;
– Naked-Quadruplets-in-a-block, or NQ(blk) for short;
– Hidden-Quadruplets-in-a-row, or HQ(row) for short;
– Hidden-Quadruplets-in-a-column, or HQ(col) for short;
– Hidden-Quadruplets-in-a-block, or HQ(blk) for short.
The super hidden version of these rules will also be introduced and proven to be
equivalent to the popular Jellyfish-in-rows and Jellyfish-in-columns respectively:
– Super-Hidden-Quadruplets-in-rows, or SHQ(row) for short;
– Super-Hidden-Quadruplets-in-columns, or SHQ(col) for short.
This will give a graph of symmetries (Figure 1, where analogies are not display-
ed) identical in its structure to those we had for Pairs or Triplets.
136 The Hidden Logic of Sudoku
HQ(row)
Scn Srn
NQ(row) SHQ(row)
Srn
Src Src
Scn
Src
NQ(col) SHQ(col)
Srn Scn
HQ(col)
But, as for Pairs or Triplets, this is not the real full story, which is to be found in
Figure 2. And, here again, the additional symmetries can easily be proved by chec-
king the precise logical formulation of all the rules (and, in particular, that the
numbers of quantifiers of each appropriate sort are equal).
We shall also analyse the relationship between Naked and Hidden subsets of
complementary cardinalities and show (a well known result) that it is not necessary
to consider Subset rules for more than four cells.
Finally, we shall introduce the first four levels of our complexity hierarchy.
VIII.1. Naked-Quadruplets
∀r∀4≠c1c2c3c4∀4≠n1n2n3n4
{ candidate(n1, r, c1) & candidate(n2, r, c1) &
∀n∉{n1, n2, n3, n4} not-candidate(n, r, c1) &
candidate(n2, r, c2) & candidate(n3, r, c2) &
∀n∉{n1, n2, n3, n4} not-candidate(n, r, c2) &
candidate(n3, r, c3) & candidate(n4, r, c3) &
∀n∉{n1, n2, n3, n4} not-candidate(n, r, c3)
candidate(n4, r, c4) & candidate(n1, r, c4) &
∀n∉{n1, n2, n3, n4} not-candidate(n, r, c4)
⇒
∀c∉{c1, c2, c3, c4}∀n∈{n1, n2, n3, n4} not-candidate(n, r, c) }.
Later, this clarification of the Naked-Quadruplets rule will allow us to show how
close it is to xy4-chains, but also why it should not be reduced to them.
Subset rules, level four: Quadruplets 139
Our first example (Sudogen17-6947, Figure 3) has identical L3 and L1_0 elabo-
rations. It cannot be solved using only Subset rules (it requires the very classical
XY-Wing and XYZ-Wing, also named xy3 and xyz3).
7 6 1 7 4 6 3 5 1 2 7 4 6 3 5 9 8
9 7 4 6 8 3 9 5 2 7 1 4 6 8 3 9 5 2 7 1 4
4 8 1 5 4 8 1 7 9 5 4 8 1 7 6 3 2
6 6 4 6 8 2 9 1 3 7 5
5 3 1 4 5 3 1 7 6 2 4 5 3 1 7 8 6 2 4 9
3 3 7 9 2 5 3 4 1 8 6
6 2 9 7 3 4 6 1 2 8 9 5 7 3 4 6 1 2 8 9 5 7
8 9 2 8 1 9 6 7 5 4 2 3 8 1 9 6 7 5 4 2 3
9 8 6 5 3 4 9 8 6 1 2 7 5 3 4 9 8 6 1
applied again to obtain the remaining conclusions. But, of course, all the conclu-
sions of naked Quads could be obtained before xyz3 applies.
6 6 9 1 3 6 2 8 7 4 5
4 8 1 3 4 8 1 7 3 4 8 2 1 5 7 6 9 3
7 7 9 5 7 6 9 4 3 1 8 2
1 3 4 5 1 3 4 5 6 1 3 4 5 8 6 2 7 9
1 6 2 1 3 6 8 5 9 2 7 1 3 6 4
9 5 6 3 9 5 2 6 7 3 9 4 5 1 8
8 3 8 3 2 7 9 1 8 3 2 4 5 6
5 9 8 2 3 5 6 9 8 2 3 4 5 7 6 9 8 2 1
6 2 9 6 2 8 1 5 9 3 6 2 8 4 1 5 9 3 7
VIII.2. Hidden-Quadruplets
As for Triplets, the proper formulation of rules for Hidden Quadruplets would
not be obvious if we could not rely on super-symmetries and meta-theorem 3.
VIII.2.1. Hidden-Quadruplets-in-a-row
Now, in row-number space, there remains to consider a rule that one could call
Super-Hidden-Quadruplets-in-rows (or Naked-Quadruplets-in-a-number in row-
number space, by analogy with the Naked-Quadruplets-in-a-column rule in row-
column space). This rule is obtained from Hidden-Quadruplets-in-a-row by permu-
ting the words "row" and "number", according to meta-theorem 3. Let us first do
this permutation formally, i.e. by applying the Srn transform to HQ(row) =
Scn(NQ(row)). Super-Hidden-Quadruplets-in-rows (logical formulation):
142 The Hidden Logic of Sudoku
∀n∀4≠r1r2r3r4∀4≠c1c2c3c4
{ candidate(n, r1, c1) & candidate(n, r1, c2) &
∀c∉{c1, c2, c3, c4} not-candidate(n, r1, c) &
candidate(n, r2, c2) & candidate(n, r2, c3) &
∀c∉{ c1, c2, c3, c4} not-candidate(n, r2, c) &
candidate(n, r3, c3) & candidate(n, r3, c4) &
∀c∉{ c1, c2, c3, c4} not-candidate(n, r3, c)
candidate(n, r4, c4) & candidate(n, r4, c1) &
∀c∉{ c1, c2, c3, c4} not-candidate(n, r4, c)
⇒
∀r∉{r1, r2, r3, r4}∀c∈{c1, c2, c3} not-candidate(n, r, c) }.
Let us now try to understand the result. First comes the direct English translite-
ration. Super-Hidden-Quadruplets-in-rows (English formulation): if there is a num-
ber n, and there are four different rows r1, r2 , r3 and r4 and four different columns c1,
c2 , c3 and c4, such that:
rn-cell (r1, n) (in row-number space) has c1 and c2 among its candidates (columns),
rn-cell (r2, n) (in row-number space) has c2 and c3 among its candidates (columns),
rn-cell (r3, n) (in row-number space) has c3 and c4 among its candidates (columns),
rn-cell (r4, n) (in row-number space) has c4 and c1 among its candidates (columns),
none of the rn-cells (r1, n), (r2, n), (r3, n) and (r4, n) (in row-number space) has any
candidate (column) other than c1, c2, c3 and c4,
then eliminate the four columns c1, c2, c3 and c4 from the candidates for any other rn-
cell (r, n) based on number n in row-number space.
Admittedly, this is not very explicit. So let us try to clarify it a little bit by tem-
porarily forgetting part of the conditions: if there is a number n, and there are four
different rows r1, r2 , r3 and r4 and four different columns c1, c2 , c3 and c4, such that
for each of the four rows the instance of number n that must be somewhere in each
of these rows can actually only be in either of the four columns, then in any of the
four columns eliminate n from the candidates for any row different from the given
four.
After chapter VII, this should not be a surprise, since this section is a strict
parallel to the section on Swordfish: this is the usual formulation of the Jellyfish-in-
rows rule – the direct proof of which is obvious (in each of the four rows there are
four cells that can receive the unique instance of n in this row, and any two of these
four instances cannot be in the same column; therefore, whatever the exact positions
of n may be in each of the four rows, there is one of them in each of the four
Subset rules, level four: Quadruplets 143
columns; which implies that, in each of the four columns there can be no instance of
n but in the four given rows).
As for Swordfish in chapter VII, there remains a point to examine. In the last
English formulation of this rule, we have discarded part of the conditions. This part
corresponds to what we have added to Comprehensive-Naked-Quadruplets-in-a-
row; it is just what prevents Jellyfish-in-rows from reducing to X-wing-in-rows or to
Swordfish-in-rows. Finally, we have not only shown that Jellyfish-in-rows is the su-
persymmetric version of Naked-Quadruplets-in-a-row, but we have also found the
proper way to write this rule according to our guiding principles, in as comprehen-
sive a way as possible.
So-called "fishy patterns" (Swordfish, Jellyfish) are very popular, even and
especially even the non-existent ones (such as Squirmbag, a would be Super-
Hidden-Quintuplets in our vocabulary – see section 4 below). But we have another
reason for giving several examples: real cases (i.e. stemming from real puzzles and
not invented for ad hoc illustration purposes) are as rare as they are celebrated
among the Sudoku addicts.
For our first example (puzzle Royle17-1007, Figure 5), the strong elaboration by
theory L3 (defined below – including some rules we have not yet defined, such as
Interaction rules and XY-Wing) has only 25 empty cells.
2 4 6 3 8 1 5 2 4 6 3 8 1 5 9 2 4 7
5 3 5 4 2 3 6 8 1 5 4 2 7 3 6 8 1 9
6 8 4 2 6 3 5 7 1 9 8 4 2 6 3 5
5 8 3 4 6 2 5 1 8 3 4 7 6 9 2 5 1 8 3
2 6 2 3 6 5 7 4 9 2 3 6 8 1 5 7 4
1 5 1 3 4 9 6 2 8 5 1 3 7 4 9 6 2
3 7 1 3 4 5 6 7 2 1 3 9 4 5 6 8 7 2 1
6 4 2 6 4 3 8 2 6 5 4 1 7 3 9 8
2 2 3 4 6 1 8 7 2 9 3 4 5 6
Resolution path in L3+NQ+HQ+SHQ for the L3+NQ+HQ (or L3) elaboration of Royle17-
1007:
144 The Hidden Logic of Sudoku
For our second example (puzzle Sudogen17-6907, Figure 6), starting from its L3
elaboration, the solution requires using successively two rare patterns: Jellyfish and
Swordfish.
3 7 1 5 3 8 7 1 5 3 8 9 4 7 1 2 6
4 5 1 4 2 5 7 8 1 9 4 3 6 2 5 7 8
8 1 8 1 5 2 7 6 8 1 5 9 4 3
9 8 9 7 4 3 8 1 9 6 7 4 3 5 2
5 2 8 5 2 8 3 5 7 2 8 1 6 9 4
4 2 5 3 4 2 5 9 3 8 1 4 6 2 5 9 3 8 1 7
8 9 8 9 6 8 1 4 2 9 7 3 5
7 6 9 7 5 8 4 6 9 7 2 5 1 3 8 4 6 9
5 2 8 9 4 7 5 2 8 9 4 3 7 5 6 2 8 1
Resolution path in L3+NQ+HQ for the L3+NQ (or L3) elaboration of Sudogen17-6907)
row r3 interaction-with-block b3 ==> r1c9 ≠ 4, r1c8 ≠ 4
row r8 interaction-with-block b8 ==> r7c5 ≠ 3, r7c4 ≠ 3
row r1 interaction-with-block b3 ==> r3c9 ≠ 2, r3c8 ≠ 2
naked-pairs-in-a-row {n1 n6}r4{c2 c4} ==> r4c9 ≠ 6
hidden-pairs-in-a-row {n3 n4}r3{c8 c9} ==> r3c9 ≠ 6, r3c8 ≠ 9
x-wing-in-rows n1{r4 r8}{c2 c4} ==> r7c4 ≠ 1
jellyfish-in-columns n6{r3 r7 r9 r5}{c1 c3 c6 c7} ==> r7c5 ≠ 6
column c5 interaction-with-block b2 ==> r2c4 ≠ 6, r1c4 ≠ 6
swordfish-in-rows n6{r1 r2 r6}{c9 c5 c2} ==> r5c9 ≠ 6, r4c2 ≠ 6
… (Naked-Singles)
In our third example (Figure 7), the same Jellisfish-in-rows eliminates five
candidates but it is interrupted twice by simpler rules (this is only an artifact of
SudoRules and of the different priorities assigned to the various rules).
Subset rules, level four: Quadruplets 145
9 7 4 9 1 7 4 6 9 3 8 1 7 5 2 4 6
4 1 9 6 7 4 1 9 6 2 7 3 8 4 1 9 5
1 6 3 7 5 4 1 9 6 3 7 5 4 1 2 9 6 3 8 7
7 5 7 5 4 6 3 2 8 9 7 5 1 4 6 3
6 4 9 3 7 6 4 9 3 7 6 4 2 8 5 1 9
5 3 7 8 5 4 6 3 9 7 8 1 5 4 6 3 9 7 2 8
8 1 8 1 6 7 4 8 9 2 5 1 3 6 7 4
6 3 9 7 6 3 4 9 1 7 6 3 8 4 2 9 5 1
4 3 4 1 6 7 3 4 1 5 9 6 7 8 3 2
Resolution path in L3+NQ+HQ for the L3+NQ (or L3) elaboration of Sudogen0-9657:
row r5 interaction-with-block b5 ==> r4c6 ≠ 8
column c1 interaction-with-block b4 ==> r4c3 ≠ 2, r4c2 ≠ 2
naked-pairs-in-a-block {n2 n8}{r2c5 r3c4} ==> r2c4 ≠ 8, r2c4 ≠ 2, r1c6 ≠ 8, r1c6 ≠ 2
jellyfish-in-rows n2{r3 r8 r4 r6}{c8 c4 c6 c1} ==> r9c4 ≠ 2, r7c6 ≠ 2
naked-pairs-in-a-column {n3 n5}{r1 r7}c6 ==> r8c6 ≠ 5
jellyfish-in-rows n2{r3 r8 r4 r6}{c8 c4 c6 c1} ==> r7c4 ≠ 2
row r7 interaction-with-block b7 ==> r9c3 ≠ 2
row r9 interaction-with-block b9 ==> r8c8 ≠ 2
xy3-chain {n5 n2}r2c9 – {n2 n8}r3c8 – {n8 n5}r8c8 ==> r9c9 ≠ 5
naked and hidden singles ==> r9c9 = 2, r2c9 = 5, r2c4 = 3, r1c6 = 5, r7c6 = 3, r1c2 = 3
jellyfish-in-rows n2{r3 r8 r4 r6}{c8 c4 c6 c1} ==> r5c8 ≠ 2, r5c6 ≠ 2
xy3-chain {n8 n5}r8c8 – {n5 n1}r5c8 – {n1 n8}r5c6 ==> r8c6 ≠ 8
… (Naked-Singles)
In our fourth and final example, puzzle Royle17-33858 (Figure 9), although
XYZ-Wing (xyz3) is applied during the L3 elaboration process, it does not lead to
the addition of a value and it is therefore applied again when we solve the L3
elaboration (which is indeed equal to the L1 elaboration).
Resolution path in L3+NQ+HQ for the L3+NQ (or L3) elaboration of Royle17-33858:
column c3 interaction-with-block b1 ==> r2c2 ≠ 8
row r2 interaction-with-block b1 ==> r3c3 ≠ 2
block b7 interaction-with-column c3 ==> r2c3 ≠ 7
xyz3-chain {n8 n2}r3c7 – {n2 n5}r3c9 – {n5 n8}r4c9 ==> r1c9 ≠ 8
jellyfish-in-columns n5{r6 r2 r5 r1}{c1 c5 c4 c8} ==> r6c2 ≠ 5, r2c3 ≠ 5, r2c2 ≠ 5, r1c9 ≠ 5
… (Naked-Singles and Hidden-Singles)
We shall take advantage of this example to show (Figure 8) how a typical (i.e.
degenerate) Jellyfish-in-columns looks like in rc- and cn- spaces. In cn-space, it is
very easy: just like a (degenerate) Naked-Quadruplets-in-a-row would in rc-space.
146 The Hidden Logic of Sudoku
c1 c2 c3 c4 c5 c6 c7 c8 c9
r1 n6 n9 n1 n5 n3 n2 n4 n5 n5 r1
n8 n7 n8 n7
n2 n2
r2 n5 n5 n5 n6 n5 n4 n9 n3 n1 r2
n7 n7 n8 n8
n2 n2
r3 n4 n3 n5 n9 n7 n1 n6 n5 r3
n8 n8 n8
r4 n9 n5 n3 n1 n4 n6 n7 n2 n5 r4
n8 n8
r5 n2 n1 n4 n5 n5 n7 n6 n9 n3 r5
n8 n8
r6 n5 n5 n6 n3 n2 n9 n1 n5 n4 r6
n7 n7 n8 n8
n2 n2
r7 n3 n5 n9 n7 n1 n5 n4 n6 r7
n8 n8
n2 n2
r8 n8 n6 n4 n9 n3 n5 n1 r8
n7 n7
r9 n1 n4 n5 n2 n6 n5 n3 n9 r9
n7 n8 n8 n7 n8
c1 c2 c3 c4 c5 c6 c7 c8 c9
n1 r9 r5 r1 r4 r7 r3 r6 r8 r2 n1
r2 r2 r3 r3
n2 r5 r9 r6 r1 r4 n2
r7 r8 r7 r8
n3 r7 r3 r4 r6 r1 r8 r9 r2 r5 n3
n4 r3 r9 r5 r8 r4 r2 r1 r7 r6 n4
r2 r2 r2 r3 r1 r2 r1 r1 r3
n5 r6 r4 r5 r5 r8 r6 r4 n5
r6 r7 r9
r7 r9
n9 r4 r1 r7 r3 r8 r6 r2 r5 r9 n9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Figure 8. Puzzle Royle17-33858, seen in rc- and cn-spaces, just before the Jellyfish
Subset rules, level four: Quadruplets 147
Notice that, as for most cases of Jellyfish (and Swordfish), this is a highly
degenerate example; it can also be considered as a hxy-cn4 chain (see chapter XV
for the definition).
6 9 2 6 9 1 3 2 4 6 9 1 5 3 2 4 8 7
3 1 6 4 9 3 1 5 7 2 6 8 4 9 3 1
4 3 9 7 1 6 4 3 8 9 7 1 2 6 5
3 1 4 9 3 1 4 6 7 2 9 5 3 1 4 6 7 2 8
2 6 2 1 4 7 6 9 3 2 1 4 8 5 7 6 9 3
3 6 3 2 9 1 4 7 8 6 3 2 9 1 5 4
7 1 4 3 9 7 1 4 6 3 2 9 7 1 5 8 4 6
8 6 5 8 6 4 9 3 5 1 8 6 7 4 9 3 5 1 2
1 4 2 6 3 9 1 4 5 2 6 8 3 7 9
For any subset S of numbers of length k (1 ≤ k < 9), there is obviously a comple-
mentary subset Sc of length 9-k (with 1 ≤ 9-k < 9). And S forms a Naked-Subset of
cardinality k on k cells in a row (respectively a column, a block), if and only if Sc
forms a Hidden-Subset of length 9-k on the remaining 9-k cells in this row (respec-
tively: this column, this block).
As a result, no Subset rule for subsets of length greater than four is needed. For
instance, as is well known, Naked-Quintuplets in a row is just Hidden-Quadruplets
in the same row and Hidden-Quintuplets in a row is just Naked-Quadruplets in the
same row. As is less known, because super-symmetries are not generally considered,
Super-Hidden-Quintuplets in a row (familiarly called Squirmbag) is just Naked-
Quadruplets in a column (as shown by the graph in Figure 2). This is a very interes-
ting example of a named and popular thing that has no independent existence.
Finally, let us define theory L4_0 as the union of L3 with the set of rules defined
in the present chapter:
148 The Hidden Logic of Sudoku
L4_0 = L3 ∪ NQ ∪ HQ ∪ SHQ.
Chapter IX
Interaction rules
As simple as they are, these rules are used very frequently. Two facts will give
an idea of how important they are:
– apart from very easy puzzles that can be solved at level L1_0 (i.e. with only
BSRT, NS and HS), the resolution process of almost any grid resorts to at least one
of the rules described in this chapter; most of the time, many instances of these rules
will be used;
– if 46% of Royle17 (respectively 42% of Sudogen0, 41% of Sudogen17)
puzzles can be solved in L1_0, the figures rise to 77.7% (respectively 54%, 53%)
when one adds Interaction rules (see chapter XXI for details).
XXX
X X X X X X
XXX
In rn-space, consider the content of cell (r, n). There are three generic cases cor-
responding to the preconditions of RiB; they are shown in Figure 2. Here, as usual
in this book, values (i.e. candidate columns) inside an rn-cell are supposed to be
always displayed with the same global pattern:
Interaction rules 151
123
456
789
Parentheses indicate that values may be present or not. Empty places indicate as
usual that they may not be present.
(1)(2)(3)
(4)(5)(6)
(7)(8)(9)
Figure 2. The three internal rn-cell patterns for detecting RiB in rn-space
Proof: (apply meta-theorem 1) permute the words row and column in the proof
of the previous rule. From a logical point of view, this permutation entails permu-
ting the predicates "row-intersects" and "column-intersects" (see their definition at
the end of section III.1.5.2).
152 The Hidden Logic of Sudoku
To get our symbol, still in natural row-column space, for the interaction of a
column with a block, just rotate figure 1 by 90° to make it vertical. You get Figure
3, with the same conventions as for Figure 1.
X
X
X
X
X
X
X X
X X
X X
(1)(2)(3)
(4)(5)(6)
(7)(8)(9)
Figure 4. The three internal nc-cell patterns for detecting CiB in nc-space
In nc-space, consider the content of cell (n, c). There are three generic cases
corresponding to the preconditions of CiB; they are shown in Figure 4. Here, values
(i.e. candidate rows) inside an nc-cell are supposed to be always displayed with the
same global pattern:
Interaction rules 153
123
456
789
As above, parentheses indicate that values may be present or not. Empty places
indicate as usual that they may not be present.
∀n∀r∀b
{ row-intersects(r, b) &
∃s candidate[n, b, s] &
∀r’≠r ∀c’ [row-intersects(r’, b) & column-intersects(c’, b)
⇒ not-candidate(n, r’, c’)]
⇒
∀c’ [¬column-intersects(c’, b)
⇒ not-candidate(n, r, c’) ]}.
154 The Hidden Logic of Sudoku
Just reversing the arrows in Figure 1, one gets Figure 5 and our symbol, still in
natural row-column space, for the interaction of a block with a row. Conventions are
the same as for Figure 1. The direction of inference is reversed.
X X X
X X X X X X
X X X
(1)(2)(3)
(4)(5)(6)
(7)(8)(9)
Figure 6. The three internal bn-cell patterns for detecting BiR in bn-space
Interaction rules 155
Then, the three generic cases corresponding to the preconditions of the BiR rule
will appear as shown in Figure 6.
And the three generic cases corresponding to the preconditions of rule BiC will
appear as shown in Figure 7.
Figure 7. The three internal bn-cell patterns for detecting BiC in bn-space
Our first example (puzzle Royle17-7, Figure 8) shows how a puzzle that cannot
be solved in L1_0 alone can be solved simply if one adds RiB and CiB.
1 2 3 9 1 2 3 6 4 9 7 8 5 1 2
5 4 1 5 2 4 3 1 5 2 4 3 6 9 7 8
3 1 2 3 4 8 7 9 1 2 5 6 3 4
7 6 4 7 3 6 5 1 4 2 7 3 8 6 5 1 4 2 9
1 1 2 3 5 6 9 1 2 4 7 3 8 5
8 2 5 3 8 1 2 4 5 3 8 9 1 6 7
9 2 8 9 2 3 7 8 5 1 9 2 3 7 6 4 8 5 1
5 1 7 5 1 2 7 9 3 4 8 6 5 1 2 7 9 3
3 5 1 7 8 9 3 2 4 6 5 1 7 8 9 3 2 4 6
Our second example, puzzle Royle17-323 (Figure 9) shows how a puzzle that
cannot be solved in L1_0 alone or in L1_0+RiB+CiB can be solved simply if one
adds BiR.
4 1 8 2 5 4 1 9 7 3 8 2 6 5 4 1
5 3 5 1 3 7 8 2 5 6 1 3 7 4 8 2 9
2 2 8 1 5 3 7 2 4 8 1 5 9 6 3 7
2 6 3 8 9 4 2 6 7 3 1 5 8 9 4 2 6 7 3 1 5
1 6 3 1 5 4 9 8 7 6 2 3 1 5 4 9 8 7 6 2
7 5 7 2 6 5 1 3 4 7 2 6 5 1 3 4 9 8
8 4 1 8 2 7 4 1 5 3 6 8 2 7 4 1 9 5 3
8 2 1 8 5 2 4 1 3 9 6 8 5 2 7 4
4 5 3 2 1 4 5 7 9 3 2 1 8 6
XI.5.3. A puzzle with two solution paths, with and without Interaction rules
Our third example, puzzle Royle17-32227 (Figure 10), illustrates the fact that a
puzzle can have different solution paths, at the same level in our classification (here
level L3_0) but using different sets of rules.
Interaction rules 157
6 9 6 4 5 9 8 2 6 4 5 3 1 7 9 8 2
4 5 2 9 8 4 5 6 3 7 1 2 9 8 4 5 6 3 7 1
2 8 3 9 2 8 7 1 3 9 2 8 4 6 5
3 1 7 3 2 1 7 3 2 6 1 8 5 7 9 4
8 2 8 2 4 8 1 7 6 9 5 2 3
2 1 5 7 9 2 4 3 8 1 6
6 1 7 3 6 2 1 5 7 8 3 4 6 9 2 1 5 7
5 2 4 5 2 4 1 5 2 8 7 4 6 3 9
2 9 6 7 5 3 1 2 4 8
Figure 10. Puzzle Royle17-32227, its L1_0 elaboration and its solution
If we allow only L1_0 and subset rules (with no interaction rules), we get a
solution in L3_0 needing Hidden-Pairs, Naked-Triplets and Hidden-Triplets:
Resolution path (in L3_0 minus the interaction rules) for the L1_0 elaboration of Royle17-
32227:
hidden-triplets-in-a-block {n4 n6 n8}{r6c5 r5c5 r4c5} ==> r6c5 ≠ 9, r6c5 ≠ 7, r6c5 ≠ 3,
r5c5 ≠ 9, r5c5 ≠ 7, r5c5 ≠ 3
hidden-triplets-in-a-column {n1 n3 n7}{r9 r8 r1}c5 ==> r9c5 ≠ 9, r9c5 ≠ 8, r8c5 ≠ 9, r8c5 ≠ 8
hidden-triplets-in-a-block {n4 n6 n8}{r6c5 r5c5 r4c5} ==> r4c5 ≠ 9
naked and hidden singles ==> r7c5 = 9, r7c3 = 4, r7c1 = 8
hidden-pairs-in-a-column {n4 n5}{r5 r6}c1 ==> r6c1 ≠ 9, r6c1 ≠ 7, ==> r5c1 ≠ 9, r5c1 ≠ 7,
r5c1 ≠ 1
hidden-single-in-a-block ==> r5c3 = 1
naked-triplets-in-a-row {n5 n4 n6}r5{c1 c5 c7} ==> r5c9 ≠ 6, r5c9 ≠ 5, r5c9 ≠ 4, r5c6 ≠ 5,
r5c4 ≠ 5
… (Naked-Singles and Hidden-Singles)
If we allow L1_0, Subset rules and Interaction rules together, we get a very
different solution path, still in L3_0 but using only Hidden Pairs and Naked-
Triplets:
Resolution path (in full L3_0, i.e. with interaction rules allowed) for the L1_0 elaboration of
Royle17-32227:
column c4 interaction-with-block b8 ==> r9c5 ≠ 8, r8c5 ≠ 8, r7c5 ≠ 8
naked singles ==> r7c5 = 9, r7c3 = 4, r7c1 = 8
row r8 interaction-with-block b9 ==> r9c9 ≠ 6, r9c8 ≠ 6
column c8 interaction-with-block b9 ==> r9c9 ≠ 3,9 ==> r8c9 ≠ 3
hidden-pairs-in-a-column {n4 n5}{r5 r6}c1 ==> r6c1 ≠ 9, r6c1 ≠ 7, r5c1 ≠ 9
column c1 interaction-with-block b7 ==> r9c3 ≠ 9
hidden-pairs-in-a-column {n4 n5}{r5 r6}c1 ==> r5c1 ≠ 7, r5c1 ≠ 1
hidden-single-in-a-block ==> r5c3 = 1
row r5 interaction-with-block b5 ==> r6c6 ≠ 7, r6c5 ≠ 7
158 The Hidden Logic of Sudoku
Finally, our fourth example, puzzle Royle17-13840 (Figure 11), can be solved at
level L3_0 using Interaction rules but it could not be solved at this level without
them.
1 5 1 5 8 1 3 7 9 4 6 5 2
3 6 3 6 5 1 9 2 7 3 6 5 1 4 8
2 7 2 1 7 5 4 6 8 2 1 7 9 3
6 3 6 3 1 6 5 9 2 4 8 3 7 1
5 9 1 5 9 1 3 2 5 7 9 4 8 6
1 1 6 7 8 4 1 3 6 9 2 5
1 9 1 9 4 7 5 6 8 3 2 1 9
2 8 2 8 1 2 6 8 9 1 7 5 3 4
3 4 3 1 4 3 9 1 4 5 2 8 6 7
Figure 11. Puzzle Royle17-13840, its L1_0 elaboration and its solution
Like Naked-Single and Hidden-Single, Interaction rules appear in all but the
simplest puzzles. They are not very spectacular. We shall give no more examples
here because they will appear again and again in the examples of the next chapters.
Interaction rules 159
IX.6 Theory L1
We have now introduced all the rules necessary to complete the definition of
theory L1 announced in chapter V: L1 is the union of L1_0 with the set of rules
defined in the present chapter:
This chapter is a bridge between popular rules such as XY-Wing and XYZ-Wing
(as they are widely named in the Sudoku literature) and the general approach of xy-
rules and xyz-rules (and rules of associated types) that will be developed in great
detail in forthcoming chapters.
It also proves that we need not consider hidden or super hidden versions of the
XYZ-Wing rules.
Consider a sequence (or "chain") of three different cells (r1, c1), (r2, c2) and
(r3, c3), respectively in blocks b1, b2 and b3, and satisfying the following conditions:
– two consecutive cells share a unit of some type (row, column or block);
– the three cells do not share a unit (if they did, the situation described below
would reduce to a special case of Naked-Triplets in this shared unit).
These conditions obviously imply that, apart from reversing the order of the se-
quence, there are only three possibilities for the two shared units: row-column, row-
block and column-block. We say the chains are of type rc, rb and cb.
162 The Hidden Logic of Sudoku
Now suppose that there is a sequence of three different numbers, n1, n2 and n3
such that:
– the set of candidates for (r1, c1) is exactly {n1, n2};
– the set of candidates for (r2, c2) is exactly {n2, n3};
– the set of candidates for (r3, c3) is exactly {n3, n1}.
Still in conformance with usage, we define the following sets of cells, to which
the conclusions of the XY-Wing rule will apply:
– in case the 3-chain is of type rc:
S’1 = cell (r3, c1), i.e. a cell that is both in column c1 and in row r3
plus all cells that are both in block b1 and in row r3,
plus all cells that are both in column c1 and in block b3,
S’2 = all cells that are both in row r1 and in block b3,
plus all cells that are both in block b1 and in column c3,
– in case the 3-chain is of type rb:
S’1 = all cells that are both in block b1 and in row r3,
S’2 = all cells that are both in row r1 and in block b3,
– in case the 3-chain is of type cb:
S’1 = all cells that are both in block b1 and in column c3,
S’2 = all cells that are both in column c1 and in block b3,
– in all three cases:
S1 = the set of cells in S’1 that do not belong to the chain,
S2 = the set of cells in S’2 that do not belong to the chain,
S = S1 ∪ S2.
It should be noted that these definitions are the ones usually found in books and
on Web sites for the sets of cells concerned by an XY-Wing or an XYZ-Wing –
although, most of the time, they are given a less formal appearance, with the conse-
quence that the (logically important) condition on the cells not belonging to the
chain is usually omitted.
Generally, also, for XY-Wings of type rc, sets S’1 and S’2 are restricted respecti-
vely to cell (r3, c1) and to ∅; this is justified because:
The XY-Wing and XYZ-Wing patterns 163
– if there exists a cell in block b1and in row r3, then cells (r2, c2) and (r3, c3) are
not only in the same column, they must also be in the same block, and case rb
applies;
– if there exists a cell in block b1 and in column c3, then cells (r1, c1) and (r2, c2)
are not only in the same row, they must also be in the same block, and case cb
applies to the reversed sequence;
– if there exists a cell in row r1 and in block b3, then cells (r2, c2) and (r3, c3) are
not only in the same row, they must also be in the same block, and case rb applies;
– if there exists a cell in column c1 and in block b3, then cells (r1, c1) and (r2, c2)
are not only in the same column, they must also be in the same block, and case rb
applies to the reversed sequence.
We have adopted the present extended definitions of sets S and S2 because they
are equivalent to the following (this proof is left to the reader as an obvious
exercise).
Definitions: S is the set of cells that do not belong to the chain and that share a
unit (possibly a different one) with each of the endpoints of the chain. S2 is the set of
cells that do not belong to the chain and that share a unit (possibly a different one)
with each of the three cells in the chain.
Notice that, contrary to the usual definition of cells concerned by the XY-Wing
(what we shall call later the target cells of the xy-chain), the present one does not
refer in any way to the specific types of links between cells, either in the chain itself
or between the chain and cells in sets S and S2. This is characteristic of all the chain
rules we shall meet in the sequel. What is important here is that this abstract form of
the defining property is the one useful for the proof (as can be seen below) and it is
also the one that can be generalised to state the general xy-chain rules and their
extended forms (see chapters XII and XIV to XVIII).
XY-Wing rule: given an XY-Wing, eliminate n1 from the candidates for every
cell that does not belong to the chain and that shares a unit (possibly a different one)
with each endpoint of the chain.
Proof of the rule: let C be a cell in S. Notice that the following proof is based on
the general properties of the chain and of C and not on the specific type of the XY-
Wing (rc, rb or cb) or on the specific properties of the elements of S in each of these
different cases. Consider cell (r1, c1). There are only two possibilities: either (r1, c1)
= n1 or (r1, c1) = n2. Let us consider them in turn.
1) if (r1, c1) = n1, then C ≠ n1, since C shares a unit with (r1, c1).
164 The Hidden Logic of Sudoku
2) If (r1, c1) = n2, then (r2, c2) ≠ n2, since (r2, c2) shares a unit with (r1, c1).
Therefore (r2, c2) = n3. But then (r3, c3) ≠ n3, since (r3, c3) shares a unit with (r2, c2).
Therefore (r3, c3) = n1 and C ≠ n1, since C shares a unit with (r3, c3).
Finally, whichever value (r1, c1) has among its two candidates, C cannot be equal
to n1.
The logical formulation parallels strictly the English one, using the general
definitions and the auxiliary predicate rc-bivalue introduced in section VI.1.1.
Notice that, after the condition on the three numbers n1, n2 and n3 being different is
split into three separate conditions, it can be included in the three predicates "rc-
bivalue".
∀r1∀r2∀r3∀c1∀c2∀c3∀n1∀n2∀n3∀r∀c
{ rc-bivalue (r1, c1, n1, n2) &
share-a-unit(r2, c2, r1, c1) &
rc-bivalue (r2, c2, n2, n3) &
share-a-unit(r3, c3, r2, c2) &
rc-bivalue (r3, c3, n3, n1) &
share-a-unit(r, c, r1, c1) &
share-a-unit(r, c, r3, c3) &
¬same-cell(r, c, r2, c2) &
⇒
not-candidate(n1, r, c) }.
Resolution path in L3_0+XYW for the L3_0 (or L1) elaboration of Royle17-186:
xy3-chain {n8 n5}r2c8 – {n5 n4}r3c7 – {n4 n8}r4c7 ==> r4c8 ≠ 8
… (Naked-Singles and Hidden-Singles)
3 1 4 7 5 6 8 2 9 3 1 4 7 5 6 8 2 9 3 1
7 9 3 6 1 7 9 2 3 6 1 4 7 9 2 5 8
8 9 2 1 3 6 7 8 9 2 5 1 3 4 6 7
1 3 2 1 3 2 5 6 7 1 3 2 5 4 8 9 6
4 7 6 2 4 9 3 8 7 1 5 6 2 4 9 3 8 7 1 5
1 5 8 1 6 3 2 9 5 8 1 6 7 3 2 4
5 4 6 7 5 3 9 8 4 1 6 7 2 5 3 9 8 4 1 6 7 2
2 8 2 8 6 7 9 5 1 4 3 2 8 6 7 9 5 1 4 3
3 1 4 7 3 2 6 1 4 7 3 2 6 5 8 9
5 8 2 1 5 3 8 4 2 9 6 1 5 3 7 8 4
4 3 4 2 6 3 1 7 4 5 2 8 6 9 3 1
1 3 1 4 2 6 3 8 1 9 4 7 5 2 6
9 3 6 9 2 4 3 6 1 8 7 5 9 2 4 3 6 1 8 7 5
4 2 1 7 3 4 6 2 1 7 3 8 9 5 4 6 2
7 1 7 2 4 1 9 3 5 6 8 7 2 4 1 9 3
1 4 2 1 4 2 3 5 6 1 9 4 7 2 3 5 8
8 6 8 3 2 1 6 4 8 3 2 5 1 9 6 4 7
4 6 3 2 1 4 5 7 6 3 8 2 1 9
XYZ-Wing rule: given an XYZ-Wing, eliminate n1 from the candidates for any
cell that does not belong to the chain and that shares a unit (possibly a different one)
with each of its three cells.
Proof of the rule: the proof is an easy adaptation of the proof for XY-Wings. Let
C be a cell in S2.
1) If (r1, c1) = n1, then C ≠ n1, since C shares a unit with (r1, c1).
2) If (r1, c1) = n2, then (r2, c2) ≠ n2, since (r2, c2) shares a unit with (r1, c1). Therefore,
(r2, c2) = n3 or (r2, c2) = n1.
2a) If (r2, c2) = n1, then C ≠ n1, since C shares a unit with (r2, c2).
2b) If (r2, c2) = n3, then (r3, c3) ≠ n3, since (r3, c3) shares a unit with (r2, c2).
Therefore, (r3, c3) = n1 and C ≠ n1, since C shares a unit with (r3, c3).
Finally, whichever value (r1, c1) has among its two candidates, C cannot be equal
to n1.
∀r1∀r2∀r3∀c1∀c2∀c3∀n1∀n2∀n3∀r∀c
{ rc-bivalue (r1, c1, n1, n2) &
share-a-unit(r2, c2, r1, c1) &
candidate(n2, r2, c2) & candidate(n3, r2, c2) & candidate(n1, r2, c2) &
n3 ≠ n2 & ∀n∉{n2, n3, n1} not-candidate(n, r2, c2) &
share-a-unit(r3, c3, r2, c2) &
rc-bivalue (r3, c3, n3, n1) &
share-a-unit(r, c, r1, c1) &
share-a-unit(r, c, r2, c2) &
share-a-unit(r, c, r3, c3) &
⇒
not-candidate(n1, r, c) }.
The XY-Wing and XYZ-Wing patterns 167
The examples in this section prove that XYZW is not subsumed by L3_0+XYW.
1 6 4 7 1 5 6 8 3 4 2 7 1 5 6 9 8
5 3 6 5 1 8 2 9 4 7 3 6 5 1 8 2 9 4 7 3
4 6 4 3 5 1 7 9 8 6 4 3 5 2 1
7 2 5 7 2 6 1 5 9 4 7 3 2 6 8 1 5 9
3 1 3 8 6 2 1 9 3 5 4 8 6 7
4 6 1 3 4 5 8 6 1 9 7 3 4 2
1 4 7 1 4 7 1 6 4 9 3 2 7 8 5
5 6 2 3 5 8 6 2 1 4 9 3 7 5 8 6 2 1 4
8 8 2 5 4 7 1 9 3 6 8 2 5 4 7 1 9 3 6
Resolution path in L3_0+XYW+XYZW for the L3_0+XYW (or L1_0) elaboration of Roy-
le17-2717:
xyz3-chain {n9 n2}r5c3 – {n2 n5}r6c1 – {n5 n9}r6c5 ==> r6c2 ≠ 9
… (Naked-Singles and Hidden-Singles)
Resolution path in L3_0+XYW+XYZW for the L3_0+XYW (or L2) elaboration of Royle17-
2171:
xyz3-chain {n5 n2}r2c3 – {n2 n8}r8c3 – {n8 n5}r8c1 ==> r9c3 ≠ 5
naked-single ==> r9c3 = 4
168 The Hidden Logic of Sudoku
6 9 4 1 8 3 6 9 7 4 5 1 8 3 6 2 9 7
7 1 7 3 4 1 9 6 7 3 2 4 1 9 6 5 8
3 9 8 6 7 5 2 4 3 1 9 8 6 7 5 2 4 3 1
2 5 7 2 9 3 5 8 1 7 4 6 2 9 3 5 8 1 7 4 6
9 3 6 9 7 3 1 6 4 8 9 7 3 5 1 2
1 1 7 6 2 4 9 3 5 1 7 6 2 4 9 8 3
9 2 4 1 6 9 2 4 8 3 7 5 1 6 9 2 4 8 3 7 5
1 6 3 9 1 6 4 8 2 5 3 9 7 1 6 4
8 3 1 6 8 2 9 3 7 4 1 6 5 8 2 9
4 8 1 3 9 4 2 8 6 1 3 9 5 4 2 8 7 6
6 1 6 4 1 8 3 7 6 4 1 8 3 2 5 9
2 3 8 2 6 3 4 1 5 8 2 9 6 7 3 4 1
7 2 6 9 7 3 2 1 5 4 6 8 9 7 3 2 1 5 4 6 8
4 3 5 4 2 1 8 3 6 5 4 2 1 8 3 6 5 9 7
6 5 8 1 3 2 6 5 8 4 7 9 1 3 2
6 1 2 4 7 6 5 8 9 1 3 2 4 7 6 5 8 9 1 3
3 5 3 5 2 6 8 3 9 5 7 2 1 6 8 4
8 8 6 3 8 1 6 3 9 4 7 2 5
X.3. Theory L3
In conformance with our general guiding principles, let us consider the possi-
bility of hidden and/or super hidden versions of XY-Wing and XYZ-Wing. Apply-
ing the procedure described in section IV.4, we must first define the block-free
versions of XY-Wing and XYZ-Wing. For these rules, this obviously amounts to
restricting them to the cases in which blocks do not appear as linking units.
Proof: if there is an XYZ-Wing in which the shared units are limited to the types
row or column, then it is of type rc (or cr) and no cell outside the chain can share a
unit of type row or column with the three cells in the chain. Said otherwise, the
conditions in the block-free version of XYZ-Wing can match no puzzle. By meta-
theorem 3, the same applies obviously to its hidden and super-hidden counterparts.
The case of XY-Wings is more difficult. In our three puzzle databases, we have
found no case of a hidden XY-Wing. Conversely, we have not been able to prove
the following
X.3.3. Theory L3
Let us now define theory L3 as the union of L3_0 with the set of rules
introduced in the present chapter:
Later, when we have defined xy-chains, we shall show that this is equivalent to:
One needs to examine one cell for detecting Singles, two for Pairs… This is
obvious for Naked Subsets, in natural row-column space. For Hidden or Super-
Hidden Subset rules, one has to detect the same pattern as for a Naked Subset, but in
abstract rn-, cn- or bn- spaces. For instance, in rn-space, one can easily detect a
Hidden-Triplets-in a-row because we have shown it looks like a Naked-Triplets-in-
a-row would in rc-space; and, in the same rn-space, one can also easily detect a
Super-Hidden-Triplets-in-a-row because it looks like a Naked-Triplets-in-a-column
would in rc-space. Similarly, in nc representation one can easily detect a Hidden-
Triplets-in-a-column, because it looks like a Naked-Triplets-in-a-column would in
rc-space; and, in the same nc representation, one can also easily detect a Super-
Hidden-Triplets-in-a-column because it looks like a Naked-Triplets-in-a-row would
in rc-space. Finally, in bn-space, one can easily detect a Hidden-Triplets-in-a-block,
because it looks like a Naked-Triplets-in-a-block would in rc-space.
It is therefore obvious that the larger a Naked Set is the higher in the logical
complexity hierarchy the associated rule should be. And we have already mention-
ned that rules related by symmetry or super-symmetry should be given the same
logical complexity.
Having also defined the rules for Interaction, XY-Wing and XYZ-Wing, we are
now in a position to complete the specification of the first levels of our hierarchy:
The XY-Wing and XYZ-Wing patterns 171
L1_0 = BSTR ∪ HS
(where HS = {HS(row), HS(col), HS(blk)})
L1 = L1_0 ∪ {RiB, CiB, BiR, BiC}
L2_0 = L1 ∪ NP ∪ HP ∪ SHP
(where NP = {NP(row), NP(col), NP(blk)}
HP = {HP(row), HP(col), HP(blk)}
SHP = {SHP(row), SHP(col)})
L2 = L2_0
L3_0 = L2 ∪ NT ∪ HT ∪ SHT
(where NT = {NT(row), NT(col), NT(blk)}
HT = {HT(row), HT(col), HT(blk)}
SHT = {SHT(row), SHT(col)})
L4_0 = L3 ∪ NQ ∪ HQ ∪ SHQ
(where NQ = {NQ(row), NQ(col), NQ(blk)}
HQ = {HQ(row), HQ(col), HQ(blk)}
SHQ = {SHQ(row), SHQ(col)})
In the examples of this chapter and the previous ones, whenever we had to
anticipate rules that were not yet defined, we have been careful not to introduce
puzzles whose solution would require rules of a higher level of complexity than that
of the rule the example is intended to illustrate.
We have already raised the following practical question: is it worth doing all the
tedious extra job of maintaining the four representations for easier detection of
patterns that, after some training, are finally not so difficult to detect on the usual
row-column representation? Now, our answer can be more precise. If the abstract
representations were used only for detecting the rules introduced up to this point, the
172 The Hidden Logic of Sudoku
answer should probably be: no, let them serve only to illustrate the symmetry
relationships between rules of types Naked, Hidden and Super-Hidden, but do not
use them in practice.
But we shall see soon that these representations are the intuitive motivation for
the definition of new types of chains that could hardly be detected in rc-space and
that they allow one to solve grids that could probably not be solved without using
them. Moreover, Sudoku machines have been appearing and it would be easy to
program them for displaying simultaneously the four representations (or at least
three, because, apart from easing detection of the pattern defining the conditions of a
Hidden-Subset rule relative to blocks or of an Interaction rule, block-number space
is not really useful – it does not benefit from the same symmetry relationships as rn-
or cn- spaces).
Sudoku is basically a three dimensional problem, as can be seen from the fact
that its basic predicates ("value" and "not-candidate") take three arguments. It
should ideally be seen in the three dimensional row-column-number space. But
defining rules based on 3D patterns (for instance 3D chain rules) is unrealistic from
the human solver point of view. As a best approximation, this 3D space should be
considered from several two dimensional points of view: rc- rn- and cn- spaces. If
its analysis is restricted to only one of these views, the usual rc-space, significant
aspects of the problem are missed.
If T is a resolution theory and A1, A2, …, An and B1, B2, …, Bp are resolution
rules, then we write [T+A1+A2+…+An]+B1+B2+…+Bp to name the set of puzzles
that cannot be solved in theory T ∪ {A1, A2, …, An} but can be solved in the exten-
ded theory T ∪ {A1, A2, …, An} ∪ {B1, B2, …, Bp}. This notation will be very use-
ful for the easy statement of independence results. Indeed, having an example in
[T+A1+A2+…+An]+B1+B2+…+Bp proves that no rule in {B1, B2, …, Bp} can be
subsumed by the set of rules T ∪ {A1, A2, …, An}. Many of the examples in the
forthcoming chapters will be of this type. Usually, we shall not even state the
The XY-Wing and XYZ-Wing patterns 173
As a motivation for the introduction of chain rules and the associated higher
levels (and anticipating on the classification results given in chapter XXI), let us
mention that, using only the rules defined up to this point (i.e. the rules in L4_0),
one can solve approximately 88% of the puzzles in the Royle17 collection and 65%
of the puzzles in the two randomly generated Sudogen collections.
Part Three
2D CHAIN RULES
Chapter XI
Chains of various types are the main tools for dealing with hard puzzles and sha-
red units are the ingredients used to glue cells into chains. Section 1 proves a few
general theorems on shared units that will be very useful in the sequel (especially in
chapter XIV) when we deal with chains.
When the concept of a shared unit in natural rc-space is restricted to that of rc-
connected cells, it has supersymmetric versions, the concepts of rn-connected rn-
cells in abstract rn-space and of cn-connected cn-cells in abstract cn-space; in turn,
these concepts will provide the glue for building new types of chains, hidden chains
of various sorts in rn- and cn- spaces (see chapters XV, XVIII and XX).
Notice that this chapter may be considered as pertaining not only to Sudoku Re-
solution Theory but also, more basically, to Grid Theory.
Remember from section III.1.3.2 that two cells share a unit if:
– they are different,
– they are either in the same row or in the same column or in the same block.
The theorems in this section are valid for Sudoku grids of any size.
178 The Hidden Logic of Sudoku
Theorem XI.1: given two different cells in a row, if a cell shares a (possibly
different) unit with each of them, then either all three cells are in the same row or
they are in the same block (the "or" being non exclusive). The same is true if
"row" is replaced by "column".
Proof: if the third cell is also in this row, we are done; if not, then it shares no
row with any of the first two and it can share a column with at most one of them;
therefore, it must share a block with at least one of them; but then it cannot share a
column with any of the first two cells that would be outside this block, and the only
way it has to share a unit with each of these two cells is sharing this block with them
(which implies that the first two cells already shared this block), so that the first part
of the theorem is proved. For the second part, just apply the row-column symmetry
to this proof.
Theorem XI.2: given two different cells in a block, not in the same row and not
in the same column, if a cell shares a (possibly different) unit with each of them,
then it is in the same block.
Proof: in the conditions of the theorem, the first two cells span at least two rows
and two columns; any cell that is not in the same block as these two can share no
block with either of them, it can share a row with at most one of them but then it can
share no column with either of them, or it can share a column with at most one of
them but then it can share no row with either of them; finally, a cell which is not in
the same block as the first two cannot share a (possibly different) unit with each of
them.
Theorem XI.3: given three different cells such that any two of them share a
unit (possibly a different one for each couple), then there exists a unit shared by
all three cells (and there may exist two such units).
Unless otherwise stated, the theorems in this section are valid for Sudoku grids
of any size.
General theorems on shared units 179
Theorem XI.4: given three different cells in a row, if a cell shares a (possibly
different) unit with each of them, then either all four cells are in the same row or
they are in the same block (the "or" being exclusive in case of a 9x9 grid). The
same is true if "row" is replaced by "column".
Proof: if the fourth cell is also in this row, we are done; if not, then it shares no
row with any of the first three cells and it can share a column with at most one of
them; therefore, it must share a block with at least two of them; but then it cannot
share a column with any of the three cells that would be outside this block, and the
only way it has to share a unit with each of the three cells is sharing this block with
them all (which implies that the three cells already shared this block), so that the
first part of the theorem is proved. For the second part, just apply the row-column
symmetry to this proof.
Theorem XI.5: given three different cells in a block, not all three in the same
row and not all three in the same column, if a cell shares a (possibly different)
unit with each of them, then it is in the same block.
Proof: in the conditions of the theorem, the first three cells span at least two rows
and two columns; any cell that is not in the same block as them can share no block
with any of them, it can share a row with at most two of them but then it can share
no column with any of them, or it can share a column with at most two of them but
then it can share no row with any of them; finally, a cell which is not in the same
block as the first three cannot share a (possibly different) unit with each of them.
Theorem XI.6: given four different cells such that any two of them share a
unit (possibly a different one for each couple), then there exists a unit shared by
all four cells. This unit is unique in case of a 9x9 grid.
When we consider five or more cells, results are slightly simpler when we limit
them to 9x9 Sudoku grids (but notice that results similar to those in sections 1 and 2
could be formulated for grids of any size).
180 The Hidden Logic of Sudoku
Theorem XI.7: given four different cells in a row, if a cell shares a (possibly
different) unit with each of them, then all five cells are in the same row. The same
is true if "row" is replaced by "column".
Proof: if the fifth cell is also in this row, we are done; if not, then it shares no
row with any of the first four cells and it can share a column with at most one of
them; therefore, it must share a block with at least three of them; but then it cannot
share a column with any cell among them that would be outside this block, and the
only way it has to share a unit with each of the first four cells is sharing this block
with them all; but four cells in a row cannot share a block; so that the first part of the
theorem is proved. For the second part, just apply the row-column symmetry to this
proof.
Theorem XI.8: given four different cells in a block, if a cell shares a (possibly
different) unit with each of them, then all five cells are in the same block.
Proof: in the conditions of the theorem, the four cells span at least two rows and
two columns; any cell that is not in the same block as them can share no block with
any of them, it can share a row with at most three of them but then it can share no
column with any of them, or it can share a column with at most three of them but
then it can share no row with any of them; finally, a cell which is not in the same
block as the first four cannot share a (possibly different) unit with each of them.
Theorem XI.9: given five different cells such that any two of them share a unit
(possibly a different one for each couple), then there exists a single unit shared by
all five cells. This unit is unique in case of a 9x9 grid.
Meta-theorem 3 (chapter I or IV) does not apply directly to the above theorems,
since they refer to predicate "share-a-unit". But one can introduce block-free analo-
gues of these theorems.
General theorems on shared units 181
Now, the following limited versions of the theorems in section 1 are totally ob-
vious (and, in themselves, they will be as useless to us are they are obvious). Notice
that, given the trivial situation in the conclusion of these theorems, there is no need
to consider sets of more than three cells.
Theorem XI.1-rc: given two different rc-cells in the same row, if a rc-cell is rc-
connected to each of them, then it is in the same row. The same is true if "row" is
replaced by "column".
Theorem XI.3-rc: given three different rc-cells such that any two of them are rc-
connected, then either all three rc-cells share the same row coordinate (i.e. they are
in the same row) or they share the same column coordinate (i.e. they are in the same
column).
The theorems of section 2.1 become more interesting, and they will be used in
chapter XV, when they are transferred by column-number symmetry into abstract
row-number space, according to meta-theorem 3 (chapter I or IV).
First, one has to introduce predicate rn-connected, with arity 4 and signature
(Row, Number, Row, Number);
Theorem XI.3-rn: given three different rn-cells such that any two of them are
rn-connected, then either all of them have the same row coordinate (i.e. they are
in the same row) or they have the same number coordinate.
As being bn-connected merely means being in the same block, any theorem in
bn-space is obvious.
Chapter XII
Chains are the main tools for dealing with hard Sudoku puzzles. Indeed, many of
the advanced resolution rules that have been appearing on the Web are explicitly or
implicitly concerned with various types of chains; moreover, some of the rules
reviewed in the previous chapters may be recast as chain rules (as will be explained
in chapters XIV through XVII).
But very often, proposed chain rules are only considered through particular
examples, so that there remains much ambiguity concerning their scopes of appli-
cation and much redundancy in proposals for new rules. It also appears that the
basic concepts relevant for chain rules are unclear, leaving unanswered questions
even for the most familiar types of rules (for instance, should one allow loops in xy-
chains?, in AICs?). As a consequence, the need for a systematic classification of
rules has been expressed many times but it remains as yet unsatisfied.
This chapter introduces our general conceptual framework for dealing with two
dimensional (2D) chains. As all the chains in this Part Three are 2D chains (i.e. can
be defined as chains of cells in either of the rc-, rn- or cn- 2D spaces), "2D" will
generally be omitted. This chapter defines the notions of a link, of a chain, of a
target value and of a target cell for a chain; based on these notions, it formulates a
general inference rule schema for chains – a prototypical theorem for all conceivable
types of chains.
Then it introduces some usual types of links and associated types of chains fre-
quently met in the sequel: xy-chains (in section 2) and c-chains (or conjugacy
chains, in section 3). Although such names may be familiar to the experimented
184 The Hidden Logic of Sudoku
sudoku-ka, our definitions are non-standard, either refining the usual ones (in the
case of xy-chains) or extending them (in the case of c-chains). The reasons for this
will appear when the general inference rule schema of section 1 is specialised to
chains of these types (respectively in sections 2 and 3): our definitions are the most
general ones leading to a natural proof of these rules.
This section introduces the basic notions of our general framework for dealing
with 2D chains. It culminates at the end with the general inference rule schema for
2D chains. This rule schema must be considered as a regulatory principle for the
definitions of specific types of chains (relevant conditions must be put in the
definitions of the chain and of its targets so that the theorem is valid for them). This
is why no specific types of chains, no concrete examples, will be introduced before
the end of this section: we need to know how we intend to use chains of various
types for inference before we can give them precise definitions.
XII.1.1. Links
When two different cells share a unit (i.e. when they are in the same row, in the
same column or in the same block), we also say that they are linked (or rc-linked if
we want to emphasise that the link is in natural row-column space) or that there is a
link (or an rc-link) between them. This is a symmetric (but not reflexive) relation: if
C1 is linked to C2, then C2 is linked to C1 (but C1 is not linked to C1). Sometimes,
one says that the two cells see each other, but we prefer avoiding such vocabulary.
Since two different cells can share two units at the same time (a row and a block,
or a column and a block, but not a row and a column), we may have to be more pre-
cise when we say that they share a unit. So, if ut is a unit type (ut = row, col or blk),
we shall sometimes say that the two cells are ut-linked or linked along a unit of type
ut; similarly, if u is a unit, we shall sometimes say that the two cells are u-linked or
linked along unit u. Notice that, given a unit-type ut and two cells ut-linked, there is
Chains, target cells and chain rules 185
one and only one unit u of type ut that can link these cells. But, most of the time, in
chain rules, only the existence of a link will be relevant, not its type.
In sections 2 and 3 respectively, two specific types of links between two cells
(xy-links and conjugacy-links or c-links) will be introduced; they will play a major
role in the next chapters.
Linked cells are most useful when they combine into chains. Later we shall defi-
ne various specific types of chains, but all of them will satisfy the conditions in the
following definition: a chain (or a general chain) is a finite sequence of cells (it is
thus linearly ordered) such that:
– any two consecutive cells in the sequence are different and they are linked by a
definite unit,
– the two endpoints of the sequence are different.
Remarks:
– "definite unit" in the first condition means that, in case two consecutive cells
share two units, only one of these (the one specified in the definition of the chain) is
considered as linking these two cells in this chain;
– we consider only chains with different endpoints, or "open" chains (it will be
shown later that there is no reason for introducing global loops, since rules that
might be associated with them can be handled with shorter chains through the notion
of a target of a chain – see chapter XIV);
– intermediate non consecutive cells are not a priori forbidden to be identical, so
that a general chain may contain internal loops; but in the specific definitions of the
usual types of chains, specific restrictions will allow one to discard such loops as
being "unproductive" – an idea that will be explained later;
– if we need to specify the length of a chain, we shall speak of a 3-chain, a 4-
chain, a 5-chain…, according to the number of cells it contains (beware: we count
the cells, not the links);
– a 2-chain is simply a couple of cells that share a unit;
– in our definition of a chain, we do not impose the condition that the first and
the last cells share some candidate; although such a condition will be needed in the
end (i.e. when we want to use chains for inference), putting them in the general defi-
nition of a chain would unnecessarily complicate matters if we want to combine dif-
ferent kinds of chains;
186 The Hidden Logic of Sudoku
We shall also need to make a distinction between a chain of a specified type and
a full chain of the same type, i.e. a chain of this type satisfying additional conditions
specific to this type that make it ready for the application of the inference rule speci-
fic to this type of chain (see section 1.4). Such a distinction is necessary if we want
to be able to combine chains of different types (inference rules apply to full chains
whereas non full chains can be combined to form chains of mixed types).
The aim of introducing chains, and more specifically full chains of specified
types, is the formulation of constraints propagation rules that lead to the elimination
of a (generally unique) candidate from some (generally several) cells. The number
and cells that may be concerned by such elimination will be called the targets of the
chain (target number or target value and target cells). A specificity of our
framework is that the target cells of a chain do not belong to the chain. For all the
types of chains we consider in this book, this allows defining them with homoge-
neous chain patterns without incurring any loss of generality and this allows a chain
to have several targets, as will be seen from the xy- and c- chain examples below.
To be more precise in our definition of a target cell, consider first that, in any
actual chain on an actual grid, there are two aspects:
– the structural aspects: the number of cells, the types of links between them, the
relationships these links impose on the values in the cells of the chain;
– the instantiation (or occurrrence) aspects: the actual places of the cells on the
grid and the actual values of the candidates for these cells.
It will always be very easy to define the target number of a full chain of a given
type and there will be one and only one such number. As for the target cells, al-
though similar ideas appear from time to time in the Web literature, this notion does
not play a major or systematic role, because they are generally considered as being
part of the chain. We define them according to our general guiding principles:
Chains, target cells and chain rules 187
We therefore adopt the following definition: a general target cell of a full chain
is a cell that does not belong to the chain and that is linked to both of its endpoints.
Finally, we adopt the following definition: a specific target cell of a specific type
of chain is a cell that does not belong to the chain and that is linked to both of its
endpoints and to zero or more cells in this chain, as defined precisely by the specific
type of the chain.
When we speak of the target cell of a full chain or simply of the target of a chain
of some specified type, unless otherwise specified, we mean a specific target cell of
a full chain of this type. The next two sections will introduce the two basic cases of
xy-chains and c-chains and give the corresponding definitions of full chains and tar-
get cells. Other types of chains will be considered in forthcoming chapters.
With the above definitions, constraints propagation rules based on full chains of
any type (zz) must be instances of the following (informal) general rule schema:
188 The Hidden Logic of Sudoku
A formal version of this theorem schema will be given in the next chapter, after
we have introduced a graphico-logical formalism for representing chains.
Notice that, as the conclusion of the rule modifies the content of the target cells,
if we allowed a target cell to be one of the cells in the chain, we would potentially
have a vicious circle, amounting to negate the conditions used to justify the
conclusion. This is why we excluded this possibility from the definition of a target
cell.
Notice also that all our chain rules can only eliminate candidates, they can never
assert values. In the literature, one can find chain rules that assert value, but this can
always be replaced by a succession of rules for shorter chains that eliminate candi-
dates and of elementary rules. Finally, among all the rules discussed in this book,
the only ones that assert values are Naked-Single and Hidden-Single.
XII.2.1. xy-links
Definition: two cells on a grid are said to be xy-linked along a given unit u
(respectively: xy-linked along a given unit type ut) by a given number n if:
– they are linked along unit u (respectively: linked along a unit of type ut),
which entails that they are different,
– each of the two cells has two distinguished non equal candidates, called the
left-linking candidate and the right-linking candidate (and it may have additional
candidates);
– the left-linking candidate for the second cell is equal to the right-linking candi-
date for the first cell; it is called the xy-linking candidate for the two cells.
Ιn such a situation, u is called the xy-linking unit and ut the xy-linking unit type.
Chains, target cells and chain rules 189
An xy-link is called strict if none of the two cells it links has any candidate other
than the two distinguished ones.
To make this definition more intuitive, and anticipating on chapter XIII (where
such patterns will be given rigorous meaning and logical status), let us give a
graphical representation of an xy5-chain (where the horizontal bars represent links
along units of any type):
Definitions:
– a full xy-chain is an xy-chain in which the left-linking candidate for the first
cell equals the right-linking candidate for the last cell (but it is not required that
there is a direct link between these two cells, i.e. a full xy-chain does not necessarily
extend into what might be called an xy-loop);
– the target number of a full-xy-chain is the left-linking candidate for the first
cell, which is equal to the right-linking candidate for the last cell;
– a target cell of a full-xy-chain is any general target cell of this chain (notice in
particular that we do not require a target cell to be bivalue and we do not require
the links between the endpoints of an xy-chain and any of its target cells to be xy-
links).
In order to illustrate the difference between a general xy-chain and a full xy-
chain, let us give an intuitive graphical representation of a full xy5 chain:
Theorem XII.1 (constraints propagation rule for full xy-chains): given a full
xy-chain with xy-chain target value n, eliminate n from the candidates for any of
its target cells.
Proof of the rule for the full xy4-chain {1 2}—{2 3}—{3 4}—{4 1}: let the
cells in the chain be C1, C2, C3, C4; let their successive left-linking candidates be n1,
n2, n3, n4, so that the target value is n1 and the successive right-linking candidates are
n2, n3, n4, n1; let TC be any xy4-target-cell, i.e. TC shares a unit with both C1 and
C4 – and it is therefore different from each of these two cells.
Cell C1 can take two and only two values (hypothesis n2 ≠ n1 is essential for this
assertion). Let us consider each possibility in turn:
– if C1 = n1, then TC cannot be n1 since it shares a unit with C1 (notice that hypo-
thesis TC ≠ C1 is essential here);
– if C1 = n2, then C2 cannot be n2 since it shares a unit with C1; it must therefore
be n3 (hypothesis n3 ≠ n2 is essential for this conclusion). C3 cannot be n3 since it
shares a unit with C2; it must therefore be n4 (hypothesis n4 ≠ n3 is essential for this
conclusion). C4 cannot be n4 since it shares a unit with C3; it must therefore be n1
(hypothesis n1 ≠ n4 is essential for this conclusion); finally, TC cannot be n1 since it
shares a unit with C4 (notice that hypothesis TC ≠ C4 is essential here).
Το finish the proof: in any of the two cases, whether C1 = n1 or C1 = n2, TC can-
not be n1.
The case of a full xy-chain of any length (at least 2) is completely similar. We
just have to do the appropriate number of inference steps in the second branch of the
above alternative concerning the possible values of cell C1. For a more formal proof,
the best way to proceed is proving theorem XX.2 below by inference.
It should be noted that the two candidates for any cell in an xy-chain must be
different but it is allowed for candidates in different cells to be identical (as far as
this does not contradict the definition of xy-links). In particular, it is not forbidden
for the target value to appear in the chain at places other than the endpoints, as in
this particular xy5-chain: {1 2}—{2 3}—{3 1}—{1 5}—{5 1}.
It should also be noted that the no-loop condition we have put in the definition of
xy-chains has not been used in this proof. Nevertheless, it will be justified on other
grounds in the next chapter.
Chains, target cells and chain rules 191
As a last remark on this theorem, we have defined target cells as not belonging
to the chain, contrary to usage. This allows keeping the homogeneity of the chain
(all links in the chains are pure xy-links) and having the greatest generality for the
rule (target cells are not linked by xy-links). This also allows using the same chain
for eliminations in different target cells. The same remark will apply to c-chains.
Finally, what is shown in the second branch of the alternative in the above proof
is indeed that, if C1 ≠ n1, then all the cells in the chain are equal to their right-linking
candidate. We therefore have:
Theorem XII.2 (general theorem for non necessarily full xy-chains): given an
xy-chain, either the value of the first cell is its left-linking candidate or the value
of each cell in the chain is its right-linking candidate.
This result provides the basic tool for combining chains of different types. (Al-
though such combinations will not be studied in this book, we think it is useful to
mention how they could be dealt with.)
XII.3.1. c-links
Definition: two cells on a grid are said to be conjugate or c-linked along a given
unit u (respectively: c-linked along a unit of a given type ut) by a given number n if:
– they are linked along unit u (respectively: linked along a unit of type ut),
which entails that they are different;
– they both have number n among their candidates;
– on unit u (respectively: on the necessarily unique unit u of type ut linking
them), no other cell has number n among its candidates;
– in this situation, number n is called the c-linking value, unit u the c-linking
unit and unit type ut the c-linking unit type.
Notice that the third condition bears on the specified unit u (respectively: on the
unique unit u of specified type ut) and does not imply anything about a possible se-
cond unit u’ of another type ut’ that might also be shared by the two cells. For in-
stance, two cells that are both in the same row r and in the same block b can be c-
linked by number 1 along row r, but not by number 1 along block b: this is the case
when there is no instance of number 1 among the candidates for other cells in row r
but there is an instance of this number among the candidates for other cells in block
b.
192 The Hidden Logic of Sudoku
Similarly, two cells can be c-linked along a single unit u by two different num-
bers (and it is then a case of Hidden-Pairs in the corresponding unit), or along two
different units by two different numbers (for instance, if they share a row and a
block, they can be c-linked by number 1 along row r and by number 2 along block
b). Nevertheless, two cells cannot be c-linked along a single unit (or unit type) by
more than two different numbers (if two cells are c-linked along a unit u by two
different numbers, this makes a Hidden Pairs; therefore, if they were c-linked along
u by a third number, the puzzle would have no solution).
It can be expected that a pair of conjugate cells is a very useful tool for
inference. Indeed, if one of them equals the c-linking value n then the other must be
different (which is a simple consequence of their sharing a unit), but the converse is
also true: one of them must be n (since there is no other possibility for the instance
of n that must occur in unit u to find a place anywhere else in this unit).
Remember also from section VI.2.4, that conjugacy merely means bivalue in
either of the rn-, cn- or bn- spaces.
XII.3.2. c-chains
As a consequence of this definition, all the cells in the chain share the same defi-
ning candidate (and each cell can have any additional candidates).
Notice that our definition is much broader than the usual ones appearing under
the name of "simple colouring": we do not require all the links to be c-links, but
only the odd links. The reason is that this is enough for the purpose of inference, as
will be made clear in the next subsection.
To make this definition more intuitive, and anticipating on chapter XIII, let us
give a graphical representation of a general c5-chain (where the =(1)= symbols repre-
sent c-links relative to variable n1):
Chains, target cells and chain rules 193
1=(1)=1—1=(1)=1—1
Definitions:
– a full c-chain is a c-chain of even length, i.e. it has an even number of cells
(this additional condition will appear as being essential); as a consequence, the first
and the last links of a full c-chain are c-links;
– the target number of a full c-chain is the number explicitly listed in the c-links
as common to all the cells;
– a target cell of a full c-chain is simply a general target cell of this chain (notice
in particular that we do not require the links between the endpoints of a c-chain and
any of its a target cells to be c-links).
In order to illustrate the difference between a general c-chain and a full c-chain,
let us give the intuitive graphical representations of a full c4- and a full c6- chain
(the last link is also a c-link):
Theorem XII.3 (constraints propagation rule for full c-chains): given a full c-
chain with c-chain target value n, eliminate n from the candidates for any of its
target cells.
Proof of the rule for a full c4-chain: let the cells in the chain be C1, C2, C3, C4, let
n be the target value (which is also the common c-linking value) and let TC be a
target cell. According to the definition of a c-chain, the link between C1 and C2 and
the link between C3 and C4 are c-links. Now, consider in turn the possible values for
cell C1:
– if C1 = n, then TC cannot be n since it shares a unit with C1 (notice that hypo-
thesis TC ≠ C1 is essential here);
– if C1 ≠ n, then C2 = n, since C1 and C2 are c-linked by number n; but then
C3 ≠ n, since C3 shares a unit with C2; finally C4 = n, since C3 and C4 are c-linked by
number n; therefore TC cannot be n, since it shares a unit with C4 (notice that hypo-
thesis TC ≠ C4 is essential here).
The case of a full c-chain of any even length is completely similar. We just have
to do the appropriate number of inference steps in the second branch of the above al-
194 The Hidden Logic of Sudoku
ternative concerning values of cell C1. Here the condition that the length is even is
essential: when the first cell is not n, values of successive cells alternate between n
and not n; the last one must be n for the inference to be valid; it must therefore oc-
cupy an even position in the chain. Notice also that the argument alternates between
consecutive cells being conjugate and simply sharing a unit; this is the reason why
we adopted an unusual extended definition of c-chains.
Again, not considering target cells as belonging to the chain is contrary to usage.
But this allows keeping the homogeneity of the chain (all links alternate between c-
links and ordinary links) and having the greatest generality for the rule (target cells
are not necessarily linked to the endpoints by c-links). This avoids artificial and
pointless distinctions between different types of c-chains, depending on the types of
links with the target cells, such as the notions of "continuous" and "discontinuous"
loops.
Notice that, as was the case for xy-chains, what is shown (if we transform it into
an obvious recursion step) in the second branch of the alternative in the above proof
is indeed that, if C1 ≠ n, then all the cells in the chain are equal to n if they are even
and they are different from n if they are odd. We therefore have:
Theorem XII.4 (general theorem for non necessarily full c-chains): given a c-
chain, either the value of the first cell is the c-chain linking value, or the value of
each even cell in the chain is the c-linking value and the value of each odd cell in
the chain is not the c-linking value.
The case of Hidden Pairs should be clarified in order to dismiss some inappro-
priate conclusions.
Nevertheless, rules for Hidden Pairs are not subsumed by the c2-chain rule (nor
by the Interaction rules). Indeed, the target cells for Hidden-Pairs are the cells for-
Chains, target cells and chain rules 195
ming the Hidden Pairs and the target values are the Numbers not in the Pairs, where-
as the target cells for the c2-chain rule cannot be cells in the Pairs and the target
values can only be values in the Pairs.
We have stated the general chain rule schema and its specific instantiations for
xy-chains and c-chains in their practical imperative form: "given …, eliminate …"
(which is just a way of saying: "if …, then eliminate …"). Considering the formal
(epistemic) definitions of predicates "value", "not-candidate" and "candidate" given
in chapter IV, one might be sceptical about our informal way of stating and proving
the chain rules. Let us therefore clarify these points. These clarifications will not be
repeated, but they do apply to all the chain rules we shall state in each of the
forthcoming chapters. (Of course, they also apply to all the rules previously stated,
even though it is less obvious from their proofs that they need an interpretation).
First, a phrase such that "given a full zz-chain" should be interpreted as a strong
epistemic condition: "whenever, in the current knowledge state, you have effectively
detected a zz-chain on a grid". In practice, it means that the logical description of a
zz-chain must be written with the epistemic predicates "value" and "not-candidate",
as we said in chapter IV.
Second, the phrase "eliminate n from the candidates for cell (r, c)" in the conclu-
sion of a chain rule must also be interpreted formally in a strong epistemic sense, i.e.
as the assertion of "not-candidate(n, r, c)" in all the subsequent knowledge states.
Therefore, the global meaning of the rule is also completely epistemic: once you
have effectively detected a full chain pattern on a grid, together with a target value
and a target cell, then you do know that this value is excluded from this cell.
Now, nothing would be disturbing with this strong epistemic interpretation (the
only one consistent with the notion of a resolution rule) if our proofs of these rules
did not implicitly rely on the underlying non epistemic predicate "value°". Indeed,
for our proofs to be meaningful, phrases such as "if (r, c) = n" or "if (r, c) ≠ n" must
be interpreted formally as: "given any (partial) solution for the puzzle, if it is the
case that value°(n, r, c)" or "given any (partial) solution for the puzzle, if it is the
case that ¬value°(n, r, c)", i.e. in terms of the simple (but unknown) truth (in
epistemic states) of a fact about any solution. In chapter IV, we said that we never
needed to use the primary non epistemic predicates in the statement of the resolution
rules and we insist that this remains true, as shown by the above remarks on their
interpretation, even if we need to use these predicates in their proofs.
196 The Hidden Logic of Sudoku
Our proofs are of the following type: I know that, in any epistemic state acces-
sible from the current one (if there is any), cell C1 can only be 1 or 2, I do not know
whether, in a supposedly given solution grid among these, cell C1 is in fact 1 or 2,
but, considering that I know the existence of the zz-chain and considering the possi-
ble values of the successive cells in this chain, whichever value cell C1 has (among 1
and 2), zz-target cell C cannot have value 1. Such a proof is of the following type: I
know the climate in Delhi is awful in monsoon time, I do not know whether it is
raining right now in Delhi, but if it is raining I shall need an umbrella to protect me
from the rain and if it is not raining I shall need an umbrella to protect me from the
sun; therefore, I know that I shall always need an umbrella. From general known
facts we may conclude to other general known facts, but in order to prove that the
conclusion is correct we need to consider hypothetical contingent facts that do not
have an epistemic status.
From the point of view of intuitionistic logic, although the law of the excluded
middle is not valid, the above chain of reasoning is perfectly valid, because the
following axiom, on which it relies, is valid:
(A ⇒ C) ⇒ ((B ⇒ C) ⇒ (A or B ⇒ C)).
In the proof of the chain rules, this axiom is repeatedly applied in formulæ such
as:
(C1 = n1 ⇒ C ≠ n1) ⇒ ((C1= n2 ⇒ C ≠ n1) ⇒ (C1= n1 or C1= n2 ⇒ C ≠ n1)).
This kind of proof is sometimes called "reasoning by cases" and this is certainly
justified if we refer to the proof itself and to the person carrying it (except that, in
the present situation, reasoning by cases is intermingled with epistemic facets of the
underlying logic). But it would be very inappropriate to extend this qualification to
the player: you cannot expect him to prove the resolution rules every time he uses
them – or do you prove Pythagoras theorem any time you apply it? Like any other
resolution rule, a chain rule is written (and applied) in the condition-action form, i.e.
it has an imperative form and it requires no reasoning by cases on the part of the
player. As is the case with any other rule, a chain rule is proven once and for all;
how it was proven is not relevant for the player; he is just expected to detect the
appropriate pattern (which leaves a lot of place for fun) and to apply the conclusion.
Classifying the chain rules in the "reasoning by cases" category is often used to
argue that they is not much different from Trial and Error. But this amounts to
confusing the mathematician and the player’s roles. The two types of techniques
should not be assimilated, for the mere and obvious reason that Trial and Error is not
a resolution rule (it does not specify any pattern that could entail an elimination).
Chapter XIII
Before continuing with a detailed study of xy-chains and c-chains and before
defining more complex types of chains, we need an intuitive (but non ambiguous)
representation of chains. The graphical formalism introduced in section 1 of the pre-
sent chapter (and some easy extensions of it that will be defined in the next chapters,
when we need them) aims first at facilitating the writing of chains and chain rules in
their full generality, e.g. discarding irrelevant data (such as the types of the units
shared by consecutive cells in the chain or shared with target cells). But it also aims
at establishing (in section 2) a strict correspondence between these intuitive repre-
sentations and well-defined logical formulæ. Finally, it is justified by showing (in
section 3), with the examples of xy-chain rules and c-chain rules, that it is adequate
for writing resolution rules based on chains.
As a practical result, in the sequel we shall be able to write all our chain rules in
this intuitive graphical formalism (with some minor extensions), never again writing
explicitly any logical formula, but we shall be tranquil that these rules rest on the
strictest logic. This opens the way to the systematic logical formulation of resolution
rules based on chains, which can be considered itself as the first step towards their
implementation by rules of an expert system simulating human behaviour.
XIII.1. Simple patterns for cells and chains; their graphical representations
and their instantiations
In the previous chapters, we have dealt with actual cells, links and chains on ac-
tual grids; the chain rules we have considered have been stated (and proven) in-
198 The Hidden Logic of Sudoku
formally. Let us now introduce a formal language for specifying the previously re-
viewed situations of these cells, links and chains from a structural point of view (i.e.
not depending on actual places or values). It will be based on various notions of ab-
stract patterns and it will have simple graphical representations.
Let us adopt the following definitions and conventions for a cell pattern, a notion
intended to describe the structural content of a cell:
– a cell pattern is a set of variables (as such, each variable can appear only once
in it); a cell pattern is either open or closed;
– an open cell pattern is represented by (and displayed as) a list of integers,
where each integer i in the list stands for the corresponding variable ni; since a cell
pattern is a set and not a sequence (i.e. repetition of a variable is not allowed), the
integers in its representation must be different;
– a closed cell pattern is represented by (and displayed as) an open cell pattern
except that it is enclosed in curly braces: for instance, {1 2 3} represents the cell pat-
tern {n1, n2, n3};
– by a natural abuse of language, we often identify a cell pattern and its display;
– an open cell pattern is said to be instantiated in an actual cell (of an actual puz-
zle) when each of its variables is associated with an actual candidate for this cell,
different variables being associated with different candidates (this is the unique
names assumption at the level of cell patterns; it is essential in avoiding
redundancies between rules);
– a closed cell pattern is said to be instantiated in an actual cell (of an actual puz-
zle) when each of its variables is associated with an actual candidate for this cell,
different variables being associated with different candidates as before, and when
there are no candidates for this cell other than those covered by such associations;
– when necessary, a cell pattern is named Ck (where k is any positive integer);
accordingly, coordinates of the instantiating cell in the two canonical coordinate sys-
tems are named respectively (rk, ck) and [bk, sk].
In future chapters, we shall introduce more complex cell patterns (with conditio-
nal optional variables), but the present definitions are enough for dealing with xy-
chains and c-chains (and their hidden counterparts that will be introduced in chapter
XV).
Graphico-logical patterns for chain rules 199
Notice that these conventions for cell and link symbols provide no means for
specifying actual values for the cells and candidates or for the type (row, column or
block) of the link symbol. This is in resolute opposition with many graphical
representations of chains that have been appearing on the Web. The reason for our
choice is that introducing such possibilities and distinctions would unnecessarily
complicate matters in the general formulation of chain rules.
We adopt the following convention: in a chain pattern, cell patterns and link
symbols are always numbered in separate ascending order, starting from 1; accor-
dingly, the coordinates of the k-th cell, Ck, are always named (rk, ck) or [bk, sk]. This
convention remains true even when k indices are not explicitly displayed (as will be
the usual case). As for the link symbols, when we need to refer to the unit-type of Lk
(which will rarely be the case), it will always be named utk.
A chain pattern is represented by (and displayed as) the sequence of the repre-
sentations of each of its successive elements. By a natural abuse of language, a chain
pattern and its representation are often assimilated, but one should not forget that the
numbers appearing in the cell patterns are not literal constraints on values, they are
only place holders for Numbers in an actual cell of an actual grid.
Notice that two variables that appear in the same cell pattern may not be instan-
tiated by the same value but two variables that never appear together in any of the
cell patterns in the chain may be instantiated by the same value.
Definition: a starred chain pattern is like a chain pattern, except that the first and
the last cell patterns, and possibly other ones, are followed by a star (*). The inten-
ded meaning is that any target cell of any chain instantiating this chain pattern must
be linked to the instantiations of the starred cell patterns.
Graphico-logical patterns for chain rules 201
A starred chain pattern is represented by (and displayed as) a chain pattern, with
a star added after each appropriate cell pattern representation.
Our goal in this section is to establish a strict correspondence between the simple
patterns defined in section 1 and logical formulæ. This is in preparation of the next
section, where it will be shown that our chain patterns or their graphical displays are
the main components of shorthands for the logical formulation of resolution rules.
The formulæ defined below are chosen so as to comply with all our previous de-
finitions. We describe the correspondence with great detail, so that this section can
be considered as the core specification for a software parser taking chain patterns as
input and producing automatically logical formulæ (or rules for an inference engine)
as output.
In this section, we always consider the following chain pattern and we progres-
sively build a corresponding logical formula:
– the finite conjunction of the predicates "ni ≠ nj" for all the variables ni and nj
(with i ≠ j) in Ck,
– the formula "∀n {n ∉ Ck ⇒ not-candidate(n, rk, ck)}", where n ∉ Ck is a short-
hand for the finite conjunction of the inequalities n ≠ ni, for all the variables ni in Ck.
The logical formula associated with a link symbol Lk can only be defined in the
context of the surrounding cell patterns Ck and Ck+1:
– to the link symbol "—", we associate the open logical formula:
share-a-unit(rk, ck, rk+1, ck+1);
where the auxiliary predicate "share-a-unit" has been defined in section III.2.
– to the link symbol "=(i)=", we associate the open logical formula:
∃utk conjugate(ni, rk, ck, rk+1, ck+1, utk);
where the auxiliary predicate "conjugate" has been defined in section VI.2.4.
Remarks:
– at this stage, the no-loop condition we have imposed on some types of chains
(xy-chains) is not included in these definitions; it will be taken care of later;
– associating a formula with a chain pattern does not mean in itself that we
assert anything about chains instantiating this pattern; the best way to consider such
formulæ is as auxiliary predicates useful for describing situations in which there
appears to be a chain;
Graphico-logical patterns for chain rules 203
– as can be seen from the above definitions and examples, one can easily define
auxiliary predicates describing xy-chains or c-chains of any predefined length. But,
unless one admits formulæ of infinite length, one cannot define a predicate for an
xy-chain or a c-chain of unspecified length.
Example 1: for the typical xy3-chain pattern: {1 2}—{2 3}—{3 4}, the associa-
ted formula is (after introducing our auxiliary predicates):
xy3-chain(r1, c1, r2, c2, r3, c3, n1, n2, n3, n4) ≡
rc-bivalue(r1, c1, n1, n2) &
share-a-unit(r1, c1, r2, c2) &
rc-bivalue(r2, c2, n2, n3) &
share-a-unit(r2, c2, r3, c3) &
rc-bivalue(r3, c3, n3, n4) &
¬same-cell(r1, c1, r3, c3).
Example 2: for the typical c3-chain pattern: 1=(1)=1—1, the associated formula is
(after introducing our auxiliary predicates):
From the above conventions, all the names of variables appearing in a chain pat-
tern have a subscript. We now add the following convention: variable names
without a subscript (r, c, b, s) are reserved to designate a target cell; variable n is
reserved to name a target value. The logical formula associated to a starred chain
pattern is now defined as the conjunction of:
– the formula associated to the underlying unstarred chain pattern,
– for every starred cell pattern Ck in the chain pattern (which always includes the
cases k = 1 and k = n), the formula share-a-unit(r, c, rk, ck) (expressing that any
target cell must share a unit with the instantiation of cell pattern Ck),
– for every cell pattern Ci in the chain pattern, except the starred ones, the
conjunction of all the formulæ ¬same-cell(r, c, ri, ci), expressing that a target cell
does not belong to the chain; starred cell patterns can be excluded from this conjunc-
tion, since sharing a unit implies being different.
204 The Hidden Logic of Sudoku
It may be useful to repeat that, even at this stage, writing such a formula does not
mean that we assert anything about chains instantiating starred chain patterns. Since
such patterns do not include conditions specific to particular types of chains, such
assertions would not be correct. Asserting rules for specific types of chains is the to-
pic of next section.
xy3-chain*(r, c, r1, c1, r2, c2, r3, c3, n1, n2, n3, n4) ≡
rc-bivalue(r1, c1, n1, n2) &
share-a-unit(r1, c1, r2, c2) &
rc-bivalue(r2, c2, n2, n3) &
share-a-unit(r2, c2, r3, c3) &
rc-bivalue(r3, c3, n3, n4) &
¬same-cell(r1, c1, r3, c3) &
share-a-unit(r, c, r1, c1) &
share-a-unit(r, c, r3, c3) &
¬same-cell(r, c, r2, c2).
In this section, the above graphical formalism is used to write resolution rules
based on full chains.
Contrary to a chain pattern or a starred chain pattern, which are only descriptions
of possible situations, but in conformance with the standard meaning of the "|=" sign
in logic, a chain rule pattern is an assertion. We must therefore be careful not to
Graphico-logical patterns for chain rules 205
write chain rule patterns that would express non valid rules. In particular, only chain
rule patterns corresponding to full chains should be written.
To be precise, the chain rule pattern "Prefix |= SCP" asserts the validity of the
rule defined by the following procedure:
1) let F1 designate the formula associated as above to the starred chain pattern SCP;
2) if no symbol of type "n=nk" is present (as should normally be the case) or if the
symbol "n=n1" is present, then add to F1 the formula n=n1; otherwise, add to F1 the
formula n=nk; this provides for the possibility of having a target variable other than
n1, but this is probably not useful; in any case, the result is a formula F2;
3) if "loops" is absent from the Prefix (which will be the standard case), then add to
F2 the no-loop condition (which we take as the default for all type of chains); more
precisely, for every cell pattern Ck in SCP but the first two, and for all i < k-1, add to
F2 as a conjunct the following formula:
¬same-cell(rk, ck, ri, ci); this expresses the fact that each cell Ck is different from all
the previous ones (as Ck-1 is already specified as being linked to Ck, it is not necessa-
ry to repeat it); the result is a formula F3;
4) define F4 as the following formula: F3 => not-candidate(n, r, c);
5) define F5 as the universal closure of F4; i.e. F5 is obtained by enclosing F4 in the
scope of a universal quantifier for every unquantified variable appearing in F4;
6) if rn, cn and H are absent from the prefix (i.e. if, forgetting loops, the prefix is rc
or empty), then do nothing; if one or more of these symbols is present, then wait
until chapter XV to know what to do; in any case, the result is a formula F6;
7) the rule asserted by the chain rule pattern is expressed by formula F6.
As our first example, consider the xy3 chain rule pattern (notice the difference
with the general xy3 chain pattern: for the pattern to correspond to a full xy3-chain,
variable n4 has to be the same as variable n1):
Applying the above procedure, we get the assertion associated with this pattern,
the xy3-chain rule (in this particular case, the no-loop condition reduces to nought):
∀r1∀c1∀r2∀c2∀r3∀c3∀n1∀n2∀n3∀r∀c∀n
{ rc-bivalue(r1, c1, n1, n2) &
share-a-unit(r2, c2, r1, c1) &
rc-bivalue(r2, c2, n2, n3) &
share-a-unit(r3, c3, r2, c2) &
rc-bivalue(r3, c3, n3, n1) &
¬same-cell(r3, c3, r1, c1) &
share-a-unit(r, c, r1, c1) &
206 The Hidden Logic of Sudoku
One can recognize the XY-Wing rule of section X.1.2 (except that in X.1.2, n1
was designed from the start as the target variable and there was no need for additio-
nal variable n).
|= 1*=(1)=1—1=(1)=1*
This stands for the assertion of the logical formula (which is the c4-chain rule):
∀r1∀c1∀r2∀c2∀r3∀c3∀r4∀c4∀n1∀r∀c∀n
{ ∃ut1 conjugate(n1, r1, c1, r2, c2, ut1) &
share-a-unit(r3, c3, r2, c2) &
¬same-cell(r3, c3, r1, c1) &
∃ut3 conjugate(n1, r3, c3, r4, c4, ut3) &
¬same-cell(r4, c4, r1, c1) &
¬same-cell(r4, c4, r2, c2) &
share-a-unit(r, c, r1, c1) &
share-a-unit(r, c, r4, c4) &
¬same-cell(r, c, r2, c2) &
¬same-cell(r, c, r3, c3) &
n=n1
=>
not-candidate(n, r, c) }.
Obviously, the graphical patterns are much more appealing than the logical
formulæ, even though they are strictly equivalent. In the sequel, all the chain rules
will be written in the graphical formalism introduced above (and some minor exten-
sions of it). This means that we shall be able to define chain rules without having to
explicitly write complex logical formulæ.
Chapter XIV
xy-chains
We consider xy-chains as the chains of the simplest kind and as the prototype for
all other chains. They have been defined in section XII.2.2: an xy-chain is a chain in
which:
– each cell has two non equal distinguished candidates, called the "left-linking
candidate" and the "right-linking candidate", and it has no other candidate;
– the left-linking candidate for each cell but the first is equal to the right-linking
candidate for the previous cell (therefore, for any two consecutive cells, the link
between them is actually a strict xy-link);
– any two cells in the sequence are different (i.e. there are no loops).
The first section of this chapter justifies the last condition by showing that one
needs not consider xy-chains with (local or global) loops. This is very important in
practice since it simplifies considerably the search for xy-chains. The second section
lists all the xy-chain rules of length nine or less and analyses special cases of xy-
chains. With detailed examples, the third section illustrates the diversity of situa-
tions one can encounter with xy-chains. Through these examples, although we do
not state them explicitly, many independence results are also proven, as explained in
the introduction.
Define an xy-loop as an xy-chain, but with the third condition replaced by the
following: the last cell is the same as the first (they therefore have the same two
candidates, but their left and right distinguished candidates may be different); this is
the broadest definition of an xy-loop one can give. Define an xy-loop target cell as
any cell linked to the unique endpoint.
Let there be an xy-loop of length p and let C1, …, C2, …, Cp be the sequence of
cells in the chain, with Cp = C1; let n1, n2, …, np be the sequence of left-linking
candidates for the cells in the chain, so that the right-linking candidates are n2, …,
np, np+1. Since Cp = C1, the content of the two cells is the same and the following set
equality is true: {n1, n2} = {np, np+1}.
Two cases must therefore be considered: either the right-linking candidate for
the last cell equals the left-linking candidate for the first cell (np+1 = n1 and np = n2)
and (conforming to our general notion of a full chain) we call the chain a full (or a
true) xy-loop or it is not the case (np = n1 and np+1 = n2) and we call the chain a
pseudo xy-loop.
Proof: suppose we have a pseudo xy-loop (np = n1 and np+1 = n2). In this case, the
loop allows no conclusion of xy type (i.e. it is not possible to conclude that either
C1 = n1 or Cp = n1): such a pseudo xy-loop is unproductive. For instance, in the
following example of such a loop: {1 2}—{2 3}—{31}—{1 2}, C1 can be either 1
or 2. Actually, a pseudo-xy-loop is not a full xy-chain and we should therefore not
expect to have an associated rule. This case was considered only for further refe-
rence.
Theorem XIV.2: the resolution rule that might be associated with a true xy-
loop is subsumed by BSRT together with rules for shorter xy-chains with no loops.
Proof: suppose we have a true xy-loop (np+1 = n1 and np = n2). Then we can view
the chain as a full xy-chain (except for the no loop condition), e.g.
{1 2}*—{2 3}—{3 4}—{4 2}—{2 1}*, and we may expect proper conclusions to
be deduced from it. Consider the subchain C2, …, Cp-1 obtained from the original
loop by forgetting the two (identical) endpoints (and their links to the rest of the
chain). Since cell Cp-1 has np = n2 as its right-linking value, this is an xy-chain of
length p-2 admitting number n2 as its target value and cell C1 as a target cell. The
general xy-chain rule for chains of length p-2 allows to eliminate n2 from the candi-
dates for C1. Therefore C1 = n1. Since the target cells of the initial loop are all the
xy-chains 209
cells linked to C1 and Cp (= C1), the elementary constraints propagation rules (ECP)
allow to eliminate n1 from the candidates for any target cell of the initial loop.
Finally, any conclusion that can be drawn from the existence of an xy-loop can
already be drawn from rules applying to xy-chains with no loops. According to our
general guiding principles, no specific rule should be formulated for xy-loops.
71 71
17 27
12 12
23 23
21 12
42 34 41 34
Figure 1. True (left) and pseudo (right) loop on the first cell of an xy-chain
Proof: consider a loop on the first cell of an xy-chain (by reversing the chain, the
case of the last cell is similar). The two types of xy-loops must be considered.
If it is a true xy-loop, then, as shown in section 1.1, BSRT together with rules for
shorter xy-chains should allow one to conclude that C1 = n1; it remains to apply an
elementary propagation rule to get TC ≠ n1.
If it is a pseudo xy-loop, then the rest of the chain is an xy-chain, with the same
target value and target cells as the original chain; and the loop can simply be ex-
cised.
210 The Hidden Logic of Sudoku
Theorem XIV.4: the resolution rules that might be associated with xy-chains
with internal loops are subsumed by BSRT together with rules for shorter xy-
chains with no loops.
Proof: consider a full xy-chain of this extended type, of length p, with cells C1,
…, Ci, …, Ck, …, Cp, C1 ≠ Cp, and suppose there is a loop between cells Ci and Ck
(1 < i < k < p). Let n1, …, ni, …, nk, …, np be the sequence of left candidates for the
cells in the chain, so that the sequence of right candidates is n2, …, ni+1, …, nk+1, …,
n1. Let TC be any target cell. For our internal loop, we must consider two cases:
The situation in this second case can be described as follows: in an xy-chain with
a loop between cells Ci and Ck, if the same number in cells Ci and Ck is used twice
as the right candidate (Ci+1 and Ck+1), then, as far as inference is concerned, the loop
can be excised. Such a loop is called unproductive.
xy-chains 211
Of course, this does not mean that you should not use global xy-loops when you
find them: they can help you find all the associated open xy-chains.
For each length n, a rule for xy-chains of length n has been proved in section
XII.2.3. The goals of the present section are:
– to list these xy-chain rules systematically up to length nine, using the graphical
conventions introduced in chapter XIII,
– for the simplest of them, to show their equivalence with previously defined
rules.
|= {1 2}*—{2 1}*
We leave it to the reader as an obvious exercise to write similar rules for longer
xy-chains. How far should we go? We have no general answer; let us say that we
have found (very few) puzzles that can be solved using rules for (extended) xy-
chains of length thirteen but cannot be solved with the set of rules in our classifi-
cation using only patterns with less than thirteen cells. But our set of rules is not
complete with respect to such patterns. Similarly, we have not tried to find xy-
chains of length greater than thirteen.
It is easily seen from the definition in chapter X that, when it is restricted to cells
C1, C2 and C3 with two different linking units such that there is no link of type u1 be-
tween cells C2 and C3, this rule is XY-Wing.
In the remaining case, when the rule is instantiated with cells C1, C2 and C3 with
the same linking unit (u2 = u1), then we have the conditions of a special case of
Naked-Triplets, because the three cells share this unit. As for the conclusions of the
xy3-rule in this case, they obviously cover all cases covered by NT in this unit. But
the conditions on xy3-chains are far from covering all possible cases of Naked-
Triplets in this unit (in NT, any of the three cells may actually contain any subset of
the three values); for instance, the two examples of Naked-Triplets given in chapter
VII, puzzles Royle17-11200 and Royle17-23317, cannot be solved in L2+XY-Wing
or even in L4_0 ∪{all the rules for xy-chains of length thirteen or less}. Rules for
Naked-Triplets should therefore not be eliminated in favour of the present one.
On the other hand, still in the case u2 = u1, the conclusions of the xy3-chain rule
may also apply to xy-target cells that are not concerned by the NT rule: any cell that
shares a unit other than u1 with C1 and C3 but not with C2. For instance, if C1, C2 and
C3 are in the same row, the links u1 and u2 are instantiated by this row and cells C1
and C3 are also in the same block, then all other cells in this block are target cells for
XY3 but not for NT.
tions of a special case of Naked-Quads. But, again, there may be extra target cells.
And, again, this is far from covering all possible cases of Naked-Quads (where any
additional subset of the four values can be present in any of the four cells);
therefore, rules for Naked-Quads should not be eliminated in favour of XY4. This is
the occasion to notice that the gap between the two rules may be better understood if
we remember that their proofs develop along very different lines: a global analysis
of the four cells and the four values for Naked-Quads, a sequential analysis of the
cells and values for xy-chains.
3 1 8 2 3 1 5 4 7 8 2 6 9 3 1 5 4 7
7 2 7 1 8 4 5 2 3 7 9 1 8 4 5 2 3 6
5 5 4 3 2 7 1 8 5 4 3 2 7 6 1 8 9
1 8 6 3 2 1 8 4 6 7 9 5 3 2 1 8 4 6 7 9 5 3
5 4 3 5 8 2 4 6 1 3 7 9 5 8 2 4 6 1
6 5 4 1 9 3 8 7 2 6 5 4 1 9 3 8 7 2
4 3 7 4 3 2 7 1 8 6 9 5 4 3 2 7 1 8 6 9 5
1 8 9 5 2 4 1 8 9 6 5 3 2 4 7 1 8
2 1 8 5 2 4 1 8 7 6 5 9 3 2 4
Puzzle Royle17-3766 (Figure 2) cannot be solved in L4_0 but its L4_0 elabora-
tion (which coincides with its L1 elaboration) is in L4_0+XY4 (and indeed in
L1_0+ XY4).
xy-chains 215
Resolution path in L4_0+XY4 for the L4_0 (or L1) elaboration of Royle17-3766:
xy4-chain {n6 n9}r9c6 – {n9 n6}r3c6 – {n6 n9}r1c4 – {n9 n6}r1c3 ==> r9c3 ≠ 6
… (Naked-Singles and Hidden-Singles)
2 5 9 6 7 2 5 3 1 9 4 8 6 7 2 5 3 1
6 9 3 6 9 1 8 7 3 6 5 9 4 1 8 7 2
1 7 1 3 6 9 7 2 1 3 5 8 6 4 9
8 3 8 9 4 2 3 1 7 8 9 6 4 2 3 1 5 7
2 4 2 1 9 4 2 3 7 1 9 5 4 6 8
6 1 6 9 1 5 4 8 6 7 9 2 3
3 1 6 4 5 3 9 7 1 6 4 8 2 5 3 9 7 1 6
5 7 2 5 7 9 2 1 6 3 5 7 9 2 1 6 3 8 4
9 6 1 3 2 9 5 6 1 3 7 8 4 2 9 5
It is interesting to have a look at the partial resolution path leading from the ori-
ginal 17-minimal puzzle (left-hand grid) to its L4 elaboration (central grid): it alrea-
216 The Hidden Logic of Sudoku
dy requires two applications of another xy-chain rule, XY4, which leads to the addi-
tion of five new values.
6 2 1 6 8 2 7 9 1 3 4 5 6 8 2 7 9
2 1 8 2 6 9 7 4 1 8 2 6 9 7 3 4 5 1
8 7 2 1 4 8 6 7 9 5 2 1 4 8 3 6
8 1 4 5 7 8 9 1 6 2 3 4 5 7 8 9 1 6 2 3
6 4 6 1 2 4 6 1 9 3 2 5 7 4 8
3 3 2 6 4 1 3 8 2 6 4 7 9 1 5
7 1 4 9 7 1 4 3 6 5 8 2 9 7 1 4 3 6 5 8 2
5 3 6 2 1 5 9 3 6 7 2 4 8 1 5 9 3 6 7
9 5 6 3 7 8 2 1 9 4 5 6 3 7 8 2 1 9 4
At this point, the puzzle has been fully elaborated in L4_0+XY4, and, in
particular, the Naked Singles and Hidden Singles following the application of the
xy4-chain rule have completely integrated the action of this chain rule into the
values asserted. But the puzzle is not yet solved, although only 19 values are
missing. This L4_0+XY4 elaborated version becomes the starting point for the
resolution in L4+XY5. The second rule that applies to it is now XY5 and, after a
single application of it, the final solution is obtained using only Naked-Singles.
Moreover this puzzle can nearly be solved in L1_0: XY6 is used only once to
eliminate only one candidate. This illustrates a characteristic of most of the complex
rules, which might be very frustrating: their action is very limited if one measures it
by the number of candidates eliminated – but they have nevertheless a crucial un-
blocking role. This suggests that the worth of a rule cannot be evaluated after the
(mean) number of candidates it leads to eliminate.
3 8 2 1 3 6 5 8 7 2 1 3 4 6 9 5 8 7
5 2 8 5 7 1 2 3 9 8 5 7 1 2 3 9 4 6
7 1 6 7 8 5 1 3 2 6 4 9 7 8 5 1 3 2
1 6 7 1 3 6 2 8 7 5 1 3 6 2 4 8 7 9 5
7 5 7 2 5 6 3 1 7 8 2 9 5 6 3 1 4
5 3 1 7 2 4 9 5 3 1 7 2 6 8
6 1 4 5 7 8 6 3 1 4 2 9 5 7 8 6 3 1 4 2 9
2 5 3 2 1 8 7 6 5 3 9 2 1 8 7 4 6 5 3
3 6 5 2 8 7 1 3 6 4 5 9 2 8 7 1
Puzzle Royle17-35802 (Figure 6) is the only one in the Royle17 database that
cannot be solved in L6 but whose L6 elaboration (identical to its L3 or to its L2+
XYZ-Wing elaboration) can be solved in L6+XY7. As the resolution of the L6 ela-
boration uses rules other than XY7 (among which are two instances of XY4), this
case is more representative of reality than those we have to select in most of our
examples in order to keep their resolution path short.
xy4-chain {n7 n6}r1c6 – {n6 n5}r3c4 – {n5 n4}r3c3 – {n4 n7}r3c8 ==> r3c5 ≠ 7, r1c9 ≠ 7
block b3 interaction-with-row r3 ==> r3c2 ≠ 7
naked-pairs-in-a-column {n4 n9}{r1 r4}c9 ==> r8c9 ≠ 4, r3c9 ≠ 9
xy4-chain {n9 n4}r1c9 – {n4 n7}r3c8 – {n7 n3}r3c9 – {n3 n9}r3c2 ==> r1c2 ≠ 9
… (Naked-Singles)
8 3 8 3 1 8 7 5 3 9 6 2 1 4
5 2 1 5 8 2 3 6 4 7 1 9 5 8
1 1 8 1 9 4 5 2 8 6 7 3
2 6 5 7 2 3 1 6 5 8 7 2 3 1 6 5 4 8 9
5 4 7 6 5 1 9 8 4 7 3 2 6 5 1 9 8 4 7 3 2
1 4 8 9 2 1 6 5 4 8 9 2 3 7 1 6 5
4 2 9 4 7 8 2 1 9 4 7 6 5 3 8 2 1
8 1 5 6 2 8 1 9 5 6 2 8 1 9 3 4 7
3 7 3 1 8 7 4 2 5 9 6 3 1 8 7 4 2 5 9 6
Let us introduce the following notation: if X designates any of the chain types
considered in this book (i.e. X = XY or HXY or XYT or HXYT …), define Xj_k as
the set of rules for chains of type X and of length between j and k included. Gene-
rally j will be 4. Thus, XY4_9 stands for the set {XY4, XY5, XY6, XY7, XY8,
XY9}.
Although hidden xy-chains and associated rules of type HXY will be defined
only in the next chapter, the example of puzzle Royle17-14259 (Figure 7), with
identical L4_0 and L1_0 elaborations, provides a very good motivation for introdu-
cing them: without them, rules for xy-chains of lengths 4, 5, 8 and 9 are required,
whereas if HXY4 is allowed, only the simplest of these rules (XY4) will be needed.
This illustrates the fact that the maximum length of the chains of some type required
to solve a puzzle can (sometimes) be traded with the acceptation of hidden chains of
the same type.
Let us display the two resolution paths. Notice first that they start with the same
rules (rules from L2 that produce no value and whose results therefore do not appear
in the L4_0 elaboration); this is normal since rules of lower complexity are always
applied before rules of higher complexity:
xy-chains 219
;;; common part in L2 for the two resolution paths, in L4_0+XY4_9 and in L4_0+XY4+
HXY4, for the L4_0 (or L1_0) elaboration of Royle17-14259:
column c4 interaction-with-block b8 ==> r9c6 ≠ 6, r8c6 ≠ 6
row r9 interaction-with-block b8 ==> r8c6 ≠ 4, r8c6 ≠ 2, r8c4 ≠ 2
hidden-pairs-in-a-row {n3 n7}r1{c3 c6} ==> r1c6 ≠ 6
row r1 interaction-with-block b3 ==> r3c9 ≠ 6
naked-pairs-in-a-row {n2 n4}r3{c5 c9} ==> r3c6 ≠ 4, r3c6 ≠ 2, r3c2 ≠ 2
hidden-pairs-in-a-row {n3 n7}r1{c3 c6} ==> r1c6 ≠ 4, r1c6 ≠ 2, r1c3 ≠ 4, r1c3 ≠ 2
;;; end of the common part
1 5 1 9 5 8 2 1 3 9 5 7 8 6 4
8 7 5 8 7 1 9 5 6 4 8 3 2 7 1 9
3 9 8 1 5 3 9 7 8 1 4 6 5 3 2
7 5 7 4 9 5 6 1 3 7 4 9 5 6 1 3 2 8
6 9 1 5 6 4 9 7 1 5 6 3 2 8 4 9 7
4 1 8 4 7 9 6 5 1 8 3 2 4 7 9 6 5 1
6 9 2 6 8 1 7 9 5 2 4 3 6 8 1 7 9 5 2 4 3
1 5 1 9 5 4 2 7 6 1 3 9 8 5
3 3 9 5 1 7 3 9 5 2 8 4 1 7 6
Continuation of the resolution path in L4_0+XY4_9 for the L4_0 (or L1_0) elaboration of
Royle17-14259:
xy5-chain {n2 n4}r3c5 – {n4 n2}r3c9 – {n2 n8}r4c9 – {n8 n6}r9c9 – {n6 n2}r9c4 ==>
r9c5 ≠ 2
xy8-chain {n2 n4}r8c1 – {n4 n2}r1c1 – {n2 n6}r1c8 – {n6 n8}r8c8 – {n8 n3}r8c6 –
{n3 n7}r1c6 – {n7 n3}r1c3 – {n3 n2}r6c3 ==> r8c3 ≠ 2
xy9-chain {n2 n4}r1c1 – {n4 n2}r8c1 – {n2 n7}r8c2 – {n7 n6}r3c2 – {n6 n7}r3c6 –
{n7 n3}r1c6 – {n3 n8}r8c6 – {n8 n6}r8c8 – {n6 n2}r1c8 ==> r1c9 ≠ 2
xy5-chain {n8 n4}r9c5 – {n4 n2}r3c5 – {n2 n4}r3c9 – {n4 n6}r1c9 – {n6 n8}r9c9 ==>
r9c6 ≠ 8
xy5-chain {n4 n8}r9c5 – {n8 n6}r9c9 – {n6 n4}r1c9 – {n4 n2}r3c9 – {n2 n4}r3c5 ==>
r2c5 ≠ 4
xyz3-chain {n2 n4}r1c1 – {n4 n3}r2c3 – {n3 n2}r2c5 ==> r2c2 ≠ 2
hidden-pairs-in-a-block {n2 n4}{r1c1 r2c3} ==> r2c3 ≠ 3
xy4-chain {n7 n3}r1c6 – {n3 n2}r2c5 – {n2 n4}r2c3 – {n4 n7}r8c3 ==> r1c3 ≠ 7
… (Naked-Singles and Hidden-Singles)
Continuation of the resolution path in L4_0+XY4+HXY4 for the L4_0 (or L1_0) elaboration
of Royle17-14259:
hxy-cn4-chain {r2 r3}c6n6 – {r3 r1}c6n7 – {r1 r8}c3n7 – {r8 r2}c3n4 ==> r2c6 ≠ 4
hidden-single-in-a-column ==> r9c6 = 4
xy4-chain {n8 n2}r4c9 – {n2 n4}r3c9 – {n4 n2}r3c5 – {n2 n8}r9c5 ==> r9c9 ≠ 8
… (Naked-Singles )
For another example of a trade of a similar kind, between the lengths and the
types of the chains required to reach the solution, consider puzzle Royle17-17265
(Figure 8). Its L4_0 elaboration (central grid) coincides with its L2 elaboration. It
can be solved either in L4_0+XY4_11, i.e. using only chains of type xy (and of
lengths between 4 and 11) in addition to rules in L4_0, or in L5, using chains of the
more complex types hxy and xyt (which will be introduced in chapters XV and
XVII respectively). This is also the longer xy-chain (length 11) we have found in the
Royle17 database (considering that L4_0 should be fully applied before any chain of
any type and of length greater than 4 is looked for).
3 8 5 6 3 8 9 5 2 4 1 7 6 3 8 9 5 2 4 1 7
4 2 5 4 2 8 5 1 7 4 6 3 2 8 9
8 6 5 9 2 4 8 1 7 6 5 3
8 1 5 2 8 1 5 4 6 3 7 9 2 8 1 5 4 6
4 6 7 4 6 1 8 7 2 4 6 1 5 3 9 8 7 2
2 2 8 5 7 9 3 1 2 8 5 7 4 6 9 3 1
7 6 4 7 5 6 8 1 2 4 7 5 3 6 9 8 1 2 4
3 3 6 8 9 2 1 7 4 3 6 5
9 6 7 9 1 4 6 3 2 5 7 9 8
As was the case in the previous example, the two resolution paths start with the
same rules:
;;; common part in L2 for the two resolution paths, in L4_0+XY4_11 and in L5, for the L4_0
(or L2) elaboration of Royle17-17265:
column c4 interaction-with-block b8 ==> r9c5 ≠ 1, r8c5 ≠ 1
naked-pairs-in-a-column {n3 n9}{r5 r7}c5 ==> r9c5 ≠ 3, r8c5 ≠ 9, r3c5 ≠ 3, r2c5 ≠ 3
block b2 interaction-with-column c6 ==> r9c6 ≠ 3
block b2 interaction-with-column c6 ==> r5c6 ≠ 3
hidden-pairs-in-a-column {n2 n4}{r3 r8}c3 ==> r8c3 ≠ 9, r3c3 ≠ 9, r3c3 ≠ 7
hidden-pairs-in-a-row {n2 n4}r3{c2 c3} ==> r3c2 ≠ 9, r3c2 ≠ 7
xy-chains 221
Continuation of the resolution path in L4_0+XY4_11 for the L4_0 (or L2) elaboration of
Royle17-17265:
xy11-chain {n3 n9}r4c1 – {n9 n1}r3c1 – {n1 n7}r3c5 – {n7 n3}r3c6 – {n3 n6}r2c6 –
{n6 n4}r6c6 – {n4 n5}r9c6 – {n5 n9}r5c6 – {n9 n3}r5c5 – {n3 n9}r7c5 – {n9 n3}r7c3 ==>
r9c1 ≠ 3
naked and hidden singles ==> r7c3 = 3, r7c5 = 9, r5c5 = 3, r5c4 = 5, r8c4 = 1, r9c4 = 3,
r5c6 = 9, r4c1 = 3
xy7-chain {n1 n7}r3c5 – {n7 n3}r3c6 – {n3 n6}r2c6 – {n6 n4}r6c6 – {n4 n5}r9c6 –
{n5 n8}r9c9 – {n8 n1}r9c1 ==> r3c1 ≠ 1
… (Naked-Singles)
2) Resolution path in L5 using chains of more complex types (hxy and xyt):
Continuation of the resolution path in L5 for the L4_0 (or L2) elaboration of Royle17-17265:
xyt4-chain {n1 n5}r8c4 – {n5 n4}r9c6 – {n4 n2}r9c5 – {n2 n1}r9c2 ==> r9c4 ≠ 1
hidden-single-in-a-block ==> r8c4 = 1
hxy-cn4-chain {r9 r8}c5n2 – {r8 r3}c5n7 – {r3 r2}c5n1 – {r2 r9}c2n1 ==> r9c2 ≠ 2
hidden-single-in-a-row ==> r9c5 = 2
hxy-rn5-chain {c2 c5}r2n1 – {c5 c6}r2n6 – {c6 c5}r6n6 – {c5 c6}r6n4 – {c6 c2}r9n4 ==>
r9c2 ≠ 1
… (Naked-Singles)
Now comes the longer xy-chain we have found in our three databases (still
considering that L4_0 should be fully applied before any chain of any type and of
length greater than four is looked for). Indeed, as this is the unique xy-chain of
length thirteen in these databases, we have not searched for longer ones (except in
Royle-17, where we have gone up to length sixteen).
;;; common part, in L2 for the two resolution paths, in L4_0+XY4_11 and in L4_0+XY4+
HXY4+C4, for the L4_0+XY4+HXY4+C4 (or L1) elaboration of Sudogen17-3403:
row r6 interaction-with-block b5 ==> r4c4 ≠ 8
row r3 interaction-with-block b3 ==> r1c9 ≠ 7
column c6 interaction-with-block b5 ==> r6c5 ≠ 7
column c4 interaction-with-block b5 ==> r6c6 ≠ 6, r4c6 ≠ 6
row r6 interaction-with-block b5 ==> r5c4 ≠ 2
block b2 interaction-with-column c6 ==> r6c6 ≠ 2
block b7 interaction-with-column c1 ==> r6c1 ≠ 1, r5c1 ≠ 1
block b4 interaction-with-row r6 ==> r6c4 ≠ 1
block b7 interaction-with-column c1 ==> r1c1 ≠ 1
naked-pairs-in-a-column {n1 n6}{r4 r5}c4 ==> r6c4 ≠ 6
row r6 interaction-with-block b4 ==> r5c1 ≠ 6
;;; end of the common part
9 5 4 8 3 9 5 4 8 6 7 3 9 5 2 4 8 1
8 7 1 8 4 7 1 3 8 4 9 7 1 6 2 3 5
2 8 6 2 5 3 4 8 6 2 5 1 3 4 8 7 6 9
2 3 2 4 9 3 2 4 6 9 7 5 1 8
9 8 3 9 8 3 4 5 9 8 1 3 4 6 2 7
3 4 3 9 4 7 1 6 2 8 5 3 9 4
3 7 6 5 3 7 6 9 5 4 3 7 8 6 9 1 5 2
9 1 9 6 2 5 1 4 3 9 6 2 5 7 1 8 4 3
5 6 8 5 3 9 6 1 8 5 4 2 3 9 7 6
2) Resolution path in L4 using shorter chains of more complex types (hxy and
xyt):
;;; continuation of the resolution path in L4_0+XY4+HXY4+XYT4 for the L4_0+XY4+
HXY4+C4 (or L1) elaboration of Sudogen17-3403:
hxy-rn4-chain {c9 c7}r4n8 – {c7 c5}r8n8 – {c5 c7}r8n7 – {c7 c9}r3n7 ==> r4c9 ≠ 7
xyt4-chain {n2 n1}r1c9 – {n1 n7}r3c7 – {n7 n8}r8c7 – {n8 n2}r7c9 ==> r5c9 ≠ 2
xyz3-chain {n7 n1}r4c8 – {n1 n5}r5c9 – {n5 n7}r5c1 ==> r5c8 ≠ 7
xyz3-chain {n7 n1}r4c8 – {n1 n5}r5c9 – {n5 n7}r5c1 ==> r5c7 ≠ 7
xyt4-chain {n2 n1}r1c9 – {n1 n7}r3c7 – {n7 n8}r8c7 – {n8 n2}r7c9 ==> r2c9 ≠ 2
xy4-chain {n5 n7}r5c1 – {n7 n6}r1c1 – {n6 n9}r2c3 – {n9 n5}r2c9 ==> r5c9 ≠ 5
naked-pairs-in-a-block {n1 n7}{r4c8 r5c9} ==> r5c8 ≠ 1
naked-single ==> r5c8 = 2
row r9 interaction-with-block b8 ==> r7c4 ≠ 2
naked-pairs-in-a-block {n1 n7}{r4c8 r5c9} ==> r5c7 ≠ 1, r4c9 ≠ 1, r4c7 ≠ 7, r4c7 ≠ 1
xy4-chain {n1 n2}r1c9 – {n2 n5}r2c7 – {n5 n6}r5c7 – {n6 n1}r5c4 ==> r5c9 ≠ 1
… (Naked-Singles)
Chapter XV
This chapter is probably the strongest illustration of hidden structures and of the
strength of meta-theorem 3. The xy-chain rules defined and studied in the previous
chapters have their "hidden" counterparts in row-number and column-number spa-
ces. Roughly speaking, a hxy-chain is defined as and looks like an xy-chain, but in
rn- or cn- space – except that there are no links along 3x3 pseudo-blocks in these
spaces; and the eliminations it allows in rn- or cn- space are similar to those allowed
in rc-space by xy-chains. Moreover, the "super-hidden" counterparts of xy-chains
are identical to their "hidden" counterparts. This will also be the case for the xyt-
and xyzt- chain rules defined in the forthcoming chapters. As for the c-chains, it will
be easy to see that there are no hidden c-chains (see chapter XVI).
Let us consider xyk-chain*, the pattern for a full xy-chain of length k. The XYk
rule asserts that the universal closure of the following formula is valid:
"xyk-chain* => not-candidate(n, r, c)".
Proof: It is enough to check that the starred chain pattern xyk-chain* is block-
positive; and this can be done easily from its definition: the only non block-free
predicates it contains are "share-a-unit" and they never appear in the scope of a
negation.
As usual, the validity of these rules can also be proven directly. We leave this as
an easy exercise for the reader.
XV.1.2. Formula associated to a chain rule pattern with rn, cn or H in its prefix
This is equivalent to saying that, in the chain rule patterns with rn (respectively
cn) prefixes:
– cell patterns should be interpreted as describing the content of rn-cells, (res-
pectively cn-cells) in rn-space (respectively cn-space), i.e. the set of candidate rows
(respectively columns) for rn-cells (respectively cn-cells); notice that the rn- or cn-
in the name of the rules point to the space in which the chains live, not to the (Scn or
Srn) transform by which they are obtained;
– link patterns should be interpreted as predicates "rn-connected" (respectively
"cn-connected") instead of "share-a-unit";
– target cells should be understood as rn-cells (respectively cn-cells) and their
links to the starred cells in the chain pattern should be considered as meaning "rn-
connected" (respectively "cn-connected").
Hidden xy-chains (hxy-chains) 227
This also means that we can easily describe how the instantiations of the patterns
corresponding to the above rules look, at least if we do this in the appropriate
spaces. In rn-space, the hxy-rnk pattern looks like an xyk pattern (with no links along
blocks) would in rc-space. And, in cn-space, the hxy-cnk pattern also looks like an
xyk pattern (with no links along blocks) would in rc-space.
This is why, in full compatibility with the previous formal definitions (using the
Scn and Srn transforms), such patterns are called hidden xy-chains.
With the previous definitions, and just to make things more concrete, let us write
the first hxy-chain rules. Remember that cells are now rn-cells (respectively cn-
cells), with content candidate columns (resp. candidate rows), while two rn-cells
(resp. cn-cells) are linked in rn-space (resp. cn-space) if and only if they share a row
or a number (resp. a column or a number).
The general pattern should now be clear and we leave it to the reader as an easy
(and tedious) exercise to write rules for hxy-chains of longer lengths. How far
should we go? We could repeat the remarks we made for xy-chains in section
XIV.2.1. In our SudoRules solver, hxy-chains, like xy-chains, have been implemen-
ted up to length thirteen.
∀r1∀c1∀r2∀c2∀r3∀c3∀n1∀n2∀n3∀r∀c∀n
{ rc-bivalue(r1, c1, n1, n2) &
share-a-unit(r2, c2, r1, c1) &
rc-bivalue(r2, c2, n2, n3) &
share-a-unit(r3, c3, r2, c2) &
rc-bivalue(r3, c3, n3, n1) &
¬same-cell(r3, c3, r1, c1) &
share-a-unit(r, c, r1, c1) &
share-a-unit(r, c, r3, c3) &
¬same-cell(r, c, r2, c2) &
n=n1
=>
not-candidate(n, r, c) },
∀r1∀c1∀r2∀c2∀r3∀c3∀n1∀n2∀n3∀r∀c∀n
{ rc-bivalue(r1, c1, n1, n2) &
rc-connected(r2, c2, r1, c1) &
rc-bivalue(r2, c2, n2, n3) &
rc-connected(r3, c3, r2, c2) &
rc-bivalue(r3, c3, n3, n1) &
¬same-cell(r3, c3, r1, c1) &
Hidden xy-chains (hxy-chains) 229
As was said for the general XYk case, this is a specialisation of XY3, it is
obviously valid in LS and, by meta-theorem 3, we can assert the validity of its Scn
and Srn transforms. Let us write the Scn transform, the HXY3-rn rule:
∀r1∀n1∀r2∀n2∀r3∀n3∀c1∀c2∀c3∀r∀n∀c
{ rn-bivalue(r1, n1, c1, c2) &
rn-connected(r2, n2, r1, n1) &
rn-bivalue(r2, n2, c2, c3) &
rn-connected(r3, n3, r2, n2) &
rn-bivalue(r3, n3, c3, c1) &
¬same-rn-cell(r3, n3, r1, n1) &
rn-connected(r, n, r1, n1) &
rn-connected(r, n, r3, n3) &
¬same-rn-cell(r, n, r2, n2) &
c=c1
=>
not-candidate(n, r, c) }.
We already know that, for any k, in the XYk rule, the two rc-space coordinates
play symmetrical roles and that Src(XYk) = XYk. This identity remains obviously
true under the BF transform. Under the Scn (respectively Srn) transform, it gets
transposed into rn- (resp. cn-) space; this means that:
Scn(HXY-rnk) = HXY-rnk and Srn(HXY-cnk) = HXY-cnk.
As a consequence, xy- and hxy- chain rules are related as described in Figure 1.
Of course, this can also be checked for any k, on the logical formulation of the
rules: in rule HXY-rnk, the two rn-space coordinates (i.e. all the variables of sorts
Row and Number) play symmetrical roles (you can check this on the above example
XY-rn3); similarly, for any k, in the HXY-cnk rule, the two cn-space coordinates
(i.e. all the variables of sorts Column and Number) play symmetrical roles.
HXY-rnk Srn
Scn
Srn
HXY-cnk Scn
If we admit that xy-chains constitute chains of the simplest kind, hxy chains,
which are their supersymmetric version, should be granted the same logical (if not
psychological) simplicity, according to our general guiding principles.
The graphical representations of hidden chains we are giving in this section are
not representations of rule patterns in the sense of chapter XIII. Their purpose is
only to give an intuitive visual description of how these chains appear in rn- and rc-
spaces. Let us adopt the following convention: a simple bar represents a link (along
the indicated coordinate in rn- or rc- space) and a double bar (in rc-space) represents
a conjugacy link (for the value indicated between parentheses and along the indica-
ted coordinate in rc-space). Open and closed sets of candidates are displayed as for
chain patterns, whereas the X symbol inside a circle indicates a target cell. The star
on the right of a value indicates that it is the target value.
Hidden xy-chains (hxy-chains) 231
For the HXY-rn3 rule, apart from reversing the order of the cells in the chain,
there is a priori only one possibility that does not reduce to Hidden Triplets or to
Swordfish. (But remember that we have found no case of this possibility and we
conjecture that it is subsumed by simpler rules).
n c
r r (1) (2)
{12} {23} 1 12 2
{31} 2
1* 1*2 (2)
Figure 2. The typical hxy-rn3-chain seen in rn-space (left) and in rc-space (right)
What can we conclude from this review of hxy-rn4 chains? First, this notion
unifies what would otherwise be considered as very different combinations of c-
chains with several c-linking values (a family of patterns called Nice Loops).
Second, it also shows that such combinations are logically much simpler than one
would think if one looked at them only in rc-space:
– in case 1, three c-chains on a total of six rc-cells: C2(1), C2(2) and C4(3);
– in case 2, two c-chains on a total of seven rc-cells: C2(1) and C6(2);
232 The Hidden Logic of Sudoku
– in case 3, three c-chains on a total of six rc-cells: C2(1), C4(2) and C2(3);
– in case 4, two c-chains on a total of seven rc-cells: C4(1) and C4(2).
n c
{41} 3
1* 1*3 (3)
Figure 3. The typical hxy-rn4-chain, case 1, seen in rn-space (left) and in rc-space (right)
n c
r r (1) (2)
{12} {23} 1 12 2
(2)
{34} 2 2
{41} 2
1* 1*2 (2)
Figure 4. The typical hxy-rn4-chain, case 2, seen in rn-space (left) and in rc-space (right)
Notice nevertheless that combinations of c-chains thus obtained are very speci-
fic: they do not mix c-links along different unit-types (as AICs or Nice Loops would
do); to a hxy-rn-chain (respectively to a hxy-cn-chain), there correspond only com-
Hidden xy-chains (hxy-chains) 233
binations of c-chains where all the c-links are along rows (resp. along columns).
hxy-chains do not exempt us from looking for complex combinations of c-chains,
but, at least, they exempt us from looking for some combinations of them that
should be considered as less complex.
n c
r r (1) (2)
{12} {23} 1 12 2
{41} {34} 23 2
1* 1*2 (3) (2)
Figure 5. The typical hxy-rn4-chain, case 3, seen in rn-space (left) and in rc-space (right)
n c
r r
{12} (1)
1* 2*1 1
1 (1)
{41} 2 12
(2)
{23} {34} 2 2
(2)
Figure 6. The typical hxy-rn4-chain, case 4, seen in rn-space (left) and in rc-space (right)
With our first example, puzzle Royle17-211 (Figure 7), a hxy4-cn-chain (of type
2 in the above classification) can be seen immediately after three simple Interaction
rules have eliminated four candidates from the L4_0+XY4 elaboration (which is
equal to the L1 elaboration). Notice how the nrc-notation for these chains is a simple
adaptation of the notation for xy-chains, by merely permuting the nrc symbols.
3 1 4 6 5 8 3 1 4 6 5 8 7 2 9 3 1
8 4 9 8 1 4 3 6 9 8 1 5 4 3 6 2 7
7 2 7 3 1 6 8 4 2 7 3 9 1 6 5 8 4
1 6 3 7 1 6 3 5 4 7 8 1 2 6 3 9 5 4 7 8
3 3 8 4 6 7 1 3 9 8 4 6 7 1 5 2
8 7 4 8 1 3 6 7 5 4 2 8 1 3 6 9
5 4 8 5 4 3 8 1 6 5 4 2 7 3 9 8 1 6
6 2 8 1 6 4 2 3 8 1 7 6 5 4 2 9 3
1 6 3 1 8 4 6 3 9 1 2 8 7 4 5
Resolution path in L4_0+XY4+HXY4 for the L4_0+XY4 (or L1) elaboration of Royle17-
211:
row r1 interaction-with-block b2 ==> r2c4 ≠ 2
block b3 interaction-with-column c7 ==> r9c7 ≠ 9
block b9 interaction-with-row r9 ==> r9c5 ≠ 7, r9c3 ≠ 7
hxy-cn4-chain {r1 r3}c7n9 – {r3 r9}c7n5 – {r9 r8}c5n5 – {r8 r1}c5n7 ==> r1c7 ≠ 7
… (Naked-Singles)
This elementary example can be used to illustrate the benefits of the cn-space.
Let us consider various representations of the knowledge state just before the hxy-
cn4-rule is applied. For completeness, let us first notice that it has nothing particu-
larly appealing in the standard rc-representation (Figure 8), which contains no useful
xy-chain.
Things are very different with the cn-representation of the same knowledge state.
A hxy-cn4-chain {r1 r3}c7n9 – {r3 r9}c7n5 – {r9 r8}c5n5 – {r8 r1}c5n7 immedia-
tely appears, and the hxy-cn4-chain rule can now be used to eliminate the (row)
candidate r1 from the hxy-cn4-target cell c7n7. Let us use the extended Sudoku
board with the explicit n, c, and r symbols in the rc-, rn- and cn- cells.
Hidden xy-chains (hxy-chains) 235
c1 c2 c3 c4 c5 c6 c7 c8 c9
n2 n2
r1 4 6 5 8 3 1
n7 n9 n9 n7 n9
n2 n2
r2 9 8 1 n5 4 3 6 n5 n5
n7 n7
r3 2 7 3 n5 1 6 n5 8 4 r3
n9 n9
n2 n2
r4 1 6 3 5 4 7 8 r4
n9 n9
n2 n2 n2
r5 3 n5 8 4 6 7 1 n5 n5 r5
n9 n9 n9
n2 n2 n2
r6 7 n5 4 8 1 3 6 n5
n9 n9 n9
n2 n2
r7 5 4 n5 3 8 1 6
n7 n9 n7 n9 n9
r8 8 1 6 n5 4 2 n5 3 r8
n7 n9 n7 n9 n9
n2 n2 n2
r9 6 3 1 n5 8 n5 4 n5 r9
n9 n9 n7 n7 n9
c1 c2 c3 c4 c5 c6 c7 c8 c9
n1 4 8 2 9 3 6 5 7 1 n1
r2 r1 r1 r2 r2
n2 3 r4 r5 r6 r6 r4 8 r5 r5 r6
r7 r9 r7 r9 r7 r9
n3 5 9 3 4 7 2 6 1 8 n3
n4 1 7 6 5 2 8 4 9 3 n4
r2 r3 r3 r2 r2
n5 7 r5 r6 1 4 r5 r5 r6
r7 r8 r9 r9 r8 r9
n6 9 1 4 8 5 3 2 6 7 n6
r2 r1 r1 r2
n7 6 3 5 4
r7 r8 r7 r8 r9 r9
n8 8 2 5 1 6 9 7 3 4 n8
r3 r1 r1 r1 r3 r2
n9 2 r4 r5 r6 r6 r4 r5 r5 r6 n9
r7 r8 r9 r7 r8 r9 r7 r8 r9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Figure 8. Puzzle Royle17-211, in rc- and cn- spaces, just before HXY4 is applied
236 The Hidden Logic of Sudoku
Our second example, puzzle Royle17-619 (Figure 9), requires a simple combi-
nation of xy4 and hxy4 chains. The L4_0+XY4 elaboration process uses the XY4
rule but this does not lead to the addition of any new value and the L4_0+XY4
elaboration is equal to the L1_0 elaboration. Its resolution path starts with XY4,
which, after an Interaction, is followed by HXY-cn4 (of type 4 in the above classifi-
cation), the sequel being in L1_0.
8 1 3 5 4 7 8 1 3 5 4 6 7 2 9 8 1
2 3 2 1 8 3 7 9 2 1 8 3 4 7 5 6
7 8 1 3 2 6 7 8 5 1 9 4 3 2
6 3 2 4 8 5 9 6 1 3 2 7 4 8 5 9 6 1 3 2 7
7 4 5 7 4 5 1 8 7 9 2 4 5 3 1 6 8
1 1 8 7 1 3 6 2 8 7 5 9 4
5 7 8 5 1 9 7 2 8 3 5 1 9 7 2 8 6 4 3
6 2 8 6 7 4 2 1 9 8 6 7 3 4 5 2 1 9
1 2 4 3 1 9 6 8 7 5 2 4 3 1 9 6 8 7 5
Resolution path in L4_0+XY4+HXY4 for the L4_0+XY4 (or L1_0) elaboration of puzzle
Royle17-619:
xy4-chain {n9 n3}r6c2 – {n3 n2}r6c4 – {n2 n6}r1c4 – {n6 n9}r1c7 ==> r6c7 ≠ 9
column c7 interaction-with-block b3 ==> r2c8 ≠ 9
hxy-cn4-chain {r1 r3}c4n6 – {r3 r2}c1n6 – {r2 r3}c1n9 – {r3 r1}c7n9 ==> r1c7 ≠ 6
… (Naked-Singles)
Notice that the hxy-cn4-chain was not present at the start. The content of cn-cell
c7n9 was {r3, r1, r6}. r6 is eliminated by the xy4 rule, leading to the apparition of
the hxy-cn4 pattern. The Interaction rule is nevertheless applied before the hxy-cn4
rule because it is simpler. It does not change the content of the cn-cells in the hxy-
cn4-chain.
Our third example, puzzle Royle17-520 (Figure 10), also combines an xy4 chain
and a hxy-cn4 chain (of type 1 in the above classification) but the two chains now
live at the same time on the grid. After four simple Interaction rules have eliminated
six candidates from the L4_0+XY4 elaboration (which coincides with the L1 elabo-
ration), this can be seen from the fact that the xy4 rule, applied just before the hxy4
rule, does not change anything in the cn-cells on which the hxy4 chain lives.
Hidden xy-chains (hxy-chains) 237
6 5 7 2 3 9 6 5 7 2 8 3 4 9 1 6 5
9 2 9 5 2 6 7 3 8 9 4 5 2 6 1 7 3 8
3 6 5 7 3 1 6 5 7 8 4 2 9
9 2 4 6 9 5 2 4 3 6 7 1 9 8 5 2 4 3
5 3 5 3 7 6 8 5 3 7 2 4 6 9 1
9 6 3 5 4 9 2 6 1 3 5 8 7
6 5 7 6 9 5 7 3 4 2 6 9 8 5 7 3 1 4
1 6 8 1 3 4 9 6 8 5 1 3 7 4 9 6 8 5 2
3 9 5 3 2 9 6 5 8 4 1 3 2 9 7 6
Resolution path in L4_0+XY4+HXY4 for the L4_0+XY4 (or L1) elaboration of Royle17-
520:
row r4 interaction-with-block b4 ==> r6c3 ≠ 7
column c1 interaction-with-block b4 ==> r6c3 ≠ 4
column c9 interaction-with-block b6 ==> r5c8 ≠ 1
row r1 interaction-with-block b1 ==> r3c2 ≠ 2
hidden-single-in-a-column ==> r1c2 = 2
xy4-chain {n8 n2}r7c1 – {n2 n1}r7c8 – {n1 n7}r9c8 – {n7 n8}r6c8 ==> r6c1 ≠ 8
hxy-cn4-chain {r6 r5}c9n1 – {r5 r3}c9n9 – {r3 r8}c9n2 – {r8 r6}c3n2 ==> r6c3 ≠ 1
block b4 interaction-with-row r4 ==> r4c5 ≠ 1
… (Naked-Singles and Hidden-Singles)
8 5 2 9 6 8 3 7 5 2 9 6 8 1 3 7 4 5
6 2 7 8 4 5 6 2 3 7 8 4 5 6 9 2 1 3
1 1 3 5 7 2 6 1 3 5 7 4 2 8 9 6
4 2 1 4 2 3 1 7 6 5 4 2 3 1 7 6 5 8 9
6 7 5 1 3 4 6 7 2 8 5 1 3 9 4 6 7 2
3 6 7 2 5 4 3 1 6 7 9 2 5 8 4 3 1
6 7 3 5 6 7 9 3 2 4 5 6 7 9 8 1 3 2 4
4 5 3 1 4 2 5 6 7 3 1 8 4 2 5 9 6 7
2 4 2 6 3 7 1 5 9 4 2 6 3 7 1 5 8
Figure 11. Puzzle Royle17-11212, its L1_0 elaboration and its solution
As usual, let us start with a very simple example, puzzle Royle17-11212 (Figure
11). Its L4+XY5 elaboration coincides with its L1_0 elaboration. And this elabora-
238 The Hidden Logic of Sudoku
tion can be solved by a single application of HXY5 (apart from the final NS). The
hxy-cn5 pattern is readily visible on the central puzzle.
Resolution path in L4+XY5+HXY5 for the L4+XY5 (or L1_0) elaboration of puzzle
Royle17-11212:
hxy-cn5-chain {r3 r5}c5n9 – {r5 r7}c5n8 – {r7 r6}c6n8 – {r6 r8}c3n8 – {r8 r3}c7n8 ==>
r3c7 ≠ 9
… (Naked-Singles)
Another simple example is puzzle Royle17-7295 (Figure 12). Again, the L4+
XY5 and L1_0 elaborations coincide. The HXY5-rn pattern appears immediately
after an Interaction.
2 1 8 2 1 9 8 3 2 7 1 6 4 5
4 9 1 4 3 8 5 9 2 1 4 6 3 8 5 7 9 2
2 9 4 1 8 7 2 5 9 6 4 3 1 8
7 5 6 3 7 5 8 2 6 1 4 3 9 7 5 8 2 6 1
2 1 2 6 1 4 9 3 8 2 6 1 4 9 3 5 8 7
8 6 8 6 1 2 9 3 4 8 5 7 6 1 2 9 3 4
9 3 9 8 1 3 7 2 5 9 8 1 3 7 4 2 6
1 3 8 2 9 1 3 6 7 4 8 2 9 1 5 3
4 8 3 1 2 5 4 6 8 7 9 3 1 2 5 4 6 8 7 9
Figure 12. Puzzle Royle17-7295, its L1_0 elaboration and its solution
Resolution path in L4+XY5+HXY5 for the L4+XY5 (or L1_0) elaboration of puzzle Royle17
-7295:
row r8 interaction-with-block b7 ==> r7c1 ≠ 6
hxy-rn5-chain {c7 c3}r1n3 – {c3 c1}r1n9 – {c1 c3}r4n9 – {c3 c1}r4n4 – {c1 c7}r7n4 ==>
r1c7 ≠ 4
… (Naked-Singles and Hidden-Singles)
Our third example, puzzle Royle-17-4167 (Figure 13) is no more complex. Its
L4+XY5 and L1_0 elaborations coincide. After the initial elimination of a candidate
by an Interaction rule, the solution path for this elaboration starts with HXY-rn5 and
the sequel is entirely in L1_0.
Resolution path in L4+XY5+HXY5 for the L4+XY5 (or L1_0) elaboration of puzzle
Royle17-4167:
block b4 interaction-with-row r4 ==> r4c5 ≠ 6
Hidden xy-chains (hxy-chains) 239
hxy-rn5-chain {c5 c3}r4n3 – {c3 c2}r4n6 – {c2 c8}r3n6 – {c8 c9}r8n6 – {c9 c5}r5n6 ==>
r5c5 ≠ 3
… (Naked-Singles and Hidden-Singles)
4 8 1 5 4 7 8 2 1 9 5 3 6 4 7 8 2 1
7 2 7 4 2 1 8 5 3 7 4 2 1 8 9 6 5 3
1 8 3 5 2 7 4 1 6 8 3 5 2 7 9 4
7 5 4 8 7 5 1 4 2 8 9 6 7 3 5 1 4 2
2 7 4 2 1 5 8 3 7 4 2 9 1 5 8 6
1 2 1 5 8 4 3 7 2 1 5 8 6 4 9 3 7
6 4 3 6 2 9 4 7 8 3 1 5 6 2 9 4 7 8 3 1 5
8 1 4 8 7 5 1 3 2 4 8 7 5 1 3 2 6 9
5 7 5 3 1 2 4 7 8 5 3 1 9 2 6 4 7 8
Figure 13. Puzzle Royle17-4167, its L1_0 elaboration and its solution
The L5+XY6 elaboration of puzzle Royle-17-5546 (Figure 14) coincides with its
L1 elaboration and, moreover, all the candidates elimination done by the elaboration
process is subsumed by the values it produces. The resolution path for the elaborated
puzzle starts with the HXY-rn6 rule. One can therefore detect a HXY-rn6 pattern
directly on the central grid of Figure 13 (after displaying it in rn-space).
7 1 4 2 7 1 6 4 9 2 8 7 1 5 6 3
3 2 3 7 4 6 2 1 3 7 8 4 5 6 2 9 1
1 6 3 2 4 7 1 5 6 9 3 2 4 8 7
4 7 1 4 3 7 1 8 4 9 5 6 3 7 1 2
6 2 8 6 1 7 2 8 4 9 3 5 6 1 7 2 8 4 9 3 5
4 1 7 8 4 2 3 5 1 9 7 8 4 6
5 3 9 5 2 4 3 1 8 6 7 9 5 2 4 3 1 8 6 7 9
6 2 7 6 2 1 4 7 8 3 6 2 9 1 5 4
1 6 1 7 4 2 9 6 1 7 4 5 3 2 8
Resolution path in L5+XY6+HXY6 for the L5+XY6+HXY6 (or L1) elaboration of puzzle
Royle17-5546:
hxy-rn6-chain {c8 c6}r8n5 – {c6 c7}r9n5 – {c7 c9}r9n3 – {c9 c1}r9n8 – {c1 c3}r4n8 –
{c3 c8}r2n8 ==> r8c8 ≠ 8
naked and hidden singles ==> r8c8 = 5, r9c7 = 3, r1c7 = 5, r9c9 = 8, r1c9 = 3, r9c1 = 9,
r6c1 = 2, r4c1 = 8, r6c9 = 6, r4c9 = 2, r9c6 = 5, r8c6 = 9, r4c5 = 6
240 The Hidden Logic of Sudoku
xy4-chain {n9 n8}r1c2 – {n8 n9}r1c4 – {n9 n5}r4c4 – {n5 n9}r6c5 ==> r6c2 ≠ 9
column c2 interaction-with-block b1 ==> r2c3 ≠ 9
xy3-chain {n8 n9}r1c4 – {n9 n5}r2c5 – {n5 n8}r2c3 ==> r1c2 ≠ 8
naked singles ==> r1c2 = 9, r1c4 = 8
xy4-chain {n9 n5}r4c3 – {n5 n8}r2c3 – {n8 n5}r3c2 – {n5 n9}r3c4 ==> r4c4 ≠ 9
… (Naked-Singles)
For puzzle Sudogen0-9617 (Figure 15), the L6+XY7 and the L5 elaborations
coincide (they effectively use XY5 and it leads to assert the value r6c7 = 3).
3 9 3 6 9 3 4 1 6 2 9 8 7 5
1 4 6 7 8 1 4 6 5 9 7 8 3 1 4 2 6
8 2 4 5 8 2 6 4 5 9 8 2 6 4 7 5 3 1 9
5 9 2 5 6 9 2 3 4 1 5 6 7 9 8
7 8 3 4 7 6 8 3 9 2 4 7 6 8 3 9 2 1 5 4
2 6 9 4 8 2 6 1 5 9 7 4 8 2 6 3
7 6 9 7 5 2 6 3 9 4 7 5 2 6 3 9 8 1
3 8 6 3 8 6 1 3 9 8 7 5 4 2
2 6 8 2 1 6 9 8 2 5 1 4 6 3 7
Two resolution paths for this elaboration can now be considered: either in L6+
XH7+HXY7 or in L4_0+XY4_7+HXY4_7. The two paths have a common (not
very interesting) part in L3:
;;; common part in L3 for the two resolution paths, in L4_0+XY4_7+HXY4_7 and in L6+
XH7+HXY7, for the L6+XY7 (or L5) elaboration of Sudogen0-9617:
column c6 interaction-with-block b8 ==> r9c4 ≠ 7, r8c4 ≠ 7
row r5 interaction-with-block b6 ==> r6c9 ≠ 5
column c1 interaction-with-block b7 ==> r8c2 ≠ 4
row r5 interaction-with-block b6 ==> r6c9 ≠ 1, r4c9 ≠ 1, r4c7 ≠ 1
row r3 interaction-with-block b3 ==> r1c9 ≠ 1
column c9 interaction-with-block b9 ==> r8c8 ≠ 1, r8c7 ≠ 1, r7c8 ≠ 1
row r3 interaction-with-block b3 ==> r1c8 ≠ 1, r1c7 ≠ 1
naked-pairs-in-a-row {n5 n9}r2{c1 c2} ==> r2c8 ≠ 5
row r2 interaction-with-block b1 ==> r1c2 ≠ 5
xyz3-chain {n7 n3}r6c9 – {n3 n5}r9c9 – {n5 n7}r8c7 ==> r8c9 ≠ 7
;;; end of the common part
Hidden xy-chains (hxy-chains) 241
From this point on, either we allow only rules of types XY and HXY and of
length at most seven, or, in addition, we allow any rule of length at most six.
Continuation of the resolution path, in L4_0+XY4_7+HXY4_7, for the L6+XY7 (or L5) ela-
boration of Sudogen0-96179:
hxy-rn6-chain {c2 c9}r8n1 – {c9 c8}r8n2 – {c8 c5}r2n2 – {c5 c8}r2n3 – {c8 c9}r9n3 –
{c9 c2}r6n3 ==> r6c2 ≠ 1
hxy-cn7-chain {r7 r1}c8n8 – {r1 r4}c7n8 – {r4 r3}c7n3 – {r3 r2}c5n3 – {r2 r1}c5n2 –
{r1 r8}c9n2 – {r8 r7}c9n1 ==> r7c9 ≠ 8
… (38 Naked-Singles)
Continuation of the resolution path, in L6+XY7+HXY7, for the L6+XY7 (or L5) elaboration
of Sudogen0-9617:
xyzt5-chain {n7 n3}r6c9 – {n3 n5}r9c9 – {n5 n7}r8c7 – {n7 n8}r4c7 – {n8 n7}r4c9 ==>
r1c9 ≠ 7
hxy-rn6-chain {c2 c9}r8n1 – {c9 c8}r8n2 – {c8 c5}r2n2 – {c5 c8}r2n3 – {c8 c9}r9n3 –
{c9 c2}r6n3 ==> r6c2 ≠ 1
hxy-cn7-chain {r7 r1}c8n8 – {r1 r4}c7n8 – {r4 r3}c7n3 – {r3 r2}c5n3 – {r2 r1}c5n2 –
{r1 r8}c9n2 – {r8 r7}c9n1 ==> r7c9 ≠ 8
… (38 Naked-Singles)
In the second case, one more rule for a shorter chain is applied (XYZT5) before
hxy-rn6 and hxy-cn7, but this is useless, since the next steps are unchanged.
One more thing to notice about this puzzle is the long sequence of 38 final
Naked-Singles.
Now comes a very complex but also very instructive example, puzzle Royle17-
1020 (Figure 16). It can be solved neither in L7+XY8 nor in L4_0+XY4_13+
HXY4_13, i.e. in L4_0 plus all the rules of type xy and of length between four and
thirteen (XY4_13) plus all the rules of type hxy and of length between four and
thirteen (HXY4_13). Indeed, the elaborations of this puzzle by any one of these two
sets of rules coincide with its L1_0 elaboration; this means that all the other rules in
L7+XY8 or in L4_0+XY4_13+HXY4_13 are of no use for producing an interesting
elaboration of this puzzle. Moreover, the L1_0 elaboration has only three more
values than the original puzzle; as a result, it may be expected that the resolution
path starting from the elaborated version will be very long.
This puzzle can be solved if we add the HXY8 rule to L7+XY8. The solution
requires many kinds of rules for chains of length not greater than seven in addition
to the rule of interest (HXY-cn8); some of these chains are of types (hxyt and c) that
242 The Hidden Logic of Sudoku
will be introduced or studied only in further chapters. This example is thus also a
justification for introducing these more complex rules. In contrast with our usual
examples chosen for simplicity, it shows how complex a resolution path can be.
2 6 2 6 4 9 1 3 5 8 2 6 7
8 7 8 7 8 3 6 7 2 9 5 1 4
3 3 2 5 7 4 6 1 3 8 9
1 6 5 1 7 6 5 1 4 2 9 7 6 8 5 3
7 3 7 3 5 6 7 9 8 3 5 1 4 2
4 2 4 2 3 8 5 1 4 2 7 9 6
5 1 5 1 7 6 8 5 9 3 4 2 1
2 4 2 4 5 2 4 6 1 7 9 3 8
3 1 3 9 1 3 2 8 4 6 7 5
Figure 16. Puzzle Royle17-1020, its L1_0 elaboration and its solution
hxyt-cn5-chain {r3 r5}c1n2 – {r5 r4}c9n2 – {r4 r6}c9n3 – {r6 r1}c1n3 – {r1 r3}c1n4 ==>
r3c1 ≠ 9, r3c1 ≠ 7, r3c1 ≠ 6
naked-pairs-in-a-row {n2 n4}r3{c1 c4} ==> r3c9 ≠ 4, r3c8 ≠ 4, r3c6 ≠ 4, r3c2 ≠ 4
hxyt-cn6-chain {r9 r7}c8n2 – {r7 r2}c5n2 – {r2 r3}c5n6 – {r3 r1}c5n5 – {r1 r8}c5n1 –
{r8 r9}c5n8 ==> r9c8 ≠ 8
hxy-rn7-chain {c2 c9}r4n3 – {c9 c3}r4n2 – {c3 c5}r2n2 – {c5 c8}r7n2 – {c8 c6}r7n3 –
{c6 c7}r7n4 – {c7 c2}r4n4 ==> r4c2 ≠ 9, r4c2 ≠ 8
hxy-rn7-chain {c6 c8}r7n3 – {c8 c5}r7n2 – {c5 c3}r2n2 – {c3 c9}r4n2 – {c9 c2}r4n3 –
{c2 c7}r4n4 – {c7 c6}r7n4 ==> r7c6 ≠ 9, r7c6 ≠ 7
hxy-rn7-chain {c7 c6}r7n4 – {c6 c8}r7n3 – {c8 c5}r7n2 – {c5 c3}r2n2 – {c3 c9}r4n2 –
{c9 c2}r4n3 – {c2 c7}r4n4 ==> r9c7 ≠ 4, r5c7 ≠ 4, r2c7 ≠ 4
hxy-cn7-chain {r9 r3}c4n2 – {r3 r5}c1n2 – {r5 r4}c9n2 – {r4 r6}c9n3 – {r6 r1}c1n3 –
{r1 r8}c4n3 – {r8 r9}c4n6 ==> r9c4 ≠ 4
column c4 interaction-with-block b2 ==> r2c6 ≠ 4, r1c6 ≠ 4
hxy-cn8-chain {r9 r7}c8n2 – {r7 r8}c8n3 – {r8 r1}c4n3 – {r1 r6}c1n3 – {r6 r4}c9n3 –
{r4 r2}c2n3 – {r2 r7}c6n3 – {r7 r9}c6n4 ==> r9c8 ≠ 4
c4-chain row-bl-col on cells n4{r4c2 r4c7} – n4{r5c8 r2c8} ==> r2c2 ≠ 4
row r2 interaction-with-block b3 ==> r1c9 ≠ 4
hxy-cn5-chain {r1 r8}c4n3 – {r8 r7}c8n3 – {r7 r2}c6n3 – {r2 r4}c2n3 – {r4 r1}c2n4 ==>
r1c4 ≠ 4
...(Naked-Singles and Hidden-Singles)
Let us now comment some parts of this very long resolution path. First, you can
see six different instances of the HXY-rn6 rule:
hxy-rn6-chain {c8 c6}r7n3 – {c6 c2}r2n3 – {c2 c9}r4n3 – {c9 c3}r4n2 – {c3 c5}r2n2 –
{c5 c8}r7n2
hxy-rn6-chain {c6 c8}r7n3 – {c8 c5}r7n2 – {c5 c3}r2n2 – {c3 c9}r4n2 – {c9 c2}r4n3 –
{c2 c6}r2n3
hxy-rn6-chain {c5 c8}r7n2 – {c8 c6}r7n3 – {c6 c2}r2n3 – {c2 c9}r4n3 – {c9 c3}r4n2 –
{c3 c5}r2n2
hxy-rn6-chain {c9 c2}r4n3 – {c2 c6}r2n3 – {c6 c8}r7n3 – {c8 c5}r7n2 – {c5 c3}r2n2 –
{c3 c9}r4n2
hxy-rn6-chain {c3 c9}r4n2 – {c9 c2}r4n3 – {c2 c6}r2n3 – {c6 c8}r7n3 – {c8 c5}r7n2 –
{c5 c3}r2n2
hxy-rn6-chain {c2 c9}r4n3 – {c9 c3}r4n2 – {c3 c5}r2n2 – {c5 c8}r7n2 – {c8 c6}r7n3 –
{c6 c2}r2n3
You can notice that these six instances live on the same six rn-cells, each of
which contains exactly two (column) candidates. These rn-cells are just built into
different hxy-sequences, with different left and right candidates. Each of them al-
lows eliminating different (column) candidates from different target rn-cells. Notice
that these cells form a loop in rn-space and this example also illustrates how loops
are considered in our approach (whichever of the rc-, rn- or cn- spaces they lie in):
244 The Hidden Logic of Sudoku
not as a specific pattern in itself but as the support for several open chains built on it.
Of course, in practice, this does not forbid the search for loops, if you like loops. It
just illustrates that we need not add specific rules for dealing with them.
Figure 17. Six instances of a hxy-rn chain on the same six cells in rn-space
Something similar (including the remark on loops) occurs with the three diffe-
rent instances of the HXY-rn7 rule:
hxy-rn7-chain {c2 c9}r4n3 – {c9 c3}r4n2 – {c3 c5}r2n2 – {c5 c8}r7n2 – {c8 c6}r7n3 –
{c6 c7}r7n4 – {c7 c2}r4n4
hxy-rn7-chain {c6 c8}r7n3 – {c8 c5}r7n2 – {c5 c3}r2n2 – {c3 c9}r4n2 – {c9 c2}r4n3 –
{c2 c7}r4n4 – {c7 c6}r7n4
hxy-rn7-chain {c7 c6}r7n4 – {c6 c8}r7n3 – {c8 c5}r7n2 – {c5 c3}r2n2 – {c3 c9}r4n2 –
{c9 c2}r4n3 – {c2 c7}r4n4
Chapter XVI
The first section of this chapter justifies the last condition by showing that one
needs not consider c-chains with (local or global) loops. This is very important in
practice since it simplifies considerably the search for c-chains (for both humans and
computers).
Section 2 first shows that the super-hidden subset rules introduced in chapters VI
to VIII are subsumed by full c-chains rules (but should not be replaced by them, due
to their lower complexity). It also simplifies the search for chains by showing that
we need not consider hidden or super-hidden c-chains.
Finally, detailed examples of c-chains of lengths four, six and eight are given in
section 3; as explained in the introduction, these examples implicitly prove indepen-
dence results.
246 The Hidden Logic of Sudoku
Proof: given two consecutive links in the chain, one of them is a c-link, say u.
Since any cell in the chain must contain the common c-linking value as a candidate,
no other cell of the chain can be in u; in particular, the previous and the next link in
the chain (if there is any) cannot be u.
Theorem XVI.2: in a c-chain, the RiB and CiB Interaction rules enforce that
three consecutive cells cannot be in the same block.
Proof: consider three consecutive cells. Apart from reversing their order, they
form a c3-chain 1=(1)=1—1. Suppose these three cells share a block b.
If the first link is of type blk, no other cell containing candidate n1 can be in this
block, by the mere definition of a c-link.
If the first link is of type row (respectively col), let this row (resp. column) be r
(resp. c). Then the RiB (resp. CiB) rule applies to row r (resp. column c) and block
b; it allows us to eliminate candidate n1 from every cell in block b and not in row r
(resp. column c). As a result, in block b, candidate n1 can only be in the first two
cells of the chain. Therefore, no other cell of the chain can be in block b.
Finally, in all three possible cases our hypothesis is contradicted, which proves
the theorem.
Define a c-loop as a c-chain, but with the last conditions modified as follows: all
the cells are different, except the first and the last, which are equal. Define the
length of a c-loop as the number of cells in the chain, where the first and the last
count for only one.
Notice that, to a given c-loop of length k, one can associate a c-chain of length k
by just forgetting the last link between cells Ck and C1. A c-loop can therefore be
also defined as a c-chain in which there is a direct link between the last and the first
cells, such that this link is a c-link if k is odd. The length of the c-loop is thus the
Conjugacy chains (c-chains) 247
length of the associated c-chain. Moreover, if k is even, then the associated c-chain
is a full c-chain.
Notice that, given a c-loop of odd length k, the first and the last links are c-links
and it is easy to prove that the value of the first cell C1 must be n1: if C1 is supposed
to be different from n1, theorem XII.4 (applied to the associated non-full c-chain
proves that the value of cell Ck cannot be n1; since the last link (between Ck and C1)
is a c-link for value n1, C1 must be n1 – a contradiction.
One might therefore think that c-loops of odd length lead to a new type of chain
rules, one for each odd length, the C2k+1-loop rule: for any c-loop of odd length,
assert the common c-linking value as the value of its first cell. Nevertheless, this
subsection shows that this rule is unnecessary since it is subsumed by simpler rules.
Theorem XVI.3: The c3-loop rule is subsumed by the RiB and RiC Interaction
rules.
Proof: consider a c3-loop, where the first and the last cells are the same. Since
any two of the three cells share a unit, there must be a common unit shared by them,
after theorem XI.3. Due to the existence of two c-links, these cells cannot share a
row or a column; therefore, they must share a block. But, given the RiB and CiB
rules, this contradicts theorem 2 above.
Theorem XVI.4: For any k > 1, the c2k+1-loop rule is subsumed by the c2k-chain
rule.
Proof: given a c2k+1-loop, consider the associated full c2k-chain composed of its
first 2k cells and the links between them inherited from the loop. Since cell C2k+1 is
linked to cells C2k and C1, it is a full c-chain target cell for the c2k-chain and the c2k-
chain rule entails that C2k+1 cannot be n1. Now, it suffices to remark that, since C2k+1
and C1 are c-linked for number n1, C1 must be n1.
Theorem XVI.5 Formal statement: the resolution rules that might be obtained
from c-chains with loops are subsumed by BSRT together with Hidden Single, the
RiB and CiB Interaction rules, and rules for shorter c-chains with no loops.
Practical statement: c-chains should have no loops.
Proof: We have seen above that it is not necessary to introduce chain rules for
global c-loops. Consider now a full c-chain in which there is a loop on cell Ck.
248 The Hidden Logic of Sudoku
If the loop has odd length, then, according to theorems 3 and 4 above, interaction
rules (RiB and CiB) or rules for simpler chains imply that Ck = n1. Let us propagate
this result along the links in the given c-chain (in the forward direction if k is even,
in the backward direction if k is odd): we use ordinary ECP rules along an ordinary
link and Hidden-Single along a c-link. Finally, we obtain that one of the endpoints
of the initial c-chain is equal to n1. Therefore, none of its target cells can be n1.
If the loop has even length, we can always suppose that its first link is a c-link
(by reversing the given c-chain if necessary); then this loop can simply be excised:
this does not change the set of target cells of the given chain and, using theorem
XII.4, this does not change the inferences one can make along the chain.
Theorem XVI.6: the c2-chain rule is subsumed by the Interaction rules (RiB,
CiB, BiR and BiC).
Note that the converse of this theorem does not hold: there are cases of RiB,
CiB, BiR and BiC that are not of type c2-chain.
Proof of the theorem: Let cells C1 and C2 form a c2-chain with linking unit u, of
type ut, and c-linking value n1. Let TC be a full c-chain target cell, i.e. TC shares a
unit with each of C1 and C2. According to theorem XI.3, there is a unit u’ shared by
C1, C2 and TC.
If this common unit is u, number n1 is already absent from the candidates for TC
by the definition of a c-link and no rule is needed in this case.
If this unit is not u, then we have to consider three different cases, according to
the value of ut.
– if ut = row, then u’ cannot be of type col (since two cells C1 and C2 in the same
row u cannot share a column), so that u’ is of type blk; let r1 be the row c-linking C1
and C2 and let b1 be the block common to all three cells; then the situation is one of
an Interaction of row r1 with block b1; and the RiB rule leads to the conclusion that
n1 should be eliminated from the candidates for TC;
– if ut = col, the reasoning is similar (permute row and column);
– if ut = blk, then u’ is of type row or col; let b1 be the block c-linking C1 and C2
and let r1 be the row (resp. c1 the column) common to all three cells; then the situa-
tion is one of an Interaction of block b1 with row r1 (resp. column c1); and the BiR
Conjugacy chains (c-chains) 249
(respectively BiC) rule leads to the conclusion that n1 should be eliminated from the
candidates for TC.
Now, why should we prefer the set of Interaction rules to the c2-chain rule? The
first subsuming the second is not sufficient reason for this. We must also check that
the first is not more complex than the second. This is true according to our comple-
xity scale: these rules should be placed at level L1. For the c2-chain rule, this asser-
tion may contradict our general positioning of chain rules according to their length
(which would put c2-chains a priori at level 2). But c2-chains with any pre-specified
type of link constitute a very particular case; for instance, any c2-chain 1=(1)=1 with
a link of type column can be represented by a single cell in row-number space. The
internal pattern of this cell is the presence of exactly two values (columns). Conver-
sely, this argument might lead to classify c2-chain rules as simpler than Interaction
rules (which have more complex internal patterns). But we have decided not to in-
troduce so subtle differences in our classification. In any case, this rule can be added
anywhere in a sub-hierarchy inside level L1, globally it will not change the set of
grids solved at this level.
Theorem XVI.7:
1) the X-Wing(row) and X-Wing(col) rules are subsumed by the c4-chain rule;
2) the Swordfish(row) and Swordfish(col) rules are subsumed by the c6-chain
rule;
3) the Jellyfish(row) rules and Jellyfish(col) are subsumed by the c8-chain rule.
Proof: details of the proof are left to the reader as an easy exercise. Given an
instance of one of the above Super-Hidden Subset patterns, the proof proceeds in
two steps: first show that the corresponding c-chain pattern can be instantiated in
respectively four (2x2), nine (3x3) or sixteen (4x4) different ways in the actual
Super-Hidden Subset; then show that the c-chain rule (applied successively to each
of these instantiations) globally leads to eliminate the same candidates as the initial
Super-Hidden Subset rule.
Should we therefore eliminate X-Wing, Swordfish and Jellyfish from our sets of
rules? No, because they are less complex than the general rules subsuming them.
Theorem XVI.8:
1) the X-Wing(row) and X-Wing(col) rules are the block-free part of c4-chain;
2) the Swordfish(row) and Swordfish(col) rules are the block-free part of c6-
chain;
3) the Jellyfish(row) and Jellyfish(col) rules are the block-free part of c8-chain.
As there are some similarities between xy-, hxy- and c- chains, it is worth expli-
citing their independence with some detail. We do this for chains of length 4, given
L4_0 (and actually given much less than L4_0). For this purpose, we shall exhibit
three puzzles: one in [L4_0+XY4+HXY4]+C4, one both in [L4_0+C4]+ XY4 and
in [L4_0+C4]+HXY4 and one in [L4_0+C4+XY4]+HXY4.
1 7 8 2 6 3 5 1 7 8 4 2 6 3 5 9 1 7
6 2 6 7 1 2 8 3 5 6 7 9 1 2 8 3 5 4
3 1 5 7 8 2 6 3 1 5 4 7 9 8 2 6
1 5 3 1 5 3 2 6 7 1 5 3 2 4 6 7 8 9
8 2 9 5 8 7 2 3 1 9 6 4 5 8 7 2 3 1
7 2 8 7 1 3 6 5 2 8 7 9 1 3 4 6 5
4 3 1 5 4 8 3 1 5 7 2 4 9 8 3 6 1 5 7 2
2 6 7 2 1 8 5 6 3 7 2 1 8 5 4 6 9 3
7 5 3 7 2 1 5 3 6 7 9 2 1 4 8
Resolution path in L4_0+XY4+HXY4+C4 for the L4_0+XY4+HXY4 (or L4_0 or L1_0) ela-
boration of Royle17-57:
c4-chain row-bl-col n4r8{c8 c6} – n4{r9 r4}c5 ==> r4c8 ≠ 4
column c8 interaction-with-block b9 ==> r9c9 ≠ 4
hxy-cn4-chain {r9 r5}c3n6 – {r5 r2}c3n4 – {r2 r4}c9n4 – {r4 r9}c5n4 ==> r9c5 ≠ 6
… (Naked-Singles and Hidden-Singles)
2 1 7 6 4 3 8 2 1 7 6 4 3 8 5 9 2 1
8 4 8 1 5 4 2 7 3 8 1 5 4 9 2 7 3 6
9 2 3 9 1 7 8 4 2 3 9 1 6 7 5 8 4
6 5 7 4 6 9 8 5 7 1 2 4 3 6 9 8 5 7 1 2 4 3
3 8 3 5 2 6 8 1 7 3 5 2 6 4 9 8 1 7
2 1 4 7 8 2 3 1 4 7 8 2 3 6 5 9
7 9 4 5 7 3 9 1 8 4 6 2 5 7 3 9 1 8 4 6 2
2 1 2 1 7 6 3 8 4 2 1 7 5 6 3 9 8
8 6 2 3 1 7 9 8 6 2 3 4 1 7 5
After a single Interaction rule (which is applied first because it is simpler and has
therefore higher priority), one can chose indifferently between an xy-chain rule and
an hxy-chain rule (that were already both applicable before the Interaction rule
applied). The two (xy4 and hxy4) chains appear directly on the central grid. Let us
give the two resolution paths:
Resolution path in L4_0+C4+XY4 for the L4_0+C4 (or L1_0) elaboration of Royle17-118:
block b3 interaction-with-column c7 ==> r6c7 ≠ 5
xy4-chain {n5 n9}r9c9 – {n9 n6}r2c9 – {n6 n9}r2c5 – {n9 n5}r1c6 ==> r9c6 ≠ 5
… (17 Naked Singles)
Resolution path in L4_0+C4+HXY4 for the L4_0+C4 (or L1_0) elaboration of Royle17-118:
block b3 interaction-with-column c7 ==> r6c7 ≠ 5
252 The Hidden Logic of Sudoku
hxy-rn4-chain {c6 c1}r9n4 – {c1 c9}r9n9 – {c9 c5}r2n9 – {c5 c6}r5n9 ==> r5c6 ≠ 4
… (17 Naked Singles)
2 1 7 5 3 2 6 1 9 8 4 7 5 3 2 6 1
7 3 7 3 5 6 2 1 8 7 3 5 6 2 1 9 8 4
6 4 6 1 2 8 4 9 5 6 1 2 8 4 9 7 5 3
6 3 1 5 6 2 3 8 1 5 7 9 6 2 4 3 8
8 5 3 8 7 5 2 6 3 9 8 4 1 7 5 2 6
2 2 6 3 8 5 1 2 4 6 3 8 5 1 9 7
1 2 1 2 6 5 8 7 1 2 9 6 3 4 5
5 8 2 5 8 6 1 4 2 3 5 7 8 6 1 9
6 7 5 6 4 8 7 2 5 6 9 1 3 4 8 7 2
Resolution path in L4_0+C4+XY4+HXY4 for the L4_0+C4+XY4 (or L4_0 or L1) elabora-
tion of Royle17-4934:
hxy-rn4-chain {c7 c5}r7n3 – {c5 c2}r7n7 – {c2 c9}r6n7 – {c9 c7}r3n7 ==> r3c7 ≠ 3
… (Naked-Singles and Hidden-Singles)
Resolution path in L4_0+C4+XY4+HXY4 for the L4_0+C4+XY4 (or L4_0 or L1) elabora-
tion of Royle17-520:
row r4 interaction-with-block b4 ==> r6c3 ≠ 7
column c1 interaction-with-block b4 ==> r6c3 ≠ 4
column c7 interaction-with-block b3 ==> r3c9 ≠ 1, r3c8 ≠ 1
column c9 interaction-with-block b6 ==> r6c8 ≠ 1, r5c8 ≠ 1
xy4-chain {n8 n2}r7c1 – {n2 n1}r7c8 – {n1 n7}r9c8 – {n7 n8}r6c8 ==> r6c1 ≠ 8
hxy-cn4-chain {r6 r5}c9n1 – {r5 r3}c9n9 – {r3 r8}c9n2 – {r8 r6}c3n2 ==> r6c3 ≠ 1
block b4 interaction-with-row r4 ==> r4c5 ≠ 1
… (Naked-Singles and Hidden-Singles)
Conjugacy chains (c-chains) 253
6 5 7 2 3 9 6 5 7 2 8 3 4 9 1 6 5
9 2 9 5 2 6 7 3 8 9 4 5 2 6 1 7 3 8
3 6 5 7 3 1 6 5 7 8 4 2 9
9 2 4 6 9 5 2 4 3 6 7 1 9 8 5 2 4 3
5 3 5 3 7 6 8 5 3 7 2 4 6 9 1
9 6 3 5 4 9 2 6 1 3 5 8 7
6 5 7 6 9 5 7 3 4 2 6 9 8 5 7 3 1 4
1 6 8 1 3 4 9 6 8 5 1 3 7 4 9 6 8 5 2
3 9 5 3 2 9 6 5 8 4 1 3 2 9 7 6
But the same L4_0 elaboration can also be easily solved with a single instance of
C4, applied to another sequence of cells. This puzzle is thus also in [L4_0]+C4.
Let us display the two resolution paths and check that the two xy4 and c4 chains
use different sequences of cells on the same grid (leading to different eliminations):
Resolution path in L4_0+XY4 for the L4_0 (or L1) elaboration of Royle17-3766:
xy4-chain {n6 n9}r9c6 – {n9 n6}r3c6 – {n6 n9}r1c4 – {n9 n6}r1c3 ==> r9c3 ≠ 6
… (Naked-Singles)
Resolution path in L4_0+C4 for the L4_0 (or L1) elaboration of Royle17-3766:
c4-chain col-row-bl n6{r9 r1}c3 – n6{r1c4 r3c6} ==> r9c6 ≠ 6
… (Naked-Singles)
easily solved by a single step (apart from NS and ECP), either in L4_0+XY4 or in
L4_0+C4.
2 4 6 7 3 5 1 2 4 8 6 7 3 5 1 9 2 4
1 8 1 3 8 2 7 6 5 1 4 3 8 2 9 7 6 5
3 2 5 7 6 1 3 2 9 5 7 4 6 8 1 3
4 5 3 1 4 7 2 5 6 3 1 8 4 7 2 5 9 6
7 1 7 2 4 6 5 1 3 7 2 4 6 9 5 1 3 9
3 5 6 1 3 2 4 7 9 5 6 1 3 8 2 4 7
5 1 6 4 5 1 3 6 7 2 4 8 9 5 1 3 6 7 2
2 5 6 7 2 9 3 5 1 6 7 2 9 8 4 3 5 1
3 7 5 3 1 2 6 7 4 5 3 1 2 6 7 4 8 9
Resolution path in L4_0+XY4 for the L4_0 (or L1) elaboration of Royle17-147:
xy4-chain {n8 n9}r6c6 – {n9 n4}r2c6 – {n4 n9}r2c2 – {n9 n8}r1c1 ==> r6c1 ≠ 8
… (Naked-Singles)
Resolution path in L4_0+C4 for the L4_0 (or L1) elaboration of Royle17-147:
c4-chain row-col-row n9r6{c1 c6} – n9r2{c6 c2} ==> r1c1 ≠ 9
… (Naked-Singles)
Puzzle Royle17-16774 (Figure 6) is an example where only three values are ad-
ded at the start by NS and HS rules, and then many candidates must be deleted by
lots of different rules before any other value can be added. It has two interesting
resolution paths, one in L5+C6 and one in L5+XY4_8. Moreover, the only part of
L5 that is effectively used in both cases is very simple, being limited to L3_0+XY4.
Given L3_0+XY4, this is thus an example of a trade between C6 and XY8.
Part common to the two resolution paths (in L5+XY6+HXY6+C6 and in L5+
XY4_8) for the L5 (or L1_0) elaboration of Royle17-16774:
3 5 3 5 4 3 8 5 7 6 2 1 9
4 4 5 6 9 1 2 3 8 7 4
6 6 7 2 1 4 8 9 5 6 3
6 4 1 6 4 1 6 9 4 7 5 8 1 3 2
3 7 3 7 8 1 5 3 4 2 7 9 6
2 2 3 2 7 3 6 9 1 4 5 8
6 4 2 3 6 4 2 3 5 7 8 6 4 9 2 1
1 3 1 5 3 9 4 6 2 1 5 3 8 7
8 5 8 5 1 8 2 9 3 7 6 4 5
Let us now display the ends of the two resolution paths. The same number (6)
appears to be the target value of a c6-chain and an xy8-chain (living, of course, on
different cells). But the target cells of the two chains are different. In both cases, the
rule eliminates one single candidate before the puzzle can be finished with only NS
and HS.
Continuation of the resolution path in L5+C6 for the L5 (or L1_0) elaboration of Royle17-
16774:
c6-chain n6{r9c7 r9c3} – n6{r1c3 r1c6} – n6{r5c6 r6c4} ==> r6c7 ≠ 6
… (Naked-Singles and Hidden-Singles)
Continuation of the resolution path in L5+XY4_8 for the L5 (or L1_0) elaboration of
Royle17-16774:
xy8-chain {n6 n4}r8c2 – {n4 n2}r3c2 – {n2 n6}r2c2 – {n6 n1}r2c4 – {n1 n4}r3c4 –
{n4 n6}r6c4 – {n6 n4}r6c7 – {n4 n6}r9c7 ==> r9c3 ≠ 6
… (Naked-Singles and Hidden-Singles)
256 The Hidden Logic of Sudoku
We have very few examples of c8-chains (given our principle that simpler rules
should be applied before we search for these chains). Puzzle Royle17-14207 (Figure
7) can be solved at level L8 using C8 but, in addition to C8, it needs an application
of the rule for another type of chains (xyzt-chains, that will be defined later, in
chapter XIX ), of length 8. (Notice that the application of XYZT8 is necessary
before C8 becomes applicable).
1 4 6 1 8 3 4 2 6 1 8 3 4 7 5 2 9
6 8 4 3 2 6 1 8 4 3 7 5 9 2 6 1 8
3 2 6 8 1 3 4 5 2 9 6 8 1 7 3 4
5 4 7 6 3 5 8 4 7 2 9 6 3 1 5 8 4 7 2
8 3 8 2 3 6 1 8 5 4 7 2 3 9 6 1
2 2 6 8 3 2 7 1 4 6 9 8 5 3
3 8 6 3 8 6 2 4 3 9 5 8 1 6 2 4 7
4 1 4 2 3 1 8 6 7 4 2 9 3 5 1 8 6
2 8 6 2 4 3 1 8 6 2 7 4 3 9 5
xyt-chains
XVII.1.1. xyt-chains
the right-linking candidate; notice that several cells may have such additional t-
candidates and there can be more than one such additional candidate in each cell.
Definitions:
– a full xyt-chain is an xyt-chain such that the right-linking candidate for the last
cell equals the left-linking candidate for the first cell;
– the target number of a full-xyt-chain is the left-linking candidate for the first
cell, which is equal to the right-linking candidate for the last cell (as is the case for
xy-chains);
– a target cell of a full-xyt-chain is any general target cell.
Remarks:
– since xyt-chains obviously include xy-chains as particular cases (no extra
candidate present in any cell) and they have the same targets, our general guiding
principles would recommend eliminating pure xy-chains from strategies including
xyt-chains; but we keep them as the basis for all the types of extended xy-chains;
moreover, there is a second reason for keeping pure xy-chains: they are easier to
find than xyt-chains of the same length (and computationally "cheaper");
– contrary to pure xy-chains, xyt-chains are fundamentally non symmetrical
relatively to their endpoints and each cell must "keep some memory" of the
previous cells;
– the more we advance to the right end of an xyt-chain, the more additional can-
didates are allowed in a cell;
– as a consequence, as the length of an xyt-chain gets larger, it may become
more and more difficult to discover the potential next cells (both for a human solver
and for a machine); however see the nrc(z)(t) tagging algorithm in Part Four;
– xyt-chains defined as above are strong xyt-chains; one might also introduce
the notion of a weak xyt-chain, in which only one of the above additional values is
allowed in each cell; one might also introduce the notion of an extra-weak xyt-chain,
in which only one of the above additional values is allowed in only one cell; it is ob-
vious that strong xyt-chains subsume weak xyt-chains, which subsume extra-weak
xyt-chains, which subsume pure xy-chains; in the sequel we shall consider only
strong xyt-chains. Notice however that, in practice, xyt-chains with more than one
additional candidate in a cell are rare. In practice, the search for xyt-chains could
therefore be limited (at least in a first stage) to the search for weak xyt-chains for
which only bi- and tri- value cells have to be considered. For all practical questions
on how to spot the chains, see Part Four.
xyt-chains 259
Theorem XVII.1 (constraints propagation rule for full xyt-chains): given a full
xyt-chain of any length, with xyt-chain target value n, eliminate n from the candi-
dates for any of its target cells.
Let us prove the rule for a full xyt4-chain: let the cells in the chain be C1, C2, C3,
C4; let the successive left-linking candidates be n1, n2, n3, n4, so that the target
variable is n1 and the successive right-linking candidates are n2, n3, n4, n1.
A symbolic representation of the chain and the possible values in its cells could
be: {1 2}—{2 3}—{3 4 (2#1)}—{4 1 (2#1) (3#2)}. This schema should be read as
follows: cell C3 can optionally have additional value n2 provided that it is linked to
cell C1; cell C4 can optionally have additional value n2 provided that it is linked to
cell C1 or (non exclusive "or") additional value n3 provided that it is linked to cell
C2. Details of this extended chain pattern are given in section 2 below.
Proof of xyt4-chain rule: the proof of the theorem parallels the proof of the xy4-
chain rule in section XII.2.3 until, in the second branch of the alternative for C1 (i.e.
in the hypothesis C1 = n2), we reach cell C3.
Cell C1 can take two and only two values (hypothesis n2 ≠ n1 is essential for this
assertion). Let us consider each possibility in turn:
2) if C1 = n2, then C2 cannot be n2 since it shares a unit with C1; it must therefore
be n3 (hypothesis n3 ≠ n2 is essential for this conclusion). Therefore, C3 cannot be n3
since it shares a unit with C2.
At this point of divergence with the proof for xy-chains, there remain not one but
two possibilities for cell C3: C3 = n4 or C3 = n2 (this makes sense only if we assume
n2 ≠ n4, i.e. n2 is effectively an additional value in C3); but the second possibility
(C3 = n2) is present only when C3 is linked to C1, which makes it inconsistent with
the current hypothesis C1 = n2. We can therefore conclude that C3 = n4.
As a consequence, C4 cannot be n4, since it is linked to C3, and it can a priori be:
– either n1, in which case TC cannot be n1 since TC shares a unit with C4;
– or n2 (this makes sense only if we assume n2 ≠ n4, i.e. n2 is effectively an addi-
tional value in C4); but C4 can be n2 only if it is linked to C1, which makes this
possibility inconsistent with the current hypothesis C1 = n2;
260 The Hidden Logic of Sudoku
– or n3; but C4 can be n3 only if it is linked to C2, which makes this possibility
inconsistent with the conclusion C2 = n3 already reached from the current hypothesis
C1 = n2.
The proof for longer xyt-chains is similar, the only difference being that every
step in the proof imposes considering an alternative with one more case than for the
previous cell; therefore the longer the chain the longer the number of successive
steps of the proof (as for xy-chains) and the longer the maximal number of alterna-
tives one has to consider in each of these steps.
Notice that, as was the case for pure xy-chains, what we actually proved in the
branch of the alternative with C1 = n2 is that all the cells in the chain are equal to
their right-linking candidate. We therefore have:
Theorem XVII.2 (general theorem for non necessarily full xyt-chains): given
an xyt-chain, either the value of the first cell is its left-linking candidate, or the
value of each cell in the chain is its right-linking candidate.
Let us extend our definitions of cell and chain patterns and show that we can still
associate unambiguous logical formulæ to such extensions.
A closed cell pattern is represented by (and displayed as) a list of integers and
conditional optional integers, where each integer i in the list stands for the corres-
ponding variable ni; the conditional optional variable (ni#k) is represented as the
conditional optional integer (i#k).
xyt-chains 261
Definitions of chain patterns and their representations and of starred cell and
chain patterns are modified accordingly, with the condition that, for a cell pattern Cp
appearing in a chain pattern, conditional optional variable (ni#k) is allowed only if
k < p.
As for the meaning and the instantiations of these extended [starred] cell pat-
terns, they can be defined only in the context of a [starred] chain pattern. The
[starred] chain pattern
C1L1C2L2 … CkLkCk+1…Cn-1Ln-1Cn, (1<k<n),
and its ordinary [starred] cell patterns are instantiated as previously, with the addi-
tion that an instantiation of a closed cell pattern Cp with conditional optional varia-
bles must satisfy the following conditions. Cp is instantiated in an actual cell i(Cp) of
an actual chain when:
– each of its ordinary (i.e. non conditional optional) variables is instantiated by
an actual candidate in actual cell i(Cp), i.e. is associated to an actual candidate in this
cell;
– for each of its conditional optional variable (ni#k), either there is a link bet-
ween i(Cp) and the instantiation i(Ck) of cell pattern Ck, in which case variable ni
may be (but is not necessarily) instantiated by an actual candidate in i(Cp), or there
is no such link and variable ni may not be instantiated in i(Cp);
– any two variables that are effectively instantiated, whatever their type (ordina-
ry or conditional optional), must have different instantiations,
– there are no candidates in actual cell i(Cp) other than those covered by all pre-
vious associations.
Let Cp be an unstarred extended closed cell pattern; let Vp be the finite set of its
ordinary variables and COp the finite set of its conditional optional variables, of type
(ni#k). Let also i(Cp) designate any actual cell instantiating Cp in any actual grid.
– all the predicates "ni ≠ nj", for all the variables ni and nj (with i ≠ j) in V p ∪
CO’p; this expresses that the instantiations of all the instantiated variables must be
different;
– the formula "share-a-unit(rp, cp, rk, ck)", for every k in CO’p; this expresses that
i(Cp) must share a unit with i(Ck);
– the formula "∀n {n ∉ V p ∪ CO’p ⇒ not-candidate(n, rp, cp)}", where n ∉ Vp
∪ CO’p is a shorthand for the finite conjunction of inequalities n ≠ ni for all varia-
bles ni in Vp ∪ CO’p; this expresses the condition that there may be no candidates in
i(Cp) other than those covered by the instantiations of variables in Vp ∪ CO’p.
Obviously, due to its disjunctive form, this may be a very complex formula, and
it will generally be possible to simplify it quite a lot. But the important point here is
that there is a systematic procedure for associating a clearly defined logical formula
to an extended chain pattern.
With the above simple extension of the formalism introduced in chapter XIII, all
the xyt-chain rules can be written very easily. Let us do it for chains of length seven
or less.
The general principle should now be clear, as should be the pattern of regularly
increasing complexity with length. How far should we go? Same question and same
non-answer as for xy- and hxy- chains. In SudoRules, xyt-chain rules, like xy-chain
rules, have been implemented up to length sixteen and systematically tested up to
length thirteen. As the examples below show, this is not superfluous.
As an example of the logical formula associated to such a rule, let us write it for
the simpler case of an xyt3-chain (this rule was not listed above, because this case is
useless in practice after theorem 1 below). After simplification (factorisation and
introduction of auxiliary predicates), we get:
∀r1∀c1∀r2∀c2∀r3∀c3∀n1∀n2∀n3∀r∀c∀n
{ rc-bivalue(r1, c1, n1, n2) &
share-a-unit(r2, c2, r1, c1) &
rc-bivalue(r2, c2, n2, n3) &
share-a-unit(r3, c3, r2, c2) &
{ rc-bivalue r3, c3, n3, n1) or
[ candidate(n3, r3, c3) & candidate(n1, r3, c3) & n1≠ n3 &
candidate(n2, r3, c3) & share-a-unit(r3, c3, r1, c1) &
∀n∉{n1, n2, n3} not-candidate(n, r3, c3)] }
¬same-cell(r3, c3, r1, c1) &
share-a-unit(r, c, r1, c1) &
share-a-unit(r, c, r3, c3) &
¬same-cell(r, c, r2, c2) &
n=n1
=>
not-candidate(n, r, c) }.
264 The Hidden Logic of Sudoku
It should be noted that the complexity of the raw logical formula obtained from
the procedure defined above grows rapidly (although polynomially) as the length of
the chain increases – much faster than the apparent complexity of the chain pattern.
Considering the XYT3 rule: |= {1 2}*—{2 3}—{3 1 (2#1)}*, it can be split into
two simpler rules (corresponding to the options in the third cell):
– one without the additional variable n2 in cell C3 and with no additional link im-
posed between cells C3 and C1; this case reduces to XY3;
– one with the additional variable n2 in cell C3 and an additional link imposed
between cells C3 and C1; due to theorem XI.3, there is a unit u shared by the three
cells and the pattern is one of Naked Triplets in u; NT allows us to eliminate n1 from
all the other cells in u; but this is not enough, since the initial XYT3 rule applies to
additional target cells: any cell not in u that is linked to both C1 and C3; this is taken
care of as follows.
Suppose, for instance, that u is a row r and that C1 and C3 are also in the same
block b; if n1 is not a candidate for C2, i.e. if n1 ≠ n3,whether C2 is also in b or not,
the RiB rule applies to number n1, row r and block b; it allows us to eliminate n1
from the candidates for any cell in block b and not in row r; if n1 = n3, the same
eliminations are done by NP(blk); in any case, this completes the job that would
have been done by the XYT3 rule.
To complete the proof, one must consider the other three possible cases for
which there may be additional target cells: u is a column and C1 and C3 are also in
the same block b; u is a block and C1 and C3 are also in the same row r; u is a block
and C1 and C3 are also in the same column c; these cases are treated exactly as the
first, but with RiB replaced respectively by CiB, BiR, BiC, and with NP(blk)
replaced by appropriate NPs.
As a result, we have:
Theorem XVII.1: XYT3 is subsumed by {RiB, CiB, BiR, BiC, NP, NT, XY3}.
1 7 3 9 4 6 5 1 2 7 3 8 9 4 6 5 1 2 7 3 8
5 6 4 5 2 8 6 1 9 4 5 2 8 6 3 7 1 9 4
1 8 4 9 2 5 6 1 3 7 8 4 9 2 5 6
2 4 5 2 8 1 4 5 2 8 1 4 7 5 9 6 3
8 1 5 4 8 1 7 5 4 3 9 6 8 1 2
6 9 1 8 5 4 3 6 9 1 2 8 5 4 7
1 8 3 6 1 5 8 4 3 2 6 1 5 7 8 4 3 2 9
2 5 8 2 1 4 5 8 9 3 2 6 1 4 7 5
4 4 2 5 8 1 4 7 2 9 5 3 6 8 1
Our second example, puzzle Royle17-5105 (Figure 2), is interesting in that its
L4_0+XY4+HXY4+C4 elaboration effectively uses the XY4 and C4 rules and it
produces values that subsume the elimination of candidates done by these rules. As
a result, the resolution process of this elaboration can start with XYT4, and the xyt4
pattern is thus directly visible in the central grid of Figure 2 (after ECP is applied).
6 1 8 7 6 1 3 8 7 4 9 6 1 5 3 2
3 7 3 1 5 7 6 3 9 1 4 2 5 7 8 6
2 2 5 6 7 3 1 2 5 6 8 7 3 1 9 4
6 5 1 8 9 6 3 5 4 7 2 1 8 9 6 3 5 4 7 2 1 8
7 3 7 8 3 1 6 4 7 8 2 3 1 9 6 4 5
1 6 8 7 3 1 4 5 6 8 2 9 7 3
7 3 4 6 7 3 8 4 5 1 6 2 9 7 3 8 4 5 1
1 8 6 5 1 8 4 3 6 7 5 1 8 2 9 4 3 6 7
4 3 7 1 5 6 8 4 3 7 1 5 6 8 2 9
Our third example, puzzle Royle17-499 (Figure 3) is remarkable for another rea-
son: its L4_0+XY4+HXY4+C4 and its L1 elaborations coincide, and, after a single
interaction rule is applied to them, a succession of 4-chain patterns of various types
(xy4, hxy4, xyt4, c4 and again xy4) appears on the grid.
6 1 5 7 4 3 6 1 8 5 7 4 2 3 9 6 1
4 7 4 1 7 6 5 3 4 1 9 8 7 6 5 2 3
2 6 2 3 1 5 7 4 6 2 3 9 1 5 8 7 4
6 1 5 3 6 1 5 4 7 3 6 1 5 4 7 2 8 9
3 7 4 6 3 1 7 4 5 9 8 2 6 3 1 7 4 5
5 5 7 4 1 3 6 5 7 4 2 8 9 1 3 6
5 1 8 4 5 1 6 8 3 7 2 4 5 1 6 8 3 9 7
7 4 7 6 3 5 4 1 7 9 6 3 5 2 4 1 8
1 3 7 4 6 5 1 3 8 7 9 4 6 5 2
6 1 5 8 2 6 1 5 7 9 8 2 4 6 3 1
8 7 8 7 1 5 4 8 2 6 3 7 1 9 5 4
1 5 6 8 7 1 3 4 9 5 6 8 2 7
6 4 1 6 4 5 1 8 7 6 4 5 1 8 7 3 9 2
3 7 8 3 9 1 6 7 8 5 3 9 1 4 6 2 7 8 5
5 2 5 9 3 4 1 6 2 8 7 5 9 3 4 1 6
3 7 4 3 5 1 7 4 6 8 2 3 5 1 7 9
1 2 9 1 2 7 4 8 5 6 3 9 1 2 7 4 8 5 6 3
5 6 7 5 3 6 1 4 8 7 5 3 6 1 9 2 4 8
2 3 9 4 2 3 8 9 1 6 7 4 2 3 8 5
4 9 8 4 3 9 2 8 4 5 3 9 6 7 1 2
7 7 3 2 8 4 9 7 3 2 1 5 8 4 6 9
5 6 1 5 7 9 6 2 3 1 4 8 5 7 9 6 2 3 1 4 8
4 8 3 2 4 8 9 3 2 1 4 8 5 9 7 6
2 4 8 9 2 3 4 6 8 9 7 1 5 2 3
8 4 6 5 7 8 1 9 2 3 4 6 5 7 8 1 9 2 3 4
2 9 2 8 4 3 9 1 2 8 4 5 3 7 6 9 1
1 3 1 9 3 2 4 8 1 9 3 2 6 4 8 5 7
c1 c2 c3 c4 c5 c6 c7 c8 c9
1 1 1
r1 9 6 5 6
7
5 4 2 3 8 7
5 6
1 1
r2 8 4 5 6 3 9 7
5 6
7
5 6
7
5 6 2 r2
1 1
r3 7 3 2
5 5 6 8 4 5 6 9
r4 5 7 9 6 2 3 1 4 8 r4
1 1
r5 3 2
6 4 8
7
5 9
7
5 6
7
5 6 r5
1 1
r6 4
6 8 9
7
5
7
5
7
5 6 2 3 r6
r7 6 5 7 8 1 9 2 3 4 r7
r8 2 8 4
7
5 3
7
5 6
7
5 6 9 1 r8
r9 1 9 3 2
7
5 6 4 8 7
5 6
7
5 6 r9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Because of the place it takes, we cannot do this for all our examples, but it is
useful to display the full rc-representation of at least a few cases. Let us do this just
before the xyt6-chain rule is applied (Figure 6). This xyt6-chain is:
{n7 n5}r6c5 – {n5 n6}r3c5 – {n6 n7 n5#1}r9c5 – {n7 n5}r8c4 – {n5 n6 n7#3}r8c6
– {n6 n7 n5#4}r8c7, where the additional candidates are marked with a "#".
It is interesting to notice that among the six cells of this moderately long xyt-
chain, three have a (single) additional candidate: additional candidate 5 is allowed in
r9c5 because this cell shares a column with cell 1 (r6c5), in which 5 is the right-
linking candidate; additional candidate 7 is allowed in r8c6 because this cell shares a
block with cell 3 (r9c5), in which 7 is the right-linking candidate; finally, additional
candidate 5 is allowed in r8c7 because this cell shares a column with cell 4 (r8c4), in
which 5 is the right-linking candidate.
6 8 3 6 8 9 5 3 2 4 6 8 1 9 7 5 3 2
5 2 3 1 6 5 2 8 3 9 1 6 5 2 8 7 4
5 2 8 3 6 7 5 2 4 8 3 6 1 9
5 1 7 5 1 3 2 7 6 8 5 1 3 9 2 4 7 6 8
8 6 2 8 6 5 3 9 2 7 8 1 6 4 5 3
3 8 6 3 5 2 8 4 6 3 7 5 9 2 1
1 4 2 1 3 4 8 2 9 6 1 3 5 7 4 8 2 9 6
2 7 2 8 6 3 7 2 8 9 5 6 1 3 4 7
8 6 2 3 8 5 6 7 4 2 3 9 1 8 5
Let us skip a few lengths. For puzzle Royle17-33442 (Figure 8), the L9+XY10+
HXY10 and L1 elaborations coincide. After a few (relatively) simple rules (in L2+
C4), XYT10 applies to it and the sequel needs only rules in L1_0.
6 2 7 6 2 5 1 4 7 6 2 8 5 9 3 1 4 7
4 1 5 7 4 1 2 5 3 7 8 4 1 2 9 6
8 4 1 7 2 8 4 1 9 6 7 2 5 8 3
5 4 1 5 7 4 1 2 3 8 6 9 5 7 4 1 2
7 3 7 4 1 3 2 7 4 1 3 2 8 6 5 9
2 2 5 4 1 2 9 5 4 1 6 7 3 8
4 3 7 4 1 3 2 8 7 4 1 6 9 3 2 5
5 7 5 2 7 9 5 2 7 3 4 8 6 1
2 2 1 6 3 2 8 5 9 7 4
6 8 6 2 8 1 7 6 2 9 3 8 1 5 4
2 3 8 2 3 6 8 1 5 2 7 4 3 9 6
2 6 2 8 4 9 3 6 5 1 2 7 8
2 7 2 6 7 1 5 2 8 6 4 9 3 7 1 5
5 8 1 2 5 6 8 9 7 4 1 2 5 6 8 3
1 6 5 1 6 7 5 3 1 8 6 7 9 4 2
5 4 6 1 5 4 2 6 1 5 9 3 4 2 8 6 7
3 7 3 8 7 1 6 5 3 4 8 7 1 6 5 2 9
1 6 2 1 6 2 7 5 8 9 4 3 1
7 9 7 8 1 9 5 7 2 3 8 1 6 9 4 5
3 8 5 1 9 3 7 8 5 6 1 4 9 3 2 7 8
5 8 5 7 3 8 9 4 2 5 7 3 1 6
9 7 2 1 9 5 7 2 1 3 8 9 6 5 7 2 4
4 5 4 5 3 4 5 9 3 7 2 6 8 1
1 1 8 5 3 6 7 2 1 4 8 5 3 9
8 5 3 8 9 1 5 3 2 4 7 6 8 9 1 5 3
9 7 9 5 7 3 1 9 8 5 7 3 1 4 6 2
1 3 1 5 9 7 3 1 6 5 2 4 8 9 7
3 9 3 7 1 9 3 7 6 5 1 8 2 4
7 2 9 4 1 7 2 9 3 4 1 5 7 2 8 9 6 3
8 1 7 8 2 9 3 1 7 8 6 2 4 9 3 5 1 7
4 3 2 9 1 4 3 2 7 5 6 9 1 4 3 8 2
1 9 3 1 2 6 9 3 8 1 2 6 7 4 9 5
9 3 6 2 9 3 6 1 2 4 9 3 8 5 6 7 1
6 5 7 1 6 3 5 7 2 1 6 9 3 5 7 2 1 4 8
2 4 9 3 2 1 4 9 3 6 5 2 8 1 4 9 7 3 6
1 1 4 8 3 6 1 7 4 8 3 6 2 5 9
Figure 11. Puzzle Sudogen17-1542, its L12 elaboration and its solutio
Although we have not searched our three collections systematically for xyt-
chains of lengths greater than thirteen, we have found a unique xyt14-chain in
Royle17: puzzle Royle17-17692 (Figure 12); its L13+XY14 and L1_0 elaborations
coincide. Notice that, in Royle17, there is no xyt-chain of length fifteen or sixteen.
4 6 1 2 4 6 5 1 3 2 4 9 8 6 5 1 7 3
5 3 5 1 3 4 2 5 8 1 3 4 7 2 6 9
8 3 1 2 8 4 5 7 3 6 1 2 9 8 4 5
3 2 5 3 2 4 5 1 3 7 2 4 8 6 9 5 1
1 4 5 3 1 4 2 9 5 8 7 3 1 4 2 6
1 4 5 2 3 1 6 4 9 5 2 3 8 7
1 7 4 1 3 2 7 4 5 6 1 3 2 7 4 5 9 8
5 3 2 4 5 1 3 2 4 9 7 5 1 8 6 3 2
2 5 3 1 4 8 2 5 6 9 3 7 1 4
Figure 12. Puzzle Royle17-17692, its L1_0 elaboration and its solution
Resolution path in L14 for the L13+XY14 (or L1_0) elaboration of Royle17-17692
row r3 interaction-with-block b1 ==> r2c2 ≠ 6
block b9 interaction-with-row r7 ==> r7c1 ≠ 8
block b9 interaction-with-column c7 ==> r4c7 ≠ 7
c4-chain col-row-col n8{r5 r9}c1 – n8{r9 r4}c5 ==> r5c4 ≠ 8, r4c2 ≠ 8
row r4 interaction-with-block b5 ==> r6c4 ≠ 8
xyzt7-chain {n9 n8}r9c5 – {n8 n6}r9c4 – {n6 n9}r8c6 – {n9 n7}r3c6 – {n7 n8}r2c6 –
{n8 n6}r4c6 – {n6 n9}r4c7 ==> r9c7 ≠ 9
c4-chain col-row-col n9{r8 r4}c7 – n9{r4 r9}c5 ==> r8c6 ≠ 9
block b8 interaction-with-row r9 ==> r9c1 ≠ 9
xyt14-chain {n7 n6}r9c7 – {n6 n9}r4c7 – {n9 n8}r4c5 – {n8 n9}r9c5 – {n9 n8}r9c4 –
{n8 n6}r8c6 – {n6 n7}r4c6 – {n7 n9}r3c6 – {n9 n7}r1c4 – {n7 n9}r1c8 – {n9 n8}r7c8 –
{n8 n9}r7c9 – {n9 n6}r7c1 – {n6 n7}r3c1 ==> r9c1 ≠ 7
hidden-single-in-a-row ==> r9c7 = 7
x-wing-in-columns n6{r4 r8}{c6 c7} ==> r8c3 ≠ 6, r8c2 ≠ 6
column c2 interaction-with-block b4 ==> r5c3 ≠ 6
xyt-chains 275
9 4 6 9 1 4 5 6 9 1 4 5 2 6 7 3 8
5 1 9 5 4 1 2 6 3 9 8 7 5 4 1
7 4 7 4 1 8 5 7 3 4 1 6 2 9
2 7 4 9 1 2 7 4 9 1 3 8 2 7 4 9 1 5 6
6 9 4 8 6 9 7 4 1 8 6 5 3 9 2
6 1 8 4 6 1 8 4 6 9 5 1 3 2 8 7 4
1 2 6 4 7 3 9 8 5
3 1 8 3 1 8 5 3 9 2 1 8 4 7 6
4 8 5 9 2 4 8 5 9 2 1 4 7 8 6 5 9 2 1 3
Figure 13. Puzzle Sudogen0-456, its L1_0 elaboration and its solution
Resolution path in L15 for the L14+XY15 (or L1_0) elaboration of Sudogen0-456
row r2 interaction-with-block b2 ==> r1c5 ≠ 7, r3c2 ≠ 6
block b8 interaction-with-row r7 ==> r7c9 ≠ 7, r7c8 ≠ 7, r7c7 ≠ 7, r7c2 ≠ 7, r7c1 ≠ 7
naked-pairs-in-a-column {n3 n7}{r1 r5}c7 ==> r8c7 ≠ 7, r7c7 ≠ 3, r3c7 ≠ 3
c4-chain row-col-bl n7r9{c9 c2} – n7{r6c2 r5c1} ==> r5c9 ≠ 7
c4-chain col-row-bl n7{r8 r5}c1 – n7{r5c7 r6c8} ==> r8c8 ≠ 7
block b9 interaction-with-column c9 ==> r1c9 ≠ 7
xyzt5-chain {n3 n2}r6c5 – {n2 n5}r6c6 – {n5 n3}r5c6 – {n3 n7}r5c7 – {n7 n3}r6c8 ==>
r6c3 ≠ 3
xyzt4-chain {n5 n9}r6c3 – {n9 n6}r8c3 – {n6 n7}r9c2 – {n7 n5}r6c2 ==> r5c3 ≠ 5
xyt7-chain {n8 n5}r4c2 – {n5 n9}r6c3 – {n9 n7}r6c2 – {n7 n6}r9c2 – {n6 n3}r9c4 –
{n3 n2}r3c4 – {n2 n8}r3c2 ==> r2c2 ≠ 8
276 The Hidden Logic of Sudoku
xyzt8-chain {n5 n6}r8c8 – {n6 n9}r8c3 – {n9 n4}r8c7 – {n4 n2}r8c4 – {n2 n3}r3c4 –
{n3 n6}r9c4 – {n6 n7}r9c2 – {n7 n5}r8c1 ==> r8c9 ≠ 5
xyzt8-chain {n3 n2}r6c5 – {n2 n5}r6c6 – {n5 n9}r6c3 – {n9 n7}r6c2 – {n7 n6}r9c2 –
{n6 n5}r8c3 – {n5 n1}r7c3 – {n1 n3}r5c3 ==> r5c6 ≠ 3
block b5 interaction-with-row r6 ==> r6c8 ≠ 3
xyt10-chain {n5 n2}r5c6 – {n2 n3}r6c5 – {n3 n5}r6c6 – {n5 n9}r6c3 – {n9 n7}r6c2 –
{n7 n6}r9c2 – {n6 n5}r8c3 – {n5 n1}r7c3 – {n1 n3}r5c3 – {n3 n5}r5c9 ==> r5c1 ≠ 5
hidden-pairs-in-a-row {n2 n5}r5{c6 c9} ==> r5c9 ≠ 3
xyt9-chain {n2 n5}r5c9 – {n5 n2}r5c6 – {n2 n3}r6c5 – {n3 n5}r6c6 – {n5 n9}r6c3 –
{n9 n7}r6c2 – {n7 n6}r9c2 – {n6 n3}r9c4 – {n3 n2}r3c4 ==> r3c9 ≠ 2
xyt15-chain {n8 n5}r4c2 – {n5 n9}r6c3 – {n9 n7}r6c2 – {n7 n6}r9c2 – {n6 n2}r2c2 –
{n2 n9}r7c2 – {n9 n5}r8c3 – {n5 n6}r8c8 – {n6 n4}r7c7 – {n4 n9}r8c7 – {n9 n7}r8c9 –
{n7 n2}r8c1 – {n2 n1}r7c1 – {n1 n3}r5c1 – {n3 n8}r2c1 ==> r4c1 ≠ 8
hidden-single-in-a-block ==> r4c2 = 8
xyzt6-chain {n5 n3}r4c1 – {n3 n1}r5c3 – {n1 n7}r5c1 – {n7 n2}r8c1 – {n2 n8}r2c1 –
{n8 n5}r3c1 ==> r7c1 ≠ 5
xyt7-chain {n3 n5}r4c1 – {n5 n9}r6c3 – {n9 n7}r6c2 – {n7 n6}r9c2 – {n6 n5}r8c3 –
{n5 n6}r8c8 – {n6 n3}r4c8 ==> r4c9 ≠ 3
xyt7-chain {n3 n5}r4c1 – {n5 n9}r6c3 – {n9 n7}r6c2 – {n7 n6}r9c2 – {n6 n5}r8c3 –
{n5 n1}r7c3 – {n1 n3}r5c3 ==> r5c1 ≠ 3
xyt9-chain {n7 n6}r9c2 – {n6 n2}r2c2 – {n2 n5}r3c2 – {n5 n9}r7c2 – {n9 n5}r8c3 –
{n5 n6}r8c8 – {n6 n4}r7c7 – {n4 n9}r8c7 – {n9 n7}r8c9 ==> r9c9 ≠ 7
naked and hidden singles ==> r8c9 = 7, r9c2 = 7, r5c1 = 7, r5c7 = 3, r1c7 = 7, r5c3 = 1,
r7c1 = 1, r6c8 = 7, r5c9 = 2, r5c6 = 5, r4c1 = 3, r2c3 = 3, r2c2 = 6
naked-pairs-in-a-column {n5 n6}{r4 r8}c8 ==> r7c8 ≠ 6, r7c8 ≠ 5, r3c8 ≠ 6
hidden-pairs-in-a-block {n6 n9}{r3c7 r3c9} ==> r3c9 ≠ 8, r3c9 ≠ 3
c4-chain row-col-row n3r9{c9 c4} – n3r3{c4 c8} ==> r1c9 ≠ 3
…(Naked-Singles and Hidden-Singles)
7 8 7 8 6 4 7 8 2 5 1 3 9
2 3 1 5 2 3 1 7 5 2 3 8 9 1 7 5 4 6
6 3 7 1 6 3 7 5 9 1 4 6 3 2 7 8
9 1 8 9 1 8 7 6 5 2 9 1 3 8 4
3 1 7 3 1 7 3 1 4 7 5 8 6 9 2
9 6 9 6 8 2 9 6 3 4 7 5 1
4 2 9 3 4 7 6 8 2 9 3 4 7 6 5 8 2 9 1 3
5 7 9 5 8 7 9 5 3 1 4 6 8 2 7
1 8 7 1 8 7 1 8 2 3 7 9 4 6 5
Resolution path in L16 for the L15+XY16 (or L3) elaboration of Sudogen0-7766
row r5 interaction-with-block b6 ==> r4c9 ≠ 6, r4c7 ≠ 6
row r2 interaction-with-block b3 ==> r1c9 ≠ 6, r1c8 ≠ 6, r1c7 ≠ 6
column c3 interaction-with-block b4 ==> r6c1 ≠ 5, r4c1 ≠ 5, r5c3 ≠ 2, r4c3 ≠ 2
block b9 interaction-with-column c8 ==> r6c8 ≠ 1, r1c8 ≠ 1
naked-triplets-in-a-column {n4 n2 n6}{r3 r5 r9}c7 ==> r6c7 ≠ 4, r6c7 ≠ 2, r4c7 ≠ 4, r4c7 ≠ 2,
r1c7 ≠ 4, r1c7 ≠ 2
hxy-cn8-chain {r3 r2}c9n8 – {r2 r5}c3n8 – {r5 r6}c6n8 – {r6 r3}c1n8 – {r3 r1}c1n5 –
{r1 r4}c1n6 – {r4 r1}c2n6 – {r1 r3}c2n9 ==> r3c9 ≠ 9
xyt9-chain {n2 n4}r6c2 – {n4 n9}r3c2 – {n9 n6}r1c2 – {n6 n5}r1c1 – {n5 n8}r3c1 –
{n8 n4}r2c3 – {n4 n9}r2c4 – {n9 n4}r1c6 – {n4 n2}r1c5 ==> r6c5 ≠ 2
xyzt9-chain {n2 n4}r6c2 – {n4 n9}r3c2 – {n9 n6}r1c2 – {n6 n5}r1c1 – {n5 n8}r3c1 –
{n8 n4}r2c3 – {n4 n5}r4c3 – {n5 n4}r4c9 – {n4 n2}r3c9 ==> r6c9 ≠ 2
xyt16-chain {n2 n4}r6c2 – {n4 n5}r4c3 – {n5 n8}r5c3 – {n8 n7}r6c1 – {n7 n6}r4c1 –
{n6 n5}r1c1 – {n5 n8}r3c1 – {n8 n4}r2c3 – {n4 n9}r2c4 – {n9 n4}r1c6 – {n4 n5}r5c6 –
{n5 n3}r6c5 – {n3 n4}r8c5 – {n4 n2}r1c5 – {n2 n4}r5c5 – {n4 n2}r4c4 ==> r4c2 ≠ 2
hidden-single-in-a-block ==> r6c2 = 2
c4-chain row-col-bl n2r4{c9 c4} – n2{r3c4 r1c5} ==> r1c9 ≠ 2
xyzt9-chain {n9 n4}r2c4 – {n4 n8}r2c3 – {n8 n5}r3c1 – {n5 n6}r1c1 – {n6 n7}r4c1 –
{n7 n3}r4c7 – {n3 n1}r1c7 – {n1 n4}r1c9 – {n4 n9}r1c2 ==> r1c6 ≠ 9
…(Naked-Singles, Hidden-Singles and Interactions)
1 3 1 3 6 7 5 1 3 6 7 4 9 2 8
5 8 1 5 2 8 4 1 5 9 7 3 6
9 2 8 1 9 2 3 8 1 9 6 7 2 3 8 1 5 4
4 6 4 6 1 2 9 3 4 7 8 6 5
6 2 1 6 8 2 1 7 3 6 5 8 2 4 1 9
8 9 3 2 8 9 3 2 4 5 8 9 6 1 3 7 2
4 7 8 4 7 2 8 4 1 7 2 6 5 9 3
3 9 5 1 3 8 9 5 1 3 7 2 8 9 5 6 4 1
4 2 8 4 2 8 6 9 5 4 1 3 2 8 7
Resolution path in L16 for the L15+XY16 (or L3) elaboration of Sudogen0-4443
278 The Hidden Logic of Sudoku
hxyt-chain rules are obtained from the block-free part of xyt-chain rules, by ap-
plying the Scn and Srn transformations.
Let us consider xytk-chain*, the pattern for a full xyt-chain of length k. The
XYTk rule asserts that the universal closure of the following formula is valid:
"xytk-chain* => not-candidate(n, r, c)".
Proof: apply the same proof as for XYk. It is enough to check that the starred
chain pattern xytk-chain* is block-positive; and this can be done easily from its defi-
nition: the only non block-free predicates it contains are "share-a-unit" and they
never appear in the scope of a negation.
Let us apply the previous theorem to list the first hxyt-chain rules:
and so on. The pattern is clear. How far should we go? The answer is the same as for
xy-chains. In our SudoRules solver, hxyt-chains have been implemented up to
length thirteen.
As was the case for xy- and hxy- chain rules, it can easily be seen that xyt- and
hxyt- chain rules are related as described in Figure 1.
HXYT-rnk Srn
Scn
Srn
HXYT-cnk Scn
The examples in this section will prove that hxyt-chain rules of any length at
least up to length thirteen are not subsumed by simpler rules in our hierarchy.
2 6 9 8 2 6 1 4 9 5 7 8 2 3 6 1 4
1 1 2 6 4 9 8 3 1 2 6 4 5 9 7 8
9 4 8 6 9 1 4 8 6 9 1 7 5 3 2
2 3 6 2 3 4 1 9 6 8 2 3 4 1 9 6 8 5 7
8 9 5 8 4 2 9 3 6 7 1 5 8 4 2 9 3
5 4 5 9 8 3 7 2 1 4 6 5 9 8 3 7 2 1 4 6
4 1 3 2 4 1 3 8 9 7 2 5 4 6 1 3 8 9
9 7 9 7 8 1 6 9 7 3 8 4 2 5
8 8 2 9 8 4 3 2 5 9 7 6 1
Our second example, puzzle Royle17-20059 (Figure 3), has the advantage of
showing a c4-chain and a hxyt4-chain living at the same time on the same grid.
6 3 2 4 6 1 3 8 2 4 6 1 3 8 2 9 5 7
7 4 1 2 8 7 9 4 1 3 2 8 7 9 6 5 4 1 3
8 3 1 7 4 2 8 6 3 9 5 1 7 4 2 8 6
3 5 6 7 3 4 5 8 6 1 7 3 4 5 2 8 6 9 1
1 4 6 1 3 4 6 2 9 7 1 3 8 4 5
1 4 3 5 1 8 4 9 6 7 3 2
6 3 4 6 1 3 7 8 9 4 2 6 5 1 3 7 8
8 4 8 3 4 1 6 8 5 3 2 4 7 1 6 9
1 1 6 8 3 4 1 7 6 8 3 9 5 2 4
hxy-rn5-chain {c5 c2}r2n3 – {c2 c6}r5n3 – {c6 c3}r6n3 – {c3 c7}r6n4 – {c7 c5}r8n4 ==>
r2c5 ≠ 4
hxyt-rn5-chain {c5 c7}r8n4 – {c7 c5}r8n9 – {c5 c1}r1n9 – {c1 c6}r9n9 – {c6 c5}r9n6 ==>
r9c5 ≠ 4
c4-chain row-col-row n4r2{c2 c6} – n4r9{c6 c1} ==> r7c2 ≠ 4
block b7 interaction-with-column c1 ==> r1c1 ≠ 4
swordfish-in-rows n4{r1 r6 r8}{c5 c3 c7} ==> r5c7 ≠ 4
hxy-rn4-chain {c2 c1}r7n3 – {c1 c8}r7n4 – {c8 c9}r3n4 – {c9 c2}r5n4 ==> r5c2 ≠ 3
...(Naked-Singles and Hidden-Singles)
5 1 6 8 5 2 7 1 6 9 8 4 5 3 2 7 1 6
7 8 7 1 8 5 2 7 3 1 9 6 4 8 5 2
2 5 6 2 7 1 8 3 5 6 2 7 1 8 3 9 4
4 3 5 1 2 9 4 8 7 6 3 5 1 2 9 4 8 7 6 3 5
8 2 8 5 2 7 8 4 5 6 2 3 1 7 9
6 6 7 5 2 8 6 7 3 1 5 9 4 2 8
7 2 6 8 7 5 2 1 3 9 6 8 7 5 2 4 1
5 3 2 5 8 3 1 6 7 2 5 8 3 4 1 9 6 7
1 1 7 2 5 8 3 4 1 7 2 9 6 5 8 3
4 3 5 4 2 3 1 5 6 7 9 4 2 3 1 8
2 8 9 2 3 6 1 8 7 5 4 9 2 3 6 1 8 7 5 4
1 4 1 3 5 2 6 8 4 1 3 7 5 2 9 6
4 7 4 5 2 3 7 1 4 5 6 9 2 3 9 7 1
1 2 3 1 4 2 5 7 3 8 1 9 6 4 2 5
5 1 2 4 5 7 1 9 2 4 5 7 6 8 3
6 7 2 6 7 5 2 1 4 6 7 5 2 3 1 8 4 9
4 1 2 5 4 1 7 2 8 9 5 6 4 1 3 7
3 5 3 1 4 5 2 3 1 4 7 8 9 5 6 2
With our third example, puzzle Royle17-8530 (Figure 6), after a few simple
rules have been applied to the L4+XY5+HXY5+XYT5 elaboration (which coincides
with the L4 elaboration), three chains of length five and of different types (hxy-rn5,
hxy-cn5 and hxyt-rn5) appear to live at the same time on the same grid:
4 2 3 9 4 6 7 2 5 3 9 8 1 4 6 7 2 5 3
6 5 8 6 5 7 3 2 8 6 5 7 3 2 8 9 1 4
7 3 4 2 9 7 8 6 3 4 2 9 1 5 7 8 6
1 5 6 1 7 5 2 3 6 1 7 4 5 8 2 3 6 9
3 2 3 7 2 9 5 6 3 4 8 7 1
8 8 3 7 2 8 3 6 7 9 1 4 2 5
1 9 7 1 3 9 2 7 6 8 1 4 3 5 9 2
3 7 3 2 7 5 1 3 2 7 9 6 4 8
2 2 3 7 4 2 9 8 5 6 1 3 7
hxyt-rn5-chain {c6 c7}r8n6 – {c7 c9}r8n8 – {c9 c5}r4n8 – {c5 c4}r9n8 – {c4 c6}r9n6 ==>
r6c6 ≠ 6
...(Naked-Singles and Hidden-Singles)
1 2 6 4 1 8 2 6 3 9 4 1 8 2 6 3 9 5 7
8 4 7 1 8 4 2 3 5 9 7 1 8 6 4
3 8 4 3 1 7 6 9 8 4 5 3 2 1
6 4 7 6 4 7 1 8 6 2 4 7 5 9 1 8 3
1 5 1 8 5 4 6 9 7 3 1 2 8 5 4 6
8 8 1 4 6 2 8 5 1 4 3 6 2 7 9
7 1 4 6 3 7 1 8 5 4 6 3 9 2 7 1 8
3 4 3 8 7 1 4 6 2 3 8 7 5 1 4 6 9 2
8 1 6 8 7 4 1 9 2 6 8 7 4 3 5
Our second example, puzzle Royle17-3340 (Figure 8), is interesting for two rea-
sons:
– its L5+XY6+HXY6+C6+XYT6 and its L2 elaborations coincide and their
solution requires no complex rule apart from (c4 and) hxyt-rn6;
286 The Hidden Logic of Sudoku
– it has only a small sequence of final Naked-Singles (17); indeed, before these,
there is a long series of 27 NS and 4 HS, as generally occurs when the solution is
close, but an xy4 has to be applied before the 17 final NS can effectively lead to it.
2 3 6 5 2 3 6 5 7 4 9 2 3 6 8 1
1 5 1 3 6 5 9 1 3 6 8 4 7 5 2
2 6 7 3 9 8 2 6 7 1 5 4 3 9
3 2 7 3 2 7 3 4 2 8 7 1 9 6 5
5 4 5 4 3 1 8 7 5 9 6 2 4 3
6 6 5 3 6 9 5 3 4 2 8 1 7
1 8 3 9 1 8 4 3 9 1 6 7 5 2 8
5 4 5 4 3 2 5 8 4 3 9 1 7 6
7 3 7 6 2 3 7 6 1 2 5 8 3 9 4
2 8 6 5 2 8 6 3 7 5 4 9 2 8 6 3 7 1
7 4 5 7 6 2 3 9 1 8 4 5 7 6 2 3 9 1 8 4 5
3 7 5 4 3 8 1 7 5 4 9 6 2
3 6 2 1 3 6 7 5 2 4 1 3 6 9 7 5 2 8 4
5 4 5 4 3 8 5 7 4 3 2 1 9 6
1 6 1 5 3 9 2 4 6 1 8 5 3 7
3 6 1 4 3 6 5 2 7 8 1 4 3 6 5 9
4 5 4 1 3 5 6 4 1 3 5 6 9 7 2 8
1 6 5 2 4 1 3 6 9 5 8 2 7 4 1 3
As usual, the two paths have a common part (in L2), but it is very short in the
present case:
As the display of the two resolution paths shows, which one should be preferred
in this case is a matter of taste. In both cases, the same hxyt-rn7 chain produces the
same elimination.
The first steps of this example are another illustration, for hxy-cn4-chains, of
what we saw in section XV.3.5 for hxy-rn6-chains: on the same set of cells,
different orderings can lead to different hxy-chains (of course, this is also true for
xy-chains; it is less likely for non reversible chains, like xyt, hxyt, xyzt or hxyzt;
nevertheless, see the example in section 2.6 below).
3 4 3 4 6 7 2 9 8 1 3 5 4
2 2 1 6 5 4 8 3 7 2 1 9 6
6 6 2 8 1 3 9 6 5 4 2 7 8
1 4 1 4 7 3 9 2 1 8 4 6 7 3 5
5 2 5 4 2 7 8 6 5 1 3 4 2 9
6 8 6 8 3 5 4 7 2 9 6 8 1
1 3 5 1 3 8 5 6 2 4 9 7 1 3 8 5 6 2
2 6 7 2 6 3 8 1 7 2 6 3 4 9 5 8 1 7
8 8 1 5 9 4 3 8 1 5 2 6 7 9 4 3
Figure 10. Puzzle Royle17-1092, its L1_0 elaboration and its solution
Near the end of this common path (forgetting as usual the final NS and HS), one
can see three hxyt-rn-chains, one of length seven and two of length nine, living at
the same time on the grid (with no cell in common). As was the case in previous
290 The Hidden Logic of Sudoku
examples of chains of types xy or hxy, the two hxyt9-chains consist of the same
cells, taken in a different order.
4 8 6 4 8 6 5 4 2 8 3 6 5 9 7 1
5 4 5 7 9 1 4 2 8 5 3 6
6 6 5 4 6 3 5 9 7 1 4 2 8
1 5 7 1 4 5 7 3 1 4 5 8 7 6 9 2
4 3 5 6 4 3 5 7 2 1 9 6 8 4 3
2 2 4 1 5 7 9 8 6 2 3 4 1 5 7
4 9 8 7 4 9 8 5 1 6 3 7 4 9 2 8 5
5 7 5 7 4 2 5 9 8 1 3 7 6 4
6 4 6 5 8 4 7 6 5 2 3 1 9
hxyt-rn9-chain {c1 c6}r9n8 – {c6 c9}r2n8 – {c9 c8}r2n6 – {c8 c3}r8n6 – {c3 c2}r6n6 –
{c2 c7}r7n6 – {c7 c9}r4n6 – {c9 c8}r4n9 – {c8 c1}r8n9 ==> r8c1 ≠ 8
...(Naked-Singles and Hidden-Singles)
1 2 5 1 4 2 9 6 8 5 1 7 4 2 3 9 6 8 5
8 9 6 8 9 6 8 2 4 5 7 1 9 3
9 6 8 9 3 5 6 1 8 4 7 2
7 1 2 7 5 9 1 2 8 7 5 9 1 4 6 2 3 8
4 8 4 6 8 9 4 2 6 3 8 5 9 1 7
9 6 3 8 9 6 3 1 8 7 9 2 5 6 4
6 3 6 3 9 8 2 6 3 9 7 4 8 5 1
5 7 8 4 1 5 7 9 8 4 1 5 6 3 7 2 9
9 9 7 8 6 5 9 7 8 2 1 3 4 6
1 4 8 6 1 4 8 6 2 5 1 4 8 7 9 3 6
9 6 8 9 4 7 6 3 2 8 9 4 7 6 3 2 5 1 8
3 5 3 1 5 3 6 8 1 5 9 4 2 7
7 3 9 8 7 3 9 8 7 1 2 3 9 5 8 6 4
2 5 8 3 2 6 4 7 9 1
8 2 7 8 2 6 9 4 7 1 8 2 5 3
3 5 3 5 9 7 8 4 3 5 9 7 6 1 8 2
1 9 3 7 1 9 8 3 7 1 2 9 8 4 3 6 7 5
8 7 1 9 8 7 5 1 9 8 7 6 5 2 1 3 4 9
xyt6-chain {n6 n4}r7c6 – {n4 n1}r7c7 – {n1 n2}r7c9 – {n2 n6}r7c1 – {n6 n2}r8c2 –
{n2 n6}r8c5 ==> r9c5 ≠ 6
xyt6-chain {n1 n4}r7c7 – {n4 n6}r7c6 – {n6 n2}r7c1 – {n2 n6}r8c2 – {n6 n5}r8c7 –
{n5 n1}r2c7 ==> r5c7 ≠ 1
hxyt-rn7-chain {c9 c7}r8n5 – {c7 c5}r8n4 – {c5 c2}r8n6 – {c2 c3}r3n6 – {c3 c2}r3n8 –
{c2 c3}r5n8 – {c3 c9}r5n3 ==> r5c9 ≠ 5
hxyt-cn10-chain {r5 r6}c9n3 – {r6 r5}c3n3 – {r5 r3}c3n8 – {r3 r5}c2n8 – {r5 r6}c2n9 –
{r6 r4}c2n1 – {r4 r1}c2n5 – {r1 r8}c2n2 – {r8 r7}c9n2 – {r7 r5}c9n1 ==> r5c9 ≠ 7
…(Naked-Singles and Hidden-Singles)
Puzzle Sudogen0-7875 (Figure 14) proves that HXYT12 is not subsumed by the
rules in L11+XY12+HXY12+XYT12, i.e. that HXYT12 is not superfluous. Its
L11+ XY12+HXY12+XYT12 elaboration reduces to its L1_0 elaboration and it
adds only five new values to the original. In its resolution path, after a few simple
rules, three chains of lengths five, seven and eight (hxy-rn5, xyzt7 and xyt8) appear,
followed by a hxyt-cn12-chain. After that, only rules in L1_0 are necessary.
9 3 9 5 3 2 7 1 9 4 6 8
7 6 3 7 9 6 3 7 1 9 4 8 6 2 3 5
8 7 8 9 7 4 8 6 2 3 5 1 9 7
5 9 5 9 4 8 2 5 6 7 3 9 4 1
4 2 6 4 2 6 9 7 4 5 2 1 3 8 6
6 3 4 5 6 3 4 5 1 6 3 9 4 8 7 5 2
6 3 4 6 3 9 4 6 5 7 3 9 2 8 1 4
9 5 9 5 2 9 8 1 5 4 6 7 3
4 1 7 9 4 1 7 9 3 4 1 8 6 7 5 2 9
Figure 14. Puzzle Sudogen0-7875, its L1_0 elaboration and its solution
xyt8-chain {n7 n1}r5c2 – {n1 n5}r2c2 – {n5 n2}r7c2 – {n2 n3}r8c1 – {n3 n5}r9c1 –
{n5 n3}r9c7 – {n3 n8}r5c7 – {n8 n7}r5c8 ==> r5c4 ≠ 7
hxyt-cn12-chain {r9 r8}c1n3 – {r8 r4}c9n3 – {r4 r3}c5n3 – {r3 r5}c6n3 – {r5 r3}c6n5 –
{r3 r8}c6n4 – {r8 r7}c6n2 – {r7 r4}c2n2 – {r4 r5}c2n7 – {r5 r2}c2n1 – {r2 r7}c2n5 –
{r7 r9}c7n5 ==> r9c7 ≠ 3
…(57 Naked-Singles and Hidden-Singles)
Finally, puzzle Royle17-2995 (Figure 15) proves that HXYT13 is not subsumed
by the rules in L12+XY13+HXY13+XYT13, i.e. that HXYT13 is not superfluous.
The L12+XY13+HXY13+XYT13 and L7 elaborations coincide.
2 1 9 3 4 2 8 5 1 9 3 6 7 4 2 8 5 1 9
8 6 8 1 6 3 8 4 1 6 5 9 7 3 2
5 5 9 1 5 2 9 1 3 7 4 6 8
9 4 1 9 4 1 9 8 2 4 3 6 7 5
6 3 6 4 9 3 1 6 7 4 9 8 5 3 2 1
1 2 1 9 4 2 5 3 7 6 1 9 8 4
1 4 9 1 4 9 1 5 8 7 6 2 4 3
7 3 7 3 4 1 7 8 2 3 9 4 1 5 6
5 8 4 5 1 8 4 3 6 5 1 2 8 9 7
Resolution path in L13 for the L12+XY13+HXY13+XYT13 (or L7) elaboration of Royle17-
2995:
row r1 interaction-with-block b1 ==> r3c2 ≠ 7, r2c2 ≠ 7, r3c2 ≠ 6
column c3 interaction-with-block b7 ==> r9c2 ≠ 2, r8c2 ≠ 2
naked-pairs-in-a-row {n3 n7}r3{c5 c6} ==> r3c9 ≠ 7, r3c8 ≠ 7, r3c7 ≠ 7
row r3 interaction-with-block b2 ==> r2c6 ≠ 7, r2c5 ≠ 7
hidden-pairs-in-a-row {n3 n5}r7{c3 c9} ==> r7c9 ≠ 7, r7c9 ≠ 6, r7c9 ≠ 2, r7c3 ≠ 8
row r7 interaction-with-block b8 ==> r8c5 ≠ 8
hidden-pairs-in-a-row {n3 n5}r7{c3 c9} ==> r7c3 ≠ 6, r7c3 ≠ 2
hidden-pairs-in-a-block {n3 n6}{r4c6 r6c5} ==> r6c5 ≠ 8, r6c5 ≠ 7, r6c5 ≠ 5, r4c6 ≠ 7,
r4c6 ≠ 5
block b5 interaction-with-row r5 ==> r5c8 ≠ 5, r5c2 ≠ 5
hidden-pairs-in-a-block {n3 n6}{r4c6 r6c5} ==> r4c6 ≠ 2
hxy-rn5-chain {c8 c6}r5n2 – {c6 c5}r5n5 – {c5 c6}r2n5 – {c6 c5}r2n9 – {c5 c8}r8n9 ==>
r8c8 ≠ 2
xyt5-chain {n7 n6}r1c2 – {n6 n3}r9c2 – {n3 n5}r7c3 – {n5 n8}r8c2 – {n8 n7}r5c2 ==>
r6c2 ≠ 7
Hidden xyt-chains (hxyt-chains) 295
xyt5-chain {n8 n7}r5c2 – {n7 n6}r1c2 – {n6 n3}r9c2 – {n3 n5}r7c3 – {n5 n8}r8c2 ==>
r6c2 ≠ 8
xyt5-chain {n5 n3}r7c3 – {n3 n6}r9c2 – {n6 n7}r1c2 – {n7 n8}r5c2 – {n8 n5}r8c2 ==>
r8c3 ≠ 5
xyt5-chain {n3 n6}r9c2 – {n6 n7}r1c2 – {n7 n8}r5c2 – {n8 n5}r8c2 – {n5 n3}r7c3 ==>
r9c3 ≠ 3
xyzt8-chain {n7 n8}r6c4 – {n8 n5}r5c5 – {n5 n9}r2c5 – {n9 n6}r8c5 – {n6 n3}r6c5 –
{n3 n5}r6c2 – {n5 n8}r8c2 – {n8 n7}r5c2 ==> r5c6 ≠ 7
hxyt-rn13-chain {c2 c9}r9n3 – {c9 c3}r7n3 – {c3 c6}r4n3 – {c6 c5}r3n3 – {c5 c6}r3n7 –
{c6 c8}r9n7 – {c8 c6}r9n9 – {c6 c5}r2n9 – {c5 c6}r2n5 – {c6 c5}r5n5 – {c5 c2}r5n7 –
{c2 c3}r1n7 – {c3 c2}r1n6 ==> r9c2 ≠ 6
naked singles ==> r9c2 = 3, r7c3 = 5, r7c9 = 3, r6c2 = 5
xyt5-chain {n7 n8}r5c2 – {n8 n6}r8c2 – {n6 n9}r8c5 – {n9 n5}r2c5 – {n5 n7}r5c5 ==>
r5c8 ≠ 7
hxy-rn5-chain {c3 c5}r6n3 – {c5 c6}r3n3 – {c6 c5}r3n7 – {c5 c2}r5n7 – {c2 c3}r1n7 ==>
r6c3 ≠ 7
xyt6-chain {n7 n8}r6c4 – {n8 n3}r6c3 – {n3 n6}r6c5 – {n6 n9}r8c5 – {n9 n5}r2c5 –
{n5 n7}r5c5 ==> r4c4 ≠ 7
xyzt5-chain {n2 n8}r5c8 – {n8 n6}r3c8 – {n6 n7}r6c8 – {n7 n8}r6c4 – {n8 n2}r4c4 ==>
r4c8 ≠ 2
xyt8-chain {n2 n8}r5c8 – {n8 n7}r5c2 – {n7 n5}r5c5 – {n5 n2}r5c6 – {n2 n8}r4c4 –
{n8 n7}r6c4 – {n7 n6}r6c8 – {n6 n2}r3c8 ==> r9c8 ≠ 2
hxyt-rn7-chain {c8 c6}r9n9 – {c6 c5}r2n9 – {c5 c8}r8n9 – {c8 c9}r8n5 – {c9 c3}r8n2 –
{c3 c9}r9n2 – {c9 c8}r9n7 ==> r9c8 ≠ 6
hxyt-rn7-chain {c9 c7}r2n7 – {c7 c2}r2n4 – {c2 c9}r2n2 – {c9 c3}r8n2 – {c3 c6}r9n2 –
{c6 c8}r9n9 – {c8 c9}r9n7 ==> r4c9 ≠ 7
xyzt10-chain {n2 n8}r4c4 – {n8 n7}r6c4 – {n7 n5}r5c5 – {n5 n9}r2c5 – {n9 n6}r8c5 –
{n6 n8}r8c2 – {n8 n2}r8c3 – {n2 n6}r9c3 – {n6 n7}r9c9 – {n7 n2}r2c9 ==> r4c9 ≠ 2
xyt11-chain {n2 n8}r4c4 – {n8 n7}r6c4 – {n7 n5}r5c5 – {n5 n9}r2c5 – {n9 n6}r8c5 –
{n6 n8}r8c2 – {n8 n2}r8c3 – {n2 n5}r8c9 – {n5 n6}r4c9 – {n6 n8}r6c8 – {n8 n2}r5c8 ==>
r5c6 ≠ 2
...(Rules in L1)…
Whether any human being can find so long hxyt-chains (or xyt-chains) will be
left as an open question. What is certain is that it is much more likely to happen with
the rn- representation (Figure 16) than without it, because the corresponding
situation in ordinary rc-space is much more complex.
296 The Hidden Logic of Sudoku
n1 n2 n3 n4 n5 n6 n7 n8 n9
2 3 2 3
r1 8 5 1 4 7 6 9
2 2
r2 3
7 9
8 7
5 6 4 7 9
1 5 6 r2
2 2
r3 4
7 8 9
5 6
7
1
7 8 9
5 6
8 9
3 r3
3 3 3 3
r4 1
4
7 8 9
6 5 8 9
6
7 8 9
4
7 8 9
4
8 9
2 r4
2 2
r5 9 8
6 7 3 5 6 1
5
8
5
8
4 r5
2 3 2 3 3 3
r6 6 1
5 9 8
5
8
4
8
4
8
7 r6
3 3
r7 2 4
7
6
9
8 9 7
5 6 4 5 6 4 5 1 r7
7
3 2 2 3 2 3 3
r8 7 9
4 6 8 9
5
8 9
1
5
8
r8
3 2 2 3
r9 5 6
8 9 9
1 4 6 6
8 9
7
8
6 r9
8 9
n1 n2 n3 n4 n5 n6 n7 n8 n9
Figure 16. Puzzle Royle17-2995, seen in rn-space, just before the hxyt13-chain rule
It should be noted that, although this chain is very long (thirteen cells), only two
cells have a (single) additional (column) candidate: c9 is allowed in r9n7 because
this rn-cell shares a row with (has the same row-coordinate as) rn-cell 1 (r9n3), in
which c9 appears to be the right-linking candidate; c8 is allowed in r5n7 because
this rn-cell ("shares a number with" i.e.) has the same number-coordinate as rn-cell
5 (r9n7), in which c9 appears to be the right-linking candidate. Moreover, this long
chain can be considered as composed of three simpler autonomous partial chains: a
partial hxyt-rn6 chain on the first six cells, a partial hxyt-rn6 chain on cells 6 to 11
and a partial hxy-rn chain on cells 12 and 13.
Chapter XIX
XIX.1.1. xyz-chains
– the left-linking candidate for each cell (but the first) is equal to the right-
linking candidate for the previous cell (as is the case for xy-chains);
– one and only one of the internal cells of the chain (i.e. not the endpoints) may
have a third candidate (called a z-candidate, equal to the left-linking candidate for
the first cell (the target value of a full xyz-chain), but no other extra candidate.
Notice that a cell may contain the target value as a left or a right-linking candidate,
but then it is not counted as a third candidate (if there is no other extra candidate in
the chain, it is then a pure xy-chain).
Definitions:
– a full xyz-chain is an xyz-chain such that the right-linking candidate for the last
cell equals the left-linking candidate for the first cell;
– the target number of a full-xyz-chain is the left-linking candidate for the first
cell, which is equal to the right-linking candidate for the last cell (as is the case for
xy-chains);
– a target cell of a full-xyz-chain is any cell that is linked to both endpoints of
the chain and to the unique cell in the chain having three candidates.
Theorem XIX.1 (constraints propagation rule for full xyz-chains): given a full
xyz-chain with xyz-chain target value n, eliminate n from the candidates for any
of its xyz-target cells.
Proof of the rule for a full xyz4-chain: let the cells in the chain be C1, C2, C3, C4;
let the successive left candidates be n1, n2, n3, n4, so that the target variable is n1 and
the successive right candidates are n2, n3, n4, n1.
There are two types of xyzt4-chains, depending on which of the second or the
third cell has an extra candidate and a link to the target cells; to these two cases,
there correspond two symbolic representations:
– xyz4-chain_type-1 rule:
Consider for instance the second case and let TC be any xyz4-target-cell, i.e. TC
shares a unit with both C1, C3 and C4 – and is therefore different from these three
cells.
The proof of the theorem parallels the proof of the xy4-chain rule in section
XII.2.3 until, in the second branch of the alternative for C1 (i.e. in the hypothesis
C1 = n2), we reach cell C3.
Cell C1 can take only two values (hypothesis n2 ≠ n1 is essential for this asser-
tion). Let us consider each possibility in turn:
– if C1 = n1, then TC cannot be n1 since it shares a unit with C1;
– if C1 = n2, then C2 cannot be n2 since it shares a unit with C1; it must therefore
be n3 (hypothesis n3 ≠ n2 is essential for this conclusion). Therefore C 3 cannot be n3
since it shares a unit with C2.
Here comes the main difference with a simple xy4-chain: there remain two pos-
sibilities for C 3 instead of one: n1 or n4 (and it is essential for this conclusion that we
have the three inequalities: n4 ≠ n3, n1 ≠ n3 and n1 ≠ n4). Let us consider each
possibility in turn:
– if C3 = n1, then TC cannot be n1 since it shares a unit with C3; (this is where
the additional constraint on target cells is useful);
– if C3 = n4, then C4 cannot be n4 since it shares a unit with C3; it must therefore
be n1 (hypothesis n1 ≠ n4 is essential for this conclusion); and TC cannot be n1 since
it shares a unit with C4.
The proof for longer xyz-chains is similar, wherever the additional candidate and
link to the target cell are situated. We just have to do the appropriate number of
inferences in the second branch of the above alternative concerning values of cell C1
(i.e. in the hypothesis C1 = n2).
Notice that, as was the case for pure xy-chains, what we actually proved in the
branch of the alternative with C1 = n2 is that, if there is a compatible target cell, all
the cells in the chain are equal to their right-linking candidate. We therefore have:
Theorem XIX.2 (general theorem for non necessarily full xyz-chains): given
an xyz-chain with target value n1, if there is at least one cell TC linked to all the
starred cells of this chain, then, for each such cell, either TC ≠ n1 or the value of
each cell in the chain is its right-linking candidate.
300 The Hidden Logic of Sudoku
– xyz3-chain rule:
Let us show that, apart from special cases of Naked-Triplets, this general pattern
for xyz3-chains covers exactly the classical cases of XYZ-Wing described in detail
in chapter X – entailing that there is no reason to add effectively this rule to our rule
base. Remember our conventions: u1 and u2 are the names for the first two links in a
chain.
If there are links u’1 and u’2 (possibly different from u1 and u2) between the same
three cells, such u’1 = u’2, then we still have an xyz-chain on these cells with these
new links and, due to theorems XI.3 and XI.6, the conditions and conclusions of the
rule are those of a Naked-Triplets.
We can therefore assume in the sequel that u1 ≠ u2. And we have to consider
only two cases: u1 is a row and u2 is a column; u1 is a row and u2 is a block; all other
cases can be deduced from these two by symmetry.
– the first case (u1 is a row and u2 is a column) is impossible (more precisely, it
would entail that the three cells and the target cell share a block).
– the second case (u1 is a row and u2 is a block) corresponds to xyz-wing-rows-
blocks. The only thing we have to check is that the cells sharing a link with the three
cells of the chain are in row r1 and in block b2.
– xyz4-chain_type-1 rule:
xyz- and xyzt- chains 301
– xyz5-chain_type-1 rule:
– xyz5-chain_type-2 rule:
– xyz6-chain_type-1 rule:
– xyz6-chain_type-2 rule:
We leave it to the reader as a very easy exercise to write similar rules for xyz-
chains of length seven or more. It is obvious that, for any chain length k, there are
formally k-2 rules for xyz-chains of length k but only IP((k-1)/2) non logically equi-
valent such rules (where IP stands for "integer part").
From the definition of xyz-chains, it appears that we have decided not to consi-
der target cells having explicitly four or more links with cells in the chain. For such
a target cell, two of these links would be of the same type and therefore identical,
which means that some additional link of this type would already be present be-
tween the two cells in the chain. Therefore, instead of adding constraints on the tar-
get cells, we (partially) take care of such situations with the types of chains (xyzt)
introduced below. Indeed, xyz-chains are subsumed by the more complex xyzt-
chains. They had been implemented in the first versions of SudoRules but they are
not considered independently in our classification results in chapter XXII.
XIX.2.1. xyzt-chains
One can combine the ideas of xyt-chains and xyz-chains and obtain xyzt-chains.
xyzt-chains are obtained from xyt-chains in exactly the same way as xyz-chains are
obtained from xy-chains.
more than one additional t-candidate in each cell; (up to this point, the conditions
are those of xyt-chains);
– moreover, one of the internal cells of the chain (i.e. not its endpoints), whether
or not it already has extra t-candidates from the previous origin, may have one more
candidate (called a z-candidate) , equal to the left-linking candidate for the first cell
(the target value of a full xyt-chain); notice that a cell may contain the target value
as a left or a right-linking candidate, but then it is pointless to add it as an extra z-
candidate (it would be a pure xyt-chain).
Definitions:
– a full xyzt-chain is an xyzt-chain such that the right-linking candidate for the
last cell equals the left-linking candidate for the first cell;
– the target number of a full-xyzt-chain is the left-linking candidate for the first
cell, which is equal to the right-linking candidate for the last cell (as is the case for
xy-chains);
– a target cell of a full-xyzt-chain is any general target cell with one link added
to the cell where the fourth condition applies.
Theorem XIX.4 (constraints propagation rule for full xyzt-chains): given a full
xyzt-chain with xyzt-chain target value n, one can eliminate n from the candidates
for any of its xyzt-target cells.
Proof of the rule for a full xyzt4-chain: let the cells in the chain be C1, C2, C3, C4;
let the successive left-linking candidates be n1, n2, n3, n4, so that the target variable is
n1 and the successive right-linking candidates are n2, n3, n4, n1.
As for xyz4-chains, there are two possible generic cases of xyzt4-chains, accor-
ding to the cell receiving the extra target value. The corresponding symbolic repre-
sentations follow:
{1 2}*—{2 3 1}*—{3 4 (2#1)}—{4 1 (2#1) (3#2)}* and
{1 2}*—{2 3}—{3 4 (2#1) 1}*—{4 1 (2#1) (3#2)}*
Let us prove the rule in the second case (the first case is simpler). The proof is a
combination of the proofs for xyz- and xyt- chains. It parallels the proof of the xyt4-
chain rule in section XVII.1.2 until, in the second branch of the alternative for C1
(i.e. in the hypothesis C1 = n2), we reach cell C3. We just have to be careful about
the tree of hypotheses.
304 The Hidden Logic of Sudoku
Cell C1 can take two and only two values (hypothesis n2 ≠ n1 is essential for this
assertion). Let us consider each possibility in turn:
1) if C1 = n1, then TC cannot be n1 since it shares a unit with C1 (notice that hy-
pothesis TC ≠ C1 is essential here);
2) if C1 = n2, then C2 cannot be n2 since it shares a unit with C1; it must therefore
be n3 (hypothesis n3 ≠ n2 is essential for this conclusion). Therefore, C3 cannot be n3
since it shares a unit with C2.
At this point of divergence with the proof for xyt4-chains, there remain not two
but three possibilities for cell C3: either C3 = n4 or C3 = n2 or C3 = n1 (this makes
sense only if we assume n2 ≠ n4, n1 ≠ n3 and n1 ≠ n4, i.e. n2 and n1 are effectively ad-
ditional values in C3); let us consider them in turn:
2a) the second possibility (C3 = n2) is present only when C3 is linked to C1,
which makes it inconsistent with the current hypothesis C1 = n2;
2b) the third possibility (C3 = n1) directly entails that TC cannot be n1, since it
shares a unit with C3; (this is where the additional constraint on target cells is
useful);
2c) as for the first possibility (C3 = n4), it entails that C4 cannot be n4, since it
shares a unit with C3, and C4 can a priori be either n1 or n2 or n3; let us consider the
three possibilities in turn:
2cα) in the first case (C4 = n1), TC cannot be n1 since it shares a unit with C4;
2cβ) the second case (C4 = n2), which makes sense only if we assume n2 ≠ n4, i.e.
n2 is effectively an additional value in C4, can be present only when C4 is linked to
C1, which makes this possibility inconsistent with the current hypothesis C1 = n2;
2cγ) the third case (C4 = n3) can be present only if C4 is linked to C2, which
makes this possibility inconsistent with the conclusion C2 = n3 already reached from
the current hypothesis C1 = n2.
Remarks:
– the proof for longer xyzt-chains, wherever we put the additional target value
and link, is completely similar;
xyz- and xyzt- chains 305
Notice that, as was the case for pure xy-chains, what we actually proved in the
branch of the alternative with C1 = n2 is that, if there is a compatible target cell, all
the cells in the chain are equal to their right-linking candidate. We therefore have:
Theorem XIX.5 (general theorem for non necessarily full xyzt-chains): given
an xyzt-chain with target value n1, if there is at least one cell TC linked to all the
starred cells of this chain, then, for each such cell, either TC ≠ n1 or the value of
each cell in the chain is its right-linking candidate.
All the xyzt-chain rules of any given length are easily written starting with the
xyt-chain rule of the same length. To each rule for xyt-chains of length k, there
correspond k-2 rules for xyzt-chains of length k. (The logical equivalences that exis-
ted among xyz-chains are not preserved here, due to the non symmetrical nature of
the underlying xyt-chains).
The general principle should now be clear, as should be the pattern of increasing
complexity with length. How far should we go? Same question and same non-
answer as for all the types of chains we have already met. In SudoRules, xyzt-chain
rules have been implemented up to length ten. As the examples below show, this is
not superfluous.
As, for each length k, there are k-2 types of xyzt-chains, we shall give up any
idea of completeness in the independence results and we shall propose examples
neither for all lengths nor for all types of xyzt-chains.
5 1 9 8 5 3 1 2 6 9 8 5 3 1 7 2 4
2 4 5 2 4 8 1 3 9 5 2 4 7 8 6 1 3 9
3 9 3 1 9 4 2 8 7 3 1 9 4 2 8 5 6
9 2 1 9 8 3 2 1 9 8 3 6 5 4 2 7 1
1 8 1 2 8 9 3 4 1 7 2 8 9 3 6 4 5
3 4 6 2 1 3 9 8 4 6 5 2 1 7 3 9 8
6 2 3 6 2 8 9 1 3 5 6 4 2 8 9 1 7
8 6 8 1 3 6 2 8 1 9 3 7 5 4 6 2
8 2 1 8 3 2 4 7 1 6 9 5 8 3
5 6 3 5 1 6 3 8 2 9 5 1 6 4 7
6 7 6 7 3 1 9 6 5 4 7 3 1 2 8
1 1 6 2 8 3 4 7 1 6 2 8 3 5 9
1 2 4 1 9 2 4 7 3 8 5 6 1 9 2 4 7 3
7 2 7 3 4 5 8 6 2 9 1 7 3 4 5 8 6 2 9 1
3 1 3 4 7 8 1 2 9 3 4 7 8 6 5
5 4 5 4 3 2 1 9 7 5 4 3 2 1 9 7 8 6
2 3 2 1 8 7 6 5 9 3 4 2 1 8 7 6 5 9 3 4
8 1 8 3 4 5 1 2 6 9 7 8 3 4 5 1 2
3 8 6 1 7 9 4 3 8 5 6 2 1 7 9 4 3 8
7 1 3 5 2 6 7 1 9 3 8 4 5 2 6 7 1 9
9 4 9 7 1 4 3 8 6 9 7 1 4 3 8 6 5 2
1 7 1 7 3 6 4 9 5 8 1 7 3 2 6
6 9 6 3 4 5 9 1 6 3 8 2 4 5 9 7 1
3 1 6 3 1 2 7 9 6 3 8 4 5
6 5 2 6 5 1 2 3 7 4 9 6 5 1 2 8 3
3 6 3 7 1 6 8 5 3 7 9 2 1 6 4
1 1 6 3 2 1 6 3 8 4 5 9 7
c1 c2 c3 c4 c5 c6 c7 c8 c9
2 2
r1 5 6 5 1 7 9 4 3 8
r2 3 4 4 5 2 6 7 1 9 r2
8 8
2 2
r3 9 7 1 4 3 8 6 5 5
2 2 2 2 2
r4 4 5 4 5 4 5 1 7 3 4 5 6 r4
8 8 9 8 9 8 9 8
2 2 2
r5 6 3 4 5 9 1 r5
7 8 8 7 8
2 2 2 2 2
r6 1 4 5 4 5 6 3 5 4 5 4 5 r6
8 9 7 8 9 8 9 8 7 8 7
r7 4 4 4 6 5 1 2 4 3 r7
7 8 8 9 7 8 9 7 8 9
2 2 2
r8 4 5 4 5 3 7 4 1 6 4 5 r8
8 8 9 8 9
2 2
r9 4 5 1 6 3 4 5 4 5 4 5 r9
7 8 8 9 8 7 8 9 7
c1 c2 c3 c4 c5 c6 c7 c8 c9
It is worth analysing the starting situation in some detail. Notice that the first two
xyzt5-chains reside on the same five cells, but ordered differently and with different
target values and target cells. This example is ideal for understanding how the "t, z
and zt extensions" work. Let us look successively at the three chains, with the full
content of their cells, their target values (TV) and their target cells (TC). Notice that,
for the first two chains, although they lie on the same set of cells, the same additio-
nal value in the same cell (e.g. n4 in r9c8) requires different justifications.
6 3 5 8 1 9 6 7 2 4 3 5 8 1 9 6 7 2 4 3 8
7 4 1 7 6 3 4 5 1 2 7 6 3 8 4 5 9 1 9
2 2 4 5 1 3 2 4 5 1 3 9 8 7 2
5 2 7 5 4 2 3 7 6 5 4 9 2 3 7 8 6
3 5 3 7 5 2 4 9 3 7 5 1 8 6 2 1
8 8 2 4 7 3 5 1 8 2 4 6 7 3 5 5
4 2 4 9 7 5 1 2 3 4 9 8 7 5 1 2 6 3
1 3 7 1 2 5 3 7 1 2 9 6 5 4 4
3 5 2 3 5 2 6 3 8 4 1 9 7
1 4 6 5 1 9 3 4 6 5 1 9 7 3 4 6 8 2
9 3 1 9 5 3 2 7 6 1 9 8 5 4 3
3 5 9 1 8 3 4 2 6 5 9 7 1
2 8 7 3 2 5 8 4 1 7 9 6 3 2 5 8 4 1 7 9 6
3 5 9 3 5 1 4 7 9 8 3 5 6 1 2 4
4 4 1 9 3 5 4 6 1 9 7 2 3 5 8
2 1 5 4 3 2 1 9 6 5 7 4 8 3 2 1 9
3 5 1 4 3 5 2 9 8 6 7 1 4 3 5 2 9 8 6 7
9 4 9 2 1 4 3 5 9 8 2 6 1 7 4 3 5
Notice that, before the hxyt-rn5 rule was applied, the xyzt5-chain was not
present, but the xyzt7-chain was already present.
4 6 2 4 6 9 2 3 4 7 5 6 1 8 9 2 3
1 2 1 8 3 2 9 7 4 5 6 1 8
8 8 2 5 7 6 8 1 3 2 9 4 5 7
3 8 5 4 3 2 8 6 5 7 9 4 1 3 2 8 6 5
6 3 8 6 3 2 8 1 6 5 9 4 3 7 2
2 2 3 1 4 2 5 3 8 6 7 1 9 4
5 3 7 5 3 2 7 4 1 5 3 2 9 8 6 7 4 1
2 1 9 4 8 2 7 1 5 3 6 9 4 8 2 7 1 5 3 6
4 6 4 5 3 2 8 9 1 6 7 4 5 3 2 8 9
Exercise: before the HXYZT-rn6-type-1 rule applied, the xyzt5- and the xyzt6-
chains were not present (candidate 5 may not be present in cell r5c6), but the xyzt7-
chain was already present (no change of its candidates is produced by the three
preceding chain rules).
5 7 1 5 4 7 3 6 9 1 2 5 4 7 3 8 6
6 4 6 7 5 3 4 1 6 7 5 2 3 8 4 1 9
4 1 4 3 1 6 7 5 4 8 3 1 9 6 7 5 2
7 1 4 7 1 6 3 2 4 7 9 8 5 1 6 3
3 6 3 1 7 6 5 8 3 1 7 6 2 5 9 4
4 8 5 4 1 3 8 7 5 6 9 4 1 3 8 2 7
2 7 5 3 4 2 1 7 5 3 9 4 8 2 1 6 7 5
1 1 3 7 1 5 6 3 7 9 2 4 8
3 7 4 3 1 7 2 8 6 5 4 9 3 1
Hidden xyz- (respectively hidden xyzt-) chains, or hyyz- (resp. hxyzt-) chains,
are the "hidden" counterpart of xyz- (resp. xyzt-) chains. They are to xyz- (resp. to
xyzt-) chains exactly what hxyt-chains are to xyt-chains or what hxy-chains are to
xy-chains. Roughly speaking, a hxyz- (respectively a hxyzt-) chain is defined as and
looks like an xyz- (resp. an xyzt-) chain, but in rn- or cn- instead of rc- space –
except that there are no links along 3x3 pseudo-blocks in these spaces; and the
eliminations it allows in rn- or cn- space are similar to those allowed in rc-space by
xyz- (resp. xyzt-) chains. Moreover, the "super-hidden" counterparts of xyz- and
xyzt- chains are identical to their "hidden" counterparts. As xyz-chains are subsu-
med by xyzt-chains, we shall develop here only the hxyzt-chains.
hxyzt-chain rules are obtained from the block-free part of xyzt-chain rules, by
applying the Scn and Srn transformations.
316 The Hidden Logic of Sudoku
Let us apply the previous theorem to list the first hxyt-chain rules:
Starting from the four rules for xyzt6-chains, we get eight rules for hxyzt6-
chains; and so on.
The pattern is clear. How far should we go? The answer is the same as for all the
previous types of chains. In our SudoRules solver, hxyzt-chains have been imple-
mented up to length seven. As the examples below show, this is not superfluous.
But, conversely, there may exist hxyzt-chains of greater length.
HXYZT-rnk Srn
Scn
Srn
HXYZT-cnk Scn
As was the case for xy- and hxy- chain rules, or for xyt- and hxyt- chain rules, it
can easily be seen that xyzt- and hxyzt- chain rules are related as described in Figure
1.
In particular, we have the useful practical consequence:
3 1 6 3 1 2 4 8 6 5 3 1 9 2 4 8 7 6
8 4 7 8 1 4 3 7 6 2 8 5 1 9 4 3
8 4 3 1 9 8 4 3 6 7 2 1 5
7 3 1 4 7 8 3 1 6 9 5 4 7 8 3 2 1
4 2 4 1 7 2 3 8 4 1 7 2 3 5 6 9 8
8 7 8 3 1 7 4 8 2 3 1 9 6 7 5 4
2 5 8 2 7 6 5 4 3 1 8 9 2 7 6 5 4 3 1 8 9
5 1 3 5 8 1 4 3 5 8 7 1 9 4 6 2
1 4 9 8 3 1 4 9 6 8 2 5 3 7
7 8 4 7 8 4 7 6 1 5 8 3 4 9 2
9 3 9 3 8 9 4 3 2 7 6 8 5 1
1 8 1 8 5 2 4 1 9 3 6 7
3 5 9 3 5 9 8 1 7 4 3 2 5 9 8 6
8 8 9 1 6 8 9 1 4 7 2 3 5
9 9 8 1 2 3 5 9 6 8 1 7 4
2 1 8 2 9 1 8 3 2 6 7 9 4 5 1 8
5 6 5 8 6 5 1 8 6 3 2 7 4 9
2 8 2 4 9 7 8 5 1 6 2 3
reason to introduce such a specific chain type, since it is subsumed by the general
xy-pattern.
5 3 2 9 8 5 6 3 2 1 7 9 8 5 6 3 2 4 1
6 1 6 1 5 3 4 6 1 9 2 7 8 5 3
4 5 3 4 1 6 2 5 3 4 8 1 6 7 9
4 7 1 6 9 4 2 8 7 5 3 1 6 9 4 2 8 7 5 3 1 6
8 1 3 7 6 2 8 5 1 3 7 6 4 2 9 8 5
5 5 8 6 1 3 2 5 8 6 1 3 9 4 2 7
8 2 8 7 2 5 6 1 3 8 7 9 2 5 6 1 3 4
3 5 3 1 5 6 2 3 1 4 7 9 8 5 6 2
1 6 2 5 3 1 6 2 5 3 1 4 7 9 8
xy4-chain {n7 n9}r4c8 – {n9 n7}r4c4 – {n7 n9}r6c5 – {n9 n7}r1c5 ==> r1c8 ≠ 7
xyzt6-chain {n9 n3}r8c2 – {n3 n7}r2c2 – {n7 n9}r3c3 – {n9 n8}r1c1 – {n8 n3}r2c1 –
{n3 n9}r5c1 ==> r5c2 ≠ 9
hxyzt-cn6-chain {r9 r8}c3n8 – {r8 r5}c3n3 – {r5 r2}c1n3 – {r2 r1}c1n8 – {r1 r7 }c9n8 –
{r7 r9}c4n8 ==> r9c8 ≠ 8
block b9 interaction-with-column c9 ==> r1c9 ≠ 8
hxyt-cn5-chain {r6 r1}c8n8 – {r1 r2}c1n8 – {r2 r5}c1n3 – {r5 r1}c1n9 – {r1 r6}c5n9 ==>
r6c8 ≠ 9
xyzt6-chain {n7 n9}r1c5 – {n9 n7}r6c5 – {n7 n8}r6c8 – {n8 n9}r6c7 – {n9 n8}r3c7 –
{n8 n7}r2c7 ==> r1c9 ≠ 7
...(Naked-Singles and Hidden-Singles)
2 3 4 5 2 3 4 1 9 5 2 3 7 4 1 8 6
4 1 5 6 4 2 8 3 1 5 6 9 7 4 2
5 4 6 1 2 3 5 4 6 7 1 2 8 9 3 5
5 8 1 5 8 4 3 6 2 1 5 8 4 7 3 6 2 9 1
2 6 2 8 1 6 5 4 3 7 9 2 8 1 6 5 4
1 4 1 2 6 4 5 3 1 2 6 4 9 5 8 7 3
3 2 6 1 5 4 3 2 6 1 5 8 4 7 3 2 9
7 5 7 5 2 4 1 7 9 3 6 5 2 4 1 8
1 2 4 1 3 5 2 4 8 9 1 3 5 6 7
Two different hxy-cn6 chains appear, that share four of their six underlying cn-
cells. At the same time, there is a hxyzt-rn6-chain of type 1. After the HXYZT-rn6
rule has fired, a chain of type hxyt-rn6 appears. One can check that, before rule
HXYZT-rn6 eliminated candidate c5 from rn-cell r6n5, which is the second cell of
the hxyt6-chain, the presence of column-candidate c5 in this rn-cell prevented the
underlying sequence of rn-cells (r6n3, r6n5, r2n5, r4n5, r7n5 and r9n5) from being
the support of either a hxyt-rn6- or a hxyzt-rn6- chain.
3 5 8 3 5 1 8 2 9 7 6 4 3 5 1 8 2
2 1 2 1 2 3 1 8 9 7 5 6 4
8 5 2 1 8 4 5 6 2 1 3 7 9
7 2 7 2 7 6 3 2 5 8 9 4 1
8 3 8 2 3 5 8 2 9 1 4 7 3 6
6 6 2 4 1 9 3 7 6 8 2 5
6 1 2 6 1 2 8 6 9 7 1 4 3 2 5 8
5 8 5 8 2 1 5 4 7 8 2 6 9 3
4 7 2 8 9 4 7 3 2 8 5 6 9 4 1 7
Classification results
As indicated in the introduction, we have chosen to test the resolution rules and
theories defined in this book (and the associated knowledge base implementing
them) against three collections of minimal puzzles with a single solution: the non
random Royle’s database of 36,628 non equivalent 17-minimal puzzles and two
randomly generated databases Sudogen0 and Sudogen17. Through all the examples
they contain, the previous chapters have already included some of the results thus
obtained. Nevertheless what has just been occasionally hinted at but is still missing
is a global view on the relative efficiencies of the various resolution rules. Providing
such a view is the purpose of this chapter. Detailed listings of the puzzles pertaining
to each cell in the following tables (thus fully justifying them) can be obtained on
the author’s Web pages. These listings can also be used to find additional examples
for all the rules introduced in this book.
All the rules defined in this book have been implemented in a knowledge base
that can be run with either the CLIPS or the JESS inference engines. Nevertheless
all the results in this book (including those below) have been obtained using CLIPS
(version 6.24). CLIPS is very slow when solving long series of puzzles (due to refe-
renced but unsolved memory management problems) but it is free software, whereas
JESS is much faster (on long series) but it is not free. Moreover, JESS has some
longstanding undocumented problems regarding management of saliences (i.e. of
the priorities between the rules); as a result, JESS misses some patterns and fails to
classify some puzzles in their proper place in our hierarchy, a crippling defect for
classification purposes.
324 The Hidden Logic of Sudoku
Let us first consider the levels (L0 to L4_0) that have been defined based on the
most classical patterns, i.e. with no chain rules other than the very special cases of
XY-Wing and XYZ-Wing.
In the following tables, "new" (in "new grids solved") means relatively to the
grids solved using only the techniques preceding it in this classification:
total
New rules used number of % of
number of % of grids
(relatively to all the new grids new grids
grids solved
previous lines) solved solved
solved
NS 0 0% 0 0%
+HS → L1_0 16,867 46.05% 16,867 46.05%
+RCiB 10,363 28.29% 27,230 74.34%
+BiRC → L1 1,233 3.37% 28,463 77.71%
+NP 1,652 4.51% 30,115 82.22%
+HP 1,155 3.15% 31,270 85.37%
+SHP → L2 11 0.03% 31,281 85.40%
+NT 24 0.07% 31,305 85.47%
+HT 13 0.04% 31,318 85.50%
+SHT → L3_0 3 0.01% 31,321 85.51%
+XY3 850 2.32% 32,171 87.83%
+XYZ3 → L3 9 0.02% 32,180 87.86%
+N4 0 0% 32,176 87.85%
+H4 0 0% 32,176 87.85%
+SH4 → L4_0 3 0.01% 32,179 87.85%
Classification results 325
– there are very few puzzles solved thanks to the addition of rules for Triplets or
Quadruplets (whether they are Naked, Hidden or Super-Hidden).
Regarding levels L4_0 to L13, relative to the addition of the chain rules for all
the types of chains defined in this book, let us first give global results, based on the
length of the chains allowed at each level.
First, let us synthesise what we said in the various chapters relative to chain
rules. Depending on the levels (i.e. on the length of the chains), only certain types of
chains have been considered:
– levels L4 to L7: XY, HXY, C, XYT, HXYT, XYZT, HXYZT (all types)
– levels L8 to L10: XY, HXY, C, XYT, HXYT, XYZT (HXYZT discarded)
– level L10: XY, HXY, XYT, HXYT, XYZT (C10 discarded)
– levels L11 to L13: XY, HXY, XYT, HXYT (XYZT discarded).
There are several reasons for these limitations, all pertaining to the general
notion of "return on investment" (ROI) – in terms of programming time (often rela-
ted to computation time) versus number of new grids solved: for any type of chains,
the first increases with length (a global self-evident truth that has to be modulated by
the type of the chain for the details) while the following table shows that the third
decreases significantly with lengths greater than seven or eight (this table also indi-
cates that, with a little more work, it is likely we would have solved a few additional
puzzles, but the difference would not be very significant). For more specific expla-
nations on the ROI of each rule type, see our comments after the tables in section 3.
The table below shows that there is still little difference between the two ran-
domly generated Sudogen collections.
Classification results 327
It also shows that the Royle17 collection has a very strong bias against chain
rules, in particular of length greater than five or six. Whether this is due to the 17-
minimal property or to the (undisclosed) origins of Royle17 remains uncertain (al-
though we think the second hypothesis is the right one).
Let us now define a partial order on chain rules at levels above L4_0. A chain is
characterised by two factors: its type T and its length k. Lengths inherit from the na-
tural order on the integers, whereas types are ordered as follows:
XY < HXY < C < XYT < HXYT < XYZT < XYZT.
Notice that we should write XY = HXY, XYT = HXYT and XYZT = HXYZT,
since these rules are related by supersymmetry. Nevertheless, as in section 1, for the
purpose of evaluating how much the hidden chains add to the solution of puzzles,
we have separated them when we established the following tables.
328 The Hidden Logic of Sudoku
Then, the partial order on chains of length no less than four is defined by:
The following tables give the numbers of new grids solved, where "new" means
relatively to the grids solved using only the techniques preceding it in this classifi-
cation, being admitted once and for all that L4_0 is the root of the classification and
is included in every set of rules considered hereafter.
Notice that having a puzzle classified in (Ti, lk) does not mean that chains of type
Ti and length lk are necessary to solve it. It means precisely that:
– it cannot be solved using only chains of types Ti’ (i’ < i ) and length lk’ (k’ < k)
and either chains of type Ti’ (i’ ≤ i ) and length lk’ (k’ < k) or chains of types Ti’
(i’ < i ) and length lk’ (k’ ≤ k);
– it can be solved using a combination of the three types of chains above and,
possibly but not necessarily, chains of type Ti and length lk.
Table for Royle17 (each cell contains the number of puzzles solved among the
36,628 puzzles in the collection, followed, for easier comparison with the two Sudo-
gen cases, by the number of puzzles solved among 10,000):
Finally, we checked on the Sudogen0 database whether there was any correlation
between the number of entries of a (minimal) puzzle and the maximum level of the
rules necessary to solve it. Since the numbers of remaining puzzles rapidly become
small as the level increases and variations become meaningless, we did this compa-
rison for only our first three levels (L1 to L3). As the following table shows, there
does not seem to be any significant correlation. Similar statements have been made
frequently, but I have never seen any quantitative justification for them.
Nb of
20 21 22 23 24 25 26 27 28 29
entries
Total nb of
0 32 375 1758 3455 2872 1231 249 24 4
puzzles
Puzzles solved
59% 48% 54% 55% 52% 56% 55% 54% 50%
in L1
Puzzles solved
0% 4% 6% 7% 6% 6% 6% 4% 0%
in [L1]+L2
Puzzles solved
0% 3% 5% 5% 5% 5% 5% 0% 0%
in [L2]+L3
Considering the results of this chapter, we shall adopt Sudogen0 as our reference
collection in the sequel.
Part Four
FULLY SUPERSYMMETRIC
(OR "3D") CHAIN RULES
Chapter XXII
Until now, apart from c-chains, all the chains we have considered were defined
and could be spotted as (pure or extended) xy-chains in either of the two dimensio-
nal spaces: rc, rn or cn (bn-space could have been used, but there are much fewer 2D
chains in this space than in the others). Nevertheless, we remarked that some way of
"knitting" conditions through the two dimensional spaces was also needed. c-chains,
as the simplest example of such a knitting, had a special status in part Three (where
they were more or less misplaced). The present chapter introduces the three dimen-
sional (3D) analogues of the xy-, xyt-, xyz- and xyzt- chains and shows that they
subsume all the previously defined chains. Of course, this does not mean that they
should replace them in the player’s arsenal and that we could forget everything that
was written in part Three: on the contrary, because instances of 3D chains are more
difficult to discover than those of 2D chains, it remains worth dealing with them
before 3D chains (or at least before 3D chains of the same length).
The general ideas underlying the various types of 3D chains remain the same as
for 2D chains, roughly speaking: any candidate that is contradictory with previous
right-linking candidates in the chain (or with the target cell) can be ignored as an
additional candidate whenever necessary (but it can still be used as a linking candi-
date). In order to use this idea in the 3D view, we just have to slightly adapt its for-
mulation. This provides a second, pedagogical reason for beginning with the 2D
chains: 3D chains are a simple generalisation of them.
One can consider this chapter as the culmination of our general ideas on super-
symmetry. Contrary to the 2D chains, the various types of 3D chains defined below
are their own hidden and super-hidden counterparts.
334 The Hidden Logic of Sudoku
In each of the 2D spaces, there are 2D cells: rc-cells in rc-space, rn-cells in rn-
space, cn-cells in cn-space and bn-cells in bn-space. And in each of these spaces,
there are associated notions of cells being linked. In order to avoid any ambiguities,
and because this is a major condition for a correct understanding of links in 3D-
space, let us recall all the definitions:
– two different rc-cells (r1, c1) and (r2, c2) are rc-linked in rc-space if and only if
they share a unit ("rc-linked" should not be confused with the narrowest "rc-
connected" introduced in chapter IV for purely technical reasons), i.e. if and only if
they are in the same row or in the same column or in the same block;
– two different rn-cells (r1, n1) and (r2, n2) are rn-linked in rn-space if and only if
r1 = r2 or n1 = n2; (here, as the cells must be different, "or" is necessarily exclusive);
– two different cn-cells (c1, n1) and (c2, n2) are cn-linked in cn-space if and only
if c1 = c2 or n1 = n2; (here, as the cells must be different, "or" is necessarily
exclusive);
– two different bn-cells (b1, n1) and (b2, n2) are bn-linked in bn-space if and only
if b1 = b2 (and n1 ≠ n2); (notice the difference with the previous two cases: in bn-
space, there is no link along the b coordinate).
3D chains: nrc-, nrct-, nrcz- and nrczt- chains 335
In this chapter, we shall consider the 3D nrc-space, in which there naturally are
nrc-cells, with coordinates (n, r, c). Notice that the 3D nrc-space can be considered
as mapped onto the usual 2D grid, with the n-coordinate being uniformly curled into
each rc-cell, i.e. with the same n-coordinate value always occupying the same place
in the rc-cell. The presence of a candidate on the grid is equivalent to an nrc-cell
being occupied. In order to distinguish a candidate from its underlying nrc-cell, we
introduce a special notation for candidates: nrc, n1r1c1, n2r2c2,… In this notation,
"nrc" merely stands as a shorthand for the atomic formula "candidate(n, r, c)".
We say that nrc-cell (n, r, c) is the underlying cell (i.e. nrc-cell) of candidate nrc.
Definition: two nrc-cells (n1, r1, c1) and (n2, r2, c2) are nrc-linked if they are
different and:
– either n1 = n2 and the two rc-cells (r1, c1) and (r2, c2) are rc-linked in rc-space,
– or n1 ≠ n2 and the rc-cells (r1, c1) and (r2, c2) are the same.
Definition: two candidates n1r1c1 and n2r2c2 are nrc-linked if they are different
and their underlying nrc-cells are nrc-linked, i.e. if:
– either n1 = n2 and the two rc-cells (r1, c1) and (r2, c2) are rc-linked in rc-space,
– or n1 ≠ n2 and the rc-cells (r1, c1) and (r2, c2) are the same.
In practice, only the definition for candidates is useful. But we started with a
definition of nrc-cells being nrc-linked in order to emphasise that two given candi-
dates being nrc-linked is a purely factual property of a knowledge state and that it is
quasi "physical" in the sense that it depends only on the grid structure (in addition,
of course, to the actual presence of the candidates). In particular, it does not depend
on the ultimate truth value of these candidates.
Obviously, if two candidates are nrc-linked and one of them is ultimately true,
then the other will be false: an nrc-link is indeed the most general, fully super-
symmetric support for the immediate detection of a contradiction between two
336 The Hidden Logic of Sudoku
candidates. This is how an nrc-link can be used (for what is often called a "weak
inference" in the Sudoku litteraure), but this is not its purely factual definition.
Definition: two different candidates n1r1c1 and n2r2c2 are nrc-bivalue or nrc-
conjugate if they are nrc-linked and:
– either n1 ≠ n2, (r1, c1) and (r2, c2) are the same rc-cell, and the only candidates
for this cell are n1 and n2,
– or n1 = n2, (r1, c1) and (r2, c2) are different rc-cells, and there is a row, a column
or a block along which (r1, c1) and (r2, c2) are conjugate for n1 – i.e. in which n1 is a
candidate for only these two cells.
Here again, this defines a purely factual predicate, with a "physical" support (the
nrc-link), although its instances are (slightly) less directly visible on an actual grid
than those of "nrc-linked". When seen in rc-space, "nrc-bivalue " is the most gene-
ral, fully super-symmetric synthesis of the bivalue property of cells and the conju-
gacy property of candidates in rc-space; informally, "nrc-bivalue" means "bivalue or
conjugate"; this is why it can also be named "nrc-conjugate". It also follows from
the interpretation of conjugacy in section IV.2.4 that "nrc-bivalue" or "nrc-
conjugate" means "bivalue in either of the rc-, rn-, cn- or bn- spaces".
As previously, this property can be used for inferences: one and only one of the
two candidates must be true (in particular, if one of them is false, then the other
must be true, which is often called a "strong inference" in the Sudoku litteraure).
Again, this is how it can be used for inference, but this is not its purely factual defi-
nition.
The notion of a 3D chain that will be introduced in this section will lead to the
unification of two apparently conflicting views of chains: chains of cells and chains
of candidates.
Notice that the condition on all the candidates being different could be relaxed,
as in Part Three, to the only condition that the first and last be different. This would
amount to allowing inner loops. We shall see later that these are rarely useful.
3D chains: nrc-, nrct-, nrcz- and nrczt- chains 337
Notice that the condition that the target does not belong to the chain could be
relaxed, but we shall see later that this would rarely be useful.
Of course, as for 2D chains, not all 3D chains are useful. The rest of this chapter
is devoted to useful types of 3D chains, i.e. chains that allow eliminations. All these
chains will appear as different generalisations of the basic xy-chains:
– nrc-chains as their 3D generalisation,
– nrct-chains as the t-relaxation of the nrc-chains and as the 3D generalisation of
the xyt- or hxyt- chains,
– nrcz-chains as the z-relaxation of the nrc-chains and as the 3D generalisation
of the xyz- or hxyz- chains,
– and nrczt-chains as the combined z- and t- relaxations of the nrc-chains and as
the 3D generalisation of the xyzt- or hxyzt- chains.
XXII.2. nrc-chains
Definitions:
– an nrc-chain of length n is a 3D chain of even length 2n such that, for any k
with 1 ≤ k ≤ n, the two candidates n2k-1r2k-1c2k-1 and n2kr2kc2k are nrc-bivalue (i.e. the
odd links in the chain are nrc-bivalue links, while the even links are mere nrc-links);
– because nrc-chains are the 3D-generalisation of xy-chains, odd candidates are
called left-linking candidates and even candidates are called right-linking candi-
dates; the n cells containing the successive groups of two conjugate candidates are
called the cells of the chain; they generally belong to different 2D spaces;
– a target of an nrc-chain is simply a general target of the underlying 3D chain;
notice that, as was the case for all our 2D chains, the links between elements of a 3D
chain and any of its targets are simple links (the only difference being that, in the
present case they are nrc-links).
Theorem XXII.1 (nrc-chain rule): given an nrc-chain, any of its targets can be
eliminated.
As was the case for the xy-chains, we have a more general theorem, independent
of any target:
Theorem XXII.2 (general nrc-chain rule): given a partial nrc-chain, either the
first left-linking candidate is true, or every right-linking candidate is true.
Notice that there is a close relationship between nrc-chains and what is known in
the Sudoku litterature as Nice Loops (NLs) and Alternating Inference Chains (AICs)
– or at least of the "basic" such chains (i.e. disregarding their extensions to sets of
candidates). nrc-chains may be merely a different view of such well known chains.
Nevertheless, there is such ambiguity and variation in the definitions of these chains
that I am still unable to know for certain whether they subsume nrc-chains. Once
NLs and AICs are translated into the present conceptual framework and targets are
extracted from these chains, the problem is:
– in the NL litterature, the target is supposed to be a cell that "sees" both end-
points, where "sees" means "shares-a-unit"; the targets are therefore less general
than those of nrc-chains;
– in the AIC litterature, "sees" means sometimes "shares-a-unit" and sometimes
"nrc-linked".
– the vocabulary of "weak links" and "strong links" varies with author, time and
seemingly also with meteorological conditions.
Proof: the proof goes along the same lines as that for xy-chains, with links repla-
ced by nrc-links.
XXII.3. nrct-chains
nrct-chains are the 3D generalisation of xyt- and hxyt- chains, based on the same
general idea: any candidate that is already ruled out by a previous right-linking
candidate in the chain can be ignored as an additional candidate whenever necessary
(but it can still be used as a linking candidate).
Definition: given a set S of candidates, two candidates n1r1c1 and n2r2c2 are nrc-
bivalue modulo S if they are not in S, they are nrc-linked and:
3D chains: nrc-, nrct-, nrcz- and nrczt- chains 339
– either n1 ≠ n2, (r1, c1) and (r2, c2) are the same rc-cell, and the only candidates
for this rc-cell are n1, n2 and possibly any other value n such that (n, r2, c2) is nrc-
linked to an element of S,
– or n1 = n2, (r1, c1) and (r2, c2) are different rc-cells, and there is a row, a column
or a block along which (r1, c1) and (r2, c2) are rc-conjugate for n2 modulo S – i.e. in
which n1 is a candidate only for these two cells and possibly for any other cell (r, c)
such that n1rc is nrc-linked to an element of S.
Definitions:
– an nrct-chain is a 3D chain of even length 2n such that, for any k with 1 ≤ k ≤
n, the two candidates n2k-1r2k-1c2k-1 and n2kr2kc2k are nrc-bivalue modulo the set of
previous even nrc-candidates of the chain; (notice that the candidates that may
appear as additional candidates, which are called the t-candidates, are not considered
as belonging to the chain; notice also that this implies that the first two candidates
are nrc-bivalue);
– because nrc-chains are both the 3D-generalisation of xy-chains and the t-rela-
xation of nrc-chains, odd candidates are called left-linking candidates and even
candidates are called right-linking candidates; the n cells containing the successive
groups of two conjugate candidates (modulo the previous right-linking candidates)
are called the cells of the chain; they generally belong to different 2D spaces;
– a target of an nrct-chain is merely a general target of the underlying 3D chain.
Notice that, contrary to 2D chains, the second cell can contain additional t-candi-
dates; this was impossible in 2D chains because the second candidate was already in
the second cell; but in 3D chains, the nrc-links allow for more possibilities.
dates. This is non ambiguous if the type of the chain is prefixed to the pattern or to
the instance, as in:
nrct-chain {1 2} — {3 4} — {5 6}
Theorem XXII.4 (nrct-chain rule): given an nrct-chain, any of its targets can
be eliminated.
As was the case for the xy-chains or nrc-chains, we can prove (along similar
lines) a more general theorem, independent of any target:
Notice that, as was the case for xyt-chains and as will be the case for nrczt-
chains, there is no theorem justifying the no-loop condition. Inner loops cannot be
excised in these cases because they may contain the only right-linking candidates
justifying additional t-candidates in future cells.
XXII.4. nrcz-chains
nrcz-chains are the 3D generalisation of xyz- and hxyz- chains, based on the
same general idea: any candidate that is already ruled out by its being linked to the
target can be ignored as an additional candidate whenever necessary (but it can still
be used as a linking candidate).
{1 2} — {3 4 *} — {5 6 *}
where the curly braces indicate the nrc-bivalue relation modulo the additional z-
candidates and the *’s indicate optional candidates that are individually nrc-linked
to C. Notice that the convention on the "*" is different from that for 2D chains.
As was the case for the xy-chains or nrc-chains, we have a more general
theorem, for non full nrcz-chains:
Proof: the proof goes along the same lines as for nrc-chains.
XXII.5. nrczt-chains
nrczt-chains are the 3D generalisation of xyzt- and hxyzt- chains, based on the
same general idea: any candidate that is already ruled out by a previous right-linking
candidate in the chain or by the target can be ignored as an additional candidate
whenever necessary (but it can still be used as a linking candidate).
As was the case for the xy-chains or nrc-chains, we have a more general
theorem, for non full nrcz-chains:
XXII.6. Proof of the nrc-, nrct-, nrcz- and nrczt- chain rules
The proofs of the nrc- and nrct- chain rules follow the same general pattern,
which is the adaptation to 3D-space of the proofs for the xy- and xyt- chain rules: in
any of these chains, if the first candidate is false, then all the even candidates must
be true and all the odd candidates must be false. This can easily be proven by recur-
sion on the length of the chain.
In the case of nrcz- and nrczt- chains, the target cell has to be included in the
proof itself: if the target is true, then all the additional z-candidates are false and the
proof becomes similar to the proof for nrc- and nrct- chains.
The application to the chain rules themselves is straightforward. For any target
C, if it was true, then the first candidate in the chain would be false, and the last
would be true, which would entail that the target is false – a contradiction.
As for any kind of chain, what is difficult is not proving the validity of the
associated chain rule, it is discovering the actual chains on an actual grid.
3D chains: nrc-, nrct-, nrcz- and nrczt- chains 343
Some care must be taken when we write graphico-logical patterns for 3D chains.
For 2D chains, we could write only the list of (number, column or row) candidates,
leaving the underlying (rc-, rn- or cn-) cells implicit, e.g. {1 2} — {2 3} — {3 1}
for an xy3-chain, because we know they all lie in the same 2D space.
But 3D chains are chains of candidates. They can also be seen as chain of cells,
but in varying 2D spaces. The variables used must therefore denote full candidates
(i.e. of type nrc). As the examples introduced above show, we must write something
such as {1 2} — {3 4} — {5 6} for an nrc3-chain, where each integer stands for a
candidate instead of a Number: n1r1c1, n2r2c2… Apart from this change, the same
conventions for the optional candidates (be they z- or t- candidates) apply. As an
optional t-candidate is justified by a previous right-linking candidate (and not by a
cell), the reference will be to the justifying candidate, e.g. #2 to represent any
additional candidate nrc-linked to candidate n2r2c2 (and not to cell 2, which would
be ambiguous).
After such precautions, the logical formula associated with a 3D chain pattern
can be defined as in the 2D case, apart from the following:
– variables corresponding to symbols "1", "2",… must be interpreted as denoting
candidates: n1r1c1, n2r2c2,…;
– the "—" link symbol must be interpreted by predicate "nrc-linked";
– the "#" symbol in a closed cell pattern means the cell must be interpreted as
"nrc-bivalue modulo the previous right-linking candidates";
– the "*" symbol in a closed cell pattern means the cell must be interpreted as
"nrc-bivalue modulo the target";
– the "#*" symbol in a closed cell pattern means the cell must be interpreted as
"nrc-bivalue modulo the previous right-linking candidates and the target".
As, given these indications, the transposition is straightforward, details are left to
the reader.
XXII.8. Miscellanea
Theorem XXII.11:
– nrc-chains subsume xy, hxy-rn, hxy-cn- and c- chains;
– nrct-chains subsume nrc-chains, xyt-, hxyt-rn- and hxyt-cn- chains;
– nrcz-chains subsume nrc-chains, xyz-, hxyz-rn- and hxyz-cn- chains;
– nrczt-chains subsume nrc-, nrct-, nrcz- chains and xyzt-, hxyzt-rn- and
hxyzt-cn- chains.
Of course, this does not mean that we should only keep the nrczt-chains and
forget all the other types, in particular the simpler 2D chains. In practice, it is often
easier to find 2D chains first.
Once we have a partial nrct- or nrczt- chain, it is normally ended on the right
when its last right-linking candidate can be nrc-linked to a target. But there are two
other ways of getting a contradiction on the target. Notice that the following remarks
are not useful for nrc- or nrcz- chains, due to the no-loop theorems.
The first case is when there is already somewhere in the partial chain a left-
linking candidate C that might be taken as a right-linking candidate of a later part of
the chain if we had not excluded loops. In this case, the target of the partial chain
can be eliminated (for the same reason as usual: this situation leads to a contra-
diction). Notice that, when this happens, the target can be eliminated but nothing can
be said directly about C; this is because the part of the chain before C cannot be
excised, due to the t-candidates it may be used to justify in further cells. We call this
case an nrc(z)t-rl-lasso ("rl" because a right-linking candidate is equal to a previous
left-linking candidate). (Notice that there is no full chain in this case and that a
target of an rl-lasso does not have to be linked to the last candidate.)
The second case is when there is already somewhere in the chain a right-linking
candidate C that might be taken as a left-linking candidate of a later part of the chain
if we had not excluded loops. As in the previous case, the target can be eliminated
(for the same reason and with the same other remarks applying). We call this case an
nrc(z)t-lr-lasso ("lr" because a left-linking candidate is equal to a previous right-
linking candidate). (Again, there is no full chain in this case and a target of an lr-
lasso does not have to be linked to the last candidate.)
Generally, these lassos lead to slightly shorter partial nrc(z)t-chains, and they are
interesting for this reason, but they do not lead to eliminations that could not have
been obtained without them: see the classification results in section XXIII.4.
Chapter XXIII
Contrary to what we did in the previous parts of this book, we did not give
examples of each new type of 3D chain immediately after we defined it. The main
reason is that any puzzle may have lots of different resolution paths, which depend
on the priorities ascribed to the various rules. For 3D chains, the possibilities for
such ascriptions are much more diverse than for 2D chains, one major factor being
how we rank 3D chains, that cut through various 2D spaces, with respect to chains
of the same length that reside completely in a single 2D space. As the resolution
paths we obtain largely depend on such priorities, we must start with a discussion of
this topic before we can deal with any example. A second reason is that we must
also introduce a specific notation for 3D chains.
duced and can be modified at will. In the examples given in this chapter, this
parameter was set to 0. This has the advantage of illustrating the 3D rules (and the
inconvenient of making 2D chains appear less often). This choice is consistent with
the Alternating Inference Chain view that both types of links have the same
complexity. This must be true from a purely logical point of view but highly deba-
table from a player’s.
Another point that must be explained before we give examples is the notation we
shall use for displaying concrete 3D chains.
In order to describe all the patterns for 2D chains in Part Three, the same repre-
sentation was enough: e.g. {1 2}—{2 3}—{3 4}—{4 1} can represent an xy4, an
hxy-rn4 or an hxy-cn4 chain, depending on the space where it is interpreted. The
three associated chain rules could thus be written:
rc |= *{1 2}—{2 3}—{3 4}—{4 1}* (rc being generally omitted, as the default)
rn |= *{1 2}—{2 3}—{3 4}—{4 1}*
cn |= *{1 2}—{2 3}—{3 4}—{4 1}*
where the *’s indicate the cells to which any target cell must be linked, the (rc-, rn-
or cn-) cells are merely omitted and the numbers are shortcuts for variables in the
remaining dimension. Particular instantiations of 2D chains could thus be written as:
– for an xy-chain: {n1 n5}r7c5 – {n5 n3}r8c3 – {n3 n6}r9c3,
– for an hxy-rn-chain: {c2 c4}r2n1 – {c4 c6}r6n1 – n6r6{c6 c3}r6n6,
– for an hxy-cn-chain: {r1 r5}c5n7 – {r5 r3}c3n7 – n9{r3 r6}c3n9.
For 3D chains, things are more complex, as they are both chains of candidates
and chains of cells but, in the cell view, cells can jump form one space to another.
The new "nrc notation" that we have defined and shall use in the sequel is based
on the idea that it must be able to represent uniformly the two views of 3D chains.
As any notation, it is the result of a compromise between different constraints; as in
the case of our 2D chains and contrary to conventions one can find on the Web, we
have chosen to represent the mere nrc-links as simply as possible and to put the
complex relations (nrc-conjugacy, modulo something or not) within curly brackets.
But, contrary to the notation we used for 2D chains, where the base space coordi-
nates always appear after the content of the bivalue cells, we decided to always keep
the same ordering for the coordinates, because it is likely to make things globally
easier to read.
Examples and classification results for 3D chains 347
Alternatively, in the sloppy version of the nrc notation, ni{rjck rlcm} represents a
conjugacy for number ni along an unspecified unit in rc-space, i.e. any of the last
three cases above. A priori, this sloppy notation introduces some ambiguity. But
what ni{rjck rlcm} means should always be clear from the context. Unless otherwise
stated, in the resolution paths given below, we need not give details about which
unit type is used for conjugacy (or conjugacy modulo some candidates) and we use
the broader interpretation of ni{rjck rlcm} in the sloppy notation. We then leave it to
the reader to determine the proper conjugacy unit in each particular case. This is
consistent with the approach taken in most mathematics books, where small details
of the proofs are often left to the reader. Remember that the most difficult part in
solving is finding the patterns; when you know where they are, proving their validity
is easy. Notice also that, in the same resolution path, we never mix the strict and the
sloppy notation.
As a first example, consider the short chain: n8r6{c8 c4} – n8r3{c4 c9} ==> not
n8r2c8. In this chain, the first bivalue cell in rn-space corresponds to a conjugacy
link for number n8 along row r6; we then follow a simple link in cn-space from
n8r6c4 to n8r3c4; finally the second bivalue cell in rn-space corresponds to a
conjugacy link, still for number n8, along row r3. This allows to eliminate e.g.
n8r2c8 (written as ==> r2c8 ≠ 8), because this candidate is nrc-linked to the first
candidate (n8r6c8) (they share the c and n coordinates) and to the last candidate
(n8r3c9) (they share the n and block coordinates).
Take a second example: {n4 n8}r9c1 – {n8 n5}r4c1 – n5r6{c1 c8} – n5r9{c8
c7} ==> not n4r9c7. In this simple chain, we first have two bivalue rc-cells, joined
by an xy-link in rc-space, then two bivalue rn-cells joined by an xy-link in rn-space.
348 The Hidden Logic of Sudoku
These two groups of cells are joined by an nrc-link between n5r4c1 and n5r6c1. The
target n4r9c7 is in the same rn-cell as the first candidate n4r9c1 and in the same rc-
cell as the last candidate n5r9c7.
6 4 3 5 2 9 7 6 4 3 5 2 9 7 6 1 4 3 5 2 9 7 8
2 4 7 3 6 2 4 7 3 6 9 8 2 4 7 1 5 3 6
3 7 6 9 2 3 7 6 9 2 3 7 5 6 9 8 4 1 2
4 2 9 5 3 6 1 8 7 4 2 9 5 3 6 1 8 7 4 2 9 5 3 6 1 8 7
7 5 1 2 8 4 3 6 9 7 5 1 2 8 4 3 6 9 7 5 1 2 8 4 3 6 9
8 6 3 7 1 9 2 8 6 3 7 1 9 2 8 6 3 7 1 9 2 5 4
2 1 6 9 3 2 1 6 9 3 2 4 7 1 6 5 8 9 3
1 4 6 2 1 4 6 2 1 3 8 9 4 7 6 2 5
6 2 7 6 2 7 5 9 6 8 2 3 7 4 1
Figure 1. Ron Hagglund’s example, its L1_0 elaboration and its solution
Resolution path in M3 for the L3 (or L1_0) elaboration of Rod Hagglung’s example of a
hinge pattern.
column c4 interaction-with-block b8 ==> r9c6 ≠ 8, r8c6 ≠ 8, r7c6 ≠ 8
nrc3-chain n1{r9c8 r9c9} – {n1 n8}r1c9 – {n8 n5}r8c9 ==> r9c8 ≠ 5
nrc3-chain {n8 n9}r9c4 – n9{r8c4 r8c2} – n3{r8c2 r9c2} ==> r9c2 ≠ 8
nrc3-chain {n9 n5}r9c1 – {n5 n3}r9c6 – n3{r9c2 r8c2} ==> r8c2 ≠ 9
hidden-single-in-a-row ==> r8c4 = 9
naked-single ==> r9c4 = 8
nrc2-chain n8{r1c2 r1c9} – n8{r8c9 r7c7} ==> r7c2 ≠ 8
naked-single ==> r7c2 = 4
hidden-single-in-a-column ==> r3c7 = 4
Examples and classification results for 3D chains 349
For our second example, consider puzzle Sudogen0-212 (Figure 2). We now find
two simple examples of nrct3-chains. Exceptionally, let us display the full content of
the cells (with their additional candidates), so as to underline that these two chains
have an additional t-candidate in their second cell, which is t-justified by the right-
linking candidate of the first cell. Such an additional candidate was not possible for
2D chains, but this example illustrates how it is for 3D chains.
3 5 1 9 2 3 5 1 6 9 2 3 5 1 6 9 4 8 7 2
8 3 6 7 8 3 9 6 7 8 2 5 3 9 1 4
6 2 9 4 3 6 2 9 4 7 1 8 3 6 5
7 1 4 6 9 7 1 5 4 6 9 7 1 3 5 4 6 2 8 9
4 6 5 4 6 5 4 9 8 2 1 7 3 6
3 5 8 3 5 8 2 6 9 3 7 4 5 1
4 8 7 4 3 8 9 7 4 6 5 3 8 9 1 2 7
9 1 6 9 3 1 6 9 3 2 1 7 5 6 4 8
9 3 1 8 6 9 3 1 8 7 4 6 2 5 9 3
For our third example, consider puzzle Sudogen0-59 (Figure 3). We have a
typical nrcz2-chain, soon followed by an nrcz3-chain. Here again, we display the
full content of the cells.
350 The Hidden Logic of Sudoku
Resolution path in M4 for the L4+M3+NRC3 (or L1_0) elaboration of puzzle Sudogen0-59
block b9 interaction-with-column c8 ==> r3c8 ≠ 1, r2c8 ≠ 1
naked-pairs-in-a-row {n5 n8}r6{c2 c3} ==> r6c9 ≠ 5
row r6 interaction-with-block b4 ==> r4c3 ≠ 5, r4c2 ≠ 5
naked-pairs-in-a-column {n5 n8}{r6 r8}c2 ==> r7c2 ≠ 8, r3c2 ≠ 8, r3c2 ≠ 5, r1c2 ≠ 5
hidden-pairs-in-a-column {n8 n9}{r3 r7}c1 ==> r7c1 ≠ 2, r3c1 ≠ 6, r3c1 ≠ 5
nrc3-chain {n6 n9}r3c2 – {n9 n1}r3c7 – n1{r3c3 r2c3} ==> r2c3 ≠ 6
nrcz2-chain n6r5{c8 c1} – n6r2{c1 c9 c8*} ==> r3c8 ≠ 6
nrc3-chain n4{r2c8 r2c4} – {n4 n5}r9c4 – n5{r9c1 r2c1} ==> r2c8 ≠ 5
nrc3-chain {n6 n4}r2c8 – n4{r2c4 r3c5} – n6{r3c5 r1c5} ==> r1c9 ≠ 6
block b3 interaction-with-row r2 ==> r2c1 ≠ 6
nrcz3-chain n4{r2c4 r3c5} – {n4 n5}r3c8 – n5r1{c9 c5 c4*} ==> r2c4 ≠ 5
xy3-chain {n5 n2}r2c1 – {n2 n4}r2c4 – {n4 n5}r9c4 ==> r9c1 ≠ 5
…(Naked Singles)…
4 3 1 7 4 3 1 8 7 4 2 3 5 6 1 8 7 9
7 9 8 7 9 8 3 5 7 1 2 9 8 3 4 6
2 7 3 2 8 9 6 7 4 3 1 5 2
1 9 1 8 3 9 7 1 6 4 8 3 9 7 2 5
3 9 4 3 9 1 7 5 4 8 2 3 9 1 7 5 4 6 8
6 2 3 7 6 2 4 3 7 5 8 6 2 4 9 3 1
6 5 3 6 5 7 9 4 2 3 1 6 5 8 7
9 6 4 3 7 9 2 6 4 3 8 7 9 5 2 6 1 4
1 8 7 1 8 7 2 9 3 6 1 5 4 8 7 2 9 3
6 6 1 5 3 2 8 6 7 4 9
9 4 78 9 4 7 8 9 6 4 7 3 5 8 1 2
7 3 7 3 8 7 2 1 9 4 3 6 5
6 8 9 2 1 4 6 8 5 9 2 1 4 3 7 6 8 5 9 2 1 4 3 7
1 6 7 4 3 1 6 5 7 9 2 8 4 3 1 6 5 7 9 2 8
4 5 7 4 5 7 2 9 8 4 3 1 5 6
2 1 2 1 2 1 8 3 6 9 5 7 4
7 2 8 3 5 7 2 8 3 5 9 7 4 1 2 6 8 3
3 9 3 9 3 4 6 5 7 8 2 9 1
Resolution path in M4 for the M3+L4+NRC4 (or L1_0) elaboration of puzzle Sudogen0-3
column c1 interaction-with-block b1 ==> r3c3 ≠ 8
column c8 interaction-with-block b3 ==> r3c9 ≠ 1, r2c9 ≠ 1, r1c9 ≠ 1, r1c7 ≠ 1
naked-pairs-in-a-column {n1 n6}{r6 r8}c7 ==> r9c7 ≠ 6, r9c7 ≠ 1, r7c7 ≠ 6
nrc3-chain n1{r9c9 r8c7} – {n1 n4}r8c4 – n4{r8c2 r9c2} ==> r9c9 ≠ 4
block b9 interaction-with-row r7 ==> r7c6 ≠ 4, r7c4 ≠ 4
nrc4-chain {n5 n2}r1c2 – {n2 n9}r6c2 – n9{r8c2 r8c5} – n9{r1c5 r1c9} ==> r1c9 ≠ 5
nrc4-chain n9{r1c9 r1c5} – n9{r8c5 r8c2} – {n9 n2}r6c2 – n2{r2c2 r2c9} ==> r1c9 ≠ 2
nrc4-chain n9{r3c6 r7c6} – n9{r8c5 r8c2} – n4{r8c2 r8c4} – n4{r9c6 r3c6} ==> r3c6 ≠ 8,
r3c6 ≠ 5
nrc3-chain n5{r3c9 r3c4} – n2{r3c4 r1c4} – {n2 n5}r1c2 ==> r1c7 ≠ 5
column c7 interaction-with-block b9 ==> r9c9 ≠ 5, r7c9 ≠ 5
nrc3-chain n5{r3c9 r2c9} – n2{r2c9 r2c2} – {n2 n6}r3c3 ==> r3c9 ≠ 6
nrct3-chain n5{r3c9 r3c4} – n2{r3c4 r1c4} – n4{r1c4 r3c6} ==> r3c9 ≠ 4
nrc4-chain n4{r8c4 r8c2} – n9{r8c2 r7c3} – n9{r7c6 r3c6} – n4{r3c6 r9c6} ==> r9c4 ≠ 4
nrc4-chain n9{r8c2 r8c5} – n9{r7c6 r3c6} – n4{r3c6 r9c6} – n4{r9c2 r8c2} ==> r8c2 ≠ 6
nrc4-chain n6{r9c2 r2c2} – n2{r2c2 r2c9} – n2{r1c7 r9c7} – n7{r9c7 r9c5} ==> r9c5 ≠ 6
nrc4-chain n9{r8c5 r8c2} – n4{r8c2 r8c4} – n4{r9c6 r3c6} – n9{r3c6 r7c6} ==> r7c5 ≠ 9
nrc4-chain n9{r7c6 r8c5} – n6{r8c5 r7c5} – n3{r7c5 r2c5} – {n3 n5}r2c6 ==> r7c6 ≠ 5
nrc3-chain n5{r2c6 r9c6} – n4{r9c6 r9c2} – n6{r9c2 r2c2} ==> r2c2 ≠ 5
hidden-single-in-a-block ==> r1c2 = 5
nrc4-chain n5{r2c6 r2c9} – n2{r2c9 r2c2} – n6{r2c2 r9c2} – n4{r9c2 r9c6} ==> r9c6 ≠ 5
hidden singles ==> r2c6 = 5, r3c9 = 5, r1c9 = 9, r7c9 = 4, r2c5 = 3, r2c8 = 1
naked-pairs-in-a-row {n1 n8}r1{c1 c5} ==> r1c4 ≠ 8, r1c4 ≠ 1
naked-triplets-in-a-row {n4 n6 n8}r9{c2 c3 c6} ==> r9c9 ≠ 6
row r9 interaction-with-block b7 ==> r7c3 ≠ 6
naked-triplets-in-a-row {n4 n6 n8}r9{c2 c3 c6} ==> r9c5 ≠ 8, r9c4 ≠ 8
nrc3-chain {n8 n9}r7c3 – n9{r7c6 r8c5} – n6{r8c5 r7c5} ==> r7c5 ≠ 8
column c5 interaction-with-block b2 ==> r3c4 ≠ 8
naked-pairs-in-a-row {n6 n7}r7{c5 c8} ==> r7c7 ≠ 7
naked-single ==> r7c7 = 5
hidden-single-in-a-row ==> r9c4 = 5
xy4-chain {n2 n4}r1c4 – {n4 n1}r8c4 – {n1 n7}r9c5 – {n7 n2}r9c7 ==> r1c7 ≠ 2
…(Naked Singles)…
Resolution path in M4 for the L4+M3+NRC4 (or L3) elaboration of puzzle Sudogen0-1911
column c1 interaction-with-block b4 ==> r6c2 ≠ 9, r4c2 ≠ 9
row r1 interaction-with-block b2 ==> r2c6 ≠ 8
row r5 interaction-with-block b5 ==> r4c5 ≠ 7, r4c4 ≠ 7
column c8 interaction-with-block b9 ==> r7c7 ≠ 6
column c6 interaction-with-block b8 ==> r7c5 ≠ 5
block b7 interaction-with-column c2 ==> r6c2 ≠ 6, r4c2 ≠ 6, r2c2 ≠ 6, r1c2 ≠ 6
352 The Hidden Logic of Sudoku
5 9 5 9 3 4 6 7 2 5 8 1 9 3 4
4 3 9 4 3 2 5 1 9 4 3 6 7 2 8
8 9 2 1 3 8 4 9 2 1 5 3 8 4 9 7 2 6 1 5
3 4 3 4 2 6 8 5 3 1 9 7
5 4 6 3 5 4 6 1 3 8 2 9 7 5 4 6
7 2 3 7 4 2 3 9 5 7 1 6 4 2 8 3
2 3 2 4 3 2 4 3 7 1 5 8 6 9
7 1 4 3 2 7 1 4 3 2 7 9 1 6 4 8 3 5 2
5 4 8 5 3 2 4 8 6 5 3 2 9 4 7 1
9 6 5 9 3 6 4 7 8 1 5 9 2
5 6 7 5 6 7 2 1 9 3 4 5 8 6 7
5 2 6 9 4 5 7 2 6 9 4 5 8 7 2 6 9 3 1 4
6 1 7 9 5 6 1 7 9 2 5 6 3 8 4 1 7 9 2 5
7 7 9 4 1 8 5 2 6 7 3
2 5 7 2 5 9 7 2 5 6 9 3 4 8 1
6 6 3 9 8 5 6 1 7 4 2 3 9
4 3 9 1 4 7 3 9 1 5 4 7 3 9 2 8 1 5 6
1 3 8 1 3 4 8 1 9 2 5 3 6 7 4 8
6 2 6 2 1 4 8 5 6 9 2 3 7 1
2 8 4 2 7 1 8 4 3 2 6 7 5 1 9 8 4
9 4 6 9 4 8 6 7 9 1 4 3 8 2 5 6
5 8 1 5 9 8 6 1 5 4 9 3 8 6 1 2 7
1 9 8 3 1 2 9 7 5 8 3 6 1 2 9 7 5 8 4 3
7 4 5 7 1 4 5 8 3 7 2 1 4 6 9 5
2 6 3 1 2 6 3 1 9 2 5 4 8 6 3 7 1 9
9 1 7 3 9 1 4 7 3 9 6 8 1 4 7 5 3 2
2 2 9 1 7 3 5 2 9 4 6 8
Resolution path in M5 for the M4+L5 (or L1_0) elaboration of puzzle Sudogen0-618
column c7 interaction-with-block b9 ==> r9c8 ≠ 4
xy3-chain {n3 n6}r2c1 – {n6 n4}r5c1 – {n4 n3}r4c2 ==> r6c1 ≠ 3
block b4 interaction-with-column c2 ==> r9c2 ≠ 3, r1c2 ≠ 3
hidden-pairs-in-a-block {n1 n3}{r9c1 r9c3} ==> r9c3 ≠ 8, r9c3 ≠ 6, r9c3 ≠ 5, r9c3 ≠ 4, r9c1
≠ 8, r9c1 ≠ 7
column c1 interaction-with-block b1 ==> r1c2 ≠ 7
354 The Hidden Logic of Sudoku
The next puzzle, Sudogen0-8027 (Figure 8), is worth examining in full detail.
8 6 8 4 6 9 3 8 4 1 2 5 6 7
4 5 9 4 5 9 6 8 1 4 2 7 5 9 6 8 1 3
7 3 2 9 5 1 6 7 3 8 4 2 9 5 1 6 7 3 8 4 2 9
2 7 2 7 6 8 3 2 7 1 9 4 5
5 5 8 7 5 1 8 4 9 2 3 6
4 3 8 4 6 3 8 2 9 4 6 5 3 1 7 8
7 1 7 6 1 8 4 9 3 2 7 6 5 1
5 6 8 1 5 6 8 1 7 5 9 6 4 3 8 2
3 6 1 4 3 6 1 8 4 3 6 2 1 8 5 7 9 4
Resolution path in M5 for the M4+L5 (or L1_0) elaboration of puzzle Sudogen0-8027
column c4 interaction-with-block b8 ==> r9c6 ≠ 9, r8c6 ≠ 9
Examples and classification results for 3D chains 355
c1 c2 c3 c4 c5 c6 c7 c8 c9
3 1 2 1 2 3 3
r1 8 4 5 6 5
7 9 7 9 7 7
2 3 2 3 3
r2 4 5 9 6 8 1 r2
7 7 7
r3 5 1 6 7 3 8 4 2 9 r3
3 1 3 1 1 3 3 3
r4 6 2 7 4 5 5 4 5 6 r4
8 9 8 9 9 9 9 9
2 1 2 3 1 1 1 2 3 3 2 3
r5 6 5 8 4 4 4 6 r5
7 9 7 9 9 7 9 7 9 7
2 2 1 1 2
r6 4 6 5 3 5 5 8 r6
7 9 7 9 7 9 7 9
2 2 2 3 2 3
r7 4 4 5 7 6 5 1 r7
8 9 8 9 9 9 9
2 3 2 2 3 2 3
r8 1 4 5 6 4 8 r8
7 9 9 7 9 7
2 2 2
r9 3 6 1 8 5 5 5 4 r9
7 9 7 9 7 9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Figure 9. Puzzle Sudogen0-8027, just before the first nrct4 rule is applied
5 8 2 5 8 4 2 1 3 5 8 4 2 7 6 9
4 2 4 2 6 8 9 5 7 3 1 4 2
4 7 1 5 4 7 2 1 8 5 4 7 2 9 6 1 8 5 3
1 1 8 4 7 2 1 5 9 3 6
5 1 8 5 1 8 2 5 6 7 3 9 4 1 8
3 4 6 2 3 46 2 7 9 1 3 4 8 6 2 7 5
6 9 6 9 5 6 4 1 2 8 3 9 7
8 3 8 3 2 7 9 8 3 5 4 6 2 1
3 9 3 9 8 3 2 1 6 9 7 5 8 4
Figure 10. Puzzle Sudogen0-352, its L1_0 elaboration and its solution
Even with 3D chains, life may be hard. Here is an example, puzzle Sudogen0-
707 (Figure 11), one of the hardest two in the Sudogen0 collection, that requires
three different nrczt-chains of length thirteen (this maximal length is not changed if
we allow lassos). In order to let you fully appreciate this puzzle, we have displayed
the justifications for all the additional candidates and interspersed comments
throughout the resolution path. Notice however that, given the classification results
in the next section, such a hard puzzle is quite exceptional.
naked-pairs-in-a-column {n2 n4}{r2 r6}c9 ==> r8c9 ≠ 4, r8c9 ≠ 2, r7c9 ≠ 4, r5c9 ≠ 4, r4c9 ≠
4, r4c9 ≠ 2
4 3 6 8 4 3 6 1 8 4 2 7 3 5 9 6 1 8
5 1 3 5 1 3 8 9 5 6 7 1 2 3 4
1 6 4 7 3 1 6 4 7 3 1 6 2 4 8 5 9 7
7 8 7 8 2 7 4 9 3 5 1 8 6
2 2 6 3 8 7 1 2 9 4 5
5 1 6 3 7 5 1 8 6 3 7 9 5 1 4 8 6 3 7 2
2 9 8 2 9 8 7 4 2 5 9 3 8 6 1
1 1 5 8 9 1 6 7 4 2 3
4 9 4 9 1 6 3 8 2 4 7 5 9
Figure 11. Puzzle Sudogen0-707, its L1_0 elaboration and its solution
c1 c2 c3 c4 c5 c6 c7 c8 c9
2 2
r1 4 3 5 5 6 1 8 r1
9 7 9 7 7 9
2 2 2 2 2
r2 5 6 6 1 4 3 4 r2
7 8 9 8 9 7 9 7 7 9
2 2 2
r3 3 1 6 4 5 5 7 r3
8 9 8 9 9 9
2 3 1 3 3 1 2 1
r4 6 7 4 4 5 5 5 4 5 8 5 6 r4
9 9 9 9 9
3 3 1 3 1 1
r5 6 4 6 4 4 5 5 2 4 5 4 5 6 5 6 r5
8 8 8 7 9 7 9 9
2 2
r6 5 1 4 8 6 3 7 4 r6
9 9
1 3 3 1 3
r7 5 6 4 6 2 5 6 9 5 8 4 5 6 5 6 r7
7 7 7
3 3 2 3 3 2 2 3
r8 5 6 4 6 4 1 5 6 5 4 5 4 5 6 5 6 r8
7 8 8 9 7 8 9 7 8 7
1 3 3 2 2 3 1 2 2
r9 5 6 6 5 6 5 6 4 5 5 6 9 r9
7 8 8 7 8 8 7
c1 c2 c3 c4 c5 c6 c7 c8 c9
Figure 12. Puzzle Sudogen0-707, situation just before the first nrczt11 chain
Examples and classification results for 3D chains 359
;;; Here are three chains built on the same cells, with candidates differently ordered
and additional candidates differently justified:
nrczt3-chain n7r5{c5 c4} – n7r2{c4 c1 c5*} – n7r7{c1 c6 c4#n7r5c4} ==> r8c5 ≠ 7
nrczt3-chain n7r5{c4 c5} – n7r2{c5 c1 c4*} – n7r7{c1 c6 c4*} ==> r9c4 ≠ 7
nrczt3-chain n7r5{c5 c4} – n7r2{c4 c1 c5*} – n7r7{c1 c6 c4#n7r5c4} ==> r9c5 ≠ 7
;;; Here is now an nrczt4-chain that can be considered as built around three different
targets – i.e. the only additional z-candidate r5 in the second (cn-) cell (c7n9) can be
justified by any of the three targets:
nrczt4-chain n9{r5 r3}c8 – n9{r2 r4 r3#n9r3c8 r5*}c7 – n2r4{c7 c1} – {n2 n9}r6c1 ==>
r5c3 ≠ 9, r5c2 ≠ 9, r5c1 ≠ 9
nrczt6-chain {n9 n2}r6c1 – {n2 n6 n9*}r4c1 – {n6 n8 n9*}r5c1 – n8r2{c1 c2} – n2{r2c2
r1c2} – n9{r1 r8 r2#n8r2c2}c2 ==> r8c1 ≠ 9
;;; The situation at this point is described in Figure 12.
nrczt11-chain {n6 n8}r5c1 – n8r2{c1 c2} – {n8 n3 n6*}r9c2 – {n3 n4 n6*}r7c2 – {n4 n5
n6*}r7c8 – {n5 n7 n6*}r7c4 – n7r5{c4 c5} – n1{r5c5 r4c5} – n3{r4 r8 r5#n7r5c5
r9#n3r9c2}c5 – n3r7{c6 c9 c2#n3r9c2} – n1r7{c9 c1} ==> r7c1 ≠ 6
nrczt13-chain n6r4{c9 c1} – n2r4{c1 c7} – n2{r6 r2 r5#n2r4c7}c9 – n2r3{c8 c4
c7#n2r4c7} – n8{r3 r9}c4 – n6{r9 r2 r7*}c4 – {n6 n7 n2#n2r2c9}r2c5 – n7{r2c1 r1c3} –
{n7 n3 n8#n8r9c4}r9c3 – n3r7{c2 c6 c9*} – n7{r7 r8 r1#n7r2c5}c6 – n7{r8 r9
r2#n7r2c5}c7 – n1{r9c7 r7c9} ==> r7c9 ≠ 6
nrczt13-chain n9r8{c2 c3} – {n9 n7}r1c3 – {n7 n8 n3*}r9c3 – n8r8{c1 c6 c3#n9r8c3 c2*}
– n7{r8 r7 r1#n7r1c3}c6 – n3r7{c6 c9 c2*} – n1{r7c9 r9c7} – n7{r9c7 r8c7} – n4r8{c7 c8
c3#n9r8c3 c2*} – n2r8{c8 c5 c7#n7r8c7} – {n2 n5 n7#n7r1c3}r1c5 – n5{r1 r4
r7#n7r7c6}c6 – n3{r4 r8 r7#n7r7c6}c6 ==> r8c2 ≠ 3
nrczt13-chain n3{r8 r7}c9 – n1{r7c9 r9c7} – n7{r9c7 r8c7} – n2r8{c7 c8 c5*} – n4{r8c8
r7c8 r8c7#n7r8c7} – {n4 n6 n3#n3r7c9}r7c2 – n6r8{c1 c9 c2#n6r7c2 c8#n2r8c8 c5*} –
n6{r9 r5 r7#n6r7c2 r8#n2r8c8}c8 – n9{r5 r3}c8 – {n9 n8}r3c6 – {n8 n5 n7#n7r8c7
n3*}r8c6 – {n5 n7}r7c4 – {n7 n3 n5#n5r8c6}r7c6 ==> r8c5 ≠ 3
;;; Notice that the last elimination (r8c5≠3) is necessary before the next rule can be
applied, because otherwise cell 6, which is now n3{r7c6 r9c5}, would be n3{r7c6
r9c5 r8c5}, and the additional candidate n3r8c5 could be justified neither by the
previous right-linking candidates nor by the target:
nrczt7-chain n8{r9c4 r8c6} – n8{r8 r5 r9*}c3 – n8{r5 r2 r8#n8r8c6 r9*}c1 – n7{r2c1 r1c3} –
n7{r1 r7 r8#n8r8c6}c6 – n3{r7c6 r9c5} – {n3 n8 n7#n7r1c3}r9c3 ==> r9c2 ≠ 8
;;; Notice that the elimination (r7c9≠6) done by the first nrczt13 is necessary before
the next rule can be applied, because otherwise cell 9, which is now {n3 n1 n5}r7c9
would be {n3 n1 n5 n6}r7c9, and the additional candidate n6r7c9 could be justified
neither by the previous right-linking candidates nor by the target:
nrczt11-chain n7r5{c4 c5} – n1{r5 r4}c5 – n3{r4 r9 r5#n7r5c5}c5 – {n3 n6}r9c2 – n6{r9c4
r8c5 r9c5#n3r9c5} – n2{r8c5 r9c4 r9c5#n3r9c5} – {n2 n5 n6#n6r9c2}r9c8 – {n5 n3
n6#n6r8c5}r8c9 – {n3 n1 n5#n5r9c8}r7c9 – {n1 n5}r7c1 – {n5 n7 n6#n6r8c5 n7*}r7c6 ==>
r7c4 ≠ 7
block b8 interaction-with-column c6 ==> r1c6 ≠ 7
nrct4-chain n8{r9c4 r8c6} – n7{r8 r7}c6 – n3{r7c6 r9c5 r8c6#n8r8c6} – {n3 n6 }r9c2 ==>
r9c4 ≠ 6
360 The Hidden Logic of Sudoku
;;; Notice that, in the following chain, there are two additional t-candidates in the
same cell (cell 4), which can be justified by the same right-linking candidate (in
cell 1); one is justified by a link along a column and one by a link along a block :
nrct7-chain n7r7{c6 c1} – n7{r2c1 r1c3} – n7r9{c3 c7 c1#n7r7c1} – n7r8{c7 c6 c1#n7r7c1
c3#n7r7c1} – n8{r8c6 r9c4} – {n8 n3}r9c3 – n3{r9c5 r7c6} ==> r7c6 ≠ 5
nrczt7-chain n7{r8c7 r9c7} – n1{r9c7 r7c9} – {n1 n5 n7*}r7c1 – {n5 n6}r7c4 – {n6 n4
n5#n5r7c1}r7c8 – n4{r7c2 r8c2 r8c3*} – n9r8{c2 c3} ==> r8c3 ≠ 7
nrct9-chain n7{r5 r2}c4 – n7{r2c1 r1c3} – n7{r1 r5 r2#n7r2c4}c5 – n1{r5 r4}c5 – n3{r4 r9
r5#n7r5c5}c5 – {n3 n8 n7#n7r1c3}r9c3 – n8{r9c4 r8c6} – {n8 n9}r3c6 – n9{r3 r5}c8 ==>
r5c4 ≠ 9
row r5 interaction-with-block b6 ==> r4c7 ≠ 9
nrczt10-chain {n6 n5}r7c4 – {n5 n2 n6*}r8c5 – {n2 n7 n6*}r2c5 – n7{r2c1 r1c3} – {n7 n5
n2#n2r8c5}r1c5 – {n5 n9}r1c6 – {n9 n8}r3c6 – n8{r8c6 r9c4} – {n8 n3 n7#n7r1c3}r9c3 –
{n3 n6}r9c2 ==> r9c5 ≠ 6
nrczt10-chain n1{r9 r7}c1 – n5{r7 r8 r9*}c1 – n7{r8 r2 r7#n1r7c1 r9*}c1 – n7r7{c1 c6} –
n7r8{c6 c7 c1#n5r8c1} – n7r9{c7 c3 c1#n7r2c1} – n8r9{c3 c4 c1*} – {n8 n3 n5#n5r8c1
n7#n7r7c6}r8c6 – {n3 n6 n5#n5r8c1}r8c9 – n6r4{c9 c1} ==> r9c1 ≠ 6
nrczt10-chain {n5 n6}r7c4 – {n6 n2 n5*}r8c5 – {n2 n7 n5*}r1c5 – n7{r1 r9}c3 – n3r9{c3 c2
c5*} – {n3 n4 n6#n6r7c4}r7c2 – {n4 n5 n6#n6r7c4}r7c8 – {n5 n1 n6#n6r7c4
n7#n7r9c3}r7c1 – {n1 n8 n7#n7r9c3 n5*}r9c1 – {n8 n5 n2#n2r8c5}r9c4 ==> r9c5 ≠ 5
nrczt6-chain {n6 n3}r9c2 – {n3 n2}r9c5 – {n2 n5 n6*}r8c5 – {n5 n7 n2#n2r9c5}r1c5 – {n7
n9}r1c3 – n9r8{c3 c2} ==> r8c2 ≠ 6
nrczt8-chain {n6 n3}r9c2 – {n3 n2}r9c5 – {n2 n5 n6*}r8c5 – {n5 n8 n2#n2r9c5}r9c4 – {n8
n7 n3#n3r9c2}r9c3 – n7{r9c7 r8c7} – {n7 n3 n5#n5r8c5 n8#n8r9c4}r8c6 – {n3 n6
n5#n5r8c5}r8c9 ==> r8c1 ≠ 6
column c1 interaction-with-block b4 ==> r5c2 ≠ 6
nrczt9-chain {n2 n3}r9c5 – {n3 n6}r9c2 – {n6 n5 n2*}r9c8 – {n5 n8 n2*}r9c4 – {n8 n7
n3#n3r9c5}r9c3 – {n7 n9}r1c3 – {n9 n5}r1c6 – {n5 n7 n3#n3r9c5 n8#n8r9c4}r8c6 –
n7{r8c7 r9c7} ==> r9c7 ≠ 2
nrczt9-chain n6r2{c5 c4} – n7{r2c4 r1c5 r2c5*} – n7{r1 r9}c3 – n7{r9c7 r8c7} – n2r8{c7 c8
c5#n2r8c8} – n4{r8c8 r7c8 r8c8#n2r8c8} – n6r7{c8 c2 c4#n6r2c4 c9#n7r8c7} – {n6 n3}r9c2
– {n3 n2}r9c5 ==> r2c5 ≠ 2
nrczt8-chain n6{r9c2 r9c8} – n6{r8c9 r8c5} – {n6 n7}r2c5 – n7{r2c1 r1c3} – {n7 n8}r9c3 –
n8{r9c4 r8c6} – n7{r8c6 r7c6} – n3{r7c6 r9c5} ==> r9c2 ≠ 3
naked-single ==> r9c2 = 6
nrc2-chain n3r9{c5 c3} – n3{r7 r5}c2 ==> r5c5 ≠ 3
row r5 interaction-with-block b4 ==> r4c3 ≠ 3
xy3-chain {n4 n2}r6c9 – {n2 n9}r6c1 – {n9 n4}r4c3 ==> r4c7 ≠ 4
xyzt6-chain {n5 n2}r9c8 – {n2 n3}r9c5 – {n3 n7}r7c6 – {n7 n8 n3#n3r9c5 n5*}r8c6 – {n8
n9}r3c6 – {n9 n5 n2#n2r9c8}r3c8 ==> r8c8 ≠ 5
nrczt6-chain {n5 n2}r9c8 – n2r8{c7 c5 c8#n2r9c8} – n6r8{c5 c8 c9*} – {n6 n4 n5*}r7c8 –
{n4 n3}r7c2 – n3{r7 r8}c9 ==> r8c9 ≠ 5
nrczt7-chain n9r6{c4 c1} – n9r4{c3 c6 c1#n9r6c1 c4*} – n3r4{c6 c5} – n3r9{c5 c3} – n7{r9
r1}c3 – n7{r1c5 r2c5 r2c4#n7r1c3} – n6r2{c5 c4} ==> r2c4 ≠ 9
nrczt8-chain {n4 n9}r4c3 – n9r8{c3 c2} – n9r1{c2 c6 c3#n9r4c3} – {n9 n8}r3c6 – n8{r3
r9}c4 – n8{r9 r5 r8*}c3 – n3{r5 r9 r8*}c3 – {n3 n4}r7c2 ==> r8c3 ≠ 4
column c3 interaction-with-block b4 ==> r5c2 ≠ 4
Examples and classification results for 3D chains 361
nrczt7-chain {n8 n6}r5c1 – n6r4{c1 c9} – {n6 n3}r8c9 – {n3 n9 n8*}r8c3 – {n9 n7}r1c3 –
n7r9{c3 c7 c1*} – n1r9{c7 c1} ==> r9c1 ≠ 8
nrczt7-chain n8{r9 r3}c4 – n2{r3c4 r2c4} – n6r2{c4 c5} – n7{r2c5 r1c5 r2c4#n2r2c4} –
n7{r1 r9}c3 – {n7 n1 n5*}r9c1 – {n1 n5 n7#n7r9c3}r9c7 ==> r9c4 ≠ 5
nrct3-chain n7{r8c7 r9c7} – n1r9{c7 c1} – n5r9{c1 c8 c7#n7r9c7} ==> r8c7 ≠ 5
nrczt6-chain n7{r1 r9}c3 – n3r9{c3 c5} – n2{r9 r8 r1*}c5 – n6{r8c5 r7c4} – n5{r7c4 r8c6
r8c5#n2r8c5} – n5r1{c6 c5} ==> r1c5 ≠ 7
hidden-single-in-a-row ==> r1c3 = 7
hidden-pairs-in-a-row {n6 n7}r2{c4 c5} ==> r2c4 ≠ 2
hidden-pairs-in-a-column {n2 n8}{r3 r9}c4 ==> r3c4 ≠ 9
column c4 interaction-with-block b5 ==> r4c6 ≠ 9
hidden-pairs-in-a-row {n1 n7}r9{c1 c7} ==> r9c7 ≠ 5, r9c1 ≠ 5
hidden singles ==> r9c8 = 5, r3c7 = 5
row r9 interaction-with-block b8 ==> r8c5 ≠ 2
naked-pairs-in-a-block {n5 n6}{r7c4 r8c5} ==> r8c6 ≠ 5
hidden-triplets-in-a-column {n1 n5 n7}{r9 r7 r8}c1 ==> r8c1 ≠ 8
xy3-chain {n3 n6}r8c9 – {n6 n4}r7c8 – {n4 n3}r7c2 ==> r8c3 ≠ 3, r7c9 ≠ 3
naked and hidden singles ==> r7c9 = 1, r9c7 = 7, r9c1 = 1, r8c9 = 3
column c9 interaction-with-block b6 ==> r5c8 ≠ 6
xy3-chain {n4 n2}r2c9 – {n2 n9}r3c8 – {n9 n4}r5c8 ==> r6c9 ≠ 4
…(Naked Singles)…
Although we have just seen that there are exceptionally hard puzzles whose
solutions require long 3D chains, the introduction of 3D chains may drastically
simplify the solution of some puzzles. Remember the very long xyt-chains we used
at the end of chapter XVII for the following two examples.
The nrc-, nrct- and nrczt- chains (and lassos) have been included upto length
twenty eight in version 13 of SudoRules. As a result, it can now solve all of the
10,000 randomly generated puzzles in the Sudogen0 collection without using any
chain rule based on the consideration of subsets (no Hinges, no Almost Locked Sets)
or any rule based on the assumption of Uniqueness. In the following table, giving
the number of grids solved for each length n, Mn and Nn are defined as: M1 = L1 and:
2 3 4 5 6 7 8
L (2D) 6,026 6,577 8,271 8,959 9,326 9,472 9,584
M (3D) 6,754 8,400 9,638 9,893 9,964 9,984 9,997
N (3D) 6,773 8,415 9,658 9,913 9,975 9,991 9,997
9 10 11 12 13 14 15 16
L (2D) 9,636 9,685 9,695 9,708 9,717 9,727 9,735 9,739
M (3D) 9,998 9,998 9,999 9,999 10,000
As can be seen from this table, 99.99% of the random minimal puzzles can be
solved with 3D chains of length no more than twelve. More significantly, more than
364 The Hidden Logic of Sudoku
99% (respectively 99.9%) of the random minimal puzzles can be solved with 3D
chains of length no more than five (resp. seven). As the psychologists keep
repeating that human short term memory has size seven plus or minus two, these
results mean that a human should be able to solve almost any random minimal
puzzle without any computer assistance. But he may need quite a lot of patience for
finding even these short chain patterns.
Finally, there are also absolute monsters. Easter Monster (Figure 13) is currently
the hardest known puzzle. After the Elementary Constraints Propagation rules have
been applied to it, none of the rules described in this book can be applied. Indeed, no
resolution rule is yet known to be applicable to this puzzle.
C1 C2 C3 C4 C5 C6 C7 C8 C9
3 3 3 3 3
R1 1 4
7 8
4 5
7 8 7
5 6 6
8 9
5 6
7 8
4
8 9
6
9
2 R1
2 3 3 1 2 3 1 2 1 3 3
R2
8
9 7 8
4 8
6
7 8
6
8
5 8
6 R2
2 3 2 1 2 3 1 2 3 1 2 1 3 3
R3 4 5
8
4
8
6 5
8 9
5
8
7 9
4
8 9
R3
2 1 1 2 1 2 1 2
R4 4
8
6 5 4
7 8
9 4 6 3 8 7
6
7 8
6 R4
2 3 1 2 1 3 1 2 1 2 1 2 3 1 2 3 3
R5 4 6
8 9
4
8
6 4
8 9
6 7 4 6 5
8 9
6
9
5 6
8 9
R5
2 3 1 2 1 3 1 2 1 2 3 3
R6 6
9 7
6
7 9
8 5 6
9
4 7
6
9
R6
1 1 1 2 3 1 2 3 1 2 2 3 3
R7 7 4
8
4 5
8 9
5 4
8
4 5
8
6 9
4 5
9
R7
1 1 2 1 2 2
R8 4 5 6 3 4 5 5 6 4 6 9 4 5 8 4 5 R8
7 7
3 3 3 3
R9 4 5 6
8 9
4
8
6 2 7
5 6 4
8
6 4 5 6
7 8
4 5
9 7 9
1 R9
C1 C2 C3 C4 C5 C6 C7 C8 C9
There is a generalised confusion, on the Web forums and in most of the books on
Sudoku, between the description of a purely factual situation (e.g. the existence of
what was defined in this book as an ordinary link, i.e. of a unit being shared between
two cells, or of a conjugacy link or of an nrc-link) and the way this can be used in
the proof of a chain rule. As a consequence there is also a double generalised
confusion: 1) between the logical validity of a chain rule and the internal structure of
the chain and 2) between how a resolution rule can be proven and how the associa-
ted patterns can be found.
366 The Hidden Logic of Sudoku
Let us define a basic fact as what can be seen, physically and immediately, on
the standard grid. On the standard grid, only values, candidates, equality between
them and "physical" links between cells or candidates can be seen physically and
immediately. Inferences or bits of inferences cannot be seen physically. As an
obvious consequence, the basic predicates adopted in this book are "equal" (indeed,
one "equal" predicate for each sort), "value", "candidate", to which almost primary
predicates, easily defined from the primary ones, such as "same-cell", "share-a-unit"
and "nrc-linked", have been added. Basic facts can thus be defined formally as what
can be described by either of our primary or almost primary predicates. (We could
also define "extended basic facts" as what can be seen, physically and immediately,
on a well defined universal representation extending the standard grid, such as our
extended Sudoku board – but this is not necessary, because such facts can be
defined by simple auxiliary predicates).
Of course, one can always say that what we consider as a basic fact is mainly a
matter of modelling and that different models could be developed. It has been
suggested that "or statements" – i.e. disjunctions such as "candidate(n1, r1, c1) or
candidate(n2, r2, c2)" or much longer ones – could be considered as the basic facts.
What we consider as basic facts and what we consider as atomic steps in the
resolution process of a puzzle are two closely related questions. In this alternative
view, elementary steps in the resolution process should be the assertion of such "or
statements" instead of the assertion of a value and/or the deletion of a candidate.
Among the problems currently plaguing every discussion on Sudoku is the idea
that Sudoku must be approached "at the inference level" and the two associated
Resolution rules and resolution techniques 367
notions of a "weak" and a "strong" link. These notions are certainly the most confu-
sing ones that have ever been introduced – leading e.g. to hyper-realistic debates on
whether a strong link is also a weak link. In this view, there is a weak link between
two candidates C1 and C2 if C2 is false whenever C1 is true (which is called a
"weak inference") and there is a strong link between them if C2 is true whenever C1
is false (which is called a "strong inference"): a link is thus defined by how it can be
used in the proof of a rule instead of being a purely factual notion as in our
approach. The central notion of all this confusion, that of a strong link, is an aberra-
tion for two reasons:
– its meaning has evolved with time: initially, it covered merely "conjugate",
then "bivalue" was added to it, then "ALS" (Almost Locked Sets); and if other ways
of doing "strong inferences" are found, one can be sure that the notion of a strong
link will be extended to cover them;
– as a result of using it, one never knows exactly what is under discussion –
which is a way of cheating with complexity: a bivalue is much less complex than a
link based on an ALS, but the difference is negated by giving them the same name.
(Notice that, for some people, this is also a way of cheating with credit: nothing new
can be invented, because it was already implicit in their definition.)
Chains of different kinds are the main tool for solving hard puzzles. Never-
theless, chains have a bad reputation. The main problem is that many people are still
thinking of a chain as a chain of inferences and, for this reason, they cannot under-
stand that a chain is a pattern exactly like any other pattern (e.g. Naked-Pairs or
Swordfish). As a result of viewing chains as chains of inferences, many also are still
thinking that they must re-prove each chain rule every time they use one – as if they
had to re-prove Pythagoras Theorem every time they use it.
On the contrary, in the approach developed in this book, a chain (of some speci-
fied type) has a purely factual definition: it is a well defined physical pattern on the
grid, a conjunction of precise physical conditions if you prefer. When you discover
one, you have nothing to prove; as for any other patttern, you know in advance what
you can conclude.
Within our approach, there are two factual views of chains and these two views
(that had previously been considered, "at the inference level", as conflicting by the
Nice Loop and Alternating Inference Chains communities) have been shown to
blend perfectly: 2D chains are chains of cells in their fixed base space, whereas 3D
368 The Hidden Logic of Sudoku
chains are chains of candidates that can also be considered as chains of cells in
varying 2D spaces.
The cell view is obviously best adapted to xy-chains, based on directly visible
bivalue cells, which we consider as the most basic chains. But xy-chains are only a
small part of the story. In order to consistently use the cell view but overcome its
limitations, and considering that "conjugate" is "bivalue" in another 2D space, the
rn-, cn- and bn- spaces and the corresponding hxy-chains were introduced. Using the
z- and t- extensions of these basic chains, one can solve 97% of the minimal
puzzles. Which means that the most general combininations of bivalue and conju-
gacy links are required for only at most 3% of these puzzles (although they may lead
to simplifications for some of the 97%). Which also means that the cell view is
enough for these 97%.
In order to solve the remaining 3% (and to simplify some of the 97%), general
combinations of bivalue and conjugacy links are necessary. Such combinations may
seem very unnatural in the cell view. Here the fundamental view of chains is chains
in the 3D nrc-space, where cells are nrc-cells (each nrc-cell having a corresponding
candidate that can be present or absent). Then the natural view of chains becomes
chains of candidates. Whence the notion of nrc-chains (and their t-, z- and zt- exten-
sions). What is interesting, at least from a theoretical point of view, is that viewing
3D chains as chains of candidates doesn't preclude viewing them as chains of cells
(but in varying 2D spaces). One of the advantages of the cell oriented view (imple-
mented in the nrc notation) is that it allows generalising to the "3D" chains the
simple and powerful z and t relaxations that were first defined only for xy or hxy
chains.
The idea that Sudoku must be approached "at the inference level" leads to
another confusion, between how a rule can be proven and how its instantiations on a
real grid can be found. Let it therefore be clear that:
– a resolution rule is proven once and for all, and the way it has been proven, by
valid logical methods, including reasoning by cases, has no impact of any kind on its
logical validity;
– being logically valid doesn't prevent a rule from being more or less easy to
apply (i.e. having instances more or less easy to find), but this has a priori nothing
to do with the way it has been proven. How the instances of a rule can be found will
be evoked in section 4 below.
Resolution rules and resolution techniques 369
The concept of a resolution rule has been defined formally in section IV.3. After
the first edition of this book, some people interpreted it in an "integrist" perspective:
every resolution rule should be explicitly written in FOL, using only the basic
predicates introduced in chapter III; as a consequence, the whole approach would be
very tedious. This book is the best proof of the contrary: just have a look at the very
simple graphico-logical representations of the most complex chain rules!
FOL (or, more precisely, MS-FOL) is the ultimate scientific form of these rules,
but most of the time, using the auxiliary predicates that have already been proven to
be writable as MS-FOL formulæ, a non ambiguous factual English formulation is
enough to guarantee that they can be expressed as resolution rules.
We have seen that chains are the subject of heated debates and that, due to their
being viewed as "chains of inferences", they have a bad reputation. Among those
who do not like them, some do not want to say it explicitly and they have invented
pseudo-scientific words (such as "multiple inference", "bifurcative" or "assump-
tive") to disguise their dislike. It is important to understand that such words are
devoid of any meaning in our purely factual approach (and it seems it is also the
case in the "inference" view).
A chain pattern is a pattern like any other pattern (e.g. Jellyfish,…). The idea
that chains are more complex than other patterns is completely false: this can be
seen from our classification results, or from subsumption results showing that nrczt
chains subsume most kinds of "finned" and/or "sashimi" fishy patterns (see our
posts in the the "supersymmetric chains" thread in the Sudoku Players Forum).
Warning: chains should not be confused with chain rules. A chain rule is merely
valid or not valid, which depends neither on the way it has been proven nor on the
properties of the underlying chain that will be defined below. A non valid chain rule
is merely useless. But a valid chain rule can be more a less general (giving rise to
370 The Hidden Logic of Sudoku
XXIV.2.1. Linearity
XXIV.2.2. Homogeneity
First, remember that, in our approach, a target of a chain never belongs to the
chain. This is very important for three reasons: 1) it allows a chain to have several
targets; 2) it also allows to discard irrelevant distinctions depending on the type of
links a target cell has with the chain (such as Nice Loops being "continuous" or
"discontinuous"); 3) all the chain rules that have been introduced in this book can be
based on homogeneous patterns.
"Homogeneous" means that the pattern is a sequence of similar bricks. This pro-
perty of all the chains introduced in this book is obvious from their definitions. It
would become meaningless if we had to include the targets in the chains.
The word "reversibility" has been the pretext of so poisonous debates on Web
forums that nobody has yet tried to propose an objective definition of it. Let us do it:
Resolution rules and resolution techniques 371
– given a (2D or 3D) chain C, the reversed chain is the chain obtained by
reversing the order of the candidates; in the process, when used in the definition of
the chain type, left- (resp. right-) linking candidates become right- (resp. left-) link-
ing candidates;
– a chain type is called reversible if for any chain of this type, the reversed
chain is of this type.
Proof: obvious.
Notice that chains using the t-extension are not reversible. This is a weak point
for them. But the sequel will show that they satisfy properties (left-extendability and
composability) that partially palliate this weakness.
Definition: a given type of (2D or 3D) chain is called non-anticipative if, when a
chain of this type is built from left to right, all that needs be checked when adding
the next candidate depends only on the previous candidates (and not on the potential
future ones) (and possibly on the target, for chains that have to be built around a
target, e.g. xyz, xyzt, nrcz or nrczt). Notice that this doesn't imply that adding a
candidate will always allow to finally get a full chain of this type, but it guarantees
that, up to the new candidate added, the chain satisfies the conditions on chains of
this type whatever will be added to it later.
Theorem XXIV.3: all the chains defined in this book, (h)xy(z)(t)-chains and
nrc(z)(t)-chains, are non-anticipative.
Proofs: obvious. Indeed, when I first defined them, I had in mind this condition
of non-anticipativeness (although it was only implicit).
372 The Hidden Logic of Sudoku
Definition: a given type of (2D or 3D) chain is called left-extendable if, when
given a partial chain of this type, cells or candidates can be added not only to its
right but also to its left (of course, respecting the linking conditions on left- and
right- linking candidates for chains of this type at the junction and having the same
target in case they are built around a target).
Theorem XXIV.6: all the chains defined in this book, (h)xy(z)(t)-chains and
nrc(z)(t)-chains, are left-extendable.
Proof: obvious. The idea is that, when the presence of a t-candidate can be justi-
fied by previous right-linking candidates in a partial chain, it will remain justified by
them if we add candidates to the left of this partial chain (and justifications of z-
candidates will not be changed). This notion and the last theorem were first sugges-
ted by Mike Barker.
Definition: a given type of (2D or 3D) chain is called composable if, when two
partial chains of this type are given, they can be combined into a single chain of this
type (of course, respecting the linking conditions on left- and right- linking candi-
dates for chains of this type at the junction and having the same target in case they
are built around a target).
Theorem XXIV.7: all the chains defined in this book, (h)xy(z)(t)-chains and
nrc(z)(t)-chains, are composable.
The practical impact of this theorem is mainly for chains with the t-extension:
when additional t-candidates are justified by previous right-linking candidates of a
partial chain, they will still be justified by the same candidates if another partial
chain of the same type is added to its left. Of course, not all chains with the t-
extension can be obtained by combining shorter chains of the same type, but looking
first for chains with limited distance t-interactions may be a valuable strategy.
XXIV.2.6. Complexity
The search for xy(z)(t)-chains on a real grid is undoubtedly more complex than
the search for the simplest xy-chains of the same length. As xy(z)(t)-chains are more
general than xy-chains and allow many more puzzles to be solved, this should not be
Resolution rules and resolution techniques 373
There is theory and there is practice and they should not be confused. Said
otherwise, there is logic with the whole fauna of resolution rules and there is the
question of how they can be applied in practice.
But the role of resolution rules is not to help a player find their instantiations on
a real grid. For instance, a chain rule will tell you that it is worth looking for chains
of well defined types and what to do when you discover one, but it wont’t tell you
(or at least not directly) how to discover them. This is done by resolution techniques.
Whereas the phrase "resolution rule" has been given a precise, purely logical
definition, it is not yet the case for "resolution technique". First, two different kinds
of techniques should be defined.
There are resolution rules and there are resolution techniques implementing such
rules, whose purpose is to answer the practical question: how do we apply the rules
on a real grid, i.e. how do we find the patterns (defining the condition part of a rule)
on a real grid. Let us therefore introduce the:
A more formal definition may be helpful. First, as the two notions pertain to
different universes of discourse, we need define a common framework in which
comparing them will be meaningful. The notion of a knowledge state introduced in
section IV.2.2 and the set KS of such states naturally provide this framework.
Remember that KS contains inconsistent states, corresponding to puzzles with no
solution, and that a knowledge state is something very concrete: the set of values
and candidates present on your grid in each cell.
With both a resolution rule (or a resolution theory, i.e. a set of resolution rules)
and a resolution technique one can straightforwardly associate a unique function
defined on all of KS and with values in KS. In the case of a resolution theory, notice
that the confluence property is not strictly necessary: it only guarantees that we can
repeatedly apply the rules in any order and we are certain we always get the same
result. If the confluence property was not satisfied, determining "all the logical
Resolution rules and resolution techniques 375
For a technique that is not based on a known (set of) resolution rule(s), it may be
very difficult to guarantee that it doesn't amount to Depth First Search (recursive
Trial and Error) or to Breadth First Search (as many of the general tagging algo-
rithms that are regularly proposed do); and it may be very difficult to find a set of
acceptable (whatever one understands by this) resolution rules such that it would be
its implementation. Such a set may have to contain very complex rules (e.g., for
algorithms using general tagging, rules based on complex nets).
Before that, notice that, for any resolution technique, it is likely that it will have
many variants; e.g. one can start the search for an xyt-chain from the first cell in the
chain or from a previously obtained shorter chain; but, even though knowing the
target cell in advance is not needed in the search for an xyt-chain, one can never-
theless also start this search from a target cell (e.g. because one would like to elimi-
nate some candidate in this cell).
Let us consider the most general rule defined in this book, the nrczt-chain rule
(or rule schema, since there is one rule for each length). We shall show that there are
at least three different resolution techniques that can be used to implement it. This
example can easily be specialised for all the simpler 2D or 3D chains. For simpli-
city reasons, let us suppose a target candidate TC is given. This corresponds to a
realistic situation in which a player focuses on the elimination of this candidate.
376 The Hidden Logic of Sudoku
nrc(z)t chains built around TC can be found by using three different techniques.
All of them start with two candidates that are linked by an nrc-bivalue link, the first
being nrc-linked to TC; let's call the target and these two candidates the seed of the
chain. In either method, the chain is progressively extended to the right, in steps that
add two candidates at the same time. Suppose therefore we already have a partial
nrc(z)t chain on 2(n-1) candidates (n > 1), suppose also that the last candidate is not
nrc-linked to TC (otherwise we have found a full nrczt-chain and the algorithm has
succeeded), and let's try to extend it with two more candidates. Each of the three
techniques is defined by two sub-procedures: an "initialisation step" and a "next
step". They also all have the same exit condition, that will not be repeated: if the last
right-linking candidate is nrc-linked to the target, then stop (you have found an
nrczt-chain). We can also have additional exit conditions in case we want to find rl-
and lr- lassos together with nrczt-chains: for rl-lassos, accept a new right-linking
candidate that is already present in the chain as a left-linking candidate and stop; for
lr-lassos, if the new right-linking candidate is nrc-linked to a right-linking candidate
in the chain, stop; for simplicity reasons, we won’t mention this in the sequel.
This is the simplest (and my preferred) method. Initialisation step: initialise the
procedure by choosing two nrc-bivalue candidates such that the first is nrc-linked to
TC and drawing a red arrow from the first to the second. Next step:
– find two new candidates not already in the chain, such that the first is nrc-
linked to candidate 2(n-1) (it will be the new left-linking candidate) and the second
is nrc-conjugate with the first modulo any previous right-linking candidate in the
chain and modulo TC (it will be the new right-linking candidate); the fact that this
second candidate satisfies these conditions is checked on the fly, noting that the
previous right-linking candidates are the arrival points of the red arrows;
– draw a blue arrow from candidate 2(n-1) to the first a these new candidates;
– draw a red arrow from the first to the second of these new candidates.
Initialisation step: same as above; in addition, colour in blue any candidate that
is nrc-linked to the second candidate or to the target. Next step:
Resolution rules and resolution techniques 377
– find two new candidates not already in the chain, such that the first is nrc-
linked to candidate 2(n-1) (it will be the new left-linking candidate) and the second
is nrc-conjugate with the first when all the candidates coloured in blue are ignored
(it will be the new right-linking candidate); notice that there is no restriction on the
first candidate (it may be already coloured in blue); notice also that the search for
the new pair is facilitated by the colours (which was the goal of using them),
because it can be limited to cells that appear as mono- or bi- value when the blue
candidates are ignored (a pseudo mono-value, but undecided, cell corresponds to the
case when the first of the two new candidates is coloured in blue);
– draw a blue arrow from candidate 2(n-1) to the first a these new candidates;
– draw a red arrow from the first to the second of these new candidates; (notice
that arrows need not be coloured);
– colour in blue any candidate that is nrc-linked to the second of these new
candidates.
In the above two techniques, the same two questions can be asked.
Firstly, what happens if the predefined target is not nrc-linked to the last candi-
date added? Answer: nothing happens. You may continue extending the chain. (Or
you may try starting another chain; the algorithm decribed here has a lot of possible
variants.)
Secondly, what happens if, at some point, the extension step cannot be done?
Concerning the values and the candidates present on the grid, nothing will happen.
As for any type of chain, you will just have to find a better extension. How can this
be done? The answer is very standard: go one step backwards in the chain and try
another extension. Here the two methods behave very differently. For the first, you
just have to erase the last two arrows. For the second, you must also erase all the
colouring and restart it from nought; as this makes it rather inefficient, it is a debili-
tating point for nrc(z)(t)-colouring. But it can be saved by nrc(z)(t) tagging.
This technique is for fans of limited tagging. This modification of the previous
procedure is only a slight adaptation from an abstract point of view, but a huge one
with respect to efficiency matters. Choose a fixed sequence of symbols (e.g. letters).
378 The Hidden Logic of Sudoku
Follow the same procedure as for nrc(z)(t) colouring, but, instead of colouring
the candidates in blue, tagg them with letters; a candidate needs be tagged with only
one letter (if it is already tagged, don't tagg it again). The letters used for tagging are
chosen as follows: in any new step of the technique described above for colouring,
the next letter in the sequence must be used. In the choice of the next candidates,
ignore tagged ones instead of coloured ones.
When you have to "backtrack" to the previous step, just erase all the instances of
the last letter used in tags. All the tagging done by the previous steps will be kept.
In problem solving practice, the simplest and most commonly used method for
managing this kind of backtracking is ordering all the possible choices: order the
cells, candidates and types of links (e.g. for the red links: bivalue first, then row-
conjugacy, …) and always follow the same order in your choices. Thus, you need no
explicit markings for keeping track of the possibilities already tried.
If you don't want to follow a systematic procedure, you can also mark your
choices. But this requires some complex marking procedure. In any case, I would
not recommend using the tag system itself for keeping track of the search paths.
With the tag system as defined above, you need only as many tags as the longest
chain length you are looking for. But explicit marks for paths would require many
more different signs.
The above resolution techniques were presented here merely as an illustration of:
– the difference between a resolution rule and a resolution technique;
– what it means for a resolution technique to be an implementation of a resolu-
tion rule;
– the fact that, in any of the techniques used here, no value is tentatively asserted
and no candidate is tentatively eliminated; i.e. none of these techniques introduces
anything that could reasonably be called T&E, in any sense compatible with the
definition of it given in the next section, although they also involve some kind of
search (but this search is at another level); they should therefore also help under-
stand the full scope of the T&E theorem.
Resolution rules and resolution techniques 379
Finally, notice that, in SudoRules, nrc(z)(t) tagging is not used. It is the job of
the inference engine (CLIPS or JESS or any one compatible with their common
syntax) to execute the set of rules as if it was a procedure (taking their priorities into
account). The inference engine can be considered as a super-compiler that automa-
tically transforms logical formulæ into procedures (this is largely metaphorical –
and should therefore not be taken too litterally). This is longstanding AI technology.
The question of Trial and Error (T&E) has always been the topic of much confu-
sion and heated (or even poisonous) debates – very often for the main reason that
the terms used in these debates are undefined and are given at least as many diffe-
rent meanings as there are participants. Many of these meanings are so vague that
anything in Sudoku would be T&E if they were to be taken seriously. Some types of
resolution rules have even been claimed to be T&E, because their proof uses
reasoning by cases. Leaving all this aside, we shall define rT&E as a precise
resolution technique.
all of them is the idea that when no knowledge is applicable, you try anything that
has not yet been shown to be impossible.
Notice that, for each T, rT&E(T) thus defined is in fact a whole family of
resolution techniques in the above sense, with variants associated to each exit condi-
tion (how many solutions one wants) and to each method for choosing the
successive hypotheses. The forthcoming rT&E theorem will be valid for any T and
for any member of the rT&E(T) family. Notice that the only effect of the resolution
Resolution rules and resolution techniques 381
rules in T is to prune the tree of possibilities, thus focusing the search (but not
necessarily making the algorithm more efficient when programmed into a computer,
since computing the logical consequences of an hypothesis may be time consuming
for complex rules); it has no impact on the solutions that can be found. We shall
write simply rT&E to mean any member of any of the rT&E(T) families.
The above general procedure can also be improved (i.e. made more efficient in
terms of computing time) in various ways, such as making hypotheses preferably in
bivalue cells (or also from conjugate candidates, which are bivalue in rn-, cn-, or bn-
space), but this doesn’t change the fundamentally tentative nature of the successive
hypotheses to be made.
What is important for the sequel is that, considering the function from KS to KS
associated to any fixed member of the rT&E family (see section 3.2), it always
outputs a solution state or an explicitly contradictory state.
XXIV.5.2. T&E should not be confused with the search for patterns on a real grid
Here is the difference between rT&E and resolution rules: with rT&E, one
tentatively asserts values and/or deletes candidates, thus modifying the basic data;
one sometimes needs to restore a previous state of these data. With resolution rules,
although one has to search the grid for some pattern, this search does not imply any
modification of the "basic data" (i.e. values and candidates); all the modifications
382 The Hidden Logic of Sudoku
one ever does on the basic data are irrevocable. And it can be seen that this
difference is both conceptual and practical. Moreover, as Max Beran puts it: when
we are using a resolution rule, "we can declare precisely what we are trying to
achieve and the properties that the pattern must have for this to be true. A bifurcator
[i.e. one that uses T&E] can make no such equivalent statement".
If this distinction is lost, then nearly everything in Sudoku is T&E, even Naked
Pairs: once you have a bivalue cell, you must search for a second; in this process,
you may encounter a second bivalue cell, but with different candidates; then you
have to search for another bivalue, until you find one with the same two candidates.
This is an obviously absurd view of Pairs.
A precise understanding of the idea that a resolution technique can be the imple-
mentation of a resolution rule is necessary to catch the full scope of our T&E
theorem. This is why we first gave detailed examples of this association.
As this theorem has been the topic of much debate, let us give two versions of it
and two proofs, based on completely different approaches and with different scopes.
Notice that this proof cannot be extended to a set T of resolution rules: although
the condition "there is a cell with at least two candidates" would have to be satisfied
by all the rules in T, any one of these rules could have specific additional conditions.
This is somewhat reassuring, because if such an extension was true, it would entail
we have no chance of ever finding a complete set of resolution rules.
The second version is stronger, but, whereas the first did not depend on adopting
or excluding the assumption of uniqueness, it applies only if we accept puzzles with
non necessarily unique solutions:
Proof: for any set S of resolution rules, if a puzzle has more than one solution,
rT&E(S) is guaranteed to find one (or several or all, depending on the exit condition
we put on the rT&E algorithm – again, rT&E is a family of algorithms, and the
theorem applies to any variant). On the contrary, as S can only lead to conclusions
that are logical consequences of S and of the entries of the puzzle, if a puzzle has
several solutions, S cannot find any; e.g. if there are two solutions such that r1c1 is 1
in the first and 2 in the second, S cannot prove that r1c1 =1 (nor that r1c1 = 2). It can
therefore find none of these solutions. (This corresponds to the following general
meta-theorem in FOL: what a FOL theory can prove is exactly what is true in all its
models.) q.e.d.
Miscellanea
What does it mean for a Sudoku Resolution Theory to be "complete"? Since all
the results that can be produced (i.e. all the values that can be asserted and all the
candidates that can be eliminated) when a resolution theory T is applied to a given
puzzle P are logical consequences of theory T ∪ EP (where EP is the conjunction of
the entries for P, as defined in chapter IV), these results must be valid for any solu-
tion for P (i.e. for any model of T ∪ EP). Therefore a resolution theory can only
solve puzzles that have a unique solution and one can give three sensible definitions
of the completeness of T:
– it solves all the puzzles that have a unique solution;
– for any puzzle, it finds all the values common to all its solutions;
– for any puzzle, it finds all the values common to all its solutions and it elimi-
nates all the candidates that are excluded by any solution.
Obviously, the third definition implies the second, which implies the first, but
whether the converse of any of these two implications is true remains an open ques-
tion.
Even regarding the weakest definition, the strongest resolution theory examined
in this book, L13 (or even its weak extension L16 obtained by adding the xyt-chains
of length fourteen to sixteen, or even its extension to M28, in this second edition, by
nrc(z)(t) chains of length up to 28), does not solve all the puzzles that have a single
386 The Hidden Logic of Sudoku
solution. It can solve most of these puzzles, but not all these puzzles. Whether a
stronger theory using only chains of the same types as those defined in this book,
though longer, would allow a positive answer remains an open question.
Given a resolution theory T and a puzzle P, one must make a clear distinction
between the full abstract epistemic model KSP of P (as defined in chapter IV) and
that part of KSP to which T gives access. Firstly, by construction, in the full KSP
there are always lots of inconsistent states; but in the part accessible through T, there
are only consistent knowledge states, unless P has no solution. Secondly, two
different problems related to KSP must be distinguished.
KS13
KSP
Figure 1. The epistemic model KSP of a puzzle P with two solutions (all lines) and the part of
it accessible by a Resolution Theory T (full lines).
Miscellanea 387
A second problem appears in case there may be several solutions. There are
several maximally consistent elements in KSP, some of which may be accessible
from some states reachable by T, and some others not accessible. Questions related
to uniqueness will be dealt with in section 3.
Figure 1 shows a puzzle with two solutions; knowledge state KS8 represents all
the "value°" and "cand°" ground atomic formulæ that must be true in either of them.
No greater knowledge state can be attained within any consistent Resolution Theory.
In this example, this knowledge state KS8 can effectively be reached by theory T,
following two paths, one through KS1 and one through KS3. Nevertheless, the figure
suggests that the path leading to KS3, although valid within T, is a dead end within
T. The next section will show that this is impossible for the Resolution Theories
intro-duced in this book.
There is logic, together with the logical Resolution Theories introduced in this
book… and there are the many ways we can use them in practice to solve real puz-
zles. From a strict logical standpoint, all the rules in a logical theory are on an equal
footing, which leaves no possibility of defining a precedence order between them
(unless we use some of the circumscription theories introduced by AI or some of
their numerous sprouts – which would be a complex approach). Nevertheless,
throughout this book, we have referred to some ordering of the rules and we have
stated many results mentioning it. This is harmless when we state that a rule is
subsumed by others, because in such theorems we are completely free to choose
which rules we prefer to keep and which to reduce to the previous ones.
But, when it comes to the practical exploitation of our theories and in particular
to their implementation in our SudoRules solver, one question remains unanswered:
can superimposing some ordering on the set of rules prevent us from reaching a
solution that the choice of an other ordering might have made accessible? This is
obviously a fundamental question when we assert that a puzzle P cannot be solved
388 The Hidden Logic of Sudoku
in some theory T, because what we effectively prove is only that P cannot be solved
with our implementation of T in SudoRules. By proving that all the resolution theo-
ries introduced in this book have the confluence property (to be defined soon) we
shall reject any objection of this kind.
First, given a resolution theory T, consider all the resolution methods that can be
built on it by defining various orderings of the rules in T. Given a puzzle P and star-
ting from KSP, the resolution process associated with a method M built on T
consists of repeatedly applying resolution rules from T according to the additional
constraints (i.e. the precedence order) introduced by M. Considering that, at any
point in the resolution process, different rules from T may be applicable (and
different rules will be applied) depending on the chosen resolution method M, we
must allow for several resolution paths starting from KSP. (Since our complexity
hierarchy only defines a partial order relation on the set of resolution rules, all this
discussion would remain unchanged even if all the resolution methods we consider
were restricted to those conforming to this hierarchy – but this is not the point we
want to make here.)
Do all our theories BSRT and L1_0 to L13 have the confluence property? The
question can easily be seen as equivalent to the following: in any resolution method
based on any of these theories, can the application of a rule prevent us from reaching
a conclusion (i.e. asserting a value or a non-candidate) that might have been obtai-
ned if another rule had been applied first?
Theorem 1: All the resolution theories defined in this book (BRST, L1_0 to
L13 and beyond, M2 to M28 and beyond) have the confluence property.
Proof: since the conclusion of a resolution rule in any of the theories listed above
is either the addition of a value or (non exclusive "or") the elimination of a candida-
te, proving the confluence property a priori imposes the examination of all the rules
and all the interactions that may occur between them. Nevertheless, it is enough to
prove the following property: for any theory T as above and for any of the rules R in
T, if R is applicable at some point in the resolution process, then the assertion of a
value or the elimination of a candidate (e.g. as a result of the application of another
rule in T) may make R inapplicable but then its conclusions can still be obtained by
applying other rules whose complexity is no greater than that of R (rules which
belong therefore to the same theory T).
Miscellanea 389
Let T be any of the theories listed in the theorem. Let us consider in turn all the
rules R it contains and see what happens if, in any knowledge state where it can be
applied, another rule R’ is applied before it. Given T, you should only read the para-
graphs of the following proof starting with the names of rules R belonging to T.
ECP: since rules in ECP always belong to T, if a candidate n for a cell C could
have been eliminated by ECP but instead another rule R’ in T is applied to assert
some value or to eliminate some candidate, this cannot prevent the later applicability
of ECP if R’ has not already asserted this value or eliminated this candidate.
CD: since no rule can add a candidate for a cell, no rule can make CD inappli-
cable if it was applicable in some knowledge state. Stated otherwise, a contradiction
that would have been detected by a rule in one of our theories can never be hidden
by the application of another rule.
NS: the only condition in NS is the presence of a single candidate for a cell C; if
this candidate is eliminated by another rule R’ in T before being asserted by NS as
the value for cell C, then CD applies.
NP: consider a Naked-Pairs in some unit u, for two cells C1 and C2, with candi-
dates n1 and n2. If n1 is eliminated as a candidate for C1 (by the application of any
other rule R’ in T), then NS (a rule present in T whenever NP is) can still prove that
the remaining candidate (n2) is the value for C1; n2 can therefore still be eliminated
by ECP (a set of rules present in all our theories) from all the other cells in unit u,
including C2. As a result, C2 can still be proved to have a single candidate (n1) and
NS can still be used to conclude that C2=n1. ECP can still be applied to eliminate n1
from the candidates for any other cell in unit u. Finally n1 and n2 can still be elimina-
ted from the candidates for all the cells in unit u other than C1 and C2. The same ar-
gument works (skipping the first step) if n2 is asserted as the value for C1.
NT and NQ: the proof works along the same lines; elimination of a candidate in
a cell of the Triplet (respectively the Quadruplet) leads to a NP (resp. a NT), which
can be dealt with by T, since NP (resp. NT) rules are present in all the theories that
contain NT (resp. NQ).
The proofs for HP, HT and HQ work by combining the arguments for the NP,
NT and NQ cases with those for HS.
Interaction rules: since the conditions of the Interaction rules bear only on the
absence of candidates for some cells, eliminating candidates for any cell or asserting
values for some cells does not change the validity of these conditions.
Chain rules (non hidden): this is certainly the most interesting part of the proof.
xyzt-chains: the proof is similar to that for xyt-chains, with propagation of con-
straints following the proof of the xyzt-chain rule. The major difference is that a
new type of extra candidates must be taken care of: the target value n1. Suppose it
appears in cell C as an additional value of this type; if it is eliminated as a candidate
for C, then the situation is reduced to that of an xyt-chain; if it is asserted as the
value of C, then n1 can still be eliminated by ECP from any xyzt-target cell.
Miscellanea 391
Hidden chain rules: since these rules are obtained from the non-hidden ones by
supersymmetry, the above arguments used for the original chains can be transposed
to their hidden counterparts.
3D chain rules: there is nothing new in the proofs, they are the direct transpo-
sitions of the proofs for the corresponding 2D chain rules.
In Part V of his otherwise interesting book [STU 07], Andrew C. Stuart descri-
bes several techniques based on the assumption of uniqueness. In particular, the
technique known as BUG (for Bivalue Universal Grave – sic) is stated as follows:
"If a puzzle has one cell with three candidates and all the other undecided cells have
two candidates, you can immediately fill that three-candidate cell. Just check which
candidate appears three times in the row, column or box [block, in our vocabulary]
392 The Hidden Logic of Sudoku
in which this three-candidate cell resides. That candidate is the one that goes in the
three-candidate cell."
c1 c2 c3 c4 c5 c6 c7 c8 c9
r1 3 5
9
2 8 4
7
1 4
9
6
7
5
r2 1 8 5
9
4
7
3 6 4
9
2 7
5
r3 7 6 4 2 5 9 1 3 8 r3
r4 5 2 7 6 1 8 3 9 4 r4
2 2
r5 9 3 8 4
7
4
7
4 5 1 6
r6 6 4 1 3 9 5 7 8 2 r6
r7 8 1 3 5 6 7 2 4 9 r7
2 2
r8 4 5 5 1 8 4 6 7 3
9 9
2 2
r9 4 7 6 9 4 3 8 5 1 r9
c1 c2 c3 c4 c5 c6 c7 c8 c9
Stuart proposes the puzzle in Figure 2 (call it BUG-41-1) as an example that can
be solved with this technique. Notice that we can consider that this is a puzzle given
in the most ordinary sense used throughout this book (i.e. only with values as en-
tries), since the complete configuration with candidates can be computed from the
values alone with just a few applications of ECP.
The solution given by Stuart (left-hand grid in Figure 3), based on the direct
application of the above BUG rule, chooses value 7 for cell R5C5, all the remaining
values being then decided by mere NS.
Miscellanea 393
3 5 2 8 4 1 9 6 7 3 9 2 8 7 1 4 6 5 3 9 2 8 7 1 4 6 5
1 8 9 7 3 6 4 2 5 1 8 5 4 3 6 9 2 7 1 8 5 4 3 6 9 2 7
7 6 4 2 5 9 1 3 8 7 6 4 2 5 9 1 3 8 7 6 4 2 5 9 1 3 8
5 2 7 6 1 8 3 9 4 5 2 7 6 1 8 3 9 4 5 2 7 6 1 8 3 9 4
9 3 8 4 7 2 5 1 6 9 3 8 7 2 4 5 1 6 9 3 8 7 4 2 5 1 6
6 4 1 3 9 5 7 8 2 6 4 1 3 9 5 7 8 2 6 4 1 3 9 5 7 8 2
8 1 3 5 6 7 2 4 9 8 1 3 5 6 7 2 4 9 8 1 3 5 6 7 2 4 9
2 9 5 1 8 4 6 7 3 4 5 9 1 8 2 6 7 3 2 5 9 1 8 4 6 7 3
4 7 6 9 2 3 8 5 1 2 7 6 9 4 3 8 5 1 4 7 6 9 2 3 8 5 1
Figure 3. The three solutions for puzzle BUG-41-1: Stuart’s solution based on BUG and two
other solutions
The assumption of uniqueness is false in the present case and applying the BUG
rule produces the illusion that there is a unique solution. Unfortunately, there are
two others, as shown by the central and right-hand grids in Figure 3. These two
solutions can be obtained by trying the value r2c4=4; a few NS later, only 6 cells
(r5c5, r5c6, r8c1, r8c6, r9c1 and r9c5) remain undecided, with the same two
candidates each (Numbers 2 and 4). This situation is a wonderful generalisation of
Unique Rectangle (see definition below) to six cells instead of four; it is somewhat
reminiscent of an unproductive Swordfish or of a c4-, c5- or c6- chain with no
productive target cell.
This is the place to notice that the BUG U-rule (or any U-rule based on the
assumption of uniqueness) does not guarantee that the candidate it eliminates would
have led to no solution. It only says it would have led to two (or more) solutions. It
is therefore fine to apply this rule if you are absolutely sure there is only one
solution; but how can you be absolutely sure unless you have proved it for yourself?
Otherwise, it lets you unduly believe that there is a unique solution when there may
be several.
One should never forget that the rules based on the assumption of uniqueness
cannot produce uniqueness in a puzzle when it is not there.
This remark has practical consequences for the puzzle creator. Puzzles proposed
to the public are supposed to have a unique solution, mainly because "puzzle
compilers would not want to annoy their audience with puzzles that have two or
394 The Hidden Logic of Sudoku
more solutions" ([STU 07]). There are currently two general types of algorithms for
generating a puzzle: either you start from nought and you progressively add a ran-
dom value in a random cell until you get a puzzle with a unique solution or you start
from a complete puzzle (randomly generated) and you repeatedly withdraw a ran-
domly chosen entry as long as the resulting puzzle has more than one solution. In
both cases, each step requires a proof that the current puzzle has a unique solution.
Now, this proof can be done in either of two ways: using a solver based on some
resolution theory or carrying out a recursive Trial and Error (or combining both).
The problem with the first method is that if some of the rules in your solver are
based on an assumption of uniqueness, you do not really prove that which you
intended. This is how some puzzles are proposed as having a single solution but
appear to have several.
Another question is: "do we need rules based on the assumption of uniqueness",
i.e. can these U-rules solve puzzles (effectively known to have a unique solution)
that could not be solved without them (let us say within the theories defined in this
book or within other theories built on well defined guiding principles)?
I therefore decided to try these rules on the part of Royle17 that was not solved
within L13 (or within its weak extension L16 by the sole xyt-chain rules). And I
actually found very few puzzles that could be solved by U-rules belonging to the
Unique Rectangles family.
One of the basic rules (or family of rules) in relation to uniqueness is named
Unique Rectangles. It is based on the following remark. Define a horizontal rectan-
gle as given by two different rows r1 and r2 and two different columns c1 and c2
(whose intersections define the corners of the rectangle) such that the two cells on
the first column are in the same block b1 and the two cells on the second column are
Miscellanea 395
in the same block b2≠b1. Given such a rectangle, if, in a puzzle P, each of the four
cells at the corners have the same two numbers n1 and n2 as their only candidates,
then to any solution for P such that (r1, c1) = n1, there corresponds a solution such
that (r1, c1) = n2, and conversely. The proof is obvious, since permuting the two
numbers in these four cells does not change the constraints (along rows, columns
and blocks) their values impose on any other cell. As a result, no puzzle with a uni-
que solution can display the above pattern.
From this general remark, several "unique rectangle" U-resolution rules can be
devised, whose purpose is to avoid reaching a knowledge state with the above
pattern. Let us write the U-rules in this family that are most commonly encountered
in the literature, in their "horizontal" versions; as usual, transposing rows and
columns will provide "vertical" versions. In each of the following U-rules, we
consider a horizontal rectangle where the four corners are supposed to have n1 and
n2 as their only two candidates, unless otherwise specified. Proofs of the U-rules are
left to the reader as an easy exercise.
UR1-H (Type 1 Unique Rectangle – Horizontal): if (r2, c2) has one or more
addi-tional candidate(s), then eliminate n1 and n2 from the candidates for (r2, c2).
UR2-H (Type 2 Unique Rectangle – Horizontal): if (r1, c2) and (r2, c2) both have
a third candidate n3 (different from n1 and n2), then eliminate n3 from the candidates
for any cell that shares a unit with (r1, c2) and (r2, c2), i.e. from any other cell in co-
lumn c2 or in block b2.
UR2b-H (Type 2b Unique Rectangle – Horizontal): if (r2, c1) and (r2, c2) both
have a third candidate n3 (different from n1 and n2), then eliminate n3 and n4 from
the candidates for any cell that shares a unit with (r2, c1) and (r2, c2), i.e. from any
other cell in row r2.
UR3-H (Type 3 Unique Rectangle – Horizontal): if each of (r2, c1) and (r2, c2)
has a third candidate different from n1 and n2, as in type UR2b, but these additional
candidates, n3 for (r2, c1) and n4 for (r2, c2), are different, and if there is a third cell
on row r2 such that its only two candidates are n3 and n4, then eliminate n3 and n4
from the candidates for any cell in row r2 other than the three already mentioned.
UR4-H (Type 4 Unique Rectangle – Horizontal): if (r1, c2) and (r2, c2) both have
additional candidates (different from n1 and n2) and if n1 is not a candidate for any
other cell in block b2, then eliminate n2 from the candidates for cells (r1, c2) and (r2,
c2).
396 The Hidden Logic of Sudoku
UR4b-H (Type 4b Unique Rectangle – Horizontal): if (r2, c1) and (r2, c2) both
have additional candidates (different from n1 and n2), and if n1 is not a candidate for
any other cell in row r2, then eliminate n2 from the candidates for cells (r2, c1) and
(r2, c2).
In the Royle17 database, we have found only five puzzles that cannot be solved
within L16 but can be solved if we add U-rules for Unique Rectangles. (Among the
puzzles not solved by L16, ten of them actually use Unique Rectangles, but for five
of them, this is not enough to reach a solution.) This is not a lot of puzzles, although
it is enough to prove that these U-rules are not subsumed by those in L16. Notice
that if we had given U-rules for Unique Rectangles priorities between those in L4
and L5 (a natural choice, since these patterns rely on four cells) this might have
changed the sets of puzzles solved at all the levels above L4. Nevertheless, we stick
to our preference for rules not based on the assumption of uniqueness (see section
3.4 below). Notice also that, when we add rules for 3D chains, all the examples
below can be solved at worst within M5 without resorting to any U-rule.
1 6 8 1 4 7 6 8 5 3 2 1 4 7 6 9 8
7 4 7 4 6 3 1 9 7 4 2 6 8 3 1 5
6 1 3 7 4 8 6 1 3 9 5 7 2 4
1 3 7 1 5 4 3 6 7 1 5 4 8 3 2 6 9
7 4 5 7 6 4 1 3 2 9 5 7 6 8 4 1
6 5 6 4 1 5 3 7 6 4 8 9 1 2 5 3 7
2 6 5 2 7 6 5 1 4 3 2 9 7 6 5 1 4 8 3
3 7 1 5 3 4 7 6 1 5 3 8 2 4 9 7 6
1 4 6 7 3 1 4 8 6 7 3 9 1 5 2
Resolution path in L16+UR1 for the L16 (or L1_0) elaboration of Royle17-6526:
column c8 interaction-with-block b9 ==> r8c7 ≠ 8
row r8 interaction-with-block b8 ==> r9c6 ≠ 8
column c3 interaction-with-block b4 ==> r5c2 ≠ 8, r5c1 ≠ 8
block b7 interaction-with-column c2 ==> r5c2 ≠ 9, r1c2 ≠ 9
Miscellanea 397
6 4 1 5 6 3 4 1 9 7 5 8 2 6 3 4 1 9 7
3 7 3 7 9 6 3 7 9 1 8 2 4 5
9 9 4 3 1 9 4 2 7 5 3 8 6
8 7 2 8 5 3 7 9 4 2 8 5 3 7 6 9 4 2 1
5 3 9 7 4 5 3 9 7 6 1 4 2 5 3 8
9 2 4 5 3 7 9 2 4 1 5 8 3 7 6 9
4 6 4 9 6 5 3 4 1 9 8 2 7 6 5 3
3 9 7 5 3 9 7 2 5 3 9 6 8 1 4
5 3 4 5 9 7 3 6 8 4 5 1 9 7 2
Resolution path in L16+UR1 for the L16 (or L1) elaboration of Royle17-10233:
column c5 interaction-with-block b5 ==> r5c6 ≠ 6
column c2 interaction-with-block b7 ==> r9c3 ≠ 6, r9c3 ≠ 1
xyzt4-chain {n8 n2}r8c7 – {n2 n1}r9c9 – {n1 n6}r4c9 – {n6 n8}r5c9 ==> r8c9 ≠ 8
xyzt8-chain {n2 n8}r9c3 – {n8 n1}r9c9 – {n1 n6}r4c9 – {n6 n8}r5c9 – {n8 n1}r6c8 –
{n1 n6}r6c3 – {n6 n1}r5c3 – {n1 n2}r5c6 ==> r9c6 ≠ 2
xyzt9-chain {n8 n2}r9c3 – {n2 n1}r9c9 – {n1 n6}r4c9 – {n6 n8}r5c9 – {n8 n1}r6c8 –
{n1 n6}r6c3 – {n6 n1}r5c3 – {n1 n2}r5c4 – {n2 n8}r5c6 ==> r9c6 ≠ 8
vertical unique rectangle type 1 {r1 r9}{c3 c2} ==> r9c2 ≠ 8, r9c2 ≠ 2
naked-pairs-in-a-row {n1 n6}r9{c2 c6} ==> r9c9 ≠ 1
block b9 interaction-with-row r8 ==> r8c6 ≠ 1, r8c2 ≠ 1
398 The Hidden Logic of Sudoku
4 1 4 1 8 7 6 5 2 9 4 1 8 3 7
7 5 7 4 5 1 8 7 4 6 3 2 5 1 9
6 1 7 4 6 3 1 9 8 7 5 4 6 2
5 7 6 8 5 4 3 7 6 8 1 5 4 3 7 2 9 6 8 1
9 3 1 9 8 3 6 4 2 7 5 1 9 8 3 6 4 2 7 5
4 1 5 8 4 2 6 7 1 5 8 9 4 3
4 1 8 4 1 8 7 6 4 2 1 5 8 7 3 9 6
2 7 8 2 1 7 4 9 8 6 2 1 3 7 5 4
4 1 8 7 3 5 4 9 6 1 2 8
Resolution path in L16+UR1+UR2 for the L16+UR1 (or L1_0) elaboration of Royle17-4218:
column c4 interaction-with-block b2 ==> r2c6 ≠ 6
column c9 interaction-with-block b3 ==> r1c8 ≠ 2
row r1 interaction-with-block b1 ==> r3c3 ≠ 2, r3c1 ≠ 2, r2c1 ≠ 2
hidden-pairs-in-a-row {n6 n8}r2{c1 c4} ==> r2c4 ≠ 9, r2c1 ≠ 9, r2c1 ≠ 3
Miscellanea 399
vertical unique rectangle type 2 in cells {r4 r2}{c6 c5} ==> r2c9 ≠ 3
row r2 interaction-with-block b2 ==> r3c6 ≠ 3
xyt5-chain {n9 n2}r4c6 – {n2 n9}r4c5 – {n9 n3}r9c5 – {n3 n2}r2c5 – {n2 n9}r2c9 ==>
r2c6 ≠ 9
xyzt4-chain {n9 n5}r3c3 – {n5 n2}r3c6 – {n2 n3}r2c6 – {n3 n9}r2c5 ==> r3c4 ≠ 9
xyt5-chain {n3 n9}r1c8 – {n9 n2}r2c9 – {n2 n3}r2c6 – {n3 n9}r2c5 – {n9 n3}r9c5 ==>
r9c8 ≠ 3
xyt6-chain {n3 n9}r7c7 – {n9 n5}r7c4 – {n5 n8}r3c4 – {n8 n6}r2c4 – {n6 n9}r1c4 –
{n9 n3}r1c8 ==> r8c8 ≠ 3
block b9 interaction-with-row r7 ==> r7c2 ≠ 3
hxy-cn4-chain {r1 r7}c4n9 – {r7 r6}c7n9 – {r6 r7}c7n3 – {r7 r1}c8n3 ==> r1c8 ≠ 9
…(Naked Singles and Hidden Singles)
8 2 6 3 8 2 6 3 7 9 4 1 5 8 2
4 6 4 8 6 1 5 4 8 6 7 2 9 1 3
1 1 2 8 9 1 2 8 3 5 7 6 4
2 9 8 2 7 9 8 2 7 6 1 9 8 3 4 5
1 4 8 1 4 8 5 3 2 6 1 9 7
3 3 9 8 3 9 1 7 5 4 8 2 6
8 3 7 8 5 3 4 7 1 8 5 3 2 6 9 4 7 1
4 1 6 2 4 1 3 6 5 8 7 2 9 4 1 3 6 5 8
5 1 6 4 5 8 7 2 1 6 4 5 8 7 2 3 9
xyzt5-chain {n7 n9}r1c3 – {n9 n5}r1c7 – {n5 n3}r4c7 – {n3 n1}r4c4 – {n1 n7}r1c4 ==>
r1c5 ≠ 7
xyzt6-chain {n5 n9}r3c6 – {n9 n7}r3c1 – {n7 n9}r8c1 – {n9 n5}r2c1 – {n5 n3}r2c9 –
{n3 n5}r3c7 ==> r3c5 ≠ 5
vertical unique rectangle type 4 {r9 r5}{c9 c8} ==> r5c9 ≠ 3, r5c8 ≠ 3
…(Naked Singles and Hidden Singles)
1 4 1 7 4 3 9 2 6 1 7 4 5 8
9 7 1 9 7 1 8 4 3 9 5 2 7 6
7 3 7 8 3 1 7 5 6 4 8 2 9 3 1
4 1 5 4 1 8 7 5 2 6 4 1 8 7 3 5 9 2
2 8 7 2 1 8 9 7 3 2 5 1 6 8 4
9 1 5 2 8 9 4 6 7 1 3
6 1 9 8 6 7 1 9 8 6 7 5 3 4 1 2 9
2 7 8 2 1 9 7 8 2 1 9 7 6 8 3 4 5
4 1 9 4 3 5 1 2 9 8 6 7
Unless you need to prove uniqueness of a solution, whether or not you should
adopt rules based on the assumption of uniqueness is mainly a matter of personal
taste. Nevertheless I would like to make two remarks:
– a resolution path provides a stronger result when it does not use rules based on
this assumption: it does prove uniqueness;
– rules based on the assumption of uniqueness should therefore not be applied
before rules (or at least before rules of similar complexity) that do not use it.
Given the partial order relation we have defined on the set of rules introduced in
this book, where should rules based on the assumption of uniqueness be situated in
it (if we accept them)? The answer depends on the goals we pursue. If we want not
only to find a solution but also to prove it is the only one, then we should not use
these rules at all. If we admit uniqueness of the solution but we nevertheless prefer
to prove it independently whenever possible, then we should accept these rules as
last resort weapons (just before recursive Trial and Error) and put them at the
farthest end of the hierarchy, as we have done in the previous examples.
between L4 and L5. BUGs depend on many cells and should in any case be placed
as far as possible in the hierarchy.
We know that theories L16 or M28 are not complete and that the Uniqueness
rules discussed above are not enough to make them complete. Could it be made
complete only with rules of all the types defined in this book merely by increasing
the maximal lengths of the chains we consider (the Easter Monster example shows
that this is not the case), or do we need other types of rules, such as some of those
discussed in [STU 07] and in many Web forums? The same Easter Monster example
shows that there remain a few extreme cases for which the known rules are not
enough.
As for the rules in [STU 07], many of which are based on subsets (Almost
Locked Sets, Hinges, Grouped chains), SudoRules solves within theory L8 all the
puzzles this book proposes to illustrate them (for a full listing of the resolution
paths, see my Web pages), but this does not prove that it would also solve any
complex puzzle that can be solved with these rules. Moreover, all this is not to
suggest in any way that the set of rules defined here is better than Stuart’s. The goals
we pursue are very different. Whereas Stuart, as many Sudoku experts, is interested
in the diversity of rules that may be helpful (and that may be sheer bliss when you
find the opportunity of applying them), I am more interested in finding a minimal
set of rules that does the same job (preferably without resorting to subsets in poten-
tially exponential number). I think the advantages of such an approach are not only
theoretical but they may have a practical impact on the way one will search for a
solution. Finding complex patterns is a very demanding task and the more varied
patterns you want to spot on a grid, the more complex this task will become.
Limiting the number of different types of patterns to be looked for (if that does not
sacrifice the possibility of finding the solution or the simplicity of the resolution
path – two important restrictions) is thus a means of easing the resolution task.
Conclusion
In this conclusion, I’d like first to highlight a few facets of what has been achie-
ved in this book, from four complementary overlapping points of view.
1) From the point of view of the Sudoku addict, the most striking results should
be the following.
We have fully clarified the symmetry relationships that exist between Naked,
Hidden and Super-Hidden Subset rules (the latter being classically known as X-
Wing, Swordfish, Jellyfish, Squirmbag and other existent or non existent "fishy
patterns"). Such relationships had already been partly mentioned on some Web
pages, but never in the systematic way we have dealt with them here. As a result, we
have naturally found the proper formulations for the Hidden and Super-Hidden
Subset rules and we have proven that no other subset rule of such type can exist.
404 The Hidden Logic of Sudoku
We have proven certain theorems showing that particular resolution rules can be
formally reduced to simpler ones in the above hierarchy (e.g. "xy-chains of length 3
are subsumed by Naked-Triplets plus XY-Wing").
We have evaluated the strength of each rule by the proportion of new puzzles
("new" according to the above hierarchy) its introduction allows to solve and, for the
first time, such an evaluation has been based on a large collection of puzzles (more
than 56,000), with detailed results available online.
We have given chain rules a major place in this book (they occupy more than
half of it), because they are the main tool for dealing with hard puzzles but they re-
main the subject of much confusion. We have introduced a general conceptual fra-
mework (including the notion of a target not belonging to the chain) for dealing with
all conceivable types of chains and we have applied it systematically to all the types
we have defined. In particular, we have introduced an intuitive graphical language
of patterns for specifying chains and their targets, abstracting them from any irrele-
vant facet (such as a link being a row or a column or a block), and we have shown
that these patterns are equivalent to logical formulæ. In our framework, the confu-
sing notions of "chain of inferences", "weak link" and "strong link" are never used;
our chains are well defined patterns of candidates, cells and links.
We have chosen the simplest kind of homogeneous chains, the xy-chains, as our
main type of chains, with all the other chains being intuitive generalisations of them.
We have proven that xy-chains and c-chains should have no loops (with the
practical consequence that searching for these chains becomes simpler for both a
human and a computer).
Conclusion 405
With each type of chain in natural row-column space, we have associated two
new types of chains, their hidden counterparts in row-number and column-number
spaces, and, in particular, we have shown the unifying power of the hidden xy-
chains. All these chains can be spotted in either of the 2D representations, using our
Extended Sudoku Board.
We have also proven theorems allowing to combine our various types of homo-
geneous chains to build heterogeneous, more complex, ones.
In this second edition, we have generalised the above 2D chains, introduced their
fully super-symmetric 3D versions (the nrc-, nrct-, nrcz- and nrczt- chains) and
shown that all these types of chains can still be considered in a natural way as
various generalisations of the basic xy-chains.
In particular, we have proven that, using only the types of chain rules introduced
in the first edition, it is necessary to consider chains of length more than thirty if we
want to have a chance of solving all the randomly generated puzzles without
resorting to Trial and Error or assuming the uniqueness of a solution.
In the first edition, we had exhibited a set of resolution rules (L13) based on only
2D chains that could solve 97% of the randomly generated minimal puzzles and
99,67% of the 36,628 17-minimal puzzles in the famous Royle database (without
resorting to Trial and Error or to an assumption of uniqueness).
In this second edition, we have also exhibited a set of resolution rules (M5) that
can solve more than 99% of the randomly generated minimal puzzles using only 3D
chains of length no more than five and a set of resolution rules (M7) that can solve
99.9% of these puzzles using only 3D chains of length no more than seven. As
psychologists consider that human short term memory has size seven plus or minus
two, this means that a human being using these rules should be able to solve almost
any puzzle without any computer assistance (but still with some patience). It should
be noticed that these chains do not include subsets (Hinges or Almost Locked Sets
or grouped chains), contrary to the currently popular chains (Nice Loops or Alterna-
ting Inference Chains), thus avoiding a potential source of exponential behaviour.
406 The Hidden Logic of Sudoku
2) From the point of view of mathematical logic, our most obvious result is the
introduction of a strict formalism allowing a clear distinction between the straight-
forward Sudoku Theory (that merely expresses the constraints defining the game)
and all the possible Sudoku Resolution Theories formulated in terms of condition-
action rules (that may be put to practical use for solving puzzles). We have given a
clear logical definition of what a "resolution rule" is, as opposed to any logically
valid formula. With the notions of a resolution theory T and a resolution path (which
is merely a proof of the solution within T, in the sense of mathematical logic), we
have given a precise meaning to the widespread but as yet informal idea that one
wants a "pure logic solution". This leads to both sound foundations and intuitive
justifications for our resolution theories, exhibiting the following facets and conse-
quences.
We have established a clear logical (epistemic) status for the notion of a candi-
date – a notion that is quasi universally introduced for stating the resolution rules but
that does not pertain a priori to Sudoku Theory and that is usually used only from an
intuitive standpoint. Moreover, we have shown that the epistemic operator that must
appear in any proper formal definition of this notion can be "forgotten" in practice
when we state the resolution rules and that this notion can be considered as primary,
provided that we work with intuitionistic (or constructive) logic instead of standard
logic (this is not a restriction in practice). Notice that this whole approach can be
extended to any game that is based on techniques of progressive elimination of
candidates.
The natural symmetries of the Sudoku problem have been expressed as three ge-
neral meta-theorems asserting the validity of resolution rules obtained by some sim-
ple transformations of those already proven. These meta-theorems have been stated
and proven both intuitively and formally. As a first example of how these meta-
theorems can be used in practice, we have exhibited a precise relationship between
well known (Naked and Hidden) Subset rules with what we call their Super-Hidden
counterparts (the famous "fishy patterns") and we have proven some form of
completeness of the set of known Subset rules. As a second example of how these
meta-theorems can be used in practice, we have defined entirely new types of chain
rules, hidden chains of various types, and shown their unifying power.
Conclusion 407
We have also devised a direct proof of the existence of a simple and striking
relationship between Sudoku and Latin Squares: a block-free resolution rule (i.e. a
rule that does not mention blocks or squares) is valid for Sudoku if and only if it is
already valid for Latin Squares. Notice that it does not seem one can prove this
result by using only the general methods one would expect to see used in such cases:
either the interpolation theorem or the techniques of Gentzen’s sequent calculus.
3) From the point of view of Artificial Intelligence (AI), the following should
be stressed.
Sudoku is a wonderful example for AI teachers. It has simpler rules and is more
accessible for student projects than games such as chess or go, but it is much more
complex and exciting than the usual examples one can find in AI textbooks (Tic-
Tac-Toe, Hanoi Towers, Monkey and Bananas, Bricks World, …). It easily suggests
lots of projects based on the introduction and the formalisation of new types of rules
since no complete set of resolution rules is known (see below).
Sudoku is also a wonderful testbed for the inference engine chosen to run the
knowledge base of resolution rules. The logic of the set of rules is so intricate that
many functionalities of the inference engine are stringently tested, which is how we
discovered a longstanding undocumented bug in the management of saliences in
JESS. With long series of puzzles to solve, memory management can also be a
problem (as it is in CLIPS).
The previous topic is related to a crucial problem of AI, both practical and epis-
temological: how can one be sure that the system does what it is intended to do?
Although this is already a very difficult question for classical software (i.e. mainly
procedural, notwithstanding the object oriented refinements), it is much worse for
AI, for two main reasons. Firstly, the logic underlying a knowledge base is generally
much more complex than a set of procedures is (otherwise it would probably be
much better to solve the problem with procedural techniques) and secondly an infe-
rence engine is a very complex piece of software and debugging it is very difficult.
As a general result, using AI to prove theorems (although this has always been
and remains one of the key subtopics of the domain) may make mathematicians and
logicians very suspicious.
As a specific result, all the independence theorems that have been proven in this
book rely on the correctness of the inference engine we used for our computations.
They do not depend on this correctness when we assert that "theory T allows the
following resolution path for puzzle P", since the validity of any particular path can
easily be checked "by hand", whichever method was used to generate it. But these
independence results depend on this correctness any time we state that a particular
theory is not able to find a solution for some puzzle P. This is why all our computa-
tions have finally been done with CLIPS instead of JESS despite the fact that CLIPS
regularly gets lost in memory management problems and computation times on long
series of puzzles grow astronomical. But the fact that, due to its problem with the
management of saliences6, JESS misses some inferences on simpler rules, even
though this may be infrequent, disqualifies it as a theorem prover in our case (so
much so that missed inferences also vary from one version to the next). Obviously,
this does not prove that CLIPS is bug free. The only certain conclusions are that,
using the same knowledge base, CLIPS solved (a few) more puzzles than JESS and
never returned any spurious message of type "this puzzle has no solution".
6
It seems that this bug has been corrected in the latest release of JESS (71b1) available as of
the publication of this second edition. But we haven’t carried out systematic tests.
Conclusion 409
On the contrary, with the popular game of Sudoku, you can get a feeling of ano-
ther type of complexity, computational complexity (how this is related to the pre-
vious ones remains an interesting but very difficult question). Sudoku is known to
be NP-complete, i.e., to state it very informally (and somewhat incorrectly), when
410 The Hidden Logic of Sudoku
we consider grids of increasing sizes, resolution times grow faster than any determi-
nistic polynomial algorithm. As you will never try to solve a Sudoku puzzle on a
100x100 grid (unless you have unlimited free access to a psychoanalyst), this may
also remain an abstract definition. There is nevertheless a difference with the pre-
vious examples: you can already get a foretaste of the underlying complexity with
the standard 9x9 puzzles (e.g. by comparing them to their homologues on 4x4 grids,
the so-called Sudokids).
The Sudoku problem is defined by four very simple constraints, immediately un-
derstood by everybody in terms of "single occupancy" of a cell and of "mutual ex-
clusion" in a row, a column or a block. For a classically formatted mind, it is there-
fore natural to think that any puzzle can easily be proven to have no solution or be
solved by a finite set of simple operational resolution rules of the condition-action
type: "in such a situation carry out such an action (assert a value or eliminate a
candidate)". And this idea can only be reinforced if you consider the majority of
puzzles published in newspapers. But the independence results proven in this book
through a multiplicity of examples have shown that very complex resolution rules
are indeed needed.
What this book has shown then, in both intuitive and logically grounded ways, is
that writing a set of operational rules for solving an apparently very simple cons-
traints propagation problem may be a very complex task. (Indeed, notwithstanding
their overall complexity, the rules that have been defined in this book do not even
form a complete resolution theory.) Moreover, as all the NP-complete problems are
equivalent (through polynomial transformations) and some of them have lots of
practical applications, such as the famous travelling salesman, dealing with the
apparently futile example of Sudoku may provide intuitions on problems that seem
to be unrelated.
What has been partly achieved (from the point of view of AI)
In the introduction, we said we wanted a set of rules that would simulate a hu-
man solver and that could explain each of the resolution steps. The explanations pro-
duced by SudoRules are largely illustrated by the listings given in this book; they
are sufficiently explicit once you know the definitions of our rules and it would be
easy work to make them still more explicit for those who do not know them; but we
do not consider this as a very exciting topic. As for the solver facet, SudoRules does
simulate a human solver, a particular kind of player who would try all our rules sys-
tematically (ordered according to their complexity) on all of their potential instantia-
tions.
Conclusion 411
Is it likely that any human solver would proceed in such a systematic way? He
may prefer to concentrate on a part of the puzzle and try to eliminate a candidate
from a chosen cell (or group of cells). What may be missing then in our system is a
"strategic" knowledge level: when should one look for such or such pattern? But I
have no idea of which criteria could constitute a basis for such strategic knowledge;
moreover, as far as I know, whereas there is a plethora of literature on resolution
techniques (often misleadingly called strategies), nothing has ever been written on
the ways they should be used, i.e. on what might legitimately be called strategies.
To say it otherwise, we do have a strategic level: searching for the different pat-
terns in their order of increasing complexity. Notice that there is already more strate-
gy in this than proposed by most of the books or Web pages on the subject. The
question then is: can one define a better (or at least another) strategy? Well, the rules
in this book (and the corresponding software SudoRules) are there; you can use
them as a basis for further analysis of alternative strategies. One of the simplest
ways to do so is to modify the complexity measure and the ordering we have
defined on the set of rules. For instance, using psychological analyses, one could
break or relax the symmetry constraints we have adopted.
What was not our purpose and has not been achieved; open questions
The first thing that has not been done in this book is a review of all the advanced
rules that have been proposed, mainly on the Web (under varying names, in more or
less clear formulations, with more or less defined scopes). The list would be too
long and moreover it is regularly increasing. The best place to get an idea on this
topic is in the recent book by Andrew C. Stuart ([STU 07]) or on the Web, in parti-
cular in the Sudoku Players Forum:
http://www.sudoku.com/forums/viewtopic.php?t=3315
(with the problem that chains are often considered as "chains of inferences" instead
of patterns and they are sometimes classified according to absurd criteria).
Instead, our two main purposes in this regard were to take advantage of the sym-
metries of the game in a systematic way and to isolate a limited number of rule
types, with rules definitions extended as far as the arguments used in their proofs al-
lowed: this is how we introduced xyt-, xyz- and xzyt- chain rules (on the basis of
xy-chain rules) and their hidden counterparts (on the basis of our general meta-
theorems); this is also how this second edition introduced the 3D chain rules. Of
course, we do not claim that there may not be another general type of rules that
should be added to ours. For instance, if you admit uniqueness of the solution (i.e.
add the axiom of uniqueness), much work remains to be done in order to clarify all
the techniques that have been proposed to express it in the form of U-resolution
412 The Hidden Logic of Sudoku
rules. But one of the main questions in this regard is: should we accept rules for nets
or for chains of subsets? In a sense, AICs based on subsets appear to be nets when
we try to re-formulate them as chains of candidates; but they are mild kinds of nets.
nrczt-chains, which have approximately the same solving power as the most
complex AICs, prove that including subsets (Hinges, Almost Locked Sets) in chains
is not necessary. On the other hand, general tagging algorithms that can solve
anything correspond to unrestricted kinds of nets and they are not of much help for
answering the question: which (mild) kinds of nets should we accept?
Viewed from the methodological standpoint, more than proposing a final set of
resolution rules, our purpose was to set some minimal standard in the way one
should systematise the definition of rules in natural language, formalise them in lo-
gic (or in equivalent graphical representations), implement them (as rules for an
inference engine or in any other formalism that can be run on a computer, e.g. as
strictly defined colouring or tagging resolution techniques) and test their validity
and efficiency through the treatment of large collections of examples. It is our
opinion that only this complete cycle may bring some clarity into the subject.
The second thing that has not been achieved in this book is the discovery of a
complete resolution theory that would make no uniqueness assumption and that
would not use Trial and Error. Our strongest resolution theory (L16 in the first
edition, M28 in this second edition) cannot solve all the minimal puzzles that have a
single solution. It can solve almost all these puzzles, but not all these puzzles, and
increasing the maximal length of the chains would not help. Indeed, no set of
resolution rules is known that would allow to solve such exceptionally complex
puzzles as Easter Monster. Some kind of nets may be necessary. Defining very
complex types of nrczt-nets is very easy; defining useful but mild ones is more
difficult.
Finally, another related question is: does our strongest resolution theory (L13, or
its weak extension L16 or the 3D theory M28) detect all the puzzles that have no
solution? We have found no example that could not be detected. But this never-
theless leaves the question open. Underlying this question, there is a more general
informal one, still open: is formulating a priori necessary and sufficient criteria on
the existence of a solution (criteria that would only bear on the entries of a puzzle)
easier than finding a complete resolution theory?
References