
Our puzzle solving framework is based on a greedy
solver, which works in several phases. First a compatibility
function is computed to measure the affinity between two
neighboring parts (as often done in other solvers as well).
Then the solver solves three problems - the placement prob-
lem, the segmentation problem, and the shifting problem.
The placement module places all parts on the board in an in-
formed fashion, the segmentation module identifies regions
which are likely to be assembled correctly, and the shift-
ing module relocates regions and parts to produce the final
result.
Our main contribution is a fully automated solver which
uses no clues, hints, or other prior knowledge whatsoever.
In addition, our contributions also include the proposal of
new and better compatibility metrics, as well as the intro-
duction of the concept of an estimation metric which evalu-
ates the quality of a given solution without any reference to
the original, ground-truth image. As we show, these metrics
become a critical tool that facilitates self evaluation in case
when no clues or prior knowledge are available. In what fol-
lows we first discuss these, as well as the rest of the metrics
involved in the solver.
2. Metrics
2.1. Compatibility Metrics
A compatibility metric is at the foundation of every jig-
saw solver. Given two puzzle parts x
i
, x
j
and a possible
spatial relation R between them, the compatibility function
predicts the likelihood that these two parts are indeed placed
as neighbors with relation R in the correct solution. With
R ∈ {l, r, u, d}, we use C(x
i
, x
j
, R) to denote the compat-
ibility that part x
j
is placed on either of the left, right, up,
or down side of part x
i
, respectively.
Optimally, one would wish to find a metric C such that
C(x
i
, x
j
, R) = 1 iff part x
j
is located to the R side of x
i
in the original image and 0 otherwise. If such a function
exists, the jigsaw puzzle problem could be solved in poly-
nomial time by a greedy algorithm [7]. Motivated by the
above, we first discuss how one can measure the accuracy
of a compatibility function and then seek a compatibility
function C which is as accurate and discriminative as pos-
sible.
Measuring Compatibility Accuracy: In order to compare
between compatibility metrics, we denote the classification
criterion as the ratio between correct placements and the to-
tal number of possible placements. Similar to Cho et al. [5],
we define the correct placement of parts x
i
and x
j
accord-
ing to the compatibility metric C if part x
j
is in relation R
to part x
i
in the original image and if
∀x
k
∈ P arts, C(x
i
, x
j
, R) ≥ C(x
i
, x
k
, R) . (1)
Recently, Cho et al. [5] evaluated five compatibility met-
rics, among which the dissimilarity-based compatibility
metric showed to be the most discriminative. Inspired by
both the dissimilarity-based metric and the characteristics
of natural images, we propose two new types of compatibil-
ity metrics, which will be described shortly after we review
the dissimilarity-based compatibility metric.
Dissimilarity-based Compatibility: The dissimilarity be-
tween two parts x
i
, x
j
can be measured by summing up the
squared color differences of the pixels along the parts’ abut-
ting boundaries [5]. For example, if we represent each color
image part in normalized LAB space by a K ×K ×3 matrix
(where K is the part width/height in pixels) then the dissim-
ilarity between parts x
i
and x
j
, where x
j
is to the right of
x
i
, can be defined as
D(x
i
, x
j
, r) =
K
X
k=1
3
X
d=1
(x
i
(k, K, d) − x
j
(k, 1, d))
2
. (2)
We do emphasize that dissimilarity is not a distance mea-
sure, i.e., D(x
j
, x
i
, R) is not necessarily the same as (and
almost always different than) D(x
i
, x
j
, R).
Although the dissimilarity-based metric was shown to be
the most discriminative among the tested metrics in Cho et
al. [5], the observation that it is related to the L
2
norm of
the boundaries’ difference vector suggests that other norms
of the same difference vector could behave even better. In-
spired by the use of non Euclidean norms in image opera-
tions such as noise removal [19] or diffusion [20], and by
the observation that the L
2
norm penalizes large boundary
differences very severely even though such large differences
do exist in natural images, we evaluated other L
p
norms as
well. The average accuracy empirical results for the dis-
similarity metric based on various L
p
norms are shown in
Fig. 1 and indicate clearly that the L
2
is suboptimal, and
that the best results may be obtained with L
p
norms with
p ≈ 0.3. While this is outside the scope of this paper, an
interesting question that emerges from this data is why does
performance break down for small p values, and why does
p ≈ 0.3 appear to provide optimal performance (see the
Discussion and Future Work section).
0.5 1 1.5 2 2.5 3
0
0.2
0.4
0.6
0.8
X: 0.3
Y: 0.8648
p−Value
Clasification Accuracy
Figure 1. Comparing dissimilarity-based metrics using different
L
p
norms. For this test we used a database of 20 test images [5],
and analyzed each for the portion of correct part pairs that received
the highest compatibility score among other candidates. The plot
depicts the average classification accuracy of 20 images and shows
how performance peaks at p ≈ 0.3, with a value of 86%.
(L
p
)
q
Compatibility: As mentioned before, the optimal
10