Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving
Ben Vardi
1 a
, Alessandro Torcinovich
2 b
, Marina Khoroshiltseva
2,3 c
, Marcello Pelillo
2,4 d
and Ohad Ben-Shahar
1 e
1
Ben-Gurion University of the Negev, Be’er-Sheva, Israel
2
Ca’ Foscari University of Venice, Venice, Italy
3
Istituto Italiano di Tecnologia, Genoa, Italy
4
European Centre of Living Technology, Venice, Italy
Keywords:
Puzzle Solving, Square Jigsaw Puzzles, Relaxation Labeling.
Abstract:
We present a novel method for solving square jigsaw puzzles based on global optimization. The method is fully
automatic, assumes no prior information, and can handle puzzles with known or unknown piece orientation.
At the core of the optimization process is nonlinear relaxation labeling, a well-founded approach for deducing
global solutions from local constraints, but unlike the classical scheme here we propose a multi-phase approach
that guarantees convergence to feasible puzzle solutions. Next to the algorithmic novelty, we also present a
new compatibility function for the quantification of the affinity between adjacent puzzle pieces. Competitive
results and the advantage of the multi-phase approach are demonstrated on standard datasets.
1 INTRODUCTION
The jigsaw puzzle game is a well-known and time-
honored pastime for children and adults. Given
n non-overlapping image fragments, commonly re-
ferred to as puzzle pieces, the goal is to assemble
them into a coherent visual image, preferably recon-
structing the original (possibly unknown) image it-
self. Although large jigsaw puzzles are successfully
solved by humans, the problem posed by this pop-
ular game is rather difficult, as it was shown to be
NP-complete (Demaine and Demaine, 2007). Algo-
rithmic capability to solve jigsaw puzzles has impor-
tant applications in several fields, such as information
security (Zhao, Su, Chou and Lee, 2007), assembly
of shredded documents and photos (Deever and Gal-
lagher, 2012; Liu, Cao and Yan, 2011) and archaeol-
ogy (Koller and Levoy, 2006).
A particular type of jigsaw puzzles that has raised
interest in the scientific community deals with square
jigsaw puzzles (e.g., Cho, Avidan and Freeman, 2010;
Pomeranz, Shemesh and Ben-Shahar, 2011), where
a
https://orcid.org/0000-0002-6950-8297
b
https://orcid.org/0000-0001-8110-1791
c
https://orcid.org/0000-0003-0424-0661
d
https://orcid.org/0000-0001-8992-9243
e
https://orcid.org/0000-0001-5346-152X
all pieces have identical square shape, geometric in-
formation is unavailable, and only pictorial informa-
tion may be used for driving the solution. This is also
why solving square jigsaw puzzles marks an upper
bound of jigsaw puzzle reconstruction performance
when geometric cues are diminishing. Such an under-
standing is a key step in developing solvers for real-
world puzzles, in which both pictorial and geometric
information are usually available in varying degrees.
Square jigsaw puzzle solvers may be distin-
guished by two criteria. The first is the evaluation
method for potential piece matchings, which defines
the compatibility of placing any two puzzle pieces
in one of the possible spatial 4-neighborhood adja-
cency relations. The measure used for this evaluation
is commonly referred to as piece compatibility mea-
sure. The second criterion is the algorithmic method
used to process these compatibilities in order to infer
a solution for the full puzzle. In other words, one may
argue that square jigsaw puzzle solvers use local con-
straints (piece compatibilities) in order to deduce a
global solution (piece placements that seek to restore
the original image).
While several attempts were made to solve square
jigsaw puzzles and impressive results were achieved
(e.g., Pomeranz et al., 2011; Andal
´
o, Taubin and
Goldenstein, 2016; Son, Hays and Cooper, 2018),
most of the suggested algorithms use greedy meth-
Vardi, B., Torcinovich, A., Khoroshiltseva, M., Pelillo, M. and Ben-Shahar, O.
Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving.
DOI: 10.5220/0011622800003417
In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP, pages
785-795
ISBN: 978-989-758-634-7; ISSN: 2184-4321
Copyright
c
2023 by SCITEPRESS Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)
785
ods rather than global optimization methods. Nev-
ertheless, global optimization algorithms are gener-
ally preferred over greedy algorithms for their ability
to produce solutions which simultaneously take into
consideration all the local potential piece matchings.
Moreover, global optimization approaches may often
facilitate additional mathematical analysis (Floudas,
2013) and thus a better understanding of the dynam-
ics of the problem and solution process.
A well-founded global optimization approach that
handles problems where local constraints are given
and global interpretation should be found is relax-
ation labeling (RL). The approach, first introduced
by Rosenfeld, Hummel and Zucker (1976), is a class
of iterative procedures for refining label assignments
based on contextual constraints. RL has been appli-
cable in solving many global optimization problems,
most commonly from the computer vision field (e.g.,
Zucker, Hummel and Rosenfeld, 1977). Although
iterative, and reminiscent of gradient ascent, it can
be formulated without any step size (Pelillo, 1997;
Rosenfeld et al., 1976) while guaranteeing the op-
timization of its objective function (Hummel and
Zucker, 1983; Pelillo, 1997).
Recently, preliminary results of using RL for
square jigsaw puzzle solving (Khoroshiltseva et al.,
2021) demonstrated the potential of this approach, but
nevertheless suffered from several limitations such as
occasional failures to converge to a feasible solution
or being limited to puzzles in which piece orienta-
tions are known (known as Type 1 puzzles; Gallagher,
2012). In this work we further explore the potential of
RL for square jigsaw puzzle solving and present a new
RL algorithm that does guarantee feasible solutions,
performs better than the preliminary approach, and
handles both Type 1 but also Type 2 puzzles (whose
piece orientations are unknown). In addition to the al-
gorithmic novelty, another contribution of this paper
is the proposal of a new piece compatibility measure.
2 RELATED WORK
2.1 Puzzle Solving
The topic of puzzle solving has long been of interest
in the scientific community. The first to investigate
puzzles were Freeman and Garder (1964), focusing
on the geometric information of irregularly shaped
puzzle pieces to suggest the first computational apic-
torial solver. Three decades later, Kosiba, Devaux,
Balasubramanian, Gandhi and Kasturi (1994) were
the first to also consider pictorial puzzles in their clas-
sical “toy” jigsaw puzzle solver.
While the effort to combine pictorial and geomet-
ric information was followed also by others, the line
of research that has been dominating the literature fo-
cused on square jigsaw puzzles in which all pieces
have an identical square shape, with puzzle dimen-
sions being often known. While the objective of all
different square jigsaw puzzle solvers is the same,
solution approaches vary greatly. A first notable at-
tempt was made by Cho et al. (2010) who employed
loopy belief propagation algorithm for Type 1 puz-
zles. Although being able to solve the biggest puzzle
at the time, their algorithm was only semi-automatic,
requiring at least some ground truth information. A
year later, Pomeranz et al. (2011) published the first
fully automatic solver, requiring no prior knowledge.
Their iterative greedy Type 1 solver is based on three
steps: pieces placement according to the piece com-
patibility measure, segmentation of the proposed so-
lution into coherently assembled segments, and shift-
ing segments to improve the current solution. Despite
being simpler, this approach provided not only a leap
in performance, but also the ability to solve puzzles
an order of magnitude larger than before.
Following Pomeranz et al. (2011), Andal
´
o
et al. (2012; 2016) were the first to formulate the
problem as a global optimization problem using
a constrained quadratic objective function. Gal-
lagher (2012) used a tree-based greedy algorithm in-
spired by the well-known Kruskal’s algorithm for
finding minimum spanning trees. Gallagher (2012)
was the first to introduce and solve Type 2 puzzles,
namely puzzles in which piece orientations are un-
known. Shortly after, Sholomon, David and Ne-
tanyahu (2013) devised a genetic algorithm-based
solver for Type 1 puzzles, and later extended it for
Type 2 puzzles (Sholomon, David and Netanyahu,
2014).
A different approach for the problem was pro-
posed by Son et al. (2014; 2018) who devised a Type
1 and 2 solver that utilizes loops of pieces that form
consistent cycles. More recently, Huroyan, Lerman
and Wu (2020) proposed a Type 2 method based on
recovering piece orientations by estimating the graph
connection Laplacian. Of special importance to our
work is the global optimization solver by Khoroshilt-
seva et al. (2021), which is based on RL. Due to its
relevance, we dedicate Section 2.3 to that solver.
Several neural network methods have been sug-
gested to solve square jigsaw puzzles, most of them
applicable only for small puzzles. Paumard, Pi-
card and Tabia (2020) offered a convolutional neu-
ral network-based solver that predicts neighboring re-
lations in 9 pieces puzzles. Li, Liu, Wang, Liu and
Zeng (2021) devised a generative adversarial network
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
786
(GAN) pipeline for training a classification network
to predict pieces permutation in puzzles with up to 16
pieces. Talon, Del Bue and James (2022) suggested
another GAN-based approach that is capable of solv-
ing bigger puzzles, but suffers from diminishing ac-
curacy as puzzle size gets bigger.
Along with the different algorithmic approaches
to the square puzzle problem, different piece compat-
ibility measures have been also developed over time.
Such measures define the compatibility of placing
any two puzzle pieces in any of the 4-neighborhood
adjacency relations. They are most commonly de-
fined as a function C(b
i
, b
j
, R) that specifies how
compatible it is to place piece b
j
in relation R
{right, down, le f t, up} to piece b
i
. In virtually all
solvers, piece compatibilities are based on a more fun-
damental measure of piece dissimilarity that usually
compares the pictorial information along the bound-
aries of pieces. This measure is later mapped to a
score of compatibility.
2.2 Relaxation Labeling
Relaxation labeling (RL) is a class of processes for
ambiguity reduction of labeling assignments, done by
iteratively refining assignments based on contextual
information (Rosenfeld et al., 1976). Among these
processes, in the current work we focus on nonlinear
relaxation labeling, where problems involve four ele-
ments:
(1) Set of n objects B = {b
1
, ...,b
n
}.
(2) Set of m possible labels Λ = {λ
1
, ...,λ
m
}.
(3) Four-dimensional matrix of compatibility coeffi-
cients {r
i j
(λ, µ)} R
n
2
m
2
, expressing how com-
patible are the two hypotheses of “labeling object
b
i
with label λ” and “labeling object b
j
with label
µ”.
(4) Initial labeling ¯p
(0)
R
nm
, containing prior prob-
abilities for labeling each object b
i
with each label
λ.
The purpose of nonlinear relaxation labeling
(henceforth referred to as just RL) is to provide a con-
sistent labeling assignment ¯p R
nm
that represents a
viable solution to the problem. An assignment is a set
of probabilities for assigning each label λ to each ob-
ject b
i
. While the initial labeling ¯p
(0)
is already qual-
ified as such, it may not be “good enough” in terms
of satisfying the local constraints represented by the
compatibility coefficients. An assignment that does
satisfy these constraints is called consistent, and RL
is the optimization process that provides such consis-
tent assignments given the four elements above.
A labeling assignment ¯p, usually referred to just
as labeling, is represented in the current work as la-
beling matrix of n rows and m columns. The proba-
bility of object b
i
to be labeled with label λ is notated
as p
i
(λ), and is indicated by the matrix entry in row i
and column λ. Since each row i represents the proba-
bilities of all label assignments to object b
i
, the space
of all possible labelings is
K =
¯p R
nm
| p
i
(λ) 0, i B, λ Λ
and
m
λ=1
p
i
(λ) = 1, i B
.
(1)
A particular important subset of K is the set of all
possible unambiguous labelings:
K
=
¯p K | p
i
(λ) {0, 1}, i B, λ Λ
, (2)
where each object is assigned one label with full con-
fidence.
In order to refine the initial (likely inconsistent) la-
beling, RL algorithm involves the computation of two
terms in each iteration k of the process, the support
and a new more “refined” labeling. While not nec-
essarily the only way to do so, we adapt the common
terms from the RL literature (Pelillo, 1997; Rosenfeld
et al., 1976):
(1) The support matrix ¯q
(k)
R
nm
represents the con-
textual “support” that each object b
i
gets for label-
ing it with each label λ. This matrix, referred to
as support, is computed by a support function:
q
(k)
i
(λ) =
n
j=1
m
µ=1
r
i j
(λ, µ)p
(k)
j
(µ). (3)
(2) The refined labeling ¯p
(k+1)
that represents the new
“better” labeling, is computed by a nonlinear up-
date rule:
p
(k+1)
i
(λ) =
p
(k)
i
(λ)q
(k)
i
(λ)
m
µ=1
p
(k)
i
(µ)q
(k)
i
(µ)
. (4)
The entire algorithm is best viewed as a contin-
uous mapping of K onto itself through Eq. 4, and
it is expected to converge to a fixed point if applied
successively. Theoretically, such a fixed point itera-
tion stops when p
(k+1)
i
(λ) = p
(k)
i
(λ), i, λ for some
iteration k. In practice, we may (heuristically) deter-
mine convergence if the increase of the global objec-
tive function (see below in Eq. 5) in two successive
iterations is smaller than a given threshold ε.
The theoretical foundations for RL were laid by
Hummel and Zucker (1983) who have proved sev-
eral useful theoretical properties for an alternative
RL process. Pelillo (1997) later proved similar dy-
namics for the original RL algorithm from Rosenfeld
Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving
787
et al. (1976). Of particular relevance to us is that un-
der symmetric non-negative compatibility coefficients
(i.e., r
i j
(λ, µ) = r
ji
(µ, λ) 0, i, j, λ, µ), each iteration
of the process strictly increases a global consistency
measure known as average local consistency (ALC;
Hummel and Zucker, 1983):
A( ¯p, ¯q) =
n
i=1
m
λ=1
p
i
(λ)q
i
(λ). (5)
Both contributions (Hummel and Zucker, 1983;
Pelillo, 1997) showed that their respective process
eventually approaches and converges to a consistent
solution near the initial labeling assignment. Impor-
tantly, however, unlike other optimization algorithms
such as gradient ascent, the original iteration due to
Rosenfeld et al. (1976) does so without the need to
determine a step size.
2.3 Puzzle Solving with “Balanced” RL
To our best knowledge, the only work that employs
RL for square jigsaw puzzle solving is by Khoroshilt-
seva et al. (2021). In their formulation for Type 1 puz-
zles, the set of objects is the set of puzzle pieces and
the set of labels is the set of possible positions on the
puzzle grid. In these settings n = m, so the space K is
the space of stochastic matrices, and the space K
is
the space of binary stochastic matrices. As no better
prior information can be assumed, the initial label-
ing is a uniform distribution of labels across each ob-
ject. The compatibility coefficients r
i j
(λ, µ) are set to
piece compatibility scores C(b
i
, b
j
, R) for all pairs of
different pieces b
i
, b
j
and positions λ, µ that represent
neighboring positions on the puzzle grid in relation R.
In all other cases, r
i j
(λ, µ) is 0.
Unfortunately, applying the RL update rule from
Eq. 4 using such definitions does not guarantee, and in
fact often does not result in a feasible puzzle solution,
defined as an assignment where each piece is placed at
a unique position on the puzzle grid and all positions
are occupied with some piece. Such feasible solu-
tions, represented by permutation labeling matrices,
are not always achieved because the theory just guar-
antees that the update rule will map a stochastic ma-
trix into another stochastic matrix, with the additional
hope that the process converges to an unambiguous
assignment, namely a binary stochastic matrix. But
even if the process converges to a binary stochastic
matrix (as opposed to a non-binary one), this does
not prohibit situations where two objects (pieces) ob-
tain the same label (position), or that a given label
(position) is not assigned to any object (piece). To
remedy this situation, Khoroshiltseva et al. (2021) ex-
tended the update rule from Rosenfeld et al. (1976)
with two versions of a matrix balancing component
that seeks to map a stochastic labeling matrix to a
doubly stochastic matrix. Since the latter are easier
to binarize into permutation matrices, this should help
the process converge to the space of all feasible puzzle
solutions.
Khoroshiltseva et al. (2021) showed empirically
that their approach can perform better than a plain
RL without balancing. At the same time, while the
method encourages feasible solutions, it cannot guar-
antee them. Testing on natural images indeed showed
vulnerability to certain common conditions, and in
particular the presence of constant pieces, namely
puzzle pieces of identical appearance of some con-
stant value throughout. Such failure cases can be ex-
plained by the fact that the rows of the labeling matrix
that refer to constant pieces are identical in all steps of
the process (both the update rule and balancing com-
ponent map identical rows into identical rows), thus
the process cannot converge to a permutation matrix.
In the current work we suggest an alternative and
improved puzzle solver based on a novel multi-phase
RL scheme. Our method is applicable for both Type
1 and 2 puzzles, addresses the aforementioned prob-
lems and facilitates further exploration of RL as a
computational powerhouse for puzzle solving.
3 MULTI-PHASE RL PUZZLE
SOLVERS
3.1 Puzzles with Known Piece
Orientation
Type 1 puzzles, possibly the most elementary varia-
tion of the puzzle problem, constitute unordered set of
square pieces whose orientation is given but their or-
ganization in an N ×M array of positions is not. Treat-
ing this problem as an RL problem can be done in the
following way, similar to Khoroshiltseva et al. (2021):
(1) The objects B = {b
1
, . . . , b
n
} are puzzle pieces.
(2) The labels Λ = {λ
1
, ..., λ
m
} are all possible posi-
tions (x
λ
, y
λ
) on the two-dimensional N × M puz-
zle grid.
(3) The compatibility coefficients are defined by
r
i j
(λ, µ) = r
i j
((x
λ
, y
λ
), (x
µ
, y
µ
)) =
C(b
i
, b
j
, R), if i ̸= j and (x
λ
, y
λ
), (x
µ
, y
µ
) are
neighboring positions in relation R,
0, otherwise ,
(6)
where R {right, down, le f t, up} represents a 4-
neighborhood spatial relation and C is a non-
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
788
(a) (b) (c)
Figure 1: Type 1 puzzle feasible labeling structure. (a) is a
2 × 2 puzzle and (b) is its perfect solution. This perfect so-
lution is represented by the permutation labeling matrix (c).
Piece IDs in red were added for demonstration purposes.
negative and symmetric piece compatibility func-
tion (see Section 3.3 for the function definition).
(4) The initial labeling ¯p
(0)
is the barycenter of the
search space, i.e., p
(0)
i
(λ) =
1
m
, i, λ. With no
prior information, this is the only rational choice
one can make.
As mentioned above, a feasible Type 1 reconstruc-
tion is a solution in which each piece is placed at
a unique position on the puzzle grid, and the space
of such feasible labelings is the space of permutation
matrices. Figure 1 exemplifies feasible labeling struc-
ture for the Type 1 problem.
Since C is non-negative and symmetric, the same
applies to r
i j
(λ, µ). Consequently, the process is guar-
anteed to strictly increase the ALC in each iteration,
and to converge to a consistent assignment (Pelillo,
1997). For puzzle solving this turns the ALC to an in-
tuitive objective function and the relaxation process to
one that seeks to maximize the sum of compatibilities
between all adjacent pieces, similar in spirit to other
global optimizers for puzzle solving (Andal
´
o et al.,
2016; Sholomon et al., 2013). For the justification of
this claim please refer to the supplementary material
in the project webpage (https://icvl.cs.bgu.ac.il/multi-
phase-rl-for-jigsaw-puzzle-solving).
3.1.1 Type 1 Solver
With the elements of the RL process now set for puz-
zle solving, and the observation that preliminary at-
tempts (Khoroshiltseva et al., 2021) could not guar-
antee feasible solutions, we suggest an alternative al-
gorithm that does guarantee this property. Central to
our approach is a novel multi-phase scheme where
each phase uses RL to determine the position of a
single puzzle piece at a non-occupied position. As
explained below, this enables to change the structure
of the labeling assignment towards permutation label-
ing, as required for feasible solutions.
More specifically, in each phase of the algorithm
we let the standard RL algorithm (Pelillo, 1997;
Rosenfeld et al., 1976) run and intervene only when
the process is confident enough about placing a cer-
Figure 2: The first phase of our Type 1 solver on a 3 × 3
puzzle, done with α = 0.7. The RL phase begins from
the initial labeling and terminates once some labeling value
crosses α (left matrix, obtained for p
5
((2, 2)) after 4 RL it-
erations). On the right is the labeling after anchoring piece
b
5
to position (2, 2).
tain piece b
i
at some position λ, namely that the pro-
cess converges to a consistent labeling and/or at least
one label at one object satisfies p
i
(λ) α, where α is
a pre-defined threshold. This intervention marks the
end of the current phase, and involves three steps:
(1) Anchoring piece b
i
to position λ and disallow-
ing placing it from now on at any other posi-
tion. This is done by setting p
i
(λ) = 1 and
p
i
(µ) = 0, µ ̸= λ.
(2) Preventing the future placement of other pieces
at position λ. This is done by setting
p
j
(λ) = 0, j ̸= i.
(3) Resetting all entries p
j
(µ) that involve pieces
b
j
and positions µ for which past phases have
not made a decision yet. This step is done
by setting these entries to the barycenter of the
remaining search subspace, i.e. to the recipro-
cal of the number of remaining unassigned ob-
jects/positions. This allows subsequent phases to
start afresh without bias from prior phases.
Figure 2 demonstrates these steps after the completion
of the first RL phase on a 3 × 3 puzzle.
The manipulated labeling serves as the initial la-
beling for the next phase of the algorithm, which
anchors another piece at another position. The al-
gorithm finally terminates after anchoring all pieces,
namely after n phases. Importantly, although no spe-
cial measures are taken, the process does not mod-
ify entries of pieces and positions for which previous
phases have already made a decision since Eq. 4 does
not (indeed, cannot) modify binary (i.e., 0 or 1) prob-
abilities. Thanks to this property, each phase takes a
decision about a single piece and position by binariz-
ing their respective row and column in the labeling
matrix, all while utilizing the information obtained in
previous phases. This way, each phase updates the la-
beling so it becomes progressively “closer” to a per-
mutation labeling. After n phases convergence to a
Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving
789
(a) (b) (c) (d)
Figure 3: The shifted reconstruction phenomenon and its
solution. (a) is a reconstruction where rows are shifted. To
prevent this, we translate the whole anchored block when
the anchoring of some piece reaches the edge of the puzzle
grid, as in (b) (gray areas represent non-occupied positions).
At this point we translate the block downwards to obtain (c)
and allow subsequent phases to grow in all directions. (d) is
a case a translation will not necessarily benefit the situation
as translating the block upwards will also limit the growth.
At this point we branch the computation to try both options,
translating the block upwards or leaving the block as is.
permutation labeling is guaranteed.
Operating as described above, the multi-phase
solver will normally attach new pieces to the block of
pieces already anchored in the past. However, there is
no guarantee for this behavior. To prevent the solver
from generating isolated “islands” of solutions that
would be difficult to “stitch” together properly across
the puzzle grid, we allow anchoring only at positions
adjacent to previously anchored ones, thus guaran-
teeing a single contiguous block of anchored pieces.
Moreover, if the α threshold is met simultaneously by
more than one combination of piece b
i
and position λ,
we select a single combination by an internal sorting
of relevant combinations. Please see the supplemen-
tary material for more details.
The algorithm just described guarantees feasible
solutions although, of course, it cannot guarantee cor-
rect solutions. In practice, many solutions are per-
fect (or near perfect) up to a circular shift of the rows
and/or columns, as shown in Figure 3(a). This phe-
nomenon is best explained as a lack of awareness of
the solver to the puzzle dimensions, as it emerges
from the likely event of assigning the very first piece
to a wrong position in the puzzle grid. This prob-
lem was also observed in past (non-RL) solvers (e.g.,
Pomeranz et al., 2011), but is easy to alleviate by in-
corporating such awareness in the process. In partic-
ular, each time a phase is completed, if the anchored
block abuts the edge of the puzzle grid, we translate
it one column or row inside (see Figures 3(b)-3(c)).
This translation is done by applying a proper trans-
formation on the labeling matrix, and it allows subse-
quent phases to again grow the block in all directions.
Of course, these translations of the anchored block
will cease once the solution is short of one row or
column to the vertical or horizontal dimension of the
puzzle (e.g., Figure 3(d)). At this point it is yet im-
possible to know what side the coming phases should
grow the last row or column and thus we branch the
(a) (b)
(c)
Figure 4: Type 2 puzzle feasible labeling structure. (a) is a
2 × 2 puzzle and (b) is its perfect solution. The correspond-
ing type 2 permutation labeling matrix is shown in (c).
algorithm to try both options. Doing this for both the
vertical and horizontal growth, we end up with 4 final
reconstructions, from which the final solution is the
one with the largest ALC.
3.2 Puzzles with Unknown Piece
Orientation
Type 2 puzzles extend the elementary case to situa-
tions where piece orientation is unknown. Since the
final orientation of the pieces needs to be determined
by the process, the setup of the problem as an RL pro-
cess requires some adjustments. As before, the ob-
jects b
1
, . . . , b
n
still represent the NM puzzle pieces.
However, the labels λ
1
, ..., λ
m
now need to represent
all combinations of piece positions and orientations.
Each label λ is a pair ((x
λ
, y
λ
), θ
λ
), where (x
λ
, y
λ
) is
a position on the N × M grid and θ
λ
{0°, 90°, 180°,
270°} is one of 4 possible piece orientations.
Clearly, the change to labels requires a corre-
sponding adjustment to the compatibility coefficients:
r
i j
(λ, µ) = r
i j
(((x
λ
, y
λ
), θ
λ
), ((x
µ
, y
µ
), θ
µ
)) =
C(θ
λ
(b
i
), θ
µ
(b
j
), R), if i ̸= j and (x
λ
, y
λ
), (x
µ
, y
µ
)
are neighboring positions
in relation R,
0, otherwise ,
(7)
where function θ
λ
(b
i
) represents the clockwise-
rotation of piece b
i
by θ
λ
degrees.
Under this formulation several complications im-
mediately arise. First, the labeling matrix is no longer
square as m = 4n, and thus the solution cannot be a
permutation matrix. Second, a feasible Type 2 recon-
struction requires to place each piece in some orien-
tation at a unique position on the puzzle grid. In fact,
in order to represent feasible solutions, the labeling
matrix should include a single 1 in each row (just as
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
790
(a) (b) (c) (d) (e)
(f) (g) (h)
Figure 5: Possible reconstructions for Type 2 puzzles. (a)
is a square-shaped puzzle, for which (b), (c), (d) and (e) are
four perfect reconstructions. (f) is a rectangular-shaped puz-
zle, for which (g) and (h) are two perfect reconstructions.
in Type 1 puzzles), but at the same time a single 1
in every group of 4 columns that represent the differ-
ent orientations of the same position. A labeling for
which the matrix representation meets this description
is referred to as type 2 permutation labeling (see Fig-
ure 4). We note that similar to the Type 1 case, in the
space of Type 2 feasible solutions the RL optimization
seeks to maximize the sum of compatibilities between
adjacent pieces.
To complete the description of Type 2 problems
as RL processes, we need to define the initial label-
ing. The situation here is again somewhat more com-
plicated since unlike for Type 1 puzzles, that have a
single perfect reconstruction, any Type 2 puzzle has
at least 2 perfect reconstructions due to global image
rotation (see Figure 5). In order to encourage conver-
gence to one of these solutions, the dynamics should
be biased towards a specific solution rather than “fa-
voring” all solutions equally. To address this, we
set the initial labeling as the barycenter of the search
space, while limiting some piece b
i
1
to some orienta-
tion θ
1
:
p
i
(λ) = p
i
(((x
λ
, y
λ
), θ
λ
)) =
1
m
, if i ̸= i
1
,
1
n
, if i = i
1
, θ
λ
= θ
1
,
0, if i = i
1
, θ
λ
̸= θ
1
.
(8)
When puzzles are square in shape (i.e., N = M), the
orientation θ
1
is chosen randomly among the 4 possi-
ble ones and serves as an “angular anchor” that pulls
the entire solution towards the proper global orienta-
tion. Doing the same for rectangular N ×M puzzles is
problematic because the selected orientation of piece
b
i
1
, and thus the preferred global orientation of the
solution, might conflict the rectangular dimensions of
the puzzles. To cope with this problem we run the al-
gorithm twice, first with an initial state derived from
a randomly selected orientation θ
1
, and once again
with an initial state derived from θ
1
+ 90°. This en-
sures that one of the computations could converge to
a correct reconstruction, which is identified and se-
Figure 6: The first phase of our Type 2 solver on a 2 × 2
puzzle, with α = 0.7. The RL phase begins from the ini-
tial labeling and terminates once labeling values cross α
(top matrix, obtained after 3 RL iterations). The best supra-
threshold entry at this point is p
3
(((2, 1), )). After anchor-
ing piece b
3
to position (2, 1) at orientation we obtain the
bottom matrix. Note that b
2
is allowed only at orientation
0° since the initial labeling was set this way.
lected by its larger ALC. Finally, in order to maxi-
mize the influence of the bias obtained from the initial
state, piece b
i
1
is heuristically selected as the one that
maximizes the sum of potential compatibilities in its
4-neighborhood.
3.2.1 Type 2 Solver
Not unlike the Type 1 solver that directs the process to
a permutation labeling, the Type 2 solver needs to lead
the process to a type 2 permutation labeling. How-
ever, to “anchor” an entry p
i
(((x
λ
, y
λ
), θ
λ
)) requires
to revise the intervention at the end of each RL phase
to manipulate the labeling in a slightly different way:
(1) Anchoring piece b
i
to position (x
λ
, y
λ
) in ori-
entation θ
λ
and preventing its placement at
any other position (regardless of orientation).
Formally we set p
i
(((x
λ
, y
λ
), θ
λ
)) = 1 and
p
i
(µ) = 0, µ ̸= ((x
λ
, y
λ
), θ
λ
).
(2) Preventing a future placement of other
pieces at position (x
λ
, y
λ
) regardless of
their orientation. This is done by setting
p
j
(((x
λ
, y
λ
), θ)) = 0, θ, j ̸= i.
(3) Resetting entries p
j
(((x
µ
, y
µ
), θ
µ
)) that involve
pieces b
j
and positions (x
µ
, y
µ
) for which anchor-
ing is yet made. This is done by setting these en-
tries to the reciprocal of the number of combina-
tions of unassigned positions and possible orien-
tations (i.e., the number of unassigned positions
multiplied by 4). If piece b
i
1
(used for setting the
initial labeling) is not anchored, we reset its val-
ues so only orientation θ
1
is allowed.
Figure 6 demonstrates the effect of these 3 steps after
the completion of the first RL phase on a 2 ×2 Type 2
Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving
791
puzzle. The shifting of the anchored block at the end
of each phase is handled as before.
3.3 Piece Compatibility Measure
At the foundations of any puzzle solver is some mea-
sure of affinity between puzzle pieces that allows to
quantify the quality of immediate neighborhood re-
lationships. In our algorithm it is integrated in RL
compatibility coefficients.
As mentioned earlier, piece compatibility mea-
sures are conventionally based on a more basic mea-
sure of piece dissimilarity. While various dissim-
ilarity measures have been proposed, we use the
Mahalanobis gradient dissimilarity proposed by Son
et al. (2018). This dissimilarity measure evaluates
the disagreement of placing two adjacent pieces by
computing the Mahalanobis distances between the ac-
tual gradients and expected gradients with respect to
the distribution of predicted gradients, where the gra-
dients considered are both across and along the ad-
joining boundaries of pieces. Please refer to Son
et al. (2018) for a full description.
Given the dissimilarity, it is possible to convert it
to a compatibility function in various ways, the most
immediate of which is simply negating the dissimilar-
ities. Here we take a more elaborate approach that is
based on the observation that our solver, and RL pro-
cesses in general, perform better when the compati-
bility values are well spaced and the difference be-
tween compatible and incompatible values is signifi-
cant. This was already observed in other solvers, and
here we follow Andal
´
o et al. (2016) who improved the
basic idea from Pomeranz et al. (2011) and defined
C(b
i
, b
j
, R) = exp
ϕ
i,R
( j)
D(b
i
, b
j
, R)
quartile(i, R)
, (9)
where quartile(i, R) is the quartile dissimilarity
among all dissimilarities in relation R to piece b
i
, and
ϕ
i,R
( j) is the rank of the dissimilarity D(b
i
, b
j
, R) in
the increasingly ordered dissimilarities in relation R
to piece b
i
. The quartile(i, R) term is included to
consider the scattering of dissimilarity values (and not
just the values themselves), and the ϕ
i,R
( j) rank term
results in “spacing” the computed compatibility val-
ues.
Inspired by such ideas, we propose a new compat-
ibility measure:
C(b
i
, b
j
, R) =
1, if D(b
i
, b
j
, R) = p
avg
(i, R;k) = 0,
1
D(b
i
,b
j
,R)
p
avg
(i,R;k)
ϕ
i,R
( j)
, if D(b
i
, b
j
, R) p
avg
(i, R;k)
and p
avg
(i, R;k) > 0,
0, otherwise ,
(10)
where p
avg
(i, R; k) is the average dissimilarity from
the first value till the k percentile among all dissimi-
larities in relation R to piece b
i
.
This measure offers two advantages over the mea-
sure from Eq. 9: (1) dissimilarities scattering informa-
tion is determined by multiple values rather than one
value (the quartile dissimilarity). Moreover, we intro-
duce k as a parameter to avoid limitation to k = 25
(the quartile). (2) It better balances the D(b
i
, b
j
, R)
and ϕ
i,R
( j) terms since high ϕ
i,R
( j) does not neces-
sarily guarantee low compatibility score.
An important challenge that most global optimiza-
tion solvers face is due to constant pieces, or puz-
zle pieces that have the same constant color value
throughout, which happens mostly in sky regions that
turned white due to pixel saturation. Constant pieces
maintain maximal compatibility with other constant
pieces (and possibly with more pieces), thus it be-
comes difficult for the RL process to converge to a la-
beling that unambiguously determines their positions.
To alleviate this, we slightly adjust the compatibility
for puzzles with more than two constant pieces:
C
(b
i
, b
j
, R) =
max{0, X}, if there exists constant
piece b
k
s.t. C(b
i
, b
k
, R) = 1
and C(b
k
, b
j
, R) = 1,
C(b
i
, b
j
, R), otherwise ,
(11)
where X U(4, 1) is a random variable. This
adjustment re-draws all maximal compatibilities for
constant pieces (and after rectification set it to 0 with
probability of 0.8), which sparsifies the compatibility
coefficients and assists convergence.
4 EXPERIMENTAL RESULTS
To test the suggested puzzle solvers, we used the 20
puzzles of 432 pieces from the MIT data set pro-
posed by Cho et al. (2010) and the 20 puzzles of 540
pieces from the McGill data set proposed by Pomer-
anz et al. (2011). All puzzles contain pieces of size
of 28 × 28 pixels. We evaluated performance with
the following three performance measures: (1) Di-
rect comparison (DC; Cho et al., 2010) is the fraction
of pieces that are in the same position (and orienta-
tion for Type 2 puzzles) in the assembly solution and
the ground truth solution. (2) Neighbor comparison
(NC; Cho et al., 2010) is the fraction of pairwise piece
matchings that exist in the ground truth solution. (3)
Perfect reconstruction (PR; Gallagher, 2012) is a bi-
nary indication of whether the assembly solution is
identical to the ground truth. For an entire data set,
PR becomes the number of perfectly solved puzzles.
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
792
Table 1: Comparison of Type 1 solvers.
Method
The MIT data set The McGill data set
DC NC PR DC NC PR
Andal
´
o et al. (2016) 96% 95.6% 13 90.8% 95.3% 13
Gallagher (2012) 95.3% 95.1% 12 - - -
Paikin and Tal (2015) 96.2% 95.8% 13 93.2% 96.1% 13
Pomeranz et al. (2011) 94% 95% 13 83.5% 90.9% 9
Sholomon et al. (2013) 86.2% 96.2% 9 92.8% 96% 8
Son et al. (2018) 95.8% 95.6% - 95.4% 97% -
Khoroshiltseva et al. (2021) 79.1% 87.9% 6 25.6% 64.5% 0
Ours 95.5% 95.4% 13 62.9% 87.8% 6
Table 2: Comparison of Type 2 solvers.
Method
The MIT data set The McGill data set
DC NC PR DC NC PR
Gallagher (2012) 82.2% 90.4% 9 - - -
Huroyan et al. (2020) 94.8% 95.2% 13 88.3% 92.2% 13
Paikin and Tal (2015) 95.4% 95.4% - - - -
Sholomon et al. (2014) 94.6% 94.9% 10 89.6% 92% 8
Son et al. (2018) 95.8% 95.6% 12 92.3% 94.5% 11
Ours 87.1% 93.9% 5 25.8% 78% 0
The only random part in our solvers is the constant
pieces compatibility adjustment (Eq. 11). Otherwise,
solvers are fully deterministic for a given input. For
this reason, the reported performance for puzzles with
constant pieces is the average of 10 runs while the rest
of the puzzles were solved once. In all experiments,
the RL convergence threshold ε was set to 10
4
, while
the α anchoring threshold was set to 0.7. Dissimi-
larity computations are based on the CIELAB color
space. Compatibility computations (Eq. 10) use k = 3
and k = 1.5 for Type 1 and 2 solvers, respectively.
Table 1 presents the Type 1 solver performance.
Our main goal in this paper is a progression in
performance of RL solvers and indeed the multi-
phase method performs better than Khoroshiltseva
et al. (2021). Equally important, results are highly
competitive with many prior methods on the MIT data
set and approach the MIT data set upper bounds (DC,
NC, and PR upper bounds are 96.7%, 96.4% and
15, respectively; Son et al., 2014), which are below
perfect since no solver is expected to internally or-
der identical constant pieces correctly. The results of
prior methods were taken from their papers, while the
results of Khoroshiltseva et al. (2021) were obtained
by our implementation for their solver with Sinkhorn-
Knopp balancing. Table 2 presents our Type 2 results
compared to other methods. Because Type 2 solutions
may be rotated compared to the ground truth (see Fig-
ure 5), we evaluate only the rotation that achieves the
highest DC. Figure 7 shows Type 1 and 2 results.
5 CONCLUSION
This paper suggests a step forward in RL puzzle solv-
ing based on a novel multi-phase RL optimization ap-
proach that is fully automatic, uses no prior informa-
tion, guarantees feasible solutions, and is applicable
(a)
(b)
(c)
(d)
(e)
Figure 7: Input puzzle, RL solver result, and ground truth
image are shown left to right. (a) shows a typical result on
a Type 1 puzzle with constant pieces (sky pieces are con-
stant white). While the solution is visually perfect, its DC
and NC scores are relatively low (83.8% and 81.8%, respec-
tively) due to wrong internal ordering of constant pieces. (b)
shows a Type 1 solution with a very low DC score (6.3%)
even though the solution is visually excellent, due to offset
of one column. (c) shows a perfectly solved Type 2 puzzle.
(d) and (e) exemplify an occasional failure of the Type 1
and 2 solvers (respectively) in which sub-blocks of pieces
are assembled correctly, but in wrong position (and rotation
for Type 2) relative to each other.
for both Type 1 and 2 puzzles. Future work should
handle failure cases (see Figures 7(d)-7(e)), which
could, we believe, be alleviated within the RL frame-
work. For example, by identifying correctly assem-
bled block (such as done by Pomeranz et al., 2011),
modifying the compatibility coefficients accordingly
and running the RL process again.
We believe that the proposed multi-stage approach
could be used as a general RL-based scheme for other
hard optimization problems (not necessarily puzzles)
for solving permutation challenges (cf. Type 1 cases)
or even more complicated permutation constraints (cf.
Type 2 cases). For example, it is likely that the multi-
phase approach could be applied to much larger scale
Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving
793
instances of the traveling salesman problem than pre-
viously attempted with RL (Pelillo, 1993).
ACKNOWLEDGEMENTS
This project has received funding from the Euro-
pean Union’s Horizon 2020 research and innovation
programme under grant agreement No 964854, the
Helmsley Charitable Trust through the ABC Robotics
Initiative, and the Frankel Fund of the Computer Sci-
ence Department at Ben-Gurion University.
REFERENCES
Andal
´
o, F. A., Taubin, G., and Goldenstein, S. (2012). Solv-
ing image puzzles with a simple quadratic program-
ming formulation. In 2012 25th SIBGRAPI Confer-
ence on Graphics, Patterns and Images, pages 63–70.
Andal
´
o, F. A., Taubin, G., and Goldenstein, S. (2016).
PSQP: Puzzle solving by quadratic programming.
IEEE Transactions on Pattern Analysis and Machine
Intelligence, 39(2):385–396.
Cho, T. S., Avidan, S., and Freeman, W. T. (2010). A
probabilistic image jigsaw puzzle solver. In 2010
IEEE Computer Society Conference on Computer Vi-
sion and Pattern Recognition, pages 183–190.
Deever, A. and Gallagher, A. (2012). Semi-automatic as-
sembly of real cross-cut shredded documents. In 2012
19th IEEE International Conference on Image Pro-
cessing, pages 233–236.
Demaine, E. D. and Demaine, M. L. (2007). Jigsaw puz-
zles, edge matching, and polyomino packing: Con-
nections and complexity. Graphs and Combinatorics,
23(1):195–208.
Floudas, C. A. (2013). Deterministic global optimization:
Theory, methods and applications. Springer Science
& Business Media.
Freeman, H. and Garder, L. (1964). Apictorial jigsaw puz-
zles: The computer solution of a problem in pattern
recognition. IEEE Transactions on Electronic Com-
puters, EC-13(2):118–127.
Gallagher, A. C. (2012). Jigsaw puzzles with pieces of un-
known orientation. In 2012 IEEE Conference on Com-
puter Vision and Pattern Recognition, pages 382–389.
Hummel, R. A. and Zucker, S. W. (1983). On the foun-
dations of relaxation labeling processes. IEEE Trans-
actions on Pattern Analysis and Machine Intelligence,
PAMI-5(3):267–287.
Huroyan, V., Lerman, G., and Wu, H.-T. (2020). Solv-
ing jigsaw puzzles by the graph connection lapla-
cian. SIAM Journal on Imaging Sciences, 13(4):1717–
1753.
Khoroshiltseva, M., Vardi, B., Torcinovich, A., Traviglia,
A., Ben-Shahar, O., and Pelillo, M. (2021). Jigsaw
puzzle solving as a consistent labeling problem. In In-
ternational Conference on Computer Analysis of Im-
ages and Patterns, pages 392–402. Springer.
Koller, D. and Levoy, M. (2006). Computer-aided recon-
struction and new matches in the forma urbis romae.
Bullettino Della Commissione Archeologica Comu-
nale di Roma, pages 103–125.
Kosiba, D. A., Devaux, P. M., Balasubramanian, S., Gandhi,
T. L., and Kasturi, K. (1994). An automatic jigsaw
puzzle solver. In Proceedings of 12th International
Conference on Pattern Recognition, volume 1, pages
616–618.
Li, R., Liu, S., Wang, G., Liu, G., and Zeng, B. (2021). Jig-
sawgan: Auxiliary learning for solving jigsaw puzzles
with generative adversarial networks. IEEE Transac-
tions on Image Processing, 31:513–524.
Liu, H., Cao, S., and Yan, S. (2011). Automated assem-
bly of shredded pieces from multiple photos. IEEE
Transactions on Multimedia, 13(5):1154–1162.
Paikin, G. and Tal, A. (2015). Solving multiple square jig-
saw puzzles with missing pieces. In Proceedings of
the IEEE Conference on Computer Vision and Pattern
Recognition (CVPR), pages 4832–4839.
Paumard, M.-M., Picard, D., and Tabia, H. (2020). Deepz-
zle: Solving visual jigsaw puzzles with deep learning
and shortest path optimization. IEEE Transactions on
Image Processing, 29:3569–3581.
Pelillo, M. (1993). Relaxation labeling processes for
the traveling salesman problem. In Proceedings of
1993 International Conference on Neural Networks
(IJCNN-93-Nagoya, Japan), volume 3, pages 2429–
2432.
Pelillo, M. (1997). The dynamics of nonlinear relaxation
labeling processes. Journal of Mathematical Imaging
and Vision, 7(4):309–323.
Pomeranz, D., Shemesh, M., and Ben-Shahar, O. (2011).
A fully automated greedy square jigsaw puzzle solver.
In CVPR 2011, pages 9–16.
Rosenfeld, A., Hummel, R. A., and Zucker, S. W. (1976).
Scene labeling by relaxation operations. IEEE Trans-
actions on Systems, Man, and Cybernetics, SMC-
6(6):420–433.
Sholomon, D., David, O., and Netanyahu, N. S. (2013). A
genetic algorithm-based solver for very large jigsaw
puzzles. In 2013 IEEE Conference on Computer Vi-
sion and Pattern Recognition, pages 1767–1774.
Sholomon, D., David, O. E., and Netanyahu, N. S. (2014).
A generalized genetic algorithm-based solver for very
large jigsaw puzzles of complex types. In Proceed-
ings of the AAAI Conference on Artificial Intelligence,
volume 28.
Son, K., Hays, J., and Cooper, D. B. (2014). Solving square
jigsaw puzzles with loop constraints. In European
Conference on Computer Vision, pages 32–46.
Son, K., Hays, J., and Cooper, D. B. (2018). Solving square
jigsaw puzzle by hierarchical loop constraints. IEEE
transactions on pattern analysis and machine intelli-
gence, 41(9):2222–2235.
Talon, D., Del Bue, A., and James, S. (2022). Ganz-
zle: Reframing jigsaw puzzle solving as a retrieval
VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications
794
task using a generative mental image. arXiv preprint
arXiv:2207.05634.
Zhao, Y.-X., Su, M.-C., Chou, Z.-L., and Lee, J. (2007). A
puzzle solver and its application in speech descram-
bling. In Proceedings of the 2007 Annual Conference
on International Conference on Computer Engineer-
ing and Applications, pages 171–176.
Zucker, S. W., Hummel, R. A., and Rosenfeld, A. (1977).
An application of relaxation labeling to line and curve
enhancement. IEEE Transactions on Computers,
26(04):394–403.
Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving
795