Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving

Ben Vardi

1 a

, Alessandro Torcinovich

2 b

, Marina Khoroshiltseva

2,3 c

, Marcello Pelillo

2,4 d

and Ohad Ben-Shahar

1 e

Ben-Gurion University of the Negev, Be’er-Sheva, Israel

Ca’ Foscari University of Venice, Venice, Italy

Istituto Italiano di Tecnologia, Genoa, Italy

European Centre of Living Technology, Venice, Italy

Keywords:

Puzzle Solving, Square Jigsaw Puzzles, Relaxation Labeling.

Abstract:

We present a novel method for solving square jigsaw puzzles based on global optimization. The method is fully

automatic, assumes no prior information, and can handle puzzles with known or unknown piece orientation.

At the core of the optimization process is nonlinear relaxation labeling, a well-founded approach for deducing

global solutions from local constraints, but unlike the classical scheme here we propose a multi-phase approach

that guarantees convergence to feasible puzzle solutions. Next to the algorithmic novelty, we also present a

new compatibility function for the quantiﬁcation of the afﬁnity between adjacent puzzle pieces. Competitive

results and the advantage of the multi-phase approach are demonstrated on standard datasets.

1 INTRODUCTION

The jigsaw puzzle game is a well-known and time-

honored pastime for children and adults. Given

n non-overlapping image fragments, commonly re-

ferred to as puzzle pieces, the goal is to assemble

them into a coherent visual image, preferably recon-

structing the original (possibly unknown) image it-

self. Although large jigsaw puzzles are successfully

solved by humans, the problem posed by this pop-

ular game is rather difﬁcult, as it was shown to be

NP-complete (Demaine and Demaine, 2007). Algo-

rithmic capability to solve jigsaw puzzles has impor-

tant applications in several ﬁelds, such as information

security (Zhao, Su, Chou and Lee, 2007), assembly

of shredded documents and photos (Deever and Gal-

lagher, 2012; Liu, Cao and Yan, 2011) and archaeol-

ogy (Koller and Levoy, 2006).

A particular type of jigsaw puzzles that has raised

interest in the scientiﬁc community deals with square

jigsaw puzzles (e.g., Cho, Avidan and Freeman, 2010;

Pomeranz, Shemesh and Ben-Shahar, 2011), where

https://orcid.org/0000-0002-6950-8297

https://orcid.org/0000-0001-8110-1791

https://orcid.org/0000-0003-0424-0661

https://orcid.org/0000-0001-8992-9243

https://orcid.org/0000-0001-5346-152X

all pieces have identical square shape, geometric in-

formation is unavailable, and only pictorial informa-

tion may be used for driving the solution. This is also

why solving square jigsaw puzzles marks an upper

bound of jigsaw puzzle reconstruction performance

when geometric cues are diminishing. Such an under-

standing is a key step in developing solvers for real-

world puzzles, in which both pictorial and geometric

information are usually available in varying degrees.

Square jigsaw puzzle solvers may be distin-

guished by two criteria. The ﬁrst is the evaluation

method for potential piece matchings, which deﬁnes

the compatibility of placing any two puzzle pieces

in one of the possible spatial 4-neighborhood adja-

cency relations. The measure used for this evaluation

is commonly referred to as piece compatibility mea-

sure. The second criterion is the algorithmic method

used to process these compatibilities in order to infer

a solution for the full puzzle. In other words, one may

argue that square jigsaw puzzle solvers use local con-

straints (piece compatibilities) in order to deduce a

global solution (piece placements that seek to restore

the original image).

While several attempts were made to solve square

jigsaw puzzles and impressive results were achieved

(e.g., Pomeranz et al., 2011; Andal

o, Taubin and

Goldenstein, 2016; Son, Hays and Cooper, 2018),

most of the suggested algorithms use greedy meth-

Vardi, B., Torcinovich, A., Khoroshiltseva, M., Pelillo, M. and Ben-Shahar, O.

Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving.

DOI: 10.5220/0011622800003417

In Proceedings of the 18th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISIGRAPP 2023) - Volume 4: VISAPP, pages

785-795

ISBN: 978-989-758-634-7; ISSN: 2184-4321

 2023 by SCITEPRESS – Science and Technology Publications, Lda. Under CC license (CC BY-NC-ND 4.0)

785

ods rather than global optimization methods. Nev-

ertheless, global optimization algorithms are gener-

ally preferred over greedy algorithms for their ability

to produce solutions which simultaneously take into

consideration all the local potential piece matchings.

Moreover, global optimization approaches may often

facilitate additional mathematical analysis (Floudas,

2013) and thus a better understanding of the dynam-

ics of the problem and solution process.

A well-founded global optimization approach that

handles problems where local constraints are given

and global interpretation should be found is relax-

ation labeling (RL). The approach, ﬁrst introduced

by Rosenfeld, Hummel and Zucker (1976), is a class

of iterative procedures for reﬁning label assignments

based on contextual constraints. RL has been appli-

cable in solving many global optimization problems,

most commonly from the computer vision ﬁeld (e.g.,

Zucker, Hummel and Rosenfeld, 1977). Although

iterative, and reminiscent of gradient ascent, it can

be formulated without any step size (Pelillo, 1997;

Rosenfeld et al., 1976) while guaranteeing the op-

timization of its objective function (Hummel and

Zucker, 1983; Pelillo, 1997).

Recently, preliminary results of using RL for

square jigsaw puzzle solving (Khoroshiltseva et al.,

2021) demonstrated the potential of this approach, but

nevertheless suffered from several limitations such as

occasional failures to converge to a feasible solution

or being limited to puzzles in which piece orienta-

tions are known (known as Type 1 puzzles; Gallagher,

2012). In this work we further explore the potential of

RL for square jigsaw puzzle solving and present a new

RL algorithm that does guarantee feasible solutions,

performs better than the preliminary approach, and

handles both Type 1 but also Type 2 puzzles (whose

piece orientations are unknown). In addition to the al-

gorithmic novelty, another contribution of this paper

is the proposal of a new piece compatibility measure.

2 RELATED WORK

2.1 Puzzle Solving

The topic of puzzle solving has long been of interest

in the scientiﬁc community. The ﬁrst to investigate

puzzles were Freeman and Garder (1964), focusing

on the geometric information of irregularly shaped

puzzle pieces to suggest the ﬁrst computational apic-

torial solver. Three decades later, Kosiba, Devaux,

Balasubramanian, Gandhi and Kasturi (1994) were

the ﬁrst to also consider pictorial puzzles in their clas-

sical “toy” jigsaw puzzle solver.

While the effort to combine pictorial and geomet-

ric information was followed also by others, the line

of research that has been dominating the literature fo-

cused on square jigsaw puzzles in which all pieces

have an identical square shape, with puzzle dimen-

sions being often known. While the objective of all

different square jigsaw puzzle solvers is the same,

solution approaches vary greatly. A ﬁrst notable at-

tempt was made by Cho et al. (2010) who employed

loopy belief propagation algorithm for Type 1 puz-

zles. Although being able to solve the biggest puzzle

at the time, their algorithm was only semi-automatic,

requiring at least some ground truth information. A

year later, Pomeranz et al. (2011) published the ﬁrst

fully automatic solver, requiring no prior knowledge.

Their iterative greedy Type 1 solver is based on three

steps: pieces placement according to the piece com-

patibility measure, segmentation of the proposed so-

lution into coherently assembled segments, and shift-

ing segments to improve the current solution. Despite

being simpler, this approach provided not only a leap

in performance, but also the ability to solve puzzles

an order of magnitude larger than before.

Following Pomeranz et al. (2011), Andal

et al. (2012; 2016) were the ﬁrst to formulate the

problem as a global optimization problem using

a constrained quadratic objective function. Gal-

lagher (2012) used a tree-based greedy algorithm in-

spired by the well-known Kruskal’s algorithm for

ﬁnding minimum spanning trees. Gallagher (2012)

was the ﬁrst to introduce and solve Type 2 puzzles,

namely puzzles in which piece orientations are un-

known. Shortly after, Sholomon, David and Ne-

tanyahu (2013) devised a genetic algorithm-based

solver for Type 1 puzzles, and later extended it for

Type 2 puzzles (Sholomon, David and Netanyahu,

2014).

A different approach for the problem was pro-

posed by Son et al. (2014; 2018) who devised a Type

1 and 2 solver that utilizes loops of pieces that form

consistent cycles. More recently, Huroyan, Lerman

and Wu (2020) proposed a Type 2 method based on

recovering piece orientations by estimating the graph

connection Laplacian. Of special importance to our

work is the global optimization solver by Khoroshilt-

seva et al. (2021), which is based on RL. Due to its

relevance, we dedicate Section 2.3 to that solver.

Several neural network methods have been sug-

gested to solve square jigsaw puzzles, most of them

applicable only for small puzzles. Paumard, Pi-

card and Tabia (2020) offered a convolutional neu-

ral network-based solver that predicts neighboring re-

lations in 9 pieces puzzles. Li, Liu, Wang, Liu and

Zeng (2021) devised a generative adversarial network

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

786

(GAN) pipeline for training a classiﬁcation network

to predict pieces permutation in puzzles with up to 16

pieces. Talon, Del Bue and James (2022) suggested

another GAN-based approach that is capable of solv-

ing bigger puzzles, but suffers from diminishing ac-

curacy as puzzle size gets bigger.

Along with the different algorithmic approaches

to the square puzzle problem, different piece compat-

ibility measures have been also developed over time.

Such measures deﬁne the compatibility of placing

any two puzzle pieces in any of the 4-neighborhood

adjacency relations. They are most commonly de-

ﬁned as a function C(b

, b

, R) that speciﬁes how

compatible it is to place piece b

in relation R ∈

{right, down, le f t, up} to piece b

. In virtually all

solvers, piece compatibilities are based on a more fun-

damental measure of piece dissimilarity that usually

compares the pictorial information along the bound-

aries of pieces. This measure is later mapped to a

score of compatibility.

2.2 Relaxation Labeling

Relaxation labeling (RL) is a class of processes for

ambiguity reduction of labeling assignments, done by

iteratively reﬁning assignments based on contextual

information (Rosenfeld et al., 1976). Among these

processes, in the current work we focus on nonlinear

relaxation labeling, where problems involve four ele-

ments:

(1) Set of n objects B = {b

, ...,b

(2) Set of m possible labels Λ = {λ

, ...,λ

(3) Four-dimensional matrix of compatibility coefﬁ-

cients {r

i j

(λ, µ)} ∈ R

, expressing how com-

patible are the two hypotheses of “labeling object

with label λ” and “labeling object b

with label

µ”.

(4) Initial labeling ¯p

(0)

∈ R

, containing prior prob-

abilities for labeling each object b

with each label

λ.

The purpose of nonlinear relaxation labeling

(henceforth referred to as just RL) is to provide a con-

sistent labeling assignment ¯p ∈ R

that represents a

viable solution to the problem. An assignment is a set

of probabilities for assigning each label λ to each ob-

ject b

. While the initial labeling ¯p

(0)

is already qual-

iﬁed as such, it may not be “good enough” in terms

of satisfying the local constraints represented by the

compatibility coefﬁcients. An assignment that does

satisfy these constraints is called consistent, and RL

is the optimization process that provides such consis-

tent assignments given the four elements above.

A labeling assignment ¯p, usually referred to just

as labeling, is represented in the current work as la-

beling matrix of n rows and m columns. The proba-

bility of object b

to be labeled with label λ is notated

as p

(λ), and is indicated by the matrix entry in row i

and column λ. Since each row i represents the proba-

bilities of all label assignments to object b

, the space

of all possible labelings is

K =



¯p ∈ R

| p

(λ) ≥ 0, ∀i ∈ B, λ ∈ Λ

and

∑

λ=1

(λ) = 1, ∀i ∈ B



(1)

A particular important subset of K is the set of all

possible unambiguous labelings:

∗



¯p ∈ K | p

(λ) ∈ {0, 1}, ∀i ∈ B, λ ∈ Λ



, (2)

where each object is assigned one label with full con-

ﬁdence.

In order to reﬁne the initial (likely inconsistent) la-

beling, RL algorithm involves the computation of two

terms in each iteration k of the process, the support

and a new more “reﬁned” labeling. While not nec-

essarily the only way to do so, we adapt the common

terms from the RL literature (Pelillo, 1997; Rosenfeld

et al., 1976):

(1) The support matrix ¯q

(k)

∈ R

represents the con-

textual “support” that each object b

gets for label-

ing it with each label λ. This matrix, referred to

as support, is computed by a support function:

(k)

(λ) =

∑

j=1

∑

µ=1

i j

(λ, µ)p

(k)

(µ). (3)

(2) The reﬁned labeling ¯p

(k+1)

that represents the new

“better” labeling, is computed by a nonlinear up-

date rule:

(k+1)

(λ) =

(k)

(λ)q

(k)

(λ)

∑

µ=1

(k)

(µ)q

(k)

(µ)

. (4)

The entire algorithm is best viewed as a contin-

uous mapping of K onto itself through Eq. 4, and

it is expected to converge to a ﬁxed point if applied

successively. Theoretically, such a ﬁxed point itera-

tion stops when p

(k+1)

(λ) = p

(k)

(λ), ∀i, λ for some

iteration k. In practice, we may (heuristically) deter-

mine convergence if the increase of the global objec-

tive function (see below in Eq. 5) in two successive

iterations is smaller than a given threshold ε.

The theoretical foundations for RL were laid by

Hummel and Zucker (1983) who have proved sev-

eral useful theoretical properties for an alternative

RL process. Pelillo (1997) later proved similar dy-

namics for the original RL algorithm from Rosenfeld

Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving

787

et al. (1976). Of particular relevance to us is that un-

der symmetric non-negative compatibility coefﬁcients

(i.e., r

i j

(λ, µ) = r

(µ, λ) ≥ 0, ∀i, j, λ, µ), each iteration

of the process strictly increases a global consistency

measure known as average local consistency (ALC;

Hummel and Zucker, 1983):

A( ¯p, ¯q) =

∑

i=1

∑

λ=1

(λ)q

(λ). (5)

Both contributions (Hummel and Zucker, 1983;

Pelillo, 1997) showed that their respective process

eventually approaches and converges to a consistent

solution near the initial labeling assignment. Impor-

tantly, however, unlike other optimization algorithms

such as gradient ascent, the original iteration due to

Rosenfeld et al. (1976) does so without the need to

determine a step size.

2.3 Puzzle Solving with “Balanced” RL

To our best knowledge, the only work that employs

RL for square jigsaw puzzle solving is by Khoroshilt-

seva et al. (2021). In their formulation for Type 1 puz-

zles, the set of objects is the set of puzzle pieces and

the set of labels is the set of possible positions on the

puzzle grid. In these settings n = m, so the space K is

the space of stochastic matrices, and the space K

∗

the space of binary stochastic matrices. As no better

prior information can be assumed, the initial label-

ing is a uniform distribution of labels across each ob-

ject. The compatibility coefﬁcients r

i j

(λ, µ) are set to

piece compatibility scores C(b

, b

, R) for all pairs of

different pieces b

, b

and positions λ, µ that represent

neighboring positions on the puzzle grid in relation R.

In all other cases, r

i j

(λ, µ) is 0.

Unfortunately, applying the RL update rule from

Eq. 4 using such deﬁnitions does not guarantee, and in

fact often does not result in a feasible puzzle solution,

deﬁned as an assignment where each piece is placed at

a unique position on the puzzle grid and all positions

are occupied with some piece. Such feasible solu-

tions, represented by permutation labeling matrices,

are not always achieved because the theory just guar-

antees that the update rule will map a stochastic ma-

trix into another stochastic matrix, with the additional

hope that the process converges to an unambiguous

assignment, namely a binary stochastic matrix. But

even if the process converges to a binary stochastic

matrix (as opposed to a non-binary one), this does

not prohibit situations where two objects (pieces) ob-

tain the same label (position), or that a given label

(position) is not assigned to any object (piece). To

remedy this situation, Khoroshiltseva et al. (2021) ex-

tended the update rule from Rosenfeld et al. (1976)

with two versions of a matrix balancing component

that seeks to map a stochastic labeling matrix to a

doubly stochastic matrix. Since the latter are easier

to binarize into permutation matrices, this should help

the process converge to the space of all feasible puzzle

solutions.

Khoroshiltseva et al. (2021) showed empirically

that their approach can perform better than a plain

RL without balancing. At the same time, while the

method encourages feasible solutions, it cannot guar-

antee them. Testing on natural images indeed showed

vulnerability to certain common conditions, and in

particular the presence of constant pieces, namely

puzzle pieces of identical appearance of some con-

stant value throughout. Such failure cases can be ex-

plained by the fact that the rows of the labeling matrix

that refer to constant pieces are identical in all steps of

the process (both the update rule and balancing com-

ponent map identical rows into identical rows), thus

the process cannot converge to a permutation matrix.

In the current work we suggest an alternative and

improved puzzle solver based on a novel multi-phase

RL scheme. Our method is applicable for both Type

1 and 2 puzzles, addresses the aforementioned prob-

lems and facilitates further exploration of RL as a

computational powerhouse for puzzle solving.

3 MULTI-PHASE RL PUZZLE

SOLVERS

3.1 Puzzles with Known Piece

Orientation

Type 1 puzzles, possibly the most elementary varia-

tion of the puzzle problem, constitute unordered set of

square pieces whose orientation is given but their or-

ganization in an N ×M array of positions is not. Treat-

ing this problem as an RL problem can be done in the

following way, similar to Khoroshiltseva et al. (2021):

(1) The objects B = {b

, . . . , b

} are puzzle pieces.

(2) The labels Λ = {λ

, ..., λ

} are all possible posi-

tions (x

, y

) on the two-dimensional N × M puz-

zle grid.

(3) The compatibility coefﬁcients are deﬁned by

i j

(λ, µ) = r

i j

((x

, y

), (x

, y

)) =











C(b

, b

, R), if i ̸= j and (x

, y

), (x

, y

) are

neighboring positions in relation R,

0, otherwise ,

(6)

where R ∈ {right, down, le f t, up} represents a 4-

neighborhood spatial relation and C is a non-

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

788

(a) (b) (c)

Figure 1: Type 1 puzzle feasible labeling structure. (a) is a

2 × 2 puzzle and (b) is its perfect solution. This perfect so-

lution is represented by the permutation labeling matrix (c).

Piece IDs in red were added for demonstration purposes.

negative and symmetric piece compatibility func-

tion (see Section 3.3 for the function deﬁnition).

(4) The initial labeling ¯p

(0)

is the barycenter of the

search space, i.e., p

(0)

(λ) =

, ∀i, λ. With no

prior information, this is the only rational choice

one can make.

As mentioned above, a feasible Type 1 reconstruc-

tion is a solution in which each piece is placed at

a unique position on the puzzle grid, and the space

of such feasible labelings is the space of permutation

matrices. Figure 1 exempliﬁes feasible labeling struc-

ture for the Type 1 problem.

Since C is non-negative and symmetric, the same

applies to r

i j

(λ, µ). Consequently, the process is guar-

anteed to strictly increase the ALC in each iteration,

and to converge to a consistent assignment (Pelillo,

1997). For puzzle solving this turns the ALC to an in-

tuitive objective function and the relaxation process to

one that seeks to maximize the sum of compatibilities

between all adjacent pieces, similar in spirit to other

global optimizers for puzzle solving (Andal

o et al.,

2016; Sholomon et al., 2013). For the justiﬁcation of

this claim please refer to the supplementary material

in the project webpage (https://icvl.cs.bgu.ac.il/multi-

phase-rl-for-jigsaw-puzzle-solving).

3.1.1 Type 1 Solver

With the elements of the RL process now set for puz-

zle solving, and the observation that preliminary at-

tempts (Khoroshiltseva et al., 2021) could not guar-

antee feasible solutions, we suggest an alternative al-

gorithm that does guarantee this property. Central to

our approach is a novel multi-phase scheme where

each phase uses RL to determine the position of a

single puzzle piece at a non-occupied position. As

explained below, this enables to change the structure

of the labeling assignment towards permutation label-

ing, as required for feasible solutions.

More speciﬁcally, in each phase of the algorithm

we let the standard RL algorithm (Pelillo, 1997;

Rosenfeld et al., 1976) run and intervene only when

the process is conﬁdent enough about placing a cer-

Figure 2: The ﬁrst phase of our Type 1 solver on a 3 × 3

puzzle, done with α = 0.7. The RL phase begins from

the initial labeling and terminates once some labeling value

crosses α (left matrix, obtained for p

((2, 2)) after 4 RL it-

erations). On the right is the labeling after anchoring piece

to position (2, 2).

tain piece b

at some position λ, namely that the pro-

cess converges to a consistent labeling and/or at least

one label at one object satisﬁes p

(λ) ≥ α, where α is

a pre-deﬁned threshold. This intervention marks the

end of the current phase, and involves three steps:

(1) Anchoring piece b

to position λ and disallow-

ing placing it from now on at any other posi-

tion. This is done by setting p

(λ) = 1 and

(µ) = 0, ∀ µ ̸= λ.

(2) Preventing the future placement of other pieces

at position λ. This is done by setting

(λ) = 0, ∀ j ̸= i.

(3) Resetting all entries p

(µ) that involve pieces

and positions µ for which past phases have

not made a decision yet. This step is done

by setting these entries to the barycenter of the

remaining search subspace, i.e. to the recipro-

cal of the number of remaining unassigned ob-

jects/positions. This allows subsequent phases to

start afresh without bias from prior phases.

Figure 2 demonstrates these steps after the completion

of the ﬁrst RL phase on a 3 × 3 puzzle.

The manipulated labeling serves as the initial la-

beling for the next phase of the algorithm, which

anchors another piece at another position. The al-

gorithm ﬁnally terminates after anchoring all pieces,

namely after n phases. Importantly, although no spe-

cial measures are taken, the process does not mod-

ify entries of pieces and positions for which previous

phases have already made a decision since Eq. 4 does

not (indeed, cannot) modify binary (i.e., 0 or 1) prob-

abilities. Thanks to this property, each phase takes a

decision about a single piece and position by binariz-

ing their respective row and column in the labeling

matrix, all while utilizing the information obtained in

previous phases. This way, each phase updates the la-

beling so it becomes progressively “closer” to a per-

mutation labeling. After n phases convergence to a

Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving

789

(a) (b) (c) (d)

Figure 3: The shifted reconstruction phenomenon and its

solution. (a) is a reconstruction where rows are shifted. To

prevent this, we translate the whole anchored block when

the anchoring of some piece reaches the edge of the puzzle

grid, as in (b) (gray areas represent non-occupied positions).

At this point we translate the block downwards to obtain (c)

and allow subsequent phases to grow in all directions. (d) is

a case a translation will not necessarily beneﬁt the situation

as translating the block upwards will also limit the growth.

At this point we branch the computation to try both options,

translating the block upwards or leaving the block as is.

permutation labeling is guaranteed.

Operating as described above, the multi-phase

solver will normally attach new pieces to the block of

pieces already anchored in the past. However, there is

no guarantee for this behavior. To prevent the solver

from generating isolated “islands” of solutions that

would be difﬁcult to “stitch” together properly across

the puzzle grid, we allow anchoring only at positions

adjacent to previously anchored ones, thus guaran-

teeing a single contiguous block of anchored pieces.

Moreover, if the α threshold is met simultaneously by

more than one combination of piece b

and position λ,

we select a single combination by an internal sorting

of relevant combinations. Please see the supplemen-

tary material for more details.

The algorithm just described guarantees feasible

solutions although, of course, it cannot guarantee cor-

rect solutions. In practice, many solutions are per-

fect (or near perfect) up to a circular shift of the rows

and/or columns, as shown in Figure 3(a). This phe-

nomenon is best explained as a lack of awareness of

the solver to the puzzle dimensions, as it emerges

from the likely event of assigning the very ﬁrst piece

to a wrong position in the puzzle grid. This prob-

lem was also observed in past (non-RL) solvers (e.g.,

Pomeranz et al., 2011), but is easy to alleviate by in-

corporating such awareness in the process. In partic-

ular, each time a phase is completed, if the anchored

block abuts the edge of the puzzle grid, we translate

it one column or row inside (see Figures 3(b)-3(c)).

This translation is done by applying a proper trans-

formation on the labeling matrix, and it allows subse-

quent phases to again grow the block in all directions.

Of course, these translations of the anchored block

will cease once the solution is short of one row or

column to the vertical or horizontal dimension of the

puzzle (e.g., Figure 3(d)). At this point it is yet im-

possible to know what side the coming phases should

grow the last row or column and thus we branch the

(a) (b)

(c)

Figure 4: Type 2 puzzle feasible labeling structure. (a) is a

2 × 2 puzzle and (b) is its perfect solution. The correspond-

ing type 2 permutation labeling matrix is shown in (c).

algorithm to try both options. Doing this for both the

vertical and horizontal growth, we end up with 4 ﬁnal

reconstructions, from which the ﬁnal solution is the

one with the largest ALC.

3.2 Puzzles with Unknown Piece

Orientation

Type 2 puzzles extend the elementary case to situa-

tions where piece orientation is unknown. Since the

ﬁnal orientation of the pieces needs to be determined

by the process, the setup of the problem as an RL pro-

cess requires some adjustments. As before, the ob-

jects b

, . . . , b

still represent the NM puzzle pieces.

However, the labels λ

, ..., λ

now need to represent

all combinations of piece positions and orientations.

Each label λ is a pair ((x

, y

), θ

), where (x

, y

) is

a position on the N × M grid and θ

∈ {0°, 90°, 180°,

270°} is one of 4 possible piece orientations.

Clearly, the change to labels requires a corre-

sponding adjustment to the compatibility coefﬁcients:

i j

(λ, µ) = r

i j

(((x

, y

), θ

), ((x

, y

), θ

)) =











C(θ

), θ

), R), if i ̸= j and (x

, y

), (x

, y

)

are neighboring positions

in relation R,

0, otherwise ,

(7)

where function θ

) represents the clockwise-

rotation of piece b

by θ

degrees.

Under this formulation several complications im-

mediately arise. First, the labeling matrix is no longer

square as m = 4n, and thus the solution cannot be a

permutation matrix. Second, a feasible Type 2 recon-

struction requires to place each piece in some orien-

tation at a unique position on the puzzle grid. In fact,

in order to represent feasible solutions, the labeling

matrix should include a single 1 in each row (just as

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

790

(a) (b) (c) (d) (e)

(f) (g) (h)

Figure 5: Possible reconstructions for Type 2 puzzles. (a)

is a square-shaped puzzle, for which (b), (c), (d) and (e) are

four perfect reconstructions. (f) is a rectangular-shaped puz-

zle, for which (g) and (h) are two perfect reconstructions.

in Type 1 puzzles), but at the same time a single 1

in every group of 4 columns that represent the differ-

ent orientations of the same position. A labeling for

which the matrix representation meets this description

is referred to as type 2 permutation labeling (see Fig-

ure 4). We note that similar to the Type 1 case, in the

space of Type 2 feasible solutions the RL optimization

seeks to maximize the sum of compatibilities between

adjacent pieces.

To complete the description of Type 2 problems

as RL processes, we need to deﬁne the initial label-

ing. The situation here is again somewhat more com-

plicated since unlike for Type 1 puzzles, that have a

single perfect reconstruction, any Type 2 puzzle has

at least 2 perfect reconstructions due to global image

rotation (see Figure 5). In order to encourage conver-

gence to one of these solutions, the dynamics should

be biased towards a speciﬁc solution rather than “fa-

voring” all solutions equally. To address this, we

set the initial labeling as the barycenter of the search

space, while limiting some piece b

to some orienta-

tion θ

(λ) = p

(((x

, y

), θ

)) =











, if i ̸= i

, if i = i

, θ

= θ

0, if i = i

, θ

̸= θ

(8)

When puzzles are square in shape (i.e., N = M), the

orientation θ

is chosen randomly among the 4 possi-

ble ones and serves as an “angular anchor” that pulls

the entire solution towards the proper global orienta-

tion. Doing the same for rectangular N ×M puzzles is

problematic because the selected orientation of piece

, and thus the preferred global orientation of the

solution, might conﬂict the rectangular dimensions of

the puzzles. To cope with this problem we run the al-

gorithm twice, ﬁrst with an initial state derived from

a randomly selected orientation θ

, and once again

with an initial state derived from θ

+ 90°. This en-

sures that one of the computations could converge to

a correct reconstruction, which is identiﬁed and se-

Figure 6: The ﬁrst phase of our Type 2 solver on a 2 × 2

puzzle, with α = 0.7. The RL phase begins from the ini-

tial labeling and terminates once labeling values cross α

(top matrix, obtained after 3 RL iterations). The best supra-

threshold entry at this point is p

(((2, 1), 0°)). After anchor-

ing piece b

to position (2, 1) at orientation 0° we obtain the

bottom matrix. Note that b

is allowed only at orientation

0° since the initial labeling was set this way.

lected by its larger ALC. Finally, in order to maxi-

mize the inﬂuence of the bias obtained from the initial

state, piece b

is heuristically selected as the one that

maximizes the sum of potential compatibilities in its

4-neighborhood.

3.2.1 Type 2 Solver

Not unlike the Type 1 solver that directs the process to

a permutation labeling, the Type 2 solver needs to lead

the process to a type 2 permutation labeling. How-

ever, to “anchor” an entry p

(((x

, y

), θ

)) requires

to revise the intervention at the end of each RL phase

to manipulate the labeling in a slightly different way:

(1) Anchoring piece b

to position (x

, y

) in ori-

entation θ

and preventing its placement at

any other position (regardless of orientation).

Formally we set p

(((x

, y

), θ

)) = 1 and

(µ) = 0, ∀ µ ̸= ((x

, y

), θ

(2) Preventing a future placement of other

pieces at position (x

, y

) regardless of

their orientation. This is done by setting

(((x

, y

), θ)) = 0, ∀ θ, j ̸= i.

(3) Resetting entries p

(((x

, y

), θ

)) that involve

pieces b

and positions (x

, y

) for which anchor-

ing is yet made. This is done by setting these en-

tries to the reciprocal of the number of combina-

tions of unassigned positions and possible orien-

tations (i.e., the number of unassigned positions

multiplied by 4). If piece b

(used for setting the

initial labeling) is not anchored, we reset its val-

ues so only orientation θ

is allowed.

Figure 6 demonstrates the effect of these 3 steps after

the completion of the ﬁrst RL phase on a 2 ×2 Type 2

Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving

791

puzzle. The shifting of the anchored block at the end

of each phase is handled as before.

3.3 Piece Compatibility Measure

At the foundations of any puzzle solver is some mea-

sure of afﬁnity between puzzle pieces that allows to

quantify the quality of immediate neighborhood re-

lationships. In our algorithm it is integrated in RL

compatibility coefﬁcients.

As mentioned earlier, piece compatibility mea-

sures are conventionally based on a more basic mea-

sure of piece dissimilarity. While various dissim-

ilarity measures have been proposed, we use the

Mahalanobis gradient dissimilarity proposed by Son

et al. (2018). This dissimilarity measure evaluates

the disagreement of placing two adjacent pieces by

computing the Mahalanobis distances between the ac-

tual gradients and expected gradients with respect to

the distribution of predicted gradients, where the gra-

dients considered are both across and along the ad-

joining boundaries of pieces. Please refer to Son

et al. (2018) for a full description.

Given the dissimilarity, it is possible to convert it

to a compatibility function in various ways, the most

immediate of which is simply negating the dissimilar-

ities. Here we take a more elaborate approach that is

based on the observation that our solver, and RL pro-

cesses in general, perform better when the compati-

bility values are well spaced and the difference be-

tween compatible and incompatible values is signiﬁ-

cant. This was already observed in other solvers, and

here we follow Andal

o et al. (2016) who improved the

basic idea from Pomeranz et al. (2011) and deﬁned

C(b

, b

, R) = exp



− ϕ

i,R

( j) −

D(b

, b

, R)

quartile(i, R)



, (9)

where quartile(i, R) is the quartile dissimilarity

among all dissimilarities in relation R to piece b

, and

i,R

( j) is the rank of the dissimilarity D(b

, b

, R) in

the increasingly ordered dissimilarities in relation R

to piece b

. The quartile(i, R) term is included to

consider the scattering of dissimilarity values (and not

just the values themselves), and the ϕ

i,R

( j) rank term

results in “spacing” the computed compatibility val-

ues.

Inspired by such ideas, we propose a new compat-

ibility measure:

C(b

, b

, R) =











1, if D(b

, b

, R) = p

avg

(i, R;k) = 0,



1 −

D(b

,R)

avg

(i,R;k)



i,R

( j)

, if D(b

, b

, R) ≤ p

avg

(i, R;k)

and p

avg

(i, R;k) > 0,

0, otherwise ,

(10)

where p

avg

(i, R; k) is the average dissimilarity from

the ﬁrst value till the k percentile among all dissimi-

larities in relation R to piece b

This measure offers two advantages over the mea-

sure from Eq. 9: (1) dissimilarities scattering informa-

tion is determined by multiple values rather than one

value (the quartile dissimilarity). Moreover, we intro-

duce k as a parameter to avoid limitation to k = 25

(the quartile). (2) It better balances the D(b

, b

, R)

and ϕ

i,R

( j) terms since high ϕ

i,R

( j) does not neces-

sarily guarantee low compatibility score.

An important challenge that most global optimiza-

tion solvers face is due to constant pieces, or puz-

zle pieces that have the same constant color value

throughout, which happens mostly in sky regions that

turned white due to pixel saturation. Constant pieces

maintain maximal compatibility with other constant

pieces (and possibly with more pieces), thus it be-

comes difﬁcult for the RL process to converge to a la-

beling that unambiguously determines their positions.

To alleviate this, we slightly adjust the compatibility

for puzzles with more than two constant pieces:

′

, b

, R) =











max{0, X}, if there exists constant

piece b

s.t. C(b

, b

, R) = 1

and C(b

, b

, R) = 1,

C(b

, b

, R), otherwise ,

(11)

where X ∼ U(−4, 1) is a random variable. This

adjustment re-draws all maximal compatibilities for

constant pieces (and after rectiﬁcation set it to 0 with

probability of 0.8), which sparsiﬁes the compatibility

coefﬁcients and assists convergence.

4 EXPERIMENTAL RESULTS

To test the suggested puzzle solvers, we used the 20

puzzles of 432 pieces from the MIT data set pro-

posed by Cho et al. (2010) and the 20 puzzles of 540

pieces from the McGill data set proposed by Pomer-

anz et al. (2011). All puzzles contain pieces of size

of 28 × 28 pixels. We evaluated performance with

the following three performance measures: (1) Di-

rect comparison (DC; Cho et al., 2010) is the fraction

of pieces that are in the same position (and orienta-

tion for Type 2 puzzles) in the assembly solution and

the ground truth solution. (2) Neighbor comparison

(NC; Cho et al., 2010) is the fraction of pairwise piece

matchings that exist in the ground truth solution. (3)

Perfect reconstruction (PR; Gallagher, 2012) is a bi-

nary indication of whether the assembly solution is

identical to the ground truth. For an entire data set,

PR becomes the number of perfectly solved puzzles.

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

792

Table 1: Comparison of Type 1 solvers.

Method

The MIT data set The McGill data set

DC NC PR DC NC PR

Andal

o et al. (2016) 96% 95.6% 13 90.8% 95.3% 13

Gallagher (2012) 95.3% 95.1% 12 - - -

Paikin and Tal (2015) 96.2% 95.8% 13 93.2% 96.1% 13

Pomeranz et al. (2011) 94% 95% 13 83.5% 90.9% 9

Sholomon et al. (2013) 86.2% 96.2% 9 92.8% 96% 8

Son et al. (2018) 95.8% 95.6% - 95.4% 97% -

Khoroshiltseva et al. (2021) 79.1% 87.9% 6 25.6% 64.5% 0

Ours 95.5% 95.4% 13 62.9% 87.8% 6

Table 2: Comparison of Type 2 solvers.

Method

The MIT data set The McGill data set

DC NC PR DC NC PR

Gallagher (2012) 82.2% 90.4% 9 - - -

Huroyan et al. (2020) 94.8% 95.2% 13 88.3% 92.2% 13

Paikin and Tal (2015) 95.4% 95.4% - - - -

Sholomon et al. (2014) 94.6% 94.9% 10 89.6% 92% 8

Son et al. (2018) 95.8% 95.6% 12 92.3% 94.5% 11

Ours 87.1% 93.9% 5 25.8% 78% 0

The only random part in our solvers is the constant

pieces compatibility adjustment (Eq. 11). Otherwise,

solvers are fully deterministic for a given input. For

this reason, the reported performance for puzzles with

constant pieces is the average of 10 runs while the rest

of the puzzles were solved once. In all experiments,

the RL convergence threshold ε was set to 10

−4

, while

the α anchoring threshold was set to 0.7. Dissimi-

larity computations are based on the CIELAB color

space. Compatibility computations (Eq. 10) use k = 3

and k = 1.5 for Type 1 and 2 solvers, respectively.

Table 1 presents the Type 1 solver performance.

Our main goal in this paper is a progression in

performance of RL solvers and indeed the multi-

phase method performs better than Khoroshiltseva

et al. (2021). Equally important, results are highly

competitive with many prior methods on the MIT data

set and approach the MIT data set upper bounds (DC,

NC, and PR upper bounds are 96.7%, 96.4% and

15, respectively; Son et al., 2014), which are below

perfect since no solver is expected to internally or-

der identical constant pieces correctly. The results of

prior methods were taken from their papers, while the

results of Khoroshiltseva et al. (2021) were obtained

by our implementation for their solver with Sinkhorn-

Knopp balancing. Table 2 presents our Type 2 results

compared to other methods. Because Type 2 solutions

may be rotated compared to the ground truth (see Fig-

ure 5), we evaluate only the rotation that achieves the

highest DC. Figure 7 shows Type 1 and 2 results.

5 CONCLUSION

This paper suggests a step forward in RL puzzle solv-

ing based on a novel multi-phase RL optimization ap-

proach that is fully automatic, uses no prior informa-

tion, guarantees feasible solutions, and is applicable

(a)

(b)

(c)

(d)

(e)

Figure 7: Input puzzle, RL solver result, and ground truth

image are shown left to right. (a) shows a typical result on

a Type 1 puzzle with constant pieces (sky pieces are con-

stant white). While the solution is visually perfect, its DC

and NC scores are relatively low (83.8% and 81.8%, respec-

tively) due to wrong internal ordering of constant pieces. (b)

shows a Type 1 solution with a very low DC score (6.3%)

even though the solution is visually excellent, due to offset

of one column. (c) shows a perfectly solved Type 2 puzzle.

(d) and (e) exemplify an occasional failure of the Type 1

and 2 solvers (respectively) in which sub-blocks of pieces

are assembled correctly, but in wrong position (and rotation

for Type 2) relative to each other.

for both Type 1 and 2 puzzles. Future work should

handle failure cases (see Figures 7(d)-7(e)), which

could, we believe, be alleviated within the RL frame-

work. For example, by identifying correctly assem-

bled block (such as done by Pomeranz et al., 2011),

modifying the compatibility coefﬁcients accordingly

and running the RL process again.

We believe that the proposed multi-stage approach

could be used as a general RL-based scheme for other

hard optimization problems (not necessarily puzzles)

for solving permutation challenges (cf. Type 1 cases)

or even more complicated permutation constraints (cf.

Type 2 cases). For example, it is likely that the multi-

phase approach could be applied to much larger scale

Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving

793

instances of the traveling salesman problem than pre-

viously attempted with RL (Pelillo, 1993).

ACKNOWLEDGEMENTS

This project has received funding from the Euro-

pean Union’s Horizon 2020 research and innovation

programme under grant agreement No 964854, the

Helmsley Charitable Trust through the ABC Robotics

Initiative, and the Frankel Fund of the Computer Sci-

ence Department at Ben-Gurion University.

REFERENCES

Andal

o, F. A., Taubin, G., and Goldenstein, S. (2012). Solv-

ing image puzzles with a simple quadratic program-

ming formulation. In 2012 25th SIBGRAPI Confer-

ence on Graphics, Patterns and Images, pages 63–70.

Andal

o, F. A., Taubin, G., and Goldenstein, S. (2016).

PSQP: Puzzle solving by quadratic programming.

IEEE Transactions on Pattern Analysis and Machine

Intelligence, 39(2):385–396.

Cho, T. S., Avidan, S., and Freeman, W. T. (2010). A

probabilistic image jigsaw puzzle solver. In 2010

IEEE Computer Society Conference on Computer Vi-

sion and Pattern Recognition, pages 183–190.

Deever, A. and Gallagher, A. (2012). Semi-automatic as-

sembly of real cross-cut shredded documents. In 2012

19th IEEE International Conference on Image Pro-

cessing, pages 233–236.

Demaine, E. D. and Demaine, M. L. (2007). Jigsaw puz-

zles, edge matching, and polyomino packing: Con-

nections and complexity. Graphs and Combinatorics,

23(1):195–208.

Floudas, C. A. (2013). Deterministic global optimization:

Theory, methods and applications. Springer Science

& Business Media.

Freeman, H. and Garder, L. (1964). Apictorial jigsaw puz-

zles: The computer solution of a problem in pattern

recognition. IEEE Transactions on Electronic Com-

puters, EC-13(2):118–127.

Gallagher, A. C. (2012). Jigsaw puzzles with pieces of un-

known orientation. In 2012 IEEE Conference on Com-

puter Vision and Pattern Recognition, pages 382–389.

Hummel, R. A. and Zucker, S. W. (1983). On the foun-

dations of relaxation labeling processes. IEEE Trans-

actions on Pattern Analysis and Machine Intelligence,

PAMI-5(3):267–287.

Huroyan, V., Lerman, G., and Wu, H.-T. (2020). Solv-

ing jigsaw puzzles by the graph connection lapla-

cian. SIAM Journal on Imaging Sciences, 13(4):1717–

1753.

Khoroshiltseva, M., Vardi, B., Torcinovich, A., Traviglia,

A., Ben-Shahar, O., and Pelillo, M. (2021). Jigsaw

puzzle solving as a consistent labeling problem. In In-

ternational Conference on Computer Analysis of Im-

ages and Patterns, pages 392–402. Springer.

Koller, D. and Levoy, M. (2006). Computer-aided recon-

struction and new matches in the forma urbis romae.

Bullettino Della Commissione Archeologica Comu-

nale di Roma, pages 103–125.

Kosiba, D. A., Devaux, P. M., Balasubramanian, S., Gandhi,

T. L., and Kasturi, K. (1994). An automatic jigsaw

puzzle solver. In Proceedings of 12th International

Conference on Pattern Recognition, volume 1, pages

616–618.

Li, R., Liu, S., Wang, G., Liu, G., and Zeng, B. (2021). Jig-

sawgan: Auxiliary learning for solving jigsaw puzzles

with generative adversarial networks. IEEE Transac-

tions on Image Processing, 31:513–524.

Liu, H., Cao, S., and Yan, S. (2011). Automated assem-

bly of shredded pieces from multiple photos. IEEE

Transactions on Multimedia, 13(5):1154–1162.

Paikin, G. and Tal, A. (2015). Solving multiple square jig-

saw puzzles with missing pieces. In Proceedings of

the IEEE Conference on Computer Vision and Pattern

Recognition (CVPR), pages 4832–4839.

Paumard, M.-M., Picard, D., and Tabia, H. (2020). Deepz-

zle: Solving visual jigsaw puzzles with deep learning

and shortest path optimization. IEEE Transactions on

Image Processing, 29:3569–3581.

Pelillo, M. (1993). Relaxation labeling processes for

the traveling salesman problem. In Proceedings of

1993 International Conference on Neural Networks

(IJCNN-93-Nagoya, Japan), volume 3, pages 2429–

2432.

Pelillo, M. (1997). The dynamics of nonlinear relaxation

labeling processes. Journal of Mathematical Imaging

and Vision, 7(4):309–323.

Pomeranz, D., Shemesh, M., and Ben-Shahar, O. (2011).

A fully automated greedy square jigsaw puzzle solver.

In CVPR 2011, pages 9–16.

Rosenfeld, A., Hummel, R. A., and Zucker, S. W. (1976).

Scene labeling by relaxation operations. IEEE Trans-

actions on Systems, Man, and Cybernetics, SMC-

6(6):420–433.

Sholomon, D., David, O., and Netanyahu, N. S. (2013). A

genetic algorithm-based solver for very large jigsaw

puzzles. In 2013 IEEE Conference on Computer Vi-

sion and Pattern Recognition, pages 1767–1774.

Sholomon, D., David, O. E., and Netanyahu, N. S. (2014).

A generalized genetic algorithm-based solver for very

large jigsaw puzzles of complex types. In Proceed-

ings of the AAAI Conference on Artiﬁcial Intelligence,

volume 28.

Son, K., Hays, J., and Cooper, D. B. (2014). Solving square

jigsaw puzzles with loop constraints. In European

Conference on Computer Vision, pages 32–46.

Son, K., Hays, J., and Cooper, D. B. (2018). Solving square

jigsaw puzzle by hierarchical loop constraints. IEEE

transactions on pattern analysis and machine intelli-

gence, 41(9):2222–2235.

Talon, D., Del Bue, A., and James, S. (2022). Ganz-

zle: Reframing jigsaw puzzle solving as a retrieval

VISAPP 2023 - 18th International Conference on Computer Vision Theory and Applications

794

task using a generative mental image. arXiv preprint

arXiv:2207.05634.

Zhao, Y.-X., Su, M.-C., Chou, Z.-L., and Lee, J. (2007). A

puzzle solver and its application in speech descram-

bling. In Proceedings of the 2007 Annual Conference

on International Conference on Computer Engineer-

ing and Applications, pages 171–176.

Zucker, S. W., Hummel, R. A., and Rosenfeld, A. (1977).

An application of relaxation labeling to line and curve

enhancement. IEEE Transactions on Computers,

26(04):394–403.

Multi-Phase Relaxation Labeling for Square Jigsaw Puzzle Solving

795