Subset Sum and the Distribution of Information

Daan Van Den Berg

1 a

and Pieter Adriaans

2 b

Yamasan Science & Education, Amsterdam, The Netherlands

ILLC, IVI-CCI, University of Amsterdam, The Netherlands

Keywords:

Subset Sum, Branch and Bound, Information, Instance Hardness, Computational Complexity, NP-hard.

Abstract:

The complexity of the subset sum problem does not only depend on the lack of an exact algorithm that runs

in subexponential time with the number of input values. It also critically depends on the number of bits m of

the typical integer in the input: a subset sum instance of n with large m has fewer solutions than a subset sum

instance with relatively small m.

Empirical evidence from this study suggests that this image of complexity has a more ﬁne-grained structure. A

depth-ﬁrst branch and bound algorithm deployed to the integer partition problem (a special case of subset sum)

shows that for this experiment, its hardest instances reside in a region where informational bits are equally dis-

persed among the integers. Its easiest instances reside there too, but in regions of more eccentric informational

dispersion, hardness is much less volatile among instances. The boundary between these hardness regions is

given by instances in which the i

element is an integer of exactly i bits. These ﬁndings show that, for this

experiment, a very clear hardness classiﬁcation can be made even on the basis of information dispersal, even

for subset sum instances with identical values of n and m. The role of the ‘scale free’ region are discussed

from an information theoretical perspective.

1 INTRODUCTION

The subset sum problem exists in many shapes

and forms, but always involves a set of integers

S = {x

, x

...x

}, a target value t, and a summing

operation. Consider the following instance with

n = 8:

S = {1, 4, 9, 12, 17, 31, 41, 59} with t = 74

In its decision form (“Is there a subset of S that sums

up to t?”), the problem is in NP, because any given

solution can be veriﬁed for correctness in polyno-

mial time (Garey and Johnson, 1979). The subset

{41, 31, 1} is not a solution, because it sums up to 73

and not 74. Neither is {59, 9, 4, 1} and again, verify-

ing that it isn’t can be done quickly, by just summing

up the elements which takes polynomial time. Does

the above example have an exact solution at all?

Avoiding the direct answer to that question leads

to the slightly harder optimization problem (“Which

summed subset of S approximates t most closely?”).

This version of subset sum is NP-hard, and does not

https://orcid.org/0000-0001-5060-3342

https://orcid.org/0000-0002-8473-7856

have a polynomial time veriﬁcation procedure. Sub-

set {41, 31, 1} might be the closest approximation, but

there’s no way of knowing for sure apart from sum-

ming all subsets of S, which is a cumbersome (non-

polynomial) operation.

This study focuses on the optimization problem,

with the additional requirement that t has the value

of the summed integers divided by two (which would

mean t = 87 in the above example). This version

is known as the partition problem

, and traditionally

formulated as the task to split the set into two as-

equal-as-possible valued subsets. In this paper, we’ll

use an exact algorithm, guaranteed to deliver the best

possible answer, and measure runtimes to assess the

hardness of individual problem instances.

There is some previous work related to ours. Early

papers by Brickell (Brickell, 1984) and Lagarias and

Odlyzko (Lagarias and Odlyzko, 1985) deﬁne den-

sity on knapsack and subset sum instances, and a

more recent work by Tural on the closely related de-

cision variant of subset sum also incorporates branch

and bound (Kemal Tural, 2020). His generation of

instances appears to be somewhat similar to that of

Zhang and Korf’s (Zhang and Korf, 1996), and al-

Or sometimes number partition problem

134

Van Den Berg, D. and Adriaans, P.

Subset Sum and the Distribution of Information.

DOI: 10.5220/0010673200003063

In Proceedings of the 13th International Joint Conference on Computational Intelligence (IJCCI 2021), pages 134-140

ISBN: 978-989-758-534-0; ISSN: 2184-3236

though that study is on ATSP, we feel that our current

work is most supplementary to theirs. We’ll elaborate

a little more on this in Section 6.

The major difference with previous studies lies in

the creation of instances. Historically, this is often

done uniformly random from a range of integers. In

this study, we’ll take a different approach. All in-

stances in our experiment will have exactly 12 ele-

ments, exactly 78 bits of information, and are there-

fore supposedly equally hard. But by assigning an

explicit distribution to the informational bits of the el-

ements x

∈ S in an instance, we will show that this

distribution can also play a key role in the hardness

assessment of problem instances. The results of this

new approach makes us rethink the traditional hard-

ness measures, maybe even classes, and opens up a

whole new dimension of research in algorithms and

instance hardness. There’s also an information the-

oretical perspective on the results, which we’ll high-

light in Section 4.

2 METHODS

2.1 Branch & Bound

The solving algorithm for this experiment is rela-

tively straightforward. The term ‘branch and bound’

was coined by Little et al. (Little et al., 1963)(Daan

van den Berg, 2019)

but the basic principle ex-

isted earlier (Rossman and Twery, 1958a)(Rossman

and Twery, 1958b)(Eastman, 1958)

(Land and Doig,

2010)(Land and Doig, 1960)

. The paradigm appears

in roughly two algorithmic forms: a (priority) queue

based implementation and a stack-based implementa-

tion, which thereby follows a depth-ﬁrst search tree

traversal. We will use the latter of the two, but both

forms are exact methods, and therefore guarantee a

solution iff it exists for any decision problem it is

deployed to, or return a 100% certain “no solution”

otherwise – given it has enough runtime to ﬁnish.

Along the same lines, depth-ﬁrst branch and bound

is also solution-optimal when deployed to optimiza-

tion problems, guaranteeing the best solution when

it ﬁnishes. The drawback of course, is that runtimes

Little et al.’s original published paper is digitally avail-

able only as a murky scan, or a work-in-progress-report, but

the second reference points towards a fully refurbished edi-

tion, clean and searchable, made available in 2020.

Most of these works are very old, outdating the inter-

net, and not readily available in digital form.

Although both references point to the same paper, the

second one seems to be a republication. The original paper

is from 1960.

go up exponentially, exceeding the universe’s lifes-

pan even for moderately sized problem instances. Our

implementation on the subset sum problem has com-

plexity O(2

), but it does have a strong strategy for

cutting down runtimes. Abiding by Steven Skiena’s

famous words “Clever pruning can make short work

of surprisingly hard problems” (Skiena, 1998), it cuts

branches from the search tree which are guaranteed to

not improve over the incumbent best solution (‘best’).

For subset sum, this means that when the algorithm is

approximating t, and has a current value cur higher

that t, no additional items are added to it, using t as a

bound when branching through the search space. The

algorithm thus effectively traverses all 2

|S|

sums, but

omits branches that cannot possibly result in a better

value than the incumbent best, thereby saving signif-

icant computation time. Finally, the boolean variable

targetFound ensures the algorithm halts as soon as

the ﬁrst exact solution is found.

1. Preprocessing: sort S in descending order, as-

sign cur := 0, i := 1, i

max

:= |S|,best := ∞, t =

∑

|S|

i=1

, targetFound := f alse.

2. Assign cur

:= cur + s

, cur

:= cur. If cur

= t,

then assign targetFound := true. If cur

< best :

best := cur

3. If i < i

max

and cur

< t and ¬targetFound: go to

2 with cur := cur

, i = i + 1.

4. If i < i

max

and cur

< t and ¬targetFound: go to

2 with cur := cur

, i = i + 1.

The reason that t is chosen for the bound, and not best,

is that in general, the closest possible approximation

can be either higher or lower than t. So if at some

point in the previous example cur = 68, the incum-

bent best solution best = 71 and t = 74, it is unwise

to cut off along best, because 75 is still achievable,

be it on the other side of t. Only when cutting off

after the last addition has exceeded t assures further

addition cannot result in a better solution.

2.2 Templates & Instances

We fed the solving algorithm exactly 70 instances di-

vided in 7 cohorts of 10 instances, each of which had

m = 78 bits of information in n = 12 positive integers

in its set. Every cohort was formed through a ‘strict

template’ that assigned a prespeciﬁed number of bits

b to every integer in the instance. As such, a 3-bit

template entry (‘3b’) results in a random integer, val-

ued 4, 5, 6 or 7 in the instance. The template entry 6b

results in an randomly chosen integer between 32 and

64 in the problem instance (some explicit examples

are in Table 1). It is important to observe the notion

Subset Sum and the Distribution of Information

135

Table 1: Seven ‘strict templates’ used for making 70 subset sum instances. A value such as 4b randomly generates a corre-

sponding integer of exactly 4 bits, meaning it is randomly chosen between 8 and 16. The templates vary in eccentricity, ST

being the most eccentric, and ST

−3

being the ﬂattest possible. For each template, one corresponding instance is given as an

example.

Strict Template Example Instance

(1b,1b,1b,1b,1b,4b,4b,5b,9b,13b,17b,21b) {0,1,1,1,0,10,12,17,478,7899,90607,1638220}

(1b,1b,2b,2b,3b,4b,5b,6b,9b,12b,15b,18b) {1,1,3,3,6,15,23,40,423,3422,24181,251636}

(1b,1b,2b,4b,4b,5b,6b,7b,9b,11b,13b,15b) {0,1,3,8,14,30,45,79,324,1145,4332,19120}

(1b,2b,3b,4b,5b,6b,7b,8b,9b,10b,11b,12b) {1,2,6,12,19,35,115,247,305,563,1534,3828}

−1

(3b,3b,4b,4b,5b,6b,7b,8b,8b,9b,10b,11b) {7,6,9,13,16,55,109,175,230,330,909,1686}

−2

(4b,4b,5b,5b,6b,6b,7b,7b,8b,8b,9b,9b) {11,11,30,26,49,49,84,80,166,156,484,317}

−3

(6b,6b,6b,6b,6b,6b,7b,7b,7b,7b,7b,7b) {58,54,35,61,50,49,122,71,111,119,108,92}

of ‘strict’, which means that a template entry of 6b

results in an integer of exactly 6 bits, and not fewer.

This constraint ensures that all instances in all cohorts

have exactly 78 bits in total, making all experimen-

tal instances equal in both n and m. Still, despite this

stringent uniformity, the exact assignment of informa-

tional bits within the input has a formidable impact on

an instance’s hardness.

Centrally located in the list of strict templates is

, that increases linearly: 1b, 2b, 3b...12b, which,

from an information theory perspective, could be seen

as ‘scale free’ (Section 4). Below it, all templates

get increasingly ﬂatter, reaching the ﬂattest possible

template of 78 bits in 12 entries in ST

−3

, which con-

sists of six 6b-entries and six 7b-entries. The three

templates above the central template are increasingly

eccentric, with a ﬂat beginning but rising increas-

ingly faster towards the end. Unlike the lower half

of the strict templates, that ranges from linearly in-

creasing to maximum ﬂatness, the top half does not

range to maximum eccentricity. A maximally eccen-

tric strict template for 78 bits and 12 entries is given

by (1b,1b, 1b...1b, 67b), which would produce eleven

zeroes and ones, and one extremely large integer. So

even though the range of eccentricity extends a long

way upward, we did not incorporate any of these ex-

tremes. In its current setting, we think that the up-

per half of the table is representative for all templates

more eccentric than ST

3 RESULTS

In this experiment, the eccentric subset sum instances

all have the exact same hardness, ﬁxed at 2048 re-

cursions. This is well explainable: the preprocess-

ing step, which sorts the integers largest to smallest,

ensures the ﬁrst recursion subtree immediately ex-

ceeds the target value t and thereby requires no fur-

ther recursions, discarding half of the total search tree.

There can be only one integer s

≥

∑

|S|

i=1

in an in-

stance of the partition problem of this size, and such

an integer is present in all instances generated from

templates ST

, ST

and ST

. Computationally

speaking therefore, the presence of such an integer

sets an upper bound of 2

|S|−1

recursions on the in-

stance’s hardness. The rest of the search tree, contain-

ing at most half the summed value distributed over 11

integers, needs to be fully checked to ensure the best

approximation, requiring exactly 2

|S|−1

recursions.

Descending through Table 1, the instances in ST

’s

cohort are the ﬁrst to show a little variation in their

computational cost (also see Fig. 1). In this region,

the largest integer in the instance no longer automat-

ically constitutes half of the set’s total, but also the

algorithm sometimes performs one more step than

many other branch and bound implementations, be-

cause its target t can be approximated from below and

above.

The most interesting patterns however emerge

from ST

−1

, ST

−2

and especially ST

−3

. In this area,

most subtrees are non-prunable because the integers

reside in the same range of values, which causes all

of the search tree to be exhaustively searched in the

worst case (even though this never happened in this

experiment, likely because t =

∑

|S|

i=1

for all in-

stances). But computation time doesn’t only increase

in these regions; there are also signiﬁcant setbacks

in computational cost. These occur when instances

have exact solutions, and the search can be halted

altogether once one is found. As instances ﬂatten,

mostly in the ST

−3

-area, exact solutions become ever

more rife, causing the search process to halt earlier.

But the instances without an exact solution are in-

deed harder, so the somewhat paradoxical conclusion

is that in this region, the hardest instances and the

easiest instances are very close together, tightly in-

termingled in the same conﬁned area of ST

−3

. In this

area, instances’ integers can be said to have mutual

information, a concept well-known from information

theory, the broader ramiﬁcations of which will be dis-

cussed next.

ECTA 2021 - 13th International Conference on Evolutionary Computation Theory and Applications

136

Figure 1: Strict templates of different eccentricity have different hardness patterns. Templates more eccentric than the linear

template ST

have no variation in computational hardness. Templates less eccentric than ST

have both harder and easier

instances.

4 INFORMATION THEORY

The template approach we are adopting in this pa-

per is motivated by an information theoretical analysis

of the subset sum problem and its instance hardness.

Earlier and related work can be found in (Adriaans

and Van Benthem, 2008; Adriaans, 2020b; Adriaans,

2020a).

Generally speaking, let S be a set of numbers

∈ N

, let target value t be a natural number, and let

the pair (S,t) be a corresponding instance of the sub-

set sum problem. Such an instance will, from an in-

formation theoretical perspective be hard if t contains

‘minimal’ information about the subsets of S. In other

words, suppose (S, t) is a yes-instance of subset sum

and (S,t

) is a no-instance, then it should be difﬁcult

to distinguish these two: there should be no computa-

tional predicate (i.e. a computer program) that read-

ily identiﬁes some structural difference between the

two. Using insights from information theory we can

argue that such instances exist, although we can never

construct them explicitly. The conditions we have to

create are prima facie simple:

1. The set S should have maximal entropy.

2. The numbers t and t

should have maximal en-

tropy given S.

In order to identify the numbers that satisfy the second

condition we can use the following lemma:

Lemma 4.1. Let S be a ﬁnite set of objects of cardi-

nality n. The maximal amount of conditional infor-

mation I(S

|S) in a subset S

⊂ S is n bits, for large

enough n, and it is reached when we select

n ele-

ments from S under uniform distribution.

Proof: When we select k objects from a set with cardi-

nality n, we need at most log





bits for an index of

this subset, since there are





of such subsets. New-

ton’s binomium is symmetrical and peaks at the value

k =

n with log





≈ n. Asymptotically we have:

lim

n→∞

log





= 1

If we use a Bernoulli process (i.e. coin ﬂipping) to se-

lect the elements of S

we may, without loss of gener-

ality, assume that it has maximal conditional entropy.



In order to understand how Lemma 4.1 can be used to

identify hard cases of subset sum, we ﬁrst turn our at-

tention to exemplars that have very low entropy: sets

that contain only powers of two. Equation 1 illustrates

the special nature of these sets:

i=0

= 2

n+1

− 1 (1)

Furthermore, it will prove useful to deﬁne the notion

of the scale of a number:

Deﬁnition 4.2. The scale of a number n ∈ N

is given

by sc(n) = dlog

(n)e.

Subset Sum and the Distribution of Information

137

The scale of a binary number is related to the number

of bits we need to encode it. We observe that, when

S consists only of powers of two, any corresponding

subset sum problem instance (S, t) is trivial and can

be solved in polynomial time. We can simply use

Equation 1 to estimate the scale of the target number t

and estimate its constituents. This procedure is analo-

gously effectuated by our branch and bound on strict

template ST

as found in Table 1. There is nothing ob-

scure about this: positional number systems are based

on the fact that any numbers can be easily rewritten as

the sum of the powers of the base (in this case 2). We

use this trivial concept every day from the moment we

learn elementary arithmetic (in this case base 10).

Next, we deﬁne the set of powers of two, ranging

from 2

to 2

as the scale-free reference set SF

Deﬁnition 4.3. The scale-free reference set SF

|0 ≤ i ≤ n}.

Observe that we have Σ

x∈SF

x = 2

n+1

− 1. Note

that we can interpret the set SF

as a set of numbers

with typical scales and we can interpret Equation 1

as marking a scale transition: the scale of a number

n is associated with the largest set of powers of two

that sums up to a value smaller than n. Things get

interesting with the following theorem:

Theorem 4.4. The set of sums associated with subsets

of SF

with maximal entropy is scale-free.

Proof: Take set SF

, and apply lemma 4.1. A subset

⊂ SF

with maximal entropy has ≈

n elements,

but there is no element of any scale from SF

that is

guaranteed to be in SF

. By equation 1, the scale of

x∈SF

x is not ﬁxed over the set of subsets of SF

with

maximal conditional information.

The down-to-earth version of theorem 4.4 tells us

that when we use uniform distributions to select sub-

sets of SF

with maximal entropy, then the sums re-

sulting from these sets can have any value. To put

it even more simply: random numbers selected with

reference to a scale-free set can have any value, large

or small. The reverse of this result motivates the fol-

lowing observation:

Lemma 4.5. Let SF be a scale-free set, let t be a

target value and let (SF,t) be a corresponding sub-

set sum problem instance, then sc(t), the scale of t,

gives no indication about the conditional complexity

I(t|SF) of t.

Proof: Immediate consequence of lemma 4.4. 

We can now use Lemma 4.5 to generate instances

of subset sum that, from an information theoretical

perspective, are hard. Instead of using the low com-

plexity set SF

= {2

|0 ≤ i ≤ n} we use a set with bit

templates of every scale:

Deﬁnition 4.6. A randomized linear scale free set

of binary numbers of order n is a set of n ran-

dom binary numbers with for each i ∈ N ≤ n exactly

one number with a representation of i bits.

Such sets

• are still scale-free, so lemma 4.5 applies: let SF

be a randomized linear scale-free set and t a num-

ber in a reasonable range [0,

n(n + 1)] then the

size of t does not give any information about its

conditional complexity.

• have maximal entropy. We could use any random

string of length

n(n+1) bit to generate a set with

n elements by simply cutting the string in n seg-

ments of increasing length.

• when used as a template, produce subset sum

problem instances that are constraint-free. Every

random string of sufﬁcient length deﬁnes an in-

stance of the problem class.

• have high average case complexity, by the same

reasoning.

• contain random numbers, and consequently are

always between 2

and 2

i+1

for some value of i.

Thereby, the conditions of equation 1 are not satis-

ﬁed: the sums that can be constructed on the basis

of subsets of SF

will have holes and doubles on

any scale. Differently put: yes- and no-instances

for target values t will be mixed at every reason-

able scale.

With the deﬁnition of the concept of randomized lin-

ear scale free sets we have reached our goal of deﬁn-

ing a class of subset sum problems that are hard from

an information theoretical point of view:

1. The sets SF

have maximal entropy.

2. There are yes-instances (SF

,t) and no-instances

(SF

) for which the numbers t and t

have

maximal conditional entropy given SF

This result does not in itself prove that subset sum

problems based on scale-free templates are algorith-

mically hard, but it does explain the results of the

small experiment we report in this paper, and the piv-

otal role of the ST

set. We expect the phase transition

marked by scale-free sets to be more expressive when

larger examples are analyzed.

5 DISCUSSION

It seems that the distribution of informational bits

among the integers in a subset sum instance can play

a critical role in its experimental hardness. Each of

ECTA 2021 - 13th International Conference on Evolutionary Computation Theory and Applications

138

the 70 instances solved in this study, had n = 12 inte-

gers, and m = 78 input bits – no exceptions. But the

distribution of informational bits over the input ele-

ments was very different, and that seems to make a

huge difference.

Part of the hardness differentiation might be due

to the intrinsically binary nature of our branch and

bound algorithm, which decides on including or ex-

cluding a single element at the time. For very skewed

information distributions, as found in the random

instances generated by eccentric strict templates, it

thereby allows to discard half the search tree from the

root, effectively halving the runtime. There is even

a situation conceivable in which every non-discarded

subtree can also be split into directly discardable sub-

trees. For these numbers of bits and integers, it is the

set {1, 2, 4, 8...2048}, a special ‘maximal’ instance

from template ST

, which is likely the easiest ex-

istable instance in our experiment. From an infor-

mational theoretical perspective however, scale-free

non-strict templates can be used to generate hard sub-

set sum instances.

6 FUTURE WORK

There has been a lot of research on instance

hardness for various NP-complete problems, such

as SAT, K-colorability, VLSI routing and rect-

angle packing (Cheeseman et al., 1991)(Mitchell

et al., 1992)(Mu and Hoos, 2015)(Kirkpatrick and

Selman, 1994)(Turner, 1988)(Hogg and Williams,

1994)(Venkatesan and Levin, 1988)(Jansen et al.,

2020)(van den Berg et al., 2016)(Pejic and van den

Berg, 2020).

For the Hamiltonian cycle problem, also NP-

complete, there seems to be an interesting similar-

ity with this study. Although in completely random

ensembles, the hardest instances are located in an

area around the Koml

os-Szemer

edi bound (Cheese-

man et al., 1991)(van Horn et al., 2018)(Koml

os and

Szemer

edi, 1983)(Sleegers and Berg, 2021), an un-

informed heuristical search found the hardest prob-

lem instances in a very edge-dense region (Sleegers

and van den Berg, 2020b)(Sleegers and van den Berg,

2020a). The most interesting similarity in results

however, is that in this edge-dense region, the hard-

est and easiest Hamiltonian cycle problem instances

were very close together, separated by just one bit

of information. Could it be that the Hamiltonian cy-

cle instances in the edge-dense region have some-

thing in common with NP-complete subset sum in-

stances generated through ﬂat templates? With some

imagination, both areas could be seen as ‘low entropy

plains’, dense in solutions, but punctured with numer-

ous very small very deep holes in which very hard

instances live – instances that have no solutions, and

voraciously devour algorithmic recursions while try-

ing to ﬁnd one.

There is also some relevant work on instance hard-

ness for NP-hard optimization problems. Most rel-

evant might be studies on the asymmetric traveling

salesman problem. The standard deviation in the dis-

tance matrix generation procedure turned out to be

an unreliable predictor of instance hardness (Cheese-

man et al., 1991)(Sleegers et al., 2020), but a later

study by WeiXiong Zhang and Richard Korf, solved

their ATSP instances with a branch and bound algo-

rithm similar, but not identical, to ours (Zhang and

Korf, 1996). They didn’t use (strict) templates, but

did something very similar, drawing their matrix en-

tries from {0, 1, 2...r}. Their ﬁnding, and this is really

interesting, is a ‘complexity transition’ similar but not

identical to ours, and their explanation of the control-

ling factor: the number of distinct entries in the ma-

trix. Are we again looking at instance entropy as a

predictive data analytic for instance hardness? Less

diverse entries in a subset sum instance means fewer

different outcomes too, but the difference between

‘less’ and ‘fewer’ is so mysterious in this context that

we dare not speculate any further. We need more evi-

dence, so as for the future, it looks like we’ve got our

work cut out for us.

ACKNOWLEDGEMENTS

We would like to thank the IvI Complex Cyber Infras-

tructure (CCI) group and its director Cees de Laat for

their support.

REFERENCES

Adriaans, P. (2020a). A computational theory of meaning.

Advances in Info-Metrics: Information and Informa-

tion Processing across Disciplines, page 32.

Adriaans, P. (2020b). Information. In Zalta, E., edi-

tor, The Stanford Encyclopedia of Philosophy. Meta-

physics Research Lab, Stanford University, fall 2020

edition.

Adriaans, P. and Van Benthem, J. (2008). Philosophy of

information. Handbook of the Philosophy of Science,

(8).

Brickell, E. F. (1984). Solving low density knapsacks. In

Advances in cryptology, pages 25–37. Springer.

Cheeseman, P., Kanefsky, B., and Taylor, W. M. (1991).

Where the Really hard problems are. In IJCAI, vol-

ume 91, pages 331–340.

Subset Sum and the Distribution of Information

139

Daan van den Berg (2019). Refurbished version of Lital’s

paper. http://www.heuristieken.nl/resources/(1963)

Littleetal- refurbished AlgorithmforTSP.pdf. Last ac-

cessed june 18

, 2021.

Eastman, W. L. (1958). Linear programming with pattern

constraints: a thesis. PhD thesis, Harvard University.

Garey, M. R. and Johnson, D. S. (1979). Computers and

intractability, volume 174. freeman San Francisco.

Hogg, T. and Williams, C. P. (1994). The hardest constraint

problems: A double phase transition. Artiﬁcial Intel-

ligence, 69(1-2):359–377.

Jansen, R., Vinkesteijn, Y., and van den Berg, D. (2020).

On the solvability of routing multiple point-to-point

paths in manhattan meshes. In Proceedings of the

2020 Genetic and Evolutionary Computation Confer-

ence Companion, pages 1685–1689.

Kemal Tural, M. (2020). On the hardness of almost all

subset sum problems by ordinary branch-and-bound.

arXiv, pages arXiv–2001.

Kirkpatrick, S. and Selman, B. (1994). Critical behavior in

the satisﬁability of random boolean expressions. Sci-

ence, 264(5163):1297–1301.

Koml

os, J. and Szemer

edi, E. (1983). Limit distribution for

the existence of hamiltonian cycles in a random graph.

Discrete Mathematics, 43(1):55–63.

Lagarias, J. C. and Odlyzko, A. M. (1985). Solving low-

density subset sum problems. Journal of the ACM

(JACM), 32(1):229–246.

Land, A. and Doig, A. (1960). An automatic method of

solving discrete programming problems. Economet-

rica, 28(3):497–520.

Land, A. H. and Doig, A. G. (2010). An automatic method

for solving discrete programming problems. In 50

Years of Integer Programming 1958-2008, pages 105–

132. Springer.

Little, J. D., Murty, K. G., Sweeney, D. W., and Karel, C.

(1963). An algorithm for the traveling salesman prob-

lem. Operations Research, 11(6):972–989.

Mitchell, D., Selman, B., and Levesque, H. (1992). Hard

and easy distributions of SAT problems. In AAAI, vol-

ume 92, pages 459–465.

Mu, Z. and Hoos, H. H. (2015). On the empirical time

complexity of random 3-sat at the phase transition. In

Twenty-Fourth International Joint Conference on Ar-

tiﬁcial Intelligence.

Pejic, I. and van den Berg, D. (2020). Monte carlo

tree search on perfect rectangle packing problem in-

stances. In Proceedings of the 2020 Genetic and Evo-

lutionary Computation Conference Companion, pages

1697–1703.

Rossman, M. and Twery, R. (1958a). Combinatorial-

programming. In OPERATIONS RESEARCH,

volume 6, pages 634–634. INST OPERATIONS

RESEARCH MANAGEMENT SCIENCES 901

ELKRIDGE LANDING RD, STE . . . .

Rossman, M. and Twery, R. (1958b). A solution to

the traveling salesman problem by combinatorial-

programming. In Operations Research, volume 6,

pages 897–897. INST OPERATIONS RESEARCH

MANAGEMENT SCIENCES 901 ELKRIDGE

LANDING RD, STE . . . .

Skiena, S. S. (1998). The algorithm design manual. page

247.

Sleegers, J. and Berg, D. v. d. (2021). Backtracking (the)

algorithms on the hamiltonian cycle problem. arXiv

preprint arXiv:2107.00314.

Sleegers, J., Olij, R., van Horn, G., and van den Berg, D.

(2020). Where the really hard problems aren’t. Oper-

ations Research Perspectives, 7:100160.

Sleegers, J. and van den Berg, D. (2020a). Looking for

the hardest hamiltonian cycle problem instances. In

IJCCI, page 40–48.

Sleegers, J. and van den Berg, D. (2020b). Plant propa-

gation & hard hamiltonian graphs. In Evostar 2020

“The Leading European Event on Bio-Inspired Com-

putation”.

Turner, J. S. (1988). Almost all k-colorable graphs are easy

to color. Journal of algorithms, 9(1):63–82.

van den Berg, D., Braam, F., Moes, M., Suilen, E., and Bhu-

lai, S. (2016). Almost squares in almost squares: Solv-

ing the ﬁnal instance. Data Analytics 2016, page 81.

van Horn, G., Olij, R., Sleegers, J., and van den Berg, D.

(2018). A predictive data analytic for the hardness of

hamiltonian cycle problem instances. Data Analytics

2018, page 101.

Venkatesan, R. and Levin, L. (1988). Random instances of

a graph coloring problem are hard. In Proceedings

of the twentieth annual ACM symposium on Theory of

computing, pages 217–222.

Zhang, W. and Korf, R. E. (1996). A study of complex-

ity transitions on the asymmetric traveling salesman

problem. Artiﬁcial Intelligence, 81(1-2):223–239.

ECTA 2021 - 13th International Conference on Evolutionary Computation Theory and Applications

140