als. And for this problem, the constraint violations
increase quadratically instead of linearly, becoming
infeasible from n = 12 already (Verduin et al., 2023b;
Verduin et al., 2023a). But things could be better, too.
For a problem such as TSP, uniformly randomly sam-
pling can be done in deterministic linear time (θ(n)):
1. Start with a full list of unpicked cities, and an
empty tour.
2. Add a randomly picked city from the list of un-
picked cities, and add it to the tour.
3. Delete that city from the list of unpicked cities.
4. If the list of unpicked cities is not empty, go back
to 2.
While this algorithm might be considered too triv-
ial to explicitly write down, it important to realize
that this method produces a uniformly random valid
TSP-solution in deterministic linear time. Point 3 is
important in this sense. Many programmers opt for
a boolean ‘picked marker’ for each city, and sim-
ply pick a new random city when an already chosen
city is accidentally picked. This will work for up to
very large instances without any noticeable delays,
and might even stochastically improve runtimes, as
deletion from a data structure such as an array is ex-
tremely expensive compared to flipping a bit in a list
of boolean picked markers.
So it exists for TSP, but does such a uniform prob-
ability linear time selection algorithm also exist for
NP-protein folding? We do not think so. The best
we can do (for now) appears to be randomly sampling
a conformation by assigning all aminos a random di-
rection ∈ {left, right, straight}, relative to the
chain. Although this does guarantee that all confor-
mations have equal probability, it also includes a lot
of invalid conformations with colliding aminos. The
obvious solution is just to resample a few times until
we pick a valid solution, but how feasible is this ap-
proach as instances get bigger? Not very feasible, it
seems. For now, the race is on to find a determinis-
tic polynomial time uniform sampling method, which
might or might not exist. For the future of this prob-
lem, it is of utmost importance.
REFERENCES
Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K.,
Walter, P., and Chaffey, N. (2003). Molecular biology
of the cell. 4th ed. Oxford University Press.
Anonymous (2023). Repository containing source
material: https://anonymous.4open.science/r/PF-
sampling/README.md.
Atari, M. and Majd, N. (2022). 2d hp protein folding us-
ing quantum genetic algorithm. In 2022 27th Inter-
national Computer Conference, Computer Society of
Iran (CSICC), pages 1–8.
Berger, B. and Leighton, T. (1998). Protein folding in the
hydrophobic-hydrophilic (hp) is np-complete. In Pro-
ceedings of the second annual international confer-
ence on Computational molecular biology, pages 30–
39.
Boumedine, N. and Bouroubi, S. (2021). A new hybrid
genetic algorithm for protein structure prediction on
the 2d triangular lattice. Turkish Journal of Electrical
Engineering and Computer Sciences, 29(2):499–513.
Bui, T. N. and Sundarraj, G. (2005). An efficient genetic al-
gorithm for predicting protein tertiary structures in the
2d hp model. In Proceedings of the 7th Annual Con-
ference on Genetic and Evolutionary Computation,
GECCO ’05, page 385–392, New York, NY, USA. As-
sociation for Computing Machinery.
Creighton, T. E. (1988). The protein folding problem. Sci-
ence, 240(4850):267–267.
Cust
´
odio, F. L., Barbosa, H. J. C., and Dardenne, L. E.
(2004). Investigation of the three-dimensional lattice
hp protein folding model using a genetic algorithm.
Genetics and Molecular Biology, 27(4):611–615.
Dill, K. A. (1985). Theory for the folding and stability of
globular proteins. Biochemistry, 24(6):1501–1509.
Dill, K. A. and MacCallum, J. L. (2012). The
protein-folding problem, 50 years on. science,
338(6110):1042–1046.
Dobson, C. M. (1999). Protein misfolding, evolution and
disease. Trends in biochemical sciences, 24(9):329–
332.
Dondapati, S. K., Stech, M., Zemella, A., and Kubick,
S. (2020). Cell-free protein synthesis: a promis-
ing option for future drug development. BioDrugs,
34(3):327–348.
Eiben, A. E. and Smith, J. E. (2015). Introduction to evolu-
tionary computing. Springer.
Garza-Fabre, M., Rodriguez-Tello, E., and Toscano-Pulido,
G. (2015). Constraint-handling through multi-
objective optimization: The hydrophobic-polar model
for protein structure prediction. Computers & Opera-
tions Research, 53:128–153.
Hart, W. E. and Istrail, S. (1997). Robust proofs of np-
hardness for protein folding: general lattices and en-
ergy potentials. Journal of Computational Biology,
4(1):1–22.
Kaytako
˘
glu, S. and Akyalc¸ın, L. (2007). Optimization of
parametric performance of a pemfc. International
Journal of Hydrogen Energy, 32(17):4418–4423.
Kirkpatrick, S., Gelatt, C. D., and Vecchi, M. P. (1983).
Optimization by simulated annealing. Science,
220(4598):671–680.
Lau, K. F. and Dill, K. A. (1989). A lattice statistical me-
chanics model of the conformational and sequence
spaces of proteins. Macromolecules, 22(10):3986–
3997.
Can HP-protein Folding Be Solved with Genetic Algorithms? Maybe not
139