Table 5 shows that, under ρ = 0.01 and 0.80 ≤
τ ≤ 0.90, we can find one correlated mutation, which
is concerned with just S with positions from 109 to
154. Also the position of 144 is concerned with spike
protein substation in Table 1.
3.4 Correlated Mutation for Omicron
Variant Under ρ = 0.10
Next, when we fix ρ = 0.10, Table 6 illustrates the
found correlated mutations for the Omicron variant
obtained by varying τ from 0.70 to 0.20 decreasing
by 0.05.
Table 6: The correlated mutations for the Omicron variant
obtained by varying τ from 0.70 to 0.20 decreasing by 0.05
under ρ = 0.10.
τ id correlated mutation
0.70 CM1 ⟨S⟩289, ⟨N⟩18
0.65–0.55 CM1 ⟨S⟩289, ⟨N⟩18
CM2 ⟨S⟩222, ⟨N⟩215
0.50 CM1 ⟨S⟩289, ⟨N⟩18
CM2 ⟨S⟩112, ⟨S⟩222, ⟨N⟩215
0.45 CM1 ⟨S⟩5, ⟨S⟩289, ⟨S⟩809, ⟨S⟩1104,
⟨S⟩1264, ⟨N⟩9, ⟨N⟩18, ⟨N⟩63
CM2 ⟨S⟩112, ⟨S⟩222, ⟨N⟩215
CM3 ⟨S⟩95, ⟨S⟩142
0.40–0.35 CM4 ⟨S⟩5, ⟨S⟩112, ⟨S⟩222, ⟨S⟩289, ⟨S⟩809,
⟨S⟩1104, ⟨S⟩1264, ⟨N⟩9, ⟨N⟩18,
⟨N⟩63, ⟨N⟩215
CM3 ⟨S⟩95, ⟨S⟩142
0.30–0.20 CM5 ⟨S⟩5, ⟨S⟩95, ⟨S⟩112, ⟨S⟩142, ⟨S⟩222,
⟨S⟩289, ⟨S⟩809, ⟨S⟩1104, ⟨S⟩1264,
⟨N⟩9, ⟨N⟩18, ⟨N⟩63, ⟨N⟩215
Table 6 shows that all of the correlated mutations
contain the positions of S and the other structural pro-
teins. Also, the correlated mutation CM3 contains the
positions for spike protein substitutions.
Also Table 6 shows that the correlated mutation
CM4 is the combination of CM1 and CM2, and the
correlated mutation CM5 is the combination of CM3
and CM4. Hence, we can regard that the correlated
mutation CM5 is the convergence of other found cor-
related mutations.
On the other hand, the number of positions in
spike protein substitutions occurring in the correlated
mutations is just two, which is small. Then, it is a fu-
ture work to find the correlated mutations containing
more positions in spike protein substitutions.
4 CONCLUSION
In this paper, we have found the correlated mutations
of positions among structural proteins in amino acid
sequences of SARS-CoV-2 Delta and Omicron vari-
ants by using the algorithm FINDCM designed by
(Shimada et al., 2012). Then, we have obtained the
correlated mutations containing the positions among
several structural proteins and containing the posi-
tions occurring in the spike protein substitutions in
SARS-CoV-2 Delta and Omicron variant.
In particular, we have found the correlated muta-
tion CM5 in Table 6 as the convergence of several
correlated mutations containing the positions in the
spike protein substitutions. On the other hand, it is a
future work to investigate the positions except a spike
protein of CM2 at τ = 0.80 in Table 4 and CM5 in
Table 6 in the genomic viewpoints.
The algorithm FINDCM is based on the set enu-
meration algorithm (Rymon, 1992). Then, it is a fu-
ture work to design the algorithm of finding correlated
mutations based on another enumeration algorithm,
with introducing another thresholds like τ and ρ.
Whereas we have found the correlated mutations
concerned with the positions in the spike protein sub-
stitutions, the number of them is small, in particular,
for the Delta variant. Also, whereas the algorithm
FINDCM finds all of the correlated mutations under
given τ and ρ, it is necessary to find the correlated
mutations concerned with the positions in the spike
protein substitutions directly and efficiently. Hence,
it is a future work to design a new algorithm of find-
ing the correlated mutations containing given several
positions like as the positions in spike protein sub-
stitutions, which is possible to be more efficient than
FINDCM.
REFERENCES
Chen, K.-W. K., Huang, D. T.-N., and Huang, L.-M. (2022).
SARS-CoV-2 variants – evolution, spike protein, and
vaccines. Biomed. J., 45:573–579.
Cover, T. M. and Thomas, J. A. (2006). Elements of infor-
mation theory (Second edition). John Wiley & Sons.
Jeong, C. and Kim, D. (2010). Linear predictive coding
representation of correlated mutation for protein se-
quence alignment. BMC Bioinform., 11:52.
Kim, D., Lee, J.-Y., Yang, J.-S., Kim, J. W., Kim, V. N., and
Chang, H. (2020). The architecture of SARS-CoV-2
transcriptome. Cell, 181.
Lee, B.-C. and Kim, D. (2009). A new method for reveal-
ing correlated mutations under the structural and func-
tional constraints in proteins. Bioinform., 25:2506–
2513.
ICPRAM 2023 - 12th International Conference on Pattern Recognition Applications and Methods
348