
other composers. The rest were classified as another
composer rather than the correct composer. This re-
sulted in producing a higher average than with the
balanced dataset despite producing better results for
the other composers in the balanced dataset as seen
in Table 5. This shows that Szeto and Wong’s rep-
resentation was more prone to bias with imbalanced
datasets.
In contrast, our approach was much less prone to
bias, correctly identifying all the composers as them-
selves rather than another composer. This is seen
along the diagonal in Table 2 and Table 4 where the
diagonal should contain the highest value of that row.
8 DISCUSSION
A look into the time periods in which the composers
were most active might offer some insight into the
misclassification reported by the GNN. Both Chopin
and Liszt were most active in the Romantic era, Schu-
bert was active the late classical to early romantic era,
and Scarlatti and Bach were active in the Baroque
era. The era in which the composers were most ac-
tive tends to correlate to the misclassifications of the
GNN. That is to say, Chopin, Liszt and Schubert were
more likely to get confused with one another due to
their work being in the romantic era. Similarly for
Scarlatti and Bach in the Baroque era. This might
suggest that the GNN is identifying compositional
choices common to those eras, but further research
would be required to confirm this.
9 CONCLUSION
We have demonstrated a graph representation for
symbolic music that when used with a GNN outper-
forms other representations at the task of composer
classification. We believe that our representations fo-
cus on the relationships between notes provide the
GNN with a more intuitive understanding of what is
important within a composition.
An issue with our representation can be seen in the
way edges are formed for the initial graph representa-
tion. With sequential edges connecting a note to the
next nearest note, there is a bias towards the X-axis.
This is because the x and y axes are weighted equally
in terms of distance. A note that is one beat away at
the same pitch is seen to be nearer than a note that is
a quarter beat away that is an octave higher as seen
in Figure 2. Further work is required to address this
edge connection issue.
REFERENCES
Collins, T., Arzt, A., Flossmann, S., and Widmer, G. (2013).
Siarct-cfp: Improving precision and the discovery of
inexact musical patterns in point-set representations.
In ISMIR, pages 549–554.
Collins, T. and Meredith, D. (2013). Maximal transla-
tional equivalence classes of musical patterns in point-
set representations. In Mathematics and Computation
in Music: 4th International Conference, MCM 2013,
Montreal, QC, Canada, June 12-14, 2013. Proceed-
ings 4, pages 88–99. Springer.
Collins, T., Thurlow, J., Laney, R., Willis, A., and
Garthwaite, P. (2010). A comparative evaluation of
algorithms for discovering translational patterns in
baroque keyboard works. Proceedings of the 11th
International Society for Music Information Retrieval
Conference, ISMIR 2010.
Conklin, D. and Witten, I. H. (1995). Multiple viewpoint
systems for music prediction. Journal of New Music
Research, 24(1):51–73.
Corr
ˆ
ea, D. C. and Rodrigues, F. A. (2016). A survey on
symbolic data-based music genre classification. Ex-
pert Systems with Applications, 60:190–210.
Forth, J. and Wiggins, G. A. (2009). An approach for identi-
fying salient repetition in multidimensional represen-
tations of polyphonic music. London Algorithmics
2008.
Hamilton, W. L., Ying, R., and Leskovec, J. (2017). Induc-
tive representation learning on large graphs. CoRR,
abs/1706.02216.
Jeong, D., Kwon, T., Kim, Y., and Nam, J. (2019). Graph
neural network for music score data and modeling ex-
pressive piano performance. In International confer-
ence on machine learning, pages 3060–3070. PMLR.
Karystinaios, E. and Widmer, G. (2022). Cadence detec-
tion in symbolic classical music using graph neural
networks. arXiv preprint arXiv:2208.14819.
Karystinaios, E. and Widmer, G. (2023). Roman numeral
analysis with graph neural networks: Onset-wise pre-
dictions from note-wise features. arXiv preprint
arXiv:2307.03544.
Kong, Q., Li, B., Chen, J., and Wang, Y. (2022). Giantmidi-
piano: A large-scale midi dataset for classical piano
music.
Lemstr
¨
om, K. and Pienim
¨
aki, A. (2007). On comparing edit
distance and geometric frameworks in content-based
retrieval of symbolically encoded polyphonic music.
Musicae Scientiae, 11(1 suppl):135–152.
Li, X., Ji, G., and Bilmes, J. A. (2006). A factored language
model of quantized pitch and duration. In ICMC. Cite-
seer.
Meredith, D. (2013). Cosiatec and siateccompress: Pattern
discovery by geometric compression. In International
society for music information retrieval conference. In-
ternational Society for Music Information Retrieval.
Meredith, D. (2016). Using siateccompress to discover re-
peated themes and sections in polyphonic music. In
Music Information Retrieval Evaluation Exchange.
KDIR 2025 - 17th International Conference on Knowledge Discovery and Information Retrieval
378