Meanwhile, with the rise of machine learning and
artificial intelligence technologies, deep learning-
based models are widely used for music assessment.
These models can automatically learn and extract
high-dimensional features in music, providing
powerful tools for music similarity analysis,
sentiment classification, and style identification. For
example, Convolutional Neural Networks (CNNs)
have achieved remarkable results in music sentiment
classification, audio signal processing, and music
style classification.
Existing assessment models still have limitations
in dealing with complex musical features, especially
in terms of emotional expression, structural
complexity, and cross-cultural diversity. The research
motivation of this paper is to explore the limitations
of existing music assessment methods and suggest
directions for improvement. The research framework
consists of the following parts: firstly, it introduces
the main points of consideration in music assessment,
such as emotion, tempo, and similarity; then it
discusses the quantitative metrics used in recent years,
which are quantitatively analysed by extracting
features such as pitch, harmony, etc.; then it
introduces the typical models used for music
assessment and their applications in emotion
recognition, recommender systems, and
categorization; and finally, it analyses the limitations
of the current methods and looks forward to possible
future improvements, such as the introduction of
machine learning and multimodal analysis.
2 DESCRIPTIONS OF MUSIC
EVALUATION
Music evaluation involves analysing various musical
elements to assess and categorize music, focusing on
aspects such as emotion, rhythm, and similarity.
These key considerations are vital in understanding
how music affects listeners and how it can be
quantitatively measured for various applications,
including recommendation systems, automated
composition, and emotional recognition.
One of the primary factors in music evaluation is
emotion. Music has the power to evoke a wide range
of emotions, from joy to sadness, and researchers
have long focused on developing methods to quantify
these emotional responses. Studies have shown that
specific musical features such as tempo, key, and
mode significantly influence emotional expression.
Major keys and fast tempos are often associated with
positive emotions, while minor keys and slower
tempos may evoke sadness or melancholy. However,
emotion recognition is not without its challenges.
Human emotions are complex and multifaceted, and
a single piece of music may evoke different emotions
in different listeners depending on their personal
experiences or cultural background. Moreover, the
same musical features may be interpreted differently
across genres. For instance, a minor key in classical
music is often associated with sadness, while in jazz
or blues, it may convey a sense of sophistication or
reflection. Emotion recognition models need to
account for such cross-cultural and genre-specific
differences to make more accurate predictions. In
particular, deep learning models such as
Convolutional Neural Networks (CNNs) and
Recurrent Neural Networks (RNNs) have been
employed to better capture the nuances of emotional
expression in music (Lin & Qi, 2018).
Another crucial element in music evaluation is
rhythm. Rhythm refers to the timing and arrangement
of sounds and silences in a piece of music. It plays a
critical role in defining the structure and flow of
music, influencing how it is perceived by listeners.
Researchers have explored various metrics to
evaluate rhythm, such as beat alignment, tempo
consistency, and syncopation. In addition to its role in
music perception, rhythm is also a key indicator of
technical skill. In genres like jazz or classical music,
the ability to maintain complex polyrhythms or
perform intricate syncopations is often associated
with mastery. In contrast, genres like electronic dance
music (EDM) emphasize steady, consistent rhythms,
where tempo stability is paramount. Rhythm-based
evaluation tools help in understanding both the
aesthetic and technical aspects of rhythm across
genres. In the realm of music evaluation, similarity
refers to the degree of resemblance between different
musical pieces. It plays a crucial role in various
applications, such as music recommendation systems,
automatic composition, and genre classification.
Music similarity is often analysed based on features
like melody, harmony, rhythm, timbre, and structure.
This section focuses on the different methods used to
quantify musical similarity and their applications.
The melodic similarity is one of the most
fundamental aspects of music comparison. It involves
analysing the sequence of pitches in a melody to
determine how closely two musical pieces align.
Traditional methods for measuring melodic similarity
rely on calculating the Euclidean distance between
pitch sequences. For instance, two melodies with
similar pitch contours would exhibit a shorter
Euclidean distance between their note sequences,
indicating higher similarity. However, this method