Sentimental Analysis on YouTube Video Platform
Meda Sai Harini Rekha
1
, L. Rajeswari
1
, V. Leena Parimala
2
,
B. Charitha
1
and A. Divya Hari Chandana
1
1
Department of CSE(AI), Ravindra College of Engineering for Women, Pasupula Village Nandikotkur Road 518002,
Andhra Pradesh, India
2
Department of CSE, Ravindra College of Engineering for Women, Pasupula Village Nandikotkur Road 518002, Andhra
Pradesh, India
Keywords: YouTube Analytics, Machine Learning, Audience Engagement, Video Performance, NLP, Content Trends,
Data‑Driven Framework.
Abstract: Due to its great success rate, YouTube is increasingly used as a platform for content creation and consumption.
Now, it has become more of a necessity to identify user behavior, content trends, and engagement patterns.
This study has a main feature which is a machine learning technique that uses YouTube data to provide deep
insights into video performance, audience preferences, and the growth strategies of channels. The machine
model is then used through which various features such as views, likes; comments, watch time, and video
metadata are extracted to predict the key metrics of the success of the video, the engagement rates, and the
audience retention. The system also makes use of some clustering techniques so as to define content categories
and user behavior patterns. On the other hand, regression and classification models rely on data with a track
record to assist them in predicting the success of a video. Similarly, the text analysis on comments completed
by users allows them to know their feelings, which in turn, helps them improve their content strategies. And
the deep learning methods of Natural Language Processing (NLP) are perfected for keyword optimization
based on the video descriptions and titles. To confirm the validity of our approach, we run some experiments
that actually show the models capability to be very useful for content creators, advertisers, and platform
admins. This take uses a data-driven framework that helps in the process of optimizing content creation, which
in turn aims to enhance viewer engagement and improve the total user experience on YouTube.
1 INTRODUCTION
YouTube has become one of the largest platforms for
video content creation and consumption, attracting
billions of users worldwide and thus enjoying big
growth. This content to YouTube's rapid expansion
would be the best avenue for business holders and
media managers seeking proper advertisement
structures. It is also the means of finding a solution to
a universal rating system for videos, make videos
easier to be found in the query or automatically
generate tags for videos.
This research article is dedicated to looking into
the application of artificial intelligence (AI)
techniques specifically aimed at YouTube data to
discover patterns and forecast statistics like the
number of times a video is watched, user retention,
and engagement. According to the results, we will
offer a way or model to assist content creators and
platform administrations in making data-driven
decisions. Our research also involves the application
of Natural Language Processing (NLP) to perfect the
content through a thorough study of the keyword
analysis done in the video descriptions and titles. The
video metadata includes details such as both location
and language chosen. Regarding user interactions, the
likes, comments, watch time, and sentiment feedback
provide information on the user's perception and
engagement. we can compare the content of this
description with that of another video under the same
category and the content of the target video and the
correlation of textual features of the two videos which
can help us make the right conclusions about the
similarities and differences between them.
An increased population has brought the youth
into a competitive environment. YouTube is a space
for e-learning presenting inexpensive and quality
course materials from educational institutions. Young
minds like the available free content on YouTube as
842
Rekha, M. S. H., Rajeswari, L., Parimala, V. L., Charitha, B. and Chandana, A. D. H.
Sentimental Analysis on YouTube Video Platform.
DOI: 10.5220/0013874400004919
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 1st International Conference on Research and Development in Information, Communication, and Computing Technologies (ICRDICCT‘25 2025) - Volume 1, pages
842-849
ISBN: 978-989-758-777-1
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
it is to the contrary of the amount needed to join
coaching institutes. Some educational tutorial series
or marathons are accessible only on YouTube and
may be the case that one of the students prefers this
type of content to others. This is based on the prior
knowledge of the specific situation. The students who
have already watched the series or marathon and they
are beginners, intermediate and professional people
tell their thoughts on the quality and the usefulness of
the visual. The manner of comments posted by the
viewers, the number of likes, and the views of the
videos are all factors in the assessment. The project
can figure out the best videos of YouTube as there are
sentiments of comments, number of comments,
number of views, and number of likes to enable the
personalized sorting of YouTube according to its
ranking. We have received data on the videos and
extracted data like comments, number of comments,
likes and views with the help of the YouTube API.
Through its growing popularity YouTube has
become a crucial learning platform for e-learning
purposes that provide uncharged and superior
educational materials. The platform provides students
with free educational tutorials and courses which
serve as a suitable alternative to costly coaching
institutes. The exploration of educational media
content depends on the assessment of user reaction
alongside video performance analytics and comment-
based feedback evaluations. Several quantitative
measures along with sentiment tracking serve to
determine educationally successful YouTube videos.
2 RELATED WORKS
Many researches have been done which took the help
of YouTube and machine learning to spread out the
aspects of data analysis. One of these techniques is
sentiment analysis, which has been utilized largely to
comprehend the opinions of users about videos.
Between the lines, Wang et al. (2018) inserted a
sentiment analysis tool to YouTube comments to
grasp the information on the audience and the
reactions to the content. Like-wise, other researchers
have applied various clustering algorithms to transit
YouTube content, such as the tags, title, and the user
actions related to the video (e.g., likes, shares, and
comments) to a different cluster. Besides, machine
learning modeling from regression has been engaged
in the prediction of video popularity and engagement
by the historical performance metrics (Sharma &
Hoda, 2020). Additionally, the NLP methods were
exploited to scrutinize the titles and descriptions of
the videos for the keyword optimization purpose (Das
et al., 2021). Although a lot has been accomplished in
the area of YouTube data analysis, the present study
incorporates multiple machine learning strategies and
techniques, including clustering, regression,
classification, and sentiment analysis, to come up
with a holistic approach to content optimization and
user engagement analysis.
Data Collection and Sentiment Analysis: The
collection of data originates from YouTube through
its API. The data obtained includes video metadata
which comprises titles, descriptions, and tags along
with views, likes, comments, and watch time. We
cleaned and prepared the gathered data through
normalization techniques and handled all cases of
missing information. Sentiment analysis permits
discrimination of user comments into positive and
negative and neutral categories after writing the
comments through this approach. Subsequently we
utilize teaching tools that include classic BERT as
well as Support Vector Machines (SVM) or
Predictive Analytics.
Success predictions along with the organization of
content and clustering process guide our approach:
Our analytics include K-Means and DBSCAN
clustering method to connect similar published videos
through their metadata features as well as user
interaction data. Our system uses this method to
determine how parallel or functionally similar videos
perform. The team develops predictive models which
generate success calculations for videos. The team
builds two different types of predictive models where
the first one predicts watch time and views while the
second one uses user engagement to determine
successful videos.
Evaluation Keyword Optimization and Model
Evaluation Using Natural Language Processing
technology the system finds vital keywords that
appear in video title and description content to
optimize video accessibility. The Word2Vec
technology enables the system to create additional
keywords to boost search engine rankings. The
assessment model includes accuracy and precision
tests as well as metric evaluation of recall and F1-
Score scores.
3 METHODOLOGY
3.1 Overview
Modern machine learning techniques process
advanced sentiment analysis of YouTube information
to detect emotional responses from users together
with their active user metrics. The combination of
Sentimental Analysis on YouTube Video Platform
843
text-based comments with video frame visuals
enhances system sentiment recognition abilities to
better help YouTube content creators and
administrators create better content strategies.
3.2 Theoretical Structure
Figure 1: Sentiment Analysis Pipeline Using Machine
Learning.
3.3 Data Collection
As Figure 1 explains the first step, extracting all
mentioned data points constitutes the YouTube data
collection process. Sentiment analysis of user
emotions relies on the system to extract comments
made by users in their video uploads. The system uses
video metadata from combined metrics including
views and likes together with shares and watch time
data that applies to channels specifically. The
sentiment analysis benefits from video descriptions
and their titles as textual information. The video
frames function as specific visual material used for
performing multimodal sentiment analysis.
3.4 Preprocessing
Multiple critical operations must be performed during
the first step of data preprocessing. The text data
processing method begins with tokenizing comments
and descriptions and then eliminating stop words
before establishing text cleanliness. The extraction
methods periodically take video frames to acquire
meaningful visual elements that influence sentiment
analysis results.
3.4.1 Feature Extraction
Three sequential text extraction procedures transform
the features by using TF-IDF along with Word2Vec
analytical techniques for deep semantic processing.
Visual features in video frames are extracted using
pre- trained Convolutional Neural Networks (CNNs)
as the network-based tool for image features
extraction.
3.5 Sentiment Analysis Models
The system employs a combination of models for
improved sentiment classification:
Transformer-Based Models: BERT and T5 along
with GPT-3 undertake YouTube-specific data
training for analyzing sentiment expressions within
user comments and video descriptions. The detection
of complex sentences combined with both subtle
emotional expressions and verbalization that
approaches sarcastic tones makes these models
outstanding performers. Through its Attention
Mechanism function the system identifies crucial
document portions which enhance its detection
accuracy of sentiment.
3.5.1 Multimodal Sentiment Analysis
Through CLIP (Contrastive Language-Image
Pertaining) both text data and video data merge to
allow the connection of video frames with the
matching YouTube comments. This system delivers
enhanced sentiment comprehension through text and
image analysis since visual content influence how
people interpret sentiment in specific video materials.
3.5.2 Reinforcement Learning
User-created feedback activates reinforcement
learning optimization of analyses by employing
instantly activated reactions including comment likes
as well as dislikes and sharing functionalities and
direct messaging responses to achieve repeated use
accuracy enhancement. Through this feature the
model identifies how user needs change and what
fresh local behavior patterns develop.
3.6 Sentiment Classification
The text sentiment classification system employs an
LSTM network with attention mechanisms to extract
positive and negative along with neutral sentiments as
well as joy and anger and surprise emotional
outcomes. Sentiment classification with CLIP
enables the system to examine the visual cues in video
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
844
frames along with textual sentiment in comments for
an integrated content feedback analytical approach.
3.7 Output and Visualization
All video feedback is split into four sentimental
sections by the system which delivers unique
assessment of individual emotions through detailed
analytic reports. Users who utilize the tool receive
critical points of understanding regarding content
reactions through reports that merge crowd
engagement metrics with sentiment data extracted
from comments while displaying temporal sentiment
changes. Content creators and advertising staff along
with administrators access actionable strategies
because of sentiment analysis and viewer engagement
which help them improve their content direction and
develop better user experiences for elevated
engagement metrics.
3.8 Real-Time Processing
The system issues real-time analyses about YouTube
data directly to content creators through its immediate
operational mechanism. Both new material intake and
audience sentiment alteration and emerging trend
detection receive essential monitoring services
through this system.
3.9 Benefits of System
The proposed sentiment analysis system generates
important advantages that benefit both platform
management staff as well as content producers and
advertisers. The following details the main
advantages for each stakeholder segment:
3.9.1 For Content Creators
Content creators use audience sentiment analysis to
receive better content direction since they better
understand what establishes maximum viewer
engagement for creating future content accordingly.
Yahoo produces two advantages for content creators
who receive both helpful feedbacks alongside
audience motivation that eventually enhances their
future video production.
The system provides immediate sentiment
evaluation to content makers who get real-time
viewer feedback about their newly posted content.
The system allows fast upcoming content updates
combined with immediate responses to viewer
feedback through comments.
Content creators benefit from audience emotion
identification to build future content strategies
between joy and surprise experiences. Viewer
emotional understanding helps content creators to
establish better ways of connecting with their
audience. Content creators need to detect elements in
their videos that produce positive emotions for
building lasting viewer loyalty.
3.9.2 For Advertisers
The ability to conduct sentiment analysis helps
advertisers show targeted ads to viewers by assessing
positive reactions from emotions that result from
video content. Advertisers benefit from YouTube
sentiment analysis because they can evaluate their
brand perception from user comment sentiment
evaluations on their uploaded videos. Marketing
plans and approaches benefit from the obtained
information to modify themselves in real-time
operations. Advertisers employ sentiment trends in
conjunction with video emotional responses to
determine their precise ad placement opportunities for
finding the most receptive audience at their ideal
times.
3.9.3 For Platform Administrators
Administrators can discover concerning or
inappropriate content through Managed Content
Moderation by using patterns of negative sentiment
analysis and detection of harmful user contributions.
Such methods enable the development of positive
supportive conditions which spread throughout
YouTube.
The video sentiment monitoring function of
platform administrators detects mounting trends by
showing changes across content areas and indicates
which areas require promotional action or additional
monitoring initiatives. The platform operates with
authentic connectivity to altering user preferences
through this system. System administrators track user
emotional reactions to produce data that aids
YouTube UX strategy improvements. Platform
administrators should use system modifications along
with improved recommendations and content quality
evaluations for their platform improvements.
The suggestions function of YouTube depends on
sentiment analysis to refine how users find content. A
platform provides better video suggestions for users
by analyzing emotions since this approach helps
recommend content aligned with personal emotional
outcomes.
Sentimental Analysis on YouTube Video Platform
845
3.9.4 For Viewers
Through sentiment analysis optimization content
creators join platform administrators to develop better
curated content which fulfills personal and emotional
interests of viewers.
The recognition value of viewer feedback
(including comments and reactions) will enhance
through better understanding because content creators
and brands learn the underlying sentiments behind
reactions and comments.
Users who take content recommendations based
on sentiment find videos that match their emotional
tastes and preferences better thereby improving their
viewing session.
3.9.5 For Data Scientists and Researchers
Researchers can conduct superior analysis of user
activities when sentiment data tracks how users
interact with videos as well as their emotional
responses and behavioral tendencies. The exploration
of new research fields focuses on both studying user
activities on the internet and comprehending social
media operational functions.
YouTube and social media networks receive
enhanced sentiment-analysis capabilities through
joint implementations of transformer-based models
with multimodal analysis systems that researchers
develop to create effective solutions in this domain.
4 RESULTS AND EVALUATION
4.1 Model Evaluation
A collection of experiments has been performed to
confirm the effectiveness of the implemented
methodology. The regression and classification
models undergo K-fold cross- validation as part of
their evaluation to achieve general validity while
preventing model over fitting. The appropriate
metrics analyze the performance of the created
models. The metrics of Mean Squared Error (MSE)
together with R- squared assess the prediction
accuracy when measuring continuous metrics such as
views and watch time in the Regression model. The
success prediction models for videos are evaluated by
means of Accuracy and Precision, Recall, and F1-
Score metrics. The content clustering quality will be
assessed using the Silhouette score and Davies-
Bouldin index. The system's genuine effect on
YouTube channels gets measured through hands-on
tests executed on selected YouTube pages. The
usefulness of the system for video enhancement and
user engagement receives evaluation through surveys
conducted against content creators and advertisers.
Different experiments have been conducted to
verify the performance of the implemented method
for optimizing YouTube video success rates. All
regression models and classification models follow
K-fold cross-validation as their evaluation method.
General validity emerges from this testing procedure
because the models evaluate different data subsets
which protects against over fitting and establishes
reliable performance outcomes. The results of the
model performance evaluation through K-fold cross-
validation depend on different training and validation
sets to confirm the models do not yield biased
conclusions from a single training dataset. The
evaluation of regression models relies on prediction
of continuous performance metrics including views
together with watch time and multiple other
quantitative video metrics. The prediction accuracy
assessment includes an evaluation using Mean
Squared Error (MSE) along with R- squared (R²).
Using MSE allows researchers to calculate average
squared deviations between actual observations and
predicted data thus measuring model precision. R-
squared helps researchers determine the extent which
independent variables explain dependent variable
variability because it calculates the portion of
dependent variable variance which is predictable
from independent variables.
Classification models utilized for video success
prediction are evaluated through multiple
classification metrics which determine their
performance levels. The effectiveness of models can
be measured by Accuracy, Precision, Recall, and F1-
Score since each metric delivers a distinct aspect of
performance assessment. Accuracy measures correct
predictions as a whole while Precision identifies the
number of accurate positive results among all
potential positive results. Providing a balanced
assessment for unbalanced classes the F1-Score
calculates Precision and Recall to generate their
harmonic mean while Recall determines model
capability to find true positives among possible
positives. The evaluation of content clustering quality
depends on unsupervised learning metrics including
the Silhouette score and Davies-Bouldin index that
measure clustering quality in video segmentations for
similar audience attraction. The Silhouette score
assesses cluster separation quality by considering
how similar cluster objects are within a group and
how different cluster groups remain from one
another. Figure 3 talks about the key word
optimization.
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
846
The Davies-Bouldin index evaluates cluster
quality by comparing intra-cluster distances to inter-
cluster distances and generates better clustering when
the score decreases shown in Figure 5. The YouTube
channel analysis moves beyond theoretical evaluation
because researchers test the system directly through
manual evaluations of chosen YouTube video pages.
Temperate assessments examine how the system
operates effectively in actual performance and
produces better channel engagement combined with
raised view counts and improved channel metrics.
Applying system-generated predictions to video
content allows for assessment of changes in audience
retention rates and channel growth performance and
click-through rate performance. The utility
assessment of enhancement capabilities and user
engagement for system users and content creators
involves survey analysis. The surveys evaluate the
perceived value of system predictions that help
content strategists improve their approach and boost
video visibility while engaging their audience better.
Survey responses provide qualitative findings about
system performance while measuring how well it
serves YouTube creators and advertisers in their
objectives so researchers can identify its practical
value through a complete evaluation shown in Figure
4. The combination of statistical methods and
machine learning model assessment together with
practical field tests creates a methodology that
provides scientists and practitioners with sound and
efficient results for both video publishers and
advertisers on YouTube. Figure 2 shows the top 10
trending videos in YouTube.
Figure 2: Top 10 Trending Youtube Videos.
Figure 3: Tells About the Key Words Optimization.
Figure 4: Engagement Metrics.
Figure 5: Clustering of the Youtube Video.
5 DISCUSSION
Research reveals that people are increasingly
watching online video material which causes them to
use conventional TV less frequently. The shift toward
digital media has different rates of adoption because
population segments from various economic
backgrounds together with age-based choices shape
how quickly the process unfolds.
The results show that online video quality does
not impact users' plan to drop traditional TV but
strong evidence exists that increased digital
engagement leads them to choose interactive media.
Standard viewers opt for entertainment content that
adjusts to their preferences and offers interactive
elements and matches their viewing needs according
to this research. Users who spend more time with
Sentimental Analysis on YouTube Video Platform
847
online content develop interests in video creation and
content co-operation together with interactive media
consumption.
The research demonstrates that conventional
television maintains some presence in Yemen despite
its inadequate adjustment to modern technological
developments. Televisual networks have encountered
additional problems because they have been unable to
create a presence on digital platforms. Traditional
broadcasters must integrate with new media trends
since their survival depends on it. Modern media
trends indicate that TV viewing will likely decrease
because people are attracted to customizable digital
media content.
Restrictive economic factors have substantial
impact on this transformation. The slower rate of
conventional television decline in Yemen results
from its limited disposable income and cultural
preference to watch traditional broadcasting through
older generations. The data indicates that although
people watch more online videos each year the
medium has not achieved dominance in the Yemeni
market.
Traditional TV networks need to transform their
content approach for youth viewers through
innovations which appear on their preferred
platforms. The popularity of flexible online video
platforms which offer wide content diversity
motivates conventional broadcasters to transform
their business models. Platform evolution through
technological change demonstrates that new
technologies do not replace existing ones but instead
create competition against their market standing.
Traditional TV in Yemen will survive only
through implementing digital strategies and
interactive features and successfully targeting
changing audience preferences. If traditional TV
systems neglect integration with digital strategies
their presence will eventually fade out while digital
media claim the entire market in the forthcoming
years.
6 CONCLUSIONS
The study develops a whole system based on machine
learning methods to explore YouTube information
that addresses sentiment analysis and content
optimization alongside engagement prediction.
Research outcomes demonstrate how machine
learning algorithms deliver important analytical
information which benefits creators together with
marketers and YouTube platform administrators.
Stakeholders who use these insights will be able to
develop better content tactics and create more
successful user relationships and enhance their video
performance metrics. The research restrictions today
create prospects for upcoming advancements in
YouTube analytics analysis and content improvement
methods.
The combination of machine learning analysis for
user data and emotional detection enables both
marketing staff and YouTube staff and video creators
to improve their productivity. Conclusions from the
system analysis boost strategies for content along
with marketing capabilities and audience connection
methods. The application of sentiment analysis
allows video personalization so the system performs
better and provides users with more customized
content selections. The implementation of deep
learning platforms evaluating various platforms
offers better recommendations to develop future
systems. Through the management system content
creators obtain complete control that lets them create
improved bonds with their audiences.
REFERENCES
Almeida, T. A., & Silva, A. (2013). Text Classification in
the Twitter Domain. Proceedings of the 2013
International Conference on Computer Science and
Software Engineering, 213–219.
Almeida, T. A., & Silva, A. (2013). Text Classification in
the Twitter Domain Proceedings of the 2013 Internatio
nal Conference on Computer Science and Software
Engineering, 213–219.
Boulanger, P., & Poncelet, P. (2016). Sentiment Analysis:
A Literature Survey. Journal of Computer Science &
Technology, 31(2), 413–426. Preoţiuc-Pietro, D., &
Cohn, T. (2016). Affective Text: The Role of Emotions
in Text Analysis. Proceedings of the 54th Annual
Meeting of the Association for Computational
Linguistics (ACL 2016), 1–10.
Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2018).
BERT: Pre-training of Deep Bidirectional
Transformers for Language Understanding. arXiv
preprint arXiv: 1810.04805.
Hu, M., & Liu, B. (2004). Mining and Summarizing
Customer Reviews. Proceedings of the 10th ACM
SIGKDD International Conference on Knowledge
Discovery and Data Mining, 168–177.
Johnson, R. D., & Zhang, Y. (2015). Effective Use of
Word Order for Text Categorization with Convolution
al Neural Networks. Proceedings of the 24th Internatio
nal Conference on Machine Learning (ICML), 2459-
2467.
Joulin, A., Grave, E., Mikolov, T., & Bojanowski, P. (201
7). Bag of Tricks for Efficient Text Classification.arXi
v preprint arXiv: 1607.01759.
ICRDICCT‘25 2025 - INTERNATIONAL CONFERENCE ON RESEARCH AND DEVELOPMENT IN INFORMATION,
COMMUNICATION, AND COMPUTING TECHNOLOGIES
848
Pang, B., & Lee, L. (2008). Opinion Mining and Sentimen
t Analysis. Foundations and Trends in Information
Retrieval, 2(1-2), 1–135.
Preoţiuc-Pietro, D., & Cohn, T. (2016). Affective Text: The
Role of Emotions in Text Analysis. Proceedings of the
54th Annual Meeting of the Association for
Computational Linguistics (ACL 2016), 1–10.
R. F. Alhujaili and W. M. Yafooz, "Sentiment analysis for
YouTube videos with user comments", 2021 Internatio
nal Conference on Artificial Intelligence and Smart
Systems (ICAIS).
S. Siersdorfer, S. Chelaru, W. Nejdl and J. San Pedro,
"How useful are your comments, analyzing and predic
ting YouTube comments and comment ratings", Proce
edings of the 19th international conference on World
wide web.
Vader Sentiment (2014). A Sentiment Analysis Tool for
Social Media. Proceedings of the 8th International
Conference on Weblogs and Social Media (ICWSM).
Vinyals, & Le, Q. V. (2015). A Neural Network for Machi
ne Translation, at Scale. Proceedings of the 33rd Inter
national Conference on Machine Learning (ICML)
Zhang, Y., & Wallace, B. C. (2015). A Sensitivity Analysi
s of (and Practitioners’ Guide to) Convolutional
Neural Networks for Sentence Classification. Proceedi
ngs of the 8th International Conference on Empirical
Methods in Natural Language Processing (EMNLP).
Zhao, T. (2020). Sentiment Analysis and Text Classificati
on: A Survey Machine Learning and Knowledge Extra
ction, 2(1), 1–35.
Sentimental Analysis on YouTube Video Platform
849