Intelligent Customer Feedback Analysis System Using NLP
Techniques
Omkar Jayendra Rane
a
and Deepali Naik
b
Pimpri Chinchwad College of Engineering, Department of Computer Engineering, Pune, Maharashtra, India
Keywords: Sentiment Analysis, LDA, CYK, TF- IDF, Customer Feedback Analysis, Vader Sentiment Analysis.
Abstract: In the digital era, businesses receive vast unstructured client feedback containing valuable insights into
customer satisfaction, product performance, and service quality. Thus, to extract actionable insights requires
advanced analytical techniques. This study presents an Intelligent Customer Feedback Analysis System
utilizing Natural Language Processing (NLP) techniques like Vader Sentiment Analysis, LDA topic
modelling, and TF-IDF for sentiment classification, trend analysis, topic detection, and keyword extraction.
This system utilized a Streamlit interactive application, visualization tools such as word clouds, sentiment
distribution, rating trends, and filters for sentiment analysis. The implemented system is very much real-time
and assists data-driven sentiments understanding in brand decision-making, which is the power of customer
perspectives and proactive engagement.
1 INTRODUCTION
Some modern markets have witnessed an emerging
trend where the user comes first, and the service
provider must not only provide but accept user
feedback. The rise of social media, online rating and
commentary, and other forms of digital customer
feedback has made the need for robust methods for
collecting, evaluating, and synthesizing insights from
large amounts of unstructured consumer-related
information. However, in this context, its alternative
- classical feedback mechanisms - are almost always
limited in their application because they rely on rigid,
rule-based systems that cannot comprehend the
subtleties and complexities of human emotions. As a
consequence, however, it becomes hard for
enterprises to fully utilize consumer feedback in
award-winning, consumer-dominated industries such
as e-commerce, food delivery, and auto service
(Hemalatha, Velmurugan, et al. , 2020),
(Ramaswamy, and, Declerck, 2018)
The past decade has witnessed considerable
advances in Machine Learning (ML), Deep Learning
(DL), and Natural Language Processing (NLP),
making it possible to conduct consumer opinion
mining with enhanced detail and accuracy. Early
a
https://orcid.org/0009-0004-2942-2693
b
https://orcid.org/0000-0001-6022-4565
studies began with traditional methods, such as
logistic regression, Naive Bayes, and SVM, which
laid a foundation for more sophisticated models
(Shaeeali, Mohamed, et al. , 2020).
However, as consumer opinion mining techniques
involve working with large amounts of unstructured
text data, it was only natural that interest shifted
toward NLP.
Ordinary machine learning techniques are most
applicable in the case of structured and numerical
data, and such methods fail to manage the intricacies
of textual information, in particular, different
sentence patterns, ambiguous meaning of language,
and dependencies related to the context. At the same
time,NLP is devoted to processing and
comprehending human language, which allows us to
understand emotion, intention, or meaning much
more profoundly (Schreiber, Ramsey, et al. , 2020).
This capability makes it easier to use NLP when
processing large amounts of unpaid content,
including sources of information such as product
reviews, social networks, and complaints to the
company, which in turn were problematic for the
traditional ML methods (Anonymous.,et al. , 2020).
Research focusing on specific areas, like opinions
about airlines (Li, Huang, et al., 2023) or food
Rane, O. J. and Naik, D.
Intelligent Customer Feedback Analysis System Using NLP Techniques.
DOI: 10.5220/0013584100004664
Paper published under CC license (CC BY-NC-ND 4.0)
In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 1, pages 707-714
ISBN: 978-989-758-763-4
Proceedings Copyright © 2025 by SCITEPRESS Science and Technology Publications, Lda.
707
delivery (Malik and Bilal, 2023) services, has shown
that sentiment analysis has improved with the
introduction of BERT and its transformer
architecture.NLP models such as BERT (Li, Huang,
et al., 2023), GPT (Olujimi, Ade-Ibijola, et al. , 2023),
etc., are pretty effective in obtaining contextual
sentiments necessary to make persuasive arguments
in highly competitive environments.
The use of hybrid models such as SVM and CNN
integrated with optimization algorithms such as
Particle Swarm Optimization (Khaled, 2014) can also
help improve the quality of the analysis and its
precision. The study explores machine learning (ML)
and natural language processing (NLP) to automate
request classification in software companies'
customer service areas. By applying ML algorithms
such as Support Vector Machine (SVM), Extra Trees,
and Random Forests to process and balance datasets,
the research achieved a classification accuracy of
98.97% with SVM. This approach significantly
enhances customer service efficiency by reducing
response times and providing accurate
categorizations. The findings underscore the
effectiveness of data balancing and hyper-parameter
optimization techniques, particularly with
unbalanced datasets containing multiple categories
(Barahona, Díaz, et al. , 2023).
Despite these improvements, the classification of
the sentiments is insufficient, and there is a greater
need for intelligent feedback mechanisms. According
to the study, usability and interpretability are key to
effective customer feedback systems. Existing
research shows that visual feedback data displays
help stakeholders grasp and respond to complex
findings (Olujimi, Ade-Ibijola, et al. , 2023). The
visuals generated on rigorous analysis can support
better decision-making by showing trends, revealing
patterns, and pointing out areas for improvement.
According to R. Schreiber et al. (Schreiber, Ramsey,
et al. , 2020), the addition of user-friendly visual
analytics makes feedback systems even more helpful,
and lets companies set priorities based on real-time
results
Adding This study puts forward a new approach
to building an Intelligent Customer Feedback System
that combines various NLP techniques and adapts to
specific fields. The proposed system handles large
amounts of unstructured feedback data and provides
dynamic, practical insights that meet the customer's
needs. By using recent advances in NLP and ML, the
system aims to address the shortcomings of
traditional feedback analysis systems and offer a
more customer-focused and responsive solution.
The rest of the paper is arranged as follows:
section 2 focuses on the study of existing literature,
and a detailed discussion of the proposed system's
methodology is conducted in section 3. Meanwhile,
the performance analysis of the implemented
techniques is presented in the form of graphs and
tables in section 4. The paper concludes with
significant observations.
2 RELATED WORK
In recent times, technology in customer feedback
analysis has developed with ML (Hemalatha,
Velmurugan, et al. , 2020), NLP (Malik and Bilal,
2023) and DL (Ramaswamy, and, Declerck, 2018)
techniques. This section will discuss state-of-the-art
customer feedback analysis techniques. For the study,
more focus is given to the methodology and tools
adapted by the previous researchers and estimated
future developments.
Researchers started with simple computer
programs to figure out how customers feel about their
reviews. Hemalatha and Velmurugan (2020)
(Hemalatha, Velmurugan, et al. , 2020) showed that
people used logistic regression, Naïve Bayes, SVM,
and neural networks. These tools work well with
organized information, which helps lay the
groundwork for understanding feelings. But they
often miss the little hints about emotions in people's
words (Alibasic and Popovic, 2021). Take, for
example. SVM and logistic regression algorithms are
suitable for basic sorting tasks but need much
tweaking to work accurately.
Moreover, they struggle with the tricky, context-
dependent attitudes you often see in customer
feedback (Olujimi and Ade-Ibijola, 2023) (Zheng,
Zhou, et al. , 2024). These old-school computer
methods set a bar for how well things should work.
But they're not great at changing or adapting when
you're dealing with tons of information.
NLP and DL introduced newer and more
efficient data analysis methods to physicians.
Ramaswamy and Declerck (2018) (Ramaswamy and
Declerck, 2018) proved that tokenization and
segmentation significantly improve real-time
customer sentiment analysis. A study by XPath
revealed that it also uses word vectorization and
neural network models to enable the contextual
understanding of sentiments, which is a direct step to
getting actionable insights. In a parallel study,
Shaeeali et al. (2020) (Shaeeali, Mohamed, et al. ,
2020) applied NLP methods that the food delivery
company is dealing with, like text tokenization and
INCOFT 2025 - International Conference on Futuristic Technology
708
sentiment classification, to determine the customer's
emotions and also the key components that make
them satisfied (Shaeeali, Mohamed, et al. , 2020).
Recent research shed light on the emergence of
transformer-based models in capturing the intricacies
of context-dependent sentiment expressions. As an
example, Zehong Li et al. (2023) (Li, Huang, et al. ,
2023) brought BERT into practice, one of the strong
transformer models that were used in the sentiment
analysis of airline customer feedback, and the model
was able to attain high accuracy due to BERT's
contextual ability to interpret word meaning and
subtleties (Li, Huang, et al. , 2023). Specifically,
these transformers can be seen as a detonator for
information technology to the next stage of
development, and this is mainly caused by the fact
that we can attribute it according to the unstructured
set of target data. Alongside the model, they also
suffer from a massive demand for computations, and
they need domain-specific fine-tuning for stable AI
functioning (Schreiber, Ramsey, et al. , 2020). As a
result, Transformers can be said to be the right tool
for uncovering the customer's voice in a way that the
existing neural network models could not.
Malik and Bilal (2023) (Malik and Bilal, 2023)
proposed a technique that merges models like
Structural Vector Models and Convolution Neural
Networks (CNN) with optimization algorithms like
Adaptive Particle Swarm Optimization to improve
feedback analysis on e-commerce platforms (Malik
and Bilal, 2023). By adaptive learning, these hybrid
approaches gain efficiency in dealing with real-time
updates; thus, the feedback systems remain relevant
and adaptive to new data patterns.
Table 1: Summarizes the literature studied based on
various aspects, implementation techniques, and relevant
insights.
Aspect Implementation
Insights from
Literature Review
Objective
Analyze
customer
reviews for
sentiment, key
topics, and
syntactic
p
atterns.
Enhance customer
feedback
understanding using
advanced NLP for
sentiment analysis,
topic modeling, and
p
redictive analytics.
NLP Tools
Used
VADER for
sentiment
analysis, TF-
IDF for
keyword
extraction,
LDA for topic
modeling, and
Word2Vec, GloVe,
BERT, RoBERTa
for embeddings;
SVM, Naive Bayes,
Logistic Regression
for sentiment
analysis; advanced
p
arsing tools like
CFG for
p
arsin
g
.
Spacy and
CoreNLP.
Pre-
processing
Tokenization,
stopword
removal,
cleaning for
LDA, CFG
grammar
definition for
p
arsing.
Includes advanced
cleaning,
lemmatization, POS
tagging, and
contextual pre-
processing specific
to domains (e.g.,
healthcare, finance).
Sentiment
Analysis
VADER
scoring is used
to classify
reviews as
positive,
negative, or
neutral.
Comparison of
lexicon-based (e.g.,
VADER) with deep
learning-based
sentiment analysis
(e.g., BERT) for
higher accuracy and
contextual
understandin
g
.
Topic
Modeling
LDA model to
identify 5
topics with
dominant
keywords from
reviews.
Emphasis on
coherence measures
for validating topics
and use of
transformers (e.g.,
BERTopic) for
dynamic topic
modeling.
Visualization
Sentiment
distribution
(bar chart),
trend analysis
(line chart),
and word
clouds for
sentiment
g
rou
p
s.
Heatmaps, t-SNE,
and clustering plots
are utilized to
interpret high-
dimensional data
and semantic
relationships better.
Syntactic
Analysis
Context-free
grammar
(CFG) parsing
using the CKY
algorithm.
More advanced
dependency parsing
(e.g., Stanford
Parser) and
syntactic tree
generation for
domain-specific
grammar validation.
Sentiment
Analysis
VADER
scoring is used
to classify
reviews as
positive,
negative, or
neutral.
Comparison of
lexicon-based (e.g.,
VADER) with deep
learning-based
sentiment analysis
(e.g., BERT) for
higher accuracy and
contextual
understandin
g
.
3 PROPOSED METHODOLOGY
This research uses an automated framework to
Analyse customer feedback. Using a combination of
Intelligent Customer Feedback Analysis System Using NLP Techniques
709
pre-packaged libraries and original algorithms, it
implements the methodology in Python to extract
results from customer reviews. The pipeline begins
with data pre-processing, which collates the reviews
by fixing missing values, adjusting the text to
lowercase, splitting the sentences into tokens, and
filtering stop words. Descriptive statistics (mean
ratings, review lengths, rating change over time) are
generated through statistical and exploratory data
analysis (EDA) techniques. The detailed architecture
of the proposed system is demonstrated in the figure
1.
3.1 Dataset Description
The Consumer Reviews of Amazon Products dataset
on Kaggle (Datafiniti, 2018) contains over 34,000
reviews of various Amazon products, such as the
Kindle and Fire TV. It provides valuable information,
including product names, ratings, reviews, and
feedback dates. This dataset is ideal for text analysis,
sentiment analysis, and machine learning applications
focused on consumer behavior and product
evaluation. Its structured format allows easy
integration into analytical workflows to derive
insights into customer satisfaction and preferences.
3.2 Implemented Models
Classify the sentiment:
The VADER (Valence Aware Dictionary and
sEntiment Reasoner) model is deployed for sentiment
analysis, which assigns a compound score (from 1
to +1) to the reviews to classify it to positive,
negative, and neutral sentiments. We augment this by
visualizing how sentiment is distributed with bar
charts and word clouds to explore the most common
terms associated with each sentiment type.
Topics identification
The Latent Dirichlet Allocation (LDA) algorithm is
one of the methods used to perform topic modeling,
which detects the abstract themes of reviews.
Reviews are tokenized and transformed into a corpus.
Topics and most representative keywords are
generated from the corpus. A TF-IDF (Term
Frequency-Inverse Document Frequency) vectorizer
is used to calculate words' scores in reviews and find
the most important terms.
Syntactic examination
The syntactic analysis of selected reviews is
conducted using context-free grammar (CFG), with
the CKY algorithm employed to create a parse table
for checking grammatical structure.
Figure 1: Architecture of NLP-based Customer Feedback
System
4 RESULTS
The system is implemented to analyze Amazon
customer reviews through various NLP techniques.
These results can be grouped into three main types:
sentiment analysis, topic analysis, and syntactic
analysis. The results and associated visualizations
INCOFT 2025 - International Conference on Futuristic Technology
710
assist companies in providing actionable insights into
customer feedback.
The developed models are executed for the
Amazon datasets, and the results are presented as
analysis. The extracted results are shown in Figures 2
and 9.
The statistical summary of the review data
indicates a highly positive trend in customer
feedback. The average rating stands at 4.5968,
demonstrating an overall favourable perception. The
rating distribution reveals that most ratings are
concentrated at the higher end of the scale, with over
3,500 reviewers assigning a 5-star rating. Ratings of
4 also appear frequently, while lower ratings,
including 1, 2, and 3, are comparatively rare.
Additionally, the average length of reviews is
calculated at 161.35 characters, suggesting concise
user feedback. This distribution highlights a strong
positive sentiment among most reviewers.
Figure 2: Statistical Summary of Product Ratings
The rating trend over time demonstrates a
generally consistent and positive trajectory. From late
2014 through mid-2016, ratings broadly hovered
around the 5-star mark, reflecting intense satisfaction
among users. However, a noticeable drop occurred
around mid-2016, when ratings dipped significantly,
potentially indicating a specific issue or event during
this period. Following this decline, ratings quickly
rebounded and maintained a stable upward trend,
returning to near-perfect levels by late 2018. This
pattern suggests that while there was a brief
disruption in user satisfaction, it was effectively
addressed, resulting in sustained positive feedback
over the long term.
Figure 3: Review Rating Trend Analysis
The reviews were filtered based on sentiment and
specific keywords to understand user feedback better.
For example, using the keyword "Kindle" and
focusing only on positive sentiments, we identified
reviews that highlighted users' favourite features.
Figure 4: Reviews Filtering based on Keyword and
Sentiments.
The sentiment analysis results reveal a
predominantly positive tone in the dataset, with a
significant majority of 4,532 entries classified as
positive. In contrast, negative sentiments are much
fewer, with only 289 instances, while neutral
sentiments account for 179 entries. This distribution,
illustrated in the accompanying bar chart
Figure 5: Sentiment Analysis Summary with LDA
The LDA analysis revealed five main topics in
customer reviews. Topic 0 focuses on tablets, with
Intelligent Customer Feedback Analysis System Using NLP Techniques
711
keywords like "tablet," "fire," and "amazon." Topic 1
canters on voice-enabled devices such as "Alexa" and
"music." Topic 2 is about e-readers, with terms like
"Kindle" and "books." Topic 3 reflects product use
and entertainment, while Topic 4 discusses ease of
use and pricing. The numbers, like 0.050*tablet,
represent the weight of each word in a topic, showing
how relevant it is to that topic.
Figure 6: LDA Output Represents the Topics Discovered in
the Reviews, Most Significant Words, and Corresponding
Weights (Probabilities)
List of top keywords extracted from customer
reviews using the TF-IDF (Term Frequency-Inverse
Document Frequency) method. These keywords
highlight the most relevant terms frequently
mentioned across the reviews while discounting
commonly used words that add less value. The
identified keywords include terms such as "value,"
"amazon," "bought," "easy," "echo," "good," "great,"
"kindle," "love," "tablet," and "use." These words
likely indicate recurring themes in customer
feedback, reflecting user sentiment and frequently
discussed aspects of the reviewed product or service.
Figure 7: TOP Keywords Extracted by Applying TF-IDF
Technique
The below word clouds represent frequently
mentioned terms in positive and negative customer
reviews. In positive reviews, words like "love,"
"tablet," "use," and "great" stand out, reflecting
customer satisfaction with usability, quality, and
overall experience. In contrast, the negative reviews
highlight terms such as "problem," "work," "Kindle,"
and "battery," pointing to issues with performance,
reliability, or specific features. These visualizations
offer insight into customer sentiment, highlighting
both praised aspects and areas of concern.
Figure 8: Word Cloud of Positive and Negative Words from
the Analyzed Reviews
Context-free grammar (CFG) Parsing uses the
CKY algorithm for the "read the book" sentence. The
provided CFG includes rules for sentence structure,
such as breaking a sentence into a noun phrase (NP)
and a verb phrase (VP). The CKY parse table shows
how the input sentence is parsed step-by-step
according to these rules, with components like "read"
identified as a verb (V), "the" as a determiner (Det),
and "book" as a noun (N). The parsing was successful,
illustrating the grammatical breakdown of the
sentence.
INCOFT 2025 - International Conference on Futuristic Technology
712
Figure 9: Result of CKY Algorithm Applied to Review
Text.
5 CONCLUSIONS
This research provides an expansive framework for
evaluating customer feedback by utilizing various
natural language processing techniques, including
sentiment analysis, topic modelling, and syntactic
parsing. The execution uses VADER to classify the
sentiment, LDA to identify topics, and CFG parsing
using the CKY algorithm for syntactic examination.
The results show that the implemented system can
reveal critical insights regarding trends in the
sentiment expressed, frequent themes, and
grammatical patterns, thus providing practical
intelligence for organizations aiming to improve
customer satisfaction.
The interactive nature of the implementation,
enabled through the Streamlit platform, allows users
to visualize and interpret data effectively. However,
the system's reliance on traditional NLP methods
presents opportunities for improvement, particularly
by integrating advanced techniques like transformer-
based models (e.g., BERT, GPT) for enhanced
accuracy and scalability. Future work will expand the
system's capabilities to include multilingual analysis,
domain-specific customizations, and real-time
feedback processing, extending its applicability to
diverse contexts and datasets.
While customer feedback analysis has come a
long way, specific problems persist with existing
systems, which may have future work as an area of
research. Future research may gain traction by
focusing on domain-specific models that adapt and
the other traditional NLP pipeline into a rat's feedback
loop. However, state-of-the-art models still leave
room for improvement in real-time prompt adaptation
(especially in domain-specific areas), fine-tuning,
and computational efficiency. In continuation, our
proposed intelligent customer feedback system seeks
to alleviate the challenges mentioned above by
combining advanced NLP techniques with adaptive
learning models to provide customized real-time
solutions at scale specific to the domain.
REFERENCES
1. Hemalatha, B. and Velmurugan, T. (2020).
Impact of customer feedback system using
machine learning algorithms for sentiment
mining. International Journal of Innovative
Technology and Exploring Engineering.
2. Ramaswamy, S. and Declerck, N. (2018).
Customer perception analysis using deep learning
and NLP. Procedia Computer Science.
3. Shaeeali, N. S., Mohamed, A., and Mutalib, S.
(2020). Customer reviews analytics on food
delivery services in social media: a review. IAES
International Journal of Artificial Intelligence
(IJ-AI).
4. Li, Z., Huang, C., and Yang, C. (2023). A
comparative sentiment analysis of airline
customer reviews using bidirectional encoder
representations from transformers (BERT) and its
variants. Mathematics.
5. Malik, N. and Bilal, M. (2023). Natural language
processing for analyzing online customer
reviews: A survey; taxonomy; and open research
challenges. Unpublished.
6. Khaled, K. A. (2014). Natural language
processing and its use in education. International
Journal of Advanced Computer Science and
Applications.
7. Arias-Barahona, M. X., Valencia-Díaz, M. A.,
Tabares-Soto, R., Flórez-Ruíz, J. C., Arteaga-
Arteaga, H. B., and Orozco-Arias, S. (2023).
Requests classification in the customer service
area for software companies using machine
learning and natural language processing. PeerJ
Computer Science.
8. Shaik, T., McDonald, J., Tao, X., Redmond, P.,
Galligan, L., Li, Y., and Dann, C. (2022). A
review of the trends and challenges in adopting
natural language processing methods for
education feedback analysis. IEEE Access.
9. Alibasic, A. and Popovic, T. (2021). Applying
natural language processing to analyze customer
satisfaction. Unpublished.
Intelligent Customer Feedback Analysis System Using NLP Techniques
713
10. Olujimi, P. A. and Ade-Ibijola, A. A. (2023).
NLP techniques for automating responses to
customer queries: A systematic review. Discover
Artificial Intelligence.
11. Schreiber, R., Ramsey, G., and Nawab, K.
(2020). Natural language processing to extract
meaningful information from patient experience
feedback. Applied Clinical Informatics.
12. Anonymous. (2020). AI-based natural language
processing (NLP) systems. Journal of Algebraic
Statistics.
13. Abro, A. A., Talpur, M. S. H., and Jumani, A. K.
(2023). Natural language processing challenges
and issues: A literature review. Gazi University
Journal of Science.
14. Zheng, H., Zhou, H., Su, G., Xu, K., and Wang,
Y. (2024). Medication recommendation system
based on natural language processing for patient
emotion analysis. Academic Journal of Science
and Technology.
15. Datafiniti. (2018). Consumer reviews of Amazon
products, version 1. Retrieved November 24,
2024, from
https://www.kaggle.com/datasets/datafiniti/cons
umer-reviews-of-amazon-products.
INCOFT 2025 - International Conference on Futuristic Technology
714