Intelligent Customer Feedback Analysis System Using NLP

Techniques

Omkar Jayendra Rane

and Deepali Naik

Pimpri Chinchwad College of Engineering, Department of Computer Engineering, Pune, Maharashtra, India

Keywords: Sentiment Analysis, LDA, CYK, TF- IDF, Customer Feedback Analysis, Vader Sentiment Analysis.

Abstract: In the digital era, businesses receive vast unstructured client feedback containing valuable insights into

customer satisfaction, product performance, and service quality. Thus, to extract actionable insights requires

advanced analytical techniques. This study presents an Intelligent Customer Feedback Analysis System

utilizing Natural Language Processing (NLP) techniques like Vader Sentiment Analysis, LDA topic

modelling, and TF-IDF for sentiment classification, trend analysis, topic detection, and keyword extraction.

This system utilized a Streamlit interactive application, visualization tools such as word clouds, sentiment

distribution, rating trends, and filters for sentiment analysis. The implemented system is very much real-time

and assists data-driven sentiments understanding in brand decision-making, which is the power of customer

perspectives and proactive engagement.

1 INTRODUCTION

Some modern markets have witnessed an emerging

trend where the user comes first, and the service

provider must not only provide but accept user

feedback. The rise of social media, online rating and

commentary, and other forms of digital customer

feedback has made the need for robust methods for

collecting, evaluating, and synthesizing insights from

large amounts of unstructured consumer-related

information. However, in this context, its alternative

- classical feedback mechanisms - are almost always

limited in their application because they rely on rigid,

rule-based systems that cannot comprehend the

subtleties and complexities of human emotions. As a

consequence, however, it becomes hard for

enterprises to fully utilize consumer feedback in

award-winning, consumer-dominated industries such

as e-commerce, food delivery, and auto service

(Hemalatha, Velmurugan, et al. , 2020),

(Ramaswamy, and, Declerck, 2018)

The past decade has witnessed considerable

advances in Machine Learning (ML), Deep Learning

(DL), and Natural Language Processing (NLP),

making it possible to conduct consumer opinion

mining with enhanced detail and accuracy. Early

https://orcid.org/0009-0004-2942-2693

https://orcid.org/0000-0001-6022-4565

studies began with traditional methods, such as

logistic regression, Naive Bayes, and SVM, which

laid a foundation for more sophisticated models

(Shaeeali, Mohamed, et al. , 2020).

However, as consumer opinion mining techniques

involve working with large amounts of unstructured

text data, it was only natural that interest shifted

toward NLP.

Ordinary machine learning techniques are most

applicable in the case of structured and numerical

data, and such methods fail to manage the intricacies

of textual information, in particular, different

sentence patterns, ambiguous meaning of language,

and dependencies related to the context. At the same

time,NLP is devoted to processing and

comprehending human language, which allows us to

understand emotion, intention, or meaning much

more profoundly (Schreiber, Ramsey, et al. , 2020).

This capability makes it easier to use NLP when

processing large amounts of unpaid content,

including sources of information such as product

reviews, social networks, and complaints to the

company, which in turn were problematic for the

traditional ML methods (Anonymous.,et al. , 2020).

Research focusing on specific areas, like opinions

about airlines (Li, Huang, et al., 2023) or food

Rane, O. J. and Naik, D.

Intelligent Customer Feedback Analysis System Using NLP Techniques.

DOI: 10.5220/0013584100004664

Paper published under CC license (CC BY-NC-ND 4.0)

In Proceedings of the 3rd International Conference on Futuristic Technology (INCOFT 2025) - Volume 1, pages 707-714

ISBN: 978-989-758-763-4

707

delivery (Malik and Bilal, 2023) services, has shown

that sentiment analysis has improved with the

introduction of BERT and its transformer

architecture.NLP models such as BERT (Li, Huang,

et al., 2023), GPT (Olujimi, Ade-Ibijola, et al. , 2023),

etc., are pretty effective in obtaining contextual

sentiments necessary to make persuasive arguments

in highly competitive environments.

The use of hybrid models such as SVM and CNN

integrated with optimization algorithms such as

Particle Swarm Optimization (Khaled, 2014) can also

help improve the quality of the analysis and its

precision. The study explores machine learning (ML)

and natural language processing (NLP) to automate

request classification in software companies'

customer service areas. By applying ML algorithms

such as Support Vector Machine (SVM), Extra Trees,

and Random Forests to process and balance datasets,

the research achieved a classification accuracy of

98.97% with SVM. This approach significantly

enhances customer service efficiency by reducing

response times and providing accurate

categorizations. The findings underscore the

effectiveness of data balancing and hyper-parameter

optimization techniques, particularly with

unbalanced datasets containing multiple categories

(Barahona, Díaz, et al. , 2023).

Despite these improvements, the classification of

the sentiments is insufficient, and there is a greater

need for intelligent feedback mechanisms. According

to the study, usability and interpretability are key to

effective customer feedback systems. Existing

research shows that visual feedback data displays

help stakeholders grasp and respond to complex

findings (Olujimi, Ade-Ibijola, et al. , 2023). The

visuals generated on rigorous analysis can support

better decision-making by showing trends, revealing

patterns, and pointing out areas for improvement.

According to R. Schreiber et al. (Schreiber, Ramsey,

et al. , 2020), the addition of user-friendly visual

analytics makes feedback systems even more helpful,

and lets companies set priorities based on real-time

results

Adding This study puts forward a new approach

to building an Intelligent Customer Feedback System

that combines various NLP techniques and adapts to

specific fields. The proposed system handles large

amounts of unstructured feedback data and provides

dynamic, practical insights that meet the customer's

needs. By using recent advances in NLP and ML, the

system aims to address the shortcomings of

traditional feedback analysis systems and offer a

more customer-focused and responsive solution.

The rest of the paper is arranged as follows:

section 2 focuses on the study of existing literature,

and a detailed discussion of the proposed system's

methodology is conducted in section 3. Meanwhile,

the performance analysis of the implemented

techniques is presented in the form of graphs and

tables in section 4. The paper concludes with

significant observations.

2 RELATED WORK

In recent times, technology in customer feedback

analysis has developed with ML (Hemalatha,

Velmurugan, et al. , 2020), NLP (Malik and Bilal,

2023) and DL (Ramaswamy, and, Declerck, 2018)

techniques. This section will discuss state-of-the-art

customer feedback analysis techniques. For the study,

more focus is given to the methodology and tools

adapted by the previous researchers and estimated

future developments.

Researchers started with simple computer

programs to figure out how customers feel about their

reviews. Hemalatha and Velmurugan (2020)

(Hemalatha, Velmurugan, et al. , 2020) showed that

people used logistic regression, Naïve Bayes, SVM,

and neural networks. These tools work well with

organized information, which helps lay the

groundwork for understanding feelings. But they

often miss the little hints about emotions in people's

words (Alibasic and Popovic, 2021). Take, for

example. SVM and logistic regression algorithms are

suitable for basic sorting tasks but need much

tweaking to work accurately.

Moreover, they struggle with the tricky, context-

dependent attitudes you often see in customer

feedback (Olujimi and Ade-Ibijola, 2023) (Zheng,

Zhou, et al. , 2024). These old-school computer

methods set a bar for how well things should work.

But they're not great at changing or adapting when

you're dealing with tons of information.

NLP and DL introduced newer and more

efficient data analysis methods to physicians.

Ramaswamy and Declerck (2018) (Ramaswamy and

Declerck, 2018) proved that tokenization and

segmentation significantly improve real-time

customer sentiment analysis. A study by XPath

revealed that it also uses word vectorization and

neural network models to enable the contextual

understanding of sentiments, which is a direct step to

getting actionable insights. In a parallel study,

Shaeeali et al. (2020) (Shaeeali, Mohamed, et al. ,

2020) applied NLP methods that the food delivery

company is dealing with, like text tokenization and

INCOFT 2025 - International Conference on Futuristic Technology

708

sentiment classification, to determine the customer's

emotions and also the key components that make

them satisfied (Shaeeali, Mohamed, et al. , 2020).

Recent research shed light on the emergence of

transformer-based models in capturing the intricacies

of context-dependent sentiment expressions. As an

example, Zehong Li et al. (2023) (Li, Huang, et al. ,

2023) brought BERT into practice, one of the strong

transformer models that were used in the sentiment

analysis of airline customer feedback, and the model

was able to attain high accuracy due to BERT's

contextual ability to interpret word meaning and

subtleties (Li, Huang, et al. , 2023). Specifically,

these transformers can be seen as a detonator for

information technology to the next stage of

development, and this is mainly caused by the fact

that we can attribute it according to the unstructured

set of target data. Alongside the model, they also

suffer from a massive demand for computations, and

they need domain-specific fine-tuning for stable AI

functioning (Schreiber, Ramsey, et al. , 2020). As a

result, Transformers can be said to be the right tool

for uncovering the customer's voice in a way that the

existing neural network models could not.

Malik and Bilal (2023) (Malik and Bilal, 2023)

proposed a technique that merges models like

Structural Vector Models and Convolution Neural

Networks (CNN) with optimization algorithms like

Adaptive Particle Swarm Optimization to improve

feedback analysis on e-commerce platforms (Malik

and Bilal, 2023). By adaptive learning, these hybrid

approaches gain efficiency in dealing with real-time

updates; thus, the feedback systems remain relevant

and adaptive to new data patterns.

Table 1: Summarizes the literature studied based on

various aspects, implementation techniques, and relevant

insights.

Aspect Implementation

Insights from

Literature Review

Objective

Analyze

customer

reviews for

sentiment, key

topics, and

syntactic

atterns.

Enhance customer

feedback

understanding using

advanced NLP for

sentiment analysis,

topic modeling, and

redictive analytics.

NLP Tools

Used

VADER for

sentiment

analysis, TF-

IDF for

keyword

extraction,

LDA for topic

modeling, and

Word2Vec, GloVe,

BERT, RoBERTa

for embeddings;

SVM, Naive Bayes,

Logistic Regression

for sentiment

analysis; advanced

arsing tools like

CFG for

arsin

Spacy and

CoreNLP.

Pre-

processing

Tokenization,

stopword

removal,

cleaning for

LDA, CFG

grammar

definition for

arsing.

Includes advanced

cleaning,

lemmatization, POS

tagging, and

contextual pre-

processing specific

to domains (e.g.,

healthcare, finance).

Sentiment

Analysis

VADER

scoring is used

to classify

reviews as

positive,

negative, or

neutral.

Comparison of

lexicon-based (e.g.,

VADER) with deep

learning-based

sentiment analysis

(e.g., BERT) for

higher accuracy and

contextual

understandin

Topic

Modeling

LDA model to

identify 5

topics with

dominant

keywords from

reviews.

Emphasis on

coherence measures

for validating topics

and use of

transformers (e.g.,

BERTopic) for

dynamic topic

modeling.

Visualization

Sentiment

distribution

(bar chart),

trend analysis

(line chart),

and word

clouds for

sentiment

rou

Heatmaps, t-SNE,

and clustering plots

are utilized to

interpret high-

dimensional data

and semantic

relationships better.

Syntactic

Analysis

Context-free

grammar

(CFG) parsing

using the CKY

algorithm.

More advanced

dependency parsing

(e.g., Stanford

Parser) and

syntactic tree

generation for

domain-specific

grammar validation.

Sentiment

Analysis

VADER

scoring is used

to classify

reviews as

positive,

negative, or

neutral.

Comparison of

lexicon-based (e.g.,

VADER) with deep

learning-based

sentiment analysis

(e.g., BERT) for

higher accuracy and

contextual

understandin

3 PROPOSED METHODOLOGY

This research uses an automated framework to

Analyse customer feedback. Using a combination of

Intelligent Customer Feedback Analysis System Using NLP Techniques

709

pre-packaged libraries and original algorithms, it

implements the methodology in Python to extract

results from customer reviews. The pipeline begins

with data pre-processing, which collates the reviews

by fixing missing values, adjusting the text to

lowercase, splitting the sentences into tokens, and

filtering stop words. Descriptive statistics (mean

ratings, review lengths, rating change over time) are

generated through statistical and exploratory data

analysis (EDA) techniques. The detailed architecture

of the proposed system is demonstrated in the figure

3.1 Dataset Description

The Consumer Reviews of Amazon Products dataset

on Kaggle (Datafiniti, 2018) contains over 34,000

reviews of various Amazon products, such as the

Kindle and Fire TV. It provides valuable information,

including product names, ratings, reviews, and

feedback dates. This dataset is ideal for text analysis,

sentiment analysis, and machine learning applications

focused on consumer behavior and product

evaluation. Its structured format allows easy

integration into analytical workflows to derive

insights into customer satisfaction and preferences.

3.2 Implemented Models

• Classify the sentiment:

The VADER (Valence Aware Dictionary and

sEntiment Reasoner) model is deployed for sentiment

analysis, which assigns a compound score (from −1

to +1) to the reviews to classify it to positive,

negative, and neutral sentiments. We augment this by

visualizing how sentiment is distributed with bar

charts and word clouds to explore the most common

terms associated with each sentiment type.

• Topics identification

The Latent Dirichlet Allocation (LDA) algorithm is

one of the methods used to perform topic modeling,

which detects the abstract themes of reviews.

Reviews are tokenized and transformed into a corpus.

Topics and most representative keywords are

generated from the corpus. A TF-IDF (Term

Frequency-Inverse Document Frequency) vectorizer

is used to calculate words' scores in reviews and find

the most important terms.

• Syntactic examination

The syntactic analysis of selected reviews is

conducted using context-free grammar (CFG), with

the CKY algorithm employed to create a parse table

for checking grammatical structure.

Figure 1: Architecture of NLP-based Customer Feedback

System

4 RESULTS

The system is implemented to analyze Amazon

customer reviews through various NLP techniques.

These results can be grouped into three main types:

sentiment analysis, topic analysis, and syntactic

analysis. The results and associated visualizations

INCOFT 2025 - International Conference on Futuristic Technology

710

assist companies in providing actionable insights into

customer feedback.

The developed models are executed for the

Amazon datasets, and the results are presented as

analysis. The extracted results are shown in Figures 2

and 9.

The statistical summary of the review data

indicates a highly positive trend in customer

feedback. The average rating stands at 4.5968,

demonstrating an overall favourable perception. The

rating distribution reveals that most ratings are

concentrated at the higher end of the scale, with over

3,500 reviewers assigning a 5-star rating. Ratings of

4 also appear frequently, while lower ratings,

including 1, 2, and 3, are comparatively rare.

Additionally, the average length of reviews is

calculated at 161.35 characters, suggesting concise

user feedback. This distribution highlights a strong

positive sentiment among most reviewers.

Figure 2: Statistical Summary of Product Ratings

The rating trend over time demonstrates a

generally consistent and positive trajectory. From late

2014 through mid-2016, ratings broadly hovered

around the 5-star mark, reflecting intense satisfaction

among users. However, a noticeable drop occurred

around mid-2016, when ratings dipped significantly,

potentially indicating a specific issue or event during

this period. Following this decline, ratings quickly

rebounded and maintained a stable upward trend,

returning to near-perfect levels by late 2018. This

pattern suggests that while there was a brief

disruption in user satisfaction, it was effectively

addressed, resulting in sustained positive feedback

over the long term.

Figure 3: Review Rating Trend Analysis

The reviews were filtered based on sentiment and

specific keywords to understand user feedback better.

For example, using the keyword "Kindle" and

focusing only on positive sentiments, we identified

reviews that highlighted users' favourite features.

Figure 4: Reviews Filtering based on Keyword and

Sentiments.

The sentiment analysis results reveal a

predominantly positive tone in the dataset, with a

significant majority of 4,532 entries classified as

positive. In contrast, negative sentiments are much

fewer, with only 289 instances, while neutral

sentiments account for 179 entries. This distribution,

illustrated in the accompanying bar chart

Figure 5: Sentiment Analysis Summary with LDA

The LDA analysis revealed five main topics in

customer reviews. Topic 0 focuses on tablets, with

Intelligent Customer Feedback Analysis System Using NLP Techniques

711

keywords like "tablet," "fire," and "amazon." Topic 1

canters on voice-enabled devices such as "Alexa" and

"music." Topic 2 is about e-readers, with terms like

"Kindle" and "books." Topic 3 reflects product use

and entertainment, while Topic 4 discusses ease of

use and pricing. The numbers, like 0.050*tablet,

represent the weight of each word in a topic, showing

how relevant it is to that topic.

Figure 6: LDA Output Represents the Topics Discovered in

the Reviews, Most Significant Words, and Corresponding

Weights (Probabilities)

List of top keywords extracted from customer

reviews using the TF-IDF (Term Frequency-Inverse

Document Frequency) method. These keywords

highlight the most relevant terms frequently

mentioned across the reviews while discounting

commonly used words that add less value. The

identified keywords include terms such as "value,"

"amazon," "bought," "easy," "echo," "good," "great,"

"kindle," "love," "tablet," and "use." These words

likely indicate recurring themes in customer

feedback, reflecting user sentiment and frequently

discussed aspects of the reviewed product or service.

Figure 7: TOP Keywords Extracted by Applying TF-IDF

Technique

The below word clouds represent frequently

mentioned terms in positive and negative customer

reviews. In positive reviews, words like "love,"

"tablet," "use," and "great" stand out, reflecting

customer satisfaction with usability, quality, and

overall experience. In contrast, the negative reviews

highlight terms such as "problem," "work," "Kindle,"

and "battery," pointing to issues with performance,

reliability, or specific features. These visualizations

offer insight into customer sentiment, highlighting

both praised aspects and areas of concern.

Figure 8: Word Cloud of Positive and Negative Words from

the Analyzed Reviews

Context-free grammar (CFG) Parsing uses the

CKY algorithm for the "read the book" sentence. The

provided CFG includes rules for sentence structure,

such as breaking a sentence into a noun phrase (NP)

and a verb phrase (VP). The CKY parse table shows

how the input sentence is parsed step-by-step

according to these rules, with components like "read"

identified as a verb (V), "the" as a determiner (Det),

and "book" as a noun (N). The parsing was successful,

illustrating the grammatical breakdown of the

sentence.

INCOFT 2025 - International Conference on Futuristic Technology

712

Figure 9: Result of CKY Algorithm Applied to Review

Text.

5 CONCLUSIONS

This research provides an expansive framework for

evaluating customer feedback by utilizing various

natural language processing techniques, including

sentiment analysis, topic modelling, and syntactic

parsing. The execution uses VADER to classify the

sentiment, LDA to identify topics, and CFG parsing

using the CKY algorithm for syntactic examination.

The results show that the implemented system can

reveal critical insights regarding trends in the

sentiment expressed, frequent themes, and

grammatical patterns, thus providing practical

intelligence for organizations aiming to improve

customer satisfaction.

The interactive nature of the implementation,

enabled through the Streamlit platform, allows users

to visualize and interpret data effectively. However,

the system's reliance on traditional NLP methods

presents opportunities for improvement, particularly

by integrating advanced techniques like transformer-

based models (e.g., BERT, GPT) for enhanced

accuracy and scalability. Future work will expand the

system's capabilities to include multilingual analysis,

domain-specific customizations, and real-time

feedback processing, extending its applicability to

diverse contexts and datasets.

While customer feedback analysis has come a

long way, specific problems persist with existing

systems, which may have future work as an area of

research. Future research may gain traction by

focusing on domain-specific models that adapt and

the other traditional NLP pipeline into a rat's feedback

loop. However, state-of-the-art models still leave

room for improvement in real-time prompt adaptation

(especially in domain-specific areas), fine-tuning,

and computational efficiency. In continuation, our

proposed intelligent customer feedback system seeks

to alleviate the challenges mentioned above by

combining advanced NLP techniques with adaptive

learning models to provide customized real-time

solutions at scale specific to the domain.

REFERENCES

1. Hemalatha, B. and Velmurugan, T. (2020).

Impact of customer feedback system using

machine learning algorithms for sentiment

mining. International Journal of Innovative

Technology and Exploring Engineering.

2. Ramaswamy, S. and Declerck, N. (2018).

Customer perception analysis using deep learning

and NLP. Procedia Computer Science.

3. Shaeeali, N. S., Mohamed, A., and Mutalib, S.

(2020). Customer reviews analytics on food

delivery services in social media: a review. IAES

International Journal of Artificial Intelligence

(IJ-AI).

4. Li, Z., Huang, C., and Yang, C. (2023). A

comparative sentiment analysis of airline

customer reviews using bidirectional encoder

representations from transformers (BERT) and its

variants. Mathematics.

5. Malik, N. and Bilal, M. (2023). Natural language

processing for analyzing online customer

reviews: A survey; taxonomy; and open research

challenges. Unpublished.

6. Khaled, K. A. (2014). Natural language

processing and its use in education. International

Journal of Advanced Computer Science and

Applications.

7. Arias-Barahona, M. X., Valencia-Díaz, M. A.,

Tabares-Soto, R., Flórez-Ruíz, J. C., Arteaga-

Arteaga, H. B., and Orozco-Arias, S. (2023).

Requests classification in the customer service

area for software companies using machine

learning and natural language processing. PeerJ

Computer Science.

8. Shaik, T., McDonald, J., Tao, X., Redmond, P.,

Galligan, L., Li, Y., and Dann, C. (2022). A

review of the trends and challenges in adopting

natural language processing methods for

education feedback analysis. IEEE Access.

9. Alibasic, A. and Popovic, T. (2021). Applying

natural language processing to analyze customer

satisfaction. Unpublished.

Intelligent Customer Feedback Analysis System Using NLP Techniques

713

10. Olujimi, P. A. and Ade-Ibijola, A. A. (2023).

NLP techniques for automating responses to

customer queries: A systematic review. Discover

Artificial Intelligence.

11. Schreiber, R., Ramsey, G., and Nawab, K.

(2020). Natural language processing to extract

meaningful information from patient experience

feedback. Applied Clinical Informatics.

12. Anonymous. (2020). AI-based natural language

processing (NLP) systems. Journal of Algebraic

Statistics.

13. Abro, A. A., Talpur, M. S. H., and Jumani, A. K.

(2023). Natural language processing challenges

and issues: A literature review. Gazi University

Journal of Science.

14. Zheng, H., Zhou, H., Su, G., Xu, K., and Wang,

Y. (2024). Medication recommendation system

based on natural language processing for patient

emotion analysis. Academic Journal of Science

and Technology.

15. Datafiniti. (2018). Consumer reviews of Amazon

products, version 1. Retrieved November 24,

2024, from

https://www.kaggle.com/datasets/datafiniti/cons

umer-reviews-of-amazon-products.

INCOFT 2025 - International Conference on Futuristic Technology

714