Aspect Based Sentiment Analysis on Online Review Data
to Predict Corporate Reputation
R. E. Loke
a
and W. Reitter
Centre for Market Insights, Amsterdam University of Applied Sciences, Amsterdam, The Netherlands
Keywords: Aspect Based Sentiment Analysis (ABSA), Machine Learning, Natural Language Processing, Scraping.
Abstract: Corporate reputation is an intangible resource that is closely tied to an organization’s success but measuring
it and to derive actions that can improve the reputations can be a long and expensive journey for an
organization. In the available literature, corporate reputation is primarily measured through surveys, which
can be time and cost intensive. This paper uses online reviews on the web as the source for a machine-learning
driven aspect-based sentiment analysis that can enable organizations to evaluate their corporate reputation on
a fine-grained level. The analysis is done unsupervised without organizations needing to manually label
datasets. Using the insights generated through the analysis, on one hand, organizations can save costs and
time to measure corporate reputation, and, on the other hand, it provides an in-depth analysis that splits the
overall reputation into multiple aspects, with which organizations can identify weaknesses and in turn improve
their corporate reputation. Therefore, this research is relevant for organizations aiming to understand and
improve their corporate reputation to achieve success, for example, in form of financial performance, or for
organizations that help and consult other organizations on their journeys to increased success. Our approach
is validated, evaluated and illustrated with Trustpilot review data.
1 INTRODUCTION
One of the major objectives of strategic business
management is understanding driving factors of
organizational performance (Crook et al., 2008). A
possibility to evaluate and understand the
heterogeneity of firms is the resource based view, that
analyzes organizational resources and capabilities
(Eloranta & Turunen, 2015). Those resources can be
tangible or intangible (Kamasak, 2017) and according
to researchers (Kor & Mesko, 2013; Molloy & Barney,
2015) intangible resources are considered as the most
likely sources to an organizations success. Resources
need to be valuable, rare, inimitable and not
substitutable (Y. Lin & Wu, 2014). As such, according
to Brahim & Arab (2011) intangible resources are most
difficult to imitate and substitute and it can be argued,
that these are the most valuable and rarest.
A main intangible resource of an organization that
is increasingly receiving attention is corporate
reputation (Wepener & Boshoff, 2015). According to
Schwaiger et al. (2011) corporate reputation is “the
ultimate determinant of competitiveness”. However,
measuring corporate reputation is a challenge for
a
https://orcid.org/0000-0002-7168-090X
researches and businesses alike. Corporate reputation
is seen as a factor that can explain the performance of
a business, with Firestein (2006) arguing that corporate
reputation is the strongest aspect of a company’s
sustainability. The relevance of corporate reputation
has been stated numerously throughout the years, as
such Abratt & Kleyn (2012) concluded that corporate
reputation is a strategic resource to create competitive
advantage. Furthermore, Vig et al. (2017) argue that
corporate reputation can be a significant factor to
predict financial performance. A reason for the
possible correlation of corporate reputation and
financial performance can be the effect of corporate
reputation on the customers and its purchase intention.
Keh & Xie (2009) observed a relation between
corporate reputation and customer trust, that leads to
increased purchase intention and willingness to pay.
The positive effect corporate reputation can have
on a customer is becoming increasingly relevant in an
environment where businesses need to become more
customer centric for long term business success and
profitability (Roy & Shekhar, 2010), making the
corporate reputation more important than ever. As
such, it can be said that understanding and improving
Loke, R. and Reitter, W.
Aspect Based Sentiment Analysis on Online Review Data to Predict Corporate Reputation.
DOI: 10.5220/0010607203430352
In Proceedings of the 10th International Conference on Data Science, Technology and Applications (DATA 2021), pages 343-352
ISBN: 978-989-758-521-0
Copyright
c
2021 by SCITEPRESS Science and Technology Publications, Lda. All r ights reserved
343
corporate reputation is an essential part for an
organization’s success, but measuring it is a challenge
for organizations.
A method to measure and evaluate corporate
reputation is doing surveys, which is used by several
researchers (Cintamür & Yüksel, 2018; Fombrun et al.,
2015; Puncheva‐Michelotti & Michelotti, 2010;
Sequeira et al., 2015; Wepener & Boshoff, 2015) to
measure corporate reputation. It can be argued that in
order to receive the most reliable results possible, more
data is needed and albeit the method of conducting
surveys allows for targeted and in depth information, it
can be time consuming when trying to collect as much
data as possible. Furthermore, the relation between
time and cost of a project (Babu & Suresh, 1996) can
lead to higher cost involved with more time spent on
conducting the survey. Therefore, it can be said that the
main problem of organizations for evaluating and
measuring corporate reputation, which is said to impact
an organization’s success, is the time and cost intensive
methodology of collecting primary data through
surveys. As such, a goal of an organization is to find a
time and cost efficient way of measuring corporate
reputation that can provide them added value.
To find an efficient way to tackle this problem,
digital aspects should be highlighted, as according to
Deloitte (2013), especially in the digital age, customers
are increasingly more using digital touchpoints to
resolve issues or get in contact with businesses. Dang
& Pham (2020) further stress this relevancy by
concluding that focusing on customers online is an
essential part for businesses. As such, with the goal of
finding a time and cost efficient way to evaluate
corporate reputation and the increasing relevance of
online aspects, another source to evaluate and improve
corporate reputation is online reviews, where
customers leave information such as feedback, positive
or negative, recommendations or other valuable
insights. According to Mayzlin et al. (2014), user
generated online reviews are an important resource for
consumers in their purchase decision. The main
advantage of analyzing online reviews compared to
conducting surveys is, that the reviews are already
available online and no time and cost intensive surveys
need to be carried out to collect the data. Furthermore,
it can be argued that online reviews can offer more data
than surveys, as there might be more customers that
leave a review than customers who are ready to fill out
a survey. It can be said that online reviews are a
favorable way of measuring corporate reputation, but
the questions of how they are analyzed needs to be
answered as well.
To derive corporate reputation Chung et al. (2019)
used sentiment polarity analysis on social media tweets
about several companies. It can be argued that a tweet
about a company is the same as an online review about
a company. However, the main problem of sentiment
polarity or sentiment analysis is that these analyses are
done on an overall level and are not in depth enough to
conclude concrete recommendations. As such, even
when an organization is perceived overall positively
there can be aspects of the organization that are seen as
negatively that need to be improved. To tackle this
problem, Chung et al. (2019) suggest that their main
limitation, of their analysis not being in depth enough,
can be addressed by ABSA, which can predict the
sentiment of various aspects. The need and relevancy
of ABSA is also backed up by Jebbara & Cimiano
(2016), who argue that sentiment analysis needs to be
done on a more fine-grained level and stress the
importance of analyzing online reviews. The authors
propose an ABSA that can extract sentiments
expressed towards aspects from the text and thus, can
detect multiple opinions in a single review. However,
ABSA faces another time related problem, which is the
manual labeling, research that has provided great
ABSA results (Araque et al., 2017; Chen et al., 2017)
needed a significant amount of time for manually
labelling data. This will also limit the amount of data,
as there is only a limited amount of data a human can
label in a short time. As such, conducting ABSA on
online reviews is a way to evaluate corporate
reputation cost and time efficiently for an organization,
but it needs to be done in a way that does not need
manual labeling.
In the following sections, we describe relevant
conceptual models, give details in our processing
methodology, and conclude our work.
2 THEORETICAL FRAMEWORK
This section is intended to give an overview of the
current state of research that is needed to understand
corporate reputation.
2.1 Current State of Corporate
Branding
To understand corporate reputation, the origin needs to
be examined first. Corporate reputation is part of
corporate branding, which can be described as a
multidisciplinary field that has proved its usefulness in
scientific and business environments (Biraghi &
Gambetti, 2015). Corporate branding is currently
experiencing three major shifts: (1) Brand strategy
shifting its focus from products to an organizational
DATA 2021 - 10th International Conference on Data Science, Technology and Applications
344
perspective (Balmer, 2001); (2) Shifting from
marketing to corporate strategy (Abratt & Kleyn,
2012); (3) Focusing on a stakeholder-centric view
where the corporate brand is in an ongoing dialogue
with the stakeholders (Melewar et al., 2012).
According to Melewar et al. (2012) the third shift
is to move away from the traditional perspective and to
incorporate a dynamic view on corporate branding. In
this paper and for corporate reputation, the third shift is
most relevant. Here, corporate branding is seen as an
collaborative, relational and social process between the
company and its stakeholders (Cornelissen et al.,
2012). Melewar et al. (2012) further stress the fact that
this view of corporate branding sees the brand as a
vehicle that helps with the interaction between the
company and its environment. Koporcic & Halinen
(2018) assume that corporate brands are formed in the
minds of the individual people and that the image is
constantly being refined and changed. According to the
authors, the key concepts that are part of corporate
branding are corporate identity and corporate
reputation. According to Podnar (2015, p. 29)
corporate identity characteristics are real and constant
attributes, whereas corporate reputation can be
different, depending on the view of the observer.
2.2 Current Research and Definition on
Corporate Reputation
Research has underpinned the relevance of corporate
reputation and it is seen as a strategic resource to
create competitive advantage (Abratt & Kleyn, 2012).
Early on, corporate reputation was defined as a
concept, that is a personal impression of the perceived
entity (Shee & Abratt, 1989). In more recent
researches on corporate reputation, Agarwal et al.
(2015) affirm the view that corporate reputation is an
unobservable construct that only exists in the minds
of the stakeholders.
Definitions of corporate reputation can be found
in a growing body of literature. There is currently no
singular definition or adaptation of corporate
reputation and some interpretations overlap or
contradict. Table 1 displays several definitions of
corporate reputation before deriving the main
definition that will be used in this paper.
What is consistent throughout the definitions is
that corporate reputation is a perception of an
organization that a stakeholder has, that is based on
past impressions. As such, this paper’s definition of
corporate reputation is:
“A sociocognitive construct consisting of the
organization’s perception that a stakeholder has, that
is built through past impressions”
Table 1: Definitions of corporate reputation.
Literature Definition
Gotsi &
Wilson,
2001
Stakeholder’s overall evaluation of a
company based on experiences,
communication and symbols that provide
information about company’s action, over
time. The evaluation is also compared with
rivals.
Chun,
2005
Umbrella referring to the cumulative
impressions of the different stakeholders of
what the organization stands for and what it
is associated with.
Barnett et
al., 2006
Collective judgement of observers of a
corporation, based on assessments of
financial, social and environmental impacts,
over time.
Walker,
2010
Relatively stable, issue specific
representation of a company’s past actions
and future prospects compared to the
standard. Takes time to build and can remain
stable once built.
Rindova et
al., 2010
Sociocognitive construct that is
characterized by quality and prominence to
determine value as an intangible asset
contributin
g
to com
p
etitive advanta
g
e.
Lange et
al., 2011
Objective reality for the organization that is
sub
j
ectivel
y
created b
y
outside observers.
The goal of this paper is to measure corporate
reputation and several researchers have tried to
measure corporate reputation and find relevant
dimensions. Table 2 shows the proposed variables of
researchers to measure corporate reputation.
Table 2: Dimensions of corporate reputation.
Literature Dimensions Methodolo
gy
Wepener &
Boshoff,
2015
Emotional appeal, social
engagement, corporate
performance, good
em
p
lo
y
er, service
p
oints
Online
survey
Cintamür &
Yüksel,
2018
Financial performance,
customer orientation, social
and environmental
res
p
onsibilit
y
, trust
Face-to-face
survey
Fombrun et
al., 2015
Products, innovation,
workplace, governance,
citizenship, leadership,
p
erformance
Survey
Puncheva‐
Michelotti
&
Michelotti,
2010
Management excellence,
social responsibility,
customer value, economic
performance, patriotic
appeal, consumer impact,
emotional appeal, credibilit
Survey
Sequeira et
al., 2015
Enterprise agreeableness,
competence, commitment,
ruthlessness
Survey
Aspect Based Sentiment Analysis on Online Review Data to Predict Corporate Reputation
345
The problem of these dimensions is that they are
intended to be used through surveys and not ABSA
on online reviews. As such, emotional appeal,
management excellence, social responsibility and
other dimensions are not applicable to this research
where online reviews serve as the source of the
analysis. Those dimensions are more applicable to
surveys where detailed and targeted questions can be
formed, however online reviews are based on what
customers freely write. Therefore, it can be expected
that those detailed and specific information cannot be
found in online reviews, thus, most dimensions are
not applicable to the nature of this research.
This paper’s dimensions of corporate reputation
will be based on Fombrum et al. (2015), due to the
detailed explanations of the dimensions and attributes
such as quality or value that can be more applicable
to online reviews than the other dimensions shown in
Table 2. Furthermore, Fombrum et al. (2015) propose
two approaches for measuring corporate reputation,
thus there is a better option of validating the results
by comparing the results of dimensions 1 and 2; see
Figure 1. It can be compared with a recent framework
proposed by Wepener & Boshoff (2015). The
frameworks have similarities such as performance or
employee dimensions. However, the framework of
this paper has a two-way approach to measure
corporate reputation. Furthermore, the framework for
this paper has more dimensions/attributes to measure
corporate reputation on. This fact can be a bigger
advantage when using online reviews as compared to
surveys. In surveys, it can be said that the respondent
can lose attention and motivation when the survey is
longer, however when analyzing online reviews, this
hurdle is not given, and a more extensive approach
can be chosen. By having a framework to measure
corporate reputation on, this can be applied onto
ABSA. Where each dimension and attribute are an
aspect that gets analyzed on its corresponding
sentiment. This approach can be seen as an extension
to current literature and research (Caviggioli et al.,
2020; Chung et al., 2019) where corporate reputation
is increasingly more often measured through social
media or online reviews. However, those researchers
have not applied their analysis on frameworks such as
the proposed one of this paper. As such, this paper
aims to conduct ABSA on a theoretical framework
that is usually applied in surveys. This can lead to a
more reliable result of corporate reputation that is
backed up by a theoretical framework.
However, it can be seen that the dimensions of the
proposed framework are diverse in their nature, some
have a customer orientation, such as “products”
whereas others have an employee orientation such as
Figure 1: Framework of corporate reputation used in this
paper.
“workplace”. With online reviews being the source of
the analysis, it can be argued that a singular online
review source such as Trustpilot won’t provide
reviews that can cover all of the dimensions.
However, the preliminary results have shown that all
dimensions are covered and mentioned in reviews,
albeit to a varying degree from the online review
source Trustpilot (more in section 3.2). As such, all
dimensions of Fombrum et al. (2015) are kept in the
analysis, because having more aspects to analyze
won’t negatively affect the results and even less likely
to be mentioned dimensions are kept in to provide a
full picture. Furthermore, the attributes and
dimensions of the framework were adapted to the
customer language of online reviews and how a
customer would write these terms in a review. E.g.
“value” was translated to “price” as this word can be
seen as more likely to be used in a review or “Equal
DATA 2021 - 10th International Conference on Data Science, Technology and Applications
346
opportunities” changed to “equality”. This was done
based on the definitions Fombrum et al. (2015)
provided.
According to Fombrum et al. (2015), corporate
reputation can be measured through the extensive
approach of measuring all attributes related to the
dimensions of “dimensions 1”. However, corporate
reputation can also be measured alongside the
dimensions of “dimensions 2”. To evaluate validity
of the models, the result of each dimension of
“dimensions 1” was compared to the overall result of
each dimension. Furthermore, the overall results of
“dimensions 2” was compared to the overall result of
“dimensions 1”.
It can be concluded that corporate reputation and
identity both stem from corporate branding, with the
identity being a real and constant construct whereas
reputation is subjective and depends on the view of
the observer. To measure this view of corporate
reputation, it will be measured alongside the 34
attributes and dimensions of this paper’s framework
(see Figure 1).
3 METHODOLOGY
This research is focusing on a mixed-method research
as described by Saunders et al. (2015), which will be
explained in the following.
3.1 Primary Research
For this paper, online reviews needed to be collected
that can be analyzed in the later stages using a
machine learning model that conducts ABSA.
Karalevicius et al. (2018) who carried out sentiment
analysis tackled the primary research by using a
scraper that can collect social media and online
review data. For this research, a Python scraper from
CMI HVA was provided in order to collect the data.
For a platform to scrape the reviews from and to
analyze using sentiment analysis Vankka et al. (2019)
opted for Trustpilot, because of the accumulation of
companies with their respective reviews. As such
Trustpilot was chosen for scraping the online reviews
from. Finally, for this paper, the companies to extract
the reviews from and to conduct ABSA on are Nike
and Adidas, which amounts to over 2.000 scraped
online reviews. Those companies are chosen, because
they are prominent direct competitors and this allows
to compare the results of the analysis and derive
conclusions, which can also be relevant for
organizations that want to use the proposed work to
compare with their competition. Furthermore, these
companies have both a balanced amount of positive
and negative reviews, which allows to detect
weaknesses, where reviewers are unsatisfied and
clear improvement points can be deduced.
Figure 2: Starting dataset.
3.2 Secondary Research
Following the scraping of the review data, the online
reviews were analyzed using secondary research. All
the analyses were conducted using Python. The
process is according to de Kok et al. (2018) who
conduct ABSA and follow the same steps, with the
exception that they used a pre annotated dataset
where the aspects where defined in the dataset, for
this paper, the additional step of aspect term
extraction has to be added. De Kok et al. (2018) used
readily available restaurant review data that they did
not have to scrape and conducted ABSA on this data.
With their result they could evaluate which aspect
(e.g. food, ambience, service) the reviewee is
mentioning and if they have a positive or negative
sentiment towards the aspect.
Figure 3: Cleaned dataset.
1 Cleaning Data:
First the dataset was loaded; Figure 2 shows a
snapshot.
In the following step, the column “stars” was
stripped of all unnecessary characters until only the
number of stars awarded by the reviewer was left.
Then, the columns title and text were combined and
these reviews were cleaned and lemmatized using
NLTK, which is also according to de Kok et al.
(2018). NLTK is one of the most popular and widely
used libraries in the field of natural language
processing (NLP), because of its simplicity and
effectiveness (Hardeniya, 2015, p. 3). The end result
is a dataset with a text that was stripped of all non-
letters, words with not much value such as “the” and
the number of stars awarded (Figure 3).
Aspect Based Sentiment Analysis on Online Review Data to Predict Corporate Reputation
347
2 Aspect Term Extraction:
Figure 4: Example dependency parsing.
In this part, only the aspects were extracted. This is
an extra step that de Kok et al. (2018) did not go
through as their dataset was pre labeled, thus the
aspects were already extracted beforehand. For
example, in a review such as “The delivery went
smooth” the aspect is delivery. These aspects need to
be extracted to assign them the corresponding
sentiment at the later stages. This was done using the
dependency parser spaCy (Figure 4), which is
regarded as the industrial strength natural language
processing (Srinivasa-Desikan, 2018, p. 33) and
researchers such as Bandhakavi et al. (2018) rely on
spaCy for aspect extraction. Poria et al. (2016)
propose a deep convolutional neural network,
however supervised models are not fitting for this
paper, because manually labelling the dataset is too
time costly.
Using spaCy it is possible to extract the aspect and
the corresponding sentiment, which is mostly an
adjective. Figure 5 illustrates how the aspects are
extracted.
Even though spaCy is an often used and efficient
way of extracting aspects, the major downside is that
is only sees nouns as an aspect. As such, in a sentence
like “It was delivered quickly” the reviewer is mainly
talking about the delivery, but because it is expressed
as a verb, spaCy won’t detect it as an aspect.
However, it can be said that most aspects the reviewer
is talking about are articulated as a noun and therefore
this drawback won’t affect the results too negatively.
Figure 5: Example aspect extraction.
3 Training the Model:
This part is needed for the following sentiment
analysis and is mostly used through Scikit-learn. This
module is regarded as the state-of-the-art machine
learning algorithm for supervised and unsupervised
problems (Pedregosa et al., 2011). Using the machine
learning module Scikit-learn, a Support Vector
Machine (SVM) can be trained. De Kok et al. (2018)
trained and used an SVM for aspect detection. This
trained model creates a feature vector of values for
every instance that will be classified and the model is
taught using training data to interpret these values. In
the case of this paper, the accuracy of the SVM was
compared to a Naïve Bayes model, with the latter
performing better, due to an SVM generally needing
more data to train. This trained Naïve Bayes can take
in each review and assign each sentence to one or
more aspects that were extracted with spaCy.
Therefore, spaCy lists all the aspects that are available
and the Naïve Bayes classifies each review sentence
into one or more of those extracted aspects. E.g.
“Delivery was fast. Unfortunately, the product was
broken” the model will now assign the first sentence
to “delivery” and the second sentence to “product”. In
the later stages, the model will look for the sentiment
within each assigned sentence.
4 Sentiment Analysis:
For this part of analysis, first a few modules were
loaded that are according to Marrese-Taylor et al.
(2014), Ding et al. (2008), Rantanen et al. (2019) and
Urologin (2018).
First, opinion lexicon by Hu & Liu (2004). These
researchers defined a list of around 6.800 of positive
and negative words designed especially for sentiment
analysis and this lexicon is widely adopted in the NLP
community. The researchers defined the lexicon by
mining and summarizing customer reviews, then
extracting the opinion sentences and predicting
whether the opinion is positive or negative. Using this
lexicon, the code can detect if a word that an aspect is
associated with is either positive or negative. This
lexicon was used in researches such as of Marrese-
Taylor et al. (2014) or Ding et al. (2008) where it was
needed for aspect based opinion mining. Another
possible opinion lexicon could be SentiWordNet that
automatically assigns a score to a term, however
according to Na et al. (2009), this lexicon does not
handle the opinion scoring problem and therefore, the
lexicon of Hu & Liu was chosen for its simplicity.
Figure 6: Example replacing pronoun.
Second, Google’s pretrained word2vec model.
This model was published by researchers of Google
DATA 2021 - 10th International Conference on Data Science, Technology and Applications
348
(Mikolov et al., 2013) that trained the model on 1.6
billion words and it can recognize similarities of
words. Thus, it is able to find words that are similar
to the aspects and group them. This model is also
heavily utilized in textual analysis with researchers
such as Rantanen et al. (2019) using it to compute
corporate reputation from social media comments.
Another state of the art word embedding method is
Global Vectors for Word Representation (GloVe)
which was published by researchers of Stanford
(Pennington et al., 2014). However, according to the
study of Lin et al. (2015) where both wor2vec and
GloVe were compared, word2vec performed better.
Third, neural coreference model is used to
identify pronouns and replace them, as pronouns do
not add any sentiment or weight. This pre trained
neural network model can detect dependencies within
a sentence and for example understand which terms
refer to each other. According to Lee et al.(2017), this
model outperforms all previously related work.
Replacing pronouns is an important step according to
Urologin (2018), as pronouns are just placeholders
for proper nouns and this could affect the scoring of
the sentiments. An example would be “The product is
great, however it is smelling bad”. The pronoun
would be “it” which refers to “product”, therefore the
sentiment “bad” would be assigned to the term “it”
and not “product” and using the neural coreference
model, “it” will be replaced with “product”. Figure 6
is an example of how a pronoun is replaced.
In the next steps the functions are defined that will
make up the analysis. The functions consist of first
checking for similarities of the words. The maximum
similarity is 1, meaning it is the same word and it will
be checked if the term is similar to the aspect. The
threshold is set at 0.3, which garnered results where
perceived similar words are detected where the
penalty is not too high nor too low. E.g. “product
and “brand” have a similarity of 0.50 thus brand is
similar to the aspect product and will be later assigned
to that aspect. Now all the predefined aspects of
Figure 1 will receive sentiment scores. First, it will be
checked if the sentiment word like “great” is in the
opinion lexicon and a sentiment score of +/- 1 will be
given out accordingly. If there is an adjective
modifier such as “incredibly”, the sentiment will
receive a greater weight of +/- 1.5. Negations such as
“not good” will have the score flipped. Figure 7 gives
an insight to the scoring.
Now the scores for all the aspects of Figure 1 will
be accumulated and the scores of the synonyms will
also be calculated. For example if the aspect
“product” has been mentioned positively 10 times it
will have 10 points, if the synonym “brand” has been
mentioned positively 5 times the 5 points will go to
the aspect “product”, because according to word2vec
the similarity is above the threshold of 0.3 and the
aspect “product” will have overall 15 points.
Figure 7: Example sentiment scoring.
Figure 8: ABSA output for corporate reputation.
5 Visualization and Evaluation:
The last step is visualizing the results using the results
garnered from previous step. Using tables, bar graphs
or pie charts, the result of the sentiment analysis can
be plotted to give an overview of the results.
Matplotlib and Plotly were used of this step. Figures
8-10 are example output of the analysis applied to
Adidas and Nike. Note that all aspects and scores
directly relate to the relevant variables and
dimensions in the corporate reputation model of
Fombrun et al. (2015) that was shown in Figure 1.
Figure 9: Aspects mentioned for corporate reputation.
The pos/neg ratio of aspects was calculated as follows
in order to measure corporate reputation overall as
well as within its dimensions and variables:
𝑃𝑜𝑠 𝑟𝑎𝑡𝑖𝑜 𝑆𝑢𝑚 𝑝𝑜𝑠 𝑝𝑜𝑖𝑛𝑡𝑠 𝑝𝑒𝑟 𝑎𝑠𝑝𝑒𝑐𝑡
/𝑆𝑢𝑚 𝑝𝑜𝑠
𝑛𝑒𝑔 𝑝𝑜𝑖𝑛𝑡𝑠 𝑝𝑒𝑟 𝑎𝑠𝑝𝑒𝑐𝑡
(1)
Aspect Based Sentiment Analysis on Online Review Data to Predict Corporate Reputation
349
𝑁𝑒𝑔 𝑟𝑎𝑡𝑖𝑜 𝑆𝑢𝑚 𝑛𝑒𝑔 𝑝𝑜𝑖𝑛𝑡𝑠 𝑝𝑒𝑟 𝑎𝑠𝑝𝑒𝑐𝑡
/𝑆𝑢𝑚 𝑝𝑜𝑠
 𝑛𝑒𝑔 𝑝𝑜𝑖𝑛𝑡𝑠 𝑝𝑒𝑟 𝑎𝑠𝑝𝑒𝑐𝑡
(2)
The overall relative positive sentiment score on
Nike’s corporate reputation was higher than that of
Adidas on this dataset. We carefully checked on this
for evaluation purposes, and it was in line with the
average star rating of Trustpilot reviewers that was
also available in our scraped dataset and that is
publicly visible on the website.
Figure 10: Aspect comparison for corporate reputation.
3.3 Validity
In order to validate our approach, we compared the
overall scores obtained on both sides in the corporate
reputation model (see dimensions 1 and dimensions 2
in Figure 1). On the left side in the corporate
reputation model (dimensions 1 In Figure 1), we also
compared, for each dimension, the average score on
all attributes per dimension with the direct score on
the dimension; e.g. the average score on “High
quality”, “Good value”, “Stands behind” and “Meets
customer needs” is compared with that onProducts.
For Nike and Adidas together, the difference in score
between dimensions 1 and dimensions 2 is 4%. The
difference in score between averaged attribute scores
and direct dimension scores is 7%. Despite that the
scraped dataset is not extremely large, these low
scores seem to imply the internal validity of our
approach. Of course, such validity was already
reported by Fombrun et al. (2015) based on their
research with surveys. However, the important
contribution that we aim to make in the corporate
reputation literature is to be able to claim it for online
review data on the public web.
4 CONCLUSION
In summary, with our approach, companies are
enabled to assign to each aspect, that can be flexibly
defined to corporate repution or to another relevant
construct, a sentiment score derived from online
reviews on the web.
ACKNOWLEDGEMENTS
This paper has been inspired on the MSc master
project of Wolfgang Reiter who was involved via the
master Digital Driven Business at HvA. Thanks go to
Jesse Weltevreden for providing some useful
suggestions to an initial version of this manuscript.
Rob Loke is assistant professor data science at
CMIHvA.
REFERENCES
Abratt, R., & Kleyn, N. (2012). Corporate identity,
corporate branding and corporate reputations:
Reconciliation and integration. European Journal of
Marketing, 46(7/8), 1048–1063.
Agarwal, J., Osiyevskyy, O., & Feldman, P. M. (2015).
Corporate Reputation Measurement: Alternative Factor
Structures, Nomological Validity, and Organizational
Outcomes. Journal of Business Ethics, 130(2), 507–
507.
Araque, O., Corcuera-Platas, I., Sánchez-Rada, J. F., &
Iglesias, C. A. (2017). Enhancing deep learning
sentiment analysis with ensemble techniques in social
applications. Expert Systems with Applications, 77,
236–246.
Babu, A. J. G., & Suresh, N. (1996). Project management
with time, cost, and quality considerations. European
Journal of Operational Research, 88(2), 320–327.
Balmer, J. M. T. (2001). Corporate identity, corporate
branding and corporate marketing ‐ Seeing through the
fog. European Journal of Marketing, 35(3/4), 248–291.
Bandhakavi, A., Wiratunga, N., Massie, S., & Luhar, R.
(2018). Context Extraction for Aspect-Based Sentiment
Analytics: Combining Syntactic, Lexical and Sentiment
Knowledge. In M. Bramer & M. Petridis (Eds.),
Artificial Intelligence XXXV (Vol. 11311, pp. 357–
371). Springer International Publishing.
Barnett, M. L., Jermier, J. M., & Lafferty, B. A. (2006).
Corporate Reputation: The Definitional Landscape.
Corporate Reputation Review, 9(1), 26–38.
Biraghi, S., & Gambetti, R. C. (2015). Corporate branding:
Where are we? A systematic communication-based
inquiry. Journal of Marketing Communications, 21(4),
260–283.
Brahim, H. B., & Arab, M. B. (2011). The effect of
intangible resources on the economic performance of
the firm. Journal of Business Studies Quarterly, 3(1),
36–59.
Caviggioli, F., Lamberti, L., Landoni, P., & Meola, P.
(2020). Technology adoption news and corporate
reputation: Sentiment analysis about the introduction of
DATA 2021 - 10th International Conference on Data Science, Technology and Applications
350
Bitcoin. Journal of Product & Brand Management.
Chen, T., Xu, R., He, Y., & Wang, X. (2017). Improving
sentiment analysis via sentence type classification
using BiLSTM-CRF and CNN. Expert Systems with
Applications, 72, 221–230.
Chun, R. (2005). Corporate reputation: Meaning and
measurement. International Journal of Management
Reviews, 7(2), 91–109.
Chung, S., Chong, M., Chua, J. S., & Na, J. C. (2019).
Evolution of corporate reputation during an evolving
controversy. Journal of Communication Management,
23(1), 52–71.
Cintamür, İ. G., & Yüksel, C. A. (2018). Measuring
customer based corporate reputation in banking
industry: Developing and validating an alternative
scale. International Journal of Bank Marketing, 36(7),
1414–1436.
Cornelissen, J., Thøger Christensen, L., & Kinuthia, K.
(2012). Corporate brands and identity: Developing
stronger theory and a call for shifting the debate.
European Journal of Marketing, 46(7/8), 1093–1102.
Crook, T. R., Ketchen, D. J., Combs, J. G., & Todd, S. Y.
(2008). Strategic resources and performance: A meta-
analysis. Strategic Management Journal, 29(11), 1141–
1154.
Dang, T. T., & Pham, A. D. (2020). What make banks’
front-line staff more customer oriented? The role of
interactional justice. International Journal of Bank
Marketing, ahead-of-print(ahead-of-print).
de Kok, S., Punt, L., van den Puttelaar, R., Ranta, K.,
Schouten, K., & Frasincar, F. (2018). Review-
aggregated aspect-based sentiment analysis with
ontology features. Progress in Artificial Intelligence,
7(4), 295–306.
Deloitte. (2013). The Digital transformation of customer
services: Our point of view.
https://www2.deloitte.com/content/dam/Deloitte/nl/Do
cuments/technology/deloitte-nl-paper-digital-
transformation-of-customer-services.pdf
Ding, X., Liu, B., & Yu, P. S. (2008). A holistic lexicon-
based approach to opinion mining. Proceedings of the
International Conference on Web Search and Web Data
Mining - WSDM ’08, 231.
Eloranta, V., & Turunen, T. (2015). Seeking competitive
advantage with service infusion: A systematic literature
review. Journal of Service Management, 26(3), 394–
425.
Firestein, P. J. (2006). Building and protecting corporate
reputation. Strategy & Leadership, 34(4), 25–31.
Fombrun, C. J., Ponzi, L. J., & Newburry, W. (2015).
Stakeholder Tracking and Analysis: The RepTrak®
System for Measuring Corporate Reputation.
Corporate Reputation Review, 18(1), 3–24.
Gotsi, M., & Wilson, A. M. (2001). Corporate reputation:
Seeking a definition. Corporate Communications: An
International Journal, 6(1), 24–30.
Hardeniya, N. (2015). NLTK Essentials. Packt Publishing
Ltd.
Hu, M., & Liu, B. (2004). Mining and summarizing
customer reviews. Proceedings of the 2004 ACM
SIGKDD International Conference on Knowledge
Discovery and Data Mining - KDD ’04, 168.
Jaidka, K., Khoo, C. S. G., & Na, J. (2013). Literature
review writing: How information is selected and
transformed. Aslib Proceedings, 65(3), 303–325.
Jebbara, S., & Cimiano, P. (2016). Aspect-Based Sentiment
Analysis Using a Two-Step Neural Network
Architecture. In H. Sack, S. Dietze, A. Tordai, & C.
Lange (Eds.), Semantic Web Challenges (Vol. 641, pp.
153–167). Springer International Publishing.
Kamasak, R. (2017). The contribution of tangible and
intangible resources, and capabilities to a firm’s
profitability and market performance. European
Journal of Management and Business Economics,
26(2), 252–275.
Karalevicius, V., Degrande, N., & De Weerdt, J. (2018).
Using sentiment analysis to predict interday Bitcoin
price movements. The Journal of Risk Finance, 19(1),
56–75.
Keh, H. T., & Xie, Y. (2009). Corporate reputation and
customer behavioral intentions: The roles of trust,
identification and commitment. Industrial Marketing
Management, 38(7), 732–742.
Koporcic, N., & Halinen, A. (2018). Interactive Network
Branding: Creating corporate identity and reputation
through interpersonal interaction. IMP Journal, 12(2),
392–408.
Kor, Y., & Mesko, A. (2013). Dynamic managerial
capabilities: Configuration and orchestration of top
executives’ capabilities and the firm’s dominant logic.
Strategic Management Journal, 34(2), 233–244.
Lange, D., Lee, P. M., & Dai, Y. (2011). Organizational
Reputation: A Review. Journal of Management, 37(1),
153–184. https://doi.org/10.1177/0149206310390963
Lee, K., He, L., Lewis, M., & Zettlemoyer, L. (2017). End-
to-end Neural Coreference Resolution. ArXiv E-Prints,
arXiv:1707.07045.
Lin, W., Dai, H., Jonnagaddala, J., Chang, N., Jue, T. R.,
Iqbal, U., Shao, J. Y., Chiang, I., & Li, Y. (2015).
Utilizing different word representation methods for
twitter data in adverse drug reactions extraction. 2015
Conference on Technologies and Applications of
Artificial Intelligence (TAAI), 260–265.
Lin, Y., & Wu, L.-Y. (2014). Exploring the role of dynamic
capabilities in firm performance under the resource-
based view framework. Journal of Business Research,
67(3), 407–413.
Marrese-Taylor, E., Velásquez, J. D., & Bravo-Marquez, F.
(2014). A novel deterministic approach for aspect-
based opinion mining in tourism products reviews.
Expert Systems with Applications, 41(17), 7764–7775.
Mayzlin, D., Dover, Y., & Chevalier, J. (2014).
Promotional Reviews: An Empirical Investigation of
Online Review Manipulation. American Economic
Review, 104(8), 2421–2455.
Melewar, T. C., Gotsi, M., & Andriopoulos, C. (2012).
Shaping the research agenda for corporate branding:
Avenues for future research. European Journal of
Marketing, 46(5), 600–608.
Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., &
Aspect Based Sentiment Analysis on Online Review Data to Predict Corporate Reputation
351
Dean, J. (2013). Distributed Representations of Words
and Phrases and their Compositionality. In C. J. C.
Burges, L. Bottou, M. Welling, Z. Ghahramani, & K.
Q. Weinberger (Eds.), Advances in Neural Information
Processing Systems 26 (pp. 3111–3119). Curran
Associates, Inc.
Molloy, J. C., & Barney, J. B. (2015). Who Captures the
Value Created with Human Capital? A Market-Based
View. Academy of Management Perspectives, 29(3),
309–325.
Na, S.-H., Lee, Y., Nam, S.-H., & Lee, J.-H. (2009).
Improving Opinion Retrieval Based on Query-Specific
Sentiment Lexicon. In M. Boughanem, C. Berrut, J.
Mothe, & C. Soule-Dupuy (Eds.), Advances in
Information Retrieval (Vol. 5478, pp. 734–738).
Springer Berlin Heidelberg.
Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V.,
Thirion, B., Grisel, O., Blondel, M., Prettenhofer, P.,
Weiss, R., Dubourg, V., Vanderplas, J., Passos, A.,
Cournapeau, D., Brucher, M., Perrot, M., & Duchesnay,
É. (2011). Scikit-learn: Machine Learning in Python.
Journal of Machine Learning Research, 12(85), 2825–
2830.
Pennington, J., Socher, R., & Manning, C. D. (2014).
GloVe: Global Vectors for Word Representation.
Empirical Methods in Natural Language Processing
(EMNLP), 1532–1543.
Podnar, K. (2015). Corporate communication, a marketing
viewpoint. Routledge.
Poria, S., Cambria, E., & Gelbukh, A. (2016). Aspect
extraction for opinion mining with a deep convolutional
neural network. Knowledge-Based Systems, 108, 42–
49.
Puncheva‐Michelotti, P., & Michelotti, M. (2010). The role
of the stakeholder perspective in measuring corporate
reputation. Marketing Intelligence & Planning, 28(3),
249–274.
Rantanen, A., Salminen, J., Ginter, F., & Jansen, B. J.
(2019). Classifying online corporate reputation with
machine learning: A study in the banking domain.
Internet Research, 30(1), 45–66.
Rindova, V. P., Williamson, I. O., & Petkova, A. P. (2010).
Reputation as an Intangible Asset: Reflections on
Theory and Methods in Two Empirical Studies of
Business School Reputations. Journal of Management,
36(3), 610–619.
Roy, S. K., & Shekhar, V. (2010). Dimensional hierarchy
of trustworthiness of financial service providers.
International Journal of Bank Marketing, 28(1), 47–64.
Saunders, M., Lewis, P., & Thornhill, A. (2015). Research
methods for business students (7th ed.).
Schwaiger, M., Raithel, S., Rinkenburger, R., Schloderer,
M., Burke, R. J., Martin, G., & Cooper, C. L. (2011).
Measuring the Impact of Corporate Reputations on
Stakeholder Behavior. In Corporate Reputation.
Managing Opportunities and Threats
(pp. 61–88).
Ashgate Publishing Limited.
Sequeira, N., da Silva, R. V., Ramos, M., & Alwi, S. F. S.
(2015). Measuring Corporate Reputation in B2B
Markets: The Corporate Personality Adapted Scale.
IUP Journal of Knowledge Management, 13(3), 31–63.
Business Premium Collection; Engineering Database.
Shee, P. S. B., & Abratt, R. (1989). A new approach to the
corporate image management process. Journal of
Marketing Management, 5(1), 63–76.
Srinivasa-Desikan, B. (2018). Natural Language
Processing and Computational Linguistics: A practical
guide to text analysis with Python, Gensim, spaCy, and
Keras. Packt Publishing Ltd.
Urologin, S. (2018). Sentiment Analysis, Visualization and
Classification of Summarized News Articles: A Novel
Approach. International Journal of Advanced
Computer Science and Applications, 9(8).
Vankka, J., Myllykoski, H., Peltonen, T., & Riippa, K.
(2019). Sentiment Analysis of Finnish Customer
Reviews. 2019 Sixth International Conference on
Social Networks Analysis, Management and Security
(SNAMS), 344–350.
Vig, S., Dumičić, K., & Klopotan, I. (2017). The Impact of
Reputation on Corporate Financial Performance:
Median Regression Approach. Business Systems
Research Journal, 8(2), 40–58.
Walker, K. (2010). A Systematic Review of the Corporate
Reputation Literature: Definition, Measurement, and
Theory. Corporate Reputation Review, 12(4), 357–387.
Wepener, M., & Boshoff, C. (2015). An instrument to
measure the customer-based corporate reputation of
large service organizations. Journal of Services
Marketing, 29(3), 163–172.
DATA 2021 - 10th International Conference on Data Science, Technology and Applications
352