Author:
Noriyuki Okumura
Affiliation:
National Institute of Technology and Akashi College, Japan
Keyword(s):
Kaomoji, Emoticon, Original Form, N-gram, Kaomoji’s Dictionary, Annotation.
Related
Ontology
Subjects/Areas/Topics:
Applications
;
Applications and Case-studies
;
Artificial Intelligence
;
Data Engineering
;
e-Business
;
Enterprise Engineering
;
Enterprise Information Systems
;
Enterprise Ontology
;
Knowledge Engineering and Ontology Development
;
Knowledge-Based Systems
;
Natural Language Processing
;
Ontologies and the Semantic Web
;
Pattern Recognition
;
Symbolic Systems
Abstract:
In this paper, we construct a large-scale knowledge base representing the base form of kaomoji (emoticon)
and other elements of kaomoji: eye, nose, mouth, and so on, to analyze features of kaomoji in detail. Previous
methods to analyze kaomoji mainly aim to extract kaomoji from sentences, paragraphs, or documents, or to
classify kaomoji into some emotion classes based on the emotion that kaomoji shows or potentially includes.
We define the base form of kaomoji for detailed kaomoji analytics. Application systems can estimate another
feature of derivative kaomoji based on its base form and other elements for sentiment analytics, emotion
extraction, or kaomoji classification. We annotated about 40,000 kinds of kaomoji for constructing a largescale
knowledge base. The total number of extracted base forms is about 3,000. In experimental evaluations
based on cosine similarity using N-gram based features and simple Skip-gram based features, we show that
the model can estimate the base form
of kaomoji with an accuracy of about 50%.
(More)