EGAN: Generatives Adversarial Networks for Text Generation with

Sentiments

Andres Pautrat-Lertora, Renzo Perez-Lozano and Willy Ugarte

Universidad Peruana de Ciencias Aplicadas (UPC), Lima, Peru

Keywords:

GAN, Text Generation, NAO.

Abstract:

In these last years, communication with computers has made enormous steps, like the robot Sophia that sur-

prised many people with their human interactions, behind this kind of robot, there is a machine learning model

for text generation to interact with others, but in terms of text generation with sentiments not many investi-

gations have been done. A model like GAN has opportunities to become an excellent option to attack this

new problem because of their discriminator and generator competing for search the optimal solution. In this

paper, a GAN model is presented that can generate text with different emotions based on a dataset recompiled

from tweets labeled with emotions and then deployed in an NAO robot to speak the text in short phrases using

voice commands. The model is evaluated with different methods popular in text generation like BLLEU and

additionally, a human experiment is done to prove the quality and sentiment accuracy.

1 INTRODUCTION

Text generation is a stringent computational task

that last years have has many utilities like improve-

ments in virtual assistants and Human-robot interac-

tion(HRI) with more elaborate dialogues. However,

the text generated is not totally accurate and presents

not realistic phrases as (Huszar, 2015) says, this in-

cludes that if the text generated is more extensive the

problems will be more frequent. Even though the ob-

jective is the human interaction, the sentiments are

not usually considered which in a real conversation

is usually a really important topic. Nowadays there

are many good models with the main function of text

generation, some of these can be transformers, that

don’t consider a sentiment but can be modiﬁed, or the

best known model GPT3 that have different uses.

Many people have incorporated voice assistants

as a tool in their daily lives to control appliances,

play multimedia products, create notes or reminders,

etc. The relevance can be noticed in (Newman, 2019)

where indicates that on 2018 in the United States 14%

of adults regularly use one of these devices, while in

the United Kingdom it is 10%. Also, the addition of

sentiments will bring a better experience to the user,

can be more personal and have beneﬁts as explained

https://orcid.org/0000-0002-7510-618X

like recognizing the feeling and answering some-

thing depending on the user emotion.

The generation of coherent text is always a hard

task, all languages have a structure, grammatic and

correct order to be understood, furthermore, inside

every sentence, many topics can be mentioned and it

needs logic to don’t jump from one to other without

context. For that, a good text generation model needs

to catch many attributes of the already existing sen-

tence and then process them to generate more content.

Adding an emotion to the generation makes this work

even more problematic, there will be more attributes

to process and each word has to consider the senti-

ment of the previous text to continue the task. Origi-

nally the GAN architecture just have 2 models inside

the generator and the discriminator to evaluate how

realistic the samples of the generation are, but in our

model, there is another parameter, the sentiment, that

needs to be evaluated, for this, we use a third model

for the task and specialize that model on sentiment

analysis and leave the ﬁrst one for its original task.

There are different solutions made for the text

generation that uses a model based on LSTM (Long

short-term memory), like (Cai et al., 2021), which is a

variation of an RNN (Recurrent neural network), this

is used because this kind of network work with a se-

Emotion AI will personalize interactions - Gart-

ner - https://www.gartner.com/smarterwithgartner/

emotion-ai-will-personalize-interactions

Pautrat-Lertora, A., Perez-Lozano, R. and Ugarte, W.

EGAN: Generatives Adversarial Networks for Text Generation with Sentiments.

DOI: 10.5220/0011548100003335

In Proceedings of the 14th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (IC3K 2022) - Volume 1: KDIR, pages 249-256

ISBN: 978-989-758-614-9; ISSN: 2184-3228

249

quence of data instead of single data values, making

this perfect to generate new content from a context

like a sentence, some models that use this method

are (Li et al., 2018). But the model mentioned be-

fore only can work if they have a big amount of data

which is not always the scenario, so the GAN (gener-

ative adversarial networks) are used because can have

a good performance with a smaller dataset which can

be seen in models like (Rizzo and Van, 2020). On the

other hand, some investigators have tried to go further

away to make a generative model and add a sentiment

label (Li et al., 2018), with the time many works are

done to improve the text generation to make it indis-

tinguishable from a real text from a human.

We develop a GAN based model, which is com-

posed for two discriminators that are based in a con-

volutions network, one to recognize if the text evalu-

ated is true or fake and one that determines the emo-

tion of the text by extract each characteristic of the

sentence. Finally, a generator model based on LSMT

trains with the output of the other two models trying

to make them fail with the text generated.

Our contributions are as follows:

• We develop a multilabel text generation with sen-

timents.

• We build a implementation of a GAN model for

text generation with sentiments.

• We make a analysis and comparison of our model

with other similar.

In Section 2, solutions with similar subjects will

be compared to ours. In Section 3 we explain the

different architectures and algorithms related to GAN

models and text generation and then we present the

structure of the algorithm developed and the contribu-

tions made. In Section 4, the experiments and results

will be explained and detailed. Finally, the conclusion

will be presented in section 5.

2 RELATED WORKS

Many projects have proposed different solutions us-

ing their own implementation of GAN with details

that make unique models. We use some models as

inspiration for our implementation some of these can

be found next.

The model proposed on (Wu and Wang, 2020)

named TG-SeqGAN is based on SeqGAN(Yu et al.,

2017) with an addition of Truth-Guilded method to

make it closer to the real data. This model includes

an initial state where a transfer model and a cost func-

tion, deﬁned as the distance between the generated

text and the real text, are used to ﬁnd the next value

quicker with a CNN model for extract all character-

istics and context of words. In our model, we use a

CNN model as the discriminator, and an additional

classiﬁcator model is added to evaluate the sentiment

and generate sentences with a speciﬁc input.

In (Li et al., 2018) a novel framework is proposed,

where the author train a GAN model for text gen-

eration with categories based on SeqGAN(Yu et al.,

2017), with a LSTM as generator and a RNN model

for the discriminator for veracity and clasiﬁcator for

the category. Inspired by this work, we have imple-

mented an LSTM model as our generators and sep-

arated the discriminator and classiﬁcator but for the

last two models, we have used a CNN to obtain more

accurate predictions.

The model CD-GAN for text generation proposed

on (Yan et al., 2021) use an LSTM generator and a

CNN as a discriminator-augmented, this means that

their discriminator evaluates each word of the sen-

tence individually and founding the incoherence in

the sentences with the objective of avoiding the pre-

training. This is a novel technique, the CNN used has

a great performance on sequence classiﬁcation but we

consider that the individual word analysis is too com-

plicated and doesn’t have enough beneﬁts so we use

only one classiﬁcation for the whole sentence.

EmoSen (Firdaus et al., 2020) is a framework end-

to-end with the job of generating dialogs, and con-

trolling the sentiment as happy, sad, angry, etc, and

the feeling as positive, negative, and neutral, this is

approached by analyzing reference text, audio, and

visual helpers. This model can manage a lot of sen-

timents, feeling,s and contexts because of the used

train dataset Sentiment and Emotion aware Multi-

modal dataset (SEMD), for this the dialogue system

is robust. Compared with our proposed model we

use the controlled training for sentiments on the text,

but the feeling as positive, negative, and neutral is

not considered like this project instead of these, we

present six feelings, enjoyment, sadness, love, anger,

surprise and fear..

Finally, on the work made on (Chen et al., 2020),

CTGAN model is proposed based on SeqGAN, it has

the addition of being able to generate text with vari-

able length and a label of sentiment as positive, nega-

tive, and neutral. Also, an algorithm of word replace-

ment is used to guarantee the quality of the generated

text in a speciﬁc context. In this project, the discrimi-

nator is used to evaluate if the text is real or generated

and if the sentiment is accurate, on our project these

functions are separated into two models to make each

one specialized on evaluate if the text is real or fake

and if the sentiment is accurate, since that if use a one

model that describe if is real or fake and add the sen-

KDIR 2022 - 14th International Conference on Knowledge Discovery and Information Retrieval

250

timents will have many parameters that It will loss in

the training.

3 METHOD

In this section we present the main concepts and ar-

chitecture for our proposal.

3.1 Preliminary Concepts

The text generation with GAN models for human and

robot interaction has many challenges in the devel-

opment process. The adaptation and modiﬁcation of

GAN models to generate text classiﬁed with a senti-

ment label is the main problems, but this brings other

problems like how to generate the text, how to iden-

tify the sentiment, and make the interaction with the

user. In this section, the necessary background will be

explained.

3.1.1 Text Generation

Text generation is a subarea of natural language pro-

cessing, so it acquires knowledge of the computa-

tional area of language and artiﬁcial intelligence, the

objective is to generate coherent and readable text.

To make this task easier for the computer the words

use to be tokenized giving each word on a dictionary

a corresponding number and transforming the inputs

sentences into a new format like one hot where an ar-

ray of the size of the dictionary is replaced with ones

and zeros depending on the words on the sentence.

There are many models that can be used for this objec-

tive, one that performs well in many projects like (Cai

et al., 2021), are the long short-term memory (LSTM)

which can process many data at the same time, which

means that can manage multiple words at the same

time. One of the main characteristics of LSTM is the

states it remembers over time and uses this informa-

tion for the followings generations, this is done by im-

plementing loops and gates within the model, as can

be seen in Fig. 1.

3.1.2 Sentiment Analysis

According to EmoShape

, the sentiment analysis

refers to the fact that the machine can understand the

feeling that the user wants to express, this can be by

image recognition, speech recognition, or text analy-

sis. When it comes to a text format, the task becomes

a natural language processing job, whereas machine

EmoShape: Emotion Synthesis for Metaverse - https:

//emoshape.com/

Figure 1: LSTM model (Zia and Zahid, 2019).

Figure 2: Sentiment analysis in training own elaboration.

learning or deep learning techniques are needed. As

mentioned when text has to be processed tokenization

is really important and LSTM models make a good

job like paper (Hochreiter and Schmidhuber, 1997),

but other options like gated recurrent unit (GRU) and

convolutional neuronal networks (CNN) are good too

like (Liu et al., 2020) where a CNN model is used to

identify the sentiment on a text other model that for

use this criterion is SentiGAN (Wang and Wan, 2018)

in where use multiple generators and one multi-class

discriminator, to address the above problems. Since,

yours multiple generators are trained simultaneously,

aiming at generating texts of different sentiment la-

bels without supervision.

The generator model passes the words in integers

and the classiﬁer identiﬁes which of the sentiments

they belong to. Then the words considering the labels

re-enter the generator until they can enter the genera-

tor and the classiﬁer consecutively Fig. 2.

3.1.3 Generative Adversarial Networks(GAN)

GAN models were proposed on (Goodfellow et al.,

2014) and are based on having two models that com-

pete with each other, a generator of content, and a

discriminator that is responsible for checking if the

content evaluated is generated by the ﬁrst model or is

EGAN: Generatives Adversarial Networks for Text Generation with Sentiments

251

Figure 3: GAN arquitecture (Goodfellow et al., 2014).

real data from the dataset used, the structure of this

ﬁrst architecture is presented on the Fig. 3. Goodfel-

low develops the GANs to generate images, but be-

cause of their good results these models gain a lot of

popularity and other researchers start using this model

for the same purpose or adapting it to other investiga-

tion ﬁelds. One of the ﬁrst investigations published

that approached the GAN model for the text gener-

ation was (Firdaus et al., 2020) where an adaptation

of the GAN model for images is used to generate se-

quences instead of images, making possible to gener-

ate text or music composition. Other paper have use

gan and is DGSAN (Montahaei et al., 2021) in where

is a model that upgrade of before model since that ﬁx

the gradient step problem and used 2 model in same

network and subsequently generate text, obtaining re-

sults greater than 90 percent in BLEU 3.

3.1.4 Human Robot Interaction (HRI)

This refers to the study of any interaction between hu-

mans and robots. In the last years, the interactions

with robots have been improved, like the example of

Sophia where the objective is o make a robot simi-

lar to the human appearance to interact with humans.

One area of the HRI is verbal communication that can

be translated into text, here artiﬁcial intelligence is

needed to create coherent and understandable inter-

actions. Interaction between machines and humans

a better experience for the user can be reached by

adding emotional analysis and proper response to his

feeling (see Fig. 4).

3.2 Main Contribution

In this work of text generation based on sentiments,

we proposed a GAN model with a discriminator

model to evaluate the veracity of the text and clas-

siﬁer model to evaluate the sentiment of the sentence

with the objective of training a text generator model

Figure 4: NAO robot interaction.

that can receive a sentiment and generate a coherent

sentence corresponding to it.

3.2.1 GAN for Text Generation

In this section, we explain, the structure and func-

tion of the different models used inside the GAN

which are the generator, discriminator and classiﬁca-

tion. This GAN model is based on different models

as CatGAN (Liu et al., 2020) and SeqGAN (Firdaus

et al., 2020), adding a second discriminator for sen-

timent analysis in the ﬁrst and tuning the model pa-

rameters and structures to obtain better results in both

cases. For other hand, this structure uses thresholds

to train the discriminator with sentences with noise to

make a better discriminator and classiﬁcator

Before working with a sentence, each one is trans-

formed with a dictionary from words to integers and

are codiﬁed with one hot method to be easier to man-

age them inside the models.

Discriminator. The objective of this model is to be

able to differentiate between generated sentence and

a original ones, then use the results a make a train

step on the generator. Once this is done, the generator

will be able to generate better sentences to make the

discriminator wrong.

This model is based on a CNN, the model struc-

ture is presented in the Fig. 5. The ﬁrst layer is an

embedding layer as usually made on natural language

processing(NLP), this help with the large input vector

before the one-hot codiﬁcation to be easier to manage

for the model. Next, we use four 2D convolutional

layers separately with one input channel and 300 out-

put channels to extract features from the sentence and

then make a max pooling to each. The result is con-

catenated and applied on a linear layer of input and

output size of 1200 to evaluate the features extracted

by the convolutional layers. Then an activation func-

tion is applied where x is the result of the previous

linear layer and the function f (x) can be expressed as

following:

f (x) =

1+e

−x

× max(0, x)∗(1 −

1+e

−x

)

KDIR 2022 - 14th International Conference on Knowledge Discovery and Information Retrieval

252

Figure 5: Discriminator Structure own elaboration.

Figure 6: Generator Structure own elaboration.

Classiﬁer. The purpose of this model is to evaluate

sentences and classify the sentiment present in them,

and this information is used to train the generator too.

This model has a similar function to the discriminator

but with more labels, although both tasks of the dis-

criminator and classiﬁer can be done by one model,

separating these tasks into single specialized models

ensures the efﬁciency of each one on its work and re-

sult in a better train for the generator model.

Because of the similar objective between the dis-

criminator and the classiﬁcation the structure used is

the same, just variation the result depending on the

number of sentiments that are being evaluated.

Generator. The generator is the main model to

train, the discriminator and classiﬁcator work is done

to help with the training on the generator. This model

has the job of receiving a sentiment and part of the

sentence if it already exists and generate the next word

on the sentence or a dot to ﬁnish it. The objective of

the training is that this model can generate coherent

sentences that convey the feeling given.

The model is based on an LSTM model because

of its efﬁciency in generating text, architecture devel-

oped can be found on Fig. 6. Like the previous mod-

els, the ﬁrst layer is an embedding model with the in-

put size equal to the dictionary size and an output of

32. Then the label of sentiment is added and the in-

formation is passed to the LSTM layer with an input

and output size of 32. Finally a linear layer an array

with the size of the dictionary and a softmax function

determinates the next word of the sentence.

Training. For the training, on each step of the train-

ing the sentences are transformed from words to inte-

gers with a dictionary of words as X = [x

, x

, ...] and

a one-hot encoding transforms it to an array of the

size of the dictionary with ones and zeros depending

on the words on the sentence. Before starting with the

GAN training each model is pre-trained with only the

real data, this gives the models an initial state to not

start generating and evaluating sentences randomly.

The ﬁnal GAN train has 4 steps, ﬁrst, a sentence is

generated, second, the discriminator and classiﬁca-

tor evaluate the sentence, third the ﬁt of the 3 mod-

els is done and ﬁnally the discriminator and generator

train with a real sentence. For the construction of the

model, Pytorch is used, with a learning rate of 0.01

for the pre-training, GAN train of 0.0001, and a batch

size of 8. The training was performed with 150 pre-

train epochs and 2000 GAN train epochs, that was

done on 3 separate trains of 4 hours each with a Tesla

P100.

3.2.2 Connection Nao Robot

For the connection with the NAO robot we needed

to use its speciﬁc IP on the local red to send it com-

mands with the library Naoqi. This library contains

all function that can be used with the NAO robot

and is only available in Python 2, but our model was

developed on Python 3, so several connected scrips

were made. In the ﬁrst step, a subscription to AL-

TextToSpeech was necessary for the robot to say the

instructions, and ALSpeechRecognition to recognize

the user voice of what sentiment is indicated. In the

second step, the sentiment is sent to Python 3 to gen-

erate the sentence in English with the correct label,

in this part, it is necessary to clarify that we have to

present the sentences in Spanish due to academic is-

sues of our institution, so we used a model to trans-

late but for security, the red where the NAO robot is

connected has no internet so the use of translation

with APIs was not possible, for this reason, We use

the Argos model for ofﬂine translate from English to

Spanish, this model used help of OpenNMT toolkit,

SentencePiece for tokenization, Stanza for sentence

boundary detection

. In the ﬁnal step, the translated

text is sent to a script on Python 2 and uses ALText-

ToSpeech to interpret the phrase generate.

4 EXPERIMENTS

4.1 Experimental Protocol

To recreate the experimentation process we will men-

tion the hardware, dataset, parameters, and the valida-

tion of the results of our project.

Open Tech - https://www.argosopentech.com/

EGAN: Generatives Adversarial Networks for Text Generation with Sentiments

253

4.1.1 Development Environment

The model training environment used as the main

platform was Google Colab with the Pro subscription,

this tier of the platform was mainly needed for the ex-

tended runtimes compared to the Free version, this

service offers us a Tesla T4 or a Tesla P100 GPU and

25GB of RAM.

4.1.2 Dataset

The used dataset was the emotions dataset for NLP

found on Kaggle, this one has a long recompilation of

sentences labeled with one of 6 emotions and a total

of 16000 sentences. The dataset was pre-processed,

symbols were deleted, sentiments were separated into

groups of 700 sentences to make the same number

on each sentiment and the number of sentences was

reduced due to the resource given by Google Colab

Pro were not enough to make the training with all the

data.

4.1.3 Models Training

The model was developed using Pytorch and the train-

ing was realized on Google Colab Pro GPU, this one

has a maximum runtime of 24 hours and sometimes

less, to approach this issue every 20 epochs the state

of the model was saved, and if the runtime ends a new

one was generated manually, load the last state and

continue with the training, ﬁnally the train was done

on around 48 hours. For the parameters of the model,

we train 500 epochs with a batch size of 8, vocabu-

lary size of 15213, and generator, discriminator, and

classiﬁcator learning rate of 0.0001.

4.1.4 Testing Environment

We have developed two environments for testing and

validation, the ﬁrst is a user interface (UI), and the

second is an interaction with de NAO robot. The UI

is a simple environment where the user can select a

sentiment and generate a sentence based on it, then a

button can translate the English text to Spanish or vice

versa. The NAO robot interaction is the main testing

environment where it makes a presentation of the in-

teraction and tells the user to say a sentiment, next the

robot sends this sentiment to the model to generate 10

sentences, these sentences are classiﬁed with the text

discriminator from the GAN which tell us which one

is the more realistic one, ﬁnally, the selected sentence

returns to the robot to say the phrase.

Emotions dataset for NLP - https://www.kaggle.com/

datasets/praveengovi/emotions-dataset-for-nlp?select=

train.txt

4.2 Results

We have used BLEU(Papineni et al., 2002) to val-

idate the text quality of the sentences generated by

our model, comparing 48 example sentences with the

16000 of the dataset we obtain the quality of the text

generated. This metric use the number of words to

compare sentences, the metric will be more demand-

ing if this amount of words is higher. We evaluate our

models with BLEU-1, BLEU-2, BLEU-3, and BLEU-

4, where the number means the number of words

used, the result of this metric can be found in the ta-

ble 1.

Another metric that we have used is Jaccard which

evaluates 2 groups of data to compare the similarity of

these ones, we compare the 48 examples generated to

the 16000 sentences of the dataset as on BLEU and

obtain 0.0966 as shown in table 3, this means that the

similarity is pretty low to the dataset. This metric is

usually presented with BLEU because an overﬁtted

model can generate the same sentences as the dataset,

which means BLEU will be really high, so Jaccard

help to discard that a good score is the result of over-

ﬁtting.

Other metrics were considered to be used but

many were discarded because they considers parame-

ters that our model was not supposed to accomplish.

One of these metrics was METEOR (Banerjee and

Lavie, 2005), this is an improvement of BLEU fo-

cused on evaluating text translation, its bases on eval-

uating the matching unigrams considering the surface

forms of the sentence. This makes the similarity of

the generated and evaluation text important, but in our

model, we don’t want a big similarity with the dataset

so our results are bad on this metric because is focused

on novelty, not similarity. Another metric we con-

sider was ROUGE (Lin, 2004) which is used to eval-

uate text summarization and translation and is based

on counting the overlapped n-grams or sequences of

words, this makes the similarity of the texts evaluated

really important and this is not relevant for our model

as explained before.

4.3 Discussion

As shown in table 2 our model presents better perfor-

mance on BLEU metric than some similar implemen-

tations but worse in other cases, for this is important

Table 1: BLEU Metrics.

Metrics BLEU-3 BLEU-4 BLEU-5

EGAN .8127 .6138 .4574

KDIR 2022 - 14th International Conference on Knowledge Discovery and Information Retrieval

254

too to compare the Jaccard score too in table 3, so

we can notice that despite the higher score on BLEU

the text of their model is more similar to the dataset

used by them. Our model presents a great alternative

to other project implementations despite can presents

lower quality text these is more novelty.

On the found results we have the DGSAN

model (Montahaei et al., 2021) that is one of the mod-

els with the best results on the benchmarking shown

on table 2, it has better results than us on BLEU-3 and

BLEU-5, and this quality difference can be noticed on

the table 3

Next, on table 4 text generated by different mod-

els are presented. Comparing the text shown we can

notice some of them are not too coherent. For exam-

ple, in the case of the model WRGAN they present a

good text coherence in the reading, our text has less

coherence but it’s longer than WRGAN examples.

Also, we have used the NAO robot to make test-

ing and validation, as a proposition of making bet-

ter the interaction for the user we make the robot say

an introduction and then received a voice command

telling a sentiment so the model can generate sen-

tences with it and ﬁnally answer with that sentence.

We make some surveys where with a video the re-

spondents evaluate from 0 to 5 the quality of the in-

teraction, from this the average score was 3.67 so we

can conclude that the interaction with the NAO robot

with the implementation of our model was good.

Also, continue with the survey we have the qual-

iﬁed results on the feelings of the generated from the

model shown on table 5 where the sentences of Sad-

ness and Surprises are the two feelings that users per-

ceive. On the other hand, analyzing the coherence of

Table 2: BLEU Metrics.

Metrics BLEU-3 BLEU-5

EGAN(ours) .813 .457

SeqGAN .807 .419

DGSAN .945 .728

DoubAN-Full .095 .056

WRGAN .634 .303

Table 3: Jaccard Metrics.

Metrics Jaccard

EGAN(ours) .097

SeqGAN .140

DGSAN .254

Table 4: Text Comparation of differents models.

Model Text Generate

EGAN(ours)

• I go through the time i had too

much more and feel that she

asked why you feel is

• I dont feel apprehensive and ap-

prehensive among my feelings

that i m feeling reluctant to post

• I dont feel extremely worthless

feeling so apprehensive among

a bit

TILGAN

• I was driving my van to work

one day

• She bought some new books

• He saw some band members

DoubAN-Full

• What did the appletalk system

say?

• What is the immune system of?

• Where did the grand canal oc-

cur?

WRGAN

• Could use a little more human-

ity and delight

• So boring and meandering

• A pleasant, but it’s also ex-

tremely effective

Table 5: Sentiment accuracy by survey.

Anger Sadness Love Surprise

22.22 66.67 38.89 66.67

sentences with users indicates that 36 percent is bad,

other 25 percent think that is normal and ﬁnally the

38.89 percent think it’s excellent how to shown on ta-

ble 6.

In conclusion, the diversity and quality of the text

its really important to validate the results o text gen-

eration, metrics like BLEU and Jaccard can be really

useful in these cases.

EGAN: Generatives Adversarial Networks for Text Generation with Sentiments

255

Table 6: Coherence accuracy by survey.

Bad Normal Excellent

36.11 25 38.89

5 CONCLUSIONS

Through the development of the project, the analysis

of the metrics, and the results found in the other mod-

els, we concluded that our model has good text gener-

ation results, but it needs a high processing power to

be trained. For this limitation, the model parameters

were not the desired ones and that can be the cause

for some incoherence in the generated text.

The CNN and LSTM models have provided a

good performance on the GAN architecture for the

text generation with sentiments. A beneﬁt of using

convolutional networks is that they are capable of fea-

ture extracting, this help to be more precise on the

discriminator and classiﬁcator work. In the case of

the LSTM generator, due to the information saved on

each interaction on the generation, the text result has

good coherence and quality.

A good upgrade to this work that can be done in

the future, is the exchange of the internal models, sim-

ilarly to GPT3 based models (de Rivero et al., 2021).

Despite the good performance it presented, this can be

improved, for example, by replacing the LSTM gen-

erator with a transformed-based generator or transfer

learning from a CNN (Rodr

ıguez et al., 2021).

REFERENCES

Banerjee, S. and Lavie, A. (2005). METEOR: an automatic

metric for MT evaluation with improved correlation

with human judgments. In IEEvaluation@ACL.

Cai, P., Chen, X., Jin, P., Wang, H., and Li, T. (2021). Distri-

butional discrepancy: A metric for unconditional text

generation. Knowl. Based Syst., 217.

Chen, J., Wu, Y., Jia, C., Zheng, H., and Huang, G. (2020).

Customizable text generation via conditional text gen-

erative adversarial network. Neurocomputing, 416.

de Rivero, M., Tirado, C., and Ugarte, W. (2021). For-

malstyler: GPT based model for formal style trans-

fer based on formality and meaning preservation. In

IC3K.

Firdaus, M., Chauhan, H., Ekbal, A., and Bhattacharyya, P.

(2020). Emosen: Generating sentiment and emotion

controlled responses in a multimodal dialogue system.

IEEE Transactions on Affective Computing.

Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B.,

Warde-Farley, D., Ozair, S., Courville, A. C., and

Bengio, Y. (2014). Generative adversarial networks.

CoRR, abs/1406.2661.

Hochreiter, S. and Schmidhuber, J. (1997). Long short-term

memory. Neural Comput., 9(8).

Huszar, F. (2015). How (not) to train your generative model:

Scheduled sampling, likelihood, adversary? CoRR,

abs/1511.05101.

Li, Y., Pan, Q., Wang, S., Yang, T., and Cambria, E. (2018).

A generative model for category text generation. Inf.

Sci., 450.

Lin, C.-Y. (2004). Rouge: a package for automatic evalua-

tion of summaries. In Workshop on Text Summariza-

tion Branches Out of ACL.

Liu, Z., Wang, J., and Liang, Z. (2020). Catgan: Category-

aware generative adversarial networks with hierarchi-

cal evolutionary learning for category text generation.

In AAAI.

Montahaei, E., Alihosseini, D., and Baghshah, M. S.

(2021). DGSAN: discrete generative self-adversarial

network. Neurocomputing, 448.

Newman, N. (2019). Journalism, media and technology

trends and predictions 2018.

Papineni, K., Roukos, S., Ward, T., and Zhu, W. (2002).

Bleu: a method for automatic evaluation of machine

translation. In ACL, pages 311–318.

Rizzo, G. and Van, T. H. M. (2020). Adversarial text gen-

eration with context adapted global knowledge and

a self-attentive discriminator. Inf. Process. Manag.,

57(6).

Rodr

ıguez, M., Pastor, F., and Ugarte, W. (2021). Clas-

siﬁcation of fruit ripeness grades using a convolu-

tional neural network and data augmentation. In IEEE

FRUCT.

Wang, K. and Wan, X. (2018). Sentigan: Generating senti-

mental texts via mixture adversarial networks. In IJ-

CAI.

Wu, Y. and Wang, J. (2020). Text generation service model

based on truth-guided seqgan. IEEE Access, 8:11880–

11886.

Yan, Y., Shen, G., Zhang, S., Huang, T., Deng, Z., and Yun,

U. (2021). Sequence generative adversarial nets with

a conditional discriminator. Neurocomputing, 429.

Yu, L., Zhang, W., Wang, J., and Yu, Y. (2017). Seqgan:

Sequence generative adversarial nets with policy gra-

dient. In AAAI.

Zia, T. and Zahid, U. (2019). Long short-term memory re-

current neural network architectures for urdu acoustic

modeling. Int. J. Speech Technol., 22(1).

KDIR 2022 - 14th International Conference on Knowledge Discovery and Information Retrieval

256