Knowledge based Automatic Summarization
Andrey Timofeyev
and Ben Choi
Computer Science, Louisiana Tech University, Ruston, U.S.A.
Keywords: Automatic Summarization, Semantic Knowledge Base, Text Analysis, Knowledge Discovery, Natural
Language Processing.
Abstract: This paper describes a knowledge based system for automatic summarization. The knowledge based system
creates abstractive summary of texts by generalizing new concepts, detecting main topics, and composing
new sentences. The knowledge based system is built on the Cyc development platform, which comprises the
world’s largest ontology of common sense knowledge and reasoning engine. The system is able to generate
coherent and topically related new sentences by using syntactic structures and semantic features of the given
documents, the knowledge base, and the reasoning engine. The system first performs knowledge acquisition
by extracting syntactic structure of each sentence in the given documents, and by mapping the words and the
relationships of words into Cyc knowledge base. Next, it performs knowledge discovery by using Cyc
ontology and inference engine. New concepts are abstracted by exploring the ontology of the mapped
concepts. Main topics are identified based on the clustering of the concepts. Then, the system performs
knowledge representation for human readers by creating new English sentences to summarize the key
concepts and the relationships of the concepts. The structures of the composed sentences extend beyond
subject-predicate-object triplets by allowing adjective and adverb modifiers. The system was tested on various
documents and webpages. The test results showed that the system is capable of creating new sentences that
include generalized concepts not mentioned in the original text and is capable of combining information from
different parts of the text to form a summary.
1 INTRODUCTION
In this paper, we propose a knowledge based system
for automatic summarization by utilizing knowledge
base and inference engine to provide semantic
summarization. The system creates abstractive
summary of the given documents. It is built on Cyc
development platform that includes world’s largest
ontology of common sense knowledge and inference
engine (Cycorp, 2017). The knowledge base and
inference engine enable the system to abstract new
concepts, not directly stated in the text. The system
utilizes semantic features and syntactic structure of
the text. In addition, the knowledge base provides
domain knowledge about the subject matter and
allows the system to exploit relations between
concepts in the documents.
The proposed system is unsupervised and domain
independent, only limited by the comprehensive
ontology of the common sense knowledge provided
by the knowledge base. It generalizes new abstract
concepts based on the knowledge derived from the
text. It automatically detects main topics described in
the text. Moreover, it composes new English
sentences for some of the most significant concepts.
The created sentences form an abstractive summary,
combining concepts from different parts of the input
text.
Although vast majority of the research in
automatic text summarization has been conducted by
extractive methods, abstractive summarization is
considered to be more desirable. Sophisticated
abstractive method would require the ability to fuse
information from different parts of the original text,
to synthesize new information and to incorporate
domain knowledge (Cheung & Penn, 2013). Our
proposed system provides these abilities as well.
Our knowledge based system starts with
knowledge acquisition by deriving syntactic structure
of each sentence of the input text and by mapping
words and their relations into Cyc knowledge base.
Next, it performs knowledge discovery by
generalizing concepts upward in the Cyc ontology
and detecting main topics covered in the text. Then, it
conducts knowledge representation by composing
Timofeyev A. and Choi B.
Knowledge based Automatic Summarization.
DOI: 10.5220/0006580303500356
In Proceedings of the 9th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management (KDIR 2017), pages 350-356
ISBN: 978-989-758-271-4
Copyright
c
2017 by SCITEPRESS Science and Technology Publications, Lda. All rights reserved
new sentences for some of the most significant
concepts defined in main topics. The structure of the
created sentences consists of subject, predicate and
object elements and their adjective and adverb
modifiers, thus allowing the system to create new
English sentences that have structure beyond simple
subject-predicate-object triplets when available.
The system was implemented and tested on
various documents and webpages. The results show
that the system is able to detect main topics comprised
in the text, identify key concepts defined in those
topics and create new sentences that contain novel
information not explicitly mentioned in the original
text.
As an example, the sentence Big felis usually
being natural predatory animal was automatically
generated by the system resulting from analysing
articles that describe different types of felines. Here,
concept felis acts as a subject of the sentence,
beingis a predicate and predatory animalis an
object. Subject feliswas not mentioned in the text
and was derived by knowledge discovery process.
Each element has its modifier adjective big for
subject, adverb usuallyfor predicate and adjective
natural” for object respectively. The modifiers were
chosen by the system based on the analysis of
occurrences of the concepts and relationships of the
concepts.
The rest of the paper is organized as follows.
Related research in automatic text summarization is
outlined in Section 2. System workflow overview is
provided in Section 3. Detailed description of
summarization process is given in Sections 4, 5 and
6. Technical details of the implementation and
description of the results are covered in Section 7.
Conclusion and future research are discussed in
Section 8.
2 RELATED RESEARCH
Automatic text summarization seeks to compose a
concise and coherent version of the original text
preserving the most important information.
Computational community has studied automatic text
summarization problem since late 1950s (Luhn,
1958). Studies in this area are generally divided into
two main approaches extractive and abstractive.
Extractive text summarization aims to select the most
important sentences from original text to form a
summary. Such methods vary by different
intermediate representations of the candidate
sentences and different sentence scoring schemes
(Nenkova & McKeown, 2012). Summaries created
by extractive approach are highly relevant to the
original text, but do not convey any new information.
Most prominent methods in extractive text
summarization use term frequency versus inverse
document frequency (TF-IDF) metric (Hovy & Lin,
1998), (Radev, et al., 2004) and lexical chains for
sentence representation (Barzilay & Elhadad, 1999),
(Ye, et al., 2007). Statistical methods based on Latent
Semantic Analysis (LSA), Bayesian topic modelling,
Hidden Markov Model (HMM) and Conditional
random field (CRF) derive underlying topics and use
them as features for sentence selection (Gong & Liu,
2001), (Shen, et al., 2007). Graph methods tend to
represent the text as a graph of connected concepts or
sentences. Effectively traversing such graph
representation helps to choose relevant sentences to
form a summary (Mihalcea & Tarau, 2004), (Günes
& Radev, 2004). Machine learning techniques are
widely used to score candidate sentences. Such
methods discover most informative sentences based
on wide variety of features (Wong, et al., 2008),
(Rodriguez & Laio, 2014). Despite significant
advancements in the extractive text summarization,
such approaches are not capable of semantic
understanding and limited to the shallow knowledge
contained in the text.
In contrast, abstractive text summarization aims to
incorporate the meaning of the words and phrases and
generalize knowledge not explicitly mentioned in the
original text to form a summary. Phrase selection and
merging methods in abstractive summarization aim to
solve the problem of combining information from
multiple sentences. Such methods construct clusters
of phrases and then merge only informative ones to
form summary sentences (Bing, et al., 2015). Graph
transformation approaches convert original text into a
form of sematic graph representation and then
combine or reduce such representation with an aim of
creating an abstractive summary. (Ganesan, et al.,
2010), (Moawad & Aref, 2012). Summaries
constructed by described methods consist of
sentences not used in the original text, combining
information from different parts, but such sentences
do not convey new knowledge.
Several approaches attempt to incorporate
semantic knowledge base into automatic text
summarization by using WordNet lexical database
(Barzilay & Elhadad, 1999), (Bellare, et al., 2004),
(Pal & Saha, 2014). Major drawback of WordNet
system is the lack of domain-specific and common
sense knowledge. Unlike Cyc, WordNet does not
have reasoning engine and natural language
generation capabilities.
Our system is similar to one proposed in (Choi &
Huang, 2010). In this work, the structure of created
sentences has simple subject-predicate-object pattern
and new sentences are only created for clusters of
compatible sentences found in the original text.
Recent rapid development of deep learning
contributes to automatic text summarization,
improving state-of-the-art performance. Deep
learning methods applied to both extractive
(Nallapati, et al., 2017) and abstractive (Rush, et al.,
2015) summarization show promising results, but
such approaches require vast amount of training data
and powerful computational resources.
Our abstractive text summarization system
derives syntactic structure to combine information
from different parts of the text, uses knowledge base
to have background semantic knowledge and
performs reasoning to abstract new concepts. To
derive syntactic features, such as part of speech tags
and dependency parser labels, system uses SpaCy
Python library of advanced natural language
processing (Honnibal & Johnson, 2015). The system
utilizes capabilities of world’s largest ontology of
common sense knowledge Cyc (Cycorp, 2017). The
knowledgebase provides semantic knowledge and
inference engine.
3 SUMMARIZATION PROCESS
OVERVIEW
Our proposed system consists of three main parts:
knowledge acquisition, knowledge discovery and
knowledge representation for human reader. The
workflow of the system is outlined in Figure 1.
Figure 1: System workflow diagram.
During the knowledge acquisition process, our
system takes documents as an input and transforms
them into syntactic representation. Then, it maps each
word in the text to the appropriate Cyc concept and
assigns word’s weight and associations to that
concept. During the knowledge discovery process,
the system finds ancestors for each mapped Cyc
concept, records ancestor-descendant relation and
adds scaled descendant weight and descendant
associations to the ancestor concept. This process
allows system to abstract new concepts not explicitly
mentioned in the original text. Then, the system
identifies main topics described in the text by
clustering mapped Cyc concepts. During the
knowledge representation process, the system creates
English sentences for the most informative subjects
identified in main topics. This process ensures that the
summary sentences are composed using information
from different parts of the text while preserving
coherence to the main topics.
4 KNOWLEDGE ACQUISITION
Knowledge acquisition process consists of two sub-
processes. The first sub-process pre-processing,
extracts syntactic structure of the given document. It
separates text into sentences, lemmatizes each word
and assigns part of speech tags and dependency parser
associations. Then it counts the weights of the words
and their associations. The second sub-process
mapping, finds matching Cyc concepts for each word
in the input text. Once the system finds appropriate
concept, it assigns word’s weight and associations to
that concept. Mapping sub-process is described in
Figure 2.
Figure 2: Process of mapping words to Cyc concepts.
Word’s weight is a frequency, the number of times it
is mentioned in the text. The association is a relation
between two words in a sentence, derived by the
syntactic parser. Each association has a weight
assigned to it that shows how many times two words
were used together in the text. Higher weights
Knowledge Based System
Input:
document(s)
Cyc KB
Output:
summary
KNOWLEDGE
DISCOVERY
Abstract new concepts.
Identify main topics.
KNOWLEDGE
ACQUISITION
Extract syntactic structure.
Map words to Cyc concepts.
KNOWLEDGE
REPRESENTATION
Abstract new concepts.
Create new sentences.
Mapping sub-process:
o For each word obtained on preprocessing step:
Map word to corresponding Cyc concept;
Assign word’s weight to mapped Cyc concept;
Map association head to corresponding Cyc
concept;
Assign word’s association to mapped Cyc concept.
represent stronger associations. Cyc ontology
contains semantic knowledge about the concepts and
our system enhances it with syntactic structure
features. Semantic knowledge and syntactic structure
are two crucial parts that make summary cohesive and
meaningful.
5 KNOWLEDGE DISCOVERY
Knowledge discovery process consists of two sub-
processes: (i) new concepts abstraction and (ii) main
topics detection. The first sub-process starts by
deriving ancestor for each concept mapped from the
text. Then it assigns ancestor-descendant relation to
derived ancestor and keeps track of descendants’
scaled weight. The scaling is defined by
generalization parameter α. Next, it adds
descendants’ weight and associations to ancestor
concept if descendant-ratio is higher than the
threshold. The threshold is defined by generalization
parameter β. The descendant ratio is the number of
mapped descendants divided by the number of all
descendants of a concept. The parameters α and β
regulate desired level of generalization. Higher α and
lower β yield greater level of generalization giving
more emphasis to ancestor terms. New concepts
abstraction is an important part of summarization as
it allows generalizing information derived from the
input text. For example, our system can generalize
“apple”, “orange” and “mango” to an ancestor
concept “fruit”, which might not be mentioned in the
text. Sub-process is described in Figure 3.
Figure 3: Process of abstracting new concepts.
The second sub-process detects main topics in the
text. The assumption is that the topics are represented
by the most frequent micro theories in Cyc
knowledge base. Micro theories are the clusters of
concepts and facts typically representing one specific
domain of knowledge. For example, #$MathMt is the
micro theory containing mathematical knowledge.
Micro theories are the basis of the knowledge
representation in Cyc. Each concept begins to have a
semantic meaning only in its defining micro theories
(Matuszek, et al., 2006). To find the most frequent
micro theories system derives defining micro theories
for each mapped Cyc concept, counts frequencies of
discovered micro theories and picks top-n micro
theories with the highest frequencies. Sub-process is
outlined in Figure 4.
Figure 4: Process of main topics detection.
6 KNOWLEDGE
REPRESENTATION
Knowledge representation process starts by (i)
choosing concepts with the highest subjectivity rank
in each main topic detected by the knowledge
discovery. These concepts become candidate
subjects. Subjectivity rank is defined as the product
of concept weight and subjectivity ratio. Subjectivity
ratio is defined as the number of concept associations
labelled as subject relation divided by the total
number of concept associations. This ratio helps to
identify concepts with the strongest subject roles in
the text. Sub-process is described in Figure 5.
Figure 5: Process of candidate subjects discovery.
(i) New concepts abstraction sub-process:
o For each mapped Cyc concept:
Find concept’s ancestor;
Record ancestor-descendant relation;
Update ancestor’s number of descendants;
Update ancestor’s descendants weight;
Scale descendant’s weight by α.
o For each mapped Cyc concept that has descendants:
Find the number of concept’s mapped descendants;
Find the number of all concept’s descendants;
Calculate descendant ratio:
𝑑𝑒𝑠𝑐_𝑟𝑎𝑡𝑖𝑜 =
# 𝑚𝑎𝑝𝑝𝑒𝑑 𝑑𝑒𝑠𝑐𝑒𝑛𝑑𝑎𝑛𝑡𝑠
# 𝑜𝑓 𝑎𝑙𝑙 𝑑𝑒𝑠𝑐𝑒𝑛𝑑𝑎𝑛𝑡𝑠
If descendant-ratio is larger than β:
Add descendants’ weight to ancestor’s weight;
Add descendants’ associations to ancestor
associations;
Scale descendant’s association weight by α.
(ii) Main topics detection sub-process:
o For each mapped Cyc concept:
Find defining micro theories;
o Count the frequencies of discovered micro theories;
o Pick top-n micro theories with highest frequencies.
(i) Candidate subjects discovery sub-process:
o For each micro theory in top-n micro theories:
For each concept mapped from the text:
Find number of subject associations;
Find number of all associations;
Calculate subjectivity ratio:
𝑠𝑢𝑏𝑗_𝑟𝑎𝑡𝑖𝑜 =
# 𝑜𝑓 𝑠𝑢𝑏𝑗𝑒𝑐𝑡 𝑎𝑠𝑠𝑜𝑐𝑖𝑎𝑡𝑖𝑜𝑛𝑠
# 𝑜𝑓 𝑎𝑙𝑙 𝑎𝑠𝑠𝑜𝑐𝑖𝑎𝑡𝑖𝑜𝑛𝑠
Calculate subjectivity rank:
𝑠𝑢𝑏𝑗_𝑟𝑎𝑛𝑘 = 𝑐𝑜𝑛𝑐𝑒𝑝𝑡_𝑤𝑒𝑖𝑔ℎ𝑡 𝑠𝑢𝑏𝑗_𝑟𝑎𝑡𝑖𝑜
Pick top-n subjects with highest subjectivity rank.
Next, the system (ii) creates new English sentences
for the candidate subjects. To generate new sentences
system uses subjectpredicateobject structure
enhanced with the adjective modifiers for subjects
and objects, and the adverb modifiers for predicates,
when available. Subject, predicate and object
elements are mandatory while adjective and adverb
modifiers are optional. The system chooses candidate
elements for the sentence using the weight of the
association between the concepts. Created sentences
form final summary of a given text. Sub-process is
outlined in Figure 6.
Figure 6: Process of new sentence generation.
7 IMPLEMENTATION AND
TESTING
We implemented the system in Python programming
language. Python was a natural choice because of the
advanced Natural Language Processing tools and
libraries supplied by the language. Cyc knowledge
base supports inference engine operations through
SubL language commands and Java APIs. Some of
the SubL commands we used were “min-genls” to
find concept’s ancestors, “generate-phrase” to
convert Cyc concepts to natural language
representation and “query-variable” to run queries
against the knowledge base. We used Java-Python
wrapper implemented by JPype library (JPype, 2017)
to communicate with Cyc server and perform
reasoning. The system was design pipelined and
modular to allow comprehensible data flow and
convenient maintenance.
We conducted several experiments to highlight
different capabilities of the system. First, we applied
the system on Wikipedia articles describing concepts
from different domains. The articles described
domestic dog, hamburger and personal computer.
Table 1 shows main topics and concepts extracted
from the analysed articles. Topics are represented by
the micro theories from Cyc knowledge base. For
example, #$BiologyMt micro theory contains general
information about the living things;
#$HumanFoodGMt micro theory describes human
food; #$HumanSocialLifeMt micro theory covers
social and cultural aspects of human relationships.
Concepts are represented as the Cyc terms. Each term
has a natural language representation, e.g. “canis” for
#$CanisGenus, “subspecies” for
#$BiologicalSubspecies and “developer” for
#$ComputerProgrammer.
Some of the new sentences created for the articles
are outlined in Figure 7. All sentences have minimal
subject-predicate-object structure and some of the
sentences go beyond with additional adjective and
adverb modifiers. This is possible when subject,
predicate or object has strong adjective or adverb
relations.
Figure 7: Test results of some of the new sentences created
for Wikipedia articles.
Next, we conducted experiment using multiple
articles about grapefruit. New sentences created by
the system are outlined in Figure 8. These results
show the progression from subject-predicate-object
(ii) New sentence generation sub-process:
o For each subject in top-n subjects:
Convert subject Cyc concept to natural language
representation (a);
Pick adjective with highest subject-adjective
association weight;
Convert adjective Cyc concept to natural language
representation (b);
Pick top-n predicates with highest subject-predicate
association weights;
For each predicate in top-n predicates:
Convert predicate Cyc concept to natural language
representation (c);
Pick adverb with highest predicate-adverb
association weight;
Convert adverb Cyc concept to natural language
representation (d);
Pick top-n objects with highest product of subject-
object and predicate-object associations weights;
For each object in top-n objects:
Convert object Cyc concept to natural language
representation (e);
Pick adjective with highest object-adjective
association;
Convert adjective Cyc concept to natural
language representation (f);
Create new sentence using subject (a), subject-
adjective (b), predicate (c), predicate-adverb (d),
object (e), object-adjective (f) natural language
phrases.
Dog being canis.
Dog having short external anatomic part.”
Burger utilizing traditional mammal meat.
Ground beef being bovine meet.
Computer having computer program.
Computer hardware needing power.
structure to more complex structure extended by the
adjective and adverb modifiers when more articles
were processed by the system.
Figure 8: Test results of new sentences created for multiple
articles about grapefruit; (a) single article, (b) two
articles, (c) three articles.
Finally, we have applied the system on five
Wikipedia articles describing different types of
felines (cat, tiger, cougar, jaguar and lion). Table 2
shows main topics and concepts extracted from the
text and new created sentences.
Test results show that the system is able to create
sentences that contain generalized concepts and
combine information from different parts of the text.
Concepts like “canis”, “mammal meat” and “felis”
were derived by the abstraction process and were not
mentioned in the original text. The system yields
better results compared to the reported in (Choi &
Huang, 2010). New sentences created by our system
have structure that is more complex and contain
information fused from various parts of the text.
8 CONCLUSION AND FUTURE
WORK
In this paper, we described a knowledge based
automatic summarization system that creates an
abstractive summary of the text. This task is still
challenging for machines, because in order to create
such summary, the information from the input text
has to be aggregated and synthesized, drawing
knowledge that is more general. This is not feasible
without using the semantics and having domain
knowledge. To have such capabilities, our described
system uses Cyc knowledge base and its reasoning
engine. Utilizing semantic features and syntactic
structure of the text shows great potential in creating
abstractive summaries.
We have implemented and tested our proposed
system. The results show that the system is able to
abstract new concepts not mentioned in the text,
identify main topics and create new sentences using
information from different parts of the text.
We outline several directions for the future
improvements of the system. The first direction is to
improve the domain knowledge representation, since
the semantic knowledge and reasoning are only
limited by Cyc knowledge base. Ideally, the system
Grapefruit being fruit. (a)
Grapefruit being colored edible fruit.” (b)
Colored grapefruit being sweet edible fruit.” (c)
Table 1: Test results of main topics and concepts derived from Wikipedia articles.
Article: Dog
Topics (micro theories):
#$BiologyMt
#$BiologyVocabularyMt
#$NaivePhysicsVocabularyMt
Concepts:
#$Dog
#$CanisGenus
#$Person
#$BiologicalSubspecies
#$Breeder
Article: Hamburger
Topics (micro theories):
#$HumanFoodGMt
#$HumanFoodGVocabularyMt
#$ProductGVocabularyMt
Concepts:
#$Food
#$Burger
#$HamburgerSandwich
#$GroundBeef
#$Cheese
Article: Computer
Topics (micro theories):
#$InformationTerminologyMt
#$HumanSocialLifeMt
#$NaivePhysicsVocabularyMt
Concepts:
#$Computer
#$ComputerProgrammer
#$outputs
#$ComputerHardwareItem
#$ControlDevice
Table 2: Test results of new sentences, concepts and main topics for Wikipedia articles about felines.
Topics (micro theories):
#$BiologyMt
#$BiologyVocabularyMt
#$HumanSocialLifeMt
Concepts:
#$Cat
#$DomesticCat
#$FelisGenus
#$FelidaeFamily
#$Animal
Sentences:
Cat usually being native animal.
Big felis usually being natural predatory animal.
Big felis usually being exotic animal.
Big felis often using killing method.
Big felis often using marking.
Male feline often killing prey.
Male feline living historical mountain range.
would be able to use the whole World Wide Web as
a domain knowledge, but this possesses challenges
like information inconsistency and sense
disambiguation. The second direction is to improve
the structure of the created sentences. We use subject-
predicate-object triplets extended by adjective and
adverb modifiers. Such structure can be improved by
using more advanced syntactic representation of the
sentence, e.g. graph representation. Finally, some of
the created sentences are not conceptually connected
to each other. Analysing the relations between
concepts on the document level will help in creating
sentences that will be linked to each other
conceptually.
REFERENCES
Barzilay, R. & Elhadad, M., 1999. Using lexical chains for
text summarization. Advances in automatic text
summarization, pp. 111-121.
Bellare, K. et al., 2004. Generic Text Summarization Using
WordNet. Lisbon, Portugal, LREC, pp. 691-694.
Bing, L. et al., 2015. Abstractive multi-document
summarization via phrase selection and merging.
Beijing, China, Association for Computational
Linguistics, pp. 1587-1597.
Cheung, J. C. K. & Penn, G., 2013. Towards Robust
Abstractive Multi-Document Summarization: A
Caseframe Analysis of Centrality and Domain.. Sofia,
Bulgaria, Association for Computational Linguistics,
pp. 1233-1242.
Choi, B. & Huang, X., 2010. Creating New Sentences to
Summarize Documents. Innsbruck, Austria, IASTED,
pp. 458-463.
Cycorp, 2017. Cycorp Cycorp Making Solutions Better.
[Online]
Available at: http://www.cyc.com/
[Accessed July 2017].
Ganesan, K., Zhai, C. & Han, J., 2010. Opinosis: a graph-
based approach to abstractive summarization of highly
redundant opinions. Beijing, China, Association for
Computational Linguistics, pp. 340-348.
Gong, Y. & Liu, X., 2001. Generic text summarization
using relevance measure and latent semantic analysis.
New Orleans, Louisiana, ACM, pp. 19-25.
Günes, E. & Radev, D. R., 2004. Lexrank: Graph-based
lexical centrality as salience in text summarization.
Journal of Artificial Intelligence Research, Issue 22,
pp. 457-479.
Honnibal, M. & Johnson, M., 2015. An Improved Non-
monotonic Transition System for Dependency Parsing.
Lisbon, Portugal, Association for Computational
Linguistics, pp. 1373-1378.
Hovy, E. & Lin, C.-Y., 1998. Automated text
summarization and the SUMMARIST system.
Baltimore, Maryland, Association for Computational
Linguistics, pp. 197-214.
JPype, 2017. JPype - Java to Python integration. [Online]
Available at: http://jpype.sourceforge.net/
[Accessed July 2017].
Luhn, H. P., 1958. The automatic creation of literature
abstracts. IBM Journal of Research and Development,
2(2), pp. 159-165.
Matuszek, C., Cabral, J., Witbrock, M. & DeOliveira, J.,
2006. An Introduction to the Syntax and Content of Cyc.
Palo Alto, California, AAAI, pp. 44-49.
Mihalcea, R. & Tarau, P., 2004. TextRank: Bringing Order
into Text. Barcelona, Spain, EMNLP, pp. 404-411.
Moawad, I. F. & Aref, M., 2012. Semantic graph reduction
approach for abstractive Text Summarization. Cairo,
Egypt, IEEE, pp. 132-138.
Nallapati, R., Zhai, F. & Zhou, B., 2017. SummaRuNNer:
A recurrent neural network based sequence model for
extractive summarization of documents. San Francisco,
California, AAAI.
Nenkova, A. & McKeown, K., 2012. A survey of text
summarization techniques. In: C. C. Aggarwal & C.
Zhai, eds. Mining Text data. s.l.:Springer, pp. 43-76.
Pal, A. R. & Saha, D., 2014. An approach to automatic text
summarization using WordNet. Gurgaon, India, IEEE,
pp. 1169-1173.
Radev, D. R., Jing, H., Styś, M. & Tam, D., 2004. Centroid-
based summarization of multiple documents.
Information Processing & Management, 40(6), pp.
919-938.
Rodriguez, A. & Laio, A., 2014. Clustering by fast search
and find of density peaks. Science, 344(6191), pp.
1492-1496.
Rush, A. M., Chopra, S. & Wetson, J., 2015. A neural
attention model for abstractive sentence
summarization. Lisbon, Portugal, EMNLP.
Shen, D. et al., 2007. Document Summarization Using
Conditional Random Fields. Hyderabad, India, IJCAI,
pp. 2862-2867.
Wong, K.-F., Wu, M. & Li, W., 2008. Extractive
summarization using supervised and semi-supervised
learning. Manchester, United Kingdom, Association
for Computational Linguistics, pp. 985-992.
Ye, S., Chua, T.-S., Kan, M.-Y. & Qiu, L., 2007. Document
concept lattice for text understanding and
summarization. Information Processing &
Management, 43(6), pp. 1643-1662.