Objective-Oriented Transformer for Abstractive Document Summarization

Parma Nand; CangeGe Zhang; Manju Vallayil

doi:10.5220/0013682600004000

Objective-Oriented Transformer for Abstractive Document Summarization

Parma Nand, CangeGe Zhang, Manju Vallayil

2025

Abstract

The effectiveness of transformer language models has been extensively used in a variety of language understanding and production tasks. Adaptation of these models for the specific purpose of text summarization has not been explored as much. In this work, we present the adaptation of a pre-trained Transformer model for the specific task of text summarization. A common way to train a language model is to randomly mask tokens from text and train the model to predict the masked words. The learner does this by paying attention to other neighbouring words in order to predict the masked words. Instead of training a single learner to learn random words, we trained three separate learners to focus only on specific types of words and generate separate summaries from multiple summary viewpoints. Then we used these focused learners to generate composite summaries corresponding to the type of words on which they were trained. We hypothesize that if we combine these different summaries, then it should result in a richer, more accurate summary covering multiple perspectives. We used already trained masked language models, BERT and RoBERTa, to extend the pretraining on the composite tasks of predicting just the nouns, the verbs or the rest of the words, as 3 separate pretraining objectives. We then trained the composite models for the downstream task of corresponding composite summarization. The evaluation was carried out by combining the three composite summaries with two benchmark data sets, Food Review and CNN/Daily Mail. The proposed composite pre-trained model and the composite summary generation algorithm produced a higher precision score based on ROUGE-1 and ROUGE-3 but a slightly lower score on ROUGE-2 compared to the state-of-the-art. The results showed that generating multiple summaries from different perspectives and then merging them has the potential to produce a richer and better summary compared to a one-shot strategy.

Download

Paper Citation

in Harvard Style

Nand P., Zhang C. and Vallayil M. (2025). Objective-Oriented Transformer for Abstractive Document Summarization. In Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR; ISBN , SciTePress, pages 240-247. DOI: 10.5220/0013682600004000

in Bibtex Style

@conference{kdir25,
author={Parma Nand and CangeGe Zhang and Manju Vallayil},
title={Objective-Oriented Transformer for Abstractive Document Summarization},
booktitle={Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR},
year={2025},
pages={240-247},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013682600004000},
isbn={},
}

in EndNote Style

TY - CONF

JO - Proceedings of the 17th International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management - Volume 1: KDIR
TI - Objective-Oriented Transformer for Abstractive Document Summarization
SN -
AU - Nand P.
AU - Zhang C.
AU - Vallayil M.
PY - 2025
SP - 240
EP - 247
DO - 10.5220/0013682600004000
PB - SciTePress