Calvo Castro Francisco Hiram
Carrillo Mendoza Pabel
Gelbukh Alexander
Título On Redundancy in multi-document summarization
Tipo Revista
Sub-tipo JCR
Descripción Journal of Intelligent Fuzzy Systems
Resumen In this paper we study how the presence or absence of redundancy on multiple related texts can be used to compute sentence relevance for extractive multi-document summarization. Two types of redundancy can be found: intra-document and inter-document. By experimenting with them, different ideas can be extracted, for example: statements redundant between documents—which can be important by their popularity; statements that are not redundant—which can be important by their novelty; or statements redundant within each document—which can be important by being constantly addressed by a single author. We propose an unsupervised graph-based method that allows to generate summaries based on different strategies of redundancy. We present experiments on two DUC corpora of nine different strategies to extract information depending of how redundancy within a document and in different documents is managed. According to DUC gold standards, we found that a multi-document generic summary should contain the most redundant (popular) information between different sources while avoiding local intra-document redundancy. We implemented a mechanism to enrich sentence rankings with redundancy, improving the evaluation of summaries
País Mexico
No. de páginas 3245-3255
Vol. / Cap. 34(5)
Inicio 2018-05-24
ISBN/ISSN 1064-1246