TY  - CONF
ID  - mm:DIP-SumEval
T1  - A Data Set for the Analysis of Text Quality Dimensions in Summarization Evaluation
A1  - Mieskes, Margot
A1  - Loza Mencía, Eneldo
A1  - Kronsbein, Tim
TI  - Proceedings of the Twelfth International Conference on Language Resources and Evaluation (LREC 2020)
Y1  - 2020
SP  - 6690–
EP  - 6699
PB  - European Language Resources Association
AD  - Marseille, France
N1  - Data set available at \url{https://github.com/keelm/DIP-SumEval}
UR  - https://aclanthology.org/2020.lrec-1.826
N2  - Automatic evaluation of summarization focuses on developing a metric to represent the quality of the resulting text. However, text qualityis represented in a variety of dimensions ranging from grammaticality to readability and coherence. In our work, we analyze the depen-dencies between a variety of quality dimensions on automatically created multi-document summaries and which dimensions automaticevaluation metrics such as ROUGE, PEAK or JSD are able to capture. Our results indicate that variants of ROUGE are correlated tovarious quality dimensions and that some automatic summarization methods achieve higher quality summaries than others with respectto individual summary quality dimensions. Our results also indicate that differentiating between quality dimensions facilitates inspectionand fine-grained comparison of summarization methods and its characteristics. We make the data from our two summarization qualityevaluation experiments publicly available in order to facilitate the future development of specialized automatic evaluation methods.
ER  -