Efficiency of Neural Network Algorithms in Automatic Abstracting and Summarization Text
https://doi.org/10.25205/1818-7900-2024-22-4-49-61
Abstract
The article is devoted to the analysis of the role and efficiency of neural network algorithms in the tasks of automatic abstracting and summarization of texts, which are key in the field of natural language processing (NLP). The main goal of automatic abstracting is to extract and generate essential information from texts to provide quick access to the main content without having to read the whole document. The paper discusses the main challenges faced by developers in implementing abstracting algorithms, including understanding context, irony, maintaining text cohesion, and adapting to different languages and styles. Special attention is given to neural network models such as Transformer, BERT, and GPT, which have shown outstanding performance in automatic text abstracting due to their ability to learn on large amounts of data. The article also highlights the contributions of leading researchers in the field of deep learning and analyzes the methods underlying state-of-the-art NLP algorithms, highlighting the importance of continuous technological progress in improving abstracting quality and information accessibility. The article will be of interest to a wide range of readers, including researchers in the field of artificial intelligence and NLP, software developers engaged in automation of text processing, as well as specialists in areas where fast processing and analysis of large amounts of textual information is required, such as legal practice, medical diagnostics and scientific research. In addition, the material of the article will be useful for teachers and students studying data processing and artificial intelligence technologies, providing them with actual examples of applying theoretical knowledge in practical projects.
About the Author
K. V. RebenokRussian Federation
Kirill V. Rebenok, Postgraduate Student
Moscow
References
1. Divakar Y., Jalpa D., Arun K. Y. Automatic Text Summarization Methods: A Comprehensive Review. 2020. https://doi.org/10.48550/arXiv.2204.01849
2. Salchner M.F., Adam A. A Survey of Automatic Text Summarization Using Graph Neural Networks. In: Proceedings of the 29th International Conference on Computational Linguistics. Gyeongju, Republic of Korea. International Commit-tee on Computational Linguistics, 2022, pp. 6139−6150.
3. Vamvas J., Domhan T., Trenous S., Sennrich R., Hasler E. Trained MT Metrics Learn to Cope with Machine-translated References. Conference: Proceedings of the Eighth Conference on Machine Translation. 2023. https://doi.org/10.18653/v1/2023.wmt-1.95.
4. Mathur N., Baldwin T., Cohn T. Tangled up in BLEU: Reevaluating the Eval-uation of Automatic Machine Translation Evaluation Metrics. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, 2020, pp. 4984−4997.
5. Reiter E. A Structured Review of the Validity of BLEU. Computational Lin-guis-tics, 2018, № 44 (3), pp. 393−401.
6. Tianyi Z., Kishore V., Wu F., Weinberger K. Q., Artzi Y. BERTScore: Evaluat-ing Text Generation with BERT. ArXiv abs/1904.09675. 2019.
7. Guo Y., Hu J. Meteor++ 2.0: Adopt syntactic level paraphrase knowledge intomachine translation evaluation. In: Proceedings of the Fourth Conference on Ma-chine Translation, 2019, vol. 2, pp. 501−506. https://doi.org/10.18653/v1/W19-5357
8. Ayub S. A., Gaol F. L., Matsuo T. A Survey of the State-of-the-Art Models in Neural Abstractive Text Summarization. IEEE Access, 2021, № 9, pp. 13248−13265. https://doi.org/10.1109/ACCESS.2021.3052783
9. Al E. W., Awajan A. A. SemG-TS: Abstractive Arabic Text Summarization Us-ing Semantic Graph Embedding. Mathematics, 2022, № 10 (18), p. 3225. https://doi.org/10.3390/math10183225
10. Tianyi Zhang T., Kishore V., Wu F., Weinberger K. Q., Artzi Y. BERTSCORE: Evaluating Text Generation with BERT. Department of Computer Science and Cornell Tech, Cornell University, 2019.
11. Saadany H., Orasan C. BLEU, METEOR, BERTScore: Evaluation of Metrics Performance in Assessing Critical Translation Errors in Sentiment-oriented Text. Conference: TRITON (TRanslation and Interpreting Technology Online), 2021. https://doi.org/10.26615/978-954-452-071-7_006
12. Sudoh K., Takahashi K., Nakamura S. Is this translation error critical?: Classification-based human and automatic machine translation evaluation focus-ingon critical errors. In: Proceedings of the Workshop on Human Evaluation of NLPSystems (HumEval), 2021, pp. 46−55.
13. Siddhant A., Johnson M., Tsai T., Ari N. Evaluating the Cross-Lingual Effectiveness of Massively Multilingual Neural Machine Translation. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, № 34 (5), pp. 8854−8861. https://doi.org/10.1609/aaai.v34i05.6414.
14. Lin J., Nogueira R., Yates A. Pretrained Transformers for Text Ranking: BERT and Beyond. Synthesis Lectures on Human Language Technologies, 2021, № 14 (4), pp. 1−325. https:// doi org/10.2200/S01123ED1V01Y202108HLT053.
15. Chistyakova K., Kazakova T. Grammar in Language Models: BERT Study. National research university higher school of economics, 2023, № 115.
Review
For citations:
Rebenok K.V. Efficiency of Neural Network Algorithms in Automatic Abstracting and Summarization Text. Vestnik NSU. Series: Information Technologies. 2024;22(4):49-61. (In Russ.) https://doi.org/10.25205/1818-7900-2024-22-4-49-61