Model of the Text of a Scientifc and Technical Article for Markup in the Corpus of Scientifc and Technical Texts
https://doi.org/10.25205/1818-7900-2022-20-3-5-13
Abstract
The paper proposes a model of the text of a scientifc and technical article for the automation of markup in the corpus of scientifc and technical texts. It is proved that when creating a corpus of scientifc and technical texts, it is necessary to take into account the structural features of texts of scientifc and technical articles. The necessity of adding structural markup to the corpus of scientifc and technical texts has been shown. It is noted that the texts of scientifc and technical articles have the same narration structure for all texts in this class, and also contain a limited set of structural elements. The features of compositional organization of the texts of scientifc and technical articles are analyzed. The approximate content of each of the elements of article structure is described. Compositional structure of the texts of scientifc and technical articles in Bekus-Naur notation is presented. A model of the text of a scientifc and technical article in the form of a graph, the vertices and edges of which are the full-fledged structural elements of a scientifc and technical article, is proposed. It is proved that the representation of a text of scientifc and technical article in the form of a graph makes it possible to determine the type of structural element and the degree of nesting in the process of computer analysis of the text by presenting the scientifc and technical article as a fnite set of its constituent parts. It is proved that the presence of structural markup in the corpus of scientifc and technical texts signifcantly expands its research potential and serves as the basis for the tasks of automatic processing of scientifc and technical texts.
About the Author
Yu. I. ButenkoRussian Federation
Butenko I. Yulia, Candidate of Technical Sciences, Associate Professor of the Department of Romano-Germanic Languages
Moscow
References
1. Zakharov V. P. Russian corpora. Proceedings of Vinogradov Institute of the Russian Language, 2015. Vol. 6, pp. 20–65. (in Russ.)
2. Nagel O. V. Corpus linguistics and its use in computerized language learning. Language and Culture, 2008. No. 4, pp. 53–59. (in Russ.)
3. Kruzhkov M. G. Information resources of contrastive linguistic research: electronic corpus of texts. Systems and means of informatics, 2015. Vol. 25, no. 2, pp. 140–159. (in Russ.)
4. Lesnikov V. S. Types of markup of text corpus of the Russian language. Scientifc and Technical Information. Series 2. Information processes and systems, 2019. No. 9, pp. 27–30. (in Russ.)
5. Butenko Iu. I. Model of the text of the standard in the information search in the collection of documents of the normative base. Bulletin of Computer and Information Technologies, 2020. Vol. 17, no. 11, pp. 23–32. DOI: 10.14489/vkit. 2020.11 (in Russ.)
6. Butenko Iu. I., Semenova E. L. Influence of linguistic features of standards texts on information search. Philological Sciences. Scientifc reports of higher school, 2019. No. 6, pp. 29–35. DOI: 10.20339/PhS.6-19.029 (in Russ.)
7. Sidnyaev N. I., Butenko J. I., Garazha V. V. Mathematical apparatus for engineering-linguistic models. AIP Conference Proceedings, 2019. Vol. 2195, no. 1, p. 020033. DOI: 10.1063/1.5140133
8. Romanov D. A. Briefly about the structure of the experimental scientifc article in English. Bulletin of Kazan Technological University, 2014. Vol. 17, no. 6, pp. 325–327. (in Russ.)
9. Raitskaya L. K. Structure of a scientifc article on political science and international relations in the context of the quality of scientifc information. Polis. Politicheskie issledovaniye, 2019. No. 1, pp. 167–181. (in Russ.)
10. Popova T. G. Structure of the Spanish scientifc and technical article as a primary genre of scientifc discourse. Bulletin of the Peoples’ Friendship University of Russia. Series: Russian and foreign languages and the methodology of their teaching, 2004. No. 1, pp. 108–115. (in Russ.)
11. Popova N. G. Introduction to the scientifc article in English: structure and composition. Higher Education in Russia, 2015. No. 6, pp. 52–58. (in Russ.)
12. Ivanov V. P. How to write a scientifc article (material structure and work organization). Bulletin of Polotsk State University. Series B. Industry. Applied sciences, 2016. No. 3, p. 195. (in Russ.)
13. Vanyushkin A. S., Grashchenko L. A. On the markup of corpus texts with keywords. New Information Technologies in Automated Systems, 2018. No. 21, pp. 207–211. (in Russ.)
14. Solov’eva A. E. English-language texts of military aviation as the basis of linguistic corpus. Baltic humanitarian journal, 2019. No. 3(28), pp. 369–372. (in Russ.)
Review
For citations:
Butenko Yu.I. Model of the Text of a Scientifc and Technical Article for Markup in the Corpus of Scientifc and Technical Texts. Vestnik NSU. Series: Information Technologies. 2022;20(3):5-13. (In Russ.) https://doi.org/10.25205/1818-7900-2022-20-3-5-13