<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD JATS (Z39.96) Journal Publishing DTD v1.3 20210610//EN" "JATS-journalpublishing1-3.dtd">
<article article-type="research-article" dtd-version="1.3" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xml:lang="ru"><front><journal-meta><journal-id journal-id-type="publisher-id">intechngu</journal-id><journal-title-group><journal-title xml:lang="ru">Вестник НГУ. Серия: Информационные технологии</journal-title><trans-title-group xml:lang="en"><trans-title>Vestnik NSU. Series: Information Technologies</trans-title></trans-title-group></journal-title-group><issn pub-type="ppub">1818-7900</issn><issn pub-type="epub">2410-0420</issn><publisher><publisher-name>НГУ</publisher-name></publisher></journal-meta><article-meta><article-id pub-id-type="doi">10.25205/1818-7900-2024-22-2-57-67</article-id><article-id custom-type="elpub" pub-id-type="custom">intechngu-271</article-id><article-categories><subj-group subj-group-type="heading"><subject>Research Article</subject></subj-group><subj-group subj-group-type="section-heading" xml:lang="ru"><subject>Статьи</subject></subj-group></article-categories><title-group><article-title>Использование цепей Маркова для автоматического завершения исходного кода программы</article-title><trans-title-group xml:lang="en"><trans-title>Code Completion Using Markov Chains</trans-title></trans-title-group></title-group><contrib-group><contrib contrib-type="author" corresp="yes"><contrib-id contrib-id-type="orcid">https://orcid.org/0009-0008-1060-4613</contrib-id><name-alternatives><name name-style="eastern" xml:lang="ru"><surname>Тимофеев</surname><given-names>В. С.</given-names></name><name name-style="western" xml:lang="en"><surname>Timofeev</surname><given-names>V. S.</given-names></name></name-alternatives><bio xml:lang="ru"><p>Тимофеев Владислав Сергеевич, магистрант </p><p>Новосибирск</p></bio><bio xml:lang="en"><p>Vladislav S. Timofeev, Master’s Student</p><p>Novosibirsk</p></bio><email xlink:type="simple">v.timofeev@g.nsu.ru</email><xref ref-type="aff" rid="aff-1"/></contrib></contrib-group><aff-alternatives id="aff-1"><aff xml:lang="ru">Новосибирский государственный университет<country>Россия</country></aff><aff xml:lang="en">Novosibirsk State University<country>Russian Federation</country></aff></aff-alternatives><pub-date pub-type="collection"><year>2024</year></pub-date><pub-date pub-type="epub"><day>10</day><month>10</month><year>2024</year></pub-date><volume>22</volume><issue>2</issue><fpage>57</fpage><lpage>67</lpage><permissions><copyright-statement>Copyright &amp;#x00A9; Тимофеев В.С., 2024</copyright-statement><copyright-year>2024</copyright-year><copyright-holder xml:lang="ru">Тимофеев В.С.</copyright-holder><copyright-holder xml:lang="en">Timofeev V.S.</copyright-holder><license license-type="creative-commons-attribution" xlink:href="https://creativecommons.org/licenses/by/4.0/" xlink:type="simple"><license-p>This work is licensed under a Creative Commons Attribution 4.0 License.</license-p></license></permissions><self-uri xlink:href="https://intechngu.elpub.ru/jour/article/view/271">https://intechngu.elpub.ru/jour/article/view/271</self-uri><abstract><p>В сфере программирования используются разнообразные инструменты с целью оптимизации процесса разработки. Среди них особое место занимают интегрированные среды разработки (IDE), обеспечивающие широкий спектр сервисов, включая текстовый редактор, отладчик и интеллектуальное завершение кода. Настоящая работа посвящена разработке модели, направленной на предсказание вариантов завершения исходного кода программы. Для улучшения точности модели были использованы комбинации цепей Маркова, основанные на различных методах вычисления текущего контекста программы: линейном и с использованием абстрактного синтаксического дерева (AST). Линейный метод анализа контекста представляет собой анализ токенизированного представления исходного кода, в то время как второй метод использует структуру исходного кода в виде AST. Объединение различных моделей позволяет сохранить больше семантической информации о коде и учитывать при автодополнении индивидуальный стиль написания кода. Разработанная модель демонстрирует высокую точность предсказаний при минимальном объеме вычислительных ресурсов, что делает ее применимой в интегрированных средах разработки.</p></abstract><trans-abstract xml:lang="en"><p>Modern software engineers use many tools to speed up the development process. Many of them use integrated development environments (IDEs), which provide services such as text editors, debuggers and even intelligent code completion. This paper is dedicated to the development of a model for predicting variants of program source code termination. To improve the accuracy of the model, we used combinations of Markov chains constructed using different ways of calculating the current context of the program: linear and with AST. The linear way of computing the context is an analysis of the tokenized representation of the source code. The second method, on the other hand, uses a representation of the source code in the form of an abstract syntax tree. Combining the different models preserves more semantic information about the code, also adding the ability to support custom code writing style features. In order to compare the different models, a new dataset has been created specifically for the Pascal language. A detailed comparison of the working mechanisms as well as the prediction accuracy on the collected data is given. The proposed model showed high enough accuracy of predictions with minimal computation costs, which allows using it in integrated development environments.</p></trans-abstract><kwd-group xml:lang="ru"><kwd>автоматическое завершение кода</kwd><kwd>цепи Маркова</kwd><kwd>нейронные сети</kwd><kwd>интегрированные среды разработки</kwd><kwd>язык программирования Pascal</kwd></kwd-group><kwd-group xml:lang="en"><kwd>code completion</kwd><kwd>Markov chains</kwd><kwd>neural networks</kwd><kwd>integrated development environment</kwd><kwd>Pascal programming language</kwd></kwd-group></article-meta></front><back><ref-list><title>References</title><ref id="cit1"><label>1</label><citation-alternatives><mixed-citation xml:lang="ru">Marasoiu M., Church L., Blackwell A. An empirical investigation of code completion usage by professional software developers. Annual Workshop of the Psychology of Programming Interest Group, 2015.</mixed-citation><mixed-citation xml:lang="en">Marasoiu M., Church L., Blackwell A. An empirical investigation of code completion usage by professional software developers. Annual Workshop of the Psychology of Programming Interest Group, 2015.</mixed-citation></citation-alternatives></ref><ref id="cit2"><label>2</label><citation-alternatives><mixed-citation xml:lang="ru">Han S., Wallace D. R., Miller R. C. Code Completion from Abbreviated Input. 2009 IEEE/ ACM International Conference on Automated Software Engineering, Nov. 2009, pр. 332–343. DOI: 10.1109/ase.2009.64</mixed-citation><mixed-citation xml:lang="en">Han S., Wallace D. R., Miller R. C. Code Completion from Abbreviated Input. 2009 IEEE/ ACM International Conference on Automated Software Engineering, Nov. 2009, pр. 332–343. DOI: 10.1109/ase.2009.64</mixed-citation></citation-alternatives></ref><ref id="cit3"><label>3</label><citation-alternatives><mixed-citation xml:lang="ru">Hou D., Pletcher D. M. An evaluation of the strategies of sorting, filtering, and grouping API methods for Code Completion. 2011 27th IEEE International Conference on Software Maintenance (ICSM), Sep. 2011, pр. 233–242. DOI: 10.1109/icsm.2011.6080790</mixed-citation><mixed-citation xml:lang="en">Hou D., Pletcher D. M. An evaluation of the strategies of sorting, filtering, and grouping API methods for Code Completion. 2011 27th IEEE International Conference on Software Maintenance (ICSM), Sep. 2011, pр. 233–242. DOI: 10.1109/icsm.2011.6080790</mixed-citation></citation-alternatives></ref><ref id="cit4"><label>4</label><citation-alternatives><mixed-citation xml:lang="ru">Ginzberg A., Kostas L., Balakrishnan T. Automatic code completion. Stanford, Class Project, 2017.</mixed-citation><mixed-citation xml:lang="en">Ginzberg A., Kostas L., Balakrishnan T. Automatic code completion. Stanford, Class Project, 2017.</mixed-citation></citation-alternatives></ref><ref id="cit5"><label>5</label><citation-alternatives><mixed-citation xml:lang="ru">Svyatkovskiy A., Zhao Y., Fu S., Sundaresan N. Pythia: AI-assisted Code Completion System. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining, Jul. 2019, pр. 2727–2735. DOI: 10.1145/3292500.3330699</mixed-citation><mixed-citation xml:lang="en">Svyatkovskiy A., Zhao Y., Fu S., Sundaresan N. Pythia: AI-assisted Code Completion System. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery &amp; Data Mining, Jul. 2019, pр. 2727–2735. DOI: 10.1145/3292500.3330699</mixed-citation></citation-alternatives></ref><ref id="cit6"><label>6</label><citation-alternatives><mixed-citation xml:lang="ru">Buksbaum D. Increasing Code Completion Accuracy in Pythia Models for Non-Standard Python Libraries. Doctoral dissertation. Nova Southeastern University, 2023, https://nsuworks.nova.edu/gscis_etd/1188</mixed-citation><mixed-citation xml:lang="en">Buksbaum D. Increasing Code Completion Accuracy in Pythia Models for Non-Standard Python Libraries. Doctoral dissertation. Nova Southeastern University, 2023, https://nsuworks.nova.edu/gscis_etd/1188</mixed-citation></citation-alternatives></ref><ref id="cit7"><label>7</label><citation-alternatives><mixed-citation xml:lang="ru">Eysholdt M., Behrens H. Xtext: implement your language faster than the quick and dirty way. Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion, Oct. 2010, pр. 307–309. DOI: 10.1145/1869542.1869625</mixed-citation><mixed-citation xml:lang="en">Eysholdt M., Behrens H. Xtext: implement your language faster than the quick and dirty way. Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion, Oct. 2010, pр. 307–309. DOI: 10.1145/1869542.1869625</mixed-citation></citation-alternatives></ref><ref id="cit8"><label>8</label><citation-alternatives><mixed-citation xml:lang="ru">Hellendoorn V. J., Devanbu P. Are deep neural networks the best choice for modeling source code? Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, Aug. 2017, pр. 763–773. DOI: 10.1145/3106237.3106290</mixed-citation><mixed-citation xml:lang="en">Hellendoorn V. J., Devanbu P. Are deep neural networks the best choice for modeling source code? Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, Aug. 2017, pр. 763–773. DOI: 10.1145/3106237.3106290</mixed-citation></citation-alternatives></ref><ref id="cit9"><label>9</label><citation-alternatives><mixed-citation xml:lang="ru">Chan K. C., Lenard C. T., Mills T. M. An introduction to Markov chains. 49th Annual Conference of Mathematical Association of Victoria, 2012, pp. 40–47. DOI: 10.13140/2.1.1833.8248</mixed-citation><mixed-citation xml:lang="en">Chan K. C., Lenard C. T., Mills T. M. An introduction to Markov chains. 49th Annual Conference of Mathematical Association of Victoria, 2012, pp. 40–47. DOI: 10.13140/2.1.1833.8248</mixed-citation></citation-alternatives></ref><ref id="cit10"><label>10</label><citation-alternatives><mixed-citation xml:lang="ru">Cormen T. H., Leiserson C. E., Rivest R. L., Stein C. Introduction to Algorithms (2nd ed.). MIT Press and McGraw-Hill, 2001, pp. 214–217.</mixed-citation><mixed-citation xml:lang="en">Cormen T. H., Leiserson C. E., Rivest R. L., Stein C. Introduction to Algorithms (2nd ed.). MIT Press and McGraw-Hill, 2001, pp. 214–217.</mixed-citation></citation-alternatives></ref><ref id="cit11"><label>11</label><citation-alternatives><mixed-citation xml:lang="ru">Wu B., Liang B., Zhang X. Turn tree into graph: Automatic code review via simplified AST driven graph convolutional network. Knowledge-Based Systems, 2022, pp. 109450. DOI: 10.1016/j.knosys.2022.109450</mixed-citation><mixed-citation xml:lang="en">Wu B., Liang B., Zhang X. Turn tree into graph: Automatic code review via simplified AST driven graph convolutional network. Knowledge-Based Systems, 2022, pp. 109450. DOI: 10.1016/j.knosys.2022.109450</mixed-citation></citation-alternatives></ref></ref-list><fn-group><fn fn-type="conflict"><p>The authors declare that there are no conflicts of interest present.</p></fn></fn-group></back></article>
