Preview

Vestnik NSU. Series: Information Technologies

Advanced search

Code Completion Using Markov Chains

https://doi.org/10.25205/1818-7900-2024-22-2-57-67

Abstract

Modern software engineers use many tools to speed up the development process. Many of them use integrated development environments (IDEs), which provide services such as text editors, debuggers and even intelligent code completion. This paper is dedicated to the development of a model for predicting variants of program source code termination. To improve the accuracy of the model, we used combinations of Markov chains constructed using different ways of calculating the current context of the program: linear and with AST. The linear way of computing the context is an analysis of the tokenized representation of the source code. The second method, on the other hand, uses a representation of the source code in the form of an abstract syntax tree. Combining the different models preserves more semantic information about the code, also adding the ability to support custom code writing style features. In order to compare the different models, a new dataset has been created specifically for the Pascal language. A detailed comparison of the working mechanisms as well as the prediction accuracy on the collected data is given. The proposed model showed high enough accuracy of predictions with minimal computation costs, which allows using it in integrated development environments.

About the Author

V. S. Timofeev
Novosibirsk State University
Russian Federation

Vladislav S. Timofeev, Master’s Student

Novosibirsk



References

1. Marasoiu M., Church L., Blackwell A. An empirical investigation of code completion usage by professional software developers. Annual Workshop of the Psychology of Programming Interest Group, 2015.

2. Han S., Wallace D. R., Miller R. C. Code Completion from Abbreviated Input. 2009 IEEE/ ACM International Conference on Automated Software Engineering, Nov. 2009, pр. 332–343. DOI: 10.1109/ase.2009.64

3. Hou D., Pletcher D. M. An evaluation of the strategies of sorting, filtering, and grouping API methods for Code Completion. 2011 27th IEEE International Conference on Software Maintenance (ICSM), Sep. 2011, pр. 233–242. DOI: 10.1109/icsm.2011.6080790

4. Ginzberg A., Kostas L., Balakrishnan T. Automatic code completion. Stanford, Class Project, 2017.

5. Svyatkovskiy A., Zhao Y., Fu S., Sundaresan N. Pythia: AI-assisted Code Completion System. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Jul. 2019, pр. 2727–2735. DOI: 10.1145/3292500.3330699

6. Buksbaum D. Increasing Code Completion Accuracy in Pythia Models for Non-Standard Python Libraries. Doctoral dissertation. Nova Southeastern University, 2023, https://nsuworks.nova.edu/gscis_etd/1188

7. Eysholdt M., Behrens H. Xtext: implement your language faster than the quick and dirty way. Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion, Oct. 2010, pр. 307–309. DOI: 10.1145/1869542.1869625

8. Hellendoorn V. J., Devanbu P. Are deep neural networks the best choice for modeling source code? Proceedings of the 2017 11th Joint Meeting on Foundations of Software Engineering, Aug. 2017, pр. 763–773. DOI: 10.1145/3106237.3106290

9. Chan K. C., Lenard C. T., Mills T. M. An introduction to Markov chains. 49th Annual Conference of Mathematical Association of Victoria, 2012, pp. 40–47. DOI: 10.13140/2.1.1833.8248

10. Cormen T. H., Leiserson C. E., Rivest R. L., Stein C. Introduction to Algorithms (2nd ed.). MIT Press and McGraw-Hill, 2001, pp. 214–217.

11. Wu B., Liang B., Zhang X. Turn tree into graph: Automatic code review via simplified AST driven graph convolutional network. Knowledge-Based Systems, 2022, pp. 109450. DOI: 10.1016/j.knosys.2022.109450


Review

For citations:


Timofeev V.S. Code Completion Using Markov Chains. Vestnik NSU. Series: Information Technologies. 2024;22(2):57-67. (In Russ.) https://doi.org/10.25205/1818-7900-2024-22-2-57-67

Views: 186


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-7900 (Print)
ISSN 2410-0420 (Online)