Methodology for Evaluating the Results of Feature Selection Task Solving
https://doi.org/10.25205/1818-7900-2025-23-2-53-63
Abstract
Article is a study of methods for evaluating the effectiveness of feature selection algorithms and proposes a new methodology for their evaluation. It is noted that existing methods and approaches to assessment do not always adequately reflect the actual efficiency of algorithms, especially when applied to real problems. The article comprehensively discusses various opinions of researchers on the issues of internal and external validity of existing evaluation methods, assesses the impact of different parameters, including data volume, differences in algorithm implementations, and other factors. The authors propose a new integrated approach to evaluating the effectiveness of algorithms, which includesa set of indicators such as resource costs, stability, and task solution quality. One of the peculiarities of the proposed approach is the use of artificially generated data, which allows considering specific characteristics of real data and evaluating algorithms under controlled conditions. This makes it possible to more accurately determine their efficiency and reliability. The article also contains the results of preliminary testing of the proposed methodology using artifi cial data. The analysis of these results demonstrates the advantages of the new approach over traditional evaluation methods. In particular, it was found that the new methodology allows more accurately evaluating the stability and quality of algorithm performance, which is crucial for making decisions about choosing an appropriate algorithm for specific tasks. In conclusion, the authors emphasize the need for further development and expansion of the proposed methodology, as well as its adaptation for solving various types of tasks. They also point out the prospects for integrating this methodology into specialized software packages, which will make it accessible to a wide range of users and accelerate the implementation of innovative algorithms in practice.
About the Authors
A. D. CheremuhinRussian Federation
Artem D. Cheremuhin, Ph.D. in Economics, Associate Professor
Knyaginino
A. D. Rein
Russian Federation
Andrey D. Rein, Ph.D. in Economics, Associate Professor
Knyaginino
References
1. Liao T. et al. Are we learning yet? a meta review of evaluation failures across machine learning. Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2). 2021.
2. Flach P. Performance evaluation in machine learning: the good, the bad, the ugly, and the way forward. Proceedings of the AAAI conference on artificial intelligence, 2019, vol. 33, no. 1, рр. 9808–9814.
3. Lazebnik T., Rosenfeld A. A new definition for feature selection stability analysis. Annals of Mathematics and Artificial Intelligence, 2024, рр. 1–18.
4. Sağbaş E. A. Performance Evaluation of Feature Selection Methods for Sentiment Classification in Amazon Product Reviews. International Artificial Intelligence And Data Science Congress (ICADA’23), 2023, р. 2.
5. Mahendran N. et al. Machine learning based computational gene selection models: a survey, performance evaluation, open issues, and future research directions. Frontiers in genetics, 2020, vol. 11, р. 603808.
6. Mohapatra P., Chakravarty S., Dash P. K. Microarray medical data classification using kernelridge regression and modified cat swarm optimization based gene selection system. Swarm and Evolutionary Computation, 2016, vol. 28, рр. 144–160.
7. Abinash M. J., Vasudevan V. A study on wrapper-based feature selection algorithm for leukemia dataset. In: Intelligent Engineering Informatics: Proceedings of the 6th International Conference on FICTA. Springer Singapore, 2018, рр. 311–321.
8. Hancer E., Xue B., Zhang M. Differential evolution for filter feature selection based on information theory and feature ranking. Knowledge-Based Systems, 2018, vol. 140, рр. 103–119.
9. Ang J. C. et al. Supervised, unsupervised, and semi-supervised feature selection: a review on gene selection. IEEE/ACM transactions on computational biology and bioinformatics, 2015, vol. 13, no. 5, рр. 971–989.
10. Saeys Y., Inza I., Larranaga P. A review of feature selection techniques in bioinformatics. Bioinformatics, 2007, vol. 23, no. 19, рр. 2507–2517.
11. Xin B. et al. Stable feature selection from brain sMRI. Proceedings of the AAAI Conference on Artificial Intelligence, 2015, vol. 29, no. 1.
12. Bolón-Canedo V., Sánchez-Maroño N., Alonso-Betanzos A. A review of feature selection methods on synthetic data. Knowledge and information systems, 2013, vol. 34, рр. 483–519. 13. Sulistiani H. et al. Performance evaluation of feature selections on some ML approaches for diagnosing the narcissistic personality disorder. Bulletin of Electrical Engineering and Informatics, 2024, vol. 13, no. 2, рр. 1383–1391.
13. Mohammadi M. et al. Robust and stable gene selection via maximum–minimum correntropy criterion. Genomics, 2016, vol. 107, no. 2–3, рр. 83–87.
14. Cheremuhin, A. D. Stability of Algorithms for Feature Selection to Type II Errors. Cybernetics Bulletin. 2021, no. 4(44), рр. 78–82. DOI 10.34822/1999-7604-2021-4-78-82. EDN JRYVAF.
15. Aragón-Royón F. et al. FSinR: an exhaustive package for feature selection. In: arXiv preprint arXiv:2002.10330. 2020.
16. Kashef S., Nezamabadi-pour H. An advanced ACO algorithm for feature subset selection. Neurocomputing, 2015, vol. 147, рр. 271–279.
17. Yang J., Honavar V. Feature subset selection using a genetic algorithm. IEEE Intelligent Systems and their Applications, 1998, vol. 13, no. 2, рр. 44–49.
18. Liu H., Setiono R. Feature selection and classification–a probabilistic wrapper approach. In: Industrial and engineering applications or artificial intelligence and expert systems. CRC Press, 2022, рр. 419–424.
19. Posario F., Thangadurai K. Simulated Annealing Algorithm for Feature Selection. International Journal of Computers & Technology, 2016, vol. 15, no. 2, рр. 6471–6479.
20. Glover F. Tabu search—part I. ORSA Journal on computing, 1989, vol. 1, no. 3, рр. 190–206.
21. Zamani H., Nadimi-Shahraki M. H. Feature selection based on whale optimization algorithm for diseases diagnosis. International Journal of Computer Science and Information Security, 2016, vol. 14, no. 9, рр. 1243.
Review
For citations:
Cheremuhin A.D., Rein A.D. Methodology for Evaluating the Results of Feature Selection Task Solving. Vestnik NSU. Series: Information Technologies. 2025;23(2):53-63. (In Russ.) https://doi.org/10.25205/1818-7900-2025-23-2-53-63