Preview

Vestnik NSU. Series: Information Technologies

Advanced search

Analysis and testing of the RadixSpline Learned index

https://doi.org/10.25205/1818-7900-2025-23-3-67-78

Abstract

Recent advances in machine learning have introduced a novel approach to database indexing known as learned indexes. Among these, the RadixSpline index has emerged as a particularly promising solution for efficient search in sorted key-value data. This paper presents a comprehensive analysis and performance evaluation of the RadixSpline index, which combines spline approximation with bounded error (GreedySpline) and a sparse radix table structure. We investigate the index’s behavior under different parameter configurations, focusing on two key parameters: the spline approximation error (err) and the radix prefix length (r). Our experimental study, conducted on a 25 MB subset of the MovieLens32M dataset, demonstrates that RadixSpline can achieve up to 30% faster search times compared to traditional binary search while reducing memory usage by over 80 % compared to RocksDB’s sparse index implementation. The study makes several important contributions: (1) we provide a detailed analysis of parameter sensitivity, identifying optimal configurations for various use cases; (2) we address implementation challenges not covered in the original work, including handling of non-contiguous prefixes and missing keys; (3) we demonstrate the index’s suitability for real-time applications through its single-pass construction property. Our results confirm that RadixSpline offers significant advantages over conventional indexing methods, particularly in scenarios requiring fast searches with limited memory resources. The findings suggest strong potential for integrating RadixSpline into production database systems, with RocksDB being a particularly promising candidate for such integration.

About the Authors

I. M. Stetskii
Novosibirsk State University
Russian Federation

Ilya M. Stetsky, Master’s Student

Novosibirsk



B. N. Pischik
Novosibirsk State University
Russian Federation

Boris N. Pishik, Associate Professor

Novosibirsk



References

1. Kraska T., Beutel A., Chi E.H., Dean J., Polyzotis N. The Case for Learned Index Structures // arXiv:1712.01208 [cs.DB]. 2017. DOI: 10.48550/arXiv.1712.01208

2. Kipf A., Marcus R., van Renen A., Stoian M., Kemper A., Kraska T., Neumann T. RadixSpline: A Single-Pass Learned Index // Proceedings of the Third International Workshop on Exploiting Artifi cial Intelligence Techniques for Data Management (aiDM ‘20). 2020. Art. No. 5. P. 1–5. DOI: 10.1145/3401071.3401659.

3. Neumann T., Michel S. Smooth Interpolating Histograms with Error Guarantees // Proceedings of the 11th International Conference on Extending Database Technology (EDBT ‘08). Cardiff , UK, 2008. P. 42–53. DOI: 10.1007/978-3-540-70504-8-12.

4. MovieLens32M Dataset [Электронный ресурс] / URL: GroupLens Research (дата обращения: 10.12.2024).

5. Index Block Format [Электронный ресурс] // RocksDB Wiki. – URL: https://github.com/facebook/rocksdb/wiki/Index-Block-Format (дата обращения: 10.08.2024).

6. RocksDB: A Persistent Key-Value Store [Электронный ресурс] / Meta Open Source. URL: https://github.com/facebook/rocksdb (дата обращения: 10.08.2024).

7. O’Neil P. The Log-Structured Merge-Tree (LSM-Tree) // Acta Informatica. 1996. Vol. 33. No. 4. P. 351–385. DOI: 10.1007/s002360050048.

8. Marcus R., Kipf A., van Renen A., Stoian M., Misra S., Kemper A., Neumann T., Kraska T. Benchmarking Learned Indexes // Proceedings of the VLDB Endowment (PVLDB). 2021. Vol. 14. No. 1. P. 1–13. DOI: 10.14778/3421424.3421425.

9. Li Z., Chan T. N., Yiu M. L., Jensen C. S. A Survey of Learned Indexes for the Multi-dimensional Space // arXiv:2403.06456 [cs.DB]. 2024. DOI: 10.48550/arXiv.2403.06456

10. Ferragina P., Vinciguerra G. A Critical Analysis of Recursive Model Indexes // arXiv:2106.16166 [cs.DB]. 2021. DOI: 10.48550/arXiv.2106.16166

11. Ding J., Minhas U. F., Yu J., Wang C., Li Y., Zhang H., Chandramouli B., Gehrke J., Kossmann D., Lomet D., Kraska T. Towards Practical Learned Indexing (Extended Abstracts) // arXiv:2108.05117 [cs.DB]. 2021. DOI: 10.48550/arXiv.2108.05117

12. Wu J., Zhang Y., Chen S., Wang J., Chen Y., Xing C. LSI: A Learned Secondary Index Structure // arXiv:2205.05769 [cs.DB]. 2022. DOI: 10.48550/arXiv.2205.05769

13. Hadian A., Heinis T. Shift-Table: A Low-Latency Learned Index for Range Queries using Model Correction // arXiv:2103.13160 [cs.DB]. 2021. DOI: 10.48550/arXiv.2103.13160

14. Vaidya K., Kraska T. Partitioned Learned Bloom Filter // Proceedings of the VLDB Endowment. 2020. Vol. 13. No. 12. P. 3298–3311. DOI: 10.14778/3415478.3415545

15. Spector B., Marcus R., van Renen A., Stoian M., Kemper A., Kraska T., Neumann T. Bounding the Last Mile: Effi cient Learned String Indexing // arXiv:2111.14905 [cs.DB]. 2021. DOI: 10.48550/arXiv.2111.14905

16. Crotty A. Hist-Tree: Those Who Ignore It Are Doomed to Learn [Электронный ресурс] // CIDR 2021. URL: https://cs.brown.edu/people/acrotty/pubs/cidr2021-paper20.pdf (дата обращения: 02.02.2025).


Review

For citations:


Stetskii I.M., Pischik B.N. Analysis and testing of the RadixSpline Learned index. Vestnik NSU. Series: Information Technologies. 2025;23(3):67-78. (In Russ.) https://doi.org/10.25205/1818-7900-2025-23-3-67-78

Views: 10


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 License.


ISSN 1818-7900 (Print)
ISSN 2410-0420 (Online)