Preconditions-Based Algorithm for Safe Start of Replication in Fault-Tolerant PostgreSQL Cluster

A. S. Rudometov; M. V. Rutman

doi:10.25205/1818-7900-2025-23-2-29-42

Preconditions-Based Algorithm for Safe Start of Replication in Fault-Tolerant PostgreSQL Cluster

A. S. Rudometov, M. V. Rutman

https://doi.org/10.25205/1818-7900-2025-23-2-29-42

Full Text:

PDF (Rus)

Generate QR code

Abstract

Traditionally, fault-tolerant DBMS clusters using PostgreSQL or derivatives are built on replication machinery, operated via write-ahead log shipping. Default checks are aimed only at preserving the integrity of received records. In certain conditions replication start can lead to standby cluster node having data different from other nodes, or being unable to finish startup procedures. Existing high availability systems are forced to cope with the problem through recreating such nodes from backups, which is usually costly in terms of recovery time.
To address this issue, we propose an algorithm to prevent replication start when it is guaranteed to lead to data differences or node startup failure. For detection of such cases node collects information about write-ahead logs in the cluster and performs additional checks. If replication was blocked, automatic node synchronization for consequent replication start is available.
We have tested the algorithm on various real-world cluster confi gurations with simulated failures, and the experimental results indicate that algorithm substantially reduces the chance of nodes being non-eligible to restart.

Keywords

replication, fault-tolerant DBMS cluster, high availability, failure prediction, PostgreSQL

About the Authors

A. S. Rudometov

Novosibirsk State University
Russian Federation

Andrey S. Rudometov, Master’s Student

Novosibirsk

M. V. Rutman

Novosibirsk State University
Russian Federation

Mikhail V. Rutman, Associate Professor

Novosibirsk

References

1. Thomas S. M. PG Phriday: Redefining Postgres High Availability. In: BonesMoses.org: сайт. 2024. URL: https://bonesmoses.org/2024/pg-phriday-redefining-postgres-high-availability/

2. Kassema J. J. Disaster Recovery Plan for Business Continuity: Case Study in a Business Sector. In: SSRN, 2016, DOI: 10.2139/ssrn.2796601

3. Stonebraker M., Rowe L. A. The design of POSTGRES. In: Proceedings of the 1986 ACM SIGMOD international conference on Management of data (SIGMOD ‘86) (June 1986). Association for Computing Machinery, New York, NY, USA, 1986, рp. 340–355. DOI 10.1145/16856.16888

4. Cecchet E., Candea G., Ailamaki A. Middleware-based database replication: the gaps between theory and practice. In: Proceedings of the 2008 ACM SIGMOD international conference on Management of data (SIGMOD ‘08). Association for Computing Machinery, New York, NY, USA, 2008, рp. 739–752. DOI 10.1145/1376616.1376691

5. Stonebraker M., Rowe L., Hirohama M. The Implementation Of Postgres. In: Knowledge and Data Engineering, IEEE Transactions on. 2. 1990, рp. 125–142. DOI 10.1109/69.50912

6. Wieck J. Slony-I. A replication system for PostgreSQL. In: Slony-I. URL: https://slony.info/images/Slony-I-concept.pdf

7. PostgreSQL: Documentation 9.0: Release 9.0. In: PostgreSQL: Documentation. URL: https://www.postgresql.org/docs/9.0/release-9-0.html

8. Linnakangas H. Understanding PostgreSQL timelines. In: FOSDEM 2013. URL: https://wiki.postgresql.org/images/e/e5/FOSDEM2013-Timelines.pdf

9. Davidson S. B., Garcia-Molina H.; Skeen D. Consistency In A Partitioned Network: A Survey. In: ACM Computing Surveys, 1985, vol. 17, iss. 3, рp. 341–370. DOI 10.1145/5505.5508

10. Panchenko I. PostgreSQL: yesterday, today, tomorrow. In: Open systems. DBMS, 2015, no. 3, рр. 34–37. URL: https://www.osp.ru/os/2015/03/13046900

11. PostgreSQL: Documentation 17.0: pg_rewind. In: PostgreSQL: Documentation. URL: https://www.postgresql.org/docs/17/app-pgrewind.html

12. Härder, T., Sauer, C., Graefe, G. et al. Instant recovery with write-ahead logging. Datenbank Spektrum 15. 2015, рp. 235–239. DOI 10.1007/s13222-015-0204-3.

13. Bárbaro P., Pedroso M. High Availability and Load Balancing for Postgresql Databases: Designing and Implementing. International Journal of Database Management Systems, 2016, vol. 8, рp. 27–34. DOI 10.5121/ijdms.2016.8603

14. Md. Anower H., Md. Imrul H., Dr. MD Rashedul I., Nadeem A. A Novel Recovery Process in Timelagged Server using Point in Time Recovery (PITR). In: 24th International Conference on Computer and Information Technology (ICCIT). 2021. DOI 10.1109/ICCIT54785.2021.9689808.

15. Kim H., Yeom H. Y, Son Y. An Efficient Database Backup and Recovery Scheme using Write-Ahead Logging. In: 2020 IEEE 13th International Conference on Cloud Computing (CLOUD), 2020, рp. 405–413, DOI: 10.1109/CLOUD49709.2020.00062

16. Introduction – patroni 3.3 documentation. In: patroni documentation. https://patroni.readthedocs.io/en/rel_3_3/

17. Postgres Pro Enterprise: Documentation: 16: F.8: biha – built-in high-availability cluster // Documentation PostgreSQL и Postgres Pro: Postgres Professional: site. URL: https://postgrespro.ru/docs/enterprise/16/biha

18. Meng-Lai Y. Assessing availability impact caused by switchover in database failover. In: 2009 Annual Reliability and Maintainability Symposium. Fort Worth, TX, USA, 2009, рp. 401–406. DOI: 10.1109/RAMS.2009.4914710.

19. Coan B. A., & Oki B. M., Kolodner E. K. Limitations on Database Availability when Networks Partition. In: PODC ‘86: Proceedings of the fifth annual ACM symposium on Principles of distributed computing. 1986, рp. 187–194. DOI: 10.1145/10590.10606.

Review

For citations:

Rudometov A.S., Rutman M.V. Preconditions-Based Algorithm for Safe Start of Replication in Fault-Tolerant PostgreSQL Cluster. Vestnik NSU. Series: Information Technologies. 2025;23(2):29-42. (In Russ.) https://doi.org/10.25205/1818-7900-2025-23-2-29-42

This work is licensed under a Creative Commons Attribution 4.0 License.

ISSN 1818-7900 (Print)
ISSN 2410-0420 (Online)

Username
Password
	Remember me
Not a user? Register with this site Forgot your password?

User

Vestnik NSU. Series: Information Technologies

Preconditions-Based Algorithm for Safe Start of Replication in Fault-Tolerant PostgreSQL Cluster

Full Text:

Abstract

Keywords

About the Authors

References

Review

For citations:

Cookies policy