July 22nd, 2024

Difference between running Postgres for yourself and for others

The post compares self-managed PostgreSQL with managing it for others, focusing on provisioning, backup/restore, HA, and security. It addresses complexities in provisioning, backup strategies, HA setup, and security measures for external users.

Read original articleLink Icon
Difference between running Postgres for yourself and for others

The blog post discusses the disparities between running PostgreSQL for oneself versus managing it for others. It highlights key differences in provisioning, backup/restore, high availability (HA), and security aspects. When managing PostgreSQL for others, complexities arise in provisioning due to the need for extensions, certificates, and DNS records. The backup process involves full backups and incremental changes using write-ahead log files for point-in-time restores. HA involves setting up primary and standby databases with health checks and fencing the primary to prevent data loss during failovers. Lastly, security concerns include preventing unauthorized access like running OS commands from PostgreSQL. The post delves into technical details such as optimizing provisioning times, overcoming backup challenges related to low activity, and ensuring robust fencing mechanisms for HA setups. Overall, it provides insights into the additional considerations and challenges involved in managing PostgreSQL for external users compared to personal deployments.

Link Icon 8 comments
By @pwmtr - 6 months
Hey, author for the blog post is here. If you have any questions or comments, please let me know!

It's also worth calling out the first diagram shows dependencies between features for Ubicloud's managed Postgres. AWS, Azure, and GCP's managed Postgres service would have a different diagram. That's because we at Ubicloud treat write-ahead logs (WAL) as a first class citizen.

By @ramonverse - 6 months
> Security: Did you know that with one simple trick you can drop to the OS from PostgreSQL and managed service providers hate that? The trick is COPY table_name from COMMAND

I certainly did not know that.

By @zhengiszen - 6 months
In Red Hat ecosystem there is an Ansible role to that end : https://github.com/linux-system-roles/postgresql I don't know if it will help everyone but it could be a good way to standardize and maintain an instance configuration
By @wolfhumble - 6 months
From the article:

"The problem with the server certificate is that someone needs to sign it. Usually you would want that certificate to be signed by someone who is trusted globally like DigiCert. But that means you need to send an external request; and the provider at some point (usually in minutes but sometimes it can be hours) signs your certificate and returns it back to you. This time lag is unfortunately not acceptable for users. So most of the time you would sign the certificate yourself. This means you need to create a certificate authority (CA) and share it with the users so they can validate the certificate chain. Ideally, you would also create a different CA for each database."

Couldn't you automate this with Let's Encrypt instead?

Thanks!

By @tehlike - 6 months
I feel like cloudnative-pg takes away majority of pain using well known kubernetes operator pattern.
By @superice - 6 months
I'm a little confused about the point-in-time restore functionality, I'm pretty sure there must be a way to not have to force those one minute WAL boundaries. DigitalOceans managed PostgreSQL for instance just allows you to specify a timestamp and restore, and when looking into the PostgreSQL docs I remember seeing an option to specify a timestamp as well.
By @thyrsus - 6 months
Thanks for the article; it's an important checklist.

I would think you'd want at most one certificate authority per customer rather than per database. Why am I wrong?