June 20th, 2024

Schema changes and the Postgres lock queue

Schema changes in Postgres can cause downtime due to locking issues. Tools like pgroll help manage migrations by handling lock acquisition failures, preventing application unavailability. Setting lock_timeout on DDL statements is crucial for smooth schema changes.

Read original articleLink Icon
Schema changes and the Postgres lock queue

Schema changes in Postgres can lead to downtime by locking out reads and writes, but migration tools can help mitigate this issue. There are two main types of breakage during database migrations: those that make incompatible schema changes and those that lock a database object for an unacceptable amount of time, causing application unavailability. Long-running queries combined with DDL statements can block reads and writes, leading to application downtime. Postgres provides the lock_timeout setting to control how long statements should wait to acquire locks before giving up. Setting an appropriate lock_timeout on DDL statements can prevent other queries from queuing behind them for too long. Tools like pgroll offer features like backoff and retry strategies to automatically handle lock acquisition failures, reducing the risk of blocking reads and writes during schema changes. It's crucial to consider these factors when making schema changes in Postgres to avoid unintended downtime and ensure smooth migrations.

Related

AI-powered conversion from Enzyme to React Testing Library

AI-powered conversion from Enzyme to React Testing Library

Slack engineers transitioned from Enzyme to React Testing Library due to React 18 compatibility issues. They used AST transformations and LLMs for automated conversion, achieving an 80% success rate.

Avoiding Emacs Bankruptcy

Avoiding Emacs Bankruptcy

Avoid "Emacs bankruptcy" by choosing efficient packages, deleting unnecessary configurations, and focusing on Emacs's core benefits. Prioritize power-to-weight ratio to prevent slowdowns and maintenance issues. Regularly reassess for a streamlined setup.

FreeBSD Bhyve Companion Tools

FreeBSD Bhyve Companion Tools

The author details transitioning from VirtualBox to FreeBSD Bhyve, praising Bhyve's benefits in a FreeBSD setting. Tools like VNC connection and pause/resume scripts optimize Bhyve operations, simplifying VM management.

Software Engineering Practices (2022)

Software Engineering Practices (2022)

Gergely Orosz sparked a Twitter discussion on software engineering practices. Simon Willison elaborated on key practices in a blog post, emphasizing documentation, test data creation, database migrations, templates, code formatting, environment setup automation, and preview environments. Willison highlights the productivity and quality benefits of investing in these practices and recommends tools like Docker, Gitpod, and Codespaces for implementation.

I found an 8 years old bug in Xorg

I found an 8 years old bug in Xorg

An 8-year-old Xorg bug related to epoll misuse was found by a picom developer. The bug caused windows to disappear during server lock, traced to CloseDownClient events. Despite limited impact, the developer seeks alternative window tree updates, emphasizing testing and debugging tools.

Link Icon 4 comments
By @petergeoghegan - 4 months
"So far, so good. What's the problem here? The DDL statement can simply wait patiently until it's able to acquire its ACCESS EXCLUSIVE lock, right? The problem is that any other statements that require a lock on the users table are now queued behind this ALTER TABLE statement, including other SELECT statements that only require ACCESS SHARE locks."

This is the most important individual point that the blog post makes in my view. Lots of Postgres users are left with the wrong general idea about how things in this area work (e.g., the locking implications of autovacuum), just because they missed this one subtlety.

I'm not sure what can be done about that. It seems like Postgres could do a better job of highlighting when an interaction between two disparate transactions/sessions causes disruption.

By @jauntywundrkind - 4 months
Our head of department basically put the kibosh on everyone's massive enthusiasm for migrating more and more MySQL systems to postgres, because once upon a time someone tried to run a schema change and postgres seemed to never be able to give the schema change the lock.

We didn't really get told about the problem at the time, so we don't know much about what happened. And the anger-driven (anti) development seems like it's forever going to sit there, unresolved.

Very frustrating situation for us.

By @alflervag - 4 months
If this is of interest I suggest taking a look at what Robin is doing at https://kaveland.no/eugene/

TLDR; Eugene helps you write zero downtime schema migration scripts for PostgreSQL databases by giving you a friendly report warning you about any anti patterns or potential problems.

By @modestygrime - 4 months
So how do people handle this in practice if the users table in this example has a ton of traffic? It might not ever succeed even with exponential backoff. It also seems strange that Postgres would need to lock the entire table just to add a new column.