August 20th, 2024

Making Database Systems Usable

Database usability is essential for user satisfaction, often more important than technical performance. Despite advancements, challenges persist, including reliance on experts and complexities in querying, necessitating better interfaces and designs.

Read original articleLink Icon
FrustrationSympathyImprovement
Making Database Systems Usable

The discussion on database usability highlights its critical importance for users, emphasizing that usability often outweighs technical performance metrics. C. J. Date's 1983 keynote at Sigmod pointed out that users prioritize ease of operation over complex algorithms. A 2007 paper revisits this issue, illustrating that despite advancements, many usability challenges remain unresolved. The paper discusses the indirect costs associated with database management, where organizations often rely on technical experts to facilitate user interactions, leading to inefficiencies. It uses the example of flight booking systems to illustrate how web interfaces improved user experience by providing transparency and control. However, operationalizing databases remains complex, with users facing difficulties in querying and updating data due to the relational model's inherent challenges, such as the need for joins. The paper identifies various usability pain points, including overwhelming options, unexpected results, and the complexity of database creation. It suggests that while there is no simple solution, improvements could be made through better presentation models, intuitive interactions, and flexible schema designs. The discussion concludes by reflecting on the current state of database usability, noting that while some progress has been made, significant challenges persist, particularly with the integration of distributed systems and emerging technologies like LLMs.

- Database usability is crucial for user satisfaction and efficiency.

- Many organizations rely on technical experts, increasing indirect costs.

- The relational model complicates user interactions with databases.

- Usability challenges include overwhelming options and complex query formulation.

- Future improvements may come from better user interfaces and flexible database designs.

AI: What people are saying
The comments reflect a range of frustrations and suggestions regarding database usability and design.
  • Many users struggle with complex SQL queries, particularly JOIN operations, and express a desire for more intuitive solutions.
  • There is a call for databases to better accommodate human thinking, suggesting that current relational models can lead to unnecessary complexity.
  • Some commenters advocate for the use of nested queries over JOINs, citing ease of understanding.
  • Users highlight the importance of user-friendly interfaces and tools that simplify database interactions.
  • There is a recognition that while SQL is a standard, many users prefer alternatives that are perceived as easier to use, such as document databases.
Link Icon 13 comments
By @otoolep - about 2 months
>They care less about impressive benchmarks or clever algorithms, and more about whether they can operate and use a database efficiently to query, update, analyze, and persist their data with minimal headache.

Hugely important, and I would add "backup-and-restore" to that list. At risk of sounding conceited, ease of use is a primary goal of rqlite[1] -- because in the real world databases must be operated[2]. I never add a feature if it's going to measurably decrease how easy it is to operate the database.

[1] https://www.rqlite.io

[2] https://docs.google.com/presentation/d/1Q8lQgCaODlecHa2hS-Oe...

Disclaimer: I'm the creator of rqlite.

By @pjs_ - about 2 months
A huge fraction (not 100%, but maybe 80%) of my frustration in trying to get technical people to use a database is that they have such a hard time understanding JOINs.

People endlessly want hacks, workarounds, and un-normalized data structures, for a single reason - they don't want to have to think about JOIN. It's not actually for performance reasons, it's not actually for any reason other than it's easier to imagine a big table with lots of columns.

I'm actually sympathetic to that reticence but what I am not sympathetic about is this: why, in 2024, can't the computer figure out multi-table joins for me?

Unless your schema is really fucked up, there should only be one or two actually sensible ways to join across multiple tables. Like say I have apples in boxes in houses. I have an apple table, a box table, and a house table. apples have a box_id, boxes have a house_id. Now I want to find all the apples in a house. Do two joins, cool. But literally everyone seems to write this by hand, and it means that crud apps end up with thousands of nearly identical queries which are mostly just boringly spelling out the chain of joins that needs to be applied.

When I started using SQLAlchemy, I naively assumed that a trivial functionality of such a sophisticated ORM would be to implement `apple.join(house)`, automatically figuring out that `box` is the necessary intermediate step. SQLAlchemy even has all the additional relationship information to figure out the details of how those joins need to work. But after weeks of reading the documentation I realized that this is not a supported feature.

In the end I wrote a join tool myself, which can automatically find paths between distant tables in the schema, but it seems ludicrous to have to homebrew something like that.

I'm not a trained software engineer but it seems like this must be a very generic problem -- is there a name for the problem? Are there accepted solutions or no-go-theorems on what is possible? I have searched the internet a lot and mostly just find people saying "oh we just type out all combinatorially-many possible queries"... apologies in advance if I am very ignorant here

By @Spivak - about 2 months
I think there's a huge difference between how you design:

1. A database that's meant to be understood by programmers, make queries by the application efficient in space and time, and provide strong referential integrity.

2. A database that's meant to be played with by humans, where duplication is fine, referential integrity is a nice-to-have, every column is okay to be nullable, tables should contain complete objects as understood by the user that map 1-1ish to the real world, that eliminate as many opaque ids as feasible, and foreign keys might exist but aren't enforced in any direction.

The latter database is far more ergonomic and you won't run up against a user frustratingly bashing their keyboard because the database just refuses to do what they want. The stakes in #2 style databases are extremely low-- let the user destroy their copy of the database it's fine, we can reload from the last save.

The nice thing is that it seems very possible to go from #1 -> #2 mechanically, and hand that off to the users who want to play with it.

By @josephg - about 2 months
Something thats always bugged me about relational database modelling is how you have to use table relationships for everything. Humans have a special category for ownership (eg Order owns DeliveryAddress), which works differently from other kinds of relationships. Eg Order references Products.

This problem is heightened by the fact that a SQL table typically can't store composite types. Like, you can't make a column of lists-of-strings. Every modelling problem is solved by using more tables. And any nontrivial application ends up with a table explosion. You have to study your database very closely, then write complex, slow, join-heavy queries, just to get the information about simple stuff - like an order.

Solving every problem with tables might seem clever from a technical perspective, but its just not how humans think. There's a reason document databases like mongodb are popular. Even if they are worse technically, they're much easier to use and reason about.

There's no reason SQL databases couldn't support nested data. Postgres sort of does already via JSON fields. But it feels like you have to fight the database to do it.

By @jamesblonde - about 2 months
Prof Murat is calling for examples of how LLMs can help make DBs more usable.

I note that Julius.ai is making data analytics easier - just upload some data and ask for some charts and it does an ok job.

The problem of going from natural language to SQL is still a far from solved problem. The main benchmark for this is Bird-Bench and accuracy is only in the high 60s - https://bird-bench.github.io/

I have been working on making tables queryable using function calling, where you a fixed number of canned queries based on the primary key and an event-time column. PyData talk on this - https://www.youtube.com/watch?v=dRcjTe5qgwM

By @flowerlad - about 2 months
> Users cannot interact with the database directly

It is not super hard to find UI designed for end users, these days.

If you know the basics such as what a relational database is, then here’s is a good UI: https://visualdb.com

By @hi41 - about 2 months
How do you create the correct indexes? If we create index for all the conditions in a where clause wouldn’t the number of indexes become too many? I don’t fully grasp how to best create indexes. Does anyone have good reference and examples?
By @delifue - about 2 months
> “Find all flights from Detroit to Beijing” ... which must be joined twice with flight info to express our query

Actually this can be done by nested query, without joining. I prefer nested query than joins. Because join is "global" and nested query is "local" (global means interfering with the whole SQL statement). The local one is easier to understand.

By @boxed - about 2 months
For me the join problem doesn't exist anymore, because I use the excellent Django ORM for work and hobby projects.

This is a SELECT with 5 joins:

    Foo.objects.filter(a__b__c__d__e=3)
Types are respected, valid relations exist and invalid relations do not. Clean.
By @leandrod - about 2 months
Those who do not learn the relational model are bound to flounder. For most use cases, there is no reducible complexity in the relational model.
By @kak3a - about 2 months
Author talked about usability problem with database, it's oxymoron usability problem in this site where all the text are in gray which makes reading very painful. Had to use "Darken Text" Chrome ext so it is not killing my eye!
By @trollied - about 2 months
I think the core of this is the age old problem that people would rather invent crappy technologies that end up being a pain instead of taking some time to learn a standard, SQL.

SQL is not going to go away. It's relatively easy to learn.

By @SoftTalker - about 2 months
> You youngins may not remember, but pre-2005 we had to call a travel agent to book our flights and get paper tickets in return. This sucked, we don't have any transparency in to the process, we couldn't explore options (price, convenience, dates, airports) and customize our trip. Having access to flight booking via web was really a great improvement for user experience.

I'm going to disagree. I could call my travel agent and say "I am going to Chicago on <date> I need to be there by <time> call me back with a couple of nonstop options. And I'll need a room at the Hilton and Towers for two nights."

vs. today I can spend my own time navigating travel sites, avoiding their dark patterns, wondering if I'm really getting the best prices, making sure I understand the terms (is this refundable? are any checked bags included?) etc and then do the same for the hotel booking.

If you work in a business with a travel department or assistants who can manage your travel you have an idea, compared to the hassle of doing it all yourself.