July 2nd, 2024

Plaintext is not a great format for (system) logs

Using plain text for system logs poses challenges due to rich metadata. Approaches include augmenting logs with metadata, storing it separately, or discarding it. Tools can help manage metadata, making logs more structured than plain text. JSON, though text-based, may not be plain text.

Read original articleLink Icon
Plaintext is not a great format for (system) logs

The blog post discusses the limitations of using plain text for storing system logs, regardless of the tool being used. It highlights that log messages often come with rich metadata that can be challenging to handle in plain text format. The post outlines three approaches to dealing with metadata in logs: augmenting log messages with metadata in text format, storing metadata by implication in separate files, or discarding the metadata altogether. It points out that as systems attach more metadata to log messages, storing logs in plain text can lead to clutter, complexity, or loss of metadata. The post suggests that relying on tools to manipulate metadata for readability can result in logs becoming more structured than plain text. Ultimately, it mentions that while technically JSON is text, it may not qualify as plain text due to its structured nature.

Link Icon 10 comments
By @crest - 10 months
When your "compressed database" is both larger and slower to query than the plaintext, has an unstable on disk format, and isn't even crash safe like journald's database I have to assume it's a sick joke and not a serious upgrade from plaintext logs on a (compressing) file system.
By @7thaccount - 10 months
You can make it all some kind of json object and use plenty of tools to parse.

I think text is really awesome though. You can read it with any editor and there are a zillion easy to learn Linux tools like grep, cut, awk...etc for interactively creating a oneliner in a short amount of time that you can use to get what you want out of the text. I used to have a bunch of those when I was in charge of some applications on a Linux server. Text is very universal and I'm not just talking about log files.

By @likeabatterycar - 10 months
Absolutely nothing wrong with binary logs as long as the system provides a utility to get the data out of that format.

For that matter, journalctl is terrible, cryptic, and difficult to use.

By @jmclnx - 10 months
The title should be "Plaintext is not a great format for (system) logs for people have no idea about regexp" :)

In my opinion, plain text wins for everything, very portable and people can read with very little knowledge.

By @exabrial - 10 months
plaintext and grep were find for your grandpappy and they're fine for you.
By @PreInternet01 - 10 months
Au contraire, plain text is just about the only format for logs that works. Having to use some obscure tool to query logs just isn't worth it, especially not in 20 years time, when you still need to analyze said logs but the tool doesn't even run on any contemporary systems anymore and the documentation about the binary format has been irretrievably lost.

And yeah, sure, you want to use structured logging. So, in addition to the greppable log message, include a Magic Separator Character (say, \t) that you treat as 'end-of-line' in human-oriented processing, and have your key-value-pair structured data following that for automated tooling to have its way with. Or, be really creative with [key=value] blocks that are both human- and machine-readable.

Some of my most painful logging experiences were having to extract application logs from binary blobs created by a proprietary Windows tool (no, not the main Windows event log: that's bad, but at least documented). Even the most recent versions of the official viewer just crashed, the vendor was not interested in fixing that (since we were not a customer, just a third party in need of log data, but even an offer of a modest payment was met with indifference...), and the actual format turned out to be byzantine beyond words.

So, yeah, give me plain text all day, every day...

By @lucianbr - 10 months
On the other hand, multiple competing binary log formats make it harder to process the metadata than plaintext.
By @fareesh - 10 months
text is great for awk, grep, etc

perhaps some llm-powered awk command generator would be useful?

By @JackSlateur - 10 months
tldr: the author confuses plain text (aka, storing in ascii files) with storing in some binary format (whatever it is). Said confusion makes the whole post worthless, in my opinion.