Show HN: WAL Implementation in Golang
The "rebuf" GitHub project is a Golang implementation of Write Ahead||After Logging (WAL) for logging data bytes during service issues. Features include easy installation, lightweight usage, and efficient storage. Contact author [@stym06](https://github.com/stym06).
Read original articleThe GitHub project "rebuf" is a Golang implementation of Write Ahead||After Logging (WAL) for logging data bytes during downstream service issues, allowing later replay. Key features include creating and replaying log data on any filesystem, lightweight usage, and efficient storage and retrieval. To install, clone the repository, navigate to the project directory, and install dependencies using `go mod download`. Usage involves initializing Rebuf with specified options, writing bytes, and replaying data. The project is under the MIT License, and for inquiries, the author can be contacted on GitHub at [@stym06](https://github.com/stym06).
Related
Binrw
The tool binrw simplifies binary parsing and serialization with a declarative approach, offering readability and maintainability. It supports common tasks, generics, custom parsers, predefined types, and is safe for various environments.
I kind of like rebasing
People debate Git workflows, favoring rebasing for a linear history. Redowan Delowar prefers rebasing to consolidate changes and maintain a clean commit history. They discuss interactive rebasing benefits, squashing commits, handling conflicts, and offer practical tips.
Show HN: Gosax – A high-performance SAX XML parser for Go
The `gosax` Go library enables efficient XML SAX parsing with read-only features, high performance, SWAR optimizations, and `encoding/xml` compatibility. Installation via `go get` and contributions on GitHub are encouraged.
Rust FSM-based Resumable Postgres tasks
The "pg_task" project on GitHub manages FSM-based Resumable Postgres tasks. It features granular state machines, error handling, single-table task scheduling, task definition, execution, stopping, and updating guidelines. Licensed under MIT.
A write-ahead log is not a universal part of durability
A write-ahead log (WAL) isn't always essential for database durability. Techniques like fsync, group commit, and checksumming enhance durability. WAL remains cost-effective for ensuring durability in most cases, crucial for database administrators.
1. The rebuf.Init function panics. I almost never want a library to call panic, and when it does, I want the library function to denote that. The convention I’ve seen most often is to start the function name with Must, so MustInit instead of Init. In this case though, I think it’d be safe to be a little more lenient in what you accept as input and trim the trailing slash.
2. I never (not almost, actually never) want library code to call any of the fmt.Print functions unless the library is explicitly for writing output, or that behavior is strictly opt in. If the library really must print things, it should take a user supplied os.Writer and write to that. Let the user control what gets printed or not.
1. syscall.Iovec allows you to build up multiple batches semi independently and then write them all in a single syscall and sync the file with the next one. It is a good basis for allowing multiple pending writes to proceed in independent go routines and have another one have all the responsibility for flushing data.
2. It is better to use larger preallocated files than a bunch of smaller ones, along with batching, fixed size headers and padding write blocks to a known size. 16 megabytes per wal and a 128 byte padding worked well for me.
3. Batching writes until they reach a max buffer size and/or a max buffer age can also massively increase throughput. 1 megabyte max pending write or 50 ms time passed worked pretty well for me for batching and throughput to start with, then dynamically tuning the last bound to the rolling average of the time the last 16 write+sync operations (and a hard upper bound to deal with 99th percentile latency badness) worked better. Bounded channels and a little clever math makes parallelizing all of this pretty seamless.
4. Mmap'ing the wals makes consistency checking and byte level fiddling much easier on replay. No need to seek or use a buffered reader, just use slice math and copy() or append() to pull out what you need.
I also think your rotation will delete the wrong segment when you have more than ten segments - imagine you're writing rebuf-1 to rebuf-10 - what's the "oldest file" to delete now? Besides, should you really delete those files?
1. Use fsync for durable writes in case of system crashes
2. Fix log-rotation-purging logic
3. Fix `file already closed` bug on consecutive writes
4. Add CRC checksum
Perhaps it's just me, but I don't trust code that hasn't been tested.
Since you mention etcd/wal:
https://github.com/etcd-io/etcd/blob/v3.3.27/wal/wal.go#L671
https://github.com/etcd-io/etcd/blob/v3.3.27/pkg/fileutil/sy...
Similar to RocksDB.
Related
Binrw
The tool binrw simplifies binary parsing and serialization with a declarative approach, offering readability and maintainability. It supports common tasks, generics, custom parsers, predefined types, and is safe for various environments.
I kind of like rebasing
People debate Git workflows, favoring rebasing for a linear history. Redowan Delowar prefers rebasing to consolidate changes and maintain a clean commit history. They discuss interactive rebasing benefits, squashing commits, handling conflicts, and offer practical tips.
Show HN: Gosax – A high-performance SAX XML parser for Go
The `gosax` Go library enables efficient XML SAX parsing with read-only features, high performance, SWAR optimizations, and `encoding/xml` compatibility. Installation via `go get` and contributions on GitHub are encouraged.
Rust FSM-based Resumable Postgres tasks
The "pg_task" project on GitHub manages FSM-based Resumable Postgres tasks. It features granular state machines, error handling, single-table task scheduling, task definition, execution, stopping, and updating guidelines. Licensed under MIT.
A write-ahead log is not a universal part of durability
A write-ahead log (WAL) isn't always essential for database durability. Techniques like fsync, group commit, and checksumming enhance durability. WAL remains cost-effective for ensuring durability in most cases, crucial for database administrators.