Ask HN: Theory of Backups
The Tower of Hanoi and Incremental-Differential-Full methods enhance backup strategies, incorporating various backup types and emphasizing the importance of backup rotations, medium suitability, and hash verification for data integrity.
The discussion on backup strategies often centers around consumer-oriented solutions, but there is a deeper theoretical framework that can enhance backup practices. Two key concepts are introduced: the Tower of Hanoi (TOH) scheduling scheme and the Incremental-Differential-Full (IDF) backup method. The IDF-TOH scheme incorporates three types of backups: Incremental backups, which capture changes since the last backup; Differential backups, which record changes since the last full backup; and Full backups, which capture the entire system. The scheduling of these backups can be optimized, although the ideal frequency remains uncertain. Additionally, the IDF-TOH framework does not address the issue of backup rotations, which can lead to data corruption if not managed properly. Different storage mediums may be better suited for various backup types, with pressed CD-ROMs suggested for Full backups due to their longevity, while Solid State Drives may be more appropriate for Incremental backups. The IDF method itself may require further refinement, and incorporating hashing for backup verification is essential. While this theoretical approach may not be universally applicable, individuals using tools like rsync or borg may benefit from adopting a more robust backup strategy to minimize data loss with minimal effort.
- The Tower of Hanoi and Incremental-Differential-Full methods enhance backup strategies.
- IDF-TOH includes Incremental, Differential, and Full backups for comprehensive data protection.
- Backup rotations and medium suitability are critical considerations in backup planning.
- Hash verification is essential for ensuring backup integrity.
- The proposed strategies may appeal to users of existing backup tools seeking improved reliability.
Related
Resilient Sync for Local First
The "Local-First" concept emphasizes empowering users with data on their devices, using Resilient Sync for offline and online data exchange. It ensures consistency, security, and efficient synchronization, distinguishing content changes and optimizing processes. The method offers flexibility, conflict-free updates, and compliance documentation, with potential enhancements for data size, compression, and security.
Difference between running Postgres for yourself and for others
The post compares self-managed PostgreSQL with managing it for others, focusing on provisioning, backup/restore, HA, and security. It addresses complexities in provisioning, backup strategies, HA setup, and security measures for external users.
Timeshift: System Restore Tool for Linux
Timeshift is a Linux tool similar to Windows' System Restore and Mac OS' Time Machine. It creates system snapshots for users to revert to previous states. Find more on Timeshift's GitHub page.
Lessons from Ancient File Systems
The article examines the evolution of Atari 8-bit file systems, focusing on Atari DOS versions, their limitations, and the emergence of alternatives like MyDOS and SpartaDOS, emphasizing the need for future-oriented design.
Linux updates with an undo function? Some distros have that
Some Linux distributions are adding "undo" functions for updates, utilizing snapshot capabilities in file systems like Btrfs and ZFS, but challenges in implementation and licensing persist.
The other normal backups are usually managed by someone else, he just does the hardware, most of the time.
His backups are tested by experience.
I personally use the following backup strategy:
- Setup an encrypted ZFS Storage in the network (e.g. TrueNAS - in my case it is Proxmox)
- Enable zfs-auto-snapshot for 15 min snapshots auto rotation (keep 24 daily, etc.)
- NEVER (!) type in the passwords of ZFS Storage permitted users on any client, that could be affected by ransomware
- Provide a user authenticated samba share to store all important data - try to prevent local storage of data
- Sync the ZFS snapshots to an external USB drive every night (I use a tasmota shelly plug and an external usb case to power off the devices if they are not needed)
# create current snapshot
zfs snapshot -r "$NEW_POOL_SNAP"
# first backup
zfs send --raw -R "$SRC_POOL@$NEW_SNAP_NAME" | pv | zfs recv -Fdu "$DST_POOL"
# incremental backup
zfs send --raw -RI "$BACKUP_FROM_SNAPSHOT" "$BACKUP_UNTIL_SNAPSHOT" | pv | zfs recv -Fdu "$DST_POOL"
- On Windows and macOS, backup the OS on an external drive- Use restic to keep an additional copy of the local files and folders somewhere else
- Use a bluray burner to backup the most important stuff as a restic repository or encrypted archive (like very important documents, the best photo collections of you family, Keepass database, etc.) and put it to another location
- If cloud storage is affordable for the amount of data you have, consider using restic to store your stuff in the cloud
- From time to time try to restore a specific file from the backup and check if it worked and try to restore a full system (on an additional harddisk).
This may sound overkill, but ransomware is a pretty bad thing these days, even if you think you are not one of its targets.
Regarding backup scheduler - sometimes companies need to have frequent backups due to their RPOs and RTOs, for example, if they operate in a highly regulated industry. If someone can tolerate the loss of data of two hours, then, they need to have backup performed every 2 hours, if we speak here about 8 hours (working day), so why not to have backups on a daily basis?
Regarding rotations - everything depends on a backup solution, if it provides with immutable backups, so the entire data won't be corrupted. Thus, the faster someone notices the mistake, the faster they can restore their copy. IDF helps more to decide the issue with storage - not to overload it (here also worth mentioning deduplication and compression).
1. How long should you keep backups for - is the content of your backup covered by privacy laws that require you to not have copies of it after a certain period of time? is there a point where the content of your back up is so old that it's the logical equivalent of not having made a back up in the first place?
2. How much does your backup process cost - if it costs more to back up a system than it would cost you if you lost it, then you've got the backup process wrong (interestingly this can be affected by economies of scale)
3. What do you need to restore a backup - does your system requires bespoke hardware that might have been lost in whatever disaster you're trying to recover from?
…but I never delete because the more copies of the same thing there are, the more likely it will survive. If in fact I need it, time spent searching is far shorter than tedious backup procedure.
In addition, if I have to recreate something version 2 will be better because I keep getting better at the things I do.
But that is me not you. Good luck.
Related
Resilient Sync for Local First
The "Local-First" concept emphasizes empowering users with data on their devices, using Resilient Sync for offline and online data exchange. It ensures consistency, security, and efficient synchronization, distinguishing content changes and optimizing processes. The method offers flexibility, conflict-free updates, and compliance documentation, with potential enhancements for data size, compression, and security.
Difference between running Postgres for yourself and for others
The post compares self-managed PostgreSQL with managing it for others, focusing on provisioning, backup/restore, HA, and security. It addresses complexities in provisioning, backup strategies, HA setup, and security measures for external users.
Timeshift: System Restore Tool for Linux
Timeshift is a Linux tool similar to Windows' System Restore and Mac OS' Time Machine. It creates system snapshots for users to revert to previous states. Find more on Timeshift's GitHub page.
Lessons from Ancient File Systems
The article examines the evolution of Atari 8-bit file systems, focusing on Atari DOS versions, their limitations, and the emergence of alternatives like MyDOS and SpartaDOS, emphasizing the need for future-oriented design.
Linux updates with an undo function? Some distros have that
Some Linux distributions are adding "undo" functions for updates, utilizing snapshot capabilities in file systems like Btrfs and ZFS, but challenges in implementation and licensing persist.