ZFS is really nice filesystem and I use it wherever I can. I could write essays on the topic of ZFS and how awesome it is, and that is just what might happen some time in the future, but for now I’ll just show you how snapshots are managed on my machines.
Why snapshots?
There are 10
types of people:
- Those who do backups
- And those who will
I’m not sure about you, but I’ve definitely deleted things I shouldn’t have only to realize few minutes (,or 3 days, or a month) later that I need the data that’s now gone. If I only had a point-in-time to which I could return and salvage my data…
To be clear, I don’t think having snapshots is enough, you should definitely have backups to some other machine, external drive array or something similar, preferably to an offsite location. And this can be accomplished in an automated fashion with ZFS send/receive functionality.
How to do them?
To do the snapshot you just issue:
zfs snapshot storage/your/dataset@name_of_the_snapshot
And this will do the filesystem snapshot that second you executed the command.
Well, you can imagine managing and remembering to do them in the first place can really be burden and might be neglected. If backups aren’t automated, they likely won’t be done.
There are many utilities that help managing backups, and I used some of them in the past. I liked ZFSnap very much but decided to go for Sanoid since it’s nicer on Linux, and by defining policies and templates in human readable format (shown below), I don’t have to mess with cronjobs. and thinking about them much. Just one config file to manage.
Installing Sanoid
Since I’m on Fedora I had to follow CentOS installation instructions. Instead of yum
use dnf
and you’re good.
Once you’re done just define your policies and templates and you’re done. Easy as that!
There are few notes I should mention regarding configuration though, and those are the ones I try to keep on my mind while defining them as well:
- Policy name is the full dataset name used as a “source”
- Defining some setting on policy level overrides just that particular setting. Other settings from the template are still valid
- Template name is the thing you write after
template_
defintion
At last, here’s an example of my config:
################################################################################
# Policies
################################################################################
[storage/home/ivan/Downloads]
use_template = m1d3h4f3
recursive = yes
[storage/home/ivan/Nextcloud]
use_template = m3d30h72f8
[storage/home/ivan/Area52]
use_template = m3d30h72f8
[storage/home/ivan/Documents]
use_template = m3d30h72f8
################################################################################
# Templates
################################################################################
[template_m1d3h4f3]
frequently = 3
hourly = 4
daily = 3
monthly = 1
yearly = 0
autosnap = yes
autoprune = yes
[template_m3d30h72f8]
frequently = 8
hourly = 72
daily = 30
monthly = 3
yearly = 0
autosnap = yes
autoprune = yes
As you see, my template naming scheme is pretty imaginative.