Linux: incremental backup using rsync on btrfs with snapshots
My production data are backed up by bacula. This is good for data "online" all the time, but it is not so convent for my notebook and media PC. Till now I was using only simple rsync script to "copy" everything important to USB drive or iSCSI LUN. It was working fine but hat one catch. There is no way how to retrieve older version of files.
I would like to have possibility to retrieve also older versions of files so I combined my rsync backup with btrfs and snapshot feature. Here is result:
Before first backup
I am using iSCSI lun as target for backup. Skip iSCSI part if you are using USB drive.
iSCSI configuration
It is important to password protect Target. Otherwise any device on local network can overwrite it. Also I highly recommend Header and Data digest. There is possibility of data corruption without it.
Configuration of iSCSI initiator is stored in /etc/iscsi/iscsid.conf
. Following lines are mandatory:
node.session.auth.authmethod = CHAP node.session.auth.username = <user> node.session.auth.password = <pass> node.conn[0].iscsi.HeaderDigest = CRC32C node.conn[0].iscsi.DataDigest = CRC32C
All others can be left in "default" configuration.
iSCSI discovery and login
It is necessary to discover iSCSI target and log to it before system can access drive. It can be performed by following commands (replace "narnia-nas" and name of target with name used by your storage):
# iscsiadm -m discovery -t st -p narnia-nas 10.x.x.x:3260,1 iqn.2000-01.com.synology:narnia-nas.lukas # iscsiadm -m node -T iqn.2000-01.com.synology:narnia-nas.lukas --login Logging in to [iface: default, target: iqn.2000-01.com.synology:narnia-nas.lukas, portal: 10.x.x.x,3260] (multiple) Login to [iface: default, target: iqn.2000-01.com.synology:narnia-nas.lukas, portal: 10.x.x.x,3260] successful.
There should be new disk once kernel detect it (snip from dmesg command):
scsi host4: iSCSI Initiator over TCP/IP scsi 4:0:0:0: Direct-Access SYNOLOGY iSCSI Storage 4.0 PQ: 0 ANSI: 5 sd 4:0:0:0: Attached scsi generic sg2 type 0 sd 4:0:0:0: [sdb] 209715200 512-byte logical blocks: (107 GB/100 GiB) sd 4:0:0:0: [sdb] Write Protect is off sd 4:0:0:0: [sdb] Mode Sense: 43 00 10 08 sd 4:0:0:0: [sdb] Write cache: enabled, read cache: enabled, supports DPO and FUA sd 4:0:0:0: [sdb] Attached SCSI disk
# lsscsi | grep SYN [4:0:0:0] disk SYNOLOGY iSCSI Storage 4.0 /dev/sdb
File system
I prefer disk labels as device file may change over time. Make sure you select correct device:
# mkfs.btrfs -L lukas-backup /dev/sdb btrfs-progs v4.0 See http://btrfs.wiki.kernel.org for more information. Performing full device TRIM (100.00GiB) ... Turning ON incompat feature 'extref': increased hardlink limit per file to 65536 Turning ON incompat feature 'skinny-metadata': reduced-size metadata extent refs fs created label lukas-backup on /dev/sdb nodesize 16384 leafsize 16384 sectorsize 4096 size 100.00GiB
Note: TRIM on iSCSI LUN is feature of DSM 6.0 running on selected models of NAS from Synology. It may be available also on NAS devices from different vendors.
FS first time mounted
File system will be mounted by script therefore it is not necessary for system itself to mount it during boot up. Entry in /etc/fstab
should looks like:
# backup LABEL=lukas-backup /mnt/backup btrfs noauto 0 0
Btrfs can be managed only while it is mounted so it is necessary to mount it first.
mkdir -p /mnt/backup chmod 0 /mnt/backup mount /mnt/backup
Subvolume & quota
Subvolume and snapshot size are internally handled as quota. In order to see which snapshot or subvolume take so much space it is necessary to enable quota on file system level:
btrfs quota enable /mnt/backup
Note: enable quota before placing data on fs otherwise it will take some time to collect (scan) it.
I prefer to use separate subvolume for backup. With this configuration it is possible to use one filesystem as target for different backups:
# btrfs sub create /mnt/backup/lukas Create subvolume '/mnt/backup/lukas'
Umount & logoff
My script is handling iSCSI and mounting of file system. It is necessary to unmount file system and disconnect from NAS in order to have "clean" start.
# umount -v /mnt/backup/ umount: /mnt/backup/ unmounted
# iscsiadm -m node -T iqn.2000-01.com.synology:narnia-nas.lukas --logout Logging out of session [sid: 2, target: iqn.2000-01.com.synology:narnia-nas.lukas, portal: 10.x.x.x,3260] Logout of [sid: 2, target: iqn.2000-01.com.synology:narnia-nas.lukas, portal: 10.x.x.x,3260] successful.
Backup script
This script don't try to be smart. It simply execute commands and end immediately if failure occur. If failure occur while executing rsync
then snapshot is marked as "incomplete".
Note: don't use this script directly. It is created to fulfill my needs and it may be necessary to modify it before you can use it in your environment.
#!/bin/bash echo echo "To verify using \"--checksum\" execute:" echo "${0} verify" echo sleep 5 if [ "${1}" == "verify" ] then echo "... verify mode selected" opt="--checksum" else echo "... normal mode selected" opt="" fi sleep 5 echo echo echo "Discovering narnia-nas ..." iscsiadm -m discovery -t st -p narnia-nas || exit 1 echo echo "Connecting to narnia-nas ..." iscsiadm -m node -T iqn.2000-01.com.synology:narnia-nas.lukas --login || exit 2 echo echo "Sleeping ... kernel need some time to detect new devices" sleep 10 echo echo "Mounting backup fs ..." mount -v /mnt/backup || exit 3 echo echo "Copying data ..." # output of following commands is saved along with backup ( echo; echo "lsusb:" ; lsusb; echo; echo "lsscsi:"; lsscsi; echo; echo "lshw:" ; lshw -short; echo; echo "date:" ; date; echo; echo "dpkg -l"; dpkg -l; echo; echo "# EOF" ; ) > /_lastbackup_iscsi echo # copy data to backup location rsync ${opt} --archive --delete --delete-excluded --human-readable --stats --progress \ --exclude=/cdrom --exclude=/dev --exclude=/media --exclude=/mnt --exclude=/proc --exclude=/run \ --exclude=/sys --exclude=/tmp \ --exclude=/btrfs \ --exclude=/root/hekate-certificates \ --exclude=/home/lukas/.cache/google-chrome \ --exclude=/home/lukas/.cache/mozilla/firefox \ --exclude=/home/lukas/.cache/thumbnails \ --exclude=/data/VMs/no-backup \ --exclude=/data/swap \ / /mnt/backup/lukas/ RC=$? echo echo "Done with rc: ${RC}" echo echo "Flushing file system buffers ..." time sync btrfs filesystem sync /mnt/backup time sync echo echo "Creating snapshot of backup ..." if [ "${RC}" -eq 0 ] then btrfs sub snap -r /mnt/backup/lukas "/mnt/backup/lukas_$(LANG=C date +%Y-%m-%d_%s)" || exit 4 else btrfs sub snap -r /mnt/backup/lukas "/mnt/backup/lukas_$(LANG=C date +%Y-%m-%d_%s)_incomplete_${RC}" || exit 5 fi echo echo "Hit enter to continue ... " read echo "Umounting backup filesystem ..." umount -v /mnt/backup || exit 6 echo echo "Disconecting from narnia-nas ..." iscsiadm -m node -T iqn.2000-01.com.synology:narnia-nas.lukas --logout || exit 7 echo echo "Done :o)" echo # EOF
md5sum: e075be22b429a2be4b3dbf2fbb467ab9
Useful btrfs commands
"filesystem df"
BTRFS use different data layout compared to file systems like ext{2,3,4}. Data and metadata are organized to pools and pools are allocated from raw device. There are several different pools and each one have it's own utilization. Therefore it is not easy to provide "numbers" for df
command. If you ran out of space or you are unsure what is causing disk utilization then review output of following command:
# btrfs filesystem df /mnt/backup/ Data, single: total=79.01GiB, used=78.66GiB System, DUP: total=8.00MiB, used=16.00KiB System, single: total=4.00MiB, used=0.00B Metadata, DUP: total=1.00GiB, used=440.55MiB Metadata, single: total=8.00MiB, used=0.00B GlobalReserve, single: total=160.00MiB, used=0.00B
Note1: I put empty lines to output in order to visually separate pools
Note2: Data -> user data, System -> stuff related to "super block", Metadata -> Metadata :o) , GlobalReserve -> FS reserve to prevent deadlock
how to identify space consumers
Files and blocks of files can be shared between volumes (and snapshots). Standard Linux tools don't understand underlying data layout so it is not easy to find space consumers using this tools. It is necessary to use btrfs native commands in order to get accurate numbers. Example of usage:
- First step is to get ID of subvolumes:
# btrfs sub list /mnt/backup/ ID 257 gen 44 top level 5 path lukas ID 337 gen 30 top level 5 path lukas_2016-09-11_1473624103 ID 345 gen 35 top level 5 path lukas_2016-09-11_1473624272 ID 349 gen 39 top level 5 path lukas_2016-09-11_1473624385 ID 350 gen 42 top level 5 path lukas_2016-09-11_1473624546
- Second step is list of quota groups:
# btrfs qgroup show /mnt/backup/ qgroupid rfer excl -------- ---- ---- 0/5 16.00KiB 16.00KiB 0/257 78.41GiB 42.92MiB 0/337 78.84GiB 7.37GiB 0/345 78.42GiB 48.53MiB 0/349 78.42GiB 46.27MiB 0/350 78.43GiB 37.96MiB
First number represent subvolume id (snapshot is also subvolume). Second one is size of it (in mean of all data accessible via subvolume). Third one show data exclusively stored in this group. This size can be recovered by removing subvolume / snapshot.
As a example -> By removing "0/337 " I can get back 7.37GiB of space.
Subvolume (snapshot) remove
Snapshots and subvolumes are removed in same way:
# btrfs sub delete /mnt/backup/lukas_2016-09-11_1473624103/ Delete subvolume (no-commit): '/mnt/backup/lukas_2016-09-11_1473624103'
It may take some time till BTRFS release all data used by snapshot or subvolume.
File system extend
I am using iSCSI LUN so I can extend it. Here is way how to extend also BTRFS on it:
# btrfs filesystem resize max /mnt/backup/ Resize '/mnt/backup/' of 'max'
This command will extend all attached devices in btrfs to reflect maximum HW size (for example after LUN resize).