BTRFS / ZFS

BTRFS

Create RAID1 (with self healing features):
mkfs.btrfs -m raid1 -d raid1 /dev/sdXX /dev/sdYY --label <name>
  
Mount one disk and it mounts the whole RAID (do not mount more than one):
mount /dev/sdXX /mnt/<name>
  
Add a disk:
mkfs.btrfs /dev/sdXX
btrfs device add /dev/sdZZ /mnt/<name> -f
btrfs balance start -dconvert=raid1 -mconvert=raid1 /mnt/<name>
  
Check balancing status:
btrfs balance status -v /mnt/<name>
  
Check file system status:
btrfs fi show
  
To auto mount at boot, run btrfs fi show, copy the UUID into /etc/fstab:
UUID=<uuid> /mnt/<name> defaults 0 0

Run scrub to check and repair potential errors:
btrfs scrub start /mnt/<name>

Check scrub status, cancel and resume:
btrfs scrub status /mnt/<name> (-d to show per disk)
btrfs scrub cancel /mnt/<name>
btrfs scrub resume /mnt/<name>
  
If scrub does not fix the errors, then it may be reversed to a working B-tree:
mount –o recovery /dev/sda /mnt/<name>

A degraded one disk BTRFS RAID1 results in a generic error. mount: invalid file system type, invalid flag invalid superblock on /dev/sdXX. ("mount: fel filsystemstyp, felaktig flagga, felaktigt superblock på /dev/sdXX")

But this can be overridden and the RAID1 can run with only one disk:
mount -t btrfs -o degraded /dev/sdXX /mnt/<name>

df reports invalid free space for BTRFS.

Check btrfs status /mnt/<name>
- High DUP value but not much free, then balance meta data by running:
btrfs balance start -m /mnt/<name>
- Out of metadata space, used close to total, free up 5% used data blocks:
btrfs balance start -dusage=5 /mnt/<name>

Beware of btrfs check --repair, ask developers instead.

http://www.beginninglinux.com/btrfs
https://www.thegeekdiary.com/how-to-use-btrfs-scrub-command-to-manage-scrubbing-on-btrfs-file-systems/
https://askubuntu.com/questions/464074/ubuntu-thinks-btrfs-disk-is-full-but-its-not
    

ZFS (OpenZFS)


Create a RAID1 pool with self-healing features and 4096 sectors (ashift=12):
zpool create <pool-name> -o ashift=12 mirror /dev/disk/by-id/<disk or partition 1> /dev/disk/by-id/<disk or partition 2> -m /mnt/<name> -f

Note 1, <pool-name> cannot be changed afterwards but mount point can.
Note 2, it is recommended to use by-id names and not /dev/sdx(X) because of possible name changes.
Note 3, not using ashift=12 results in 512 byte sectors.
Note 4, not using -m results in that it ends up in /<pool-name>.
Note 5, it seems there is a must to have at least 2 disks or partitions when creating the RAID1 (-f does not work), but one disk can be offlined and removed when this has been done although the RAID1 then runs in a degraded state.

Check for errors:
zpool scrub /<pool-name>

Remove pool:
zpool destroy /mnt/<name>

Mount and unmount:
zfs mount/umount <pool-name>

Check mount point:
zfs get mountpoint <pool-name>

Change mount point:
zfs set mountpoint=/mnt/<name> <pool-name>

Check pool status:
zpool status <pool-name>

Offline a disk or partition (to disconnect disk or partition):
zpool offline <pool-name> <disk or partition name from zpool status <pool-name>>

Online a disk or partiton (to re-add disk or partition after it has been disconnected and reconnected):
zpool online <pool-name> <disk or partition name from zpool status <pool-name>>

Making a disk och partition online triggers a resilvering process followed by a mail when completed. This goes quite fast if there are few differences.


ZFS - memory usage


There are a lot of recommendations to have enormous quantities or RAM when using ZFS, but there are seldom any explanation of why or what it is in ZFS that consumes it. I found a resource explaining it: https://www.zfsbuild.com/2010/04/15/explanation-of-arc-and-l2arc/.

It turns out that ZFS actually caches the most frequently used data in the RAM, this is the so-called ARC cache. By default it can chew up all RAM except 1 GB. It is also possible to have a second cache based on a SSD, this is the L2ARC cache.

The ARC is reduced if other applications need the memory. This can however be a troublesome if some application need the memory directly in order to start. There are ways to limit the caches.

An idle but online ZFS partition does not seem to eat much at all, just a few hundred MB.


BTRFS / ZFS - Test self healing features


I tested the self healing features in both BTRFS and ZFS by overwriting one of the disks with random data and then asking the file system to scrub itself. Both had 2 disks and ran in RAID1 configuration.

To test, create a RAID1 setup with 2 small disks (I used 512 MB).

Mount the storage, BTRFS: mount /dev/sdXX /mnt/somewhere, ZFS: zfs mount <pool-name>.

Go to the root folder - cd /mnt/somewhere.

Create test data dd if=/dev/urandom.

Create a MD5 file sum for the contents: md5deep -r -e -l -of * > /somewhere/files.md5

Mess one of the underlaying disks up a bit: dd if=/dev/urandom of=/dev/sdXX bs=1024 seek=15000 count=15000.

Make sure to set the correct device when doing this of course.

Run the scrub process, BTRFS: btrfs scrub start /mnt/<name>, ZFS: zpool scrub /<pool-name>.

Check the statuses, BTRFS: btrfs scrub status /mnt/<name>, ZFS: zpool status /<pool-name>.
This is where the magic appears - both file systems healed themselves.

Verify the MD5 sum: md5deep -m /somewhere/files.md5 *


ZFS scrub schedule

One common bully point for users running ZFS is that they have forgotten to make a scrub schedule - when to periodically check the mirror for errors and correct them.

It turns out Debian already has one scheduled when installing the standard ZFS utilities, located at:
/etc/cron.d/zfsutils-linux

It reads:
# Scrub the second Sunday of every month
24 0 8-14 * * root [ $(date +\%w) -eq 0 ] && [ -x /usr/lib/zfs-linux/scrub ] && /usr/lib/zfs-linux/scrub

So it seems to run the second Sunday every month.


Size, read speed and write speed comparisons

For a 512 MB drive resulted in BTRFS 447MB usable space and ZFS 464MB. I checked the actual occupied space, not what df reported. 4% more storage with ZFS.
    
Read tests:
BTRFS: 373030912 byte (373 MB, 356 MiB) copied, 6,8928 s, 54,1 MB/s
ZFS: 373030912 byte (373 MB, 356 MiB) copied, 8,63009 s, 43,2 MB/s
    
Write tests:
ZFS: 40960000 byte (41 MB, 39 MiB) copied, 3,26608 s, 12,5 MB/s*
BTRFS: 40960000 byte (41 MB, 39 MiB) copied, 0,322734 s, 127 MB/s

Note, the above read and write tests are not accurate for ZFS, it turns out it is much slower on degraded RAID1 arrays, so these values are too low.

This is a personal note. Last updated: 2019-08-17 14:39:01.



LinkedIn
Klebe.se

Don't forget to pay my friend a visit too. Joakim