First Light: Balkonsternwarte

First Light: Balkonsternwarte

Ein Balkon mit Südblick ist eine tolle Gelegenheit für astronomische Beobachtungen. Wenn dann auch noch die Sicht nach Norden und auf den Polarstern frei sind, gehört auf den Balkon selbstverständlich ein Teleskop zur Himmelsbeobachtung.

Den Traum einer kleinen Balkonsternwarte habe ich mir nun erfüllt. Da größere Optiken und Brennweiten mitten im Stadtgebiet eher nicht in Frage kommen und ich mit den Skywatcher Montierungen ganz gute Erfahrungen gemacht habe, fiel die Wahl auf die Skywatcher HEQ-5. Zusammen mit einem Leitrohr und M-GEN Autoguider, einer bereits vorhandene Canon 40Da und einigen EF Objektive sowie einem Haufen Zubehör, waren im Dezember ‘17 endlich alle notwenigen Teile beschafft.

read more

ZFS: auto-snapshots and Windows shadow copies with Samba

ZFS: auto-snapshots and Windows shadow copies with Samba

With ZFS snapshots and Samba’s shadow_copy2 module, you can expose snapshots to Windows clients as shadow copies.

First, install zfs-auto-snapshot on Debian:

apt-get install zfs-auto-snapshot

Default configuration for zfs-auto-snapshot uses specific labels like daily, weekly etc. This is incompatible to samba’s expected snapshot format. As a good workaround, instead of using text labels, I’ve changed the labels to numbers:

31: for daily snapshots
05: for 5 minute interval snapshots
24: for hourly snapshots
12: for monthly snapshots
52: for weekly snapshots

This generates snapshot names like:


Now it is possible to use the label as input for the shadow copy timestamp:

shadow: format = zfs-auto-snap_%S-%Y-%m-%d-%H%M

The “label” which is always an integer, is now used as second (%S) part which is (mostly) negligible for shadow copies.

The required changes for smb.conf are:

    vfs objects = shadow_copy2
    shadow: snapdir = .zfs/snapshot
    shadow: sort = desc
    shadow: format = zfs-auto-snap_%S-%Y-%m-%d-%H%M
    shadow:localtime = no

I’ve modified the installed cronjob files for numeric labels:

# tail -vn +1 /etc/cron.*/zfs-auto-snapshot
==> /etc/cron.daily/zfs-auto-snapshot <==
exec zfs-auto-snapshot --quiet --syslog --label=31 --keep=7 //

==> /etc/cron.d/zfs-auto-snapshot <==

*/5 7-21 * * * root zfs-auto-snapshot -q -g --label=05 --keep=12  //

==> /etc/cron.hourly/zfs-auto-snapshot <==
exec zfs-auto-snapshot --quiet --syslog --label=24 --keep=24 //

==> /etc/cron.monthly/zfs-auto-snapshot <==
exec zfs-auto-snapshot --quiet --syslog --label=12 --keep=3 //

==> /etc/cron.weekly/zfs-auto-snapshot <==
exec zfs-auto-snapshot --quiet --syslog --label=52 --keep=4 //

read more

ZFS: txg_sync stuck at 100% while copy large dataset with rsync

ZFS: txg_sync stuck at 100% while copy large dataset with rsync

While transfering a large dataset (2TiB) via rsync from ext4 to zfs, rsync hangs at some time and txf_sync stuck at 100% cpu.

It was a fresh system and zfs was set up just a few minutes ago. After some research, I found out that this problem is related to ARC.

I’ve changed the zfs_arc_min parameter while trx_sync stuck and the system immediately responses.

echo 1073741824 » /sys/module/zfs/parameters/zfs_arc_min

This sets the minimum size (hard limit) for ARC to 1 GiB.


read more

ZFS: Create pool with missing devices

ZFS: Create pool with missing devices

ZFS on Linux has no support for creating pools with missing vdevs. Thats a benefit of mdadm which accepts a missing keyword in a list of devices.

But there is a workaround:

First, check the size of an existing device (all devices must have same size). In my case, each disk has 2000398934016 bytes. I’ve used fdisk to check the size but you can use whatever you want.

Now create parse file(s) with the given size.

truncate -s 2000398934016 /sparse1.img

Repeat this for each missing device and increment the number.

Now you can call zpool create with all existing devices and append the sparse files. After that, you should take the sparse files offline to prevent zfs from filling these up.

root@virt-master1:/var/lib/libvirt/vmimages# zpool status
  pool: vm1storage
 state: DEGRADED
status: One or more devices has been taken offline by the administrator.
        Sufficient replicas exist for the pool to continue functioning in a
        degraded state.
action: Online the device using 'zpool online' or replace the device with
        'zpool replace'.
  scan: none requested

        NAME                                          STATE     READ WRITE CKSUM
        vm1storage                                    DEGRADED     0     0     0
          raidz3-0                                    DEGRADED     0     0     0
            ata-WDC_WD20EFRX-68AX9N0_WD-WMC300377714  ONLINE       0     0     0
            ata-WDC_WD20EFRX-68AX9N0_WD-WMC300443307  ONLINE       0     0     0
            ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M6UR8S28  ONLINE       0     0     0
            ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M7SALHVP  ONLINE       0     0     0
            ata-WDC_WD20EFRX-68EUZN0_WD-WCC4M5PJS2AA  ONLINE       0     0     0
            /sparse1.bin                              OFFLINE      0     0     0
            /sparse2.bin                              OFFLINE      0     0     0
            /sparse3.bin                              OFFLINE      0     0     0

To prevent data loss, you should as soon as possible replace the sparse files with real physical devices.

read more

ZFS: Why is L2ARC hit ratio so low?

ZFS: Why is L2ARC hit ratio so low?

Using an additional SSD disk as a second level cache for ARC - called L2ARC - can speed up your ZFS pool. But if you analyze how often the cache is used you find a very low hit ratio. To understand why the hit ratio is low you should know how the L2ARC works.

ZFS uses a primary cache - the ARC - which takes some space of your available RAM. Until the ARC is really full, no noteworthy data is written to the L2ARC. Thus, until the ARC cache is warm, the L2ARC cache isn’t used. But even it is not used, a read request triggers a lookup in ARC and then in L2ARC. Because both caches are cold after a reboot, you can see a lot of cache misses.

Calculate the hit ratio

To calculate the hit ratio, the formula hit_ratio = (hits+misses)/hits is used.

L2 ARC Breakdown:                               2.25m
        Hit Ratio:                      12.80%  288.37k
        Miss Ratio:                     87.20%  1.97m
        Feeds:                                  795.95k

Now it’s easy to understand why the hit ratio is low. If you have a lot of RAM (say 32 GiB), it takes hours or days until the ARC cache is warm. And then it takes hours or days again until the L2ARC is warm. But during this time, every cache lookup is counted as cache-miss: after 2 days you may have 29837872 cache misses on L2ARC but it’s still filled up with just a few bytes. After both caches are warm, the L2ARC-hits will slowly increase.

A better approach to calculate the hit ratio is to wait until the L2ARC cache is warm. Then write down the current count of L2ARC cache misses.

real_hit_ratio = ((hits-hits_before_warm)+(misses-misses_before_warm))/(hits-hits_before_warm)

With this formula, you ignore all the cache misses until the cache is warm.

Determine if L2ARC cache is cold or warm

As a rule of thumb, I assume the cache is warm if:

  1. The difference between arc_max_size and arc_size is lower than 10% of arc_max_size.
  2. The difference between l2arc_size and l2arc_usage is lower than 50% of l2arc_size.
  3. The hit count of l2arc is greater than 1000

Final thoughts

Keep in mind that if you have a lot of RAM available for ARC, it may take days until the L2ARC is filled with data. The L2ARC is lost after a reboot (see issue #925 for persistent L2ARC) and if you shut down your system every night, your L2ARC cache is never used.

read more