Tar level-1 dumps are the same size as level-0 dumps

From The Open Source Backup Wiki (Amanda, MySQL Backup, BackupPC)
Revision as of 03:51, 7 September 2011 by Nathanst (talk | contribs) (expand description of first thread in Links)
Jump to navigationJump to search

Problem

On certain occasions, GNU tar produces level-1 dumps which are the same size as the preceding level-0 dumps, as if tar sees every file as changed. For a highly leveraged Amanda installation, where the total data being backed up is much larger than available tape storage, this can cause dumps to fail badly, as Amanda cannot fit dumps for all DLEs on tape.

Explanation

Amanda uses index files not only to find files on tapes, but to allow GNU tar to calculate what has changed on a partition from day to day. When tar runs, it takes an index file describing the previous dump as input, and produces a new index file describing the new run.

Among other data about files that are backed up, GNU tar's index files contain the device number of the device on which the file is located. For example, if the /data partition is on /dev/sda2, then the device number might be 8,2; you can see the device numbers with ls, operating directly on the underlying device:

# df /data
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda2               988244    698452    239592  75% /data
# ls -al /dev/sda2
brw-rw---- 1 root disk 8, 2 May 23 00:27 /dev/sda2

This particular problem occurs when the device number for a particular piece of hardware changes. Some device numbers, notably those for device mapper (used by the Logical Volume Manager on Linux), are assigned dynamically at boot time, and changing a machine's hardware configuration (by e.g., adding a new USB CD drive) can affect those numbers. The result is that, on the next run, tar sees different device numbers for every file, and interprets that to mean that the file has changed.

Solutions

There are a few long-term solutions to this problem that may occur, but are not currently planned:

  • the kernel developers can assign permanent numbers for LVM and the other dynamic but common devices in the LANANA

However, you can also stabilize the device numbers yourself on Linux, at least for the device mapper, assuming you're using modules. Just add

options dm-mod major=238

to /etc/modprobe.conf.

If you have a one-time situation where the device number changes (e.g. upgrading to a new version of the OS kernel causes the number to change), you can manually update the index files. Tar contains a Perl script which can "edit" index files, swapping one device number for another. If it is not included in your version of tar, you can also download it from http://www.gnu.org/software/tar/utils/tar-snapshot-edit.html or http://git.savannah.gnu.org/gitweb/?p=tar.git;a=blob;f=scripts/tar-snapshot-edit;hb=HEAD .

On the other hand, the device numbers change frequently in your environment, you may need to use the --no-check-devices option to GNU tar, available in version 1.20 and later. For more information, see:

In Amanda 2.6.1, the "amgtar" Application API script already includes a property that controls whether this option is passed to GNU Tar: http://wiki.zmanda.com/man/amgtar.8.html

(If you are running an earlier version of Amanda but have a recent-enough version of GNU Tar, you may be able to for usage of this option by arranging for the TAR_OPTIONS environment variable to be set on the Amanda client when the dumps are run, or by following the "wrapper script" instructions found in http://wiki.zmanda.com/index.php/Backup_client#pre_and_post-_backup_scripts_.28wrappers.29 )

Links