Hardware compression

From The Open Source Backup Wiki (Amanda, MySQL Backup, BackupPC)

Jump to: navigation, search

Compression in hardware, i.e. in the tapedrive itself, is not done by writing the bits in a higher density, or by using more tracks, or using both sides of the tape. Hardware compression is achieved with an algorithm, built into the firmware of the tapedrive, which removes redundancy from the byte stream, resulting in fewer bits actually needing to be written to the tape medium. Manufacturers often claim to achieve a 2:1 or even 2.6:1 reduction on "normal data", hence the advertising of e.g. 36/72 GB tape capacity. Some of those algorithms, when given uncompressable data, actually result in an expansion by 5-15%. The net amount of tape capacity for such uncompressable data is lowered as well: e.g. the expected 36 GB capacity is now around 30-32 GB for such drives. Some tapedrives have a much better compression firmware, which does not have that negative impact (e.g. LTO tape drives).

It is usually recommended to disable hardware compression with Amanda, though it may have less of a negative impact with certain modern tape formats (e.g. LTO).

There are multiple methods to appropriately set (i.e. turn off) hardware compression. They may vary between OSes somewhat, not all tape drives support this setting, and tape drives may support a few methods at a protocol level for controlling hardware compression, methods which your OS's mt(1) tool or similar may or may not handle - e.g. the "device configuration" scsi-2 mode page 0x10 (common), the "data compression" scsi-3 mode page 0x0F (sometimes) or a particular special "density" setting (rare nowadays, thankfully). If your tape drive still has hardware-level DIP switches to control settings, it can be more pleasant to use them.

Contents

Resurfacing HW compression

Note that some tape drives re-enable HW compression if they read HW compressed data, as explained e.g. here, here, here, and here. Therefore it may be necessary to properly overwrite existing tapes to ensure HW compression stays off.

Please note that compression information is not stored on the tape's ident header block until the tape has been written to.

Procedure for turning off compression and labelling tapes

  • Label the tape
  • Rewind the tape
  • Read the label to a file using dd command
  • Turn off tape compression using mt(1) command. See next section.
  • Re-write the label block and write more /dev/zero blocks to flush its buffers.

OS specific notes

Linux

Under Linux (and assuming a drive with support for "standard" SCSI compression functionality), there are actually two separate tape-drive-compression settings that can be configured, and on top of that there are two separate versions of the mt utility, which each have a different way of altering those tape-drive-compression settings.

Tape drive v.s. Linux kernel driver compression settings
Tape drive setting

Most SCSI tape drives support the Data Compression Characteristics mode page (page 15 or 0x0F). The host computer can read and set the hardware compression settings using the SCSI MODE SENSE and MODE SELECT commands, respectively. When this setting is written, the drive immediately changes to the specified setting, and reading this setting shows the physical drive's current status.

Linux kernel drivers

The Linux kernel st driver keeps track of a "default" setting for each tape-drive device that it is controlling. If that default is set, the driver will set the tape drive to match the setting (e.g. turn on compression if the default is set to "on") whenever the system starts writing data to the beginning of a tape in that drive.

This "default" setting is controlled using the MT_ST_DEF_COMPRESSION subcommand of the MTSETDRVBUFFER ioctl on the device file (e.g. /dev/st0).

An additional feature of the kernel driver is the support for multiple "modes" within a single physical tape drive. Each "mode" has a separate device file; for example /dev/st0, /dev/st0a, /dev/st0l, and /dev/st0m each refer to the same tape drive via different modes. The kernel keeps track of separate "default compression" settings for each mode that is defined.

The st kernel driver also supports an MTCOMPRESSION ioctl, which will immediately set the compression status of the drive (using the SCSI mode page 15 commands internally to do so).

On the other hand, the st driver does not provide any ioctls for reading the current setting of either the mode page 15 compression setting or the driver's "default compression" setting.

For more information, see the README.st or st.txt files in the Documentation directory within the kernel source tree. (For example, here is a link to the Linux 2.6.30 version of st.txt.)

Versions of the mt command

There are actually two different versions of the mt command in use on Linux, GNU mt and mt-st. For most operations they are interchangable, but for a few operations -- including controlling tape-drive hardware compression settings -- the two programs behave quite differently.

You can find out which version you are running by giving the command

# mt --version

In some distributions (e.g. Debian), it's possible to install both versions at the same time; you then invoke the command as either mt-gnu or mt-st as desired.

GNU mt

GNU mt is part of the GNU cpio package. However, the upstream version does not include support for various functions that are available from SCSI tape drives, so many Linux distributions include their own patches to support these functions.

In particular, the patches add support for a datcompression operation, which can be used to control the tape drive's compression setting. This command directly access the SCSI Data Compression Characteristics mode page as discussed above, so any changes made using this command take effect immediately (but the st driver's "default compression" setting is not changed, since it bypasses that driver completely).

mt-st

mt-st is version of "mt" patched to support various functions provided by the Linux st device driver. (The project does not seem to have a home page, but source code for the package is made available in the mt-st-*.tar.gz file found in http://ftp.ibiblio.org/pub/linux/system/backup/ .)

Since it uses the st driver ioctl functions, mt-st is able to set both the current SCSI mode page compression settings and the kernel driver's "default compression" setting -- but it is not able to read back the current values of either setting for display to the user.

The mt-st package also includes the stinit command, which can be used to initialize various kernel-driver settings for the tape devices, including the configuration of the "mode" devices (e.g. to activate the /dev/st0a, /dev/st0l, and /dev/st0m devices). (This command also uses the st kernel driver ioctls, so it can initialize that driver's "default compression" setting, etc.)

Changing and checking the compression settings
  • To change the compression setting using the (patched) GNU mt, you would use the
# mt -f /dev/st0 datcompression [COUNT] 

command.

If COUNT is "1" (which is the default COUNT value, so you can leave it off of the command line), mt displays the current compression settings for the device.
If COUNT is "0", compression is turned off.
Otherwise, compression is turned on.

(After changing the compression setting, mt will re-read the current settings from the drive and then display those settings just as if a COUNT=1 request had been executed.)

mt datcompression will also print out an indication of whether or not the tape drive supports compression.

As mentioned earlier, GNU mt directly manipulates the SCSI Data Compression Characteristics mode page for the drive. This means that the setting it displays shows the way the drive is actually operating at that moment.... and that GNU mt can't change the kernel st driver's "default compression" setting.


  • To change the current compression setting using mt-st, you would use the
# mt -f /dev/st0 compression [COUNT] 

command.

In this case, a COUNT of "1" will enable compression, and a COUNT of "0" will turn it off[*]. (The default value for COUNT is also "1" in this version of mt, so if you run this command without specifying a COUNT, compression will be enabled.)

(* Actually, as the st driver is currently implemented, any odd COUNT value will enable compression and any even value will disable it.)

The compression operation uses the kernel st driver's MTCOMPRESSION ioctl to set the compression. This ioctl, in turn, manipulates the SCSI mode page 15 to set the drive to the desired setting. However, the st driver does not provide any way for a program to read back the current setting, so mt-st does not print any such information. (However, if you also have the GNU mt command installed, you can use the datcompress operation to verify the drive's current status.)


  • mt-st can also change the kernel st driver's "default compression" setting, using the
# mt -f /dev/nst0 defcompression [COUNT]

command.

As with the "compression" operation, a COUNT of "0" sets the default to "compression disabled" and "1" sets the default to "compression enabled". The special value of "-1" disables the default compression setting. (In other words, it prevents the st driver from trying to automatically change the drive's compression setting.)

  • The stinit command command initializes the SCSI tape drives using st driver ioctls. The program would generally be run upon system startup and/or when the st kernel module is loaded or a new tape drive is plugged into the system. When executed, it reads configuration information from a configuration file (normally /etc/stinit.def) and puts the specified settings into effect for each drive found in the file.

stinit is also able to manipulate the st driver's "modes", or in other words it can activate (and configure) the /dev/st0a, /dev/st0l, and /dev/st0m devices.

The "compression" field in stinit's configuration file configure's the st driver's "default compression" setting.

So, for example, if the configuration file contains

manufacturer=XXXXX model = "YYYYY" {
mode1 compression=1
mode2 compression=0
}

(with the correct manufacturer and model strings, as taken from the listing found in /proc/scsi/scsi) and stinit is executed beforehand, then any program that uses /dev/st0 will have hardware compression enabled, while programs that reference /dev/st0l will have it disabled. (Attempts to access /dev/st0m and /dev/st0a will continue to result in a "no such device"-type error message until the correct "mode3" and "mode4" lines are added to the stinit.def file.)

(So in this example, one would point the tapedev line in amanda.conf to either st0 or st0l to select whether or not hardware compression would be used for the data written by Amanda.)

  • Note that the kernel Documentation/devices.txt file refers to the four possible logical modes for each physical tape device as "mode 0" through "mode 3", while the stinit program refers to them as "mode1" through "mode4".
  • Because the st driver's "default compression" setting takes effect each time the driver starts writing to the beginning of the tape, and because amlabel and amdump always write a new label file at the beginning of the tapes they are using, it shouldn't be necessary to go use the procedure described above under Resurfacing HW compression when switching the hardware-compression usage on an existing tape (as long as something like mt defcompression or stinit is used to configure the st driver beforehand).
  • Since the GNU mt datcompression command reads the compression information directly from the tape drive, it will report the physical drive's current setting regardless of which device file is specified. To continue the example using the stinit.def file above, if the last write to the tape were done using /dev/st0, then the command mt -f /dev/st0l datcompression would report that compression was enabled, even though the "default compression" setting for that particular device file is "disabled". As soon as data is written to the beginning of a tape using that device file, the kernel will send the command to the drive to disable compression... and after that write is completed, then the same mt datcompression command will report that compression is now disabled.
  • While the st driver does not provide any ioctl to allow a program read back the current value of the "default compression" setting, it does report some information about changes to that value via kernel log messages. (So watching the kern.log file while using stinit or mt defcompression will give a good idea as to what the kernel driver is actually doing.)
  • Additionally, in later versions of the kernel (2.6.x), the st driver does provide information about its current status using the /sys filesystem, under /sys/class/scsi_tape/<dev> .

FreeBSD

  • Use mt(1) command to turn off hardware compression at boot time. For example:
# mt -f /dev/nsa0 comp off

NetBSD

  • Use mt(1) command to turn off hardware compression at boot time. For example:
# mt -f /dev/nrst0 compress 0

OpenBSD

  • At time of writing (OpenBSD 3.9), OpenBSD's mt(1) doesn't allow manipulation of hardware compression settings. However, one can use the swiss-army-knife scsi(8) command to change the compression setting (and more besides - be careful...) in the mode pages. This should probably be considered undocumented and subject to change, as the st(4) man page describes as "reserved" the device bit pattern enrst0 used here to put the st kernel driver into "control mode".
# EDITOR=vi scsi -f /dev/enrst0 -m 16 -e -P0 

( -m 16 selects the "Device Configuration" mode page. Change the value of "Select Data Compression Algorithm" to 0 in the text editor launched, save and exit. Using -P3 _might_ save the setting, equivalent to linux mt "defcompression" above, but some tape drives don't seem to support it. )

Solaris

Set ST_MODE_SEL_COMP in st.conf for your device and then use the correct tape device variant (do not use "c" and "u" device)

AIX

(untested, based on public man pages - someone aixy please verify)

# chdev -l rmt0 -a compress=no
Personal tools