VFS Device: Difference between revisions

From wiki.zmanda.com
Jump to navigation Jump to search
Line 223: Line 223:
The '''statefile''' parameter in the chg-multi.conf is now the prefix of some files that hold the status of the emulated changer.
The '''statefile''' parameter in the chg-multi.conf is now the prefix of some files that hold the status of the emulated changer.


Run "amcheck test" to verify we did not make an error somewhere,
And we label the tapes in all the slots:
and we label the tapes in all the slots:


  $ for i in 1 2 3 4 5; do amlabel test TEST-$i slot $i; done
  $ for i in 1 2 3 4 5; do amlabel test TEST-$i slot $i; done
  ...
  $ amcheck test
  $ amtape test show
  $ amtape test show
  $ amtape test reset
  $ amtape test reset

Revision as of 22:10, 17 December 2005

Based on text by: Stefan G. Weichinger, November - December, 2003 ; minor updates in April, 2005.

Introduction

Since release 2.4.3 Amanda supports the usage of a output driver called "file". See the manual page of amanda, section OUTPUT DRIVERS, for more information.

As the name suggests, this driver uses files on disk as virtual tapes. Amanda can write to and read from virtual tapes, just like real tapes. A bunch of virtual tapes can even be manipulated with a changer.

Possible Uses

  • Test installations: You can easily explore the rich features of Amanda on systems without tape drives. Virtual tapes are usually also much faster than many real tape drives. For a quick start, have a look at: Test environment with virtual tapes.
  • Inexpensive installations: Without buying a tape drive you can enjoy the benefits of Amanda and backup to a bunch of harddisks. You can create CD/DVD-sized backups which you can burn onto optical disks later. Or you can backup to external disks connected with Firewire or USB.
  • Disk-based installations: You can use the file driver to backup onto a set of virtual tapes hosted on a bunch of hard-disks or a RAID-system. Combined with another Amanda configuration that dumps the virtual tapes to real tapes, you can provide reliable backup with faster tapeless recovery. This is called "disk-to-disk-to-tape" backup by some people today.

Please be sure to understand the differences between holding disks and virtual tapes. The two serve different purposes; holding disks allow for parallelism of multiple disklist entries (DLE's) being backed up while virtual tapes are a replacement for physical tapes.

The virtual tapes are also called "vtapes" in this document.

Disk requirements

Before beginning you will need to decide on dedicated parts of your hard disks for your virtual tape storage. While this space could be spread among several file systems and hard disks, I recommend to dedicate at least a specific partition, better a specific physical harddisk to the task of keeping your vtapes. The use of a dedicated disk will speed things up definitely.

The disk space you dedicate for your vtapes should NOT be backed up by Amanda. Also, for performance reasons there should be NO holding disks on the same partition as the vtapes, preferably not even on the same physical drive.

If you only have one harddisk, it will work out, too, but you will suffer low performance due to massive head-moving in your harddisk, resulting from copying data between the filesystems.

Prepare the filesystem used for the vtapes

Decide on where to put your files, create the appropriate partition(s) and filesystem(s) and mount them. In our example we have the dedicated partition hdc1, mounted on /amandatapes for vtape storage. The filesystem must also be capable of creating large file (> 4Gbyte) and must be able to handle symlinks (no vfat).

$ mount
[...]
/dev/hdc1 on /amandatapes type reiserfs (rw)
[...] 

Make sure there is space left. Determine the amount of space you will use.

$ df -h /amandatapes
Filesystem      Size  Used  Avail  Use%   Mounted on
/dev/hdc1        20G    0G    20G    0%   /amandatapes 

In our example we have 20GB diskspace left on /amandatapes.

Determine length and number of tapes

After deciding on the number of vtapes you want to create, evenly allocate the available space among them.

Look at the following rule of thumb: As many filesystems exhibit dramatically reduced performance when they are nearly full I have chosen to allocate only 90% of the available space. So we have:

     (Available Space * 0.9) >= tapelength * tapecycle

This is a very conservative approach to make sure you don´t suffer any performance drop due to a nearly-full-filesystem. As it is uncommon for Amanda to fill an entire tape you may also wish to use more space than that. So you could determine possible combinations of tapelength/tapecycle with the more general formula:

     Available Space >= tapelength * tapecycle

In our example we take the conservative approach, and so we could create the following combinations:

20 GB * 0.9 = 18 GB to use
  • 18 GB = 18 GB * 1
  • 18 GB = 9 GB * 2
  • 18 GB = 6 GB * 3
  • 18 GB = 3 GB * 6
  • 18 GB = ...

Using only one tape is generally considered a bad idea when it comes to backup, so we should use at least 3 tapes (for testing purposes), better 6 or more tapes.

  • 18 GB = 3 GB * 6

so we get the value 3 GB for the tapelength if we want to use 6 tapes.

Create a tapetype definition

Add a new tapetype definition similar to the following to your amanda.conf. I named my definition "HARD-DISK". Choose whatever name you consider appropriate.

define tapetype HARD-DISK {
    comment "Dump onto hard disk"
    length 3072 mbytes 	# specified in mbytes to get the exact size of 3GB
}

You don´t have to specify the parameter speed (as it is commonly listed in tapetype definitions and reported by the program amtapetype). Amanda does not use this parameter right now.

There is also an optional parameter filemark, which indicates the amount of space "wasted" after each tape-listitem. Leave it blank and Amanda uses the default of 1KB.

The tapetype defined above should just be choosen by the paramater tapetype in amanda.conf too:

 tapetype HARD-DISK

Simple virtual tapes

A virtual tape is implemented as a directory with a subdirectory named "data" in it. Let's create one for our "test" configuration:

# chown amanda:disk /amandatapes
# chmod 750 /amandatapes                       # backups contain secret files!
# su - amanda
$ mkdir -p /amandatapes/test/tape1/data

This tape can be manipulated by the ammt command, a replacement for the system command "mt". The ammt command understands the different output drivers from Amanda:

$ ammt -f file:/amandatapes/test/tape1 status
$ ammt -f file:/amandatapes/test/tape1 rewind

Vtapes are always non-rewinding. Just like Amanda needs them. That's why you always need to rewind it when you want to start reading a vtape from the beginning.

Basic writing to a vtape can be done with amdd, a replacement for the system command "dd". Virtual tapes have no real builtin capacity; the upper limit is "diskspace, the final frontier". However Amanda does obey the size you specify in tapetype definition of a vtape in amanda.conf. The amdd command also can specify an upperlimit on the virtual tapesize with the -l option:

$ amdd -l 200k if=/dev/urandom of=file:/amandatapes/test/tape1 bs=32k
amdd: write error: No space left on device
8+0 in
6+1 out

The above command writes 200 Kbytes of garbage (6 full blocks of 32k + 1 partial block) on the vtape before it bumps into the end of the virtual tape.

When there is no "data" subdirectory in a vtape, the vtape is "offline". You could burn the contents of the data directory to a CD-R, and store that away. When you want to read it, just mount is as a "data" directory, or even simpler, create a symlink "data" pointing to the mounted cdrom.

$ rm -r /amandatapes/test/tape1/data
$ ammt -f file:/amandatapes/test/tape1 status
file:/amandatapes/tape1: status: OFFLINE
$ ln -s /media/cdrom /amandatapes/test/tape1/data
$ ammt -f file:/amandatapes/test/tape1 status
file:/amandatapes/test/tape1: status: ONLINE

Amanda cannot backup a to CD-R, but can use it as a read protected vtape; making a backup to a vtape, and and later burning the contents of the data directory to a CD or DVD is the normal way.

We can use such a simple vtape as a tape device in amanda.conf with a line like:

tapedev "file:/amandatapes/test/tape1"

Each run we point the data symlink to a different directory manually. But read on, this can also be automated.

Virtual tapes with chg-disk

  • To use chg-disk you need to have at least amanda-2.4.4p1-20031202.

The changer script "chg-disk" is specifically written to handle a bunch for virtual tapes on disk. This script does not need a separate configuration file, like most other changer scripts do. Instead it uses these parameters in amanda.conf:

tpchanger "chg-disk"
changerfile "/home/amanda/test/chg-disk-status"    # status files prefix
tapedev "file:/amandatapes/test/slots"
tapecycle 5
# changerdev is ignored

"Chg-disk" operates the virtual changer by pointing the symlink data to another directory, named slotX, where X is the slot number. The directory tree should look like:

slot_root_dir -|
               |- info
               |- data -> slot1/
               |- slot1/
               |- slot2/
               |- ...
               |- slotN/

"slot_root_dir" is the value of the tapedev parameter, and N is value of tapecycle in amanda.conf. The changer script uses the value of changerfile as prefix of some files which store the status of the virtual changer.

We create the virtual slots tree for the chg-disk changer, and set it "online", by creating the "data" symlink:

$ mkdir -p /amandatapes/test/slots
$ cd /amandatapes/test/slots
$ for i in 1 2 3 4 5; do mkdir -p slot$i; done
$ ln -s slot1 data
$ ammt -f file:/amandatapes/test/slots status

The file "info" in the slot_root_dir is created automatically on first use. Do not add a leading zero to the slot number, chg-disk would not understand that. Create as many slots as you have specified as tapecycle in amanda.conf.

And we label the virtual tapes:

$ for i in 1 2 3 4 5; do amlabel test TEST-$i slot $i; done
$ amcheck test

As always we end with "amcheck test" and solve the issues it complains about. We can verify all the virtual tapes, and load the first tape again, ready for the first amdump run:

$ amtape test show
$ amtape test reset

Virtual tapes with chg-multi

A much older changer script is "chg-multi", which emulates a changer consisting of multiple tape drives. If you have two tape drives of the same model, and hook them to the server, this script allows you to emulate a changer with two slots.

Vtapes are complete tape drives, as far as Amanda is concerned, and you can operate a bunch of these with the chg-multi changer script too:

Create 5 vtapes for our "test" configuration:

for i in 1 2 3 4 5; do mkdir -p /amandatapes/test/tape$i/data; done

Now we have a server with 5 tape drives. They are virtual tapes, but Amanda isn't picky about that.

Change amanda.conf for the "test" configuration to have these values:

tpchanger "chg-multi"
changerfile "chg-multi.conf"       # name of the special configuration file
# tapedev is ignored if present, to avoid confusion, just comment it out
# changerdev is ignored too

The chg-multi script needs more configuration and uses the parameter changerfile in amanda.conf as the name of that special config file. And add a file in the same directory as amanda.conf with the name we specified above as the changerfile chg-multi.conf. Because "chg-multi" is useful in a wide context, we need to specify a lot of parameters that are irrelevant for vtapes:

multieject 0
needeject 0
gravity 0
ejectdelay 0
statefile /home/amanda/test/changerstatus
firstslot 1
lastslot 5
slot 1 file:/amandatapes/test/tape1
slot 2 file:/amandatapes/test/tape2
slot 3 file:/amandatapes/test/tape3
slot 4 file:/amandatapes/test/tape4
slot 5 file:/amandatapes/test/tape5

The statefile parameter in the chg-multi.conf is now the prefix of some files that hold the status of the emulated changer.

And we label the tapes in all the slots:

$ for i in 1 2 3 4 5; do amlabel test TEST-$i slot $i; done
$ amcheck test
$ amtape test show
$ amtape test reset

And we are ready to use our tape changer with 5 tape drives.

Comparison of chg-disk and chg-multi for virtual tapes

The two changers chg-disk and chg-multi have a different approach to the handling of tapes:

  • The script chg-multi handles many drives with a tape in each drive.
  • The script chg-disk handles a library with one drive and multiple tapes.

This implies that chg-disk can drive only one tape at a time, while in the setup with chg-multi, you can always specify one specific tape, and use that one for e.g. restoring, while amdump is using another tape.

While the chg-disk changer is very straightforward to set up, the chg-multi script has a wider range of uses, but also is slightly more complicated to set up. The chg-multi script can e.g. be extended to rait, making a mirror of the backup to a real tape or a vtape on an external disk at the same time.

For chg-multi, the underlying filesystem does not need to be able to handle symlinks. You can use plain old VFAT on an external USB-disk that is also accessible by Windows. (But you'll need to limit the maximum tapesize to 4 Gbyte, or using multi-tape-split dumps, available in 2.5.0, to avoid running into the max filesize limit of vfat.)

If you don't need the complexity of chg-multi, stay with the easy chg-disk. Migrating to chg-multi is easy by just moving the slotX directories of the chg-disk vtape tree to the data directories of each vtape in the chg-multi tree.

To use chg-disk you need to have at least amanda-2.4.4p1-20031202. Chg-multi is much older.

Recovery

Recovering files from vtapes is very similar to recovering files from a "real" tapechanger. Make sure you read the chapter Restore. I will simply paste an amrecover session here (provided by JC Simonetti, author of chg-disk):

# /usr/local/amanda/sbin/amrecover woo
AMRECOVER Version 2.4.4p3. Contacting server on backupserver.local ... 
220 backupserver AMANDA index server (2.4.4p3) ready.
200 Access OK
Setting restore date to today (2004-10-08)
200 Working date set to 2004-10-08.
Scanning /BACKUP2/holding...
Scanning /BACKUP/holding...
200 Config set to woo.
200 Dump host set to backupserver.local.
Trying disk /tmp ...
$CWD '/tmp/RECOVER' is on disk '/tmp' mounted at '/tmp'.
200 Disk set to /tmp.
Invalid directory - /tmp/RECOVER
amrecover> sethost backupserver.local
200 Dump host set to backupserver.local.
amrecover> setdisk /
200 Disk set to /.
amrecover> cd /etc
/etc
amrecover> add passwd
Added /etc/passwd
amrecover> list
TAPE B3_14 LEVEL 0 DATE 2004-09-26
       /etc/passwd
amrecover> extract

Extracting files using tape drive file:/BACKUP2/slots/ on host
backupserver.local. The following tapes are needed: B3_14

Restoring files into directory /tmp/RECOVER
Continue [?/Y/n]? Y

Extracting files using tape drive file:/BACKUP2/slots/ on host
backupserver.local. 
Load tape B3_14 now
Continue [?/Y/n/s/t]? Y
. /etc/passwd
amrecover> quit
200 Good bye.  

Nothing spectacular? The trick is this: When Amanda asks you

Load tape B3_14 now 
Continue [?/Y/n/s/t]?  

you have to run the following in a second terminal:

$ amtape woo slot 14
amtape: changed to slot 14 on file:/BACKUP2/slots/ 

This step is necessary to load the proper tape into your virtual changer. Let me express this in a more general way:

When amrecover prompts for the tape it needs to restore the files you requested, you have to "load" the tape it requests. The recommended way to do this is to use amtape. The options that make sense in this context are:

# amtape
Usage: amtape <conf> <command>
       Valid commands are:
               [...]
               slot <slot #>        load tape from slot <slot #>
               [...]
               label <label>        find and load labeled tape
               [...] 

If you know which slot contains the requested tape (for example, if you have tape daily01 in slot 1, tape daily02 in slot 2, and so on) you may use the first option. If you just know the label of the tape you need, use the second option.

To continue the upper example:

amtape woo slot 14 	        # option 1 OR
amtape woo label B3_14 	# option 2 

amtape will return something like:

amtape: label B3_14 is now loaded.  

After this you can return to your amrecover-session and continue restoring your files.

Please be aware of the fact reported by JC Simonetti: " I have never never used the "settape" command of amrecover [with chg-disk] since there's some problems with it (tape not loaded correctly, or impossible to change from tape to tape when restoring data shared accross multiple tapes...) "

!!NEW!!

Since Amanda 2.4.3 you can let amrecover use the complete changer instead of the currently loaded tape too. No need to open a second window to load the correct tape.

For this add in amanda.conf:

amrecover_do_fsf  yes
amrecover_check_label  yes
amrecover_changer  "changer"

With the last line we give the changer a name, which we can use instead of the tape device in amrecover, when starting the command:

# /usr/local/amanda/sbin/amrecover woo -d changer

or from inside with the settape command or even:

Extracting files using tape drive file:/BACKUP2/slots/ on host
backupserver.local. Load tape B3_14 now
Continue [?/Y/n/s/t]? t
New tape device [?]: changer
Using tape "changer" from server backupserver.local.
Continue [?/Y/n/s/t]? y