How To:Backup to Amazon S3: Difference between revisions

From wiki.zmanda.com
Jump to navigation Jump to search
m (how to header)
Line 1: Line 1:
{{How To Header    }}
The [[S3 Device]] performs backups to Amazon's S3, a private, affordable, and reliable off-site data storage service.  If you have a relatively small amount of very important data, then this service may be ideal.  WAN-based backups by definition traverse a relatively low-bandwidth link, so large backup sets may not be practical.
The [[S3 Device]] performs backups to Amazon's S3, a private, affordable, and reliable off-site data storage service.  If you have a relatively small amount of very important data, then this service may be ideal.  WAN-based backups by definition traverse a relatively low-bandwidth link, so large backup sets may not be practical.



Revision as of 22:20, 12 November 2008

This article is a part of the How Tos collection.

The S3 Device performs backups to Amazon's S3, a private, affordable, and reliable off-site data storage service. If you have a relatively small amount of very important data, then this service may be ideal. WAN-based backups by definition traverse a relatively low-bandwidth link, so large backup sets may not be practical.

Note that the S3 device interfaces with the Device API, and as such is only available in Amanda-2.6.0 and later.

Before You Start

Familiarize yourself with S3 at http://amazon.com/s3, and sign up for the service. You will receive an access (public) key and a secret key. In this document, we will use the access key '1ATXQ3HHA59CYF1CVS02' and the secret key '09dfma0928m0sd9f8m-adf/asdf098asdf'.

Figure out about how much data you'll be backing up per run (the tapetype length), and how many tapes you want (tapecycle). Calculate the costs to transfer and store that much data, to avoid any surprises. In the example below, we'll assume a tapecycle of 10.

Configuration

I recommend starting with the template.d/amanda-S3.conf you can find shipped with Amanda 2.6.0. Then, the following will help you configure a changer wrapping S3 with multiple virtual tapes:

amanda.conf:

tapedev "null:" # (device should come from the changer)
device_property "S3_ACCESS_KEY" "1ATXQ3HHA59CYF1CVS02"
device_property "S3_SECRET_KEY" "09dfma0928m0sd9f8m-adf/asdf098asdf"
tpchanger "chg-multi"
changerfile "changer.conf"

changer.conf:

multieject 0
gravity 0
needeject 0
ejectdelay 0
statefile /var/amanda/changer-status
firstslot 1
lastslot 10

slot  1  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-01
slot  2  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-02
slot  3  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-03
slot  4  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-04
slot  5  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-05
slot  6  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-06
slot  7  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-07
slot  8  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-08
slot  9  s3:1ATXQ3HHA59CYF1CVS02-backups/slot-09
slot  10 s3:1ATXQ3HHA59CYF1CVS02-backups/slot-10

Note that we're using the bucket 1ATXQ3HHA59CYF1CVS02-backups, which has our public key as a prefix. This helps to avoid namespaces collisions with other users of S3. Also, the above configuration will create files on S3 like: s3:1ATXQ3HHA59CYF1CVS02-backups/slot-01special-tapestart, s3:1ATXQ3HHA59CYF1CVS02-backups/slot-01f00000001-filestart, ..., s3:1ATXQ3HHA59CYF1CVS02-backups/slot-02special-tapestart, ...

I prefer keeping the files for each virtual tape sorted into one directory per "tape":

slot  1  s3:1ATXQ3HHA59CYF1CVS02-backups/DailySet1/0001/
slot  2  s3:1ATXQ3HHA59CYF1CVS02-backups/DailySet1/0002/
slot  3  s3:1ATXQ3HHA59CYF1CVS02-backups/DailySet1/0003/
...

If you enable label_new_tapes, then there's nothing more to do -- the S3 device will create the bucket and amdump will label the first tape during the first run. Otherwise, proceed to label the tapes as usual:

amlabel MYBACKUPS MYBACKUPS01 slot 1

You can check a tape's status with:

amdevcheck MYBACKUPS s3:1ATXQ3HHA59CYF1CVS02-backups/slot-10
SUCCESS

Notes

Time Sync

Proper S3 authentication depends on your system's clock being fairly accurate. If your clock tends to drift, you may need to install and configure an NTP client, or invoke rdate or the equivalent against a known-good machine.