Amanda::DB::Catalog2 - access to the Amanda catalog: where is that dump?
use Amanda::DB::Catalog2;
$catalog = Amanda::DB::Catalog2->new($catalog_conf);
$catalog->validate();
$image = $catalog->add_image($host_name, $disk_name, $device, $dump_timestamp,
$level, $based_on_timestamp);
$image = $catalog->find_image($host_name, $disk_name, $device, $dump_timestamp,
$level);
$image = $catalog->get_image($image_id);
$volume = $catalog->add_volume($pool, $label, $write_timestamp, $storage, $meta, $barcode, $block_size);
$volume = $catalog->find_volume($pool, $label);
$volume = $catalog->get_volume($copy_id);
$copy = $image->add_copy($storage_name, $write_timestamp, $retention_days, $retention_full, $retention_recover);
$copy = $catalog->get_copy($copy_id);
$image->finish_image($orig_kb, $dump_status, nb_files, nb_directories, $native_crc, $client_crc, $server_crc);
$part = $copy->add_part($volume, $part_offset, $part_size, $filenum,
$part_num, $part_status);
$copy->finish_copy($nb_parts, $kb, $byte, $copy_status, $server_crc, $copy_message);
$catalog->remove_volume($pool, $label);
$catalog->quit();
The Amanda catalog is modeled as a set of dumps comprised of parts. A dump is a complete bytestream received from an application, and is uniquely identified by a image_id
, each dump can be written to multiple destination, each destination have a different copy_id. Each try of a dump get a different image_id, so it is possible to have to image_id with the same combination of host_name
, disk_name
, dump_timestamp
and level
. A dump may be partial, or even a complete failure.
A part corresponds to a single file on a volume, containing a portion of the data for a dump. A part, then, is completely specified by a volume label and a file number (filenum
). Each part has, among other things, a part number (partnum
) which gives its relative position within the dump. The bytestream for a dump is recovered by concatenating all of the successful (status
= OK) parts matching the dump.
Files in the holding disk are considered part of the catalog, and are represented as single-part dumps (holding-disk chunking is ignored, as it is distinct from split parts).
The dump table contains one row per dump. It has the following columns:
(string) -- timestamp of the run in which the dump was created
(string) -- timestamp of the run in which the part was written to this volume, or "0"
for dumps in the holding disk.
(string) -- dump host_name
(string) -- dump disk_name
(integer) -- dump level
(string) -- The status of the dump - "OK", "PARTIAL", or "FAIL". If a disk failed to dump at all, then it is not part of the catalog and thus will not have an associated dump row.
(string) -- reason for PARTIAL or FAIL status
(integer) -- number of successful parts in this dump
(integer) -- size (in bytes) of the dump on disk, 0 if the size is not known.
(integer) -- size (in kb) of the dump on disk
(integer) -- size (in kb) of the complete dump (before compression or encryption); undef if not available
(integer) -- time (in seconds) spent writing this part
(arrayref) -- array of parts, indexed by partnum (so $parts->[0]
is always undef
). When multiple partial parts are available, the choice of the partial that is included in this array is undefined.
(arrayref) -- list of parts, all parts are included.
A dump is represented as a hashref with these keys.
The write_timestamp
gives the time of the amanda run in which the part was written to this volume. The write_timestamp
may differ from the dump_timestamp
if, for example, amflush wrote the part to tape after the initial dump.
The parts table contains one row per part, and has the following columns:
(string) -- volume label (not present for holding files)
(integer) -- file on that volume (not present for holding files)
(string) -- fully-qualified pathname of the holding file (not present for on-media dumps)
(object ref) -- a reference to the dump containing this part
(string) -- The status of the part - "OK", "PARTIAL", or "FAILED".
(integer) -- part number of a split part (1-based)
(integer) -- size (in kb) of this part
A part is represented as a hashref with these keys. The label
and filenum
serve as a primary key.
Note that parts' dump
and dumps' parts
create a reference loop. This is broken by making the parts
array's contents weak references in get_dumps
, and the dump
reference weak in get_parts
.
All timestamps used in this module are full-length, in the format YYYYMMDDHHMMSS
. If the underlying data contains only datestamps, they are zero-extended into timestamps: YYYYMMDD000000
. A dump_timestamp
always corresponds to the initiation of the original dump run, while write_timestamp
gives the time the file was written to the volume. When parts are migrated from volume to volume (e.g., by amvault), the dump_timestamp
does not change.
In Amanda, the tuple (host_name
, disk_name
, level
, dump_timestamp
) serves as a unique identifier for a dump bytestream, but because the bytestream may appear several times in the catalog (due to vaulting) the additional write_timestamp
is required to identify a particular on-storage instance of a dump. Note that the part sizes may differ between instances, so it is not valid to concatenate parts from different dump instances.
The following functions provide summary data based on the contents of the catalog.
Get a list of all write timestamps, sorted in chronological order.
Return the most recent write timestamp.
Return the timestamp of the most recent dump of the given type or types. The available types are given below for get_run_type
.
Return a list of labels for volumes written at the given timestamp.
Return the type of run made at the given timestamp. The result is one of amvault
, amdump
, amflush
, or the default, unknown
.
This function returns a sequence of parts. $dumps
is a array ref of the dump to work with. Values in %parameters
restrict the set of parts that are returned. The hash can have any of the following keys:
restrict to parts written at this timestamp
(arrayref) restrict to parts written at any of these timestamps (note that holding-disk files have no write_timestamp
, so this option and the previous will omit them)
restrict to parts with exactly this timestamp
(arrayref) restrict to parts with any of these timestamps
restrict to parts with timestamps matching this expression
if true, only return dumps on holding disk. If false, omit dumps on holding disk.
restrict to parts with exactly this host_name
(arrayref) restrict to parts with any of these host_names
restrict to parts with host_names matching this expression
restrict to parts with exactly this disk_name
(arrayref) restrict to parts with any of these disk_names
restrict to parts with disk_names matching this expression
restrict to parts with exactly this label
(arrayref) restrict to parts with any of these labels
restrict to parts with exactly this level
(arrayref) restrict to parts with any of these levels
restrict to parts with this status
(arrayref of dumpspecs) restruct to parts matching one or more of these dumpspecs
Match expressions are described in the amanda(8) manual page.
Given a list of parts, this function sorts that list by the requested keys. The following keys are available:
Note that this sorts labels lexically, not necessarily in the order they were used!
Keys are processed from left to right: if two dumps have the same value for $key1
, then $key2
is examined, and so on. Key names may be prefixed by a dash (-
) to reverse the order.
Note that some of these keys are dump keys; the function will automatically access those values via the dump
attribute.
This function returns a sequence of dumps. Values in %parameters
restrict the set of dumps that are returned. The same keys as are used for get_parts
are available here, with the exception of label
, labels
and holding
. In this case, the status
parameter applies to the dump status, not the status of its constituent parts. If the part
is set, get_parts is executed with the resulting dumps, setting the parts
and allparts
keys.
Like sort_parts
, this sorts a sequence of dumps generated by get_dumps
. The same keys are available, with the exception of label
, filenum
, and partnum
.
version INTEGER NOT NULL
One row and one column, it is an integer with the version of the database.
host_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, host_name CHAR(256) NOT NULL UNIQUE
disk_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, host_id INTEGER NOT NULL, disk_name CHAR(256) NOT NULL, device CHAR(256) NOT NULL, UNIQUE (host_id, disk_name), FOREIGN KEY (host_id) REFERENCES host (host_id)
image_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, disk_id INTEGER NOT NULL, dump_timestamp CHAR(14) NOT NULL, level INTEGER NOT NULL, orig_kb INTEGER, dump_status VARCHAR(1024), nb_files INTEGER, nb_directories INTEGER, based_on_timestamp CHAR(14), FOREIGN KEY (disk_id) REFERENCES disks (disk_id)
copy_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, image_id INTEGER NOT NULL, write_timestamp CHAR(14) NOT NULL, nb_parts INTEGER NOT NULL, kb INTEGER, bytes INTEGER, copy_status VARCHAR(1024) NOT NULL, FOREIGN KEY (image_id) REFERENCES image (image_id)
volume_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, label CHAR(256) NOT NULL UNIQUE, write_timestamp CHAR(14) NOT NULL
The label is a full path if it is a holding disk (first character must be '/')
part_id INTEGER NOT NULL PRIMARY KEY AUTO_INCREMENT, copy_id INTEGER NOT NULL, volume_id INTEGER NOT NULL, part_offset INTEGER NOT NULL, part_size INTEGER NOT NULL, filenum INTEGER NOT NULL, part_num INTEGER NOT NULL, part_status VARCHAR(1024) NOT NULL, FOREIGN KEY (copy_id) REFERENCES copy (copy_id), FOREIGN KEY (volume_id) REFERENCES volume (volume_id)
This page was automatically generated Tue Mar 19 07:08:15 2019 from the Amanda source tree, and documents the most recent development version of Amanda. For documentation specific to the version of Amanda on your system, use the 'perldoc' command.