Amanda::DB::Catalog - access to the Amanda catalog: where is that dump?
use Amanda::DB::Catalog;
# get all dump timestamps on record my @timestamps = Amanda::DB::Catalog::get_timestamps();
# loop over those timestamps, printing dump info for each one for my $timestamp (@timestamps) { my @dumpfiles = Amanda::DB::Catalog::get_parts( timestamp => $timestamp, ok => 1 ); print "$timstamp:\n"; for my $dumpfile (@dumpfiles) { print " ", $dumpfile->{hostname}, ":", $dumpfile->{diskname}, " level ", $dumpfile->{level}, "\n"; } }
The Amanda catalog is modeled as a set of dumps comprised of parts. A dump is
a complete bytestream received from an application, and is uniquely identified
by the combination of hostname
, diskname
, dump_timestamp
, level
,
and write_timestamp
. A dump may be partial, or even a complete failure.
A part corresponds to a single file on a volume, containing a portion of the
data for a dump. A part, then, is completely specified by a volume label and a
file number (filenum
). Each part has, among other things, a part number
(partnum
) which gives its relative position within the dump. The bytestream
for a dump is recovered by concatenating all of the successful (status
= OK)
parts matching the dump.
Files in the holding disk are considered part of the catalog, and are represented as single-part dumps (holding-disk chunking is ignored, as it is distinct from split parts).
The dump table contains one row per dump. It has the following columns:
(string) -- timestamp of the run in which the dump was created
(string) -- timestamp of the run in which the part was written to this volume,
or "00000000000000"
for dumps in the holding disk.
(string) -- dump hostname
(string) -- dump diskname
(integer) -- dump level
(string) -- "OK", "PARTIAL", or "FAIL"
(string) -- reason for PARTIAL or FAIL status
(integer) -- number of successful parts in this dump
(integer) -- size (in kb) of this part
(integer) -- size (in kb) of the complete dump (uncompress and uncrypted).
(integer) -- time (in seconds) spent writing this part
(arrayref) -- array of parts, indexed by partnum (so $parts->[0]
is
always undef
). When multiple partial parts are available, the choice of the
partial that is included in this array is undefined.
A dump is represented as a hashref with these keys.
The write_timestamp
gives the time of the amanda run in which the part was
written to this volume. The write_timestamp
may differ from the
dump_timestamp
if, for example, amflush wrote the part to tape after the
initial dump.
The parts table contains one row per part, and has the following columns:
(string) -- volume label (not present for holding files)
(integer) -- file on that volume (not present for holding files)
(string) -- fully-qualified pathname of the holding file (not present for on-media dumps)
(object ref) -- a reference to the dump containing this part
(string) -- "OK", "PARTIAL" or some other descriptor
(integer) -- part number of a split part (1-based)
(integer) -- size (in kb) of this part
(integer) -- time (in seconds) spent writing this part
A part is represented as a hashref with these keys. The label
and
filenum
serve as a primary key.
Note that parts' dump
and dumps' parts
create a reference loop. This is
broken by making the parts
array's contents weak references in get_dumps
,
and the dump
reference weak in get_parts
.
All timestamps used in this module are full-length, in the format
YYYYMMDDHHMMSS
. If the underlying data contains only datestamps, they are
zero-extended into timestamps: YYYYMMDD000000
. A dump_timestamp
always
corresponds to the initiation of the original dump run, while
write_timestamp
gives the time the file was written to the volume. When
parts are migrated from volume to volume (e.g., by amvault), the
dump_timestamp
does not change.
In Amanda, the tuple (hostname
, diskname
, level
, dump_timestamp
)
serves as a unique identifier for a dump bytestream, but because the bytestream
may appear several times in the catalog (due to vaulting) the additional
write_timestamp
is required to identify a particular on-storage instance of
a dump. Note that the part sizes may differ between instances, so it is not
valid to concatenate parts from different dump instances.
The following functions provide summary data based on the contents of the catalog.
get_write_timestamps()
Get a list of all write timestamps, sorted in chronological order.
get_latest_write_timestamp()
Return the most recent write timestamp.
get_labels_written_at_timestamp($ts)
Return a list of labels for volumes written at the given timestamp.
get_parts(%parameters)
This function returns a sequence of parts. Values in %parameters
restrict
the set of parts that are returned. The hash can have any of the following
keys:
restrict to parts written at this timestamp
(arrayref) restrict to parts written at any of these timestamps (note that
holding-disk files have no write_timestamp
, so this option and the previous
will omit them)
restrict to parts with exactly this timestamp
(arrayref) restrict to parts with any of these timestamps
restrict to parts with timestamps matching this expression
if true, only return dumps on holding disk. If false, omit dumps on holding disk.
restrict to parts with exactly this hostname
(arrayref) restrict to parts with any of these hostnames
restrict to parts with hostnames matching this expression
restrict to parts with exactly this diskname
(arrayref) restrict to parts with any of these disknames
restrict to parts with disknames matching this expression
restrict to parts with exactly this label
(arrayref) restrict to parts with any of these labels
restrict to parts with exactly this level
(arrayref) restrict to parts with any of these levels
restrict to parts with this status
(arrayref of dumpspecs) restruct to parts matching one or more of these dumpspecs
Match expressions are described in the amanda(8)
manual page.
Given a list of parts, this function sorts that list by the requested keys. The following keys are available:
Note that this sorts labels lexically, not necessarily in the order they were used!
Keys are processed from left to right: if two dumps have the same value for
$key1
, then $key2
is examined, and so on. Key names may be prefixed by a
dash (-
) to reverse the order.
Note that some of these keys are dump keys; the function will automatically
access those values via the dump
attribute.
get_dumps(%parameters)
This function returns a sequence of dumps. Values in %parameters
restrict
the set of dumps that are returned. The same keys as are used for get_parts
are available here, with the exception of label
and labels
. The
status
key applies to the dump status, not the status of its constituent
parts.
Like sort_parts
, this sorts a sequence of dumps generated by get_dumps
.
The same keys are available, with the exception of label
, filenum
, and
partnum
.
add_part($part)
Add the given part to the database. In terms of logfiles, this will either
create a new logfile (if the part's write_timestamp
has not been seen
before) or append to an existing logfile. Note that a new logfile will require
a corresponding new entry in the tapelist.
Note that no locking is performed: multiple simultaneous calls to this function can result in a corrupted or incorrect logfile.
TODO: add_dump
This page was automatically generated Tue Nov 19 20:05:35 2013 from the Amanda source tree, and documents the most recent development version of Amanda. For documentation specific to the version of Amanda on your system, use the 'perldoc' command.