3.1.3

Amanda::DB::Catalog


NAME

Amanda::DB::Catalog - access to the Amanda catalog: where is that dump?


SYNOPSIS

  use Amanda::DB::Catalog;
  # get all dump timestamps on record
  my @timestamps = Amanda::DB::Catalog::get_timestamps();
  # loop over those timestamps, printing dump info for each one
  for my $timestamp (@timestamps) {
      my @dumpfiles = Amanda::DB::Catalog::get_parts(
          timestamp => $timestamp,
          ok => 1
      );
      print "$timstamp:\n";
      for my $dumpfile (@dumpfiles) {
          print " ", $dumpfile->{hostname}, ":", $dumpfile->{diskname}, 
                " level ", $dumpfile->{level}, "\n";
      }
  }


MODEL

The Amanda catalog is modeled as a set of dumps comprised of parts. A dump is a complete bytestream received from an application, and is uniquely identified by the combination of hostname, diskname, dump_timestamp, level, and write_timestamp. A dump may be partial, or even a complete failure.

A part corresponds to a single file on a volume, containing a portion of the data for a dump. A part, then, is completely specified by a volume label and a file number (filenum). Each part has, among other things, a part number (partnum) which gives its relative position within the dump. The bytestream for a dump is recovered by concatenating all of the successful (status = OK) parts matching the dump.

Files in the holding disk are considered part of the catalog, and are represented as single-part dumps (holding-disk chunking is ignored, as it is distinct from split parts).

DUMPS

The dump table contains one row per dump. It has the following columns:

dump_timestamp

(string) -- timestamp of the run in which the dump was created

write_timestamp

(string) -- timestamp of the run in which the part was written to this volume, or "00000000000000" for dumps in the holding disk.

hostname

(string) -- dump hostname

diskname

(string) -- dump diskname

level

(integer) -- dump level

status

(string) -- "OK", "PARTIAL", or "FAIL"

message

(string) -- reason for PARTIAL or FAIL status

nparts

(integer) -- number of successful parts in this dump

kb

(integer) -- size (in kb) of this part

orig_kb

(integer) -- size (in kb) of the complete dump (uncompress and uncrypted).

sec

(integer) -- time (in seconds) spent writing this part

parts

(arrayref) -- array of parts, indexed by partnum (so $parts->[0] is always undef). When multiple partial parts are available, the choice of the partial that is included in this array is undefined.

A dump is represented as a hashref with these keys.

The write_timestamp gives the time of the amanda run in which the part was written to this volume. The write_timestamp may differ from the dump_timestamp if, for example, amflush wrote the part to tape after the initial dump.

PARTS

The parts table contains one row per part, and has the following columns:

label

(string) -- volume label (not present for holding files)

filenum

(integer) -- file on that volume (not present for holding files)

holding_file

(string) -- fully-qualified pathname of the holding file (not present for on-media dumps)

dump

(object ref) -- a reference to the dump containing this part

status

(string) -- "OK", "PARTIAL" or some other descriptor

partnum

(integer) -- part number of a split part (1-based)

kb

(integer) -- size (in kb) of this part

sec

(integer) -- time (in seconds) spent writing this part

A part is represented as a hashref with these keys. The label and filenum serve as a primary key.

Note that parts' dump and dumps' parts create a reference loop. This is broken by making the parts array's contents weak references in get_dumps, and the dump reference weak in get_parts.

NOTES

All timestamps used in this module are full-length, in the format YYYYMMDDHHMMSS. If the underlying data contains only datestamps, they are zero-extended into timestamps: YYYYMMDD000000. A dump_timestamp always corresponds to the initiation of the original dump run, while write_timestamp gives the time the file was written to the volume. When parts are migrated from volume to volume (e.g., by amvault), the dump_timestamp does not change.

In Amanda, the tuple (hostname, diskname, level, dump_timestamp) serves as a unique identifier for a dump bytestream, but because the bytestream may appear several times in the catalog (due to vaulting) the additional write_timestamp is required to identify a particular on-storage instance of a dump. Note that the part sizes may differ between instances, so it is not valid to concatenate parts from different dump instances.


INTERFACES

SUMMARY DATA

The following functions provide summary data based on the contents of the catalog.

get_write_timestamps()

Get a list of all write timestamps, sorted in chronological order.

get_latest_write_timestamp()

Return the most recent write timestamp.

get_labels_written_at_timestamp($ts)

Return a list of labels for volumes written at the given timestamp.

PARTS

get_parts(%parameters)

This function returns a sequence of parts. Values in %parameters restrict the set of parts that are returned. The hash can have any of the following keys:

write_timestamp

restrict to parts written at this timestamp

write_timestamps

(arrayref) restrict to parts written at any of these timestamps (note that holding-disk files have no write_timestamp, so this option and the previous will omit them)

dump_timestamp

restrict to parts with exactly this timestamp

dump_timestamps

(arrayref) restrict to parts with any of these timestamps

dump_timestamp_match

restrict to parts with timestamps matching this expression

holding

if true, only return dumps on holding disk. If false, omit dumps on holding disk.

hostname

restrict to parts with exactly this hostname

hostnames

(arrayref) restrict to parts with any of these hostnames

hostname_match

restrict to parts with hostnames matching this expression

diskname

restrict to parts with exactly this diskname

disknames

(arrayref) restrict to parts with any of these disknames

diskname_match

restrict to parts with disknames matching this expression

label

restrict to parts with exactly this label

labels

(arrayref) restrict to parts with any of these labels

level

restrict to parts with exactly this level

levels

(arrayref) restrict to parts with any of these levels

status

restrict to parts with this status

dumpspecs

(arrayref of dumpspecs) restruct to parts matching one or more of these dumpspecs

Match expressions are described in the amanda(8) manual page.

sort_parts([ $key1, $key2, .. ], @parts)

Given a list of parts, this function sorts that list by the requested keys. The following keys are available:

hostname
diskname
write_timestamp
dump_timestamp
level
filenum
label

Note that this sorts labels lexically, not necessarily in the order they were used!

partnum
nparts

Keys are processed from left to right: if two dumps have the same value for $key1, then $key2 is examined, and so on. Key names may be prefixed by a dash (-) to reverse the order.

Note that some of these keys are dump keys; the function will automatically access those values via the dump attribute.

DUMPS

get_dumps(%parameters)

This function returns a sequence of dumps. Values in %parameters restrict the set of dumps that are returned. The same keys as are used for get_parts are available here, with the exception of label and labels. The status key applies to the dump status, not the status of its constituent parts.

sort_dumps([ $key1, $key2 ], @dumps)

Like sort_parts, this sorts a sequence of dumps generated by get_dumps. The same keys are available, with the exception of label, filenum, and partnum.

ADDING DATA

add_part($part)

Add the given part to the database. In terms of logfiles, this will either create a new logfile (if the part's write_timestamp has not been seen before) or append to an existing logfile. Note that a new logfile will require a corresponding new entry in the tapelist.

Note that no locking is performed: multiple simultaneous calls to this function can result in a corrupted or incorrect logfile.

TODO: add_dump


ABOUT THIS PAGE

This page was automatically generated Tue Nov 19 20:05:35 2013 from the Amanda source tree, and documents the most recent development version of Amanda. For documentation specific to the version of Amanda on your system, use the 'perldoc' command.


3.1.3