Application API

From wiki.zmanda.com
Revision as of 20:07, 12 December 2008 by Dustin (talk | contribs) (→‎Media Formats: moved to talk page)
Jump to navigation Jump to search

This page documents the Application API from a developer's perspective -- in particular, someone interested in modifying an existing application or creating a new one. For the basics of using the Application API in an Amanda configuration, see How To:Use Amanda Applications on a Client. Note that the implementation of the Application API is still in progress.

Background

There are two compelling reasons to introduce the Application API:

  • To allow recovery of a single file without transmitting the entire backup archive to the client.
  • To make it easier to support new client backup mechanisms, both at the filesystem and application level.

The Application API addresses these needs by changing the way Amanda client operations work.

Historically, Amanda has focused on managing large chunks of data generated by one of only a few hard-coded applications (generally either GNU tar, some version of dump, or smbclient). The Application API addresses both limitations in the following manner:

  • It provides modular support for adding client backup tools, both for filesystems and applications such as databases, mail servers, etc.
  • It extends Amanda to allow more granular backup and restore options.

Backward Compatability

The Application API maintains backward compatibility by extending existing behavior rather than replacing it:

  • Legacy clients can be dumped as before. The server writes data to tape in the legacy format.
  • Legacy tapes can be read as before.
    • When restoring to a new client (one using the Application API), the server provides the legacy dump as one large collection.
    • When restoring to a legacy client, the restore works as before.
  • Legacy clients cannot restore data backed up by Application API clients; legacy clients can be restored only only from legacy dumps.

Terminology

These terms are derived from the SCSI command-set standard INCITS T10/1731-D.

A User Object is the basic unit of backup and restore, from the user perspective. Currently, a user object is a file or directory. In the future other types of data may be supported. Each user object has a hierarchical identifier and a set of associated attributes. Also, each user object is entirely contained within some set of collections, but a single collection may contain data from multiple user objects.

A Collection is the basic unit of backup and restore as it resides on the backup media. A collection is the smallest unit that can be stored or retrieved from media.

Each collection and user object may originate from only a single backup job, collection merge, or collection copy/migration.

Here are some examples of how the new Application API nomenclature would apply in the context of different application drivers.

  • Dump

User object => Filesystem object (file, directory, socket, pipe, etc.)
Collection => Entire filesystem

  • GNU tar

User object => Archive object (file, directory, etc.)
Collection => one 512-byte tar block.

Note that having such collections can be problematic; see below.

  • SQL database

User Object => Database table
Collection => Entire database

  • Alternative SQL database

User Object => Table row
Collection => Entire database

This conception is only useful if you have very large table rows; otherwise, the indices will be as big as the original database!

More