XFA/Data Handling Model: Difference between revisions
(→Notes on Diagrams: add diagrams) |
(write up a transfer, outline the remainder) |
||
Line 29: | Line 29: | ||
<br clear="all"> | <br clear="all"> | ||
[[Image:XFA-filter-intro.png|right|A filter, transforming the upper bytestream to the lower.]]A filter element, then, consumes one bytestream (the upper bytestream) and produces another (the lower). Note that there is a one-to-one correspondence of an element's credits (against its source bytestream) and debits (to the bytestream it produces). | [[Image:XFA-filter-intro.png|right|A filter, transforming the upper bytestream to the lower.]]A filter element, then, consumes one bytestream (the upper bytestream) and produces another (the lower). Note that there is a one-to-one correspondence of an element's credits (against its source bytestream) and debits (to the bytestream it produces). | ||
<br clear="all"> | |||
= A Transfer = | |||
Let's look at a full transfer, operating normally: | |||
[[Image:XFA-transfer.png|center|A normal transfer with two filters.]] | |||
Reading the diagram from top to bottom: | |||
* The application produces a single bytestream, recording (in the index) the byteranges representing each collection. | |||
* The first (topmost) filter divides that bytestream up into credits, and records the corresponding byteranges in the index. | |||
* The filter produces a different bytestream, containing one debit for each credit applied to the first bytestream; the byteranges of the debits are also recorded in the index. | |||
* The second filter independently divides that bytestream up into credits, and records the corresponding byteranges in the index. | |||
* The filter produces a final bytestream, containing one debit for each credit applied to its source bytestream; it records the corresponding byteranges in the index. | |||
* The destination element divides its bytestream into a number of equal-sized device blocks, and writes it to permanent storage. Note that the final block is shorter than the rest, representing an EOF. See the [[Device API]] for details. | |||
= Complications = | |||
== Non-seekable Filters == | |||
== Non-catenary Filters == | |||
== Index Storage == | |||
= Reindexing = | |||
= Native-Tools Restore = |
Revision as of 06:23, 1 October 2007
Introduction
Amanda's data-handling model sits at the core of the application. Its purpose is fairly simple: move data between a backup client and archival storage. However, it is subject to some stringent constraints.
- Tapes are linear-access devices, with optimal block- and file-sizes; performance deteriorates rapidly as parameters diverge from these optima.
- Recovery of single files (or "user objects", as described below) should be relatively quick, and not require reading an entire dumpfile.
- Recovery should be possible with only native tools and the contents of tapes.
- Amanda must be able to recover indexes and all other metadata from the on-tape data alone (a "reindex" operation).
Terminology
Much of this terminology is part of the the Application API; this section briefly summarizes those terms..
- Application
- An Amanda component that interfaces directly with the client data.
- User Object
- The smallest object that can be restored (e.g., a file for GNU Tar).
- Collection
- The smallest amount of data that the application can operate on (e.g., a 512-byte block for GNU Tar).
- Transfer
- A data-movement operation.
- Transfer Element
- A component of a transfer; elements are combined in a kind of pipeline, with each element sending data to the next.
- Filter
- A transfer element which performs some transformation, such as compression or encryption, on the data that passes through it. Filters are described as operating normally when performing a backup, and in reverse on restore. When operating in reverse, a filter transforms data that it produced during normal operation, transforming it back into the original input. For example, an encryption filter encrypts in normal operation, and decrypts in reverse.
- Seekable Filter
- A filter which, when operating in reverse, can begin at arbitrary points in the datastream; contrast non-seekable filters.
- Non-seekable Filter
- A filter which, when operating in reverse, must always begin at byte zero of the datastream.
- Catenary Filter
- A filter for which concatenation is distributive over filtering. A filter is catenary if
cat file1 file2 | filter | filter -reverse
produces the same output as
(filter <file1 ; filter <file2) | filter -reverse
Gzip, for example is catenary.
- Bytestream
- A linear sequence of bytes; the data exchanged between transfer elements.
- Debit
- The creation of a range of bytes in a bytestream by a transfer element (we try to avoid the terms "block" or "chunk" here, although the effect is similar).
- Credit
- The consumption of a range of bytes from a bytestream by a transfer element.
Diagrams
A bytestream is represented by a solid horizontal line; debits made by the transfer element producing the bytestream are indicated with tickmarks above the line, while credits for the consuming element are delimited with tickmarks below the line.
A filter element, then, consumes one bytestream (the upper bytestream) and produces another (the lower). Note that there is a one-to-one correspondence of an element's credits (against its source bytestream) and debits (to the bytestream it produces).
A Transfer
Let's look at a full transfer, operating normally:
Reading the diagram from top to bottom:
- The application produces a single bytestream, recording (in the index) the byteranges representing each collection.
- The first (topmost) filter divides that bytestream up into credits, and records the corresponding byteranges in the index.
- The filter produces a different bytestream, containing one debit for each credit applied to the first bytestream; the byteranges of the debits are also recorded in the index.
- The second filter independently divides that bytestream up into credits, and records the corresponding byteranges in the index.
- The filter produces a final bytestream, containing one debit for each credit applied to its source bytestream; it records the corresponding byteranges in the index.
- The destination element divides its bytestream into a number of equal-sized device blocks, and writes it to permanent storage. Note that the final block is shorter than the rest, representing an EOF. See the Device API for details.