XFA/Data Handling Model: Difference between revisions
(it's a start) |
(→Notes on Diagrams: add diagrams) |
||
Line 25: | Line 25: | ||
; Credit : The consumption of a range of bytes from a bytestream by a transfer element. | ; Credit : The consumption of a range of bytes from a bytestream by a transfer element. | ||
== | == Diagrams == | ||
[[Image:XFA-bytestream.png|right|A single bytestream.]]A bytestream is represented by a solid horizontal line; debits made by the transfer element producing the bytestream are indicated with tickmarks above the line, while credits for the consuming element are delimited with tickmarks below the line. | |||
<br clear="all"> | |||
[[Image:XFA-filter-intro.png|right|A filter, transforming the upper bytestream to the lower.]]A filter element, then, consumes one bytestream (the upper bytestream) and produces another (the lower). Note that there is a one-to-one correspondence of an element's credits (against its source bytestream) and debits (to the bytestream it produces). |
Revision as of 06:08, 1 October 2007
Introduction
Amanda's data-handling model sits at the core of the application. Its purpose is fairly simple: move data between a backup client and archival storage. However, it is subject to some stringent constraints.
- Tapes are linear-access devices, with optimal block- and file-sizes; performance deteriorates rapidly as parameters diverge from these optima.
- Recovery of single files (or "user objects", as described below) should be relatively quick, and not require reading an entire dumpfile.
- Recovery should be possible with only native tools and the contents of tapes.
- Amanda must be able to recover indexes and all other metadata from the on-tape data alone (a "reindex" operation).
Terminology
Much of this terminology is part of the the Application API; this section briefly summarizes those terms..
- Application
- An Amanda component that interfaces directly with the client data.
- User Object
- The smallest object that can be restored (e.g., a file for GNU Tar).
- Collection
- The smallest amount of data that the application can operate on (e.g., a 512-byte block for GNU Tar).
- Transfer
- A data-movement operation.
- Transfer Element
- A component of a transfer; elements are combined in a kind of pipeline, with each element sending data to the next.
- Filter
- A transfer element which performs some transformation, such as compression or encryption, on the data that passes through it. Filters are described as operating normally when performing a backup, and in reverse on restore. When operating in reverse, a filter transforms data that it produced during normal operation, transforming it back into the original input. For example, an encryption filter encrypts in normal operation, and decrypts in reverse.
- Seekable Filter
- A filter which, when operating in reverse, can begin at arbitrary points in the datastream; contrast non-seekable filters.
- Non-seekable Filter
- A filter which, when operating in reverse, must always begin at byte zero of the datastream.
- Catenary Filter
- A filter for which concatenation is distributive over filtering. A filter is catenary if
cat file1 file2 | filter | filter -reverse
produces the same output as
(filter <file1 ; filter <file2) | filter -reverse
Gzip, for example is catenary.
- Bytestream
- A linear sequence of bytes; the data exchanged between transfer elements.
- Debit
- The creation of a range of bytes in a bytestream by a transfer element (we try to avoid the terms "block" or "chunk" here, although the effect is similar).
- Credit
- The consumption of a range of bytes from a bytestream by a transfer element.
Diagrams
A bytestream is represented by a solid horizontal line; debits made by the transfer element producing the bytestream are indicated with tickmarks above the line, while credits for the consuming element are delimited with tickmarks below the line.
A filter element, then, consumes one bytestream (the upper bytestream) and produces another (the lower). Note that there is a one-to-one correspondence of an element's credits (against its source bytestream) and debits (to the bytestream it produces).