Application API/Misconceptions

From The Open Source Backup Wiki (Amanda, MySQL Backup, BackupPC)
Jump to navigationJump to search

Because the Application API represents a major departure from historical Amanda thinking, misconceptions are common. This section attempts to address some of the most common.

The exact location of user objects is known.

Amanda can restore a user object only by retrieving the associated collections. Aside from tracking the collection that contains it, Amanda doesn't store the exact location of any user object. Amanda still has enough information to efficiently restore a user object without reading the whole dump -- assuming that collections are smaller than a dump.

To put it another way, the object may not be found at any particular byte offset in the backup. Even if it could, Amanda wouldn't know that offset. But nonetheless Amanda has sufficient information perform restores efficiently.

Collections must not be very small (or very big)

Although Amanda will not enforce any particular size restriction on a collection, the optimal size for roughly corresponds to the size of a user object. In general, there is not much advantage to having collections smaller than about 64k. Very small collections will bloat the index; very large collections may cause slower restores, especially partial restores of small objects from the collection.

The server can understand a collection

As today, the server doesn't know anything about the collections on media -- it can only store and retrieve them. An entire collection (not an entire job) must be sent to an Amanda client running the same Application API for interpretation.

Note, however, that this Amanda client may be on the same physical machine as the Amanda server.

Inputs Outputs above are associated with particular sockets

As there is as yet no line protocol associated with this API, it would be premature to talk about particular sockets. But it is very possible that all output data (octet stream, collection byte offsets, and user object information) will be multiplexed in a single network socket.

On restore, The client may seek to a particular place in the backup data

Although the client could do this, the server doesn't know anything about it. Rather, the server provides the set of collections that includes all the user objects of interest. Then (and only then) the client goes about restoring user objects from this set of collections.

The data stream sent to the server is opaque to the server

Although the collection data itself is opaque, the other data (collection sizes, user object identifiers and attributes) is very much interpreted by the server. There should, for example, be a standard way of representing file permissions and timestamps as user object attributes.