XFA/Xfer Mechanisms

From wiki.zmanda.com
Jump to navigation Jump to search

Individual transfer elements communicate with their upstream and downstream elements via xfer mechanisms. The available mechanisms are tailored to the different kinds of transfer elements Amanda uses. When assembling elements into a transfer, the XFA examines the mechanisms supported by all of the constituent elements, and selects the most efficient combination, adding "glue" code if necessary.

Available Mechanisms

As of this writing, four mechanisms are available:

XFER_MECH_READFD

In this mechanism, the downstream element is given a file descriptor from which it should read. This might be used, for example, for an element that executes a compression application in a pipeline.

XFER_MECH_WRITEFD

In the mirror image, the upstream element is given a file descriptor to which it writes data.

XFER_MECH_PULL_BUFFER

Here, the downstream element calls a method of the upstream element to fetch a buffer of arbitrary size.

XFER_MECH_PUSH_BUFFER

And in this mechanism, the upstream element calls a method of the downstream element to "push" a buffer of arbitary size.

XFER_MECH_DIRECTTCP_LISTEN

In this mechanism, the downstream end of the transfer listens for an incoming DirectTCP connection. The addresses on which it is listening are available to the upstream element.

Note that, as of this writing, there is no XFER_MECH_DIRECTTCP_CONNECT.

Combinations and Glue

Each element lists the *pairs* of mechanisms that it supports. For example, a simple buffer-based encryption filter might be able to pull data from upstream (PULL_BUFFER) in response to a pull request from its downstream (PULL_BUFFER), or to push data downstream (PUSH_BUFFER) every time its upstream pushes data (PUSH_BUFFER). This element would have two supported pairs:

  • (PUSH_BUFFER, PUSH_BUFFER)
  • (PULL_BUFFER, PULL_BUFFER)

In neither case does this element "own" the thread in which it is operating -- it is merely reactionary. Imagine it were combined with an FdSource:

  • (NONE, READ_FD)

and an FdDest:

  • (WRITE_FD, NONE)

There is no simple combination of mechanisms which will correctly encrypt the data -- in fact, none of the elements contain a loop that would drive the transfer of data.

The solution is to add glue elements between the given elements to "translate" between these mechanisms. At least one such glue element should implement a loop in a new thread. One possible combination is:

FdSource --READ_FD--> Glue1
         --PUSH_BUFFER--> CrytpoFilter
         --PUSH_BUFFER--> Glue2
         --WRITE_FD--> FdDest

where Glue1 runs a loop:

 while not eof:
   buf = read(upstream.fd)
   downstream.push_buffer(buf)

and Glue2 simply calls write from its push_buffer mechanism. The result is a single thread which reads bytes from the source file, places them in a buffer, encrypts the buffer, and writes it out to the destination fd. This is about as efficient as this operation can be!

This is a simple example, but the XFA has a complete set of glue elements and can link any collection of transfer elements with maximal efficiency.

Future

One common concept is missing from this set of mechanisms. The buffers used by PUSH_BUFFER and PULL_BUFFER are of arbitrary size, at the discretion of the upstream element in each case. However, Amanda often needs buffers of very specific sizes -- the device block size, for example -- and the process of re-blocking arbitrary buffers into fixed-size buffers can consume a lot of CPU and memory. Furthermore, the PUSH_BUFFER and PULL_BUFFER mechanisms specify that the callee is responsible for freeing the buffer with g_free, which means that each buffer has the overhead of a g_malloc/g_free pair.

We will probably add a new mechanism which uses buffers and buffer sizes specified by the downstream element, leaving it to the upstream element to re-block or to pass that buffer size along to its upstream.

The addition of such a mechanism will be transparent to all of the applications it will affect -- the transfers will just magically run faster.