Difference between revisions of "Device API"

From The Open Source Backup Wiki (Amanda, MySQL Backup, BackupPC)
Jump to navigationJump to search
m (Reverted edits by C010ss (Talk) to last revision by Dustin)
 
(33 intermediate revisions by 4 users not shown)
Line 1: Line 1:
The Device API is designed to replace the ancient [[Virtual tape API | tapeio subsystem]] (sometimes called the vtape API or Virtual Tape API), originally introduced to support virtual tapes. The Device API clears up a large number of limitations and device assumptions that the tapeio system made, allowing native support for new devices (such as CDs and DVDs) as well as new device-related functionality (such as parallel access, partial recycling, and appending).
+
The Device API is a clean interface between Amanda and data-storage systems.  It provides a tape-like model -- a sequence of bytestreams on each volume, identified only by their on-volume file number -- even of non-tape devices.
  
The Device API does not change (or even address) media formats. It is not itself a user-visible change, though it enables many user-visible features.
+
= Background =
 +
 
 +
== Device API Features ==
 +
The Device API also adds a number of new features not previously available in Amanda. These include:
 +
* Device properties allow drivers to describe themselves (and the devices and media they control), as well as allow arbitrary user settings to propagate down to the driver.
 +
* Smart locking means unrelated accesses can be performed without issue, while conflicting accesses wait for one another.
 +
* Appending to volumes is supported for capable devices.
 +
* A device can describe its supported blocksize range to the Amanda core, instead of the other way around.
 +
* Deleting parts of volumes (without erasing the whole thing) is supported for capable devices.
 +
* The amount of free space on a device can be reported to the Amanda core, where the device supports it (e.g., VFS Device)
  
 
== History ==
 
== History ==
The original design of the tapeio system (located in <tt>tape-src/</tt>) was to abstract tape-related functionality into a separate library. Thus, different devices (tape, vtape and RAIT) were all phrased in terms of tape operations: Rewind, fast-forward, read a block, etc. Operations were assumed to have the same semantics as a UNIX tape device. Furthermore, the API revealed a file descriptor, with the assumption that other parts of Amanda could perform operations such as stat() or dup2() on it.
+
The Device API is designed to replace the ancient [[Virtual tape API | tapeio subsystem]] (sometimes called the vtape API or Virtual Tape API), originally introduced to support virtual tapes. The original design of the tapeio system (located in <tt>tape-src/</tt>) was to abstract tape-related functionality into a separate library. Thus, different devices (tape, vtape and RAIT) were all phrased in terms of tape operations: Rewind, fast-forward, read a block, etc. Operations were assumed to have the same semantics as a UNIX tape device. Furthermore, the API revealed a file descriptor, with the assumption that other parts of Amanda could perform operations such as stat() or dup2() on it.
 +
 
 +
The Device API clears up a large number of limitations and device assumptions that the tapeio system made, allowing native support for new devices (such as CDs and DVDs) as well as new device-related functionality (such as parallel access, partial recycling, and appending).
  
=== Limitations of tapeio ===
+
The Device API does not change (or even address) media formats.
Important limitations inherent in the design of tapeio include:
 
* Exposes a UNIX file descriptor
 
* Operates in terms of UNIX operations
 
* Does not store information in a portable (yet device-specific) way.
 
* Not reentrant: Access to the device is not permitted from multiple threads or processess.
 
* Assumes devices can be opened and closed at will.
 
* RAIT driver performs operations in series (rather than parallel).
 
  
=== Operations ===
+
== What the Device API is not ==
Tapeio included the following operations:
+
The Device API does not distinguish between random-access and linear-access media: The seek operation may take a long time for some devices, or it may be instantaneous for others. It does, however, distinguish between concurrent devices and exclusive devices: Concurrent devices may be accessed by multiple readers (and sometimes multiple writers) simultaneously, while exclusive devices cannot.
* <tt>tape_access()</tt>, <tt>tape_open()</tt>, <tt>tape_stat()</tt>, <tt>tapefd_close()</tt>, <tt>tapefd_read()</tt>, <tt>tapefd_write()</tt>: Act like their UNIX equivalents
 
* <tt>tapefd_rewind()</tt>, <tt>tapefd_unload()</tt>: Do what you expect.
 
* <tt>tapefd_fsf()</tt>: Seeks forward a certain number of filemarks.
 
* <tt>tapefd_weof()</tt>: Writes a filemark.
 
* <tt>tapefd_resetofs()</tt>: Workaround for some buggy kernels.
 
* <tt>tapefd_status()</tt>: Prints status to stdout (!).
 
  
In addition, all tapefd_ commands have matching tape_ commands, which work on an unopened device.
+
Although the Device API deals in on-medium headers and blocks, it is otherwise agnostic to media format issues. Changes to the Amanda media format, including split vs traditional formats, can be made without change to the Device API.
  
== Device API Features ==
+
= Using the Device API =
In addition to relieving the limitations of the tapeio system discussed above, the Device API also adds a number of new features. These include:
 
* Device properties allow drivers to describe themselves (and the devices and media they control), as well as allow arbitrary user settings to propagate down to the driver.
 
* Smart locking means unrelated accesses can be performed without issue, while conflicting accesses wait for one another.
 
* Appending to volumes is supported for able devices.
 
* A device can describe its supported blocksize range to the Amanda core, instead of the other way around.
 
* Deleting parts of volumes (without erasing the whole thing) is supported for able devices.
 
* The amount of free disk space can be reported to the Amanda core.
 
  
== Properties ==
+
The Device API is primarily intended for use from Perl codeSee {{pod|Amanda::Device}} for a detailed description of the interface.
The Device API includes a number of standard properties, and allows devices to declare and register other custom properties as may be appropriate. Standard properties include:
 
* Concurrency: Does the device support concurrent readers and/or writers?
 
* Streaming: Does the device require (or desire) streaming data? Physical tapes usually require that data arrive in the order it should be placed on the tape, while other devices may allow access to the data in any order.
 
* Compression: Does the device support it, and what compression rate has been yielded.
 
* Blocksize: What block size(s) are supported by the device? This property is settable for devices that support a variety of block sizes, or can be left unset for variable block size.
 
* Device UUID: Returns a unique identifier for this piece of hardware.
 
* Media access mode: What is the access paradigm for this volume? Can be Read-Only, Write-Once-Read-Many (WORM), Read-Write, or Write-Once-Read-Never.
 
* Feature support: Does the device support partial deletion or appending to volumes?
 
  
== Programmatic Interface ==
+
= Implementing a Device Driver =
The Device API is implemented in the <tt>device-src/</tt> directory. It relies heavily on GLib's type system, so developers unfamiliar with the GObject system are encouraged to consult the relevant [http://developer.gnome.org/doc/API/2.0/gobject/index.html documentation]
 
  
Broadly speaking, the C interface revolves around a single glib virtual class -- Device -- which is actually implemented as one of many possible subclasses. The current class hierarchy of implemented devices looks like this:
+
To add a new device to the Device API, do the following:
<pre>
+
* Subscribe to the <tt>amanda-hackers</tt> mailing list and/or connect to <tt>#amanda</tt> and solicit the help of existing developers.
GObject
+
* Figure out how the data model and operational model described above will map onto your device.
+ Device
+
* Implement a subclass of one of the existing <tt>Device</tt> classes named above. In particular, implement the various virtual functions provided for in <tt>device.h</tt>. 
  + RaitDevice
+
** Consult the existing device implementations for hints -- in particular, the VFS Device and S3 Device make good examples.
  + NullDevice
+
** Note that in addition to the (mostly virtual) functions discussed above, there are some additional protected functions that you may find useful, described in <tt>device.h</tt>.
  + FdDevice
+
* Make sure your device calls <tt>device_add_property()</tt> to register its properties.
    + TapeDevice
+
* Arrange to call the <tt>register_device()</tt> function to register the device, so it will recieve relevant calls to <tt>device_open()</tt>.  The easiest way to do this is to write a initialization function, and add a call to your function from <tt>device_api_init()</tt> in <tt>device.c</tt>.  Dynamically loadable device drivers are an open project.
    + VfsDevice
+
* add documentation of your device to the {{man|7|amanda-devices}} manpage.
</pre>
 
  
One can obtain a Device by calling the factory function for the particular device manually, but the best way is just to call <tt>device_open()</tt> with a device name. It returns a device of the right type for the given name.
+
The Device API relies heavily on GLib's type system, so developers unfamiliar with the GObject system are encouraged to consult the relevant [http://developer.gnome.org/doc/API/2.0/gobject/index.html documentation].
  
 
=== Properties Interface ===
 
=== Properties Interface ===
Line 72: Line 53:
 
* <tt>BETWEEN_FILE_READ</tt>: When in read mode, but not actually reading a file. This means after <tt>device_start()</tt> has been called with <tt>ACCESS_READ</tt>, but either before a call to <tt>device_seek_file</tt>, or after a call to <tt>device_read_block</tt> returns EOF (provided no intervening call to <tt>device_seek_file()</tt> or <tt>device_seek_block()</tt> since the EOF).
 
* <tt>BETWEEN_FILE_READ</tt>: When in read mode, but not actually reading a file. This means after <tt>device_start()</tt> has been called with <tt>ACCESS_READ</tt>, but either before a call to <tt>device_seek_file</tt>, or after a call to <tt>device_read_block</tt> returns EOF (provided no intervening call to <tt>device_seek_file()</tt> or <tt>device_seek_block()</tt> since the EOF).
 
* <tt>INSIDE_FILE_READ</tt>: When in read mode, and while actually reading a file. This means after a call to <tt>device_seek_block()</tt>, but before a call to <tt>device_read_block()</tt> returns EOF.
 
* <tt>INSIDE_FILE_READ</tt>: When in read mode, and while actually reading a file. This means after a call to <tt>device_seek_block()</tt>, but before a call to <tt>device_read_block()</tt> returns EOF.
 +
Note that these access flags are not automatically enforced, and are currently never consulted.  They exist for future expansion of the properties interface.
  
Adding a new standard property requires modifying both <tt>property.h</tt> and <tt>property.c</tt>, as C code in <tt>property.c</tt> is required to register the properties and generate their IDs.  See those files for the symbolic names for the standard properties described above.
+
Device-specific properties should be defined in your own device, not in <tt>property.h</tt>. See the S3 device's properties for an example.
 
 
=== Using the Device class ===
 
To make use of the Device API, call one of the <tt>device_...</tt> functions defined in <tt>device-src/device.h</tt>. Functions of interest:
 
 
 
<pre>
 
/* This is how you get a new Device. Pass in a device name like
 
* file:/path/to/storage, and (assuming everything goes OK) you will get
 
* back a nice happy Device* that you can do operations on. Note that you
 
* must device_start() it before you can do anything besides talk about
 
* properties or read the label. */
 
Device* device_open (char * device_name);
 
 
 
/* This instructs the device to read the label on the current
 
  volume. It is called automatically after device_open() and before
 
  device_start(). You can call it yourself anytime between the
 
  two. It may return FALSE if no label could be read (as in the case
 
  of an unlabeled volume). */
 
gboolean        device_read_label (Device * self);
 
 
 
/* This tells the Device that it's OK to start reading and writing
 
* data. Before you call this, you can only call
 
* device_property_{get, set} and device_read_label. You can only call
 
* this a second time if you call device_finish() first.
 
*
 
* You should pass a label and timestamp if and only if you are
 
* opening in WRITE mode (not READ or APPEND). The label should be
 
* allocated with malloc(), and the Device will free() it on
 
* cleanup. The passed timestamp may be TIME_REPLACE, in which case
 
* it will be filled in with the current time. */
 
gboolean device_start (Device * self,
 
                                DeviceAccessMode mode, char * label,
 
                                time_t timestamp);
 
 
 
/* This undoes device_start, returning you to the NULL state. Do this
 
* if you want to (for example) change access mode.
 
*
 
* Note to subclass implementors: Call this function first from your
 
* finalization function. */
 
gboolean device_finish (Device * self);
 
 
 
/* But you can't write any data until you call this function, too.
 
* This function does not take ownership of the passed dumpfile_t; you must
 
* free it yourself. */
 
gboolean        device_start_file      (Device * self,
 
                                        const dumpfile_t * jobInfo);
 
 
 
guint          device_write_min_size  (Device * self);
 
guint          device_write_max_size  (Device * self);
 
 
 
/* Does what you expect. size had better be inside the block size
 
* range, or this function will write nothing.
 
*
 
* The short_block parameter needs some additional explanation: If
 
* short_block is set to TRUE, then this function will accept a write
 
* smaller than the minimum block size, subject to the following
 
* caveats:
 
* % The block may be padded with NULL bytes, which will be present on
 
*  restore.
 
* % device_write_block will automatically call device_finish_file()
 
*  after writing this short block.
 
* It is permitted to use short_block with a block that is not short;
 
* in this case, it is equivalent to calling device_write() and then
 
* calling device_finish_file(). */
 
gboolean device_write_block (Device * self,
 
                                        guint size,
 
                                        gpointer data,
 
                                        gboolean short_block);
 
 
 
/* This will drain the given fd (reading until EOF), and write the
 
* resulting data out to the device using maximally-sized blocks. */
 
gboolean device_write_from_fd (Device * self,
 
int fd);
 
 
 
/* Call this when you are finished writing a file. This function will
 
* write a filemark or the local equivalent, flush the buffers, and do
 
* whatever dirty work needs to be done at such a point in time. */
 
gboolean device_finish_file (Device * self);
 
 
 
/* For reading only: Seeks to the beginning of a particular
 
* filemark. You don't have to do this when writing; opening in
 
* ACCESS_WRITE will start you out at the first file, and opening in
 
* ACCESS_APPEND will automatically seek to the end of the medium.
 
*
 
* The returned dumpfile_t is yours to keep, at no extra charge. */
 
dumpfile_t* device_seek_file (Device * self,
 
guint file);
 
 
 
/* After you have called device_seek_file (and /only/ after having
 
* called device_seek_file), you can call this to seek to a particular
 
* block inside the file. It works like SEEK_SET, only in blocks. */
 
gboolean device_seek_block (Device * self,
 
guint64 block);
 
 
 
/* After you have called device_seek_file and/or device_seek_block,
 
* you can start calling this function. It always reads exactly one whole
 
* block at a time, however big that might be. You must pass in a buffer and
 
* specify its size. If the buffer is big enough, the read is
 
* performed, and both *size and the return value are equal to the
 
* number of bytes actually read. If the buffer is not big enough, then
 
* no read is performed, the function returns 0, and *size is set
 
* to the minimum buffer size required to read the next block. If an
 
* error occurs, the function returns -1  and *size is left unchanged.
 
*
 
* It is not an error if buffer == NULL and *size == 0. This should be
 
* treated as a query as to the possible size of the next block. */
 
int device_read_block (Device * self,
 
                                gpointer buffer,
 
                                int * size);
 
 
 
/* This is the reading equivalent of device_write_from_fd(). It will
 
* read from the device from the current location until end of file,
 
* and drains the results out into the specified fd. Returns FALSE if
 
* there is a problem writing to the fd. */
 
gboolean device_read_to_fd (Device * self,
 
int fd);
 
 
 
/* Returns TRUE if the last device_read_block returned -1 because
 
* there is no data left to read. */
 
gboolean        device_is_eof          (Device * self);
 
 
 
/* This function tells you what properties are supported by this
 
* device, and when you are allowed to get and set them. The return
 
* value is an array of DeviceProperty structs. The last struct in
 
* the array is zeroed, so you know when the end is (check the
 
* pointer element "base"). The return value from this function on any
 
* given object (or physical device) should be invariant. */
 
const DeviceProperty * device_property_get_list (Device * self);
 
 
 
/* These functions get or set a particular property. The val should be
 
* compatible with the DevicePropertyBase associated with the given
 
* DevicePropertyId, and this function should only be called when
 
* DeviceProperty.access says it is OK. Otherwise you will get an
 
* error and not the tasty property action you wanted. */
 
gboolean device_property_get (Device * self,
 
DevicePropertyId id,
 
GValue * val);
 
gboolean device_property_set (Device * self,
 
DevicePropertyId id,
 
GValue * val);
 
 
 
/* On devices that support it (check PROPERTY_PARTIAL_DELETION),
 
* this will free only the space associated with a particular file.
 
* This way, you can apply a different retention policy to every file
 
* on the volume, appending new data at the end and recycling anywhere
 
* in the middle -- even simultaneously (via different Device
 
* handles)! Note that you generally can't recycle a file that is presently in
 
* use (being read or written).
 
*
 
* To use this, open the device as DEVICE_MODE_APPEND. But you don't
 
* have to call device_start_file(), unless you want to write some
 
* data, too. */
 
gboolean device_recycle_file (Device * self,
 
guint filenum);
 
</pre>
 
 
 
=== Implementing a device ===
 
To add a new device to the Device API, do the following:
 
* Implement a subclass of one of the existing <tt>Device</tt> classes named above. In particular, implement the various virtual functions provided for in <tt>device.h</tt>.
 
** Using the <tt>FdDevice</tt> class for devices that have a unix file discriptor will spare you from implementing <tt>device_read_block()</tt> and <tt>device_write_block()</tt>, but there are other virtual functions in FdDevice that you (may) need to reimplement.
 
** Note that in addition to the (mostly virtual) functions discussed above, there are some additional protected functions that require implementation.
 
* Make sure your device calls <tt>device_add_property()</tt> to register both its dynamic and static properties.
 
* Arrange to call the <tt>register_device()</tt> function to register the device, so it will recieve relevant calls to <tt>device_open()</tt>. The easiest way to do this is to write a initialization function, and call it from <tt>device_api_init()</tt>.
 
  
Users new to GLib will need to read up on the GObject system first, to see how GLib provides virtual abstract classes in C and how to implement a new subclass. Additionally, the existing device drivers may serve as a useful template.
+
= Future Directions =
 +
Future directions for the Device API include:
 +
* new devices
 +
* runtime loading of device modules as shared objects
 +
* more flexible device configuration

Latest revision as of 14:57, 28 September 2010

The Device API is a clean interface between Amanda and data-storage systems. It provides a tape-like model -- a sequence of bytestreams on each volume, identified only by their on-volume file number -- even of non-tape devices.

Background

Device API Features

The Device API also adds a number of new features not previously available in Amanda. These include:

  • Device properties allow drivers to describe themselves (and the devices and media they control), as well as allow arbitrary user settings to propagate down to the driver.
  • Smart locking means unrelated accesses can be performed without issue, while conflicting accesses wait for one another.
  • Appending to volumes is supported for capable devices.
  • A device can describe its supported blocksize range to the Amanda core, instead of the other way around.
  • Deleting parts of volumes (without erasing the whole thing) is supported for capable devices.
  • The amount of free space on a device can be reported to the Amanda core, where the device supports it (e.g., VFS Device)

History

The Device API is designed to replace the ancient tapeio subsystem (sometimes called the vtape API or Virtual Tape API), originally introduced to support virtual tapes. The original design of the tapeio system (located in tape-src/) was to abstract tape-related functionality into a separate library. Thus, different devices (tape, vtape and RAIT) were all phrased in terms of tape operations: Rewind, fast-forward, read a block, etc. Operations were assumed to have the same semantics as a UNIX tape device. Furthermore, the API revealed a file descriptor, with the assumption that other parts of Amanda could perform operations such as stat() or dup2() on it.

The Device API clears up a large number of limitations and device assumptions that the tapeio system made, allowing native support for new devices (such as CDs and DVDs) as well as new device-related functionality (such as parallel access, partial recycling, and appending).

The Device API does not change (or even address) media formats.

What the Device API is not

The Device API does not distinguish between random-access and linear-access media: The seek operation may take a long time for some devices, or it may be instantaneous for others. It does, however, distinguish between concurrent devices and exclusive devices: Concurrent devices may be accessed by multiple readers (and sometimes multiple writers) simultaneously, while exclusive devices cannot.

Although the Device API deals in on-medium headers and blocks, it is otherwise agnostic to media format issues. Changes to the Amanda media format, including split vs traditional formats, can be made without change to the Device API.

Using the Device API

The Device API is primarily intended for use from Perl code. See Amanda::Device for a detailed description of the interface.

Implementing a Device Driver

To add a new device to the Device API, do the following:

  • Subscribe to the amanda-hackers mailing list and/or connect to #amanda and solicit the help of existing developers.
  • Figure out how the data model and operational model described above will map onto your device.
  • Implement a subclass of one of the existing Device classes named above. In particular, implement the various virtual functions provided for in device.h.
    • Consult the existing device implementations for hints -- in particular, the VFS Device and S3 Device make good examples.
    • Note that in addition to the (mostly virtual) functions discussed above, there are some additional protected functions that you may find useful, described in device.h.
  • Make sure your device calls device_add_property() to register its properties.
  • Arrange to call the register_device() function to register the device, so it will recieve relevant calls to device_open(). The easiest way to do this is to write a initialization function, and add a call to your function from device_api_init() in device.c. Dynamically loadable device drivers are an open project.
  • add documentation of your device to the amanda-devices(7) manpage.

The Device API relies heavily on GLib's type system, so developers unfamiliar with the GObject system are encouraged to consult the relevant documentation.

Properties Interface

Most code related to device-independent properties is in property.h and property.c. Since property values are passed around as a GValue, all value types must be registered in the GLib type system.

Every abstract property has a DevicePropertyBase, which refers to a DevicePropertyId. The Base structure is the same for all properties of all devices; it holds information about the property outside other context. Each property has a unique DevicePropertyId, though the ID of a property is only guaranteed within a single run of a program: Between runs, IDs might change, so it's best to refer to properties by name outside of a particular program.

When a device starts up, it creates a DeviceProperty structure for each supported property. This structure refers to the DevicePropertyBase, but also holds information about when the property may be accessed: Not all properties may be set or gotten at any time. The PropertyAccessFlags field access holds this information: it is the bitwise OR of any of the various PROPERTY_ACCESS_GET_ and PROPERTY_ACCESS_SET_ values defined. Note that there are five distinct "time periods" as far as access is concerned:

  • BEFORE_START: Before device_start() has been called; i.e., before any permanant action may have been taken.
  • BETWEEN_FILE_WRITE: When in write mode, but not actually writing a file. Specifically, after device_start() has been called with ACCESS_WRITE or ACCESS_APPEND, but outside of a pair of calls to device_start_file() and device_finish_file().
  • INSIDE_FILE_WRITE: In the middle of writing a file. Specifically, after device_start_file() has been called, but before the coresponding call to device_finish_file().
  • BETWEEN_FILE_READ: When in read mode, but not actually reading a file. This means after device_start() has been called with ACCESS_READ, but either before a call to device_seek_file, or after a call to device_read_block returns EOF (provided no intervening call to device_seek_file() or device_seek_block() since the EOF).
  • INSIDE_FILE_READ: When in read mode, and while actually reading a file. This means after a call to device_seek_block(), but before a call to device_read_block() returns EOF.

Note that these access flags are not automatically enforced, and are currently never consulted. They exist for future expansion of the properties interface.

Device-specific properties should be defined in your own device, not in property.h. See the S3 device's properties for an example.

Future Directions

Future directions for the Device API include:

  • new devices
  • runtime loading of device modules as shared objects
  • more flexible device configuration