Driver-Taper protocol: Difference between revisions

From wiki.zmanda.com
Jump to navigation Jump to search
(→‎taper reply: missing taper reply)
 
(25 intermediate revisions by 2 users not shown)
Line 5: Line 5:


* driver initialises taper with: START-TAPER <timestamp> to which taper replies with: TAPE-OK or, for fatal errors, with: TAPE-ERROR [<message>]. For speed up,taper is authorized to scan the library and find a valid tape after it received the START-TAPER command.
* driver initialises taper with: START-TAPER <timestamp> to which taper replies with: TAPE-OK or, for fatal errors, with: TAPE-ERROR [<message>]. For speed up,taper is authorized to scan the library and find a valid tape after it received the START-TAPER command.


* driver can ask taper to copy a file from the holding disk to tape (FILE-WRITE) or directly from a dumper (PORT-WRITE) or exit at the end of the run (QUIT).
* driver can ask taper to copy a file from the holding disk to tape (FILE-WRITE) or directly from a dumper (PORT-WRITE) or exit at the end of the run (QUIT).


* taper responds to the PORT-WRITE command with: PORT <port> which driver should then hand on to dumper in a PORT-DUMP command.
* taper responds to the PORT-WRITE command with:
**PORT <hdr-port> <ip-port-pairs> which the driver should then hand on to dumper in a PORT-DUMP command


* if taper has no tape in use, it reply with REQUEST-NEW-TAPE.
* if taper has no tape in use, it reply with REQUEST-NEW-TAPE.
Line 77: Line 77:
!Driver request!! !!Taper reply!!Description
!Driver request!! !!Taper reply!!Description
|-
|-
|initail setup
|initiall setup
|-
|-
|START-TAPER||--->|| ||  
|START-TAPER||--->|| ||  
|-
|-
|||<---||TAPE-OK||normal taper setup (a tape is available, nothing is written on it)
|||<---||TAPER-OK||normal taper setup (a tape is available, nothing is written on it)
|-
|-
|||<---||TAPE-ERROR||failed taper setup (no tape are available or something else is broken)
|||<---||TAPE-ERROR||failed taper setup (no tape are available or something else is broken)
Line 97: Line 97:
|PORT-WRITE||--->|| ||
|PORT-WRITE||--->|| ||
|-
|-
|||<---||PORT||
|||<---||PORT||(taper accepts a connection on the hdr-port, reads the full header, closes the connection, and then accepts a connection on one of the directtcp ip:ports)
|-
|-
|||<---||REQUEST-NEW-TAPE||(Optional) ask driver if we should start this dump on a new tape. (driver will send a NEW-TAPE or NO-NEW-TAPE command)
|||<---||REQUEST-NEW-TAPE||(Optional) ask driver if we should start this dump on a new tape. (driver will send a NEW-TAPE or NO-NEW-TAPE command)
Line 135: Line 135:
|NEW-TAPE||--->||||driver want the taper to use a new tape
|NEW-TAPE||--->||||driver want the taper to use a new tape
|-
|-
|||<---||GOT-NEW-TAPE||taper found one
|||<---||NEW-TAPE||taper found one
|-
|-
|||<---||NO-NEW-TAPE||no tape are available
|||<---||NO-NEW-TAPE||no tape are available
Line 141: Line 141:
|quit
|quit
|-
|-
|QUIT||--->||||
|QUIT||--->||||taper quits
|-
|||<---||QUITING||
|}
|}


Line 153: Line 151:
|-
|-
|<timestamp>||Time as "yymmdd" of "yymmddhhmmss"
|<timestamp>||Time as "yymmdd" of "yymmddhhmmss"
|-
|<worker_name>||The name of a worker
|-
|-
|<handle>||Request ID
|<handle>||Request ID
Line 166: Line 166:
|<level>|| Dump level being used for backup
|<level>|| Dump level being used for backup
|-
|-
|splitsize|| size of each part on tape
|splitsize|| size of each part on tape '''in bytes'''
|-
|-
|split_diskbuffer|| file use to buffer a complete part
|split_diskbuffer|| directory in which to create disk buffer files, or the literal string "NULL" if none is specified
|-
|-
|<message>|| Error or Status message
|<message>|| Error or Status message
Line 174: Line 174:


== driver command ==
== driver command ==
* START-TAPER timestamp
* START-TAPER worker_name timestamp
* PORT-WRITE handle hostname diskname level datestamp splitsize split_diskbuffer fallback_splitsize
* PORT-WRITE worker_name handle hostname diskname level datestamp dle_tape_splitsize dle_split_diskbuffer dle_fallback_splitsize dle_allow_split part_size part_cache_type part_cache_dir part_cache_max_size datapath
* FILE-WRITE handle filename hostname diskname level datestamp splitsize
** if there is no split_diskbuffer, then it is specified as the string "NULL"
* DONE handle (result from dumper send to taper)
** splitsize and fallback_splitsize, both in kb, come directly from the dumptype
* FAILED handle (result from dumper send to taper)
* FILE-WRITE worker_name handle filename hostname diskname level datestamp dle_tape_splitsize dle_split_diskbuffer dle_fallback_splitsize dle_allow_split part_size part_cache_type part_cache_dir part_cache_max_size orig_kb
* NEW-TAPE handle
* START-SCAN worker_name handle
* NO-NEW-TAPE handle
* TAKE-SCRIBE-FROM worker_name handle from_worker_name
* DONE worker_name handle (result from dumper send to taper)
* FAILED worker_name handle (result from dumper send to taper)
* NEW-TAPE worker_name
* NO-NEW-TAPE worker_name "reason"
* QUIT
* QUIT
handles are unique to each DLE during a run, but since the taper only operates on a single file at a time, they are generally useless to the taper except in constructing syntactically correct responses to the driver.
the dle_* and part_* parameters are taken directly from the corresponding dumptype and tapetype configuration parameters


== taper reply ==
== taper reply ==
* TAPER-OK
* TAPER-OK worker_name
* TAPE-ERROR [handle] "error-message"
* TAPE-ERROR worker_name "error-message"
** handle is specified only in some circumstances
* PARTIAL handle INPUT-*    TAPE-*    "[sec %f kb %d kps %f]" "input-error-message" "tape-error-message"
* PARTIAL handle INPUT-*    TAPE-*    "[sec %f kb %d kps %f]" "input-error-message" "tape-error-message"
** The statistic are the sum of all successful part (PARTDONE), it should not include the failed part.
** The statistic are the sum of all successful part (PARTDONE), it should not include the failed part.
Line 194: Line 201:
* NEW-TAPE handle label
* NEW-TAPE handle label
* NO-NEW-TAPE handle
* NO-NEW-TAPE handle
* PARTDONE handle label fileno "[sec %f kb %d kps %f]"
* PARTDONE handle label fileno kb "[sec %f kb %d kps %f]"
** yep, the kb are specified twice
* REQUEST-NEW-TAPE handle
* REQUEST-NEW-TAPE handle
* PORT worker_name handle hdr-port ip-port-pairs
* BAD-COMMAND "error message"
* DUMPER-STATUS handle
* DUMPER-STATUS handle
* PORT port
* QUITTING
* BAD-COMMAND "error message"


== LOG ==
== LOG ==
Line 210: Line 217:
** DONE taper hostname diskname timestamp totpart level [sec %f kb %d kps %f]
** DONE taper hostname diskname timestamp totpart level [sec %f kb %d kps %f]
*** The statistic is the sum of all successful part (PART), it doesn't include PARTPARTIAL
*** The statistic is the sum of all successful part (PART), it doesn't include PARTPARTIAL
*** This indicates that, from the taper's perspective, the dump is completely done and should not be retried; this is the case even if the dumped file was partial
** PARTIAL taper hostname diskname timestamp totpart level [sec %f kb %d kps %f] "error message"
** PARTIAL taper hostname diskname timestamp totpart level [sec %f kb %d kps %f] "error message"
*** The statistic is the sum of all successful part (PART), it doesn't include PARTPARTIAL
*** The statistic is the sum of all successful part (PART), it doesn't include PARTPARTIAL
*** Some of the data was sent to the device, but not all of it; the driver will likely retry the dump
** FAILED taper hostname diskname timestamp level "error message"
** FAILED taper hostname diskname timestamp level "error message"
 
*** None of the data was sent to the device; the driver will likely retry the dump
= algorithm after a part written to tape =
  if (successful part) {
    L_PART -> logfile
    if (splited dump) {
      PARTDONE -> DRIVER
    if (not splited dump || last part) {
      if (PORT-WRITE) {
        DUMPER-STATUS -> driver
        get dumper result from driver (FAILED|DONE)
      }
      if (partial) #from dumper status or partial flag in header {
        PARTIAL INPUT-GOOD TAPE-GOOD -> driver
        L_PARTIAL -> logfile
      } else (whole) {
        DONE INPUT-GOOD TAPE-GOOD -> driver
        L_SUCCESS -> logfile
      }
    } else #splited dump
      GOTO next part
  } else #failed part {
    if (something written to tape for this part)
      L_PARTPARTIAL -> logfile
    if (splitted dump and tape error && no-input-error) {
      SPLIT_NEEDNEXT -> driver
      L_INFO -> logfile
      #wait for driver NEW-TAPE or NO-NEW-TAPE
      #It could take a long time to get the result
      if(driver sent NEW-TAPE)
        if (new tape available)
          NEW-TAPE -> DRIVER
          GOTO retry that part to the new tape
        else #no valid writable tape found
          NO-NEW-TAPE -> DRIVER
      else #driver sent NO-NEW-TAPE
        noop
    }
    #it's possible to have both INPUT-* and TAPE-* error or none.
    if (nothing written to tape for this dump) {
      FAILED INPUT-* TAPE-*  -> driver
      L_FAIL -> logfile
    } else #something written to tape for this dump {
      PARTIAL INPUT-* TAPE-* -> driver;
      L_PARTIAL -> logfile
    }
  }

Latest revision as of 16:00, 16 September 2010

Communication method

driver talks via two pipes connected to taper's stdin and stdout. The commands and responses are plain text.

Command sequence during backup operation

  • driver initialises taper with: START-TAPER <timestamp> to which taper replies with: TAPE-OK or, for fatal errors, with: TAPE-ERROR [<message>]. For speed up,taper is authorized to scan the library and find a valid tape after it received the START-TAPER command.
  • driver can ask taper to copy a file from the holding disk to tape (FILE-WRITE) or directly from a dumper (PORT-WRITE) or exit at the end of the run (QUIT).
  • taper responds to the PORT-WRITE command with:
    • PORT <hdr-port> <ip-port-pairs> which the driver should then hand on to dumper in a PORT-DUMP command
  • if taper has no tape in use, it reply with REQUEST-NEW-TAPE.
  • If the copy to tape finishes correctly taper replies with DONE.
  • If something goes wrong with the tape, taper can ask to continue on a new tape REQUEST-NEW-TAPE (for splited dump only) or it can abort that dump with PARTIAL or FAILED reply
  • After any dump that finished in PARTIAL or FAILED with a TAPE-ERROR, the driver must send a NEW-TAPE before issuing another *-WRITE command.

Other commands

  • If driver says something that taper doesn't recognise it responds with: BAD-COMMAND <message>
  • taper responds to the QUIT command with: QUITING

Protocol command reference

Taper command reply

Reply Description
TAPE-OK taper is setup correctly
TAPE-ERROR Error in setup of the driver
PORT Reply sent in response to PORT-WRITE command
DONE Full dump is on the media
PARTIAL Dump was partially written to the media.
FAILED Nothing was written to the media
DUMPER-STATUS Ask driver to send the dumper status
PARTDONE A complete part is written to tape
REQUEST-NEW-TAPE Ask driver if it should use a new tape
NEW-TAPE Will continue on a new found tape
NO-NEW-TAPE Will not continue on a new tape


DONE/PARTIAL and FAILED reply also contains an INPUT-* an TAPE-* message to denote any error:

Reply Description
INPUT-GOOD There was no error with the input.
INPUT-ERROR There was error with the input.
Reply Description
TAPE-GOOD This tape can be use to write another dump.
TAPE-ERROR Nothing more can fit on that tape.

A result for a successful dump should be: DONE INPUT-GOOD TAPE-GOOD

Driver/Taper Requests/Replies

Driver request Taper reply Description
initiall setup
START-TAPER --->
<--- TAPER-OK normal taper setup (a tape is available, nothing is written on it)
<--- TAPE-ERROR failed taper setup (no tape are available or something else is broken)
file-write setup
FILE-WRITE --->
<--- REQUEST-NEW-TAPE (Optional) ask driver if we should start this dump on a new tape. (driver will send a NEW-TAPE or NO-NEW-TAPE command)
Will get one part-result for each part written and one global result
port-write setup
PORT-WRITE --->
<--- PORT (taper accepts a connection on the hdr-port, reads the full header, closes the connection, and then accepts a connection on one of the directtcp ip:ports)
<--- REQUEST-NEW-TAPE (Optional) ask driver if we should start this dump on a new tape. (driver will send a NEW-TAPE or NO-NEW-TAPE command)
Will continue with one part-result for each part written and one global result
global-result
Can ask-dumper-status (Should be done if taper detect no error (INPUT-GOOD and TAPE-GOOD)
<--- DONE INPUT-GOOD TAPE-GOOD normal protocol for a success
<--- PARTIAL INPUT-* TAPE-* protocol for error in data phase (something written to tape)
<--- FAILED INPUT-* TAPE-* protocol for error in setup (before something is written to tape)
ask-dumper-status
<--- DUMPER-STATUS Ask driver the dumper status
DONE ---> Dumper succeeded
FAILED ---> Dumper failed
part-result
<--- PARTDONE successfully part written to tape
<--- REQUEST-NEW-TAPE We get a tape error, ask the driver if we can use a new tape (driver will send a NEW-TAPE or NO-NEW-TAPE command)
driver don't want the taper to use a new tape
NO-NEW-TAPE ---> driver doesn't want the taper to use a new tape (continue with global-result)
driver want the taper to use a new tape
NEW-TAPE ---> driver want the taper to use a new tape
<--- NEW-TAPE taper found one
<--- NO-NEW-TAPE no tape are available
quit
QUIT ---> taper quits

Command/reply arguments

Protocol data

data description
<timestamp> Time as "yymmdd" of "yymmddhhmmss"
<worker_name> The name of a worker
<handle> Request ID
<filename> Name of file on the holding disk where backup will be written to
<port> Taper port to send the backup data to
<host> Hostname of the client
<disk> Disk on the client being backed up
<level> Dump level being used for backup
splitsize size of each part on tape in bytes
split_diskbuffer directory in which to create disk buffer files, or the literal string "NULL" if none is specified
<message> Error or Status message

driver command

  • START-TAPER worker_name timestamp
  • PORT-WRITE worker_name handle hostname diskname level datestamp dle_tape_splitsize dle_split_diskbuffer dle_fallback_splitsize dle_allow_split part_size part_cache_type part_cache_dir part_cache_max_size datapath
    • if there is no split_diskbuffer, then it is specified as the string "NULL"
    • splitsize and fallback_splitsize, both in kb, come directly from the dumptype
  • FILE-WRITE worker_name handle filename hostname diskname level datestamp dle_tape_splitsize dle_split_diskbuffer dle_fallback_splitsize dle_allow_split part_size part_cache_type part_cache_dir part_cache_max_size orig_kb
  • START-SCAN worker_name handle
  • TAKE-SCRIBE-FROM worker_name handle from_worker_name
  • DONE worker_name handle (result from dumper send to taper)
  • FAILED worker_name handle (result from dumper send to taper)
  • NEW-TAPE worker_name
  • NO-NEW-TAPE worker_name "reason"
  • QUIT

handles are unique to each DLE during a run, but since the taper only operates on a single file at a time, they are generally useless to the taper except in constructing syntactically correct responses to the driver.

the dle_* and part_* parameters are taken directly from the corresponding dumptype and tapetype configuration parameters

taper reply

  • TAPER-OK worker_name
  • TAPE-ERROR worker_name "error-message"
  • PARTIAL handle INPUT-* TAPE-* "[sec %f kb %d kps %f]" "input-error-message" "tape-error-message"
    • The statistic are the sum of all successful part (PARTDONE), it should not include the failed part.
  • DONE handle INPUT-GOOD TAPE-GOOD "[sec %f kb %d kps %f]" "" ""
    • The statistic are the sum of all successful part (PARTDONE), it should not include the failed part.
  • FAILED handle INPUT-* TAPE-* "input-error-message" "tape-error-message"
  • NEW-TAPE handle label
  • NO-NEW-TAPE handle
  • PARTDONE handle label fileno kb "[sec %f kb %d kps %f]"
    • yep, the kb are specified twice
  • REQUEST-NEW-TAPE handle
  • PORT worker_name handle hdr-port ip-port-pairs
  • BAD-COMMAND "error message"
  • DUMPER-STATUS handle

LOG

  • One PART or PARTPARTIAL log line for all part written to tape. It tell the location and status of each part.
    • PART taper label fileno hostname diskname timestamp part-number/totpart level [sec %f kb %d kps %f]
    • PARTPARTIAL taper label fileno hostname diskname timestamp part-number/totpart level [sec %f kb %d kps %f] "error message"

totpart should be -1 if the the number is not known (PORT-WRITE)

  • One DONE/PARTIAL/FAILED for each dump. It tell the status of the complete dump.
    • DONE taper hostname diskname timestamp totpart level [sec %f kb %d kps %f]
      • The statistic is the sum of all successful part (PART), it doesn't include PARTPARTIAL
      • This indicates that, from the taper's perspective, the dump is completely done and should not be retried; this is the case even if the dumped file was partial
    • PARTIAL taper hostname diskname timestamp totpart level [sec %f kb %d kps %f] "error message"
      • The statistic is the sum of all successful part (PART), it doesn't include PARTPARTIAL
      • Some of the data was sent to the device, but not all of it; the driver will likely retry the dump
    • FAILED taper hostname diskname timestamp level "error message"
      • None of the data was sent to the device; the driver will likely retry the dump