GNU Tar Include and Exclude Behavior: Difference between revisions

From wiki.zmanda.com
Jump to navigation Jump to search
(notes)
(More detail)
Line 457: Line 457:
</tr>
</tr>
</table>
</table>
This was tested against tar versions 1.15, 1.15.1, 1.16, 1.17, 1.18, 1.19, 1.20, 1.21, 1.22, and 1.23.


There are some interesting patterns to note here:
There are some interesting patterns to note here:
* For includes, behavior with --no-wildcards is identical to the default behavior.
* For excludes, behavior with --wildcards is identical to the default behavior.
* Single quotes ('), double quotes ("), and spaces always match themselves exactly, regardless of wildcards.
* Single quotes ('), double quotes ("), and spaces always match themselves exactly, regardless of wildcards.
* Includes
** Behavior with --no-wildcards is identical to the default behavior.
** In versions before 1.16, the wildcard option is ignored for includes, and type &alpha; wildcard matching is always applied.
** All versions 1.16 and higher have identical behavior: TODO
* Excludes
** Behavior with --wildcards is identical to the default behavior.
** In versions up to 1.23, when wildcards are enabled, type &beta; matching is applied.  When wildcards are disabled, they are truly disabled: only literal matches are accepted.
** All versions before 1.23 have identical behavior: TODO
Matching types mentioned above:
;&alpha: Only <tt>*?[\</tt> are special, and only special characters can be escaped by <tt>\</tt> - otherwise, the escaping backslash is treated literally (e.g., <tt>E\E</tt> matches against itself, but not against <tt>EE</tt>).  There is a bug with <tt>\?</tt>, which is treated as <tt>\0177</tt> internally.
;&beta: Like &alpha;, only <tt>*?[\</tt> are special, but <tt>\</tt> can escape any character, so <tt>\X</tt> and <tt>X</tt> will both match <tt>X</tt>.
= To Do =
* Explore the inside of character classes: how do you specify ] or \ in a character class?  Negation?

Revision as of 20:28, 25 May 2010

This table represents the results of installcheck/gnutar.pl across multiple GNU Tar versions. Note that this page only deals with include and exclude behavior; see the GNU Tar FAQ entry for other undesirable behaviors.

pat file include exclude
no args -wc -no-wc no args -wc -no-wc
<1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22
./A*AA*A                                    
./A*AAxA                                    
./A\*AA*A                                    
./A\*AAxA                                    
./B?BB?B                                    
./B?BBxB                                    
./B\?BB?B                                    
./B\?BBxB                                    
./C[CC[C                                    
./C\[CC[C                                    
./D\]DD]D                                    
./D]DD]D                                    
./E\EE\E                                    
./E\\EE\E                                    
./F'FF'F                                    
./F\'FF'F                                    
./G"GG"G                                    
./G\"GG"G                                    
./H HH H                                    
./H\ HH H                                    

This was tested against tar versions 1.15, 1.15.1, 1.16, 1.17, 1.18, 1.19, 1.20, 1.21, 1.22, and 1.23.

There are some interesting patterns to note here:

  • Single quotes ('), double quotes ("), and spaces always match themselves exactly, regardless of wildcards.
  • Includes
    • Behavior with --no-wildcards is identical to the default behavior.
    • In versions before 1.16, the wildcard option is ignored for includes, and type α wildcard matching is always applied.
    • All versions 1.16 and higher have identical behavior: TODO
  • Excludes
    • Behavior with --wildcards is identical to the default behavior.
    • In versions up to 1.23, when wildcards are enabled, type β matching is applied. When wildcards are disabled, they are truly disabled: only literal matches are accepted.
    • All versions before 1.23 have identical behavior: TODO

Matching types mentioned above:

&alpha
Only *?[\ are special, and only special characters can be escaped by \ - otherwise, the escaping backslash is treated literally (e.g., E\E matches against itself, but not against EE). There is a bug with \?, which is treated as \0177 internally.
&beta
Like α, only *?[\ are special, but \ can escape any character, so \X and X will both match X.

To Do

  • Explore the inside of character classes: how do you specify ] or \ in a character class? Negation?