GNU Tar Include and Exclude Behavior: Difference between revisions

From wiki.zmanda.com
Jump to navigation Jump to search
(Done?)
(→‎To Do: more more todo)
Line 486: Line 486:
* Explore the inside of character classes: how do you specify ] or \ in a character class?  Negation?
* Explore the inside of character classes: how do you specify ] or \ in a character class?  Negation?
* Look at the source to figure out what's going on with backslashes
* Look at the source to figure out what's going on with backslashes
* Is type δ really that limited?

Revision as of 21:04, 25 May 2010

This table represents the results of installcheck/gnutar.pl across multiple GNU Tar versions. Note that this page only deals with include and exclude behavior; see the GNU Tar FAQ entry for other undesirable behaviors.

pat file include exclude
no args -wc -no-wc no args -wc -no-wc
<1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22 <1.16 1.16-22 >1.22
./A*AA*A                                    
./A*AAxA                                    
./A\*AA*A                                    
./A\*AAxA                                    
./B?BB?B                                    
./B?BBxB                                    
./B\?BB?B                                    
./B\?BBxB                                    
./C[CC[C                                    
./C\[CC[C                                    
./D\]DD]D                                    
./D]DD]D                                    
./E\EE\E                                    
./E\\EE\E                                    
./F'FF'F                                    
./F\'FF'F                                    
./G"GG"G                                    
./G\"GG"G                                    
./H HH H                                    
./H\ HH H                                    

This was tested against tar versions 1.15, 1.15.1, 1.16, 1.17, 1.18, 1.19, 1.20, 1.21, 1.22, and 1.23.

Summary

This is the most concise summary I can invent. Yes, there are *five* different matching schemes implemented in GNU tar.

  • Single quotes ('), double quotes ("), and spaces always match themselves exactly, regardless of wildcards.
  • Includes
    • The default behavior is identical to --no-wildcards
    • Behavior changed with version 1.16:
      • In versions before 1.16, the wildcard option is ignored for includes, and type α wildcard matching is always applied.
      • In versions 1.16 and higher, when wildcard matching is enabled, type β wildcard matching is applied. When wildcard matching is disabled, type ε matching is applied (!).
  • Excludes
    • The default behavior is identical to --wildcards
    • When wildcards are disabled, they are truly disabled: only literal matches are accepted.
    • Behavior changed with version 1.23:
      • In versions up to 1.23, when wildcards are enabled, type γ matching is applied.
      • In versions 1.23 and higher, when wildcards are enabled, type δ matching is applied.

Matching types mentioned above:

type α
Only *?[\ are special, and only special characters can be escaped by \ - otherwise, the escaping backslash is treated literally (e.g., E\E matches against itself, but not against EE). There is a bug with \?, which is treated as \0177 internally.
type β
Only *?[\ are special, and \ can escape any character, so \X and X will both match X. However, both \? and \\ are buggy and will not match ? and \, respectively.
type γ
Only *?[\ are special, and \ can escape any character. There is no bug with \? or \\.
type δ
Only *?[ are special, and no escaping is possible.
type ε
Only literal matches are accepted, except that both \ and \\ will match \.

To Do

  • Explore the inside of character classes: how do you specify ] or \ in a character class? Negation?
  • Look at the source to figure out what's going on with backslashes
  • Is type δ really that limited?