GNU Tar Include and Exclude Behavior
This table represents the results of installcheck/gnutar.pl across multiple GNU Tar versions. Note that this page only deals with include and exclude behavior; see the GNU Tar FAQ entry for other undesirable behaviors.
pat | file | include | exclude | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
no args | -wc | -no-wc | no args | -wc | -no-wc | ||||||||||||||
<1.16 | 1.16-22 | >1.22 | <1.16 | 1.16-22 | >1.22 | <1.16 | 1.16-22 | >1.22 | <1.23 | 1.23 | >1.23 | <1.23 | 1.23 | >1.23 | <1.23 | 1.23 | >1.23 | ||
α | ε | ε | α | β | β | α | ε | ε | γ | δ | γ | γ | δ | γ | ∅ | ∅ | ∅ | ||
./A*A | A*A | ||||||||||||||||||
./A*A | AxA | ||||||||||||||||||
./A\*A | A*A | ||||||||||||||||||
./A\*A | AxA | ||||||||||||||||||
./B?B | B?B | ||||||||||||||||||
./B?B | BxB | ||||||||||||||||||
./B\?B | B?B | ||||||||||||||||||
./B\?B | BxB | ||||||||||||||||||
./C[C | C[C | ||||||||||||||||||
./C\[C | C[C | ||||||||||||||||||
./D\]D | D]D | ||||||||||||||||||
./D]D | D]D | ||||||||||||||||||
./E\E | E\E | ||||||||||||||||||
./E\\E | E\E | ||||||||||||||||||
./F'F | F'F | ||||||||||||||||||
./F\'F | F'F | ||||||||||||||||||
./G"G | G"G | ||||||||||||||||||
./G\"G | G"G | ||||||||||||||||||
./H H | H H | ||||||||||||||||||
./H\ H | H H |
This was tested against tar versions 1.15, 1.15.1, 1.16, 1.17, 1.18, 1.19, 1.20, 1.21, 1.22, 1.23, and the current git HEAD (e21d54e8c).
Summary
This is the most concise summary I can invent. Yes, there are *five* different matching schemes implemented in GNU tar.
- Single quotes ('), double quotes ("), and spaces always match themselves exactly, regardless of wildcards.
- Includes
- The default behavior is identical to --no-wildcards
- Behavior changed with version 1.16:
- In versions before 1.16, the wildcard option is ignored for includes, and type α wildcard matching is always applied.
- In versions 1.16 and higher, when wildcard matching is enabled, type β wildcard matching is applied. When wildcard matching is disabled, type ε matching is applied (!).
- Excludes
- The default behavior is identical to --wildcards
- behavior is identical whether excluding on extract (-x) or create (-c)
- When wildcards are disabled, they are truly disabled: only literal matches are accepted (type &emtpy;).
- When wildcards are enabled, version 1.23 has a bug that causes incorrect behavior:
- In versions other than 1.23, when wildcards are enabled, type γ matching is applied.
- In versions 1.23, when wildcards are enabled, type δ matching is applied.
Matching types mentioned above:
- type ∅
- Literal matching - no wildcards or escaping.
- type α
- Only *?[\ are special, and only special characters can be escaped by \ - otherwise, the escaping backslash is treated literally (e.g., E\E matches against itself, but not against EE). There is a bug with \?, which is treated as \0177 internally.
- type β
- Only *?[\ are special, and \ can escape any character, so \X and X will both match X. However, both \? and \\ are buggy and will not match ? and \, respectively.
- type γ
- Only *?[\ are special, and \ can escape any character. There is no bug with \? or \\.
- type δ
- Only *?[ are special, and no escaping is possible (note that this is a bug in version 1.23)
- type ε
- Only literal matches are accepted, except that both \ and \\ will match \.
Note regarding -t
Note that testing these options using tar's -t option will lead to confusing results, since the output of the -t command has backslashes escaped with backslashes, although it does not escape any other characters - making it a decent, though not ideal, input for ε.
To Do
- Explore the inside of character classes: how do you specify ] or \ in a character class? Negation?
- Look at the source to figure out what's going on with backslashes