GNU Tar Include and Exclude Behavior: Difference between revisions
Jump to navigation
Jump to search
(update with info from git head) |
(tag columns with matching type) |
||
Line 3: | Line 3: | ||
<table border=1> | <table border=1> | ||
<tr> | <tr> | ||
<th rowspan=' | <th rowspan='4' valign='bottom'>pat</th> | ||
<th rowspan=' | <th rowspan='4' valign='bottom'>file</th> | ||
<th colspan='9' align='center'>include</th> | <th colspan='9' align='center'>include</th> | ||
<th colspan='9' align='center'>exclude</th> | <th colspan='9' align='center'>exclude</th> | ||
Line 28: | Line 28: | ||
<th align='center'><1.23</th> | <th align='center'><1.23</th> | ||
<th align='center'>1.23</th> | <th align='center'>1.23</th> | ||
<th align='center'>& | <th align='center'>>1.23</th> | ||
<th align='center'><1.23</th> | <th align='center'><1.23</th> | ||
<th align='center'>1.23</th> | <th align='center'>1.23</th> | ||
Line 35: | Line 35: | ||
<th align='center'>1.23</th> | <th align='center'>1.23</th> | ||
<th align='center'>>1.23</th> | <th align='center'>>1.23</th> | ||
</tr> | |||
<tr> | |||
<th align='center'>α </th> | |||
<th align='center'>η</th> | |||
<th align='center'>η</th> | |||
<th align='center'>α</th> | |||
<th align='center'>β</th> | |||
<th align='center'>β</th> | |||
<th align='center'>α </th> | |||
<th align='center'>η</th> | |||
<th align='center'>η</th> | |||
<th align='center'>γ</th> | |||
<th align='center'>δ</th> | |||
<th align='center'>γ</th> | |||
<th align='center'>γ</th> | |||
<th align='center'>δ</th> | |||
<th align='center'>γ</th> | |||
<th align='center'>∅</th> | |||
<th align='center'>∅</th> | |||
<th align='center'>∅</th> | |||
</tr> | </tr> | ||
<tr> | <tr> | ||
Line 471: | Line 491: | ||
* Excludes | * Excludes | ||
** The default behavior is identical to --wildcards | ** The default behavior is identical to --wildcards | ||
** When wildcards are disabled, they are truly disabled: only literal matches are accepted. | ** When wildcards are disabled, they are truly disabled: only literal matches are accepted (type &emtpy;). | ||
** When wildcards are enabled, version 1.23 has a bug that causes incorrect behavior: | ** When wildcards are enabled, version 1.23 has a bug that causes incorrect behavior: | ||
*** In versions other than 1.23, when wildcards are enabled, type γ matching is applied. | *** In versions other than 1.23, when wildcards are enabled, type γ matching is applied. | ||
Line 477: | Line 497: | ||
Matching types mentioned above: | Matching types mentioned above: | ||
;type ∅: Literal matching - no wildcards or escaping. | |||
;type α: Only <tt>*?[\</tt> are special, and only special characters can be escaped by <tt>\</tt> - otherwise, the escaping backslash is treated literally (e.g., <tt>E\E</tt> matches against itself, but not against <tt>EE</tt>). There is a bug with <tt>\?</tt>, which is treated as <tt>\0177</tt> internally. | ;type α: Only <tt>*?[\</tt> are special, and only special characters can be escaped by <tt>\</tt> - otherwise, the escaping backslash is treated literally (e.g., <tt>E\E</tt> matches against itself, but not against <tt>EE</tt>). There is a bug with <tt>\?</tt>, which is treated as <tt>\0177</tt> internally. | ||
;type β: Only <tt>*?[\</tt> are special, and <tt>\</tt> can escape any character, so <tt>\X</tt> and <tt>X</tt> will both match <tt>X</tt>. However, both <tt>\?</tt> and <tt>\\</tt> are buggy and will not match <tt>?</tt> and <tt>\</tt>, respectively. | ;type β: Only <tt>*?[\</tt> are special, and <tt>\</tt> can escape any character, so <tt>\X</tt> and <tt>X</tt> will both match <tt>X</tt>. However, both <tt>\?</tt> and <tt>\\</tt> are buggy and will not match <tt>?</tt> and <tt>\</tt>, respectively. |
Revision as of 21:37, 26 May 2010
This table represents the results of installcheck/gnutar.pl across multiple GNU Tar versions. Note that this page only deals with include and exclude behavior; see the GNU Tar FAQ entry for other undesirable behaviors.
pat | file | include | exclude | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
no args | -wc | -no-wc | no args | -wc | -no-wc | ||||||||||||||
<1.16 | 1.16-22 | >1.22 | <1.16 | 1.16-22 | >1.22 | <1.16 | 1.16-22 | >1.22 | <1.23 | 1.23 | >1.23 | <1.23 | 1.23 | >1.23 | <1.23 | 1.23 | >1.23 | ||
α | η | η | α | β | β | α | η | η | γ | δ | γ | γ | δ | γ | ∅ | ∅ | ∅ | ||
./A*A | A*A | ||||||||||||||||||
./A*A | AxA | ||||||||||||||||||
./A\*A | A*A | ||||||||||||||||||
./A\*A | AxA | ||||||||||||||||||
./B?B | B?B | ||||||||||||||||||
./B?B | BxB | ||||||||||||||||||
./B\?B | B?B | ||||||||||||||||||
./B\?B | BxB | ||||||||||||||||||
./C[C | C[C | ||||||||||||||||||
./C\[C | C[C | ||||||||||||||||||
./D\]D | D]D | ||||||||||||||||||
./D]D | D]D | ||||||||||||||||||
./E\E | E\E | ||||||||||||||||||
./E\\E | E\E | ||||||||||||||||||
./F'F | F'F | ||||||||||||||||||
./F\'F | F'F | ||||||||||||||||||
./G"G | G"G | ||||||||||||||||||
./G\"G | G"G | ||||||||||||||||||
./H H | H H | ||||||||||||||||||
./H\ H | H H |
This was tested against tar versions 1.15, 1.15.1, 1.16, 1.17, 1.18, 1.19, 1.20, 1.21, 1.22, 1.23, and the current git HEAD (e21d54e8c).
Summary
This is the most concise summary I can invent. Yes, there are *five* different matching schemes implemented in GNU tar.
- Single quotes ('), double quotes ("), and spaces always match themselves exactly, regardless of wildcards.
- Includes
- The default behavior is identical to --no-wildcards
- Behavior changed with version 1.16:
- In versions before 1.16, the wildcard option is ignored for includes, and type α wildcard matching is always applied.
- In versions 1.16 and higher, when wildcard matching is enabled, type β wildcard matching is applied. When wildcard matching is disabled, type ε matching is applied (!).
- Excludes
- The default behavior is identical to --wildcards
- When wildcards are disabled, they are truly disabled: only literal matches are accepted (type &emtpy;).
- When wildcards are enabled, version 1.23 has a bug that causes incorrect behavior:
- In versions other than 1.23, when wildcards are enabled, type γ matching is applied.
- In versions 1.23, when wildcards are enabled, type δ matching is applied.
Matching types mentioned above:
- type ∅
- Literal matching - no wildcards or escaping.
- type α
- Only *?[\ are special, and only special characters can be escaped by \ - otherwise, the escaping backslash is treated literally (e.g., E\E matches against itself, but not against EE). There is a bug with \?, which is treated as \0177 internally.
- type β
- Only *?[\ are special, and \ can escape any character, so \X and X will both match X. However, both \? and \\ are buggy and will not match ? and \, respectively.
- type γ
- Only *?[\ are special, and \ can escape any character. There is no bug with \? or \\.
- type δ
- Only *?[ are special, and no escaping is possible (note that this is a bug in version 1.23)
- type ε
- Only literal matches are accepted, except that both \ and \\ will match \.
To Do
- Explore the inside of character classes: how do you specify ] or \ in a character class? Negation?
- Look at the source to figure out what's going on with backslashes