1 @c Copyright (C) 1994, 1996, 1998, 2000-2001, 2003-2007, 2009-2011 Free
2 @c Software Foundation, Inc.
4 @c Permission is granted to copy, distribute and/or modify this document
5 @c under the terms of the GNU Free Documentation License, Version 1.3 or
6 @c any later version published by the Free Software Foundation; with no
7 @c Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
8 @c Texts. A copy of the license is included in the ``GNU Free
9 @c Documentation License'' file as part of this distribution.
11 @c this regular expression description is for: generic
14 * awk regular expression syntax::
15 * egrep regular expression syntax::
16 * ed regular expression syntax::
17 * emacs regular expression syntax::
18 * gnu-awk regular expression syntax::
19 * grep regular expression syntax::
20 * posix-awk regular expression syntax::
21 * posix-basic regular expression syntax::
22 * posix-egrep regular expression syntax::
23 * posix-extended regular expression syntax::
24 * posix-minimal-basic regular expression syntax::
25 * sed regular expression syntax::
28 @node awk regular expression syntax
29 @subsection @samp{awk} regular expression syntax
32 The character @samp{.} matches any single character except the null character.
38 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
40 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
48 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
50 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
52 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit.
54 The alternation operator is @samp{|}.
56 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
58 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
61 @item At the beginning of a regular expression
63 @item After an open-group, signified by
65 @item After the alternation operator @samp{|}
72 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
75 @node egrep regular expression syntax
76 @subsection @samp{egrep} regular expression syntax
79 The character @samp{.} matches any single character except newline.
85 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
87 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
95 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
97 GNU extensions are supported:
100 @item @samp{\w} matches a character within a word
102 @item @samp{\W} matches a character which is not within a word
104 @item @samp{\<} matches the beginning of a word
106 @item @samp{\>} matches the end of a word
108 @item @samp{\b} matches a word boundary
110 @item @samp{\B} matches characters which are not a word boundary
112 @item @samp{\`} matches the beginning of the whole input
114 @item @samp{\'} matches the end of the whole input
119 Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
121 The alternation operator is @samp{|}.
123 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
125 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
129 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
132 @node ed regular expression syntax
133 @subsection @samp{ed} regular expression syntax
136 The character @samp{.} matches any single character except the null character.
142 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
144 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
150 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
152 GNU extensions are supported:
155 @item @samp{\w} matches a character within a word
157 @item @samp{\W} matches a character which is not within a word
159 @item @samp{\<} matches the beginning of a word
161 @item @samp{\>} matches the end of a word
163 @item @samp{\b} matches a word boundary
165 @item @samp{\B} matches characters which are not a word boundary
167 @item @samp{\`} matches the beginning of the whole input
169 @item @samp{\'} matches the end of the whole input
174 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
176 The alternation operator is @samp{\|}.
178 The character @samp{^} only represents the beginning of a string when it appears:
182 At the beginning of a regular expression
184 @item After an open-group, signified by
187 @item After the alternation operator @samp{\|}
192 The character @samp{$} only represents the end of a string when it appears:
195 @item At the end of a regular expression
197 @item Before a close-group, signified by
199 @item Before the alternation operator @samp{\|}
204 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
207 @item At the beginning of a regular expression
209 @item After an open-group, signified by
211 @item After the alternation operator @samp{\|}
216 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
218 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
221 @node emacs regular expression syntax
222 @subsection @samp{emacs} regular expression syntax
225 The character @samp{.} matches any single character except newline.
231 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
233 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
241 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
243 GNU extensions are supported:
246 @item @samp{\w} matches a character within a word
248 @item @samp{\W} matches a character which is not within a word
250 @item @samp{\<} matches the beginning of a word
252 @item @samp{\>} matches the end of a word
254 @item @samp{\b} matches a word boundary
256 @item @samp{\B} matches characters which are not a word boundary
258 @item @samp{\`} matches the beginning of the whole input
260 @item @samp{\'} matches the end of the whole input
265 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
267 The alternation operator is @samp{\|}.
269 The character @samp{^} only represents the beginning of a string when it appears:
273 At the beginning of a regular expression
275 @item After an open-group, signified by
278 @item After the alternation operator @samp{\|}
283 The character @samp{$} only represents the end of a string when it appears:
286 @item At the end of a regular expression
288 @item Before a close-group, signified by
290 @item Before the alternation operator @samp{\|}
295 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
298 @item At the beginning of a regular expression
300 @item After an open-group, signified by
302 @item After the alternation operator @samp{\|}
309 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
312 @node gnu-awk regular expression syntax
313 @subsection @samp{gnu-awk} regular expression syntax
316 The character @samp{.} matches any single character.
322 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
324 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
332 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
334 GNU extensions are supported:
337 @item @samp{\w} matches a character within a word
339 @item @samp{\W} matches a character which is not within a word
341 @item @samp{\<} matches the beginning of a word
343 @item @samp{\>} matches the end of a word
345 @item @samp{\b} matches a word boundary
347 @item @samp{\B} matches characters which are not a word boundary
349 @item @samp{\`} matches the beginning of the whole input
351 @item @samp{\'} matches the end of the whole input
356 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
358 The alternation operator is @samp{|}.
360 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
362 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
365 @item At the beginning of a regular expression
367 @item After an open-group, signified by
369 @item After the alternation operator @samp{|}
376 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
379 @node grep regular expression syntax
380 @subsection @samp{grep} regular expression syntax
383 The character @samp{.} matches any single character except newline.
389 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
391 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
397 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
399 GNU extensions are supported:
402 @item @samp{\w} matches a character within a word
404 @item @samp{\W} matches a character which is not within a word
406 @item @samp{\<} matches the beginning of a word
408 @item @samp{\>} matches the end of a word
410 @item @samp{\b} matches a word boundary
412 @item @samp{\B} matches characters which are not a word boundary
414 @item @samp{\`} matches the beginning of the whole input
416 @item @samp{\'} matches the end of the whole input
421 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
423 The alternation operator is @samp{\|}.
425 The character @samp{^} only represents the beginning of a string when it appears:
429 At the beginning of a regular expression
431 @item After an open-group, signified by
434 @item After a newline
436 @item After the alternation operator @samp{\|}
441 The character @samp{$} only represents the end of a string when it appears:
444 @item At the end of a regular expression
446 @item Before a close-group, signified by
448 @item Before a newline
450 @item Before the alternation operator @samp{\|}
455 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
458 @item At the beginning of a regular expression
460 @item After an open-group, signified by
462 @item After a newline
464 @item After the alternation operator @samp{\|}
469 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
471 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
474 @node posix-awk regular expression syntax
475 @subsection @samp{posix-awk} regular expression syntax
478 The character @samp{.} matches any single character except the null character.
484 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
486 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
494 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
496 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
498 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
500 The alternation operator is @samp{|}.
502 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
504 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
507 @item At the beginning of a regular expression
509 @item After an open-group, signified by
511 @item After the alternation operator @samp{|}
516 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
518 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
521 @node posix-basic regular expression syntax
522 @subsection @samp{posix-basic} regular expression syntax
523 This is a synonym for ed.
524 @node posix-egrep regular expression syntax
525 @subsection @samp{posix-egrep} regular expression syntax
528 The character @samp{.} matches any single character except newline.
534 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
536 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
544 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
546 GNU extensions are supported:
549 @item @samp{\w} matches a character within a word
551 @item @samp{\W} matches a character which is not within a word
553 @item @samp{\<} matches the beginning of a word
555 @item @samp{\>} matches the end of a word
557 @item @samp{\b} matches a word boundary
559 @item @samp{\B} matches characters which are not a word boundary
561 @item @samp{\`} matches the beginning of the whole input
563 @item @samp{\'} matches the end of the whole input
568 Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
570 The alternation operator is @samp{|}.
572 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
574 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
576 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
578 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
581 @node posix-extended regular expression syntax
582 @subsection @samp{posix-extended} regular expression syntax
585 The character @samp{.} matches any single character except the null character.
591 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
593 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
601 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
603 GNU extensions are supported:
606 @item @samp{\w} matches a character within a word
608 @item @samp{\W} matches a character which is not within a word
610 @item @samp{\<} matches the beginning of a word
612 @item @samp{\>} matches the end of a word
614 @item @samp{\b} matches a word boundary
616 @item @samp{\B} matches characters which are not a word boundary
618 @item @samp{\`} matches the beginning of the whole input
620 @item @samp{\'} matches the end of the whole input
625 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
627 The alternation operator is @samp{|}.
629 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
631 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are not allowed:
634 @item At the beginning of a regular expression
636 @item After an open-group, signified by
638 @item After the alternation operator @samp{|}
643 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
645 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
648 @node posix-minimal-basic regular expression syntax
649 @subsection @samp{posix-minimal-basic} regular expression syntax
652 The character @samp{.} matches any single character except the null character.
656 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
658 GNU extensions are supported:
661 @item @samp{\w} matches a character within a word
663 @item @samp{\W} matches a character which is not within a word
665 @item @samp{\<} matches the beginning of a word
667 @item @samp{\>} matches the end of a word
669 @item @samp{\b} matches a word boundary
671 @item @samp{\B} matches characters which are not a word boundary
673 @item @samp{\`} matches the beginning of the whole input
675 @item @samp{\'} matches the end of the whole input
680 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
684 The character @samp{^} only represents the beginning of a string when it appears:
688 At the beginning of a regular expression
690 @item After an open-group, signified by
696 The character @samp{$} only represents the end of a string when it appears:
699 @item At the end of a regular expression
701 @item Before a close-group, signified by
708 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
710 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
713 @node sed regular expression syntax
714 @subsection @samp{sed} regular expression syntax
715 This is a synonym for ed.