1 @c Copyright (C) 1994, 1996, 1998, 2000, 2001, 2003, 2004, 2005, 2006
2 @c Free Software Foundation, Inc.
4 @c Permission is granted to copy, distribute and/or modify this document
5 @c under the terms of the GNU Free Documentation License, Version 1.2 or
6 @c any later version published by the Free Software Foundation; with no
7 @c Invariant Sections, with no Front-Cover Texts, and with no Back-Cover
8 @c Texts. A copy of the license is included in the ``GNU Free
9 @c Documentation License'' file as part of this distribution.
12 * awk regular expression syntax::
13 * egrep regular expression syntax::
14 * ed regular expression syntax::
15 * emacs regular expression syntax::
16 * gnu-awk regular expression syntax::
17 * grep regular expression syntax::
18 * posix-awk regular expression syntax::
19 * posix-basic regular expression syntax::
20 * posix-egrep regular expression syntax::
21 * posix-extended regular expression syntax::
22 * posix-minimal-basic regular expression syntax::
23 * sed regular expression syntax::
26 @node awk regular expression syntax
27 @subsection @samp{awk} regular expression syntax
30 The character @samp{.} matches any single character except the null character.
36 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
38 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
46 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
48 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
50 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit matches that digit.
52 The alternation operator is @samp{|}.
54 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
56 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
59 @item At the beginning of a regular expression
61 @item After an open-group, signified by
63 @item After the alternation operator @samp{|}
70 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
73 @node egrep regular expression syntax
74 @subsection @samp{egrep} regular expression syntax
77 The character @samp{.} matches any single character except newline.
83 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
85 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
93 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
95 GNU extensions are supported:
98 @item @samp{\w} matches a character within a word
100 @item @samp{\W} matches a character which is not within a word
102 @item @samp{\<} matches the beginning of a word
104 @item @samp{\>} matches the end of a word
106 @item @samp{\b} matches a word boundary
108 @item @samp{\B} matches characters which are not a word boundary
110 @item @samp{\`} matches the beginning of the whole input
112 @item @samp{\'} matches the end of the whole input
117 Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
119 The alternation operator is @samp{|}.
121 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
123 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
127 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
130 @node ed regular expression syntax
131 @subsection @samp{ed} regular expression syntax
134 The character @samp{.} matches any single character except the null character.
140 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
142 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
148 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
150 GNU extensions are supported:
153 @item @samp{\w} matches a character within a word
155 @item @samp{\W} matches a character which is not within a word
157 @item @samp{\<} matches the beginning of a word
159 @item @samp{\>} matches the end of a word
161 @item @samp{\b} matches a word boundary
163 @item @samp{\B} matches characters which are not a word boundary
165 @item @samp{\`} matches the beginning of the whole input
167 @item @samp{\'} matches the end of the whole input
172 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
174 The alternation operator is @samp{\|}.
176 The character @samp{^} only represents the beginning of a string when it appears:
180 At the beginning of a regular expression
182 @item After an open-group, signified by
185 @item After the alternation operator @samp{\|}
190 The character @samp{$} only represents the end of a string when it appears:
193 @item At the end of a regular expression
195 @item Before an close-group, signified by
197 @item Before the alternation operator @samp{\|}
202 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
205 @item At the beginning of a regular expression
207 @item After an open-group, signified by
209 @item After the alternation operator @samp{\|}
214 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
216 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
219 @node emacs regular expression syntax
220 @subsection @samp{emacs} regular expression syntax
223 The character @samp{.} matches any single character except newline.
229 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
231 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
239 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are not supported, so for example you would need to use @samp{[0-9]} instead of @samp{[[:digit:]]}.
241 GNU extensions are supported:
244 @item @samp{\w} matches a character within a word
246 @item @samp{\W} matches a character which is not within a word
248 @item @samp{\<} matches the beginning of a word
250 @item @samp{\>} matches the end of a word
252 @item @samp{\b} matches a word boundary
254 @item @samp{\B} matches characters which are not a word boundary
256 @item @samp{\`} matches the beginning of the whole input
258 @item @samp{\'} matches the end of the whole input
263 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
265 The alternation operator is @samp{\|}.
267 The character @samp{^} only represents the beginning of a string when it appears:
271 At the beginning of a regular expression
273 @item After an open-group, signified by
276 @item After the alternation operator @samp{\|}
281 The character @samp{$} only represents the end of a string when it appears:
284 @item At the end of a regular expression
286 @item Before an close-group, signified by
288 @item Before the alternation operator @samp{\|}
293 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
296 @item At the beginning of a regular expression
298 @item After an open-group, signified by
300 @item After the alternation operator @samp{\|}
307 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
310 @node gnu-awk regular expression syntax
311 @subsection @samp{gnu-awk} regular expression syntax
314 The character @samp{.} matches any single character.
320 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
322 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
330 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
332 GNU extensions are supported:
335 @item @samp{\w} matches a character within a word
337 @item @samp{\W} matches a character which is not within a word
339 @item @samp{\<} matches the beginning of a word
341 @item @samp{\>} matches the end of a word
343 @item @samp{\b} matches a word boundary
345 @item @samp{\B} matches characters which are not a word boundary
347 @item @samp{\`} matches the beginning of the whole input
349 @item @samp{\'} matches the end of the whole input
354 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
356 The alternation operator is @samp{|}.
358 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
360 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except:
363 @item At the beginning of a regular expression
365 @item After an open-group, signified by
367 @item After the alternation operator @samp{|}
374 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
377 @node grep regular expression syntax
378 @subsection @samp{grep} regular expression syntax
381 The character @samp{.} matches any single character except newline.
387 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
389 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
395 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
397 GNU extensions are supported:
400 @item @samp{\w} matches a character within a word
402 @item @samp{\W} matches a character which is not within a word
404 @item @samp{\<} matches the beginning of a word
406 @item @samp{\>} matches the end of a word
408 @item @samp{\b} matches a word boundary
410 @item @samp{\B} matches characters which are not a word boundary
412 @item @samp{\`} matches the beginning of the whole input
414 @item @samp{\'} matches the end of the whole input
419 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
421 The alternation operator is @samp{\|}.
423 The character @samp{^} only represents the beginning of a string when it appears:
427 At the beginning of a regular expression
429 @item After an open-group, signified by
432 @item After a newline
434 @item After the alternation operator @samp{\|}
439 The character @samp{$} only represents the end of a string when it appears:
442 @item At the end of a regular expression
444 @item Before an close-group, signified by
446 @item Before a newline
448 @item Before the alternation operator @samp{\|}
453 @samp{\*}, @samp{\+} and @samp{\?} are special at any point in a regular expression except:
456 @item At the beginning of a regular expression
458 @item After an open-group, signified by
460 @item After a newline
462 @item After the alternation operator @samp{\|}
467 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
469 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
472 @node posix-awk regular expression syntax
473 @subsection @samp{posix-awk} regular expression syntax
476 The character @samp{.} matches any single character except the null character.
482 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
484 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
492 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} can be used to quote the following character. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
494 GNU extensions are not supported and so @samp{\w}, @samp{\W}, @samp{\<}, @samp{\>}, @samp{\b}, @samp{\B}, @samp{\`}, and @samp{\'} match @samp{w}, @samp{W}, @samp{<}, @samp{>}, @samp{b}, @samp{B}, @samp{`}, and @samp{'} respectively.
496 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
498 The alternation operator is @samp{|}.
500 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
502 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are illegal:
505 @item At the beginning of a regular expression
507 @item After an open-group, signified by
509 @item After the alternation operator @samp{|}
514 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
516 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
519 @node posix-basic regular expression syntax
520 @subsection @samp{posix-basic} regular expression syntax
521 This is a synonym for ed.
522 @node posix-egrep regular expression syntax
523 @subsection @samp{posix-egrep} regular expression syntax
526 The character @samp{.} matches any single character except newline.
532 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
534 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
542 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are ignored. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit. Non-matching lists @samp{[^@dots{}]} do not ever match newline.
544 GNU extensions are supported:
547 @item @samp{\w} matches a character within a word
549 @item @samp{\W} matches a character which is not within a word
551 @item @samp{\<} matches the beginning of a word
553 @item @samp{\>} matches the end of a word
555 @item @samp{\b} matches a word boundary
557 @item @samp{\B} matches characters which are not a word boundary
559 @item @samp{\`} matches the beginning of the whole input
561 @item @samp{\'} matches the end of the whole input
566 Grouping is performed with parentheses @samp{()}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
568 The alternation operator is @samp{|}.
570 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
572 The characters @samp{*}, @samp{+} and @samp{?} are special anywhere in a regular expression.
574 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals are treated as literals, for example @samp{a@{1} is treated as @samp{a\@{1}
576 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
579 @node posix-extended regular expression syntax
580 @subsection @samp{posix-extended} regular expression syntax
583 The character @samp{.} matches any single character except the null character.
589 indicates that the regular expression should match one or more occurrences of the previous atom or regexp.
591 indicates that the regular expression should match zero or one occurrence of the previous atom or regexp.
599 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
601 GNU extensions are supported:
604 @item @samp{\w} matches a character within a word
606 @item @samp{\W} matches a character which is not within a word
608 @item @samp{\<} matches the beginning of a word
610 @item @samp{\>} matches the end of a word
612 @item @samp{\b} matches a word boundary
614 @item @samp{\B} matches characters which are not a word boundary
616 @item @samp{\`} matches the beginning of the whole input
618 @item @samp{\'} matches the end of the whole input
623 Grouping is performed with parentheses @samp{()}. An unmatched @samp{)} matches just itself. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{(}.
625 The alternation operator is @samp{|}.
627 The characters @samp{^} and @samp{$} always represent the beginning and end of a string respectively, except within square brackets. Within brackets, @samp{^} can be used to invert the membership of the character class being specified.
629 @samp{*}, @samp{+} and @samp{?} are special at any point in a regular expression except the following places, where they are illegal:
632 @item At the beginning of a regular expression
634 @item After an open-group, signified by
636 @item After the alternation operator @samp{|}
641 Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals such as @samp{a@{1z} are not accepted.
643 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
646 @node posix-minimal-basic regular expression syntax
647 @subsection @samp{posix-minimal-basic} regular expression syntax
650 The character @samp{.} matches any single character except the null character.
654 Bracket expressions are used to match ranges of characters. Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid. Within square brackets, @samp{\} is taken literally. Character classes are supported; for example @samp{[[:digit:]]} will match a single decimal digit.
656 GNU extensions are supported:
659 @item @samp{\w} matches a character within a word
661 @item @samp{\W} matches a character which is not within a word
663 @item @samp{\<} matches the beginning of a word
665 @item @samp{\>} matches the end of a word
667 @item @samp{\b} matches a word boundary
669 @item @samp{\B} matches characters which are not a word boundary
671 @item @samp{\`} matches the beginning of the whole input
673 @item @samp{\'} matches the end of the whole input
678 Grouping is performed with backslashes followed by parentheses @samp{\(}, @samp{\)}. A backslash followed by a digit acts as a back-reference and matches the same thing as the previous grouped expression indicated by that number. For example @samp{\2} matches the second group expression. The order of group expressions is determined by the position of their opening parenthesis @samp{\(}.
682 The character @samp{^} only represents the beginning of a string when it appears:
686 At the beginning of a regular expression
688 @item After an open-group, signified by
694 The character @samp{$} only represents the end of a string when it appears:
697 @item At the end of a regular expression
699 @item Before an close-group, signified by
706 Intervals are specified by @samp{\@{} and @samp{\@}}. Invalid intervals such as @samp{a\@{1z} are not accepted.
708 The longest possible match is returned; this applies to the regular expression as a whole and (subject to this constraint) to subexpressions within groups.
711 @node sed regular expression syntax
712 @subsection @samp{sed} regular expression syntax
713 This is a synonym for ed.