Detailed explanation of grep usage: grep and regular expression
The first thing to remember is:Regular expressions are different from wildcard characters, and their meanings are different!
Regular expressions are just a notation.As long as the tool supports this notation, the tool can handle strings of regular expressions. vi grep , awk , sed , etc. all support regular expressions.
1Basic regular expressions
grep Tools, introduced before.
grep -[acinv] 'Search for content string' filename
-a Search in text file
-c Calculate the number of times the matching rows found
-i Ignore case
-n By the way, output line numbers
-v Reverse selection, that is, find lines with no search string
The search string can be a regular expression!
1
Search for theand output the line number
$grep -n 'the' regular_express.txt
Search for rows without the and output the line number
$grep -nv 'the' regular_express.txt
2 use[]Search for collection characters
[] Represents a character in it, for example [ade]Represents a or d or e
woody@xiaoc:~/tmp$ grep -n 't[ae]st' regular_express.txt
8:I can't finish the test.
9:Oh! the soup taste good!
Can be used^The symbol is prefixed in [], indicating characters other than characters in [].
For example, search for the line where the string without g is located before oo. Use '[^g]oo' as the search string
woody@xiaoc:~/tmp$ grep -n '[^g]oo' regular_express.txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!
[] It can be expressed in range, such as [a-z]Represents lowercase letters, [0-9] represents numbers 0~9, and [A-Z] is the capital letters. [a-zA-Z0-9] means all numbers and English characters. Of course, you can also use ^ to exclude characters.
Search for rows containing numbers
woody@xiaoc:~/tmp$ grep -n '[0-9]' regular_express.txt
5:However ,this dress is about $ 3183 dollars.
15:You are the best is menu you are the no.1.
Characters at the beginning and end of the line ^ $. ^Indicates the beginning of the line, $ means the end of the line (not a character, it is a position). Then ‘^$’ means blank line, because only
The beginning and end of the line.
Here ^The meaning is different from the ^ used in []. It means that the string after ^ is at the beginning of the line.
For example, search for the line at the beginning
woody@xiaoc:~/tmp$ grep -n '^the' regular_express.txt
12:the symbol '*' is represented as star.
Search for lines starting with lowercase letters
woody@xiaoc:~/tmp$ grep -n '^[a-z]' regular_express.txt
2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
12:the symbol '*' is represented as star.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
woody@xiaoc:~/tmp$
Search for lines that do not start with English letters
woody@xiaoc:~/tmp$ grep -n '^[^a-zA-Z]' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
21:#I am VBird
woody@xiaoc:~/tmp$
$Indicates that the string before it is at the end of the line, such as '\.'Represents . at the end of a line
The end of the search is.The way
woody@xiaoc:~/tmp$ grep -n '\.$' regular_express.txt //. It is a special symbol for regular expressions, so use \Escape
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However ,this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
.....
Pay attention to MSThe text file generated under the system will add a ^M character to the newline. So the last character will be hidden ^M, which is being processedWindows
Pay special attention to the following text!
You can use cat dos_file | tr -d '\r' > unix_file to delete the ^M symbol. ^M==\r
Then '^$'It means that there are only blank lines at the beginning and end of the line!
Search for empty lines
woody@xiaoc:~/tmp$ grep -n '^$' regular_express.txt
22:
23:
woody@xiaoc:~/tmp$
Search for non-empty lines
woody@xiaoc:~/tmp$ grep -vn '^$' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
..........
Any character.with repeating characters *
In bash* represents a wildcard character, used to represent any character, but in regular expressions, its meaning is different, * means there are 0 or more characters.
For example oo*,It means that the first o must exist, the second o may have one or more or no, so it represents at least oneo.
Point. represents an arbitrary character and must exist.g??d can be represented by 'g..d'. good , gxxd , gabd .... all fit.
woody@xiaoc:~/tmp$ grep -n 'g..d' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! the soup taste good!
16:The world is the same with 'glad'.
woody@xiaoc:~/tmp$
Search two oThe above string
woody@xiaoc:~/tmp$ grep -n 'ooo*' regular_express.txt //The first two oThere must be, the third o may not exist, or there may be multiple.
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! the soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!
Search for the beginning and end of g, and the middle is at least one O string, i.e. gog, goog...gooog... etc.
woody@xiaoc:~/tmp$ grep -n 'goo*g' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!
Search gThe beginning and ending strings in the line
woody@xiaoc:~/tmp$ grep -n 'g.*g' regular_express.txt // .*Indicates 0One or more arbitrary characters
1:"Open Source" is a good mechanism to develop programs.
14:The gd software is a library for drafting programs.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
Limited range of continuous repeat characters { }
. * Only limit 0If you want to limit the number of characters to be repetitive, use {range}. Range is used as numbers, separated by 2,5 to represent 2~5,
2 means 2, 2 means 2 to more
Note that since { }There is a special meaning in SHELL, so when used as a regular expression, you should use \ to escape it.
Search contains two oline of string.
woody@xiaoc:~/tmp$ grep -n 'o\{2\}' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! the soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!
Search gFollowed by 2 to 5 os, followed by a line of string g.
woody@xiaoc:~/tmp$ grep -n 'go\{2,5\}g' regular_express.txt
18:google is the best tools for search keyword.
Search contains gThe following is more than 2 o, followed by g. .
woody@xiaoc:~/tmp$ grep -n 'go\{2,\}g' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!
Pay attention, give in to each other[]^ in the ^ - does not express special meaning, and can be placed behind the content in [].
'[^a-z\.!^ -]' It means there is no lowercase letters, no. No!, no spaces, no - string, note that there is a small space in [].
Also shellThe reverse selection inside is [!range], and the regular one is [^range]
2Extended regular expressions
Extended regular expressions add several special compositions to the basic regular expressions.
It makes certain operations more convenient.
For example, if we want to remove blank lines and lines with # at the beginning of lines, we will use this:
woody@xiaoc:~/tmp$ grep -v '^$' regular_express.txt | grep -v '^#'
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
............
However, using egrep and special symbols | that support extending regular expressions will be much more convenient.
Note that grep only supports basic expressions, while egrep supports extensions. In fact, egrep is just an alias for grep -E. Therefore grep -E supports extended rules.
So:
woody@xiaoc:~/tmp$ egrep -v '^$|^#' regular_express.txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
....................
Here |Relationship of or. That is, a string that satisfies ^$ or ^#.
Here are a few special symbols for extensions:
+, in . *It has similar functions, indicating one or more duplicate characters.
?, At . *It has similar functions, indicating 0 or one character.
|, representation or relationship, such as 'gd|good|dog'Indicates a string with gd, good or dog
(), synthesise part of the content into a unit group. For example, to search for gladOr good can do this'g(la|oo)d'
()The advantage is that it can be used on groups + ? *wait.
For example, search for AThere is at least one (xyz) string in the beginning and end with C, which can be as follows:'A(xyz)+C'
◎grep -- print lines matching a pattern (List the row that matches the style)
◎Grammar: grep [options]
PATTERN [FILE...]
grepUsed infileCompare the corresponding parts in the text, or when there is no specified file,
Compare from standard input. In the case of presets,grepThe row that matches the style will be listed.
In addition, there are two programsgrepThe variation ofegrepandfgrep。
inegrepIt's equivalent togrep -E ,fgrepEquivalent togrep -F 。
◎ Parameters
1. -A NUM,--after-context=NUM
In addition to listing the matching rows, andNUMOK.
ex: $ grep -A 1 panda file
(fromfileSearch forpandaThe line of style and displays the line's after1OK)
2. -aor--text
grepOriginally, it was searching for text files, if binary files were used as the target of search,
The following message will be displayed: Binary file Binary file name matches Then end.
If added-aParameters can be used to treat binary files as text files to search.
Equivalent to--binary-files=textThis parameter.
ex: (From binary filemvGo and searchpandastyle)
(mistake!!!)
$ grep panda mv
Binary file mv matches
(This means that this file hasmatchSee details about the--binary-files=TYPE )
$
(correct!!!)
$ grep -a panda mv
3. -B NUM,--before-context=NUM
and -A NUM Relatively, but this parameter is displayed except for the line
and display before itNUMOK.
ex: (fromfileSearch forpandaa line of style and displays the front of the line1OK)
$ grep -B 1 panda file
4. -C [NUM], -NUM, --context[=NUM]
List the following lines and list the upper and lower partsNUMYes, the default value is2。
ex: (ListfileIncludepandaStyles outline and list their upper and lower2OK)
(To change the preset value, change it directlyNUMJust)
$ grep -C[NUM] panda file
5. -b, --byte-offset
How many texts are listed before the stylebyte ..
ex: $ grep -b panda file
The results are similar to:
0:panda
66:pandahuang
123:panda03
6. --binary-files=TYPE
This parameterTYPEPreset asbinary(Binary), if searching in a normal way, only2Such results:
1.If there is a matching place: displayBinary file Binary file name matches
2.If there is no matching: nothing is displayed.
likeTYPEforwithout-match, encounter this parameter,
grepThis binary file does not contain any search style, with-I The parameters are the same.
likeTPYEfortext, grepThis binary file will be considered astextArchives, with-a The parameters are the same.
Warning: --binary-files=text If the output is a terminal, some unnecessary output may be generated.
7. -c, --count
The rows that match style are not displayed, only the total number of rows that match are displayed.
If added-v,--invert-match, the parameter displays the total number of rows that do not match.
8. -d ACTION, --directories=ACTION
If the file entered is a folder, useACTIONGo to process this folder.
PresetsACTIONyesread(Read), that is, this folder will be considered a general file;
likeACTIONyesskip(Skip), the folder will begrepSkip:
likeACTIONyesrecurse(Delivery),grepWill read all the files in the folder.
This is equivalent to-r parameter.
9. -E, --extended-regexp
Use regular representations to explain styles.
10. -e PATTERN, --regexp=PATTERN
Make the style onepartern, usually used to avoidparternuse-start.
11. -f FILE, --file=FILE
Write the style you want to search for in advance to an archive, one style per line.
Then use archive search.
An empty file means there is no style to search for, so there will be no matching.
ex: (newfileSearch for style files)
$grep -f newfile file
12. -G, --basic-regexp
Think of styles as basic rule representation interpretation.(This is a preset)
13. -H, --with-filename
Add the corresponding file name before each style line, and if there is a path, the path will be displayed.
ex: (existfileandtestfileSearchpandastyle)
$grep -H panda file ./testfile
file:panda
./testfile:panda
$
14. -h, --no-filename
and-HThe parameters are similar, but the path is not displayed when output.
15. --help
Produce a briefhelpmessage.
16. -I
grepIt will be forced to believe that this binary file does not contain any search styles.
With--binary-files=without-matchThe parameters are the same.
ex: $ grep -I panda mv
17. -i, --ignore-case
Ignore case and include the style to be searched and the archive being searched.
ex: $ grep -i panda mv
18. -L, --files-without-match
Instead of displaying the usual output results, it shows that there is no matching file name.
19. -l, --files-with-matches
The normal output results are not displayed, only the corresponding file names are displayed.
20. --mmap
If possible, usemmapThe system calls to read input, not presetreadSystem calls.
In some situations,--mmap Can produce better performance. However,--mmap
If the file is shortened in operation, orI/O When an error occurs,
May cause undefined behavior(Includecore dump),。
21. -n, --line-number
Before displaying the line, mark the up line number.
ex: $ grep -n panda file
The results are similar to those shown below:
Line number:Content that matches the line
22. -q, --quiet, --silent
No general output is displayed. See-sor--no-messages
23. -r, --recursive
Passing the directory, reading all files under each folder, which is equivalent to -d recsuse parameter.
24. -s, --no-messages
No error message about non-existence or unreadable is displayed.
Small: Not likeGNU grep, traditionalgrepNot in linePOSIX.2agreement,
Because of lack-qParameters, and his-s Parameter performance likeGNU grepof -q parameter.
Shell ScriptInclined to be traditionalgrepTransplant, avoid-qand-sparameter,
And limit the output to/dev/null。
POSIX: definitionUNIXandUNIX-likeThe functions that the system needs to provide.
25. -V, --version
Showgrepversion number to standard error.
When you are rewarding aboutgrepofbugshour,grepVersion number must be included.
26. -v, --invert-match
Shows everything except the search style line.
27. -w, --word-regexp
Think of search style as a word to search, which is exactly in line with this"Character"The rows will be listed.
28. -x, --line-regexp
grepparameter
1. -c Shows the number of matching rows (that is, show how many rows match);
2. -n Displays the line number of the document where the matching content is located;
3. -i Ignore case when matching;
4. -s Error message is not output;
5. -v Output mismatched content;
6. -x Output exactly matched content;
7. \ Ignore the original meaning of characters in expressions;
8. ^ Match the start line of the expression;
9. $ Match the end line of the expression;
10. \< Start with the line matching the expression;
11. \> to the end of the line that matches the expression;
12. [ ] Single character (e.g.[A] Right nowAmeet the requirements);
13. [ - ] scope;like[A-Z]Right nowA,B,CUntilZAll meet the requirements;
14. . All single characters;
15. * All characters, length can be0;
[Essence] Grepusage | |
| |
| |
| |
|
What is escape: treat special symbols as ordinary symbols
notes:
1.^The meaning of inside and outside
2. When does it need to escape?
3.*The difference between itself in bash and regular expressions
4.-acinv
June 24, 2014 11:24:50