web123456

Detailed explanation of grep usage: grep and regular expression

Detailed explanation of grep usage: grep and regular expression

The first thing to remember is:Regular expressions are different from wildcard characters, and their meanings are different!
Regular expressions are just a notation.As long as the tool supports this notation, the tool can handle strings of regular expressions. vi grep , awk , sed , etc. all support regular expressions.

1
Basic regular expressions

grep
Tools, introduced before.
grep -[acinv] 'Search for content string' filename
-a Search in text file
-c Calculate the number of times the matching rows found
-i Ignore case
-n By the way, output line numbers
-v Reverse selection, that is, find lines with no search string
The search string can be a regular expression!

1
Search for theand output the line number
$grep -n 'the' regular_express.txt
Search for rows without the and output the line number
$grep -nv 'the' regular_express.txt

2 use[]Search for collection characters
[]
Represents a character in it, for example [ade]Represents a or d or e
woody@xiaoc:~/tmp$ grep -n 't[ae]st' regular_express.txt
8:I can't finish the test.
9:Oh! the soup taste good!

Can be used^The symbol is prefixed in [], indicating characters other than characters in [].
For example, search for the line where the string without g is located before oo. Use '[^g]oo' as the search string
woody@xiaoc:~/tmp$ grep -n '[^g]oo' regular_express.txt
2:apple is my favorite food.
3:Football game is not use feet only.
18:google is the best tools for search keyword.
19:goooooogle yes!

[]
It can be expressed in range, such as [a-z]Represents lowercase letters, [0-9] represents numbers 0~9, and [A-Z] is the capital letters. [a-zA-Z0-9] means all numbers and English characters. Of course, you can also use ^ to exclude characters.
Search for rows containing numbers
woody@xiaoc:~/tmp$ grep -n '[0-9]' regular_express.txt

5:However ,this dress is about $ 3183 dollars.
15:You are the best is menu you are the no.1.

Characters at the beginning and end of the line ^ $. ^Indicates the beginning of the line, $ means the end of the line (not a character, it is a position). Then ‘^$’ means blank line, because only
The beginning and end of the line.

Here ^The meaning is different from the ^ used in []. It means that the string after ^ is at the beginning of the line.
For example, search for the line at the beginning
woody@xiaoc:~/tmp$ grep -n '^the' regular_express.txt
12:the symbol '*' is represented as star.

Search for lines starting with lowercase letters
woody@xiaoc:~/tmp$ grep -n
'^[a-z]' regular_express.txt
2:apple is my favorite food.
4:this dress doesn't fit me.
10:motorcycle is cheap than car.
12:the symbol '*' is represented as star.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.
woody@xiaoc:~/tmp$

Search for lines that do not start with English letters
woody@xiaoc:~/tmp$ grep -n
'^[^a-zA-Z]' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
21:#I am VBird
woody@xiaoc:~/tmp$

$
Indicates that the string before it is at the end of the line, such as '\.'Represents . at the end of a line
The end of the search is.The way
woody@xiaoc:~/tmp$ grep -n '\.$' regular_express.txt //. It is a special symbol for regular expressions, so use \Escape
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
5:However ,this dress is about $ 3183 dollars.
6:GNU is free air not free beer.
.....

Pay attention to MSThe text file generated under the system will add a ^M character to the newline. So the last character will be hidden ^M, which is being processedWindows
Pay special attention to the following text!

You can use cat dos_file | tr -d '\r' > unix_file to delete the ^M symbol. ^M==\r

Then '^$'It means that there are only blank lines at the beginning and end of the line!
Search for empty lines
woody@xiaoc:~/tmp$ grep -n
'^$' regular_express.txt
22:
23:
woody@xiaoc:~/tmp$

Search for non-empty lines
woody@xiaoc:~/tmp$ grep
-vn '^$' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
4:this dress doesn't fit me.
..........

Any character.with repeating characters *

In bash* represents a wildcard character, used to represent any character, but in regular expressions, its meaning is different, * means there are 0 or more characters.

For example oo*,It means that the first o must exist, the second o may have one or more or no, so it represents at least oneo.

Point. represents an arbitrary character and must exist.g??d can be represented by 'g..d'. good , gxxd , gabd .... all fit.


woody@xiaoc:~/tmp$ grep -n 'g..d' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
9:Oh! the soup taste good!
16:The world is the same with 'glad'.
woody@xiaoc:~/tmp$

Search two oThe above string
woody@xiaoc:~/tmp$ grep -n 'ooo*' regular_express.txt //The first two oThere must be, the third o may not exist, or there may be multiple.
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! the soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!

Search for the beginning and end of g, and the middle is at least one O string, i.e. gog, goog...gooog... etc.
woody@xiaoc:~/tmp$ grep -n 'goo*g' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!

Search gThe beginning and ending strings in the line
woody@xiaoc:~/tmp$ grep -n 'g.*g' regular_express.txt     // .*Indicates 0One or more arbitrary characters
1:"Open Source" is a good mechanism to develop programs.
14:The gd software is a library for drafting programs.
18:google is the best tools for search keyword.
19:goooooogle yes!
20:go! go! Let's go.


Limited range of continuous repeat characters { }

. * Only limit 0If you want to limit the number of characters to be repetitive, use {range}. Range is used as numbers, separated by 2,5 to represent 2~5,
2 means 2, 2 means 2 to more

Note that since { }There is a special meaning in SHELL, so when used as a regular expression, you should use \ to escape it.

Search contains two oline of string.
woody@xiaoc:~/tmp$ grep -n 'o\{2\}' regular_express.txt
1:"Open Source" is a good mechanism to develop programs.
2:apple is my favorite food.
3:Football game is not use feet only.
9:Oh! the soup taste good!
18:google is the best tools for search keyword.
19:goooooogle yes!

Search gFollowed by 2 to 5 os, followed by a line of string g.
woody@xiaoc:~/tmp$ grep -n 'go\{2,5\}g' regular_express.txt
18:google is the best tools for search keyword.


Search contains gThe following is more than 2 o, followed by g. .
woody@xiaoc:~/tmp$ grep -n 'go\{2,\}g' regular_express.txt
18:google is the best tools for search keyword.
19:goooooogle yes!


Pay attention, give in to each other[]^ in the ^ - does not express special meaning, and can be placed behind the content in [].
'[^a-z\.!^ -]' It means there is no lowercase letters, no. No!, no spaces, no - string, note that there is a small space in [].

Also shellThe reverse selection inside is [!range], and the regular one is [^range]


2Extended regular expressions

Extended regular expressions add several special compositions to the basic regular expressions.
It makes certain operations more convenient.
For example, if we want to remove blank lines and lines with # at the beginning of lines, we will use this:
woody@xiaoc:~/tmp$ grep -v '^$' regular_express.txt | grep -v '^#'
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
............

However, using egrep and special symbols | that support extending regular expressions will be much more convenient.
Note that grep only supports basic expressions, while egrep supports extensions. In fact, egrep is just an alias for grep -E. Therefore grep -E supports extended rules.
So:
woody@xiaoc:~/tmp$ egrep -v '^$|^#' regular_express.txt
"Open Source" is a good mechanism to develop programs.
apple is my favorite food.
Football game is not use feet only.
this dress doesn't fit me.
....................
Here |Relationship of or. That is, a string that satisfies ^$ or ^#.


Here are a few special symbols for extensions:
+, in . *It has similar functions, indicating one or more duplicate characters.
?
, At . *It has similar functions, indicating 0 or one character.

|, representation or relationship, such as 'gd|good|dog'Indicates a string with gd, good or dog
(), synthesise part of the content into a unit group. For example, to search for gladOr good can do this'g(la|oo)d'
()The advantage is that it can be used on groups + ? *wait.

For example, search for AThere is at least one (xyz) string in the beginning and end with C, which can be as follows:'A(xyz)+C'

 

 

 

 

 

 

 

 

 

grep -- print lines matching a pattern (List the row that matches the style)

 

 ◎Grammar: grep [options] 

 PATTERN [FILE...] 

 grepUsed infileCompare the corresponding parts in the text, or when there is no specified file, 

 Compare from standard input. In the case of presets,grepThe row that matches the style will be listed.

 

         In addition, there are two programsgrepThe variation ofegrepandfgrep          

         inegrepIt's equivalent togrep -E fgrepEquivalent togrep -F

 

 ◎ Parameters

    1. -A NUM--after-context=NUM 

               In addition to listing the matching rows, andNUMOK.

             

         ex:   $ grep -A 1 panda file 

               (fromfileSearch forpandaThe line of style and displays the line's after1OK)

                                 

    2. -aor--text  

               grepOriginally, it was searching for text files, if binary files were used as the target of search,

               The following message will be displayed: Binary file Binary file name matches Then end.

                  

               If added-aParameters can be used to treat binary files as text files to search.

               Equivalent to--binary-files=textThis parameter.

            

         ex:   (From binary filemvGo and searchpandastyle)

               (mistake!!!)

               $ grep panda mv 

               Binary file mv matches  

               (This means that this file hasmatchSee details about the--binary-files=TYPE )

               $

               (correct!!!)

               $ grep -a panda mv 

       

    3. -B NUM--before-context=NUM

               and -A NUM Relatively, but this parameter is displayed except for the line

               and display before itNUMOK.        

             

         ex:   (fromfileSearch forpandaa line of style and displays the front of the line1OK)

               $ grep -B 1 panda file 

 

    4. -C [NUM], -NUM, --context[=NUM]  

               List the following lines and list the upper and lower partsNUMYes, the default value is2

             

         ex:   (ListfileIncludepandaStyles outline and list their upper and lower2OK)

               (To change the preset value, change it directlyNUMJust)

               $ grep -C[NUM]  panda file 

              

    5. -b, --byte-offset

               How many texts are listed before the stylebyte ..

              

          ex:  $ grep -b  panda file  

       The results are similar to:

         0:panda

        66:pandahuang

       123:panda03

           

    6. --binary-files=TYPE

               This parameterTYPEPreset asbinary(Binary), if searching in a normal way, only2Such results:

                 1.If there is a matching place: displayBinary file Binary file name matches

                 2.If there is no matching: nothing is displayed.

                   

               likeTYPEforwithout-match, encounter this parameter,

               grepThis binary file does not contain any search style, with-I The parameters are the same.

                    

               likeTPYEfortext, grepThis binary file will be considered astextArchives, with-a The parameters are the same.

        

     Warning: --binary-files=text If the output is a terminal, some unnecessary output may be generated.

              

    7. -c, --count

       The rows that match style are not displayed, only the total number of rows that match are displayed.

       If added-v,--invert-match, the parameter displays the total number of rows that do not match.

 

    8. -d ACTION, --directories=ACTION

               If the file entered is a folder, useACTIONGo to process this folder.

       PresetsACTIONyesread(Read), that is, this folder will be considered a general file;

       likeACTIONyesskip(Skip), the folder will begrepSkip:

       likeACTIONyesrecurse(Delivery)grepWill read all the files in the folder.

       This is equivalent to-r parameter.

 

    9.  -E, --extended-regexp

       Use regular representations to explain styles.

      

   10.  -e PATTERN, --regexp=PATTERN

       Make the style onepartern, usually used to avoidparternuse-start.  

 

   11.  -f FILE, --file=FILE

       Write the style you want to search for in advance to an archive, one style per line.

       Then use archive search.

       An empty file means there is no style to search for, so there will be no matching.

       

   ex: (newfileSearch for style files)

       $grep -f newfile file    

 

   12.  -G, --basic-regexp

       Think of styles as basic rule representation interpretation.(This is a preset)

 

   13.  -H, --with-filename

       Add the corresponding file name before each style line, and if there is a path, the path will be displayed.

       

   ex: (existfileandtestfileSearchpandastyle)   

       $grep -H panda file ./testfile

                file:panda

                ./testfile:panda

                $

     

   14.  -h, --no-filename  

               and-HThe parameters are similar, but the path is not displayed when output.

 

   15.  --help 

               Produce a briefhelpmessage.

 

   16.  -I

               grepIt will be forced to believe that this binary file does not contain any search styles.

               With--binary-files=without-matchThe parameters are the same.

                   

           ex:  $ grep -I  panda mv

 

   17.  -i, --ignore-case       

               Ignore case and include the style to be searched and the archive being searched.

               

           ex:  $ grep -i panda mv

                 

   18.  -L, --files-without-match 

               Instead of displaying the usual output results, it shows that there is no matching file name.

 

   19.  -l, --files-with-matches               

               The normal output results are not displayed, only the corresponding file names are displayed.

 

   20.  --mmap               

               If possible, usemmapThe system calls to read input, not presetreadSystem calls. 

               In some situations,--mmap Can produce better performance. However,--mmap 

               If the file is shortened in operation, orI/O When an error occurs,

               May cause undefined behavior(Includecore dump),。

               

   21.  -n, --line-number

               Before displaying the line, mark the up line number.

               

            ex:  $ grep -n  panda file  

                The results are similar to those shown below:

                Line number:Content that matches the line

 

   22.  -q, --quiet, --silent 

               No general output is displayed. See-sor--no-messages

 

   23.  -r, --recursive

       Passing the directory, reading all files under each folder, which is equivalent to -d recsuse parameter.

 

   24.  -s, --no-messages

       No error message about non-existence or unreadable is displayed.

     

 Small: Not likeGNU grep, traditionalgrepNot in linePOSIX.2agreement,

       Because of lack-qParameters, and his-s Parameter performance likeGNU grepof -q parameter.

       Shell ScriptInclined to be traditionalgrepTransplant, avoid-qand-sparameter,

       And limit the output to/dev/null

    

POSIX: definitionUNIXandUNIX-likeThe functions that the system needs to provide.              

    

   25.  -V, --version

  Showgrepversion number to standard error.

  When you are rewarding aboutgrepofbugshour,grepVersion number must be included.

 

   26.  -v, --invert-match

  Shows everything except the search style line.

                   

   27.  -w, --word-regexp

          Think of search style as a word to search, which is exactly in line with this"Character"The rows will be listed.

 

   28.  -x, --line-regexp

 

 

 

 

 

 

 

 

 

 

grepparameter

1.   -c Shows the number of matching rows (that is, show how many rows match);

2.   -n Displays the line number of the document where the matching content is located;

3.   -i Ignore case when matching;

4.   -s Error message is not output;

5.   -v Output mismatched content;

6.   -x Output exactly matched content;

7.   \ Ignore the original meaning of characters in expressions;

8.   ^ Match the start line of the expression;

9.   $ Match the end line of the expression;

10. \< Start with the line matching the expression;

11. \> to the end of the line that matches the expression;

12. [ ] Single character (e.g.[A] Right nowAmeet the requirements);

13. [ - ] scope;like[A-Z]Right nowABCUntilZAll meet the requirements;

14. . All single characters;

15. * All characters, length can be0

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[Essence] Grepusage

 

 

 


Grep : g (globally) search for a re (regular expression ) and p (print ) the results.

1
,parameter:
-I: Ignore case

-c : Print the number of matching lines

-l : Find containing matches from multiple files

-v : Find rows that do not contain matches

-n: Print the row and line label containing matches


2. RE (regular expression)

\ Ignore the original meaning of special characters in regular expressions

^ Match the start line of the regular expression

$ Match the end line of the regular expression

\< Start with the line matching the regular expression

\>; The end of the line matching the regular expression

[ ] Single character; if [A], that is, A meets the requirements

[ - ] Scope; For example, [A-Z], that is, A, B, C and Z meet the requirements

. All single characters

* All characters, length can be
0

3. Give an example

# ps -ef | grep 
root 19955 181 0 13:43:53 ? 0:00 

#more  size file content

b124230
b034325
a081016
m7187998
m7282064
a022021
a061048
m9324822
b103303
a013386
b044525
m8987131
B081016
M45678
B103303
BADc2345

# more | grep '[a-b]' Scope ; For example, [A-Z], that is, A, B, C and Z meet the requirements

b124230
b034325
a081016
a022021
a061048
b103303
a013386
b044525
# more  | grep '[a-b]'*
b124230
b034325
a081016
m7187998
m7282064
a022021
a061048
m9324822
b103303
a013386
b044525
m8987131
B081016
M45678
B103303
BADc2345

# more  | grep '
Single character; if [A], that is, A meets the requirements
b124230
b034325
b103303
b044525
# more  | grep '[bB]'
b124230
b034325
b103303
b044525
B081016
B103303
BADc2345

# grep 'root' /etc/group
root::0:root
bin::2:root,bin,daemon
sys::3:root,bin,sys,adm
adm::4:root,adm,daemon
uucp::5:root,uucp
mail::6:root
tty::7:root,tty,adm
lp::8:root,lp,adm
nuucp::9:root,nuucp
daemon::12:root,daemon

# grep '^root' /etc/group Match the start line of the regular expression

root::0:root


# grep 'uucp' /etc/group
uucp::5:root,uucp
nuucp::9:root,nuucp

# grep '\<uucp' /etc/group
uucp::5:root,uucp


# grep 'root$' /etc/group Match the end line of the regular expression

root::0:root
mail::6:root


# more  | grep -i 'b1..*3' -i : Ignore case


b124230
b103303
B103303

# more | grep -iv 'b1..*3' -v : Find rows that do not contain matches


b034325
a081016
m7187998
m7282064
a022021
a061048
m9324822
a013386
b044525
m8987131
B081016
M45678
BADc2345

# more  | grep -in 'b1..*3'
1:b124230
9:b103303
15:B103303

# grep '$' /etc// | wc -l
128
# grep '\$' /etc// | wc -l Ignore the original meaning of special characters in regular expressions


15
# grep '\$' /etc//
case "$1" in
>;/tmp/sharetab.$$
[ "x$fstype" != xnfs ] && \
echo "$path\t$res\t$fstype\t$opts\t$desc" \
>;>;/tmp/sharetab.$$
/usr/bin/touch -r /etc/dfs/sharetab /tmp/sharetab.$$
/usr/bin/mv -f /tmp/sharetab.$$ /etc/dfs/sharetab
if [ -f /etc/dfs/dfstab ] && /usr/bin/egrep -v '^[ ]*(#|$)' \
if [ $startnfsd -eq 0 -a -f /etc/ ] && \
if [ $startnfsd -ne 0 ]; then
elif [ ! -n "$_INIT_RUN_LEVEL" ]; then
while [ $wtime -gt 0 ]; do
wtime=`expr $wtime - 1`
if [ $wtime -eq 0 ]; then
echo "Usage: $0 { start | stop }"


# more 

the test file
their are files
The end

# grep 'the' 
the test file
their are files

# grep '\<the' 
the test file
their are files

# grep 'the\>;' 
the test file

# grep '\<the\>;' 
the test file

# grep '\<[Tt]he\>;' 
the test file
The end

 

What is escape: treat special symbols as ordinary symbols

notes:

1.^The meaning of inside and outside

2. When does it need to escape?

3.*The difference between itself in bash and regular expressions

4.-acinv

June 24, 2014 11:24:50