一、正則表達式
Shell正則表達式分為兩種:
基礎正則表達式:BRE(basic regular express)
擴展正則表達式:ERE(extend regular express),擴展的表達式有+、?、|和()
1.1 基本正則表達式

1.2 擴展正則表達式

1.3 支持正則表達式的工具
grep:默認不支持擴展表達式,加-E選項開啟ERE。如果不加-E使用花括號要加轉義符\{\}
egrep:支持基礎和擴展表達式
awk:支持egrep所有的正則表達式
sed:默認不支持擴展表達式,加-r選項開啟ERE。如果不加-r使用花括號要加轉義符\{\}
二、grep
過濾來自一個文件或標準輸入匹配模式內容。

除此之外還有:
grep -f:從文件每一行獲取匹配模式
grep -m:輸出匹配的結果num數
grep -H:打印每個匹配的文件名
grep -h:不輸出文件名
grep -q:不輸出正常信息
grep -s:不輸出錯誤信息
grep -r:遞歸目錄
grep -B:打印匹配的前幾行
grep -A:打印匹配的后幾行
grep -C:打印匹配的前后幾行
grep --color:匹配的字體顏色
(1)輸出b文件在a文件相同的行
> grep -f a b
(2)輸出b文件在a文件不同的行
> grep -v -f a b
(3)匹配多個模式
> echo "a bc de" |xargs -n1 |grep -e 'a' -e 'bc'
a
bc
注:xargs為多行輸出,-n1表示一行輸出1個參數
(4)去除文件內的空行和#開頭的行(一般用于查看配置文件)
> grep -E -v "^$|.*#" /etc/httpd/conf/httpd.conf
(5)匹配開頭不分大小寫的單詞
> echo "A a b c" |xargs -n1 |grep -i a
A
a
(6)只顯示匹配的字符串
> echo "this is a test" |grep -o 'is'
is
is
(7)輸出匹配的前五個結果
> seq 1 20 |grep -m 5 -E '[0-9]{2}'
10
11
12
13
14
?(8)統計匹配多少行
> seq 1 20 |grep -c -E '[0-9]{2}'
11
(9)匹配b字符開頭的行
> echo "a bc de" |xargs -n1 |grep '^b'
bc
(10)匹配de字符結尾的行并輸出匹配的行
> echo "a ab abc abcd abcde" |xargs -n1 |grep -n 'de$'
5:abcde
(11)遞歸搜索/etc目錄下包含ip的conf后綴文件
grep?--include:只檢索匹配的文件
> grep -r '192.167.1.1' /etc --include *.conf
(12)排除搜索bak后綴的文件
grep --exclude:跳過匹配的文件
> grep -r '192.167.1.1' /opt --exclude *.bak
(13)匹配所有IP
> ifconfig |grep -E -o "[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}\.[0-9]{1,3}"
(14)打印匹配結果及后3行
> seq 1 10 |grep 5 -A 3
5
6
7
8
(15)打印匹配結果及前3行
> seq 1 10 |grep 5 -B 3
2
3
4
(16)打印匹配結果及前后3行
> seq 1 10 |grep 5 -C 3
2
3
4
5
6
7
8
三、sed
流編輯器,過濾和替換文本。
工作原理:sed命令將當前處理的行讀入模式空間進行處理,處理完把結果輸出,并清空模式空間。然后再將下一行讀入模式空間進行處理輸出,以此類推,直到最后一行。還有一個空間叫保持空間,又稱暫存空間,可以暫時存放一些處理的數據,但不能直接輸出,只能放到模式空間輸出。
這兩個空間其實就是在內存中初始化的一個內存區域,存放正在處理的數據和臨時存放的數據。
語法格式:sed [選項] '地址 命令' file



借助一些文本內容作為示例:
[root@openEuler-1 script]# cat sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
3.1 匹配打印(p)
(1)打印匹配blp5開頭的行
[root@openEuler-1 script]# sed -n '/^blp5/p' sed.txt
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
(2)打印第一行
[root@openEuler-1 script]# sed -n '1p' sed.txt
nimgtw 48003/udp # Nimbus Gateway
(3)打印第一行至第三行
[root@openEuler-1 script]# sed -n '1,3p' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
(4)打印奇數行
[root@openEuler-1 script]# seq 10 | sed -n '1~2p'
1
3
5
7
9
(5)打印匹配行及后一行
[root@openEuler-1 script]# sed -n '/blp5/,+1p' sed.txt
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
(6)打印最后一行
[root@openEuler-1 script]# sed -n '$p' sed.txt
iqobject 48619/udp # iqobject
(7)不打印最后一行
感嘆號也就是對后面的命令取反。
[root@openEuler-1 script]# sed -n '$!p' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
(8)匹配范圍blp5開頭的行~com開頭的行
以逗號分開兩個樣式選擇某個范圍。
[root@openEuler-1 script]# sed -n '/^blp5/,/^com/p' sed.txt
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
3.2 匹配刪除(d)
刪除與打印使用方法類似,打印是把匹配的打印出來,刪除是把匹配的刪除,刪除只是不用-n選項。
(1)刪除包含blp5的行
[root@openEuler-1 script]# sed '/^blp5/d' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(2)刪除1~3行
[root@openEuler-1 script]# sed '1,3d' sed.txt
isnetserv 48128/udp # Image Systems Network Services
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(3)去除http.conf文件空行和開頭#號的行
[root@openEuler-1 script]# sed '/^$/d;/.*#/d' /etc/httpd/conf/httpd.conf
3.3 替換(s///)
(1)替換blp5字符串為test
[root@openEuler-1 script]# sed 's/blp5/test/' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test 48129/tcp # Bloomberg locator
test 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(2)替換開頭是blp5的字符串并打印
[root@openEuler-1 script]# sed -n 's/blp5/test/p' sed.txt
test 48129/tcp # Bloomberg locator
test 48129/udp # Bloomberg locator
(3)使用&命令引用匹配內容并替換
[root@openEuler-1 script]# sed 's/48049/&.0/' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049.0/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(4)給IP加引號
g:全局替換
[root@openEuler-1 script]# echo "192.168.121.11 172.1.1.2 223.5.5.5" | sed -r 's/[^ ]+/“&”/g'
“192.168.121.11” “172.1.1.2” “223.5.5.5”
(5)對1-5行的blp5進行替換
[root@openEuler-1 script]# sed '1,5s/blp5/test/' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(6)對匹配行進行替換
[root@openEuler-1 script]# sed '/48129\/tcp/s/blp5/test/' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(7)二次匹配替換
[root@openEuler-1 script]# sed 's/blp5/test/;s/3g/4g/' sed.txt
nimgtw 48003/udp # Nimbus Gateway
4gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test 48129/tcp # Bloomberg locator
test 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(8)注釋匹配行后的多少行
[root@openEuler-1 script]# seq 10 | sed '/5/,+3s/^/#/'
1
2
3
4
#5
#6
#7
#8
9
10
3.4 多重編輯(-e)
[root@openEuler-1 script]# sed -e '1,2d' -e 's/blp5/test/' sed.txt
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test 48129/tcp # Bloomberg locator
test 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject# 也可以使用;分隔
[root@openEuler-1 script]# sed '1,2d;s/blp5/test/' sed.txt
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test 48129/tcp # Bloomberg locator
test 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
3.5 添加新內容(a,i、c)
(1)在blp5上一行添加test
[root@openEuler-1 script]# sed '/blp5/i \test' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test
blp5 48129/tcp # Bloomberg locator
test
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(2)在blp5下一行添加test
[root@openEuler-1 script]# sed '/blp5/a \test' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
blp5 48129/tcp # Bloomberg locator
test
blp5 48129/udp # Bloomberg locator
test
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
(3)將blp5替換新行
[root@openEuler-1 script]# sed '/blp5/c \test' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
test
test
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
3.6 讀取文件并追加到匹配行后(r)
[root@openEuler-1 script]# cat a.txt
123
456
[root@openEuler-1 script]# sed '/blp5/r a.txt' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
blp5 48129/tcp # Bloomberg locator
123
456
blp5 48129/udp # Bloomberg locator
123
456
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
3.7 將匹配行寫入文件(w)
[root@openEuler-1 script]# sed '/blp5/w b.txt' sed.txt
nimgtw 48003/udp # Nimbus Gateway
3gpp-cbsp 48049/tcp # 3GPP Cell Broadcast Service Protocol
isnetserv 48128/tcp # Image Systems Network Services
isnetserv 48128/udp # Image Systems Network Services
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
com-bardac-dw 48556/tcp # com-bardac-dw
com-bardac-dw 48556/udp # com-bardac-dw
iqobject 48619/tcp # iqobject
iqobject 48619/udp # iqobject
[root@openEuler-1 script]# cat b.txt
blp5 48129/tcp # Bloomberg locator
blp5 48129/udp # Bloomberg locator
3.8 讀取下一行(n和N)
n 讀取下一行到模式空間。
N 追加下一行內容到模式空間,并以換行符\n分隔。
(1)打印匹配的下一行
[root@openEuler-1 script]# seq 5 |sed -n '/3/{n;p}'
4
(2)打印偶數
[root@openEuler-1 script]# seq 10 | sed -n 'n;p'
2
4
6
8
10
# sed先讀取第一行1,執行n命令,獲取下一行2,此時模式空間是2,
# 執行p命令,打印模式空間。 現在模式空間是2,sed再讀取3,
# 執行n命令,獲取下一行4,此時模式空間為4,執行p命令,以此類推.
(3)打印奇數
[root@openEuler-1 script]# seq 10 |sed 'n;d'
1
3
5
7
9
# sed先讀取第一行1,此時模式空間是1,并打印模式空間1,
# 執行n命令,獲取下一行2,執行d命令,刪除模式空間的2,sed再讀取3,
# 此時模式空間是3,并打印模式空間,再執行n命令,獲取下一行4,
# 執行d命令,刪除模式空間的3,以此類推.
3.9 打印和刪除模式空間的第一行(P和D)
P 打印模式空間的第一行。
D 刪除模式空間的第一行。
(1)打印奇數
[root@openEuler-1 script]# seq 6 | sed -n 'N;P'
1
3
5
(2)保留最后一行
[root@openEuler-1 script]# seq 6 | sed 'N;D'
6
# 讀取第一行1,執行N命令讀取下一行并追加到模式空間,
# 此時模式空間是1\n2,執行D命令刪除模式空間第一行1,剩余2.
# 讀取第二行,執行N命令,此時模式空間是3\n4,
# 執行D命令刪除模式空間第一行3,剩余4,以此類推.
3.10 保持空間操作(h與H、g與G和x)
h 復制模式空間內容到保持空間(覆蓋)。
H 復制模式空間內容追加到保持空間。
g 復制保持空間內容到模式空間(覆蓋)。
G 復制保持空間內容追加到模式空間。
x 模式空間與保持空間內容互換。
(1)將匹配的內容覆蓋到另一個匹配
[root@openEuler-1 script]# seq 6 | sed -e '/3/{h;d}' -e '/5/g'
1
2
4
3
6
# h命令把匹配的3復制到保持空間,d命令刪除模式空間的3.
# 后面命令再對模式空間匹配5,并用g命令把保持空間3覆蓋模式空間5.
(2)將匹配的內容放到最后
[root@openEuler-1 script]# seq 6 | sed -e '/3/{h;d}' -e '$G'
1
2
4
5
6
3
# 這里的$代表最后一行,G表示將保持空間的內容追加到當前行的后面
(3)交換模式空間和保持空間
[root@openEuler-1 script]# seq 6 | sed -e '/3/{h;d}' -e '/5/x' -e '$G'
1
2
4
3
6
5
# 在模式空間匹配5并將保持空間的3與5交換,5就變成了3.
# 最后把保持空間的5追加到模式空間.
(4)倒序輸出
[root@openEuler-1 script]# seq 5 |sed '1!G;h;$!d'
5
4
3
2
1
# 1!G 第一行不執行把保持空間內容追加到模式空間,因為現在保持空間還沒有數據
# h 將模式空間放到保持空間暫存
# $!d 最后一行不執行刪除模式空間的內容。
(5)每行后面添加新空行
[root@openEuler-1 script]# seq 5 |sed G
12345
3.11 忽略大小寫匹配(I)
[root@openEuler-1 script]# echo -e "a\nA\nb\nc" |sed 's/a/1/Ig'
1
1
b
c
3.12 獲取總行數
[root@openEuler-1 script]# seq 10 |sed -n '$='
10
四、awk
awk 是一個處理文本的編程語言工具,能用簡短的程序處理標準輸入或文件、數據排序、計算以及 生成報表等。
基本的命令語法:awk option?'pattern {action}' file
其中pattern表示AWK在數據中查找的內容,而action是在找到匹配內容時所執行的一系列命令。花括號用于根據特定的模式對一系列指令進行分組。
awk 處理的工作方式與數據庫類似,支持對記錄和字段處理,這也是grep和sed不能實現的。在awk中,缺省的情況下將文本文件中的一行視為一個記錄,逐行放到內存中處理,而將一行中的某一部分作為記錄中的一個字段。用1,2,3...數字的方式順序的表示行(記錄)中的不同字段。用$后跟數字,引用對應的字段,以逗號分隔,0表示整個行。

4.1 選項、模式


(1)從文件讀取awk程序處理文件
[root@openEuler-1 script]# cat test.awk
{print $2}
[root@openEuler-1 script]# tail -n3 /etc/services | awk -f test.awk
45514/udp
45514/tcp
46998/tcp
(2)指定分隔符,打印指定字段(默認以空格分割)
[root@openEuler-1 script]# tail -n3 /etc/passwd | awk -F':' '{print $1}'
apache
mysql
nginx
(3)指定多個分隔符作為同一個分隔符處理
[root@openEuler-1 script]# tail -n3 /etc/services
cloudcheck-ping 45514/udp # ASSIA CloudCheck WiFi Management keepalive
cloudcheck 45514/tcp # ASSIA CloudCheck WiFi Management System
spremotetablet 46998/tcp # Capture handwritten signatures[root@openEuler-1 script]# tail -n3 /etc/services | awk -F'[/#]' '{print $1}'
cloudcheck-ping 45514
cloudcheck 45514
spremotetablet 46998
[root@openEuler-1 script]# tail -n3 /etc/services | awk -F'[/#]' '{print $2}'
udp
tcp
tcp
[root@openEuler-1 script]# tail -n3 /etc/services | awk -F'[/#]' '{print $3}'ASSIA CloudCheck WiFi Management keepaliveASSIA CloudCheck WiFi Management SystemCapture handwritten signatures
(4)變量賦值
[root@openEuler-1 script]# awk -v a=123 'BEGIN{print a}'
123
(5)輸出awk全局變量到文件
[root@openEuler-1 script]# seq 5 | awk --dump-variables '{print $0}'
1
2
3
4
5
[root@openEuler-1 script]# cat awkvars.out
ARGC: 1
ARGIND: 0
ARGV: array, 1 elements
BINMODE: 0
CONVFMT: "%.6g"
ENVIRON: array, 25 elements
ERRNO: ""
FIELDWIDTHS: ""
FILENAME: "-"
FNR: 5
FPAT: "[^[:space:]]+"
FS: " "
FUNCTAB: array, 41 elements
IGNORECASE: 0
LINT: 0
NF: 1
NR: 5
OFMT: "%.6g"
OFS: " "
ORS: "\n"
PREC: 53
PROCINFO: array, 21 elements
RLENGTH: 0
ROUNDMODE: "N"
RS: "\n"
RSTART: 0
RT: "\n"
SUBSEP: "\034"
SYMTAB: array, 28 elements
TEXTDOMAIN: "messages"
(6)BEGIN和END
BEGIN模式是在處理文件之前執行該操作,常用于修改內置變量、變量賦值和打印輸出的頁眉或標 題。END模式是在程序處理完才會執行。
示例:
[root@openEuler-1 script]# cat frequent_ip.sh
#!/bin/bash
#########################
#File name:frequent_ip.sh
#Email:obboda@163.com
#Created time:2025-01-14 11:46:11
#Description:訪問nginx日志,得到訪問ip最多的前10個
#########################log_path="/var/log/nginx/access.log"# 拿取按照訪問次數排序過的ip
sort_data=`awk '{print $1}' /var/log/nginx/access.log | sort | uniq -c`# 按照sort_data的數據進行優化處理,打印輸出
echo "$sort_data" | awk '
BEGIN{printf "%-10s %-10s %-10s \n","訪問排名","訪問IP","訪問次數"}
{printf "%-10s %-20s %-10s \n",NR,$2,$1}
END{print "end............"}
'
[root@openEuler-1 script]# bash frequent_ip.sh
訪問排名 訪問IP 訪問次數
1 192.168.121.1 8
2 192.168.121.12 3
3 192.168.121.13 2
4 192.168.121.131 1
5 192.168.121.51 4
end............
4.2 內置變量


(1)FS和OFS
在程序開始前重新賦值FS變量,改變默認分隔符為冒號,與-F一樣。
[root@openEuler-1 script]# head -n5 /etc/passwd | awk 'BEGIN{FS=":"}{print $1,$2}'
root x
bin x
daemon x
adm x
lp x# 也可以使用-v來重新賦值這個變量,并指定輸出分隔符
[root@openEuler-1 script]# head -n5 /etc/passwd | awk -vFS=':' '{print $1,$2}'
root x
bin x
daemon x
adm x
lp x# 也可以指定輸出分隔符
[root@openEuler-1 script]# head -n5 /etc/passwd | awk 'BEGIN{FS=":";OFS="#"}{print $1,$2}'
root#x
bin#x
daemon#x
adm#x
lp#x
(2)RS和ORS
# 指定以某個字符作為分隔符來處理記錄:
[root@openEuler-1 script]# echo "www.baidu.com/user/test.html" |awk 'BEGIN{RS="/"}{print $0}'
www.baidu.com
user
test.html# 將輸出的換行符替換為+號:
[root@openEuler-1 script]# seq 10 | awk 'BEGIN{ORS="+"}{print $0}'
1+2+3+4+5+6+7+8+9+10+[root@openEuler-1 script]#
(3)NF
# NF是字段個數
[root@openEuler-1 script]# echo "a b c d e f" | awk '{print NF}'
6# 打印最后一個字段
[root@openEuler-1 script]# echo "a b c d e f" | awk '{print $NF}'
f
(4)NR和FNR
NR統計記錄編號,每處理一行記錄,編號就會+1,FNR不同的是在統計第二個文件時會重新計數。
[root@openEuler-1 script]# tail -n5 /etc/services | awk '{print NR,$0}'
1 axio-disc 35100/udp # Axiomatic discovery protocol
2 pmwebapi 44323/tcp # Performance Co-Pilot client HTTP API
3 cloudcheck-ping 45514/udp # ASSIA CloudCheck WiFi Management keepalive
4 cloudcheck 45514/tcp # ASSIA CloudCheck WiFi Management System
5 spremotetablet 46998/tcp # Capture handwritten signatures
NR和FNR的區別:
[root@openEuler-1 script]# awk '{print NR,FNR,$0}' a b
1 1 a
2 2 b
3 3 c
4 1 d
5 2 e
6 3 f
(5)ARGC和ARGV
ARGC是命令行參數數量
ARGV是將命令行參數存到數組,元素由ARGC指定,數組下標從0開始
[root@openEuler-1 script]# awk 'BEGIN{print ARGC}' 1 2 3
4
[root@openEuler-1 script]# awk 'BEGIN{print ARGV[0],ARGV[1]}' 1 2 3
awk 1
(6)ARGIND
ARGIND是當前正在處理的文件索引值,第一個文件是1,第二個文件是2,以此類推,從而可以通過這種方式判斷正在處理哪個文件。
[root@openEuler-1 script]# awk '{print ARGIND,$0}' a b
1 a
1 b
1 c
2 d
2 e
2 f
(7)ENVIRON
ENVIRON調用系統變量。
[root@openEuler-1 script]# awk 'BEGIN{print ENVIRON["HOME"]}'
/root
(8)FILENAME
FILENAME是當前處理文件的文件名。
[root@openEuler-1 script]# awk '{print FILENAME,$0}' a b
a a
a b
a c
b d
b e
b f
(9)IGNORECASE
IGNORECASE=1表示忽略大小寫
[root@openEuler-1 script]# echo "A a b c" |xargs -n1 |awk 'BEGIN{IGNORECASE=1}/a/'
A
a
4.3 操作符

注意:在awk中,有3種情況表達式為假:數字是0,空字符串和未定義的值。且數值運算,未定義變量初始值為0。字符運算,未定義變量初始值為空。
(1)截取整數
[root@openEuler-1 script]# echo "123abc abc123 123abc123" |xargs -n1 | awk '{print +$0}'
123
0
123[root@openEuler-1 script]# echo "123abc abc123 123abc123" |xargs -n1 | awk '{print -$0}'
-123
0
-123
(2)打印奇數偶數行
# 打印奇數行
[root@openEuler-1 script]# seq 6 | awk 'i=!i'
1
3
5# 打印偶數行
[root@openEuler-1 script]# seq 6 | awk '!(i=!i)'
2
4
6
(3)管道符的使用
[root@openEuler-1 script]# seq 5 | shuf | awk '{print $0|"sort"}'
1
2
3
4
5
# 其中shuf會將原來的順序打亂
(4)三目運算符
[root@openEuler-1 script]# awk 'BEGIN{print 1==1?"yes":"no"}'
yes
4.4 流程控制
(1)if語句
格式:if (condition) statement [ else statement ]
# 單分支
[root@openEuler-1 script]# seq 5 | awk '{if($0==3)print $0}'
3# 雙分支
[root@openEuler-1 script]# seq 5 |awk '{if($0==3)print $0;else print "no"}'
no
no
3
no
no
(2)while語句
格式:while (condition) statement
[root@openEuler-1 script]# echo "1 2 3 4 5" | awk '{i=1;while(i<=NF){print $i;i++}}'
1
2
3
4
5
(3)for語句C語言風格
格式:for (expr1; expr2; expr3) statement
[root@openEuler-1 script]# cat file
1 2 3
4 5 6
7 8 9
[root@openEuler-1 script]# awk '{for(i=1;i<=NF;i++)print $i}' file
1
2
3
4
5
6
7
8
9
(4)for語句遍歷數組
格式:for (var in array) statement
[root@openEuler-1 script]# seq -f "str%.g" 5 |awk '{a[NR]=$0}END{for(v in a)print v,a[v]}'
1 str1
2 str2
3 str3
4 str4
5 str5
(5)break和continue語句
break跳過所有循環,continue跳過當前循環。
[root@openEuler-1 script]# awk 'BEGIN{for(i=1;i<=5;i++){if(i==3){break};print i}}'
1
2
[root@openEuler-1 script]# awk 'BEGIN{for(i=1;i<=5;i++){if(i==3){continue};print i}}'
1
2
4
5
(6)exit語句
格式:exit [ expression ]
exit退出程序,與shell的exit一樣。[ expr ]是0-255之間的數字。
[root@openEuler-1 script]# seq 5 |awk '{if($0~/3/)exit (123)}'
[root@openEuler-1 script]# echo $?
123
4.5 內置函數


[root@openEuler-1 script]# cat c
123abc
abc123
123abc123
[root@openEuler-1 script]# awk '{print int($0)}' c
123
0
123
4.6? I/O語句


# 獲取匹配行的下一行
[root@openEuler-1 script]# seq 5 |awk '/3/{getline;print}'
4
4.7 printf語句
格式化輸出,默認打印字符串不換行。
格式:printf [format] arguments

# 左對齊寬度10
[root@openEuler-1 script]# awk 'BEGIN{printf "%-10s %-10s %-10s\n","ID","Name","Passwd"}'
ID Name Passwd