項目里用logstash分析日志,由于有多種模式(pattern)需要匹配,網上搜了很多示例,發現這些都是老的寫法,都會報錯,后來查閱了官方文檔,才發現,新版本只支持新語法。
錯誤的語法:
if "batch-trans" in [tags] {grok {match => ["message","\[(?<logDate>[\d{4}(\-|\/|.)\d{1,2}\1\d{1,2}\s+d{1,2}:d{1,2}:d{1,2}]*)\]\s+\[(?<mainJobId>(?:[+-]?(?:[0-9]+)))\-(?<subJobId>(?:[+-]?(?:[0-9]+)))\-(?<shardingId>(?:[+-]?(?:[0-9]+)))\]\s+\[(?<traceId>[^\]]*)\]\s+\[(?<jobName>[^\]]*)\]\s+\[(?<threadId>[^\]]*)\]\s+\[(?<zoneId>[^\]]*)\]\s+\[(?<traceType>[^\]]*)\]\s+\[(?<cost>[^\]]*)\]\s+\[(?<splitZoneId>[^\]]*)\]\s+\[(?<url>[^\]]*)\]\s+\[(?<subJobId>[^\]]*)\](?<msg>.*)","message","\[(?<logDate>[\d{4}(\-|\/|.)\d{1,2}\1\d{1,2}\s+d{1,2}:d{1,2}:d{1,2}]*)\]\s+\[(?<mainJobId>(?:[+-]?(?:[0-9]+)))\-(?<subJobId>(?:[+-]?(?:[0-9]+)))\]\s+\[(?<traceId>[^\]]*)\]\s+\[(?<jobName>[^\]]*)\]\s+\[(?<threadId>[^\]]*)\]\s+\[(?<zoneId>[^\]]*)\]\s+\[(?<traceType>[^\]]*)\]\s+\[(?<cost>[^\]]*)\]\s+\[(?<splitZoneId>[^\]]*)\]\s+\[(?<url>[^\]]*)\]\s+\[(?<subJobId>[^\]]*)\](?<msg>.*)",]}}
正確的語法:
filter {if "accounting-log" in [tags] {grok {match => {"message" => ["^\[(?<log-time>[\s\S]*)\]\s+%{LOGLEVEL:log-level}\s\[%{DATA:trace-id}\]\s\[%{DATA:thread-name}\s*\]\s\[%{DATA:logger}\s*: %{NUMBER:line-no}\] \[%{DATA:zone-id}\]\sJob-Sharding-Params: jobId=%{NUMBER:job-id}, transCode=*%{NUMBER:trans-code}, shardingId=*%{NUMBER:sharing-id}, shardingTable=*%{DATA:sharding-table}, JobParameters=\{%{GREEDYDATA:job-parameters}\}","^\[(?<log-time>[\s\S]*)\]\s+%{LOGLEVEL:log-level}\s\[%{DATA:trace-id}\]\s\[%{DATA:thread-name}\s*\]\s\[%{DATA:logger}\s*:\s*%{NUMBER:line-no}\]\s\[%{DATA:zone-id}\]\s%{GREEDYDATA:msg}"]}}}}
}
注意,先后順序很重要,上面示例中,如果排錯了順序,后面規則永遠匹配不到,都會被前面的規則搶先了。
為方便大家拿來主義,上面示例對應的logback配置如下:
logback:<property name="NORMAL_FILE_LOG_PATTERN"value="[%d{yyyy-MM-dd HH:mm:ss.SSS}] %5p [%0.16X{traceId}] [%-12.12t] [%-40.40logger{39}:%3L] [%0.2X{zoneId}] %m%n${LOG_EXCEPTION_CONVERSION_WORD:-%wEx}" />gork:
"^\[(?<log-time>[\s\S]*)\]\s+%{LOGLEVEL:log-level}\s\[%{DATA:trace-id}\]\s\[%{DATA:thread-name}\s*\]\s\[%{DATA:logger}\s*:\s*%{NUMBER:line-no}\]\s\[%{DATA:zone-id}\]\s%{GREEDYDATA:msg}"
另外,為了讓一條日志包含多行(如,異常日志),應該做如下配置:
input{file {path => "/logs/accounting-service.log"type => "system"tags => ["accounting-log"]codec => multiline {pattern => "^(\[.+\] )" #這兒就是說多行要匹配到一行開頭:[******]跟隨一個空格的形式negate => truewhat => "previous"auto_flush_interval => 2 #這行非常重要,就是2秒內如果沒新的內容,就認為這條日志結束了,否則最后一條日志永遠就是要等到有下一條日志進來才會被采集}start_position => "beginning"}
}
參考官方文檔:(搜索“multiple patterns”)
https://www.elastic.co/guide/en/logstash/current/plugins-filters-grok.html