需求描述
Categraf作為夜鶯監控平臺的數據采集工具,為了保障Linux主機的安全,需要實現對系統用戶密碼有效期的監控,并在密碼即將到期時及時告警,以提醒運維人員更改密碼。本章將詳細介紹如何利用Categraf的exec插件來實現這一功能,并確保告警信息能夠通過企業微信、飛書等渠道準確地推送給相關運維人員。
exec插件exec.toml文件配置
這個配置文件定義了exec插件定期執行/opt/categraf/scripts/check_password_expiry.shjiao
腳本文件,并且輸出的數據格式為influx
格式。
# # collect interval
# interval = 15[[instances]]
# # commands, support glob
commands = ["/opt/categraf/scripts/check_password_expiry.sh"
]# # timeout for each command to complete
# timeout = 5# # interval = global.interval * interval_times
# interval_times = 1# # choices: influx prometheus falcon
# # influx stdout example: mesurement,labelkey1=labelval1,labelkey2=labelval2 field1=1.2,field2=2.3data_format = "influx"
influx格式及格式說明:
mesurement,labelkey1=labelval1,labelkey2=labelval2 field1=1.2,field2=2.3
mesurement
,定義指標名稱(或者前綴),比如 connections;mesurement
后面是逗號,逗號后面是標簽,如果沒有標簽,則mesurement后面不需要逗號標簽
是k=v的格式,多個標簽用逗號分隔,比如region=beijing,env=test標簽
之后是空格空格
之后是屬性字段,多個屬性字段用逗號分隔屬性
字段是字段名=值
的格式,在categraf里值只能是數字最終,
mesurement
和各個屬性字段名稱
拼接成metric名字
監控Shell腳本check_password_expiry.sh
#!/bin/bash# 定義需要檢查的用戶名數組
users=("app" "root" "weihu" "mysql" "nginx")# 循環處理每個用戶名
for user in "${users[@]}"
do# 設置 LANG 環境變量以確保 chage -l 的輸出為英文export LANG=en_US.UTF-8# 獲取密碼過期時間,并去除前后空格EXPIRY_DATE_RAW=$(chage -l $user | grep "Password expires")EXPIRY_DATE=$(echo "$EXPIRY_DATE_RAW" | awk -F: '{print $2}' | awk '{$1=$1};1')# 檢查是否密碼永不過期if [[ "$EXPIRY_DATE" =~ ^(never|從不)$ ]]; thenEXPIRY_DATE_TS=99999 # 使用一個很大的數字表示永遠不會過期的時間戳EXPIRY_DATE_FORMATTED="99999" # 使用一個很大的日期來表示永不過期DAYS_LEFT=99999 # 表示永不過期else# 將過期日期轉換為時間戳EXPIRY_DATE_TS=$(date --date="$EXPIRY_DATE" +%s 2>/dev/null)# 獲取今天的日期時間戳TODAY_TS=$(date +%s)# 計算剩余過期天數DAYS_LEFT=$(( (EXPIRY_DATE_TS - TODAY_TS) / 86400 ))# 將過期日期轉換為 yyyymmdd 格式EXPIRY_DATE_FORMATTED=$(date --date="$EXPIRY_DATE" "+%Y%m%d" 2>/dev/null)fi# 清除 LANG 環境變量以恢復之前的設置unset LANG# 輸出符合 InfluxDB line protocol 的格式echo "password_expiry,account=$user,password_expires_time=$EXPIRY_DATE_FORMATTED days_until_expiry=$DAYS_LEFT"
done
注意:
腳本執行輸出結果一定要滿足前面exec.toml配置文件中定義的
data_format = "influx"
數據格式,這樣categraf截獲的stdout內容,才能成功解析并傳給服務端,上述腳本執行輸出如下:[root@localhost categraf]# ./categraf --test --inputs exec ...... 18:44:10 password_expiry_days_until_expiry account=app agent_hostname=localhost password_expires_time=20241026 6 18:44:10 password_expiry_days_until_expiry account=root agent_hostname=localhost password_expires_time=99999 99999 18:44:10 password_expiry_days_until_expiry account=weihu agent_hostname=localhost password_expires_time=99999 99999 18:44:10 password_expiry_days_until_expiry account=mysql agent_hostname=localhost password_expires_time=99999 99999 18:44:10 password_expiry_days_until_expiry account=nginx agent_hostname=localhost password_expires_time=99999 99999 ......
監控策略規則usermanager.json
上述測試確認數據及格式無誤后在夜鶯監控平臺配置關于Linux系統用戶密碼有效期的監控大盤,直接導入如下json內容,完成監控策略配置。
{"name": "LInux系統賬號密碼有效期檢查","tags": "usermanager","ident": "","configs": {"var": [{"name": "prom","label": "數據源","type": "datasource","definition": "prometheus","defaultValue": ""},{"name": "user","label": "用戶","type": "query","datasource": {"cate": "prometheus","value": 1},"definition": "label_values(account)"}],"panels": [{"type": "table","id": "2d96fa01-57a2-4ba1-b1a2-8369c3bf34f2","layout": {"h": 12,"w": 24,"x": 0,"y": 0,"i": "2d96fa01-57a2-4ba1-b1a2-8369c3bf34f2","isResizable": true},"version": "3.0.0","datasourceCate": "prometheus","datasourceValue": 1,"targets": [{"refId": "A","expr": "password_expiry_days_until_expiry","legend": "","time": {"start": "now-1m","end": "now"},"instant": false}],"transformations": [{"id": "organize","options": {"excludeByName": {"__name__": true,"value": false,"password_expires_on": true,"password_expires_time": false,"account": false,"ident": false},"renameByName": {"account": "系統用戶","ident": "主機節點","password_expires_on": "","value": "密碼過期剩余天數","password_expires_time": "密碼過期時間"},"indexByName": {"ident": 0,"account": 1,"password_expires_time": 2,"value": 3}}}],"name": "系統用戶密碼過期檢查","maxPerRow": 4,"custom": {"showHeader": true,"colorMode": "value","calc": "last","displayMode": "labelsOfSeriesToRows","columns": ["ident","account","password_expires_time","value"],"sortColumn": "value","sortOrder": "ascend","linkMode": "appendLinkColumn"},"options": {"valueMappings": [{"type": "special","result": {"color": "#000000","text": "never"},"match": {"special": 99999}},{"type": "range","result": {"color": "rgba(253, 0, 0, 1)"},"match": {"from": -1000,"to": 15}}],"standardOptions": {"util": "none"}},"overrides": [{"matcher": {"id": "byName","value": "password_expires_time"},"properties": {"valueMappings": [{"type": "special","result": {"color": "#000000","text": "never"},"match": {"special": 99999}}],"standardOptions": {"util": "none"}}}]}],"version": "3.0.0","graphTooltip": "default","graphZoom": "default"}
}
告警策略規則alertrule.json
在夜鶯監控平臺配置關于Linux系統用戶密碼有效期的告警策略(在密碼過期前7天通過企業微信、飛書渠道每24小時推送告警提醒信息),直接導入如下json內容,完成告警策略配置。
[{"cate": "prometheus","datasource_ids": [0],"name": "Linux系統賬號過期告警提醒","note": "你的主機系統賬號 {{$labels.account}} 即將過期,請及時修改密碼!!!","prod": "metric","algorithm": "","algo_params": null,"delay": 0,"severity": 0,"severities": [3],"disabled": 0,"prom_for_duration": 60,"prom_ql": "","rule_config": {"queries": [{"keys": {"labelKey": "","valueKey": ""},"prom_ql": "password_expiry_days_until_expiry<7","severity": 3}]},"prom_eval_interval": 30,"enable_stime": "00:00","enable_stimes": ["00:00"],"enable_etime": "00:00","enable_etimes": ["00:00"],"enable_days_of_week": ["0","1","2","3","4","5","6"],"enable_days_of_weeks": [["0","1","2","3","4","5","6"]],"enable_in_bg": 0,"notify_recovered": 1,"notify_channels": ["wecom","feishu"],"notify_repeat_step": 1440,"notify_max_number": 0,"recover_duration": 0,"callbacks": [],"runbook_url": "","append_tags": [],"annotations": {},"extra_config": null}
]
效果展示
監控結果展示
告警推送結果展示
【?測試平臺-告警?】
級別狀態: S3
規則名稱: Linux系統賬號不足7天過期告警提醒
規則備注: 你的主機系統賬號 app 即將過期,請及時修改密碼!!!
告警主機: localhost
觸發時間: 2024-10-19 14:17:34
觸發時值: 7
發送時間: 2024-10-19 14:17:35