prometheus UI 和node_exporter節點圖形化Grafana

先簡單的安裝一下

進行時間的同步操作

安裝Prometheus之前必須要先安裝ntp時間同步，因為prometheus server對系統時間的準確性要求很高，必須保證本機時間實時同步。

# 用crontab進行定時的時間的同步
yum install ntpdate -y#寫一個定時器
crontab -e
* * * * * ntpdate ntp.aliyun.com

安裝Prometheus之前必須要先安裝ntp時間同步，因為prometheus server對系統時間的準確性要求很高，必須保證本機時間實時同步。

安裝和啟動Prometheus server

prometheus的安裝非常簡單，無需要解壓，然后執行命令即可。

tar -xvzf prometheus-2.44.0.linux-amd64.tar.gz
mv prometheus-2.44.0.linux-amd64  /usr/local/prometheus
cd /usr/local/prometheus

這里我們寫一個systemctl管理的腳本來啟動這個服務。

[root@prometheus-server data]# cat /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus: the alerting system
Documentation=http://prometheus.io/docs/
After=prometheus.service[Service]
ExecStart=/usr/local/prometheus/prometheus --web.enable-lifecycle --storage.tsdb.retention=90d --storage.tsdb.path=/usr/local/prometheus/data --config.file=/usr/local/prometheus/prometheus.yml
Restart=always
RestartSec=15s[Install]
WantedBy=multi-user.target

介紹幾個參數
- –config.file加載prometheus的配置文件
- –web.listen-address #監聽prometheus的web地址和端口,默認是9090.
- –storage.tsdb.path #數據持久化的保存路徑
- –storage.tsdb.retention #數據持久化的時間

就可以使用systemctl來管理prometheus 服務，操作如下

systemctl daemon-reload 
systemctl restart prometheus 
systemctl enable prometheus # 開機自啟

我們就可以檢查這個服務了

[root@prometheus-66 ~]# systemctl status prometheus

在這里插入圖片描述

打開prometheus UI界面和詳細使用

訪問你當前使用啟動服務主機和ip loaclhost:9090

在這里插入圖片描述

Prometheus UI是Prometheus內置的一個可視化管理界面，通過Prometheus UI，用戶能夠輕松的了解Prometheus當前的配置，監控任務運行狀態等。通過Graph面板，用戶還能直接使用PromQL實時查詢監控數據。
Promtheus作為一個時間序列數據庫，其采集的數據會以文件的形似存儲在本地中，默認的存儲路徑為執行命令的當前data目錄下，會自動創建，用戶也可以通過參數–storage.tsdb.path="data/"修改本地數據存儲的路徑。
在查詢up命令 1是存活，0是檢測失敗

在這里插入圖片描述

查看監控的有哪些的主機

prometheus server配置文件介紹

Prometheus的主配置文件prometheus.yml，其實prometheus解壓安裝之后，就默認自帶了一個基本的配置文件，簡單修改后的prometheus.yml文件內容如下：

[root@prometheus-66 prometheus]# vim prometheus.yml # my global config
global:scrape_interval: 15s # Set the scrape interval to every 15 seconds. Default is every 1 minute.evaluation_interval: 15s # Evaluate rules every 15 seconds. The default is every 1 minute.# scrape_timeout is set to the global default (10s).# Alertmanager configuration
alerting:alertmanagers:- static_configs:- targets:# - alertmanager:9093# Load rules once and periodically evaluate them according to the global 'evaluation_interval'.
rule_files:# - "first_rules.yml"# - "second_rules.yml"# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:# The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.static_configs:- targets: ["localhost:9090","10.0.0.110:9100","10.0.0.66:9100"]# The label name is added as a label `label_name=<label_value>` to any timeseries scraped from this config.labels:app: "prometheus"

介紹一下如下：

scrape_interval: 15s #每隔15秒向目標抓取一次數，默認為一分鐘
evaluation_interval: 15s   #每隔15秒執行一次告警規則檢查，默認為一分鐘

scrape_configs指定的是prometheus要監控的目標，這里是整個prometheus的核心部分，在srape_config中每一個監控是一個job，但job的類型有很多。可以是最簡單的static_config,即靜態文件指定的每一個目標

- job_name: prometheusstatic_configs:- targets: ['localhost:9090']

也可以自己定義要監控主機的節點，這里指定的是一個prometheus本機的一個監控節點對應的9090端口，也可以繼續擴展需要監控的節點，例如:

- job_name: 'aliyun'static_configs:-  targets:['server1:9100','IP:9100','nginxserver:9100','web01:9100','redis:9100','logserver:9100','redis2:9100']

對于prometheus.yml語法的檢查

prometool是prometheus自帶的檢測prometheus.yml的工具

[root@prometheus-66 prometheus]# ll
total 373876
-rw-r--r-- 1 root root  72649805 Dec 14  2021 grafana-8.3.3-1.x86_64.rpm
-rw-r--r-- 1 1001  118     11357 Jul 15 00:36 LICENSE
-rw-r--r-- 1 1001  118      3773 Jul 15 00:36 NOTICE
-rwxr-xr-x 1 1001  118 159425376 Jul 15 00:17 prometheus
-rw-r--r-- 1 1001  118      1128 Jul 18 14:09 prometheus.yml
-rwxr-xr-x 1 1001  118 150746286 Jul 15 00:17 promtool

[root@prometheus-66 prometheus]# ./promtool check config prometheus.yml 
Checking prometheus.ymlSUCCESS: prometheus.yml is valid prometheus config file syntax

prometheus.yml 文件編輯完成后,進行校驗配置非常重要,如果有配置錯誤,也會給出錯誤提示.

prometheus監控Linux主機狀態

Exporter介紹

所有可以向Prometheus提供監控樣本數據的程序都可以被稱為一個Exporter。而Exporter的一個實例稱為target，如下圖所示，Prometheus通過輪詢的方式定期從這些target中獲取樣本數據：

在這里插入圖片描述

（2）、用戶自定義的

我們除了可以直接使用社區提供的Exporter以外，還可以基于Prometheus提供的Client Library創建適合自己的Exporter程序，目前Promthues社區官方提供了對以下編程語言的支持：Python 、Go、Java/Scala、Ruby等。同時還有第三方實現的如：Bash、C++、Lua、Node.js、PHP、Rust等。

官方推薦的Exporter列表：https://prometheus.io/docs/instrumenting/exporters/

node_exporter介紹

Node exporte主要用于采集被監控主機上的cpu負載，內存的使用情況，網絡等數據，并上報數據給Prometheus server。Node_exporter 其實是一個以http_server方式運行在后臺，并且持續不斷采集 Linux系統中各種操作系統本身相關的監控參數的程序，其采集量是很快很全的，默認的采集項目就遠超過了我們的實際需求。
這里我們從官網下載node_exporter
Prometheus官網https://prometheus.io/download/ 下載node_exporter

在這里插入圖片描述

[root@localhost ~]# tar zxvf node_exporter-1.5.0.linux-amd64.tar.gz
[root@localhost ~]# mv node_exporter-1.5.0.linux-amd64  /usr/local/node_exporter
[root@localhost ~]# cd  /usr/local/node_exporter

這里我們也編寫systemctl的腳本啟動這個服務

[root@prometheus-server data]# cat /usr/lib/systemd/system/prometheus.service
[Unit]
Description=Prometheus: the alerting system
Documentation=http://prometheus.io/docs/
After=prometheus.service[Service]
ExecStart=/usr/local/node_exporter/node_exporter 
Restart=always
RestartSec=15s[Install]
WantedBy=multi-user.target

啟動這個服務并且檢測端口

在這里插入圖片描述

獲取對應的監控數據

http://localhost:9100

在這里插入圖片描述

通過這個頁面可以看到Prometheus的抓取到的數據

在這里插入圖片描述

這個node節點的數據

在這里插入圖片描述

這個是我們在prometheus.yml上定義的標簽，這個是可以修改的我們也可以修改為aliyun

在這個組下的都加上了這個組名字的標簽。

我們對于prometheus.yml進行修改

在這里插入圖片描述

我們修改的配置文件

scrape_configs:- job_name: "prometheus"# metrics_path defaults to '/metrics'# scheme defaults to 'http'.static_configs:- targets: ["localhost:9090"]- job_name: "aliyun.com"static_configs:- targets: ["loaclhost:9100","10.0.0.110:9100"]

我們學習查詢的語句

我們只要關于aliyun組的信息

在這里插入圖片描述

這里我們就差不多因該學會了prometheus配置文件的修改和部署，

但是這樣的儀表盤并不是很好的查看我們就需要使用圖形化工具來表示

Gragana對prometheus實現監控可視化

Grafana是一個開源的指標監測和可視化工具。官方網站為：https://grafana.com/，常用于展示基礎設施的時序數據和應用程序運行分析。Grafana的dashboard展示非常炫酷，絕對是運維提升可視化監控的一大利器。
數據源（datasource ）

數據的存儲源，它定義了將用什么方式來查詢數據展示在grafana上面，不同的datasource擁有不同的查詢語法，grafana支持多種數據源，官方支持以下數據源：Graphite，InfluxDB，OpenTSDB，Prometheus，Elasticsearch，CloudWatch。

每個數據源的查詢語言和能力各不同，我們可以將來自多個數據源的數據組合到一個儀表盤中，但是每個面板都綁定到屬于特定組織的特定數據源。
儀表盤（Dashboard ）

通過數據源定義好可視化的數據來源之后，對于用戶而言最重要的事情就是實現數據的可視化。在Grafana中，我們通過Dashboard來組織和管理我們的數據可視化圖表。

在這里插入圖片描述

(2)、Grafana的安裝與配置

Grafana的安裝非常簡單，官方就有軟件倉庫可以直接使用，也可以通過docker鏡像等方式直接本地啟動。還可以直接下載rpm包、二進制包進行安裝。

大家可以從 https://grafana.com/get 下載Grafana安裝包，然后根據需要的系統平臺及性能下載即可，官方給出了非常詳細的安裝方法，例如，要通過rpm包安裝在Redhat & Centos平臺，可直接執行如下命令：

[root@localhost ~]# wget https://dl.grafana.com/oss/release/grafana-8.3.3-1.x86_64.rpm
[root@localhost ~]#  yum localinstall grafana-8.3.3-1.x86_64.rpm

安裝完畢后，啟動grafana，訪問 http://your-host:3000 就可以看到登錄界面了。默認的用戶名和密碼都是admin。

本文是在centos7.x版本下進行的安裝，安裝方法與上面相同，安裝完畢后，設置一些服務自啟動即可，操作如下：

[root@localhost ~]# systemctl enable grafana-server #開啟自啟動
[root@localhost ~]# systemctl start grafana-server #啟動服務
[root@localhost ~]# systemctl status grafana-server #查看服務是否正常啟動

Grafana的配置文件位于/etc/grafana/grafana.ini，一般情況下無需修改配置文件。這樣，安裝就完成了。非常簡單。