python開發prometheus exporter--用于hadoop-yarn監控

首先寫python的exporter需要知道Prometheus提供4種類型Metrics

分別是：Counter, Gauge, Summary和Histogram

* Counter可以增長，并且在程序重啟的時候會被重設為0，常被用于任務個數，總處理時間，錯誤個數等只增不減的指標。

* Gauge與Counter類似，唯一不同的是Gauge數值可以減少，常被用于溫度、利用率等指標。

* Summary/Histogram概念比較復雜，對于我來說目前沒有使用場景，暫無了解。

我們需要的pip模塊

from prometheus_client import CollectorRegistry, Gauge, push_to_gateway, start_http_server-----pip install prometheus_client

代碼思路實例

def push_yarn():# 監控zk_RMYarn_zkRMAppRoot()# 監控yarn任務信息Yarn_AppsInfo()def run():start_http_server(8006)??# 8006端口啟動while True:push_yarn()time.sleep(10)if __name__ == '__main__':run()

push_yarn()為監控的數據數據

循環進行監控拿取數據進行監控

我們使用Gauge實例

注意??：Gauge與Counter類似，唯一不同的是Gauge數值可以減少，常被用于溫度、利用率等指標。

新增Gauge實例

yarn_zkRMAppRoot_code = Gauge('yarn_zkRMAppRoot', 'yarn_zkRMAppRoot_num', ['instance'])started_time_gauge = Gauge('yarn_started_time', 'started_time', ['application'])launch_time_gauge = Gauge('yarn_launch_time', 'launch_time', ['application'])finished_time_gauge = Gauge('yarn_finished_time', 'finished_time', ['application'])memory_seconds_gauge = Gauge('yarn_memory_seconds', 'memory_seconds', ['application'])vcore_seconds_gauge = Gauge('yarn_vcore_seconds', 'vcore_seconds', ['application'])

yarn_zkRMAppRoot_code: 這個是一個Gauge指標,用于記錄YARN ResourceManager應用程序根目錄在ZooKeeper中的znode數量。

yarn_started_time: 這是一個Gauge指標,用于記錄應用程序的啟動時間。這個指標有一個 application 標簽,用于區分不同的應用程序。

yarn_launch_time: 這是一個Gauge指標,用于記錄應用程序的啟動時間。這個指標也有一個 application 標簽。

yarn_finished_time: 這是一個Gauge指標,用于記錄應用程序的結束時間。這個指標也有一個 application 標簽。

yarn_memory_seconds: 這是一個Gauge指標,用于記錄應用程序使用的內存數量乘以運行時間(內存-秒)。這個指標也有一個 application 標簽。

yarn_vcore_seconds: 這是一個Gauge指標,用于記錄應用程序使用的虛擬CPU核心數量乘以運行時間(vCore-秒)。這個指標也有一個 application 標簽。

實現一下我們要監控的指標

# --------yarn-------- #####def Yarn_zkRMAppRoot():# 命令# 命令if kerberos_switch:command = f'''echo 'ls /rmstore/ZKRMStateRoot/RMAppRoot' | /opt/dtstack/DTBase/zookeeper/bin/zkCli.sh | grep application_ | awk -F , '{{print NF}}''''else:command = f'''export CLIENT_JVMFLAGS="$CLIENT_JVMFLAGS -Djava.security.auth.login.config=/opt/dtstack/DTBase/zookeeper/conf/jaas.conf -Djava.security.krb5.conf=/opt/dtstack/Kerberos/kerberos_pkg/conf/krb5.conf -Dzookeeper.server.principal=zookeeper/{hostname}@DTSTACK.COM"echo 'ls /rmstore/ZKRMStateRoot/RMAppRoot' | /opt/dtstack/DTBase/zookeeper/bin/zkCli.sh | grep application_ | awk -F , '{{print NF}}''''# 使用subprocess模塊執行命令result = subprocess.getstatusoutput(command)??# (0, '455')if result[0] == 0:yarn_zkRMAppRoot_code.labels('yarn_' + hostname).set(result[1])else:print(f"Failed to execute command: {command}")def Yarn_AppsInfo():list_apps = []command = "yarn rmadmin -getServiceState rm1"apps_url = "http://{}/ws/v1/cluster/apps"rm_info = subprocess.getstatusoutput(command)if rm_info[0] == 0:if rm_info[1] == 'active':rm_host = yarn_rm1else:rm_host = yarn_rm2response = requests.get(url=apps_url.format(rm_host))html = response.textdata = json.loads(html)for i in range(0, len(data['apps']['app'])):need_data = data['apps']['app']if need_data[i]['memorySeconds'] > 102400:??# 大于10G的任務list_apps.append([need_data[i]['id'],need_data[i]['startedTime'],need_data[i]['launchTime'],need_data[i]['finishedTime'],need_data[i]['memorySeconds'], need_data[i]['vcoreSeconds']])sorted_lst = sorted(list_apps, key=lambda x: (x[4], x[5]))for list in sorted_lst:application = list[0]started_time = list[1]launch_time = list[2]finished_time = list[3]memory_seconds = list[4]vcore_seconds = list[5]started_time_gauge.labels(application=application).set(started_time)launch_time_gauge.labels(application=application).set(launch_time)finished_time_gauge.labels(application=application).set(finished_time)memory_seconds_gauge.labels(application=application).set(memory_seconds)vcore_seconds_gauge.labels(application=application).set(vcore_seconds)

其中Yarn_zkRMAppRoot是檢測znode數量的

Yarn_AppsInfo是檢測大于10G的任務的

傳到服務器啟動這個exporter

python3 mg_exporter.py

訪問http://172.16.121.89:8006/metrics

然后加入prometheus配置中就可以檢測到了

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/diannao/44169.shtml
繁體地址，請注明出處：http://hk.pswp.cn/diannao/44169.shtml
英文地址，請注明出處：http://en.pswp.cn/diannao/44169.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！