目錄
- 引言
- 環境準備
- 安裝自定義資源
- 部署Elasticsearch
- Master 節點與 Data 節點的區別
- 生產優化建議
- 安裝好以后測試ES是否正常
- 部署Fluentd
- 測試filebeat是否正常推送日志
- 部署Kibana
- 獲取賬號密碼,賬號是:elastic
- 集群測試
引言
- 系統版本為 Centos7.9
- 內核版本為 6.3.5-1.el7
- K8S版本為 v1.26.14
- ES官網
- 本次部署已經盡量避免踩坑,直接使用官方的方法有點問題。
環境準備
- 準備ceph存儲或者nfs存儲
- NFS存儲安裝方法
- 本次安裝使用官方ECK方式部署 EFK(老版本,7.17.3。 現存的生產環境版本基本都是這個版本。)
- 增加RBAC權限和日志模板相關內容方便在輸出日志的時候添加K8S相關內容
安裝自定義資源
kubectl create -f https://download.elastic.co/downloads/eck/1.7.1/crds.yaml
kubectl apply -f https://download.elastic.co/downloads/eck/1.7.1/operator.yaml
部署Elasticsearch
apiVersion: elasticsearch.k8s.elastic.co/v1
kind: Elasticsearch
metadata:name: quickstartnamespace: elastic-system
spec:version: 7.17.3nodeSets:- name: masterscount: 1config:node.roles: ["master"]xpack.ml.enabled: truepodTemplate:spec:initContainers:- name: sysctlsecurityContext:privileged: truerunAsUser: 0command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']volumeClaimTemplates:- metadata:name: elasticsearch-dataspec:storageClassName: nfs-dynamicaccessModes:- ReadWriteOnceresources:requests:storage: 10Gi- name: datacount: 1config:node.roles: ["data", "ingest", "ml", "transform"]podTemplate:spec:initContainers:- name: sysctlsecurityContext:privileged: truerunAsUser: 0command: ['sh', '-c', 'sysctl -w vm.max_map_count=262144']volumeClaimTemplates:- metadata:name: elasticsearch-dataspec:storageClassName: nfs-dynamicaccessModes:- ReadWriteOnceresources:requests:storage: 50Gi
生產環境建議按照下面的方式配置,我這個是測試環境怎么省事怎么來。
Master 節點與 Data 節點的區別
特性 | Master 節點 | Data 節點 |
---|---|---|
核心職責 | 管理集群元數據(如索引、分片分配、節點狀態) | 存儲數據(主分片和副本分片),執行讀寫操作(搜索、聚合) |
配置中的角色定義 | node.roles: ["master"] | node.roles: ["data", "ingest", "ml", "transform"] |
資源需求 | 低 CPU/內存(輕量級元數據管理) | 高 CPU/內存/磁盤(處理數據和計算) |
高可用性要求 | 必須冗余(生產環境至少 3 個,避免腦裂) | 可水平擴展(根據數據量和負載動態增減) |
示例場景 | 集群協調、分片分配、狀態維護 | 文檔寫入、搜索請求處理、機器學習任務 |
生產優化建議
Master 節點配置優化
nodeSets:
- name: masterscount: 3 # 生產環境至少部署 3 個config:node.roles: ["master"]# 禁用非必要功能(節省資源)xpack.ml.enabled: false
Data 節點角色分離
- name: data-onlycount: 2config:node.roles: ["data"] # 專注數據存儲- name: ingestcount: 2config:node.roles: ["ingest"] # 專用寫入節點- name: mlcount: 1config:node.roles: ["ml", "transform"] # 獨立計算節點
安裝好以后測試ES是否正常
## 打開兩個終端測試或者后臺運行一個命令。
kubectl port-forward -n elastic-system services/quickstart-es-http 9200## 獲取密碼
PASSWORD=$(kubectl get secret -n elastic-system quickstart-es-elastic-user -o go-template='{{.data.elastic | base64decode}}')## 訪問一下測試
curl -u "elastic:$PASSWORD" -k "https://localhost:9200"
部署Fluentd
- 提供Fluentd的DaemonSet配置文件示例
apiVersion: beat.k8s.elastic.co/v1beta1
kind: Beat
metadata:name: filebeatnamespace: elastic-system
spec:type: filebeatversion: 7.17.3elasticsearchRef:name: quickstart # 關聯的 Elasticsearch 資源對象名namespace: elastic-system # Elasticsearch 所在 Namespaceconfig:filebeat.inputs:- type: containerpaths:- /var/log/containers/*.logprocessors:- add_kubernetes_metadata: # 增加 k8s label 等相關信息。host: ${NODE_NAME} matchers:- logs_path:logs_path: "/var/log/containers/"- drop_fields: # 這里可以根據需求增減需要去除的值fields: ["agent", "ecs", "container", "host","host.name","input", "log", "offset", "stream","kubernetes.namespace","kubernetes.labels.app","kubernetes.node", "kubernetes.pod", "kubernetes.replicaset", "kubernetes.namespace_uid", "kubernetes.labels.pod-template-hash"]ignore_missing: true # 字段不存在時不報錯- decode_json_fields: fields: ["message"] # 要解析的原始字段target: "" # 解析到根層級(平鋪字段)overwrite_keys: false # 是否覆蓋原有值process_array: false # 是否解析數組格式max_depth: 1 # 僅解析一層 JSONoutput.elasticsearch:username: "elastic" # 使用 Elastic 內置超級用戶(生產環境不推薦)password: "5ypyQpuC6BB191Si9w1209MM" # 這里需要改成正確的密碼(生產環境建議使用Secret注入)index: "filebeat-other-log-%{+yyyy.MM.dd}"indices: # # 索引路由規則(按條件分流)- index: "filebeat-containers-log-%{+yyyy.MM.dd}" # 默認索引格式(按日滾動)when.or:- contains:kubernetes.labels.app: "etcd"- index: "filebeat-services-log-%{+yyyy.MM.dd}"when.contains:kubernetes.labels.type: "service"pipelines: # 引用 Ingest Pipeline 處理數據流- pipeline: "filebeat-containers-log-pipeline"when.or:- contains:kubernetes.labels.app: "etcd"- pipeline: "filebeat-services-log-pipeline"when.contains:kubernetes.labels.type: "service"setup.template.settings:index:number_of_shards: 1 # 主分片數設為 1number_of_replicas: 0 # 副本數設為 0 ## 生產環境至少為1.setup.template.enabled: true # 必須開啟模板功能setup.template.overwrite: true # 強制覆蓋舊模板setup.template.name: "filebeat-log-template" # ? 自定義模板名setup.template.pattern: "filebeat-*-log-*" # ? 匹配所有日setup.ilm.enabled: false # 禁用 ILM(與手動模板配置兼容)daemonSet:podTemplate:spec:serviceAccount: elastic-beat-filebeat-quickstartautomountServiceAccountToken: truednsPolicy: ClusterFirstWithHostNethostNetwork: truesecurityContext:runAsUser: 0containers:- name: filebeatenv: - name: NODE_NAMEvalueFrom:fieldRef:fieldPath: spec.nodeNamevolumeMounts:- name: varlogcontainersmountPath: /var/log/containers- name: varlogpodsmountPath: /var/log/pods- name: varlibdockercontainersmountPath: /var/lib/containersvolumes:- name: varlogcontainershostPath:path: /var/log/containers- name: varlogpodshostPath:path: /var/log/pods- name: varlibdockercontainershostPath:path: /var/lib/containers
---
apiVersion: v1
kind: ServiceAccount
metadata:name: elastic-beat-filebeat-quickstartnamespace: elastic-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:name: elastic-beat-autodiscover-bindingnamespace: elastic-system
roleRef:apiGroup: rbac.authorization.k8s.iokind: ClusterRolename: elastic-beat-autodiscover
subjects:
- kind: ServiceAccountname: elastic-beat-filebeat-quickstartnamespace: elastic-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:name: elastic-beat-autodiscovernamespace: elastic-system
rules:
- apiGroups:- ""resources:- nodes- namespaces- events- podsverbs:- get- list- watch
測試filebeat是否正常推送日志
PASSWORD=$(kubectl get secret -n elastic-system quickstart-es-elastic-user -o go-template='{{.data.elastic | base64decode}}')
curl -u "elastic:$PASSWORD" -k "https://localhost:9200/filebeat-*/_search"
部署Kibana
apiVersion: v1
kind: PersistentVolumeClaim
metadata:name: kibana-data-pvcnamespace: elastic-system
spec:accessModes:- ReadWriteOnceresources:requests:storage: 5GistorageClassName: nfs-dynamic
---
apiVersion: kibana.k8s.elastic.co/v1
kind: Kibana
metadata:name: quickstartnamespace: elastic-system
spec:version: 7.17.3count: 1elasticsearchRef:name: quickstartnamespace: elastic-systemhttp:tls:selfSignedCertificate:disabled: trueconfig:i18n.locale: "zh-CN" # 添加中文支持podTemplate:spec:containers:- name: kibanaenv:- name: NODE_OPTIONSvalue: "--max-old-space-size=2048"volumeMounts:- mountPath: /usr/share/kibana/dataname: kibana-data volumes:- name: kibana-datapersistentVolumeClaim:claimName: kibana-data-pvc
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:name: kibana-ingressnamespace: elastic-system
spec:ingressClassName: nginxrules:- host: kibana.deployers.cnhttp:paths:- backend:service:name: quickstart-kb-httpport:name: httppath: /pathType: Prefixtls:- hosts:- kibana.deployers.cn
獲取賬號密碼,賬號是:elastic
## 獲取密碼
kubectl get secret -n elastic-system quickstart-es-elastic-user -o=jsonpath='{.data.elastic}' | base64 --decode; echo
集群測試
查詢集群健康狀態
## 新建一個窗口,執行這條命令。
kubectl port-forward -n elastic-system services/quickstart-es-http 9200## 獲取密碼
PASSWORD=$(kubectl get secret -n elastic-system quickstart-es-elastic-user -o go-template='{{.data.elastic | base64decode}}')
## 查看狀態
curl -u "elastic:$PASSWORD" -k "https://localhost:9200/_cluster/health?pretty"
正常輸出狀態
{"cluster_name" : "quickstart","status" : "green","timed_out" : false,"number_of_nodes" : 2,"number_of_data_nodes" : 1,"active_primary_shards" : 14,"active_shards" : 14,"relocating_shards" : 0,"initializing_shards" : 0,"unassigned_shards" : 0,"delayed_unassigned_shards" : 0,"number_of_pending_tasks" : 0,"number_of_in_flight_fetch" : 0,"task_max_waiting_in_queue_millis" : 0,"active_shards_percent_as_number" : 100.0
}
查看未分配分片詳細信息
- prirep:r 表示副本分片
- unassigned.reason:未分配原因(如 NODE_LEFT、INDEX_CREATED 等)
curl -u "elastic:$PASSWORD" -k "https://localhost:9200/_cat/shards?v&h=index,shard,prirep,state,unassigned.reason"
輸出結果
index shard prirep state unassigned.reason
.async-search 0 p STARTED
.apm-agent-configuration 0 p STARTED
.apm-custom-link 0 p STARTED
.kibana-event-log-7.17.3-000001 0 p STARTED
.geoip_databases 0 p STARTED
.kibana_security_session_1 0 p STARTED
.ds-ilm-history-5-2025.05.09-000001 0 p STARTED
.kibana_task_manager_7.17.3_001 0 p STARTED
.security-7 0 p STARTED
.ds-.logs-deprecation.elasticsearch-default-2025.05.09-000001 0 p STARTED
product-other-log-2025.05.12 0 p STARTED
.tasks 0 p STARTED
.kibana_7.17.3_001 0 p STARTED
product-other-log-2025.05.09 0 p STARTED
檢查節點資源使用情況
curl -u "elastic:$PASSWORD" -k "https://localhost:9200/_cat/nodes?v&h=name,disk.used_percent,ram.percent,cpu"
輸出結果
name disk.used_percent ram.percent cpu
quickstart-es-masters-0 1.73 54 16
quickstart-es-data-0 1.73 55 16
- 閾值參考?:
– 磁盤使用率 ≤85%
– 內存使用率 ≤80%
查看es節點狀態
curl -u "elastic:$PASSWORD" -k "https://localhost:9200/_cat/nodes?v"ip heap.percent ram.percent cpu load_1m load_5m load_15m node.role master name
172.20.129.129 8 55 16 0.96 0.76 0.57 m * quickstart-es-masters-0
172.20.129.130 70 56 16 0.96 0.76 0.57 dilt - quickstart-es-data-0