by Ritvik Khanna

Ritvik Khanna著

如何使用Elasticsearch，Logstash和Kibana實時可視化Python中的日志 (How to use Elasticsearch, Logstash and Kibana to visualise logs in Python in realtime)

什么是日志記錄？ (What is logging?)

Let’s say you are developing a software product. It works remotely, interacts with different devices, collects data from sensors and provides a service to the user. One day, something goes wrong and the system is not working as expected. It might not be identifying the devices or not receiving any data from the sensors, or might have just gotten a runtime error due to a bug in the code. How can you know for sure?

假設您正在開發軟件產品。它可以遠程工作，與不同的設備進行交互，從傳感器收集數據并為用戶提供服務。有一天，出了點問題，系統無法按預期運行。它可能無法識別設備或未從傳感器接收任何數據，或者可能由于代碼中的錯誤而剛發生運行時錯誤。您怎么能確定？

Now, imagine if there are checkpoints in the system code where, if the system returns an unexpected result, it simply flags it and notifies the developer. This is the concept of logging.

現在，想象一下系統代碼中是否存在檢查點，如果系統返回意外結果，則僅對其進行標記并通知開發人員。這就是日志記錄的概念。

Logging enables the developers to understand what the code is actually doing and how the work-flow is. A large part of software developers’ lives is monitoring, troubleshooting and debugging. Logging makes this a much easier and smoother process.

通過日志記錄，開發人員可以了解代碼的實際作用以及工作流程。軟件開發人員的大部分工作是監視，故障排除和調試。日志記錄使此過程變得更加輕松和順暢。

日志可視化 (Visualisation of logs)

Now, if you are an expert developer who has been developing and creating software for quite a while, then you would think that logging is not a big deal and most of our code is included with a Debug.Log('____') statement. Well, that is great but there are some other aspects of logging we can make use of.

現在，如果您是開發和創建軟件已有相當一段時間的專家開發人員，那么您會認為日志記錄并不重要，并且我們的大多數代碼都包含在Debug.Log('____')語句中。很好，但是我們可以利用日志記錄的其他一些方面。

Visualisation of specific logged data has the following benefits:

可視化特定記錄的數據具有以下好處：

Monitor the operations of the system remotely.
遠程監視系統的操作。
Communicate information clearly and efficiently via statistical graphics, plots and information graphics.
通過統計圖形，曲線圖和信息圖形清晰有效地傳達信息。
Extract knowledge from the data visualised in the form of different graphs.
從以不同圖形形式可視化的數據中提取知識。
Take necessary actions to better the system.
采取必要的措施來改善系統。

There are a number of ways we can visualise raw data. There are a number of libraries in the Python and R programming languages that can help in plotting graphs. You can learn more about it here. But in this post, I am not going to discuss about above mentioned methods. Have you ever heard about the ELK stack?

我們可以通過多種方式可視化原始數據。 Python和R編程語言中有許多庫可以幫助繪制圖形。您可以在此處了解更多信息。但是在這篇文章中，我將不討論上述方法。您聽說過ELK堆棧嗎？

ELK堆棧 (ELK stack)

E — Elasticsearch, L — Logstash, K — Kibana

E- Elasticsearch ，L- Logstash ， K- Kibana

Let me give a brief introduction to it. The ELK stack is a collection of three open source softwares that helps in providing realtime insights about data that can be either structured or unstructured. One can search and analyse data using its tools with extreme ease and efficiently.

讓我對其進行簡要介紹。 ELK堆棧是三個開源軟件的集合，這些軟件有助于提供有關可結構化或非結構化數據的實時見解。一個人可以使用其工具輕松高效地搜索和分析數據。

Elasticsearch is a distributed, RESTful search and analytics engine capable of solving a growing number of use cases. As the heart of the Elastic Stack, it centrally stores your data so you can discover the expected and uncover the unexpected. Elasticsearch lets you perform and combine many types of searches — structured, unstructured, geo, metric etc. It is built on Java programming language, which enables Elasticsearch to run on different platforms. It enables users to explore very large amount of data at very high speed.

Elasticsearch是一個分布式的RESTful搜索和分析引擎，能夠解決越來越多的用例。作為Elastic Stack的核心，它集中存儲您的數據，以便您發現期望的數據并發現意外的數據。 Elasticsearch可讓您執行和組合多種類型的搜索-結構化，非結構化，地理，度量等。它基于Java編程語言構建，從而使Elasticsearch可以在不同平臺上運行。它使用戶能夠以很高的速度瀏覽大量數據。

Logstash is an open source, server-side data processing pipeline that ingests data from a multitude of sources simultaneously, transforms it, and then sends it to your favourite “stash” (like Elasticsearch). Data is often scattered or siloed across many systems in many formats. Logstash supports a variety of inputs that pull in events from a multitude of common sources, all at the same time. Easily ingest from your logs, metrics, web applications, data stores, and various AWS services, all in continuous, streaming fashion. Logstash has a pluggable framework featuring over 200 plugins. Mix, match, and orchestrate different inputs, filters, and outputs to work in pipeline harmony.

Logstash是一個開放源代碼的服務器端數據處理管道，它同時從多個源中提取數據，進行轉換，然后將其發送到您喜歡的“存儲”(例如Elasticsearch)。數據通常以多種格式分散或分散在許多系統中。 Logstash支持各種輸入，這些輸入可同時從多個常見來源獲取事件。輕松以連續，流式方式從日志，指標，Web應用程序，數據存儲和各種AWS服務中提取數據。 Logstash有一個可插入框架，其中包含200多個插件。混合，匹配和編排不同的輸入，過濾器和輸出，以協調管道。

Kibana is an open source analytics and visualisation platform designed to work with Elasticsearch. You use Kibana to search, view, and interact with data stored in Elasticsearch indices. You can easily perform advanced data analysis and visualise your data in a variety of charts, tables, and maps. Kibana makes it easy to understand large volumes of data. Its simple, browser-based interface enables you to quickly create and share dynamic dashboards that display changes to Elasticsearch queries in real time.

Kibana是一個旨在與Elasticsearch一起使用的開源分析和可視化平臺。您可以使用Kibana搜索，查看和與Elasticsearch索引中存儲的數據進行交互。您可以輕松地執行高級數據分析，并在各種圖表，表格和地圖中可視化數據。使用Kibana可以輕松理解大量數據。其簡單的基于瀏覽器的界面使您能夠快速創建和共享動態儀表板，以實時顯示對Elasticsearch查詢的更改。

To get a better picture of the workflow of how the three softwares interact with each other, refer to the following diagram:

為了更好地了解這三種軟件如何交互的工作流程，請參考下圖：

實作 (Implementation)

登錄Python (Logging in Python)

Here, I chose to explain the implementation of logging in Python because it is the most used language for projects involving communication between multiple machines and internet of things. It’ll help give you an overall idea of how it works.

在這里，我選擇解釋使用Python進行日志記錄的實現，因為它是涉及多臺機器與物聯網之間通信的項目的最常用語言。它會幫助您全面了解其工作原理。

Python provides a logging system as a part of its standard library, so you can quickly add logging to your application.

Python提供了一個日志記錄系統作為其標準庫的一部分，因此您可以快速將日志記錄添加到應用程序中。

import logging

In Python, logging can be done at 5 different levels that each respectively indicate the type of event. There are as follows:

在Python中，可以在5個不同的級別上進行日志記錄，每個級別分別指示事件的類型。內容如下：

Info — Designates informational messages that highlight the progress of the application at coarse-grained level.
信息 —指定參考消息，以粗粒度級別突出顯示應用程序的進度。
Debug — Designates fine-grained informational events that are most useful to debug an application.
調試 -指定對調試應用程序最有用的細粒度信息事件。
Warning — Designates potentially harmful situations.
警告 —表示潛在的有害情況。
Error — Designates error events that might still allow the application to continue running.
錯誤 —指定可能仍允許應用程序繼續運行的錯誤事件。
Critical — Designates very severe error events that will presumably lead the application to abort.
嚴重 -指定非常嚴重的錯誤事件，可能會導致應用程序中止。

Therefore depending on the problem that needs to be logged, we use the defined level accordingly.

因此，根據需要記錄的問題，我們相應地使用定義的級別。

Note: Info and Debug do not get logged by default as logs of only level Warning and above are logged.
注意：默認情況下，不會記錄信息和調試信息，因為僅記錄警告和更高級別的日志。

Now to give an example and create a set of log statements to visualise, I have created a Python script that logs statements of specific format and a message.

現在給出一個示例并創建一組可視化的日志語句，我創建了一個Python腳本，用于記錄特定格式的語句和一條消息。

Here, the log statements will append to a file named logFile.txt in the specified format. I ran the script for three days at different time intervals creating a file containing logs at random like below:

在這里，日志語句將以指定格式追加到名為logFile.txt的文件中。我以不同的時間間隔運行了三天的腳本，創建了一個包含日志的文件，如下所示：

設置Elasticsearch，Logstash和Kibana (Setting up Elasticsearch, Logstash and Kibana)

At first let’s download the three open source softwares from their respective links [elasticsearch],[logstash]and[kibana]. Unzip the files and put all three in the project folder.

首先，讓我們從下載他們相應的鏈接三個開源軟件[ elasticsearch ]，[ logstash ]和[ kibana 。解壓縮文件，然后將所有三個文件放入項目文件夾。

Let’s get started.

讓我們開始吧。

Step 1 — Set up Kibana and Elasticsearch on the local system. We run Kibana by the following command in the bin folder of Kibana.

步驟1 —在本地系統上設置Kibana和Elasticsearch。我們通過以下命令在Kibana的bin文件夾中運行Kibana。

bin\kibana

Similarly, Elasticsearch is setup like this:

同樣，Elasticsearch的設置如下：

bin\elasticsearch

Now, in the two separate terminals we can see both of the modules running. In order to check that the services are running open localhost:5621 and localhost:9600.

現在，在兩個單獨的終端中，我們可以看到兩個模塊都在運行。為了檢查服務是否正在運行，請打開localhost：5621和localhost：9600 。

After both the services are successfully running we use Logstash and Python programs to parse the raw log data and pipeline it to Elasticsearch from which Kibana queries data.

在兩個服務都成功運行之后，我們使用Logstash和Python程序解析原始日志數據，并將其通過管道傳輸到Elasticsearch，Kibana將從中查詢數據。

Step 2— Now let’s get on with Logstash. Before starting Logstash, a Logstash configuration file is created in which the details of input file, output location, and filter methods are specified.

第2步 -現在讓我們繼續進行Logstash。啟動Logstash之前，將創建一個Logstash配置文件，其中指定了輸入文件，輸出位置和過濾器方法的詳細信息。

This configuration file plays a major role in the ELK stack. Take a look at filter{grok{…}} line. This is a Grok filter plugin. Grok is a great way to parse unstructured log data into something structured and queryable. This tool is perfect for syslog logs, apache and other webserver logs, mysql logs, and in general, any log format that is generally written for humans and not computer consumption. This grok pattern mentioned in the code tells Logstash how to parse each line entry in our log file.

此配置文件在ELK堆棧中起主要作用。看一下filter {grok {…}}行。這是一個Grok過濾器插件。 Grok是將非結構化日志數據解析為結構化和可查詢內容的好方法。該工具非常適合syslog日志，apache和其他Web服務器日志，mysql日志，以及通常用于人類而非計算機使用的任何日志格式。代碼中提到的這種grok模式告訴Logstash如何解析日志文件中的每個行條目。

Now save the file in Logstash folder and start the Logstash service.

現在，將文件保存在Logstash文件夾中，然后啟動Logstash服務。

bin\logstash –f logstash-simple.conf

In order to learn more about configuring logstash, click [here].
為了了解更多關于配置logstash的信息，請單擊[ 此處 ]。

Step 3 — After this the parsed data from the log files will be available in Kibana management at localhost:5621 for creating different visuals and dashboards. To check if Kibana is receiving any data, in the management tab of Kibana run the following command:

步驟3 —之后，將從日志文件中解析的數據在Kibana管理中的localhost：5621可用，以創建不同的圖像和儀表板。要檢查Kibana是否正在接收任何數據，請在Kibana的管理選項卡中運行以下命令：

localhost:9200/_cat/indices?v

This will display all the indexes. For every visualisation, a new Index pattern has to be selected from dev tools, after which various visualisation techniques are used to create a dashboard.

這將顯示所有索引。對于每次可視化，都必須從開發工具中選擇新的索引模式，然后使用各種可視化技術來創建儀表板。

使用Kibana的儀表板 (Dashboard Using Kibana)

After setting up everything, now it’s time to create graphs in order to visualise the log data.

設置完所有內容之后，現在該創建圖表以可視化日志數據了。

After opening the Kibana management homepage, we will be asked to create a new index pattern. Enter index_name* in the Index pattern field and select @timestamp in the Time Filter field name dropdown menu.

打開Kibana管理主頁后，將要求我們創建一個新的索引模式。在索引模式字段中輸入index_name* ，然后在時間過濾器字段名稱下拉菜單中選擇@timestamp 。

Now to create graphs, we go to the Visualize tab.

現在創建圖表，我們轉到“ 可視化”選項卡。

Select a new visualisation, choose a type of graph and index name, and depending on your axis requirements, create a graph. We can create a histogram with y-axis as the count and x-axis with the log-level keyword or the timestamp.

選擇一個新的可視化效果，選擇一種圖形和索引名稱，然后根據您的軸要求創建一個圖形。我們可以使用log-level關鍵字或時間戳創建以y軸為計數和x軸的直方圖。

After creating a few graphs, we can add all the required visualisations and create a Dashboard, like below:

創建一些圖形后，我們可以添加所有必需的可視化效果并創建一個Dashboard ，如下所示：

Note — Whenever the logs in the log file get updated or appended to the previous logs, as long as the three services are running the data in elasticsearch and graphs in kibana will automatically update according to the new data.

注—只要日志文件中的日志被更新或追加到以前的日志中，只要這三個服務都在運行，elasticsearch中的數據和kibana中的圖形將根據新數據自動更新。

結語 (Wrapping up)

Logging can be an aid in fighting errors and debugging programs instead of using a print statement. The logging module divides the messages according to different levels. This results in better understanding of the code and how the call flow goes without interrupting the program.

日志記錄可以幫助您解決錯誤和調試程序，而不是使用print語句。日志記錄模塊根據不同的級別劃分消息。這樣可以更好地理解代碼以及調用流程如何進行而不會中斷程序。

The visualisation of data is a necessary step in situations where a huge amount of data is generated every single moment. Data-Visualization tools and techniques offer executives and other knowledge workers new approaches to dramatically improve their ability to grasp information hiding in their data. Rapid identification of error logs, easy comprehension of data and highly customisable data visuals are some of the advantages. It is one of the most constructive way of organising raw data.

在每時每刻都會生成大量數據的情況下，數據可視化是必不可少的步驟。數據可視化工具和技術為高管和其他知識工作者提供了新的方法，可以大大提高他們掌握隱藏在數據中的信息的能力。快速識別錯誤日志，輕松理解數據和高度可定制的數據外觀是其中的一些優勢。它是組織原始數據的最有建設性的方法之一。

For further reference you can refer to the official ELK documentation from here — https://www.elastic.co/learn and on logging in python — https://docs.python.org/2/library/logging.html
如需進一步參考，你可以參考官方文檔ELK從這里- https://www.elastic.co/learn并在Python記錄- https://docs.python.org/2/library/logging.html