數據庫語言 數據查詢
Working with data is becoming an increasingly important skill in the modern workplace.
在現代工作場所中,處理數據已成為越來越重要的技能。
Data is no longer the domain of analysts and software engineers. With today's technology, anyone can work with data to analyse trends and inform their decision making.
數據不再是分析師和軟件工程師的領域。 借助當今的技術,任何人都可以使用數據來分析趨勢并為決策提供依據。
A fundamental concept when working with data is 'querying' a data set. This is to literally ask questions about a set of data. A query language is a software language that provides a syntax for asking such questions.
處理數據時的基本概念是“查詢”數據集。 這是從字面上詢問有關一組數據的問題。 查詢語言是一種軟件語言,提供用于詢問此類問題的語法。
If you don't have any experience writing queries, they can appear a little intimidating. However, with a little practice, you can master the basics.
如果您沒有編寫查詢的經驗,它們可能會顯得有些嚇人。 但是,只需進行一些練習,即可掌握基礎知識。
Here's how you can get started in Google Sheets.
這是您開始使用Google表格的方法 。
Google可視化API查詢語言 (Google Visualization API Query Language)
You may already be using Google Sheets for much of your day-to-day work. Perhaps you are familiar with using it to generate charts and graphs.
您可能已經在大部分日常工作中使用Google表格。 也許您熟悉使用它來生成圖表。
The Google Visualization API Query Language is the magic that works behind the scenes to make this possible.
Google Visualization API查詢語言是在幕后起作用的魔力,使之成為可能。
But did you know you can access this language through the QUERY()
function? It can be a powerful tool for working with large sheets of data.
但是您知道可以通過QUERY()
函數訪問此語言嗎? 它是處理大量數據的強大工具。
There are a lot of similarities between the query language and SQL.
查詢語言和SQL之間有很多相似之處。
In both cases, you define a data set of columns and rows, and choose different columns and rows by specifying various criteria and conditions.
在這兩種情況下,您都將定義列和行的數據集,并通過指定各種條件和條件來選擇不同的列和行。
In this article, the example data will come from a large CSV file containing international football results between 1872 and 2019. You can download the data from Kaggle.
在本文中,示例數據將來自一個大型CSV文件,其中包含1872年至2019年之間的國際足球比賽結果。您可以從Kaggle下載數據 。
In a new Google Sheet, upload the CSV file. You can select all the data with Ctrl+A (or Cmd+A on Mac).
在新的Google表格中,上傳CSV文件。 您可以使用Ctrl + A(在Mac上為Cmd + A)選擇所有數據。
From the menu ribbon, choose Data > Named ranges... and call the range selected something like 'data'. This will make it easier to work with.
從功能區菜單中,選擇“數據”>“命名范圍...”,然后將所選范圍稱為“數據”。 這將使其更易于使用。
Now, you are ready to start querying the data. Create a new tab in the spreadsheet, and in cell A1, create a new QUERY()
formula.
現在,您可以開始查詢數據了。 在電子表格中創建一個新選項卡,并在單元格A1中創建一個新的QUERY()
公式。
獲取所有英格蘭比賽 (Get all England matches)
This first query finds all the rows in the data set where England are either the home team or the away team.
第一個查詢查找數據集中所有英格蘭為主隊或客隊的行。
The QUERY()
formula takes at least two arguments. The first is the named range, which will be the data set queried. The second is a string that contains the actual query.
QUERY()
公式至少接受兩個參數。 第一個是命名范圍,它將是查詢的數據集。 第二個是包含實際查詢的字符串。
=QUERY(data, "SELECT * WHERE B = 'England' OR C = 'England'")
Let's break this down.
讓我們分解一下。
SELECT *
asks to return all columns in the data set. If you only wanted columns A, B and C, you would write SELECT A, B, C
.
SELECT *
要求返回數據集中的所有列。 如果只需要A,B和C列,則可以編寫SELECT A, B, C
。
Next, you include a filter to find only rows where column B or column C contain the team 'England'
. Make sure to use single-quotes for strings inside the query. Double-quotes are used to open and close the query itself.
接下來,包括一個過濾器,以僅查找B列或C列包含團隊'England'
。 確保查詢中的字符串使用單引號。 雙引號用于打開和關閉查詢本身。
This formula returns all the rows where England have played. If you want to search for another team, simply change the condition in the filter.
此公式將返回英格蘭已打過的所有行。 如果要搜索另一個團隊,只需在過濾器中更改條件。
計算所有友誼賽 (Count all friendly matches)
Next, let's count how many friendly matches are in the data set.
接下來,讓我們計算一下數據集中有多少個友好匹配項。
=QUERY(data, "SELECT COUNT(A) WHERE F = 'Friendly'")
This makes use of the Query Language's COUNT()
function. This is an example of an aggregate function. Aggregate functions summarise many rows into one.
這利用了查詢語言的COUNT()
函數。 這是聚合函數的示例。 聚合函數將許多行匯總為一。
For example, in this data set there are 16,716 rows where column F equals 'Friendly'
. Instead of returning all these rows, the query returns a single row - which counts them instead.
例如,在此數據集中,有16,716行,其中列F等于'Friendly'
。 查詢不返回所有這些行,而是返回單個行-對其進行計數。
Other examples of aggregate functions include MAX()
, MIN()
and AVG()
. Instead of returning all the rows matching the query, it finds their maximum, minimum and average values instead.
聚合函數的其他示例包括MAX()
, MIN()
和AVG()
。 而不是返回與查詢匹配的所有行,而是查找它們的最大值,最小值和平均值。
按比賽分組 (Group by tournament)
Aggregate functions can do more if you use a GROUP BY
statement alongside them. This query finds out how many matches have been played by each tournament type.
如果在匯總函數旁邊使用GROUP BY
語句,則匯總函數可以做更多事情。 該查詢找出每種錦標賽類型進行了多少場比賽。
=QUERY(data, "SELECT F, COUNT(A) GROUP BY F")
This query groups the data set by each of the values in column F. It then counts how many rows there are in each group.
該查詢按F列中的每個值對數據集進行分組。然后它計算每個組中有多少行。
You can use GROUP BY
on more than one column. For example, to find how many matches have been played in each country by tournament, use the query below:
您可以在多個列上使用GROUP BY
。 例如,要查找每個國家在錦標賽中進行了多少場比賽,請使用以下查詢:
=QUERY(data, "SELECT H, F, COUNT(A) GROUP BY H, F")
Let's try some more advanced filtering.
讓我們嘗試一些更高級的過濾。
獲取所有英格蘭vs德國的比賽 (Get all England vs Germany matches)
You can specify more complex filter logic using the AND
and OR
keywords. For readability, it can help to use brackets around each part of the filter.
您可以使用AND
和OR
關鍵字指定更復雜的過濾器邏輯。 為了便于閱讀,可以在過濾器的每個部分周圍使用方括號。
For example, to find all the matches between England and Germany:
例如,要查找英格蘭和德國之間的所有比賽:
=QUERY(data, "SELECT * WHERE (B = 'England' AND C = 'Germany') OR (C = 'England' AND B ='Germany')")
This filter has two criteria - one where England are the home team and Germany are away, and the other vice versa.
此篩選條件有兩個條件-一個條件是英格蘭是主隊,而德國則不在,另一個則相反。
Using data validation makes it easy to pick any two teams in the data set.
使用數據驗證可輕松選擇數據集中的任何兩個團隊。
Then, you can write a query that uses the values of different cells in its filter. Remember to use single-quotes for identifying strings within the query, and double-quotes to open and close different pieces of the query.
然后,您可以編寫一個查詢,該查詢使用其過濾器中不同單元格的值。 請記住使用單引號標識查詢中的字符串,并使用雙引號打開和關閉查詢的不同部分。
=QUERY(data, "SELECT * WHERE (B = '"&B1&"' AND C = '"&B2&"') OR (C = '"&B1&"' AND B ='"&B2&"')")
尋找趨勢 (Looking for trends)
Aggregate functions and filters make powerful tools when used in combination. Once you are comfortable with how they work, you can start searching for all kinds of interesting trends in your data set.
聚合函數和過濾器結合使用時將成為強大的工具。 一旦熟悉了它們的工作方式,就可以開始在數據集中搜索各種有趣的趨勢。
For example, the query below finds the average goals per game, by each year since 1900.
例如,下面的查詢查找自1900年以來每年的每場比賽平均目標。
=QUERY(data, "SELECT YEAR(A), AVG(D) + AVG(E) WHERE YEAR(A) >= 1900 GROUP BY YEAR(A)")
If you plot the query result as a line graph, you can immediately start seeing trends over time.
如果將查詢結果繪制為折線圖,則可以立即開始查看一段時間內的趨勢。
排序結果 (Ordering the results)
Sometimes, you are not interested in finding all the matching rows in a data set. Often, you will want to sort them according to some criteria. Perhaps you only wish to find the top ten records.
有時,您對查找數據集中的所有匹配行都不感興趣。 通常,您將需要根據一些條件對它們進行排序。 也許您只希望找到前十個記錄。
This query finds the top ten highest scoring matches in the data set.
此查詢查找數據集中得分最高的十個匹配項。
=QUERY(data, "SELECT * ORDER BY (D+E) DESC LIMIT 10")
Notice the ORDER BY
statement. This sorts the rows according to the columns specified. ?Here, the query sorts the output by the number of goals scored in the game.
注意ORDER BY
語句。 這將根據指定的列對行進行排序。 在此,查詢按游戲中得分的目標數對輸出進行排序。
The DESC
keyword indicates to sort in descending order (the ASC
keyword would have sorted them in ascending order).
DESC
關鍵字指示按降序排序( ASC
關鍵字將按升序對它們排序)。
Finally, the LIMIT
keyword restricts the output to a given number of rows (in this case, ten).
最后, LIMIT
關鍵字將輸出限制為給定的行數(在本例中為10行)。
Looks like there have been some pretty one-sided games in Oceania!
好像在大洋洲有一些漂亮的單面游戲!
哪些城市舉辦了最多的世界杯比賽? (Which cities have hosted the most World Cup matches?)
And now for one final example to bring everything together and get your imagination going.
現在,作為最后一個例子,將所有內容整合在一起,激發您的想象力。
This query finds the top ten cities that have hosted the most FIFA World Cup matches.
此查詢查找舉辦最多FIFA世界杯比賽的前十名城市。
=QUERY(data, "SELECT G, COUNT(A) WHERE F = 'FIFA World Cup' GROUP BY G ORDER BY COUNT(A) DESC LIMIT 10")
現在輪到你了 (Now it's your turn)
Hopefully you have found this article useful. If you are comfortable with the logic in each example, then you are ready to try out real SQL.
希望您發現本文有用。 如果您熟悉每個示例中的邏輯,那么您就可以嘗試使用實際SQL。
This will introduce concepts such as JOINS, nested queries and WINDOW functions. When you master these, your power to manipulate data will go through the roof.
這將介紹諸如JOINS,嵌套查詢和WINDOW函數之類的概念。 當您掌握了這些內容后,您處理數據的能力就會大打折扣。
There are a number of places to start with learning SQL. Try out the interactive examples at w3schools!
有很多地方可以開始學習SQL。 在w3schools上嘗試互動示例 !
翻譯自: https://www.freecodecamp.org/news/start-querying-data-with-google-query-language/
數據庫語言 數據查詢