Window Function 窗口函數
- Perform calculations on an already generated result set ( a window).(在已生成的結果集上執行計算)
- Aggregate calculation(without having to group your data)(允許使用聚合函數時不用進行GROUP BY分組)
- Similar to subqueries in SELECT.
- Running totals, rankings, and moving averages, etc.(可計算累加值,排序,移動平均值等)
- Processed after every part of query except ORDER BY.(執行順序在其他各部分之后,但在ORDER BY 之前)
- Uses information in result set rather than database.
- Available in PostgreSQL, Oracle, MySQL, SQLServer, and SQLite.
窗口函數是 SQL 中一類特別的函數。和聚合函數相似,窗口函數的輸入也是多行記錄。不同的是,聚合函數的作用于由 GROUP BY 子句聚合的組,而窗口函數則作用于一個窗口, 這里,窗口是由一個 OVER 子句 定義的多行記錄。
聚合函數對其所作用的每一組記錄輸出一條結果,而窗口函數對其所作用的窗口中的每一行記錄輸出一條結果。
語法
FUNCTION(value) OVER ([PARTITION BY field] [ORDER BY field])注:[]中的內容可省略,根據實際情況選擇使用。
PARTITION BY = range of calculation根據指定(1個或多個)字段進行分區,類似GROUP BY
ORDER BY = order of rows when running calculation 根據指定字段進行排序
常用函數
- 專用窗口函數
ROW_NUMBER() :從1開始,返回每組內部排序后的順序編號(組內連續的唯一的)
RANK():計算排序,如果存在相同位次的記錄,為相同的值分配相同的數字,但會跳過之后的位次。
DENSE_RANK():同樣是計算排序,即使存在相同位次的記錄,也不會跳過之后的位次。
如:
SELEECT goals,RANK() OVER(ORDER BY goals DESC) AS goals_rank,DENSE_RANK() OVER(ORDER BY goals DESC) AS goals_dense_rank,ROW_NUMBER() OVER(ORDER BY goals DESC) AS row_number
FROM grade
ORDER BY goals DESC;結果如下:goals goals_rank goals_dense_rank row_number
10 1 1 1
10 1 1 2
9 3 2 3
9 3 2 4
7 5 3 5
LAG(column, n):returns column's value at the row n rows before the current row. 返回當前行之前第n行的值(n省略時默認為1,表示返回當前行前1行的值)。
LEAD(column, n) : returns column's value at the row n row after the current row. 返回當前行之后第n行的值。
FIRST_VALUE(column):return the first value in the table or partition. 返回表中或分區中第一個值。
LAST_VALUE(column):return the last value in the table or partition. 返回表中或分區中最后一個值。
NTILE(n):splits data into n approximately equal pages. 將數據分為近乎相等的n等份。(暫時用的場景不多,以后再補充)
- 聚合函數: SUM, AVG, COUNT, MAX, MIN 也可以用于窗口函數。
分區示例


圖1中 AVG(home_goal + away_goal) OVER() AS overall_avg,未使用PARTITION BY進行分區,所以計算的是總體的平均值。
圖2中 AVG(homegoal + awaygoal) OVER(PARTITION BY season) AS season_avg,對season (表中的一個字段)進行分區,再計算分區內的平均值。

PARTITION BY 允許針對1列或多列進行分區,圖3 中同時根據m.season和c.name 進行分組后在計算分組內的平均值。所以,第一行和第三行的 season_ctry_avg值相同。
Sliding Window 滑動窗口
In addition to calculating aggregate and rank information, window functions can also be used to calculate information that changes with each subsequent row in a data set. These types of window functions are called sliding windows.
除了計算匯總、聚合和排序等,窗口函數還可以用于計算隨數據集中的每個后續行而變化的信息。這類窗口功能稱為滑動窗口。
Sliding windows are functions that perform calculations relative to the current row of a data set. 滑動窗口是執行相對于數據集當前行的計算的功能。
You can use sliding windows to calculate a wide variety of information that aggregates one row at a time down your data set -- running totals, sums, counts, and averages in any order you need.
A sliding window calculation can also be partitioned by one or more column just like a non-sliding window.
滑動窗口 關鍵字(加在OVER從句中)
ROWS BETWEEN <start> AND <finish>
可用于start 和finish 的關鍵字有:
- PRECEDING : n PRECEDING means n rows before the current row 當前行的之前第n行
- FOLLOWING : n FOLLOWING means n row after the current row 當前行之后的第n行
- UNBOUNDED PRECEDING : every row since the beginning of the data set 數據集的開始
- UNBOUNDED FOLLOWING : every row to the end of the data set 數據集的末尾
- CURRENT ROW : tells SQL that you want to stop your calculation at the current row 當前行
示例


靈活運用窗口函數,可以對原始數據進行更為復雜的運算和分組,可以從不同角度看待數據,并從中發現更深層次的規律和結論。
你的點贊是我持續更新的動力~ 謝謝 Thanks?(・ω・)ノ
其他SQL學習筆記 友情鏈接:
JessieY:SQL學習筆記 - CTE通用表表達式和WITH用法?zhuanlan.zhihu.com