SQL的窗口函數
文章目錄 SQL的窗口函數 1. 介紹 2. 聚合函數 0.數據準備 1. AVG 2. COUNT 3. MAX 4. MIN 5. 標準差 6. SUM 3. 排序函數 1. CUME_DIST 2. RANK, DENSE_RANK, ROW_NUMBER 3. PERCENT_RANK 4. NTILE 4. 值函數(偏移函數) 1. FIRST_VALUE 2. LAST_VALUE 3. LAG 4. LEAD 5. NTH_VALUE END
1. 介紹
窗口函數分類:聚合函數、排序函數和值函數(偏移函數)。 SQL中的窗口函數是一種特殊的函數,它能夠在查詢結果中創建一個窗口或者窗口集合,然后對這個窗口進行計算。窗口函數可以用于分析和匯總數據,同時保留原始查詢結果的行順序。 窗口函數基于查詢結果的行數據進行計算,窗口函數運行在HAVING子句之后、 ORDER BY子句之前。窗口函數需要特殊的關鍵字OVER子句來指定窗口即觸發一個窗口函數。
2. 聚合函數
AVG:該函數用于計算平均值。
COUNT:該函數用于計算記錄數。
MAX:該函數用于計算最大值。
MIN:該函數用于計算最小值。
STDDEV:返回數值的總體標準差。
STDDEV_SAMP:返回一組數值(整數、小數或浮點)的樣本標準差。
SUM:該函數用于計算匯總值。
0.數據準備
WITH t_sales_info AS ( SELECT * FROM ( VALUES ( 'A' , '2023-06-01' , '2023' , '1' , 36.0 ) , ( 'A' , '2023-09-01' , '2023' , '2' , 48.0 ) , ( 'B' , '2023-09-07' , '2023' , '2' , 48.0 ) , ( 'C' , '2023-10-10' , '2023' , '3' , 23.0 ) , ( 'A' , '2023-11-11' , '2023' , '3' , 23.0 ) , ( 'C' , '2023-12-11' , '2023' , '3' , 23.0 ) , ( 'A' , '2024-01-01' , '2024' , '2' , 48.0 ) , ( 'B' , '2024-02-02' , '2024' , '2' , 48.0 ) , ( 'C' , '2024-02-04' , '2024' , '1' , 36.0 ) , ( 'C' , '2024-02-11' , '2024' , '1' , 36.0 ) , ( 'B' , '2024-03-16' , '2024' , '3' , 23.0 ) , ( 'B' , '2024-04-01' , '2024' , '3' , 23.0 ) , ( 'A' , '2024-04-14' , '2024' , '3' , 23.0 ) , ( 'C' , '2024-04-21' , '2024' , '3' , 23.0 ) , ( 'B' , '2024-05-07' , '2024' , '3' , 23.0 ) ) AS tbl_name( customer_id, order_date, year , product_id, product_price)
)
1. AVG
該函數用于計算平均值。
輸入值類型:BIGINT、DOUBLE或FLOAT。
返回值類型:DOUBLE。
SELECT customer_id, order_date, ` year` , product_id, product_price, AVG ( product_price) OVER ( PARTITION BY customer_id ORDER BY customer_id) AS price_avg
FROM t_sales_info ;
customer_id order_date year product_id product_price price_avg A 2024-04-14 2024 3 23.0 35.6 A 2024-01-01 2024 2 48.0 35.6 A 2023-09-01 2023 2 48.0 35.6 A 2023-06-01 2023 1 36.0 35.6 A 2023-11-11 2023 3 23.0 35.6 B 2024-05-07 2024 3 23.0 33.0 B 2023-09-07 2023 2 48.0 33.0 B 2024-02-02 2024 2 48.0 33.0 B 2024-03-16 2024 3 23.0 33.0 B 2024-04-01 2024 3 23.0 33.0 C 2024-04-21 2024 3 23.0 28.2 C 2023-10-10 2023 3 23.0 28.2 C 2023-12-11 2023 3 23.0 28.2 C 2024-02-04 2024 1 36.0 28.2 C 2024-02-11 2024 1 36.0 28.2
SELECT customer_id, order_date, ` year` , product_id, product_price, AVG ( product_price) OVER ( PARTITION BY customer_id ORDER BY ` year` ) AS price_avg
FROM t_sales_info ;
customer_id order_date year product_id product_price price_avg A 2023-09-01 2023 2 48.0 35.666666666666664 A 2023-06-01 2023 1 36.0 35.666666666666664 A 2023-11-11 2023 3 23.0 35.666666666666664 A 2024-01-01 2024 2 48.0 35.6 A 2024-04-14 2024 3 23.0 35.6 B 2023-09-07 2023 2 48.0 48.0 B 2024-05-07 2024 3 23.0 33.0 B 2024-02-02 2024 2 48.0 33.0 B 2024-03-16 2024 3 23.0 33.0 B 2024-04-01 2024 3 23.0 33.0 C 2023-10-10 2023 3 23.0 23.0 C 2023-12-11 2023 3 23.0 23.0 C 2024-04-21 2024 3 23.0 28.2 C 2024-02-04 2024 1 36.0 28.2 C 2024-02-11 2024 1 36.0 28.2
2. COUNT
該函數用于計算記錄數。
輸入值類型:數值、字符串類型或BOOLEAN類型。
返回值類型:BIGINT。
SELECT customer_id, order_date, ` year` , product_id, product_price, COUNT ( product_id) OVER ( PARTITION BY customer_id, ` year` ORDER BY ` year` ) AS product_ct
FROM t_sales_info ;
customer_id order_date year product_id product_price product_ct A 2023-09-01 2023 2 48.0 3 A 2023-06-01 2023 1 36.0 3 A 2023-11-11 2023 3 23.0 3 A 2024-01-01 2024 2 48.0 2 A 2024-04-14 2024 3 23.0 2 B 2023-09-07 2023 2 48.0 1 B 2024-05-07 2024 3 23.0 4 B 2024-02-02 2024 2 48.0 4 B 2024-03-16 2024 3 23.0 4 B 2024-04-01 2024 3 23.0 4 C 2023-10-10 2023 3 23.0 2 C 2023-12-11 2023 3 23.0 2 C 2024-04-21 2024 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3
3. MAX
該函數用于計算最大值。
輸入值類型:該函數支持輸入任意類型的參數,但是BOOLEAN類型的數據不允許參與運算。
返回值類型:與該函數的輸入值類型保持一致。注意:當列中的值為NULL時,該行不參與計算。
SELECT customer_id, order_date, ` year` , product_id, product_price, MAX ( product_price) OVER ( PARTITION BY customer_id ORDER BY customer_id) AS product_ct
FROM t_sales_info ;
customer_id order_date year product_id product_price product_ct A 2024-04-14 2024 3 23.0 48.0 A 2024-01-01 2024 2 48.0 48.0 A 2023-09-01 2023 2 48.0 48.0 A 2023-06-01 2023 1 36.0 48.0 A 2023-11-11 2023 3 23.0 48.0 B 2024-05-07 2024 3 23.0 48.0 B 2023-09-07 2023 2 48.0 48.0 B 2024-02-02 2024 2 48.0 48.0 B 2024-03-16 2024 3 23.0 48.0 B 2024-04-01 2024 3 23.0 48.0 C 2024-04-21 2024 3 23.0 36.0 C 2023-10-10 2023 3 23.0 36.0 C 2023-12-11 2023 3 23.0 36.0 C 2024-02-04 2024 1 36.0 36.0 C 2024-02-11 2024 1 36.0 36.0
4. MIN
該函數用于計算最小值。
輸入值類型:該函數支持輸入任意類型的參數,但是BOOLEAN類型的數據不允許參與運算。
返回值類型:與該函數的輸入值類型保持一致。注意:當列中的值為NULL時,該行不參與計算。
SELECT customer_id, order_date, ` year` , product_id, product_price, MIN ( product_price) OVER ( PARTITION BY customer_id ORDER BY customer_id) AS product_ct
FROM t_sales_info ;
customer_id order_date year product_id product_price product_ct A 2024-04-14 2024 3 23.0 23.0 A 2024-01-01 2024 2 48.0 23.0 A 2023-09-01 2023 2 48.0 23.0 A 2023-06-01 2023 1 36.0 23.0 A 2023-11-11 2023 3 23.0 23.0 B 2024-05-07 2024 3 23.0 23.0 B 2023-09-07 2023 2 48.0 23.0 B 2024-02-02 2024 2 48.0 23.0 B 2024-03-16 2024 3 23.0 23.0 B 2024-04-01 2024 3 23.0 23.0 C 2024-04-21 2024 3 23.0 23.0 C 2023-10-10 2023 3 23.0 23.0 C 2023-12-11 2023 3 23.0 23.0 C 2024-02-04 2024 1 36.0 23.0 C 2024-02-11 2024 1 36.0 23.0
5. 標準差
STDDEV:返回數值的總體標準差。
STDDEV_SAMP:返回一組數值(整數、小數或浮點)的樣本標準差。
樣本標準差= 方差的算術平方根= s= sqrt(( ( x1- x) ^ 2 + ( x2- x) ^ 2 + ......( xn- x) ^ 2 ) / ( n- 1 ))
總體標準差= σ= sqrt(( ( x1-x) ^2 +( x2-x) ^2 +.. .. .. ( xn-x) ^2) /n )
SELECT customer_id, order_date, ` year` , product_id, product_price, STDDEV( product_price) OVER ( PARTITION BY customer_id ORDER BY customer_id) AS price_std, STDDEV_SAMP( product_price) OVER ( PARTITION BY customer_id ORDER BY customer_id) AS price_std_samp
FROM t_sales_info ;
customer_id order_date year product_id product_price price_std price_std_samp A 2024-04-14 2024 3 23.0 11.18212859879549 12.501999840025595 A 2024-01-01 2024 2 48.0 11.18212859879549 12.501999840025595 A 2023-09-01 2023 2 48.0 11.18212859879549 12.501999840025595 A 2023-06-01 2023 1 36.0 11.18212859879549 12.501999840025595 A 2023-11-11 2023 3 23.0 11.18212859879549 12.501999840025595 B 2024-05-07 2024 3 23.0 12.24744871391589 13.693063937629153 B 2023-09-07 2023 2 48.0 12.24744871391589 13.693063937629153 B 2024-02-02 2024 2 48.0 12.24744871391589 13.693063937629153 B 2024-03-16 2024 3 23.0 12.24744871391589 13.693063937629153 B 2024-04-01 2024 3 23.0 12.24744871391589 13.693063937629153 C 2024-04-21 2024 3 23.0 6.368673331236264 7.12039324756716 C 2023-10-10 2023 3 23.0 6.368673331236264 7.12039324756716 C 2023-12-11 2023 3 23.0 6.368673331236264 7.12039324756716 C 2024-02-04 2024 1 36.0 6.368673331236264 7.12039324756716 C 2024-02-11 2024 1 36.0 6.368673331236264 7.12039324756716
6. SUM
該函數用于計算匯總值。
輸入值類型:BIGINT、DOUBLE或FLOAT。
返回值類型:BIGINT。
SELECT customer_id, order_date, ` year` , product_id, product_price, SUM ( product_price) OVER ( PARTITION BY customer_id, ` year` ORDER BY ` year` ) AS price_sum
FROM t_sales_info ;
customer_id order_date year product_id product_price price_sum A 2023-09-01 2023 2 48.0 107.0 A 2023-06-01 2023 1 36.0 107.0 A 2023-11-11 2023 3 23.0 107.0 A 2024-01-01 2024 2 48.0 71.0 A 2024-04-14 2024 3 23.0 71.0 B 2023-09-07 2023 2 48.0 48.0 B 2024-05-07 2024 3 23.0 117.0 B 2024-02-02 2024 2 48.0 117.0 B 2024-03-16 2024 3 23.0 117.0 B 2024-04-01 2024 3 23.0 117.0 C 2023-10-10 2023 3 23.0 46.0 C 2023-12-11 2023 3 23.0 46.0 C 2024-04-21 2024 3 23.0 95.0 C 2024-02-04 2024 1 36.0 95.0 C 2024-02-11 2024 1 36.0 95.0
3. 排序函數
CUME_DIST:返回一組數值中每個值的累計分布。
RANK:返回數據集中每個值的排名。
DENSE_RANK:返回一組數值中每個數值的排名。
ROW_NUMBER:根據行在窗口分區內的順序,為每行數據返回一個唯一的有序行號,行號從1開始。
PERCENT_RANK:返回數據集中每個數據的排名百分比,其結果由( r - 1 ) / ( n - 1 ) 計算得出。其中r為RANK( ) 計算的當前行排名, n為當前窗口分區內總的行數。
NTILE:將每個窗口分區的數據分散到桶號從1到n的n個桶中。
語法
function over ( partition by a order by b RANGE| ROWS BETWEEN start AND end )
分區規范:用于將輸入行分散到不同的分區中,過程和GROUP BY子句的分散過程相似。
排序規范:決定輸入數據行在窗口函數中執行的順序。
窗口區間:指定計算數據的窗口邊界。
窗口區間支持RANGE、ROWS兩種模式:RANGE按照計算列值的范圍進行定義。ROWS按照計算列的行數進行范圍定義。RANGE、ROWS中可以使用BETWEEN start AND end指定邊界可取值。BETWEEN start AND end取值為:CURRENT ROW,當前行。N PRECEDING,前n行。UNBOUNDED PRECEDING,直到第1行。N FOLLOWING,后n行。UNBOUNDED FOLLOWING,直到最后1行。
1. CUME_DIST
返回一組數值中每個值的累計分布。
返回結果:在窗口分區中對窗口進行排序后的數據集,包括當前行和當前行之前的數據行數。排序中任何關聯值均會計算成相同的分布值。
返回值類型:DOUBLE。
SELECT customer_id, order_date, ` year` , product_id, product_price, CUME_DIST( ) OVER ( PARTITION BY customer_id ORDER BY order_date) AS cume_dist
FROM t_sales_info ;
customer_id order_date year product_id product_price cume_dist A 2023-06-01 2023 1 36.0 0.2 A 2023-09-01 2023 2 48.0 0.4 A 2023-11-11 2023 3 23.0 0.6 A 2024-01-01 2024 2 48.0 0.8 A 2024-04-14 2024 3 23.0 1.0 B 2023-09-07 2023 2 48.0 0.2 B 2024-02-02 2024 2 48.0 0.4 B 2024-03-16 2024 3 23.0 0.6 B 2024-04-01 2024 3 23.0 0.8 B 2024-05-07 2024 3 23.0 1.0 C 2023-10-10 2023 3 23.0 0.2 C 2023-12-11 2023 3 23.0 0.4 C 2024-02-04 2024 1 36.0 0.6 C 2024-02-11 2024 1 36.0 0.8 C 2024-04-21 2024 3 23.0 1.0
2. RANK, DENSE_RANK, ROW_NUMBER
返回數據集中每個值的排名: RANK( ) 會跳過重復值; DENSE_RANK( ) 不會跳過重復值序號; ROW_NUMBER( ) 不包含重復值, 相同名次按順序排列;
排名值是將當前行之前的行數加1,不包含當前行。因此,排序的關聯值可能產生順序上的空隙,而且這個排名會對每個窗口分區進行計算。
DENSE_RANK( ) 與RANK( ) 功能相似,但是DENSE_RANK( ) 關聯值不會產生順序上的空隙。
返回值類型:BIGINT。
SELECT customer_id, product_price, RANK( ) OVER ( PARTITION BY customer_id ORDER BY product_price) AS rk , DENSE_RANK( ) OVER ( PARTITION BY customer_id ORDER BY product_price) AS d_rk , ROW_NUMBER( ) OVER ( PARTITION BY customer_id ORDER BY product_price) AS rn
FROM t_sales_info ;
customer_id product_price rk d_rk rn A 23.0 1 1 1 A 23.0 1 1 2 A 36.0 3 2 3 A 48.0 4 3 4 A 48.0 4 3 5 B 23.0 1 1 1 B 23.0 1 1 2 B 23.0 1 1 3 B 48.0 4 2 4 B 48.0 4 2 5 C 23.0 1 1 1 C 23.0 1 1 2 C 23.0 1 1 3 C 36.0 4 2 4 C 36.0 4 2 5
3. PERCENT_RANK
命令說明:返回數據集中每個數據的排名百分比,其結果由( r - 1 ) / ( n - 1 ) 計算得出。其中,r為RANK( ) 計算的當前行排名, n為當前窗口分區內總的行數。
返回值類型:DOUBLE。
SELECT customer_id, order_date, ` year` , product_id, product_price, RANK( ) OVER ( PARTITION BY customer_id ORDER BY product_price) AS rk , PERCENT_RANK( ) OVER ( PARTITION BY customer_id ORDER BY product_price) AS pr
FROM t_sales_info ;
customer_id order_date year product_id product_price rk pr A 2024-04-14 2024 3 23.0 1 0.0 A 2023-11-11 2023 3 23.0 1 0.0 A 2023-06-01 2023 1 36.0 3 0.5 A 2024-01-01 2024 2 48.0 4 0.75 A 2023-09-01 2023 2 48.0 4 0.75 B 2024-05-07 2024 3 23.0 1 0.0 B 2024-03-16 2024 3 23.0 1 0.0 B 2024-04-01 2024 3 23.0 1 0.0 B 2023-09-07 2023 2 48.0 4 0.75 B 2024-02-02 2024 2 48.0 4 0.75 C 2024-04-21 2024 3 23.0 1 0.0 C 2023-10-10 2023 3 23.0 1 0.0 C 2023-12-11 2023 3 23.0 1 0.0 C 2024-02-04 2024 1 36.0 4 0.75 C 2024-02-11 2024 1 36.0 4 0.75
4. NTILE
命令說明:將每個窗口分區的數據分散到桶號從1到n的n個桶中。桶號值最多間隔1,如果窗口分區中的數據行數不能均勻地分散到每一個桶中,則剩余值將從第1個桶開始,每1個桶分1行數據。例如,有6行數據和4個桶, 最終桶號值為1 1 2 2 3 4 。返回值類型:BIGINT。
SELECT customer_id, order_date, ` year` , product_id, product_price, NTILE( 4 ) OVER ( PARTITION BY ` year` ORDER BY order_date) AS rk
FROM t_sales_info ;
customer_id order_date year product_id product_price rk A 2023-06-01 2023 1 36.0 1 A 2023-09-01 2023 2 48.0 1 B 2023-09-07 2023 2 48.0 2 C 2023-10-10 2023 3 23.0 2 A 2023-11-11 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 4 A 2024-01-01 2024 2 48.0 1 B 2024-02-02 2024 2 48.0 1 C 2024-02-04 2024 1 36.0 1 C 2024-02-11 2024 1 36.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 3 A 2024-04-14 2024 3 23.0 3 C 2024-04-21 2024 3 23.0 4 B 2024-05-07 2024 3 23.0 4
4. 值函數(偏移函數)
FIRST_VALUE:返回窗口分區第1行的值。
LAST_VALUE返回窗口分區最后1行的值。
LAG:返回窗口內距離當前行之前偏移offset后的值。
LEAD:返回窗口內距離當前行偏移offset后的值。
NTH_VALUE:返回窗口內偏移指定offset后的值,偏移量從1開始。
1. FIRST_VALUE
命令說明:返回窗口分區第一行的值。返回值類型:與輸入參數類型相同。
SELECT customer_id, order_date, ` year` , product_id, product_price, FIRST_VALUE( product_id) OVER ( PARTITION BY customer_id ORDER BY order_date) AS fv
FROM t_sales_info ;
customer_id order_date year product_id product_price fv A 2023-06-01 2023 1 36.0 1 A 2023-09-01 2023 2 48.0 1 A 2023-11-11 2023 3 23.0 1 A 2024-01-01 2024 2 48.0 1 A 2024-04-14 2024 3 23.0 1 B 2023-09-07 2023 2 48.0 2 B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 2 B 2024-05-07 2024 3 23.0 2 C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 3
2. LAST_VALUE
命令說明:返回窗口分區最后一行的值。LAST_VALUE默認統計范圍是 rows between unbounded preceding and current row,即取當前行數據與當前行之前的數據進行比較。如果像FIRST_VALUE那樣直接在每行數據中顯示最后一行數據,需要在 order by 條件的后面加上語句:rows between unbounded preceding and unbounded following。返回值類型:與輸入參數類型相同。
SELECT customer_id, order_date, ` year` , product_id, product_price, FIRST_VALUE( product_id) OVER ( PARTITION BY customer_id ORDER BY order_date) AS lv
FROM t_sales_info ;
customer_id order_date year product_id product_price lv A 2023-06-01 2023 1 36.0 1 A 2023-09-01 2023 2 48.0 2 A 2023-11-11 2023 3 23.0 3 A 2024-01-01 2024 2 48.0 2 A 2024-04-14 2024 3 23.0 3 B 2023-09-07 2023 2 48.0 2 B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 3 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 3 C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 1 C 2024-02-11 2024 1 36.0 1 C 2024-04-21 2024 3 23.0 3
SELECT customer_id, order_date, ` year` , product_id, product_price, LAST_VALUE( product_id) OVER ( PARTITION BY customer_id ORDER BY order_date ROWS BETWEEN unbounded preceding AND unbounded following ) AS lv
FROM t_sales_info ;
customer_id order_date year product_id product_price lv A 2023-06-01 2023 1 36.0 3 A 2023-09-01 2023 2 48.0 3 A 2023-11-11 2023 3 23.0 3 A 2024-01-01 2024 2 48.0 3 A 2024-04-14 2024 3 23.0 3 B 2023-09-07 2023 2 48.0 3 B 2024-02-02 2024 2 48.0 3 B 2024-03-16 2024 3 23.0 3 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 3 C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 3
3. LAG
LAG( x[ , offset[ , default_value] ] ) 命令說明:返回窗口內距離當前行之前偏移offset后的值。
偏移量起始值是0,也就是當前數據行。偏移量可以是標量表達式,默認offset是1 。
如果偏移量的值是null或者大于窗口長度,則返回default_value;如果沒有指定default_value,則返回null。
返回值類型:與輸入參數類型相同。
SELECT customer_id, order_date, ` year` , product_id, product_price, LAG( product_id, 1 , '無' ) OVER ( PARTITION BY customer_id ORDER BY order_date) AS lag1
FROM t_sales_info ;
customer_id order_date year product_id product_price lag1 A 2023-06-01 2023 1 36.0 無 A 2023-09-01 2023 2 48.0 1 A 2023-11-11 2023 3 23.0 2 A 2024-01-01 2024 2 48.0 3 A 2024-04-14 2024 3 23.0 2 B 2023-09-07 2023 2 48.0 無 B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 3 C 2023-10-10 2023 3 23.0 無 C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 1 C 2024-04-21 2024 3 23.0 1
4. LEAD
LEAD( x[ ,offset[ , default_value] ] ) 命令說明:返回窗口內距離當前行偏移offset后的值。偏移量offset起始值是0,也就是當前數據行。偏移量可以是標量表達式,默認offset是1 。如果偏移量的值是null或者大于窗口長度,則返回default_value;如果沒有指定default_value,則返回null。返回值類型:與輸入參數類型相同。
SELECT customer_id, order_date, ` year` , product_id, product_price, LEAD( product_id, 1 ) OVER ( PARTITION BY customer_id ORDER BY order_date) AS lead1
FROM t_sales_info ;
customer_id order_date year product_id product_price lead1 A 2023-06-01 2023 1 36.0 2 A 2023-09-01 2023 2 48.0 3 A 2023-11-11 2023 3 23.0 2 A 2024-01-01 2024 2 48.0 3 A 2024-04-14 2024 3 23.0 \N B 2023-09-07 2023 2 48.0 2 B 2024-02-02 2024 2 48.0 3 B 2024-03-16 2024 3 23.0 3 B 2024-04-01 2024 3 23.0 3 B 2024-05-07 2024 3 23.0 \N C 2023-10-10 2023 3 23.0 3 C 2023-12-11 2023 3 23.0 1 C 2024-02-04 2024 1 36.0 1 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 \N
5. NTH_VALUE
NTH_VALUE( x, offset) 命令說明:返回窗口內偏移指定offset后的值,偏移量從1開始。如果偏移量offset是null或者大于窗口內值的個數,則返回null;如果偏移量offset為0或者負數,則系統提示報錯。返回值類型:與輸入參數類型相同。
SELECT customer_id, order_date, ` year` , product_id, product_price, NTH_VALUE( product_id, 2 ) OVER ( PARTITION BY customer_id ORDER BY order_date) AS nv1
FROM t_sales_info ;
customer_id order_date year product_id product_price nv1 A 2023-06-01 2023 1 36.0 \N A 2023-09-01 2023 2 48.0 2 A 2023-11-11 2023 3 23.0 2 A 2024-01-01 2024 2 48.0 2 A 2024-04-14 2024 3 23.0 2 B 2023-09-07 2023 2 48.0 \N B 2024-02-02 2024 2 48.0 2 B 2024-03-16 2024 3 23.0 2 B 2024-04-01 2024 3 23.0 2 B 2024-05-07 2024 3 23.0 2 C 2023-10-10 2023 3 23.0 \N C 2023-12-11 2023 3 23.0 3 C 2024-02-04 2024 1 36.0 3 C 2024-02-11 2024 1 36.0 3 C 2024-04-21 2024 3 23.0 3
END