目錄
- 1 題目
- 2 建表語句
- 3 題解
1 題目
有用戶賬戶表,包含年份,用戶id和值,請按照年份分組,取出值前兩小和前兩大對應的用戶id,需要保持值最小和最大的用戶id排首位。
樣例數據
+-------+----------+--------+
| year | user_id | value |
+-------+----------+--------+
| 2022 | A | 30 |
| 2022 | B | 10 |
| 2022 | C | 20 |
| 2023 | A | 40 |
| 2023 | B | 50 |
| 2023 | C | 20 |
| 2023 | D | 30 |
+-------+----------+--------+
期望結果
+-------+-----------------+-----------------+
| year | max2_user_list | min2_user_list |
+-------+-----------------+-----------------+
| 2022 | A,C | B,C |
| 2023 | B,A | C,D |
+-------+-----------------+-----------------+
2 建表語句
--建表語句
create table if not exists t_amount
(year string,user_id string,value bigint
)ROW FORMAT DELIMITEDFIELDS TERMINATED BY ','STORED AS orc;--插入數據insert into t_amount(year, user_id, value)
values ('2022', 'A', 30),('2022', 'B', 10),('2022', 'C', 20),('2023', 'A', 40),('2023', 'B', 50),('2023', 'C', 20),('2023', 'D', 30)
3 題解
(1)row_number函數根據年份分組,value正排和倒排得到兩個序列
select user_id, year, value, row_number() over (partition by year order by value desc) as desc_rn, row_number() over (partition by year order by value) as rn
from t_amount
執行結果
+----------+-------+--------+----------+-----+
| user_id | year | value | desc_rn | rn |
+----------+-------+--------+----------+-----+
| B | 2022 | 10 | 3 | 1 |
| C | 2022 | 20 | 2 | 2 |
| A | 2022 | 30 | 1 | 3 |
| C | 2023 | 20 | 4 | 1 |
| D | 2023 | 30 | 3 | 2 |
| A | 2023 | 40 | 2 | 3 |
| B | 2023 | 50 | 1 | 4 |
+----------+-------+--------+----------+-----+
(2)根據年份分組,取出value最大user_id,第二大user_id,最小user_id,第二小user_id
根據年份分組,取出每年最大、第二大,最小、第二小用戶ID。使用 if 對desc_rn,rn進行判斷,對符合條件的數據取出 user_id,其他去null,然后使用聚合函數取出結果。
select year,max(if(desc_rn = 1, user_id, null)) as max1_user_id,max(if(desc_rn = 2, user_id, null)) as max2_user_id,max(if(rn = 1, user_id, null)) as min1_user_id,max(if(rn = 2, user_id, null)) as min2_user_id
from (select user_id, year, value, row_number() over (partition by year order by value desc) as desc_rn, row_number() over (partition by year order by value) as rnfrom t_amount) t1
group by year
執行結果
+-------+---------------+---------------+---------------+---------------+
| year | max1_user_id | max2_user_id | min1_user_id | min2_user_id |
+-------+---------------+---------------+---------------+---------------+
| 2022 | A | C | B | C |
| 2023 | B | A | C | D |
+-------+---------------+---------------+---------------+---------------+
(3)按照順序拼接,得到最終結果
按照題目要求,進行字符拼接
- 拼接max1_user_id、max2_user_id為max2_list;
- 拼接min1_user_id、min2_user_id為min2_list;
select year,concat(max(if(desc_rn = 1, user_id, null)), ',',max(if(desc_rn = 2, user_id, null))) as max2_user_list,concat(max(if(rn = 1, user_id, null)), ',',max(if(rn = 2, user_id, null))) as min2_user_list
from (select user_id, year, value, row_number() over (partition by year order by value desc) as desc_rn, row_number() over (partition by year order by value) as rnfrom t_amount) t1
group by year
執行結果
+-------+-----------------+-----------------+
| year | max2_user_list | min2_user_list |
+-------+-----------------+-----------------+
| 2022 | A,C | B,C |
| 2023 | B,A | C,D |
+-------+-----------------+-----------------+