1179. 重新格式化部門表
表 Department:
±--------------±--------+
| Column Name | Type |
±--------------±--------+
| id | int |
| revenue | int |
| month | varchar |
±--------------±--------+
在 SQL 中,(id, month) 是表的聯合主鍵。
這個表格有關于每個部門每月收入的信息。
月份(month)可以取下列值 [“Jan”,“Feb”,“Mar”,“Apr”,“May”,“Jun”,“Jul”,“Aug”,“Sep”,“Oct”,“Nov”,“Dec”]。
重新格式化表格,使得 每個月 都有一個部門 id 列和一個收入列。
以 任意順序 返回結果表。
結果格式如以下示例所示。
示例 1:
輸入:
Department table:
±-----±--------±------+
| id | revenue | month |
±-----±--------±------+
| 1 | 8000 | Jan |
| 2 | 9000 | Jan |
| 3 | 10000 | Feb |
| 1 | 7000 | Feb |
| 1 | 6000 | Mar |
±-----±--------±------+
輸出:
±-----±------------±------------±------------±----±------------+
| id | Jan_Revenue | Feb_Revenue | Mar_Revenue | … | Dec_Revenue |
±-----±------------±------------±------------±----±------------+
| 1 | 8000 | 7000 | 6000 | … | null |
| 2 | 9000 | null | null | … | null |
| 3 | null | 10000 | null | … | null |
±-----±------------±------------±------------±----±------------+
解釋:四月到十二月的收入為空。
請注意,結果表共有 13 列(1 列用于部門 ID,其余 12 列用于各個月份)。
題解
格式化表格,使得 每個月 都有一個部門 id 列和一個收入列
- 經典的行轉列,可以使用聚合函數+group by + case when來實現
方法一 SUM + group by
selectid,SUM(case when month='Jan' then revenue else null end) as Jan_Revenue,SUM(case when month='Feb' then revenue else null end) as Feb_Revenue,SUM(case when month='Mar' then revenue else null end) as Mar_Revenue,SUM(case when month='Apr' then revenue else null end) as Apr_Revenue,SUM(case when month='May' then revenue else null end) as May_Revenue,SUM(case when month='Jun' then revenue else null end) as Jun_Revenue,SUM(case when month='Jul' then revenue else null end) as Jul_Revenue,SUM(case when month='Aug' then revenue else null end) as Aug_Revenue,SUM(case when month='Sep' then revenue else null end) as Sep_Revenue,SUM(case when month='Oct' then revenue else null end) as Oct_Revenue,SUM(case when month='Nov' then revenue else null end) as Nov_Revenue,SUM(case when month='Dec' then revenue else null end) as Dec_Revenue
from Department
group by id
方法二 MAX + group by
selectid,MAX(case when month='Jan' then revenue else null end) as Jan_Revenue,MAX(case when month='Feb' then revenue else null end) as Feb_Revenue,MAX(case when month='Mar' then revenue else null end) as Mar_Revenue,MAX(case when month='Apr' then revenue else null end) as Apr_Revenue,MAX(case when month='May' then revenue else null end) as May_Revenue,MAX(case when month='Jun' then revenue else null end) as Jun_Revenue,MAX(case when month='Jul' then revenue else null end) as Jul_Revenue,MAX(case when month='Aug' then revenue else null end) as Aug_Revenue,MAX(case when month='Sep' then revenue else null end) as Sep_Revenue,MAX(case when month='Oct' then revenue else null end) as Oct_Revenue,MAX(case when month='Nov' then revenue else null end) as Nov_Revenue,MAX(case when month='Dec' then revenue else null end) as Dec_Revenue
from Department
group by id
方法三 MIN + group by
selectid,MIN(case when month='Jan' then revenue else null end) as Jan_Revenue,MIN(case when month='Feb' then revenue else null end) as Feb_Revenue,MIN(case when month='Mar' then revenue else null end) as Mar_Revenue,MIN(case when month='Apr' then revenue else null end) as Apr_Revenue,MIN(case when month='May' then revenue else null end) as May_Revenue,MIN(case when month='Jun' then revenue else null end) as Jun_Revenue,MIN(case when month='Jul' then revenue else null end) as Jul_Revenue,MIN(case when month='Aug' then revenue else null end) as Aug_Revenue,MIN(case when month='Sep' then revenue else null end) as Sep_Revenue,MIN(case when month='Oct' then revenue else null end) as Oct_Revenue,MIN(case when month='Nov' then revenue else null end) as Nov_Revenue,MIN(case when month='Dec' then revenue else null end) as Dec_Revenue
from Department
group by id
可能一開始看到SUM、MAX、MIN會不理解為啥?
可以看下這2個圖例呢?
中間分組的過程其實是內部存儲的,無法查詢出來的一個虛擬的結果,一個框是一個集合的內容,這樣的話就比較好理解為啥用聚合函數了。
如果不使用聚合函數會怎么樣呢?
如果不使用的話,行數不會減少,會和輸入數據一樣的行數,就需要考慮一個合并的問題了。
大致效果是:
1, 100,null,null,null,…
2,null,100,null,null,…
1,null,100,null,null,…
顯然id=1的數據沒有合并,違背了行轉列的預期效果。
分析案例
解題思路
由于篩選結果中每個ID是一個記錄 因此GROUP BY ID.
每個月份是一列,因此篩選每個月份時使用CASE [when…then…] END只取當前月份.
需要使用SUM()聚合函數 因為如果沒有聚合函數 篩選出來的是
GROUP BY、CASE…END之后的第一行.
比如 Department 表:
+------+---------+-------+
| id | revenue | month |
+------+---------+-------+
| 1 | 8000 | Jan |
| 2 | 9000 | Jan |
| 3 | 10000 | Feb |
| 1 | 7000 | Feb |
| 1 | 6000 | Mar |
+------+---------+-------+GROUP BY ID
+------+---------+-------+
| id | revenue | month |
+------+---------+-------+
| 1 | 8000 | Jan |
| 1 | 7000 | Feb |
| 1 | 6000 | Mar |
-------------------------
| 2 | 9000 | Jan |
-------------------------
| 3 | 10000 | Feb |
+------+---------+-------+如果沒有聚合函數 只輸出第一行 比如
SELECT ID, (CASE WHEN MONTH='JAN' THEN REVENUE END) AS JAN_REVENUE,
(CASE WHEN MONTH='FEB' THEN REVENUE END) AS FEB_REVENUE
FROM DEPARTMENT GROUP BY ID
會輸出
+------+-------------+-------------+
| ID | JAN_REVENUE | FEB_REVENUE |
+------+-------------+-------------+
| 1 | 8000 | NULL |
| 2 | 7000 | NULL |
| 3 | NULL | 10000 |
+------+-------------+-------------+
其中 ID=1 的 FEB_REVENUE 結果不對,這是因為 ID=1 時, (CASE WHEN MONTH='FEB' THEN REVENUE END)= [NULL, 7000, NULL], 沒有聚合函數會只取第一個,即NULL