- 🍨 本文為🔗365天深度學習訓練營 中的學習記錄博客
- 🍖 原作者:K同學啊
前言
- LSTM模型一直是一個很經典的模型,這個模型當然也很復雜,一般需要先學習RNN、GRU模型之后再學,GRU、LSTM的模型講解將在這兩天發布更新,其中:
-
- 深度學習基礎–一文搞懂RNN
-
- 深度學習基礎–GRU學習筆記(李沐《動手學習深度學習》)
- 這一篇:是基于LSTM模型火災預測研究,講述了如何構建時間數據、模型如何構建、pytorch中LSTM的API、動態調整學習率等=,最后用RMSE、R2做評估;
- 歡迎收藏 + 關注,本人將會持續更新
文章目錄
- 1、導入數據與數據展示
- 1、導入庫
- 2、導入數據
- 3、數據可視化
- 4、相關性分析(熱力圖展示)
- 5、特征提取
- 2、時間數據構建
- 1、數據標準化
- 2、構建時間數據集
- 3、劃分數據集和加載數據集
- 1、數據劃分
- 3、模型構建
- 4、模型訓練
- 1、訓練集函數
- 2、測試集函數
- 3、模型訓練
- 5、結果展示
- 1、損失函數
- 2、預測展示
- 3、R2評估
1、導入數據與數據展示
1、導入庫
import torch
import torch.nn as nn
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pylab as plt # 設置分辨率
plt.rcParams['savefig.dpi'] = 500 # 圖片分辨率
plt.rcParams['figure.dpi'] = 500 # 分辨率device = "cpu"device
'cpu'
2、導入數據
data_df = pd.read_csv('./woodpine2.csv')data_df.head()
Time | Tem1 | CO 1 | Soot 1 | |
---|---|---|---|---|
0 | 0.000 | 25.0 | 0.0 | 0.0 |
1 | 0.228 | 25.0 | 0.0 | 0.0 |
2 | 0.456 | 25.0 | 0.0 | 0.0 |
3 | 0.685 | 25.0 | 0.0 | 0.0 |
4 | 0.913 | 25.0 | 0.0 | 0.0 |
數據位實驗數據,數據是定時收集的:
- Time: 時間從 0.000 開始,每隔大約 0.228 的間隔遞增。
- Tem1: 是溫度(Temperature)的縮寫,單位可能是攝氏度 (°C)。
- CO: 是指一氧化碳 (Carbon Monoxide) 的濃度。
- Soot: 是指煙炱或炭黑 (Soot) 的濃度。
# 數據信息查詢
data_df.info()
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5948 entries, 0 to 5947
Data columns (total 4 columns):# Column Non-Null Count Dtype
--- ------ -------------- ----- 0 Time 5948 non-null float641 Tem1 5948 non-null float642 CO 1 5948 non-null float643 Soot 1 5948 non-null float64
dtypes: float64(4)
memory usage: 186.0 KB
# 數據缺失值
data_df.isnull().sum()
Time 0
Tem1 0
CO 1 0
Soot 1 0
dtype: int64
3、數據可視化
時間是每隔固定時間收集的,故有用特征為:溫度、CO、Soot
_, ax = plt.subplots(1, 3, constrained_layout=True, figsize=(14, 3)) # constrained_layout=True 自動調整子圖sns.lineplot(data=data_df['Tem1'], ax=ax[0])
sns.lineplot(data=data_df['CO 1'], ax=ax[1])
sns.lineplot(data=data_df['Soot 1'], ax=ax[2])
plt.show()
?
?
4、相關性分析(熱力圖展示)
columns = ['Tem1', 'CO 1', 'Soot 1']plt.figure(figsize=(8, 6))
sns.heatmap(data=data_df[columns].corr(), annot=True, fmt=".2f")
plt.show()
?
?
# 統計分析
data_df.describe()
Time | Tem1 | CO 1 | Soot 1 | |
---|---|---|---|---|
count | 5948.000000 | 5948.000000 | 5948.000000 | 5948.000000 |
mean | 226.133238 | 152.534919 | 0.000035 | 0.000222 |
std | 96.601445 | 77.026019 | 0.000022 | 0.000144 |
min | 0.000000 | 25.000000 | 0.000000 | 0.000000 |
25% | 151.000000 | 89.000000 | 0.000015 | 0.000093 |
50% | 241.000000 | 145.000000 | 0.000034 | 0.000220 |
75% | 310.000000 | 220.000000 | 0.000054 | 0.000348 |
max | 367.000000 | 307.000000 | 0.000080 | 0.000512 |
當我看到相關性為1的時候,我也驚呆了,后面查看了統計量,還是沒發現出來,但是看上面的可視化圖展示,我信了,隨著溫度升高,CO化碳、Soot濃度一起升高,這個也符合火災的場景,數據沒啥問題。
5、特征提取
# 由于時間間隔一樣,故這里去除
data = data_df.iloc[:, 1:]data.head(3)
Tem1 | CO 1 | Soot 1 | |
---|---|---|---|
0 | 25.0 | 0.0 | 0.0 |
1 | 25.0 | 0.0 | 0.0 |
2 | 25.0 | 0.0 | 0.0 |
data.tail(3)
Tem1 | CO 1 | Soot 1 | |
---|---|---|---|
5945 | 292.0 | 0.000077 | 0.000491 |
5946 | 291.0 | 0.000076 | 0.000489 |
5947 | 290.0 | 0.000076 | 0.000487 |
特征間數據差距較大,故需要做標準化
2、時間數據構建
1、數據標準化
from sklearn.preprocessing import MinMaxScalersc = MinMaxScaler()for col in ['Tem1', 'CO 1', 'Soot 1']:data[col] = sc.fit_transform(data[col].values.reshape(-1, 1))# 查看維度
data.shape
(5948, 3)
2、構建時間數據集
LSTM 模型期望輸入數據的形狀是 (樣本數, 時間步長, 特征數),本文數據:
- 樣本數:5948
- 時間步長:本文設置為8
- 即是:取特征每8行(Tem1, CO 1, Soot 1)為一個時間段,第9個時間段的Tem1為y(溫度),火災預測本質也是預測溫度
- 特征數:3
width_x = 8
width_y = 1# 構建時間數據X, y(解釋在上)
X, y = [], []# 設置開始構建數據位置
start_position = 0for _, _ in data.iterrows():in_end = start_position + width_xout_end = in_end + width_y if out_end < len(data):# 采集時間數據集X_ = np.array(data.iloc[start_position : in_end, :])y_ = np.array(data.iloc[in_end : out_end, 0])X.append(X_)y.append(y_)start_position += 1# 轉化為數組
X = np.array(X)
# y也要構建出適合維度的變量
y = np.array(y).reshape(-1, 1, 1)X.shape, y.shape
((5939, 8, 3), (5939, 1, 1))
3、劃分數據集和加載數據集
1、數據劃分
# 取前5000個數據位訓練集,后面為測試集
X_train = torch.tensor(np.array(X[:5000, ]), dtype=torch.float32)
X_test = torch.tensor(np.array(X[5000:, ]), dtype=torch.float32)y_train = torch.tensor(np.array(y[:5000, ]), dtype=torch.float32)
y_test = torch.tensor(np.array(y[5000:, ]), dtype=torch.float32)X_train.shape, y_train.shape
(torch.Size([5000, 8, 3]), torch.Size([5000, 1, 1]))
數據集構建:
- TensorDataset 是 PyTorch 中的一個類,用于將兩個或多個張量組合成一個數據集。每個樣本由一個輸入張量和一個目標張量組成(構建的數據集中,每一個輸入對應一個輸出)
from torch.utils.data import TensorDataset, DataLoaderbatch_size = 64train_dl = DataLoader(TensorDataset(X_train, y_train),batch_size=batch_size,shuffle=True)test_dl = DataLoader(TensorDataset(X_test, y_test),batch_size=batch_size,shuffle=False)
3、模型構建
nn.LSTM
的 API
*構造函數
torch.nn.LSTM(input_size, hidden_size, num_layers=1, bias=True, batch_first=False, dropout=0, bidirectional=False, proj_size=0)
input_size
(int
):每個時間步輸入特征的數量。hidden_size
(int
):LSTM 層中隱藏狀態(h)的特征數。這也是 LSTM 輸出的特征數量,除非指定了proj_size
。num_layers
(int
, 可選):LSTM 層的數量。默認值為 1。bias
(bool
, 可選):如果為True
,則使用偏置項;否則不使用。默認值為True
。batch_first
(bool
, 可選):如果為True
,則輸入和輸出張量的形狀為(batch, seq, feature)
;否則為(seq, batch, feature)
。默認值為False
。dropout
(float
, 可選):除了最后一層之外的所有 LSTM 層之后應用的 dropout 概率。如果num_layers = 1
,則不會應用 dropout。默認值為 0。bidirectional
(bool
, 可選):如果為True
,則變為雙向 LSTM。默認值為False
。proj_size
(int
, 可選):如果大于 0,則 LSTM 會將隱藏狀態投影到一個不同維度的空間。這減少了模型參數的數量,并且可以加速訓練。默認值為 0,表示沒有投影。
輸入
input
(tensor
):形狀為(seq_len, batch, input_size)
或者如果batch_first=True
則為(batch, seq_len, input_size)
。(h_0, c_0)
(tuple
, 可選):包含兩個張量(h_0, c_0)
,分別代表初始的隱藏狀態和細胞狀態。它們的形狀均為(num_layers * num_directions, batch, hidden_size)
。如果沒有提供,那么所有狀態都會被初始化為零。
其中:
- 單向 LSTM (bidirectional=False):此時 num_directions=1。LSTM 只按照時間序列的順序從前向后處理數據,即從第一個時間步到最后一個時間步。
- 雙向 LSTM (bidirectional=True):此時 num_directions=2。雙向 LSTM 包含兩個獨立的 LSTM 層,一個按正常的時間順序從前向后處理數據,另一個則反過來從后向前處理數據。這樣做可以讓模型同時捕捉到過去和未來的信息,對于某些任務(如自然語言處理中的語義理解)特別有用。
輸出(兩個)
output
(tensor
):包含了最后一個時間步的輸出特征(h_t
)。如果batch_first=True
,則形狀為(batch, seq_len, num_directions * hidden_size)
;否則為(seq_len, batch, num_directions * hidden_size)
。注意,如果proj_size > 0
,則輸出的最后一個維度將是num_directions * proj_size
。(h_n, c_n)
(tuple
):包含兩個張量(h_n, c_n)
,分別代表所有時間步后的最終隱藏狀態和細胞狀態。它們的形狀均為(num_layers * num_directions, batch, hidden_size)
。同樣地,如果proj_size > 0
,則h_n
的最后一個維度將是proj_size
。
'''
模型采用兩個lstm層:3->320:lstm->320:lstm(進一步提取時間特征)->1:linear
'''class model_lstm(nn.Module):def __init__(self):super().__init__()self.lstm1 = nn.LSTM(input_size=3, hidden_size=320, num_layers=1, batch_first=True)self.lstm2 = nn.LSTM(input_size=320, hidden_size=320, num_layers=1, batch_first=True)self.fc = nn.Linear(320, 1)def forward(self, x):out, hidden = self.lstm1(x)out, _ = self.lstm2(out)out = self.fc(out) # 這個時候,輸出維度(batch_size, sequence_length, output_size), 這里是(64, 8, 1)return out[:, -1, :].view(-1, 1, 1) # 取最后一條數據 (64, 1, 1), 在pytorch中如果一個維度是1,可能會自動壓縮,所以這里需要再次形狀重塑model = model_lstm().to(device)
model
model_lstm((lstm1): LSTM(3, 320, batch_first=True)(lstm2): LSTM(320, 320, batch_first=True)(fc): Linear(in_features=320, out_features=1, bias=True)
)
# 先做測試
model(torch.rand(30, 8, 3)).shape
torch.Size([30, 1, 1])
4、模型訓練
1、訓練集函數
def train(train_dl, model, loss_fn, optimizer, lr_scheduler=None):size = len(train_dl.dataset)num_batchs = len(train_dl)train_loss = 0for X, y in train_dl:X, y = X.to(device), y.to(device)pred = model(X)loss = loss_fn(pred, y)optimizer.zero_grad()loss.backward()optimizer.step()train_loss += loss.item()if lr_scheduler is not None:lr_scheduler.step()print("learning rate = {:.5f}".format(optimizer.param_groups[0]['lr']), end=" ")train_loss /= num_batchsreturn train_loss
2、測試集函數
def test(test_dl, model, loss_fn):size = len(test_dl.dataset)num_batchs = len(test_dl)test_loss = 0with torch.no_grad():for X, y in test_dl:X, y = X.to(device), y.to(device)pred = model(X)loss = loss_fn(pred, y)test_loss += loss.item()test_loss /= num_batchsreturn test_loss
3、模型訓練
# 設置超參數
loss_fn = nn.MSELoss()
lr = 1e-1
opt = torch.optim.SGD(model.parameters(), lr=lr, weight_decay=1e-4) # weight_decay 實際上是在應用 L2 正則化(也稱為權重衰減)epochs = 50# 動態調整學習率
lr_scheduler = torch.optim.lr_scheduler.CosineAnnealingLR(opt, epochs, last_epoch=-1)train_loss = []
test_loss = []for epoch in range(epochs):model.train()epoch_train_loss = train(train_dl, model, loss_fn, opt, lr_scheduler)model.eval()epoch_test_loss = test(test_dl, model, loss_fn)train_loss.append(epoch_train_loss)test_loss.append(epoch_test_loss)template = ('Epoch:{:2d}, Train_loss:{:.5f}, Test_loss:{:.5f}') print(template.format(epoch+1, epoch_train_loss, epoch_test_loss))
learning rate = 0.09990 Epoch: 1, Train_loss:0.00320, Test_loss:0.00285
learning rate = 0.09961 Epoch: 2, Train_loss:0.00022, Test_loss:0.00084
learning rate = 0.09911 Epoch: 3, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09843 Epoch: 4, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09755 Epoch: 5, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.09649 Epoch: 6, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.09524 Epoch: 7, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09382 Epoch: 8, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.09222 Epoch: 9, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.09045 Epoch:10, Train_loss:0.00015, Test_loss:0.00066
learning rate = 0.08853 Epoch:11, Train_loss:0.00015, Test_loss:0.00077
learning rate = 0.08645 Epoch:12, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08423 Epoch:13, Train_loss:0.00015, Test_loss:0.00071
learning rate = 0.08187 Epoch:14, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.07939 Epoch:15, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07679 Epoch:16, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.07409 Epoch:17, Train_loss:0.00015, Test_loss:0.00056
learning rate = 0.07129 Epoch:18, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.06841 Epoch:19, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06545 Epoch:20, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.06243 Epoch:21, Train_loss:0.00015, Test_loss:0.00069
learning rate = 0.05937 Epoch:22, Train_loss:0.00015, Test_loss:0.00057
learning rate = 0.05627 Epoch:23, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.05314 Epoch:24, Train_loss:0.00015, Test_loss:0.00072
learning rate = 0.05000 Epoch:25, Train_loss:0.00015, Test_loss:0.00061
learning rate = 0.04686 Epoch:26, Train_loss:0.00015, Test_loss:0.00058
learning rate = 0.04373 Epoch:27, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.04063 Epoch:28, Train_loss:0.00015, Test_loss:0.00059
learning rate = 0.03757 Epoch:29, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.03455 Epoch:30, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.03159 Epoch:31, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.02871 Epoch:32, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.02591 Epoch:33, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02321 Epoch:34, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.02061 Epoch:35, Train_loss:0.00015, Test_loss:0.00067
learning rate = 0.01813 Epoch:36, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.01577 Epoch:37, Train_loss:0.00015, Test_loss:0.00065
learning rate = 0.01355 Epoch:38, Train_loss:0.00015, Test_loss:0.00064
learning rate = 0.01147 Epoch:39, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00955 Epoch:40, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00778 Epoch:41, Train_loss:0.00015, Test_loss:0.00060
learning rate = 0.00618 Epoch:42, Train_loss:0.00014, Test_loss:0.00063
learning rate = 0.00476 Epoch:43, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00351 Epoch:44, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00245 Epoch:45, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00157 Epoch:46, Train_loss:0.00015, Test_loss:0.00062
learning rate = 0.00089 Epoch:47, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00039 Epoch:48, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00010 Epoch:49, Train_loss:0.00015, Test_loss:0.00063
learning rate = 0.00000 Epoch:50, Train_loss:0.00015, Test_loss:0.00063
5、結果展示
1、損失函數
import matplotlib.pyplot as plt
from datetime import datetime
current_time = datetime.now() # 獲取當前時間 plt.figure(figsize=(5, 3),dpi=120)
plt.plot(train_loss , label='LSTM Training Loss')
plt.plot(test_loss, label='LSTM Validation Loss')
plt.title('Training and Validation Loss')
plt.xlabel(current_time) # 打卡請帶上時間戳,否則代碼截圖無效
plt.legend()
plt.show()
?
?
效果不錯,收斂了
2、預測展示
predicted_y_lstm = sc.inverse_transform(model(X_test).detach().numpy().reshape(-1,1)) # 測試集輸入模型進行預測
y_test_1 = sc.inverse_transform(y_test.reshape(-1,1))
y_test_one = [i[0] for i in y_test_1]
predicted_y_lstm_one = [i[0] for i in predicted_y_lstm]
plt.figure(figsize=(5, 3),dpi=120) # 畫出真實數據和預測數據的對比曲線
plt.plot(y_test_one[:2000], color='red', label='real_temp')
plt.plot(predicted_y_lstm_one[:2000], color='blue', label='prediction')
plt.title('Title')
plt.xlabel('X')
plt.ylabel('Y')
plt.legend()
plt.show()
?
?
3、R2評估
from sklearn import metrics
"""
RMSE :均方根誤差 -----> 對均方誤差開方
R2 :決定系數,可以簡單理解為反映模型擬合優度的重要的統計量
"""
RMSE_lstm = metrics.mean_squared_error(predicted_y_lstm_one, y_test_1)**0.5
R2_lstm = metrics.r2_score(predicted_y_lstm_one, y_test_1)
print('均方根誤差: %.5f' % RMSE_lstm)
print('R2: %.5f' % R2_lstm)
均方根誤差: 0.00001
R2: 0.82422
rmse、r2都不錯,但是擬合度還可以再提高