夜漸深,我還在😘
老地方
睡覺了🙌
文章目錄
- 📚 卷積神經網絡實戰:MNIST手寫數字識別
- 🧠 4.1 預備知識
- ?? 4.1.1 `torch.nn.Conv2d()` 三維卷積操作
- 📏 4.1.2 `nn.MaxPool2d()` 池化層的作用
- 📥 4.2 數據輸入與處理
- 🗃? MNIST數據集加載
- 🔍 數據格式驗證
- 🚀 4.3 卷積模型構建與訓練
- 🧩 4.3.1 網絡架構設計
- ? 4.3.2 GPU加速與模型初始化
- 📉 4.3.3 訓練與評估函數
- 🔁 4.3.4 模型訓練循環
- 🧪 4.4 函數式API
- 🔌 4.4.1導入函數式模塊
- ? 4.4.2激活函數應用
- 🧮 4.4.3池化操作實現
📚 卷積神經網絡實戰:MNIST手寫數字識別
🧠 4.1 預備知識
?? 4.1.1 torch.nn.Conv2d()
三維卷積操作
torch.nn.Conv2d()
是PyTorch中實現三維卷積的核心方法,其關鍵參數包括:
in_channels
:輸入通道數(彩色圖為3,灰度圖為1)out_channels
:輸出通道數(卷積核數量)kernel_size
:卷積核尺寸(如3×3)stride
:步長(默認為1)padding
:填充(默認為0)
import torch
from torch import nn# 創建隨機輸入數據 (batch_size=20, 通道=3, 高=256, 寬=356)
input = torch.randn(20, 3, 256, 256) # 定義卷積層:輸入通道3→輸出通道16,3×3卷積核,步長1,填充1
conv_layer = nn.Conv2d(3, 16, (3, 3), stride=1, padding=1)# 執行卷積操作
output = conv_layer(input)
output.shape # torch.Size([20, 16, 256, 256])
💡 輸出解析:經過卷積后,特征圖尺寸保持256×256不變(因padding=1),通道數從3增加到16
📏 4.1.2 nn.MaxPool2d()
池化層的作用
池化層的重要性:
- 🎯 增大感受野:小卷積核視野有限,池化間接擴大覆蓋區域
- 🛡? 降低過擬合:減少參數量,增強模型泛化能力
- ? 加速計算:縮減特征圖尺寸,減少后續計算量
核心參數:kernel_size
(池化窗口尺寸)
# 創建隨機圖像批次 (64張256×256的RGB圖像)
img_batch = torch.randn(64, 3, 256, 256)# 2×2最大池化操作
pool_out = torch.max_pool2d(img_batch, kernel_size=(2, 2))
pool_out.shape # torch.Size([64, 3, 128, 128])
💡 輸出解析:池化后圖像尺寸減半(256→128),通道數不變,實現特征降維
📥 4.2 數據輸入與處理
🗃? MNIST數據集加載
使用PyTorch內置工具加載手寫數字數據集:
import torchvision
from torchvision.transforms import ToTensor# 下載并加載訓練集/測試集
train_ds = torchvision.datasets.MNIST("data/", train=True, transform=ToTensor(), download=True
)
test_ds = torchvision.datasets.MNIST("data/", train=False, transform=ToTensor(), download=True
)# 創建數據加載器 (batch_size=64)
train_dl = torch.utils.data.DataLoader(train_ds, batch_size=64, shuffle=True)
test_dl = torch.utils.data.DataLoader(test_ds, batch_size=64)
🔍 數據格式驗證
imgs, labels = next(iter(train_dl))
print(imgs.shape, labels.shape) # torch.Size([64, 1, 28, 28]) torch.Size([64])
? 數據格式:符合卷積網絡輸入要求(batch_size, 通道, 高, 寬)
🚀 4.3 卷積模型構建與訓練
🧩 4.3.1 網絡架構設計
LeNet風格CNN模型:
class Model(nn.Module):def __init__(self):super(Model, self).__init__()# 卷積層1:1→6通道,5×5卷積核self.conv1 = nn.Conv2d(1, 6, 5) # 卷積層2:6→16通道,5×5卷積核self.conv2 = nn.Conv2d(6, 16, 5) # 全連接層1:256→256節點self.linear1 = nn.Linear(16*4*4, 256) # 輸出層:256→10節點 (10個數字類別)self.linear2 = nn.Linear(256, 10) def forward(self, x):# 卷積→ReLU→池化 (28×28 → 12×12)x = torch.max_pool2d(torch.relu(self.conv1(x)), (2, 2)) # 卷積→ReLU→池化 (12×12 → 4×4)x = torch.max_pool2d(torch.relu(self.conv2(x)), (2, 2)) # 展平特征圖x = x.view(-1, 16*4*4) # 全連接層→ReLUx = torch.relu(self.linear1(x)) # 輸出層return self.linear2(x)
? 4.3.2 GPU加速與模型初始化
# 自動檢測GPU加速
device = "cuda" if torch.cuda.is_available() else "cpu"
model = Model().to(device)
model
Model((conv1): Conv2d(1, 6, kernel_size=(5, 5), stride=(1, 1))(conv2): Conv2d(6, 16, kernel_size=(5, 5), stride=(1, 1))(linear1): Linear(in_features=256, out_features=256, bias=True)(linear2): Linear(in_features=256, out_features=10, bias=True)
)
📉 4.3.3 訓練與評估函數
# 訓練函數
def train(dataloader, model, loss_fn, optimizer):model.train()total_samples = len(dataloader.dataset)total_batches = len(dataloader)train_loss, correct = 0, 0for X, y in dataloader:X, y = X.to(device), y.to(device)# 前向傳播pred = model(X)loss = loss_fn(pred, y)# 反向傳播optimizer.zero_grad()loss.backward()optimizer.step()# 統計指標with torch.no_grad():correct += (pred.argmax(1) == y).sum().item()train_loss += loss.item()return train_loss/total_batches, correct/total_samples# 測試函數
def test(dataloader, model):model.eval()total_samples = len(dataloader.dataset)total_batches = len(dataloader)test_loss, correct = 0, 0with torch.no_grad():for X, y in dataloader:X, y = X.to(device), y.to(device)pred = model(X)test_loss += loss_fn(pred, y).item()correct += (pred.argmax(1) == y).sum().item()return test_loss/total_batches, correct/total_samples
🔁 4.3.4 模型訓練循環
# 超參數設置
optimizer = torch.optim.Adam(model.parameters(), lr=0.005)
loss_fn = nn.CrossEntropyLoss()
epochs = 20# 訓練日志
for epoch in range(epochs):train_loss, train_acc = train(train_dl, model, loss_fn, optimizer)test_loss, test_acc = test(test_dl, model)# 打印訓練進度print(f"epoch:{epoch:2d}, train_loss:{train_loss:.5f}, "f"train_acc:{train_acc*100:.1f}%, test_loss:{test_loss:.5f}, "f"test_acc:{test_acc*100:.1f}%")
訓練輸出:
epoch: 0, train_loss:0.24543, train_acc:92.8%, test_loss:0.07341, test_acc:97.7%
epoch: 1, train_loss:0.06720, train_acc:97.9%, test_loss:0.04788, test_acc:98.4%
...
epoch:19, train_loss:0.00509, train_acc:99.8%, test_loss:0.04585, test_acc:99.2%
Done
🎯 性能總結:模型在20個epoch內達到**99.2%**的測試準確率,顯著優于全連接網絡
🧪 4.4 函數式API
🔌 4.4.1導入函數式模塊
import torch.nn.functional as F
# 行業標準導入方式
? 4.4.2激活函數應用
# 傳統方式
output = torch.relu(input)# 函數式API方式
output = F.relu(input)
🧮 4.4.3池化操作實現
# 傳統方式
pooled = torch.max_pool2d(input, kernel_size=2)# 函數式API方式
pooled = F.max_pool2d(input, kernel_size=2)