PyTorch中的線性變換：nn.Parameter VS nn.Linear

self.weight = nn.Parameter(torch.randn(in_channels, out_channels)) 和 self.linear = nn.Linear(in_channels, out_channels) 并不完全一致，盡管它們都可以用于實現線性變換（即全連接層），但它們的使用方式和內部實現有所不同。

`nn.Parameter`

當手動創建一個 nn.Parameter 時，是在顯式地定義權重矩陣，并且需要自己管理這個參數以及它如何參與到計算中。例如：

self.weight = nn.Parameter(torch.randn(in_channels, out_channels))

這里，self.weight 是一個可學習的參數，可以將其視為模型的一部分，并在前向傳播過程中手動與輸入進行矩陣乘法運算。假設輸入是 x，則輸出可以這樣計算：

output = torch.matmul(x, self.weight)

注意這里的數學公式是 $Y = X W$ ，其中 $X$ 是輸入矩陣， $W$ 是權重矩陣。如果還需要加上偏置項 $b$ ，則變為 $Y = X W + b$ 。在這個例子中，需要另外定義并初始化偏置項 self.bias。

示例 1：自定義實現線性層

import torch
import torch.nn as nnclass CustomLinear(nn.Module):def __init__(self, in_channels, out_channels):super(CustomLinear, self).__init__()# 初始化權重self.weight = nn.Parameter(torch.randn(in_channels, out_channels))# 初始化偏置self.bias = nn.Parameter(torch.randn(out_channels))def forward(self, x):# 線性變換：Y = XW + breturn torch.matmul(x, self.weight) + self.bias# 創建自定義線性層
custom_linear = CustomLinear(in_channels=3, out_channels=2)# 打印權重和偏置
print("Weights:", custom_linear.weight)
print("Bias:", custom_linear.bias)# 輸入數據
input_data = torch.randn(4, 3)  # 4個樣本，每個樣本有3個特征# 前向傳播
output = custom_linear(input_data)
print("Output:", output)

在這個示例中，我們手動創建了一個自定義的線性層 CustomLinear，它使用 nn.Parameter 來定義權重和偏置。在 forward 方法中，我們手動計算線性變換：Y = XW + b。這個實現與 nn.Linear 提供的功能類似，但更多地體現了手動管理權重和偏置的方式。

`nn.Linear`

另一方面，nn.Linear 是 PyTorch 提供的一個封裝好的模塊，用于執行線性變換。它不僅包含了權重矩陣，還自動處理了偏置項（除非明確設置 bias=False）。例如：

self.linear = nn.Linear(in_channels, out_channels)

當調用 self.linear(x) 時，它實際上是在執行以下操作：

output = torch.matmul(x, self.linear.weight.t()) + self.linear.bias

這里，self.linear.weight 的形狀是 (out_channels, in_channels)，而不是直接 (in_channels, out_channels)，因此在進行矩陣乘法之前需要對其轉置 (t() 方法)。這意味著數學公式實際上是 $Y = XW^T + b$ ，其中 $W^T$ 表示權重矩陣的轉置。

示例 2：使用 `nn.Linear`

import torch
import torch.nn as nn# 定義一個線性層
linear_layer = nn.Linear(in_features=3, out_features=2)# 打印權重和偏置
print("Weights:", linear_layer.weight)
print("Bias:", linear_layer.bias)# 輸入數據
input_data = torch.randn(4, 3)  # 4個樣本，每個樣本有3個特征# 前向傳播
output = linear_layer(input_data)
print("Output:", output)

在這個示例中，我們創建了一個線性層，它接受一個形狀為 [4, 3] 的輸入數據，并將其映射到一個形狀為 [4, 2] 的輸出數據。linear_layer.weight 和 linear_layer.bias 是自動初始化的。

數學公式的對比

對于手動定義的 nn.Parameter，如果輸入是 $X$ (形狀為 $N, in\_channels]$ )，權重是 $W$ (形狀為 $in\_channels, out\_channels]$ )，那么輸出 $Y$ 將通過 $Y = X W$ 計算。
對于 nn.Linear，同樣的輸入 $X$ (形狀為 $N, in\_channels]$ )，但是權重 $W^{'}$ (形狀為 $out\_channels, in\_channels]$ )，輸出 $Y$ 將通過 $Y = X(W')^T + b$ 計算。

從上面可以看出，雖然兩者都實現了線性變換，但在 nn.Linear 中，權重矩陣的形狀是倒置的，以適應其內部的實現細節。此外，nn.Linear 還自動處理了偏置項的添加，這使得它比手動定義參數更加方便和簡潔。

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/diannao/72694.shtml
繁體地址，請注明出處：http://hk.pswp.cn/diannao/72694.shtml
英文地址，請注明出處：http://en.pswp.cn/diannao/72694.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！