RMSNorm 類中引入一些參數

在 RMSNorm 類中，引入可學習的參數，以增強模型的表達能力和適應性。以下是一些常見的方法：

可學習的縮放參數（Scale）：
除了 self.weight，可以為每個維度引入一個可學習的縮放參數。這可以通過創建一個與輸入維度相同的權重矩陣來實現，而不是一個向量。這樣，每個特征維度都會有一個獨立的縮放因子。
```
class RMSNorm(torch.nn.Module):def __init__(self, dim: int, eps: float = 1e-6):super().__init__()self.eps = epsself.weight = nn.Parameter(torch.ones((dim, 1)))  # 權重矩陣def forward(self, x):normed = self._norm(x)return normed * self.weight
```

可學習的偏移參數（Shift）：
除了縮放，還可以為每個維度引入一個可學習的偏移參數。這可以通過添加一個與 self.weight 類似的權重矩陣來實現，但用于添加到歸一化后的輸出上。

class RMSNorm(torch.nn.Module):def __init__(self, dim: int, eps: float = 1e-6):super().__init__()self.eps = epsself.scale = nn.Parameter(torch.ones((dim, 1)))  # 縮放權重矩陣self.shift = nn.Parameter(torch.zeros((dim, 1)))  # 偏移權重矩陣def forward(self, x):normed = self._norm(x)return normed * self.scale + self.shift

可學習的歸一化參數（Custom Normalization）：
可以設計一個自定義的歸一化函數，其中包含可學習的參數。例如，可以學習一個參數來控制歸一化過程中的動態范圍。

import torch
import torch.nn as nnclass CustomNorm(nn.Module):def __init__(self, num_features, eps=1e-5):super(CustomNorm, self).__init__()# 可學習的縮放參數 gamma，初始化為1self.gamma = nn.Parameter(torch.ones(num_features))# 可選的可學習偏移參數 beta，初始化為0self.beta = nn.Parameter(torch.zeros(num_features))self.eps = epsdef forward(self, x):# 計算均值和方差mean = x.mean(1, keepdim=True)var = x.var(1, keepdim=True)# 歸一化x_norm = (x - mean) / torch.sqrt(var + self.eps)# 應用可學習的縮放和偏移x_out = self.gamma * x_norm + self.betareturn x_out# 示例使用
num_features = 10  # 假設輸入特征的維度為10
custom_norm_layer = CustomNorm(num_features)# 假設有一個隨機生成的輸入張量
input_tensor = torch.randn(5, num_features)  # 5個樣本，每個樣本有10個特征# 前向傳播
output_tensor = custom_norm_layer(input_tensor)
print(output_tensor)

可學習的激活函數參數：
在歸一化之后，可以引入一個可學習的激活函數，其參數也可以是可訓練的。這可以通過使用 nn.functional 中的激活函數，并將可學習參數作為激活函數的輸入。

class RMSNorm(torch.nn.Module):def __init__(self, dim: int, eps: float = 1e-6):super().__init__()self.eps = epsself.activation_param = nn.Parameter(torch.ones(1))  # 可學習的激活函數參數def forward(self, x):normed = self._norm(x)return torch.tanh(self.activation_param * normed)  # 使用tanh激活函數

本文來自互聯網用戶投稿，該文觀點僅代表作者本人，不代表本站立場。本站僅提供信息存儲空間服務，不擁有所有權，不承擔相關法律責任。
如若轉載，請注明出處：http://www.pswp.cn/news/718427.shtml
繁體地址，請注明出處：http://hk.pswp.cn/news/718427.shtml
英文地址，請注明出處：http://en.pswp.cn/news/718427.shtml

如若內容造成侵權/違法違規/事實不符，請聯系多彩編程網進行投訴反饋email:809451989@qq.com，一經查實，立即刪除！