[自然語言處理]pytorch概述--什么是張量(Tensor)和基本操作

pytorch概述

PyTorch 是?個開源的深度學習框架，由 Facebook 的??智能研究團隊開發和維護，于2017年在GitHub上開源，在學術界和?業界都得到了?泛應?

pytorch能做什么

GPU加速
自動求導
常用網絡層

pytorch基礎

量的概念

標量：數字1,2,3
向量：一維表格[1,2,3]
矩陣：二維表格[(1,2),(3,4)]

通過向量、矩陣描述的物體最多是H*W維，而生活中很多東西有更高維度，就用張量表示
前面三種量也可以當成張量的一種
在這里插入圖片描述

張量（Tensor）的基本概念

張量（Tensor）是pytorch中的基本單位，也是深度學習框架構成的重要組成。
我們可以先把張量看做是?個容器，??承載了需要運算的數據。
tensor 即“張量”。實際上跟numpy數組、向量、矩陣的格式基本一樣。但是是專門針對GPU來設計的，可以運行在GPU上來加快計算效率
在PyTorch中，張量Tensor是最基礎的運算單位，與NumPy中的NDArray類似，張量表示的是一個多維矩陣。不同的是，PyTorch中的Tensor可以運行在GPU上，而NumPy的NDArray只能運行在CPU上。由于Tensor能在GPU上運行，因此大大加快了運算速度。
一句話總結：一個可以運行在gpu上的多維數據而已

樣本和模型 --> Y=WX+B
X：表示樣本
W、B：表示變量
Y：表示標簽

張量的類型

Data type	dtype	Legacy Constructors（type）
32-bit floating point	torch.float32 or torch.float	torch.*.FloatTensor
64-bit floating point	torch.float64 or torch.double	torch.*.DoubleTensor
64-bit complex	torch.complex64 or torch.cfloat
128-bit complex	torch.complex128 or torch.cdouble
16-bit floating point	torch.float16 or torch.half	torch.*.HalfTensor
16-bit floating point	torch.bfloat16	torch.*.BFloat16Tensor
8-bit integer (無符號)	torch.uint8	torch.*.ByteTensor
8-bit integer (有符號)	torch.int8	torch.*.CharTensor
16-bit integer (有符號)	torch.int16 or torch.short	torch.*.ShortTensor
32-bit integer (有符號)	torch.int32 or torch.int	torch.*.IntTensor
64-bit integer (有符號)	torch.int64 or torch.long	torch.*.LongTensor
Boolean（布爾型）	torch.bool	torch.*.BoolTensor

張量的創建

函數	功能
Tensor(*size)	基礎構造函數
Tensor(data)	類似np.array
ones(*size)	全1 Tensor
zeros(*size)	全0 Tensor
eye(*size)	對角線為1，其他為0
arange(s,e,step)	從s到e，步長為step的等差數列（不包含e這個值）
linspace(s,e,steps)	從s到e，均勻切分成steps份，steps是值的個數
rand/randn(*size)	均勻/標準分布
normal(mean,std)/uniform_(from,to)	正態分布/均勻分布
randperm(m)	隨機排列

張量初始化方法

1.直接從數據，張量可以直接從數據中創建。數據類型是?動推斷的


data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)
x_data

tensor([[1, 2],[3, 4]])

x_data.dtype

torch.int64

2.從numpy數組中創建張量（反之亦然）

import numpy as np
np_array = np.array(data)
x_np = torch.from_numpy(np_array)

3.從另一個張量創建張量[除非明確覆蓋，否則新張量保留參數張量的屬性]

x_ones = torch.ones_like(x_data) # 保留x_data的屬性
print(f"Ones Tensor: \n {x_ones} \n")
#由于x_data的數據類型是int64，rand_like函數會生成一個隨機張量，數據類型與x_data相同
#而torch.rand()方法是創建一個服從均勻分布的隨機張量，值在 [0, 1)，數據類型是float32，所以需要強制轉換
x_rand = torch.rand_like(x_data, dtype=torch.float) # 重寫x_data的數據類型
print(f"Random Tensor: \n {x_rand} \n")

Ones Tensor: tensor([[1, 1],[1, 1]]) Random Tensor: tensor([[0.3156, 0.5076],[0.8555, 0.4440]])

4.使用隨機值或常量值

shape 是張量維度的元組。在下?的函數中，它決定了輸出張量的維度

shape = (2,3,) # 一個標量
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

Random Tensor: tensor([[0.5733, 0.8237, 0.1398],[0.9530, 0.9231, 0.2764]]) Ones Tensor: tensor([[1., 1., 1.],[1., 1., 1.]]) Zeros Tensor: tensor([[0., 0., 0.],[0., 0., 0.]])

5.其他一些創建方法

基于現有tensor構建，但使?新值填充

m = torch.ones(5,3, dtype=torch.double)
n = torch.rand_like(m, dtype=torch.float)
# 獲取tensor的??
print(m.size()) # torch.Size([5,3])
# 均勻分布
torch.rand(5,3)
# 標準正態分布
torch.randn(5,3)
# 離散正態分布
torch.normal(mean=.0,std=1.0,size=([5,3]))
# 線性間隔向量(返回?個1維張量，包含在區間start和end上均勻間隔的steps個點) 等差數列
torch.linspace(start=1,end=10,steps=20)

torch.Size([5, 3])
tensor([ 1.0000,  1.4737,  1.9474,  2.4211,  2.8947,  3.3684,  3.8421,  4.3158,4.7895,  5.2632,  5.7368,  6.2105,  6.6842,  7.1579,  7.6316,  8.1053,8.5789,  9.0526,  9.5263, 10.0000])

張量的屬性

每個Tensor有torch.dtype、torch.device、torch.layout三種屬性
torch.device標識了torch.Tensor對象在創建之后所存儲在的設備名稱（cpu還是GPU）
torch.layout表示torch.Tensor內存布局的對象
張量的屬性描述了張量的形狀、數據類型和存儲它們的設備。
以對象的?度來判斷，張量可以看做是具有特征和?法的對象

tensor = torch.rand(3,4)
print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")

Shape of tensor: torch.Size([3, 4])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu

張量運算

官網總結的100多種張量運算包括算術、線性代數、矩陣操作（轉置、索引、切?）、采樣等等
這些操作中的每?個都可以在 GPU 上運?（速度通常?在 CPU 上更快）

默認情況下，張量是在 CPU 上創建的。

我們可以使?使? .to() ?法明確地將張量移動到 GPU (GPU可?的情況下)。
請注意！跨設備復制內容量較?的張量，在時間和內存??可能成本很?！

# 設置張量在GPU上運算
# We move our tensor to the GPU if available
if torch.cuda.is_available():tensor = tensor.to('cuda')

張量的索引和切片

tensor = torch.ones(4, 4) # 創建一個4x4的張量
print('First row: ', tensor[0]) # 打印第一行
print('First column: ', tensor[:, 0]) # 打印第一列
print('Last column:', tensor[..., -1]) # 打印最后一列
tensor[:,1] = 0 # 第二列賦值為0
print(tensor)

First row:  tensor([1., 1., 1., 1.])
First column:  tensor([1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1.])
tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])

張量的拼接

可以使? torch.cat ?來連接指定維度的?系列張量。另?個和 torch.cat 功能類似的函數是torch.stack

方法	含義	格式
torch.cat	沿現有維度連接給定的序列	torch.cat（tensor， dim = 0 ， *， out = None ）
torch.stack	沿新維度連接一系列張量	torch.stack（張量， dim = 0， *， out = None）

print(tensor) # 打印原始張量
t1 = torch.cat([tensor, tensor, tensor], dim=1) # 按列拼接
# dim 參數決定了拼接操作沿著哪個維度進行。具體來說：
# 	?	dim=-1 表示沿著最后一個維度拼接
# 	?	dim=0 表示沿著第一個維度（行的方向）拼接。
# 	?	dim=1 表示沿著第二個維度（列的方向）拼接。
# 	?	dim=2 表示沿著第三個維度（深度方向，通常是針對三維張量）拼接，以此類推。
print(t1)

tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])
tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])

算數運算

加法運算

# 加法運算
t1 = torch.tensor([[1,2],[3,4]])
print(t1)
t2 = torch.tensor([[5,6],[7,6]])
print(t2)
t3 = t1 + t2
print(t3)
t4 = torch.add(t1, t2)
print(t4)
print(t1.add(t2))  
print(t1)
#t1.add_(t2) # 會改變t1的值

tensor([[1, 2],[3, 4]])
tensor([[5, 6],[7, 6]])
tensor([[ 6,  8],[10, 10]])
tensor([[ 6,  8],[10, 10]])
tensor([[ 6,  8],[10, 10]])
tensor([[1, 2],[3, 4]])

減法運算

#減法運算
print(t1 - t2)
print(torch.sub(t1, t2))
print(t1.sub(t2))
print(t1)

tensor([[-4, -4],[-4, -2]])
tensor([[-4, -4],[-4, -2]])
tensor([[-4, -4],[-4, -2]])
tensor([[1, 2],[3, 4]])

乘法運算

計算兩個張量之間矩陣乘法的?種?式。 y1, y2, y3 最后的值是?樣的
二維矩陣乘法運算包括torch.mm(),torch.matmul()(高維度僅支持),@

對于高維度的Tensor（dim>2），定義其矩陣乘法僅在最后的兩個維度上,要求前面的維度必須保持一致，就像矩陣的索引一樣并且運算操作只有torch.matul()

print(tensor) # 打印原始張量
y1 = tensor @ tensor.T
print(y1) # 等價于 tensor.matmul(tensor.T) 
y2 = tensor.matmul(tensor.T)
print(y2) # 等價于 tensor @ tensor.T
y3 = torch.rand_like(tensor) # 與tensor形狀相同的隨機張量(初始化y3)
torch.matmul(tensor, tensor.T, out=y3) # 輸出到y3
print(y3)

tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])
tensor([[3., 3., 3., 3.],[3., 3., 3., 3.],[3., 3., 3., 3.],[3., 3., 3., 3.]])
tensor([[3., 3., 3., 3.],[3., 3., 3., 3.],[3., 3., 3., 3.],[3., 3., 3., 3.]])
tensor([[3., 3., 3., 3.],[3., 3., 3., 3.],[3., 3., 3., 3.],[3., 3., 3., 3.]])

#高維度矩陣運算
t5 = torch.ones(1,2,3,4)
print(t5)
t6 = torch.ones(1,2,4,3)
print(t6)
print(t5.matmul(t6)) # torch.Size([1, 2, 3, 1, 2, 3])
print(torch.matmul(t5, t6)) # torch.Size([1, 2, 3, 1, 2, 3])

tensor([[[[1., 1., 1., 1.],[1., 1., 1., 1.],[1., 1., 1., 1.]],[[1., 1., 1., 1.],[1., 1., 1., 1.],[1., 1., 1., 1.]]]])
tensor([[[[1., 1., 1.],[1., 1., 1.],[1., 1., 1.],[1., 1., 1.]],[[1., 1., 1.],[1., 1., 1.],[1., 1., 1.],[1., 1., 1.]]]])
tensor([[[[4., 4., 4.],[4., 4., 4.],[4., 4., 4.]],[[4., 4., 4.],[4., 4., 4.],[4., 4., 4.]]]])
tensor([[[[4., 4., 4.],[4., 4., 4.],[4., 4., 4.]],[[4., 4., 4.],[4., 4., 4.],[4., 4., 4.]]]])

計算張量逐元素相乘的?種?法。 z1, z2, z3 最后的值是?樣的
哈達碼積（element wise,對應元素相乘）

print(tensor) # 打印原始張量
z1 = tensor * tensor # 逐元素相乘
print(z1) # 等價于 tensor.mul(tensor)
z2 = tensor.mul(tensor) # 逐元素相乘
print(z2) # 等價于 tensor * tensor
z3 = torch.rand_like(tensor) # 與tensor形狀相同的隨機張量(初始化z3)
torch.mul(tensor, tensor, out=z3) # 輸出到z3
print(z3)

tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])
tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])
tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])
tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])

除法運算

#除法運算
print(t1 / t2)
print(torch.div(t1, t2))
print(t1.div(t2))
print(t1)

tensor([[0.2000, 0.3333],[0.4286, 0.6667]])
tensor([[0.2000, 0.3333],[0.4286, 0.6667]])
tensor([[0.2000, 0.3333],[0.4286, 0.6667]])
tensor([[1, 2],[3, 4]])

冪運算

使用torch.pow(tensor,2);**;兩種方法
e指函數：torch.exp(tensor)

print(t1)
print(torch.pow(t1, 2)) # 每個元素平方
print(t1.pow(2)) # 每個元素平方
print(t1**2) # 每個元素平方
#print(t1.pow_(2)) # 每個元素平方

tensor([[1, 2],[3, 4]])
tensor([[ 1,  4],[ 9, 16]])
tensor([[ 1,  4],[ 9, 16]])
tensor([[ 1,  4],[ 9, 16]])

開方運算

tensor.sqrt()
tensor.sqrt_()

對數運算

torch.log2(tensor)
torch.log10(tensor)
torch.log(tensor)
torch.log_(tensor)

單元素張量

如果?個單元素張量，例如將張量的值聚合計算，可以使? item() ?法將其轉換為Python 數值

print(tensor) # 打印原始張量
agg = tensor.sum() # 求和
print(agg)
agg_item = agg.item() # 將張量的值轉換為Python數值
print(agg_item, type(agg_item)) # 打印agg_item的值和類型

tensor([[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.],[1., 0., 1., 1.]])
tensor(12.)
12.0 <class 'float'>

In-place操作

把計算結果存儲到當前操作數中的操作就稱為就地操作。含義和pandas中inPlace參數的含義?樣。pytorch中，這些操作是由帶有下劃線 _ 后綴的函數表?。
例如：x.copy_(y) , x.t_() , 將改變 x ??的值

In-place操作雖然節省了?部分內存，但在計算導數時可能會出現問題，因為它會?即丟失歷史記錄。因此，不?勵使?它們。

x = torch.tensor([(1, 2, 3), (4, 5, 6), (7, 8, 9)])
print(x)
x.add_(2) # 逐元素加2
print(x) # 打印x的值
# 注意：任何以`_`結尾的操作都會用結果替換原始張量。例如：x.copy_(y), x.t_() , 將更改 `x`.

tensor([[1, 2, 3],[4, 5, 6],[7, 8, 9]])
tensor([[ 3,  4,  5],[ 6,  7,  8],[ 9, 10, 11]])

與numpy之間的轉換

CPU 和 NumPy 數組上的張量共享底層內存位置，所以改變?個另?個也會變

張量到numpy數組

t1 = torch.ones(6) # 創建一個張量
print(f"t1:{t1}") #這里的f是格式化張量的內容到字符串中
n1 = t1.numpy() # 張量轉numpy數組
print(f"n1:{n1}") # 打印numpy數組

t1:tensor([1., 1., 1., 1., 1., 1.])
n1:[1. 1. 1. 1. 1. 1.]

t1.add_(1) # 逐元素加1
print(f"t1:{t1}") # 打印張量
print(f"n1:{n1}") # 打印numpy數組

t1:tensor([2., 2., 2., 2., 2., 2.])
n1:[2. 2. 2. 2. 2. 2.]

Numpy數組到張量

n2 = np.ones(5) # 創建一個numpy數組
print(f"n2:{n2}") # 打印numpy數組
t2 = torch.from_numpy(n2) # numpy數組轉張量
print(f"t2:{t2}") # 打印張量
#Numpy數組和PyTorch張量將共享它們的底層內存位置，因此對一個進行更改將導致另一個也發生更改。
np.add(n2,1,out=n2) # 逐元素加1
print(f"t2:{t2}") # 打印張量
print(f"n2:{n2}") # 打印numpy數組

n2:[1. 1. 1. 1. 1.]
t2:tensor([1., 1., 1., 1., 1.], dtype=torch.float64)
t2:tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n2:[2. 2. 2. 2. 2.]

計算圖

在進?步學習pytorch之前，先要了解?個概念 —— 計算圖( Computation graph)所有的深度學習框架都依賴于計算圖來完成梯度下降、優化梯度值等計算。
?計算圖的創建和應?，通常包含如下兩個部分：

用戶構前向傳播圖
框架處理后向傳播(梯度更新)

模型從簡單到復雜，pytorch和tensorflow都使?計算圖來完成?作。
但是，這兩個框架所使?的計算圖也卻有所不同：
tensorflow1.x 使?的是靜態計算圖，tensorflow2.x和pytorch使?的是動態計算圖

靜態計算圖

先搭建計算圖，后運行;允許編譯器進行優化
通常包括以下兩個階段。

定義?個架構(可以使??些基本的流控制?法，?如循環和條件指令)
運??組數據來訓練模型，進?推理

優點：允許對圖進?強?的離線優化/調度，所以速度相對較快。
缺點：難以調試，對代碼中處理結構化或者可變??的數據處理?較復雜

動態計算圖

編好程序即可執行
在執?正向計算時，隱式地定義圖(動態構建)。

優點：靈活，侵?性?，允許動態構建和評估
缺點：難以優化

兩種計算圖?較起來，可以看出：動態圖是對調試友好的(對程序員友好)。它允許逐?執?代碼，并可以訪問所有張量。這樣更便于發現和找到我們計算或邏輯中的問題

pytorch計算圖可視化

通過torchviz可以實現

import torch
from torchviz import make_dot# 定義矩陣 A，向量 b 和常數 c
A = torch.randn(10, 10,requires_grad=True)   #requires_grad=True表示需要計算梯度,對A求導
b = torch.randn(10,requires_grad=True)
c = torch.randn(1,requires_grad=True)
x = torch.randn(10, requires_grad=True)
# 計算 x^T * A + b * x + c
result = torch.matmul(A, x.T) + torch.matmul(b, x) + c
# ?成計算圖節點
dot = make_dot(result, params={'A': A, 'b': b, 'c': c, 'x': x})
# 繪制計算圖
dot.render('expression', format='png', cleanup=True, view=False)