Pytorch常用函数
链接
项目仓库:pytorch | github
pytorch 源码:pytorch/pytorch | source code | github
torch 源码:pytorch/pytorch/torch source code | github
torchvision源码:pytorch/vision | source code | github
torch.Tensor
ways to make tensor
-
Uniform distribution on the interval [0,1)
torch.rand -
Normal distribution with mean 0 and variance 1 (standard normal distribution)
torch.randn -
Random integers generated uniformly between low (inclusive) and high (exclusive)
torch.randint -
Returns a tensor filled with the scalar value 1
torch.ones -
Demo
1
2
3
4
5
6import torch
tensor = torch.randint(low=0, high=3, size=(3, 4))
print(tensor)
# tensor([[1, 2, 1, 1],
# [0, 2, 1, 2],
# [1, 2, 0, 0]])
torch.set_printoptions
-
Deom
1
2
3
4
5
6
7
8
9
10import torch
tensor = torch.tensor([torch.pi, 1/3])
print(tensor)
# tensor([3.1416, 0.3333])
# Limit the number of elements shown: reserve 2 decimal places
torch.set_printoptions(precision=2)
print(tensor)
# tensor([3.14, 0.33])
torch.Tensor.unfold
1 | import torch |
Tensor.views
Tensor.permute
torch.unique
1 | import torch |
torch.max
1 | import torch |
torch.argmax
1 | import torch |
torch.unsqueeze
1 | import torch |
torch.squeeze
torch.squeeze{target="_blank"}
1 | x = torch.zeros(2, 1, 2, 1, 2) # shape: [2, 1, 2, 1, 2] |
torch.roll
1 | x = torch.tensor([1, 2, 3, 4, 5, 6, 7, 8]).view(4, 2) |
Broadcasting semantics (广播机制)
Broadcasting semantics | pytorch docs
1 | import torch |
1 | x - y |
Pytorch实现线性回归
torch.nn.Linear介绍
torch.nn.Linear(in_features, out_features, bias=True)
Linear是一个类, 它按照下式计算输入数据的线性变换
其中 是行向量, 当批量处理样本时, 是 个行向量构成的矩阵, 每一行都是一个样本.
.png)
Linear一共有三个输入参数:
- 第一个参数为
in_features
,int
型: 输入数据的特征数 - 第二个参数为
out_features
,int
型: 输出数据的特征数 - 第三个参数为
bias
,bool
型: 默认为True
, 设为False
时, 偏置项为0
其中第一第二个参数决定了变换矩阵的尺寸, 是一个 的矩阵
的初始化是由 的均匀分布随机初始化的
这个分布叫做, 是15年2月何愷明 (Kaiming He)在他的论文中提出
Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification
1 | 20, 30) model = nn.Linear( |
torch.nn.MSELoss介绍
MSE为均方误差(mean square error), 而MSELoss将返回一个对象用于计算input和target间的MSE.
MSELoss的reduction
有三个选项none
|mean(default)
|sum
1 | #设置输入值和目标值 |
1 | #创建loss对象 |
用Linear实现线性回归
- 数据
1 | import torch |
- 模型搭建
1 | class LinearModel(torch.nn.Module): |
- 损失函数
1 | criterion = torch.nn.MSELoss(reduction="sum") |
- 优化器
1 | optimizer = torch.optim.SGD(model.parameters(),lr=0.01) |
- 参数训练
1 | for epoch in range(100): |
- 打印输出
1 | #输出A^T和bias |
torch.nn.modules.linear
1 | class Linear(Module): |
Note that modules.Linear
use F.linear
to calculate forward result:
torch.nn.functional.linear(input, weight, bias=None)
For the incoming input , weight matrix and bias , the F.linear
will apply a linear transformation:
If the shape of input is (*, in_features)
where *
means any number of additional dimensions including none and shape of weight matrix is (out_features, in_features)
, the shape of output will be (*, out_features)
.
Example 1:
1 | import torch |
Example 2:
1 | import torch |
Note: The basic element that Pytorch or Deep learning operate is vectors, more specifically is row vectors. Even though it’s a matrix, it can be thought of as a set of row vectors. And for a high-dimentional matrix, it can also be viewed as a larger set of row vectors. Therefore, when you apply a transformation to a 2 or higher dimentional matrix, what you actually transform is a large set of row vectors. You transform the vectors in the set and keep the shape of the set.
torch.nn
pytorch/torch/nn | source code | github
Softmax
1 | torch.nn.Softmax(dim=None) |
dim (int) – A dimension along which Softmax will be computed (so every slice along dim will sum to 1).
1 | import torch |
1 | import torch |
LogSoftmax
torch.nn.LogSoftmax()对输入的tensor进行 操作, 例如 维度的输入tensor(n+1个样本, c+1个类别)
经过 操作后返回
1 | \begin{split} |
其中 如下计算
1 | \begin{array}{c} |
经过 操作将返回
的性质
Pytorch演示:
1 | #输入向量 torch.ones(2,3) |
1 | # dim=1 |
1 | #dim=0 |
其中
LogSoftmax 对平移保持不变
例如
NLLLoss
-
Pytorch Docs
torch.nn.NLLLoss -
Descriptions
NLLLoss全称Negative Log Likelihood Loss, 用于计算输入 和输出 的NLLLoss对于输入的
以及目标
进行 操作
-
reduction="none"
, 输出其中 是第0个样本的第个值; 是第1个样本的第个值…
-
reduction="sum"
, 输出 -
reduction="mean"
(default), 输出
-
NLLLoss on Segmantation
-
Descriptions
For the inputs in segmentation, a Prediction with shape (Batch_size, Class_num, Height, Width) and a Mask or Ground Truth with shape (Batch_size, Height, Width), NLLLosss gives the following output with shape (Batch_size, Height, Width): -
Demon
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17import torch
from torch import nn
# Batch_Size: 2, Channels(Classes): 3, Height: 4, Width: 5
B, C, H, W = 2, 3, 4, 5
mask = torch.randint(low=0, high=C, size=(B, H, W)) # ∈ {0, 1, 2}
# mask.shape = torch.Size([2, 4, 5])
prediction = torch.arange(B*C*H*W, dtype=torch.float).reshape(B, C, H, W)
# prediction.shape = torch.Size([2, 3, 4, 5])
nllloss = nn.NLLLoss(reduction='none')
loss = nllloss(prediction, mask)
# loss.shape = torch.Size([2, 4, 5])
for b in range(B):
for h in range(H):
for w in range(W):
assert loss[b,h,w]==-prediction[b,mask[b,h,w],h,w], "Unequal"
NLLLoss + LogSoftmax
通常会首先对 取 , 再进行 , 这样得到的就是交叉熵损失
-
直接进行 操作
1
2
3
4
5
6
7
8#设置NLLLoss, reduction="none"直接输出每个样本的结果,不作sum\mean操作
loss = nn.NLLLoss(reduction="none")
#设置input, 3个样本
input = torch.arange(15,dtype=float).reshape(3, 5)
#设置target, 3个标签, 每个标签值介于[0,5)之间
target = torch.tensor([1, 0, 4])
#直接计算input和target的loss
output = loss(input, target)1
2
3
4
5
6
7
8# 输出 input
tensor([[ 0., 1., 2., 3., 4.],
[ 5., 6., 7., 8., 9.],
[10., 11., 12., 13., 14.]], dtype=torch.float64)
# 输出 target
tensor([1, 0, 4])
# 输出 output
tensor([ -1., -5., -14.], dtype=torch.float64) -
先取 再进行 操作
1
2
3
4#设置LogSoftmax
m = nn.LogSoftmax(dim=1)
#对input取LogSoftmax操作再与target做Loss
output = loss(m(input), target)1
2
3
4
5
6
7
8# 输出 m(input)
tensor([[-4.4519, -3.4519, -2.4519, -1.4519, -0.4519],
[-4.4519, -3.4519, -2.4519, -1.4519, -0.4519],
[-4.4519, -3.4519, -2.4519, -1.4519, -0.4519]], dtype=torch.float64)
# 输出 target
tensor([1, 0, 4])
# 输出 output
tensor([3.4519, 4.4519, 0.4519], dtype=torch.float64)
CrossEntropyLoss
-
Pytorch Docs
torch.nn.CrossEntropyLoss -
Formulas
-
Description
对于输入的 ,其有(n+1)个样本,(c+1)个类别以及目标
进行 操作
-
reduction="none"
, 输出 -
reduction="sum"
, 输出 -
reduction="mean"
(default), 输出
-
-
Pytorch演示
1
2
3
4
5
6
7
8
9# input和target
input = torch.arange(15,dtype=float).reshape(3, 5)
target = torch.tensor([1, 0, 4])
m = nn.LogSoftmax(dim=1)
NLloss = nn.NLLLoss(reduction="none")
CRloss = nn.CrossEntropyLoss(reduction="none")
output_NL = NLloss(m(input), target)
output_CR = CRloss(input, target)1
2
3
4
5
6
7
8
9
10#输出m(input)
tensor([[-4.4519, -3.4519, -2.4519, -1.4519, -0.4519],
[-4.4519, -3.4519, -2.4519, -1.4519, -0.4519],
[-4.4519, -3.4519, -2.4519, -1.4519, -0.4519]], dtype=torch.float64)
#输出target
tensor([1, 0, 4])
#输出output_NL
tensor([3.4519, 4.4519, 0.4519], dtype=torch.float64)
#输出output_CR
tensor([3.4519, 4.4519, 0.4519], dtype=torch.float64)注意上面的两个输出, 和 的计算结果是一样的.
CrossEntropyLoss on Classification
对于一个分类器如下, 这两种方法使用 loss 是等价的
1 | import torch |
CrossEntropyLoss 本身已经封装了一层 -Log(Softmax(x)), 所以如果模型本身已经对输出做了 Log(Softmax(x)) 操作, 可以考虑使用 NLLLoss 作为损失函数. 而对于没有使用 LogSoftmax 正则化的模型输出, 使用 CrossEntropLoss 会更简洁, 但是注意在做 Evaluation 的时候需要额外对 outputs 做一次 Softmax(x) 操作作为概率输出, 再取 Softmax(x) 中最大值的 indice 作为预测的 label.
对于 CrossEntropy 的公式应该如下描述
其中 x 是一个 Batch_size × Class_num 的 Tensor, 而 label 对应的是一个 int (非 one-hot 型) 的 indices. 对于 x 中的一个样本 x[i] ∈ [1 × C], 对其做 softmax 得到 softmax(x[i]) ∈ [1 × C] 表示每个类别的概率. 取这个样本的真实标签值 label[i] (这里不是 one-hot 编码, 只是 int 值). 使用 softmax(x[i]) 与 label[i] 做交叉熵, 得到的就是这个样本的交叉熵损失函数
其中第一项是正确样本的预测概率取 -logsoftmax, 而第二项是所有分类错误的预测概率取 logsoftmax. 这里取正负是因为我们使用梯度下降法来降低 loss, 所以我们希望正样本的loss变高, 即取负号后的值变低, 而负样本的loss变低即本身(正号不取负)的值变低.
关于为什么要取 log?
CrossEntropyLoss on Segmentation
-
Descriptions
For the inputs in segmentation, a Prediction with shape (Batch_size, Class_num, Height, Width) and a Mask or Ground Truth with shape (Batch_size, Height, Width), CrossEntropyLoss will do on every pixel of every batch of Prediction, which has Class_num entries, and choose the i-th(depending on mask) entry and finally gives the output with shape (Batch_size, Height, Width):The Prediction is actually the output of a model/network.
-
Demo
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18import torch
from torch import nn
# Batch_Size: 2, Channels(Classes): 3, Height: 4, Width: 5
B, C, H, W = 2, 3, 4, 5
mask = torch.randint(low=0, high=C, size=(B, H, W)) # ∈ {0, 1, 2}
# mask.shape = torch.Size([2, 4, 5])
prediction = torch.arange(B*C*H*W, dtype=torch.float).reshape(B, C, H, W)
# prediction.shape = torch.Size([2, 3, 4, 5])
crossentropyloss = nn.CrossEntropyLoss(reduction='none')
loss = crossentropyloss(prediction, mask)
# loss.shape = torch.Size([2, 4, 5])
logsofmax_prediction = nn.LogSoftmax(dim=1)(prediction)
for b in range(B):
for h in range(H):
for w in range(W):
assert loss[b,h,w]==-logsofmax_prediction[b,mask[b,h,w],h,w], "Unequal"
RuntimeError: CUDA error: device-side assert triggered
This error happens when your mask(target) values are out of bounds. For example
-
Recurrence
1
2
3
4
5
6
7
8
9
10
11
12
13import torch
from torch import nn
device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")
# Batch_Size: 2, Height: 4, Width: 5
mask = torch.randint(low=1, high=4, size=(2, 4, 5)).cuda()
# Batch_Size: 2, Channels(Classes): 3, Height: 4, Width: 5
output = torch.randn(2, 3, 4, 5).cuda()
CE_Loss = nn.CrossEntropyLoss()
loss = CE_Loss(output, mask)
'''RuntimeError: CUDA error: device-side assert triggered'''
The mask
contains values 1, 2 and 3, but output has only 3 classes which are “0”, “1” and “2”, so the value 3 is out of bounds.
When you run this code on cpu, you will have this error
-
Run above code on cpu
1
2
3
4...
# loss = CE_Loss(output, mask)
loss = CE_Loss(output.cpu(), mask.cpu())
'''IndexError: Target 3 is out of bounds.'''
Conv2d
使用 Conv2d 方法实现卷积非常方便, 只需要设置 in_channels
, out_channels
, kernel_size
三个参数就可以使用了, 不需要关注输入输出图片的长和宽.
其它参数包括 stride
, padding
, bias
也都可以指定, 详细的参数可以查看官方文档.
下面是一个实例
1 | import torch |
转置卷积
移动窗口的卷积运算可以转换成矩阵乘法, 将输入图片中的像素按从左到右从上到下的顺序排列成一个列向量, 卷积核的每个窗口都可以同样排成一个行向量, 整个卷积核对应着一个矩阵, 其行数是卷积窗口的数量, 列数是输入图片的像素数.
例如, 对于 的输入图片和 的卷积核, 做 padding=0, strides=1 的卷积运算, 得到一个 的输出. 将这个过程用矩阵乘法表示如下:
上面矩阵中的每一行都对应着一个卷积的窗口, 它们和输入图片做 element-wise 的乘积再求和, 即得到对应窗口位置的卷积输出, 如下示意:
1 | \scriptsize |
而转置卷积是在上述矩阵乘法形式的卷积的基础上, 将变换矩阵转置. 从而原本 的线性变换, 变成了 的线性变换. 如此可以对图片进行放大, 进行上采样.
转置矩阵中的每一列都对应着一个卷积的窗口, 上面的过程对应着, 卷积窗口和输入图片作乘积再求和, 如下示意:
1 | \tiny |
这样相当于 作为权重作用在 的输出上
1 | \scriptsize |
这个结果相当于用原卷积核左右镜像+上下镜像后的矩阵作为卷积核, 对两层 padding 的输入图像作卷积
ConvTranspose2d
1 | import torch |
ModuleList
-
定义一个简单的 ModuleList, 其中包括 4 个 Module
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27import torch
import torch.nn as nn
modulelist = nn.ModuleList([
nn.Conv2d(1,20,5),
nn.ReLU(),
nn.Conv2d(20,64,5),
nn.ReLU()
])
# 输出 ModuleList
print(modulelist)
# ModuleList(
# (0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
# (1): ReLU()
# (2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
# (3): ReLU()
# )
# 输出 ModuleList 的类型
print(modulelist.type)
# <bound method Module.type of ModuleList(
# (0): Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
# (1): ReLU()
# (2): Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
# (3): ReLU()
# )> -
可以迭代
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15# 使用 for 循环迭代输出 ModuleList 中的 Module
for i in modulelist:
print(i)
# Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
# ReLU()
# Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
# ReLU()
# 使用 enumerate 输出 modulelist 中的元素
for i, module in enumerate(modulelist):
print(i, module)
# 0 Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
# 1 ReLU()
# 2 Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
# 3 ReLU() -
可以使用
[]
中括号来访问 ModuleList 中的元素, 下标从 0 开始, 注意不要越界1
2
3
4
5
6
7# 使用 [] 访问 ModuleList 中的元素
for i in range(4):
print(modulelist[i])
# Conv2d(1, 20, kernel_size=(5, 5), stride=(1, 1))
# ReLU()
# Conv2d(20, 64, kernel_size=(5, 5), stride=(1, 1))
# ReLU() -
借助 ModuleList 来定义一个简单的网络模型
1
2
3
4
5
6
7
8class Net(nn.Module):
def __init__(self) -> None:
super().__init__()
def forward(self, x):
for module in modulelist:
x = module(x)
return x测试这个模型
1
2
3
4
5
6
7
8
9
10
11#输入 batch_size=1, channels=1, width=100, height=100
input = torch.randn(1, 1, 100, 100)
#网络实例
net = Net()
#输出
output = net.forward(input)
print(input.shape) #torch.Size([1, 1, 100, 100])
print(output.shape) #torch.Size([1, 64, 92, 92])
Sequential
Sequential 和 ModuleList 用法非常相似, 不同的地方在于 ModuleList 只是单纯的一个 Module 的 List, 而 Sequential 则将其中的 Module 按顺序串联成一个模型. Pytorhc 中可以直接调用定义好的 Sequential, 而 ModuleList 则不行.
- 定义一个简单的 Sequential Container
1 | import torch |
- 可以迭代
1 | # 使用 for 循环输出 sequential 中的元素 |
- 可以使用
[]
中括号来访问 Sequential 中的元素, 下标从 0 开始, 注意不要越界
1 | for i in range(4): |
- Sequential 本身已经是一个模型, 可以直接调用
1 | #输入 batch_size=1, channels=1, width=100, height=100 |
functional
one_hot
1 | import torch |
nn.Dropout
torch.autocast
AUTOMATIC MIXED PRECISION PACKAGE - TORCH.AMP{target="_blank"}
1 | import torch |
Torchvision
- Documents
pytorch.org/vision - Source code in github for torchvision
pytorch/vision | source code | github
torchvision.transforms
-
Documents
Transforming and augmenting images
Getting started with transforms v2 -
Demonstraction
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60import torch
from torchvision.transforms import v2
from PIL import Image
import matplotlib.pyplot as plt
# You can download the astronaut.jpg image from
# https://github.com/pytorch/vision/blob/main/gallery/
# and convert annotation.json file to mask by yourself
path2image = r"./coco/images/astronaut.jpg"
path2mask = r"./mask.png"
TransformsList = [
v2.RandomResizedCrop(size=(224, 224), antialias=True),
v2.RandomHorizontalFlip(p=0.5), # posibility = 0.5
v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
# 3 means and stds for 3 channels
]
ToFloatImage = v2.Compose([
v2.ToImage(), # Convert PIL Image to Torchvision Image
v2.ToDtype(torch.float32, scale=True) # Covert uint8 to float32
])
# read image and mask
image = Image.open(path2image) # PIL Image
mask = Image.open(path2mask) # PIL Image
# convet PIL Image to Torchvison Image
image = ToFloatImage(image) # Float Torchvision Image
mask = ToFloatImage(mask) # Float Torchvision Image
plt.close()
_, axes = plt.subplots(
nrows=2, ncols=len(TransformsList)+1,
figsize=(3*(len(TransformsList)+1), 3*2) # (size*cols, size*rows)
)
axes[0][0].imshow(image.permute(1,2,0)) # 3×H×W -permute-> H×W×3
axes[0][0].set_title("Original")
axes[1][0].imshow(mask.permute(1,2,0)) # 3×H×W -permute-> H×W×3
torch.manual_seed(1)
for i, transform in enumerate(TransformsList):
'''Don't transform image and mask separately
when transforms contain ramdom operators'''
# trans_image = transform(image) # <-- Don't
# trans_mask = transform(mask) # <-- Don't
trans_image, trans_mask = transform(image, mask) # <-- Instead
print(f"{type(transform).__name__}: {trans_image.shape = }")
print(f"{type(transform).__name__}: {trans_mask.shape = }")
# RandomResizedCrop: trans_image.shape = torch.Size([3, 224, 224])
# RandomResizedCrop: trans_mask.shape = torch.Size([1, 224, 224])
# RandomHorizontalFlip: trans_image.shape = torch.Size([3, 512, 512])
# RandomHorizontalFlip: trans_mask.shape = torch.Size([1, 512, 512])
# Normalize: trans_image.shape = torch.Size([3, 512, 512])
# Normalize: trans_mask.shape = torch.Size([3, 512, 512]) <-- Note the channels of mask
axes[0][i+1].imshow(trans_image.permute(1,2,0)) # 3×H×W -permute-> H×W×3
axes[0][i+1].set_title(type(transform).__name__)
axes[1][i+1].imshow(trans_mask.permute(1,2,0)) # 3×H×W -permute-> H×W×3
plt.show()
transforms.v2.Compose
-
Documents
torchvision.transforms.v2.Compose -
Demonstration
1
2
3
4
5
6
7
8
9
10
11
12from torchvision.transforms import v2
transforms = v2.Compose([
# Convert to tensor, only needed if you had a PIL image
v2.ToImage(),
# optional, most input are already uint8 at this point
v2.ToDtype(torch.uint8, scale=True),
v2.RandomResizedCrop(size=(224, 224), antialias=True),
# Or v2.Resize(antialias=True),
# Normalize expects float input
v2.ToDtype(torch.float32, scale=True),
v2.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
]) -
Trick
1
2
3
4
5
6
7
8from torch import nn
from torchvision.transforms import v2
translist = nn.ModuleList([
...
])
transform = v2.Compose([transform for transform in translist])
for trans in translist:
pass
v2-api-reference
- Documents
v2-api-reference
transforms.v2.ToTensor
-
Documents
torchvision.transforms.v2.ToTensorUse the command below instead
1
2
3
4v2.Compose([
v2.ToImage(), # Convert PIL Image to Torchvision Image
v2.ToDtype(torch.float32, scale=True) # Covert uint8 to float32
])
transforms.v2.ToImage
- Documents
transforms.v2.ToImage
transforms.v2.Normalize
-
Documents
torchvision.transforms.v2.Normalize -
Some understanding
If the input image has n channels and v2.Normalize has n means and stds1
v2.Normalize(mean=[m1, ..., mn], std=[s1, ..., sn])
The v2.Normalize dose Normalization on every channel of the input image (or array) with the corresponding mean and std
If the input image only has 1 channles and v2.Normalize has n (more than 1) means and stds.
v2.Normalize will return n channels -
Demonstraction
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54import torch
from torchvision.transforms import v2
channel1 = [[1, 0, 0], [0, 1, 0], [0, 0, 1]]
channel2 = [[0, 0, 1], [0, 1, 0], [1, 0, 0]]
t = torch.tensor([channel1, channel2], dtype=torch.float32)
print(t)
# tensor([[[1., 0., 0.],
# [0., 1., 0.],
# [0., 0., 1.]],
# [[0., 0., 1.],
# [0., 1., 0.],
# [1., 0., 0.]]])
'''
channle 1:
1: (1 - mean(0))/std(1) = 1
0: (0 - mean(0))/std(1) = 0
channle 2:
1: (1 - mean(1))/std(1) = 0
0: (0 - mean(1))/std(1) = -1
'''
transform = v2.Normalize(mean=[0, 1], std=[1, 1])
trans_t = transform(t)
print(trans_t)
# tensor([[[ 1., 0., 0.],
# [ 0., 1., 0.],
# [ 0., 0., 1.]],
# [[-1., -1., 0.],
# [-1., 0., -1.],
# [ 0., -1., -1.]]])
'''
channle 1:
1: (1 - mean(0))/std(0.5) = 2
0: (0 - mean(0))/std(0.5) = 0
channle 2:
1: (1 - mean(0))/std(2) = 0.5
0: (0 - mean(0))/std(2) = 0
'''
transform = v2.Normalize(mean=[0, 0], std=[0.5, 2])
trans_t = transform(t)
# Limit the number of elements shown
torch.set_printoptions(precision=1)
print(trans_t)
# tensor([[[2.0, 0.0, 0.0],
# [0.0, 2.0, 0.0],
# [0.0, 0.0, 2.0]],
# [[0.0, 0.0, 0.5],
# [0.0, 0.5, 0.0],
# [0.5, 0.0, 0.0]]]) -
Paper that proposed Batch Normalization
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
torchvision.datasets
- Documents
torchvision.datasets
MNIST
-
Documents
torchvision.datasets.MNIST -
Extended reading
MNIST的均值和方差(0.1307,), (0.3081,)是怎么计算出来的? -
Note
MNIST所继承的VisionDataset是制作与torchvison相兼容数据的基类, VisionDataset依然是Dataset的子类 -
Demonstraction
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21from torchvision import datasets
import random
# train=True 读取训练集
train_dataset = datasets.MNIST(root='../dataset/mnist/',
train=True,
download=False, #第一次运行设为True下载数据集
transform=None)
# train=False 读取测试集
test_dataset = datasets.MNIST(root='..dataset/mnist/',
train=False,
download=False, #第一次运行设为True下载数据集
transform=None)
print(type(train_dataset)) # <class 'torchvision.datasets.mnist.MNIST'>
print(len(train_dataset)) # 60000
print(len(test_dataset)) # 10000
item = random.choice(train_dataset)
print(type(item)) # <class 'tuple'>
print(item) # (<PIL.Image.Image image mode=L size=28x28 at 0x1CF74469760>, 5) -
Note
train_dataset
和test_dataset
是两个迭代器, 其中的元素为元组;
train_dataset
包含60000个元组;test_dataset
包含10000个元组;
每个元组都包含2个元素, 其中第一个元素是图片, 第二个元素是图片的标签.
CIFAR10
该数据集共有 60000 张彩色图像,这些图像是 32*32,分为 10 个类,每类 6000 张图。
这里面有 50000 张用于训练,构成了 5 个训练批,每一批 10000 张图;另外 10000 用于测试,单独构成一批。
测试批的数据里,取自 10 类中的每一类,每一类随机取 1000 张,一共 10000 张,剩下的 50000 张图像就随机排列组成了训练批。
注意一个训练批中的各类图像并不一定数量相同,但总训练集的 50000 张图像中每一类都有 5000 张图。
torch.optim
torch.optim | Pytorch Docs{target="_blank"}
torch.optim 是pytorch的一个工具包, 里面包含了常见的神经网络的优化算法
下面以torch.optim.SGD
为例, 介绍如何使用这个工具包
torch.optim.SGD
SGD为随机梯度下降法(stochastic gradient descent)
torch.optim.SGD | Pytorch Docs{target="_blank"}
- 自定义一个数据集
这个数据集有108个样本, 每个样本都是上均匀生成的一个64维向量, 每个样本对应一个16维的标签
1 | from torch.utils.data import Dataset |
- 自定义一个模型
1 | import torch.nn as nn |
- 使用使用torch.optim.SGD对上面自定义的数据和模型进行训练
1 | import torch.optim as optim |
zero_grad
torch.optim.Optimizer.zero_grad
step
torch.optim.Optimizer.step
TensorBoard
-
Docs
How to use TensorBoard with PyTorch
torch.utils.tensorboard -
Video tutorial
Pytorch TensorBoard Tutorial
PyTorch Profiler With TensorBoard
- Docs
PyTorch Profiler With TensorBoard
torch-tb-profiler