
相關論文:
LeNet:Handwritten Digit Recognition with a Back-Propagation Network;
Gradient-Based Learning Applied to Document Recognition(CNN的起點);
AlexNet:ImageNet Classification with Deep Convolutional Neural Networks(奠定CNN的基礎);
OverFeat:OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks;
ZFNet:isualizing and Understanding Convolutional Networks(在AlexNet基礎上做可視化、可解釋
相關工作);
VGG:VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION(將模塊堆疊到極致);
Inception V1/GoogLeNet:Going deeper with convolutions(開始劍走偏鋒,提出一些非常規的分解、并行模塊,Inception架構的基礎);
BN-Inception:Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift(Inception+Batch Normalization);
Inception V2/Inception V3:Rethinking the Inception Architecture for Computer Vision(上承Inception-V1,下啟Inception-V4和Xception,繼續對模塊進行分解);
Inception-V4, Inception-ResNet:Inception-V4, Inception-ResNet and the Impact of Residual Connections on Learning(純Inception block、結合ResNet和Inception);
Xception:Deep Learning with Depthwise Separable Convolutions(Xception:extreme inception,分解到極致的Inception);
ResNet V1:Deep Residual Learning for Image Recognition(何凱明,提出殘差連接概念 ResNet系列開山之作);
ResNet V2:Identity Mappings in Deep Residual Networks(何凱明,在V1的基礎上進行改進,和V1同一個作者);
DenseNet:Densely Connected Convolutional Networks;
ResNeXt:Aggregated Residual Transformations for Deep Neural Networks(何凱明團隊);
DualPathNet:Dual Path Networks;
SENet:queeze-and-Excitation Networks(提出SE模塊,可以便捷的插入其他網絡,由此有了一系列SE-X網絡);
Res2Net:Res2Net: A New Multi-scale Backbone Architecture;
ResNeSt:ResNeSt:Split-Attention Networks(集大成者);
NAS:NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING(神經網絡搜索的開山作之 有人工智能設計網絡);
NASNet:Learning Transferable Architectures for Scalable Image Recognition(將預測Layer參數改為預測block參數);
MnasNet:Platform-Aware Neural Architecture Search for Mobile(適用于算力受限的設備——移動端等);
MobileNets系列:
MobileNet V1: Efficient Convolutional Neural Networks for Mobile Vision Applications;
MobileNetV2:Inverted Residuals and Linear Bottlenecks;
MobileNetV3:Searching for MobileNetV3(用人工智能搜索出的架構);
SqueezeNet:ALEXNET-LEVEL ACCURACY WITH 50X FEWER PARAMETERS AND <0.5MB MODEL SIZE(與AlexNet同等精度,參數量比AlexNet小50倍,模型尺寸< 0.5MB的網絡);
ShuffleNet V1:ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices;
ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design;
EfficientNet V1:EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks;
EfficientNetV2: Smaller Models and Faster Training;
Transformer:Attention Is All You Need(開山之作);
ViT:AN IMAGE IS WORTH 16X16 WORDS: TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE(transformer在CV領域應用的里程碑著作);
Swin:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows(視覺Transformer);
VAN:Visual Attention Network(不是Transformer、只是將Transformer的思想借鑒入CNN中);
PVT:Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions(金字塔結構+Transformer);
TNT:Transformer in Transformer;
MLP-Mixer:MLP-Mixer: An all-MLP Architecture for Vision;
ConvMixer:ConvMixer:Patches Are All You Need( 證明 ViT 性能主要歸因于使用Patchs作為輸入表示的假設);