ResNet改进(1)：添加SE注意力模块-EW帮帮网

1. SE 注意力模块

ResNet（Residual Network）是一种经典的深度卷积神经网络架构，通过引入残差连接（skip connection）解决了深层网络中的梯度消失问题，使得网络可以训练得更深。SE（Squeeze-and-Excitation）模块是一种注意力机制，通过学习通道之间的依赖关系来增强特征表示能力。将SE模块融合到ResNet中，可以进一步提升模型的性能。

ResNet 融合 SE 模块的基本思路

SE模块的核心思想是通过全局池化（Squeeze）和全连接层（Excitation）来动态调整每个通道的权重，从而增强重要特征并抑制不重要的特征。将SE模块插入到ResNet的残差块中，可以增强每个残差块的特征提取能力。

具体实现步骤

SE模块的结构：
- Squeeze：通过全局平均池化（Global Average Pooling）将每个通道的空间维度压缩为一个标量。
- Excitation：通过两个全连接层（FC层）学习通道之间的依赖关系，生成每个通道的权重。
- Scale：将学习到的权重与原始特征图相乘，得到加权后的特征图。
将SE模块插入ResNet的残差块：
- 在ResNet的每个残差块中，通常在卷积层之后、残差连接之前插入SE模块。
- SE模块会对卷积输出的特征图进行加权，然后再与残差连接相加。
代码实现（以PyTorch为例）：
下面是一个简单的ResNet残差块融合SE模块的实现：

import torch
import torch.nn as nn
import torch.nn.functional as F

class SEBlock(nn.Module):
    def __init__(self, channel, reduction=16):
        super(SEBlock, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),
            nn.Sigmoid()
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)
        y = self.fc(y).view(b, c, 1, 1)
        return x * y.expand_as(x)

class ResNetBlockWithSE(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1, reduction=16):
        super(ResNetBlockWithSE, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.se = SEBlock(out_channels, reduction)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_channels != out_channels:
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_channels, out_channels, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(out_channels)
            )

    def forward(self, x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.bn2(self.conv2(out))
        out = self.se(out)  # 应用SE模块
        out += self.shortcut(x)
        out = F.relu(out)
        return out

网络结构：

在ResNet的每个残差块中插入SE模块后，整个网络的结构保持不变，但每个残差块的特征提取能力得到了增强。
你可以根据需要选择在哪些残差块中插入SE模块，通常是在较深的层中插入，因为这些层的特征更加抽象和高级。

优点

增强特征表示：SE模块通过动态调整通道权重，增强了重要特征的表示能力。
轻量级：SE模块的计算开销较小，不会显著增加模型的参数量和计算复杂度。
通用性：SE模块可以轻松集成到其他网络架构中，如ResNet、Inception等。

2. resnet+SE

在ResNet-34中添加SE（Squeeze-and-Excitation）模块，需要对ResNet的每个残差块进行修改，将SE模块插入到残差块中。以下是完整的ResNet-34融合SE模块的实现代码（基于PyTorch框架）。

代码如下：

import torch
import torch.nn as nn
import torch.nn.functional as F

# 定义SE模块
class SEBlock(nn.Module):
    def __init__(self, channel, reduction=16):
        super(SEBlock, self).__init__()
        self.avg_pool = nn.AdaptiveAvgPool2d(1)  # 全局平均池化
        self.fc = nn.Sequential(
            nn.Linear(channel, channel // reduction, bias=False),  # 降维
            nn.ReLU(inplace=True),
            nn.Linear(channel // reduction, channel, bias=False),  # 升维
            nn.Sigmoid()  # 激活函数，生成通道权重
        )

    def forward(self, x):
        b, c, _, _ = x.size()
        y = self.avg_pool(x).view(b, c)  # Squeeze操作
        y = self.fc(y).view(b, c, 1, 1)  # Excitation操作
        return x * y.expand_as(x)  # Scale操作

# 定义残差块（带SE模块）
class ResNetBlockWithSE(nn.Module):
    def __init__(self, in_channels, out_channels, stride=1, downsample=None, reduction=16):
        super(ResNetBlockWithSE, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)
        self.se = SEBlock(out_channels, reduction)  # 添加SE模块
        self.downsample = downsample
        self.stride = stride

    def forward(self, x):
        identity = x

        out = self.conv1(x)
        out = self.bn1(out)
        out = F.relu(out)

        out = self.conv2(out)
        out = self.bn2(out)
        out = self.se(out)  # 应用SE模块

        if self.downsample is not None:
            identity = self.downsample(x)

        out += identity
        out = F.relu(out)
        return out

# 定义ResNet-34
class ResNet34WithSE(nn.Module):
    def __init__(self, num_classes=1000, reduction=16):
        super(ResNet34WithSE, self).__init__()
        self.in_channels = 64

        # 初始卷积层
        self.conv1 = nn.Conv2d(3, self.in_channels, kernel_size=7, stride=2, padding=3, bias=False)
        self.bn1 = nn.BatchNorm2d(self.in_channels)
        self.relu = nn.ReLU(inplace=True)
        self.maxpool = nn.MaxPool2d(kernel_size=3, stride=2, padding=1)

        # ResNet的四个阶段
        self.layer1 = self._make_layer(64, 3, stride=1, reduction=reduction)
        self.layer2 = self._make_layer(128, 4, stride=2, reduction=reduction)
        self.layer3 = self._make_layer(256, 6, stride=2, reduction=reduction)
        self.layer4 = self._make_layer(512, 3, stride=2, reduction=reduction)

        # 全局平均池化和全连接层
        self.avgpool = nn.AdaptiveAvgPool2d((1, 1))
        self.fc = nn.Linear(512, num_classes)

    def _make_layer(self, out_channels, num_blocks, stride, reduction):
        downsample = None
        if stride != 1 or self.in_channels != out_channels:
            downsample = nn.Sequential(
                nn.Conv2d(self.in_channels, out_channels, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(out_channels)
        layers = []
        layers.append(ResNetBlockWithSE(self.in_channels, out_channels, stride, downsample, reduction))
        self.in_channels = out_channels
        for _ in range(1, num_blocks):
            layers.append(ResNetBlockWithSE(self.in_channels, out_channels, reduction=reduction))
        return nn.Sequential(*layers)

    def forward(self, x):
        x = self.conv1(x)
        x = self.bn1(x)
        x = self.relu(x)
        x = self.maxpool(x)

        x = self.layer1(x)
        x = self.layer2(x)
        x = self.layer3(x)
        x = self.layer4(x)

        x = self.avgpool(x)
        x = torch.flatten(x, 1)
        x = self.fc(x)
        return x

# 测试模型
if __name__ == "__main__":
    model = ResNet34WithSE(num_classes=1000)
    input_tensor = torch.randn(1, 3, 224, 224)  # 输入张量 (batch_size, channels, height, width)
    output = model(input_tensor)
    print(output.shape)  # 输出形状应为 (1, 1000)

代码说明

SE模块：
- SEBlock 实现了 Squeeze-and-Excitation 操作，通过全局平均池化和两个全连接层生成通道权重。
- 在残差块中，SE模块对卷积输出的特征图进行加权。
残差块：
- ResNetBlockWithSE 是 ResNet 的基础残差块，包含两个卷积层和一个 SE 模块。
- 如果输入和输出的通道数或空间尺寸不一致，则通过 downsample 进行调整。
ResNet-34 结构：
- ResNet34WithSE 定义了完整的 ResNet-34 结构，包含四个阶段（layer1 到 layer4），每个阶段由多个残差块组成。
- 在 _make_layer 方法中，构建每个阶段的残差块，并将 SE 模块插入到每个残差块中。
测试：
- 输入一个随机张量（例如 (1, 3, 224, 224)），模型输出形状为 (1, 1000)，表示 1000 个类别的分类结果。

模型的输出如下：

ResNet34WithSE(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): ResNetBlockWithSE(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=64, out_features=4, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=4, out_features=64, bias=False)
(3): Sigmoid()
)
)
)
(1): ResNetBlockWithSE(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=64, out_features=4, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=4, out_features=64, bias=False)
(3): Sigmoid()
)
)
)
(2): ResNetBlockWithSE(
(conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=64, out_features=4, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=4, out_features=64, bias=False)
(3): Sigmoid()
)
)
)
)
(layer2): Sequential(
(0): ResNetBlockWithSE(
(conv1): Conv2d(64, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=128, out_features=8, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=8, out_features=128, bias=False)
(3): Sigmoid()
)
)
(downsample): Sequential(
(0): Conv2d(64, 128, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): ResNetBlockWithSE(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=128, out_features=8, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=8, out_features=128, bias=False)
(3): Sigmoid()
)
)
)
(2): ResNetBlockWithSE(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=128, out_features=8, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=8, out_features=128, bias=False)
(3): Sigmoid()
)
)
)
(3): ResNetBlockWithSE(
(conv1): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(128, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=128, out_features=8, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=8, out_features=128, bias=False)
(3): Sigmoid()
)
)
)
)
(layer3): Sequential(
(0): ResNetBlockWithSE(
(conv1): Conv2d(128, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=256, out_features=16, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=16, out_features=256, bias=False)
(3): Sigmoid()
)
)
(downsample): Sequential(
(0): Conv2d(128, 256, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): ResNetBlockWithSE(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=256, out_features=16, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=16, out_features=256, bias=False)
(3): Sigmoid()
)
)
)
(2): ResNetBlockWithSE(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=256, out_features=16, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=16, out_features=256, bias=False)
(3): Sigmoid()
)
)
)
(3): ResNetBlockWithSE(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=256, out_features=16, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=16, out_features=256, bias=False)
(3): Sigmoid()
)
)
)
(4): ResNetBlockWithSE(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=256, out_features=16, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=16, out_features=256, bias=False)
(3): Sigmoid()
)
)
)
(5): ResNetBlockWithSE(
(conv1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=256, out_features=16, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=16, out_features=256, bias=False)
(3): Sigmoid()
)
)
)
)
(layer4): Sequential(
(0): ResNetBlockWithSE(
(conv1): Conv2d(256, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=512, out_features=32, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=32, out_features=512, bias=False)
(3): Sigmoid()
)
)
(downsample): Sequential(
(0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)
(1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): ResNetBlockWithSE(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=512, out_features=32, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=32, out_features=512, bias=False)
(3): Sigmoid()
)
)
)
(2): ResNetBlockWithSE(
(conv1): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(se): SEBlock(
(avg_pool): AdaptiveAvgPool2d(output_size=1)
(fc): Sequential(
(0): Linear(in_features=512, out_features=32, bias=False)
(1): ReLU(inplace=True)
(2): Linear(in_features=32, out_features=512, bias=False)
(3): Sigmoid()
)
)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): Linear(in_features=512, out_features=10, bias=True)
)
torch.Size([1, 10])

ResNet改进(1)：添加SE注意力模块

1. SE 注意力模块

2. resnet+SE

网站公告

今日签到

热门文章

最新发布