国产 无码 综合区,色欲AV无码国产永久播放,无码天堂亚洲国产AV,国产日韩欧美女同一区二区

[論文筆記](méi) Swin UNETR 論文筆記: MRI 圖像腦腫瘤語(yǔ)義分割

這篇具有很好參考價(jià)值的文章主要介紹了[論文筆記](méi) Swin UNETR 論文筆記: MRI 圖像腦腫瘤語(yǔ)義分割。希望對(duì)大家有所幫助。如果存在錯(cuò)誤或未考慮完全的地方,請(qǐng)大家不吝賜教,您也可以點(diǎn)擊"舉報(bào)違法"按鈕提交疑問(wèn)。

[論文筆記](méi) Swin UNETR 論文筆記: MRI 圖像腦腫瘤語(yǔ)義分割

Author: Sijin Yu

[1] Ali Hatamizadeh, Vishwesh Nath, Yucheng Tang, Dong Yang, Holger R. Roth, and Daguang Xu. Swin UNETR: Swin Transformers for Semantic Segmentation of Brain Tumors in MRI Images. MICCAI, 2022.

??開(kāi)源代碼鏈接

1. Abstract

  • 腦腫瘤的語(yǔ)義分割是一項(xiàng)基本的醫(yī)學(xué)影像分析任務(wù), 涉及多種 MRI 成像模態(tài), 可協(xié)助臨床醫(yī)生診斷病人并隨后研究惡性實(shí)體的進(jìn)展.
  • 近年來(lái), 完全卷積神經(jīng)網(wǎng)絡(luò) (Fully Convolutional Neural Networks, FCNNs) 方法已成為 3D 醫(yī)學(xué)影像分割的事實(shí)標(biāo)準(zhǔn).
  • 流行的 “U形” 網(wǎng)絡(luò)架構(gòu)在不同的 2D 和 3D 語(yǔ)義分割任務(wù)以及各種成像模式上實(shí)現(xiàn)了最先進(jìn)的性能基準(zhǔn).
  • 然而, 由于 FCNNs 中卷積層的核大小有限, 它們?cè)诮iL(zhǎng)距離信息方面的性能是次優(yōu)的, 這可能導(dǎo)致在分割大小不一的腫瘤時(shí)出現(xiàn)缺陷.
  • 另一方面, Transformer 模型在多個(gè)領(lǐng)域展示了捕獲長(zhǎng)距離信息的卓越能力, 包括自然語(yǔ)言處理和計(jì)算機(jī)視覺(jué).
  • 受 ViT 及其變體成功的啟發(fā), 我們提出了一種名為 Swin UNEt TRansformers (Swin UNETR) 的新型分割模型.
  • 具體來(lái)說(shuō), 3D 腦腫瘤語(yǔ)義分割任務(wù)被重新定義為序列到序列預(yù)測(cè)問(wèn)題, 其中多模態(tài)輸入數(shù)據(jù)被投影成一維嵌入序列, 并用作層級(jí) Swin 變換器編碼器的輸入.
  • Swin Transformer 編碼器使用移位窗口計(jì)算自注意力, 在五個(gè)不同的分辨率上提取特征, 并通過(guò)跳躍連接在每個(gè)分辨率上連接到基于FCNN 的解碼器.
  • 我們參加了 2021 年 BraTS 分割挑戰(zhàn)賽, 我們提出的模型在驗(yàn)證階段位列表現(xiàn)最佳的方法之一.

2. Motivation & Contribution

2.1 Motivation

  • 在醫(yī)療保健的人工智能領(lǐng)域, 特別是腦腫瘤分析中, 需要更先進(jìn)的分割技術(shù)來(lái)準(zhǔn)確劃定腫瘤, 以便診斷和術(shù)前規(guī)劃.
  • 當(dāng)前基于 CNN 的腦腫瘤分割方法由于其小感受野, 難以捕捉長(zhǎng)距離依賴(lài)關(guān)系.
  • ViTs 在捕捉各種領(lǐng)域的長(zhǎng)距離信息方面顯示出潛力, 暗示其在改善醫(yī)學(xué)圖像分割中的適用性.

2.2 Contribution

  • 提出了一種新型架構(gòu), Swin UNEt TRansformers (Swin UNETR), 結(jié)合了 Swin Transformer 編碼器與 U 形 CNN 解碼器, 用于多模態(tài)三維腦腫瘤分割.
  • 在 2021 年多模態(tài)腦腫瘤分割挑戰(zhàn) (BraTS) 中展示了 Swin UNETR 模型的有效性, 驗(yàn)證階段取得了排名靠前的成績(jī), 并在測(cè)試中表現(xiàn)出競(jìng)爭(zhēng)力.

3. Model

swin unetr: swin transformers for semantic segmentation of brain tumors in m,# 醫(yī)學(xué)AI論文筆記,Deep Learning 論文筆記,論文閱讀

  1. 將輸入的圖像打成 Patch.

    輸入的圖像為 X ∈ R H × W × D × S X\in\mathbb R^{H\times W\times D\times S} XRH×W×D×S. 一個(gè) Patch 的分辨率為 ( H ′ , W ′ , D ′ ) (H',W',D') (H,W,D), 一個(gè) Patch 的形狀為 R H ′ × W ′ × D ′ × S \mathbb R^{H'\times W'\times D'\times S} RH×W×D×S.

    則圖像變?yōu)橐粋€(gè) Patch 的序列, 序列長(zhǎng)度為 ? H H ′ ? × ? W W ′ ? × ? D D ′ ? \lceil\frac{H}{H'}\rceil\times\lceil\frac{W}{W'}\rceil\times\lceil\frac{D}{D'}\rceil ?HH??×?WW??×?DD??.

    在本文中, Patch size 為 ( H ′ , W ′ , D ′ ) = ( 2 , 2 , 2 ) (H',W',D')=(2, 2, 2) (H,W,D)=(2,2,2).

    對(duì)于每個(gè) patch, 將其映射為一個(gè)嵌入維度為 C C C 的 token. 因此, 最終得到分辨率為 ( ? H H ′ ? , ? W W ′ ? , ? D D ′ ? ) (\lceil\frac{H}{H'}\rceil,\lceil\frac{W}{W'}\rceil,\lceil\frac{D}{D'}\rceil) (?HH??,?WW??,?DD??) 的 3D tokens.

  2. 對(duì) 3D tokens 應(yīng)用 Swin Transformer.

    一層 Swin Transformer Block 由兩個(gè)子層組成: W-MSA, SW-MSA.

    經(jīng)過(guò)一層 Swin Transformer Block, 一個(gè) 3D tokens 每個(gè)方向上的分辨率變?yōu)樵瓉?lái)的 1 2 \frac12 21?, 通道數(shù)變?yōu)樵瓉?lái)的 2 2 2 倍. 見(jiàn) Fig.1 的左下角.

    W-MSA 和 SW-MSA 分別是規(guī)則的、循環(huán)移動(dòng)的 partitioning multi-head self-attention, 如下圖所示.

swin unetr: swin transformers for semantic segmentation of brain tumors in m,# 醫(yī)學(xué)AI論文筆記,Deep Learning 論文筆記,論文閱讀

4. Experiment

4.1 Dataset

  • BraTS 2021

4.2 對(duì)比實(shí)驗(yàn)

swin unetr: swin transformers for semantic segmentation of brain tumors in m,# 醫(yī)學(xué)AI論文筆記,Deep Learning 論文筆記,論文閱讀

5. Code

以下鏈接提供了使用Swin UNETR模型進(jìn)行BraTS21腦腫瘤分割的教程:

下面是部分核心代碼注釋:

5.1 數(shù)據(jù)預(yù)處理和增強(qiáng)

from monai import transforms

train_transform = transforms.Compose(
  [	
  	# 讀入圖像
    transforms.LoadImaged(keys=["image", "label"]),
    
		# 將單通道的標(biāo)簽圖像轉(zhuǎn)換成多通道格式, 每個(gè)通道表示不同的腫瘤類(lèi)別. (轉(zhuǎn)換前是所有類(lèi)別標(biāo)簽圖共用一個(gè)單通道圖像)    transforms.ConvertToMultiChannelBasedOnBratsClassesd(keys="label"),
		# 裁剪掉圖像周?chē)谋尘皡^(qū)域
    transforms.CropForegroundd(
        keys=["image", "label"],
        source_key="image",
        k_divisible=[roi[0], roi[1], roi[2]],
    ),
    # 將圖像隨機(jī)裁剪為指定大小
    transforms.RandSpatialCropd(
        keys=["image", "label"],
        roi_size=[roi[0], roi[1], roi[2]],
        random_size=False,
    ),
    # 在0軸方向上隨機(jī)翻轉(zhuǎn)
    transforms.RandFlipd(keys=["image", "label"], prob=0.5, spatial_axis=0),
    # 在1軸方向上隨機(jī)翻轉(zhuǎn)
    transforms.RandFlipd(keys=["image", "label"], prob=0.5, spatial_axis=1),
    # 在2軸方向上隨機(jī)翻轉(zhuǎn)
    transforms.RandFlipd(keys=["image", "label"], prob=0.5, spatial_axis=2),
    # 對(duì)每個(gè)單獨(dú)通道, 進(jìn)行強(qiáng)度歸一化, 且忽略0值
    transforms.NormalizeIntensityd(keys="image", nonzero=True, channel_wise=True),
    # 隨機(jī)調(diào)整圖像的強(qiáng)度, img = img * (1 + eps)
    transforms.RandScaleIntensityd(keys="image", factors=0.1, prob=1.0),
    # 隨機(jī)調(diào)整圖像的強(qiáng)度, img = img + eps
    transforms.RandShiftIntensityd(keys="image", offsets=0.1, prob=1.0),
	]
)

val_transform = transforms.Compose(
	[
    transforms.LoadImaged(keys=["image", "label"]),
    transforms.ConvertToMultiChannelBasedOnBratsClassesd(keys="label"),
    transforms.NormalizeIntensityd(keys="image", nonzero=True, channel_wise=True),
  ]
)

5.2 Swin UNETR 模型架構(gòu)

def forward(self, x_in):
  if not torch.jit.is_scripting():
    self._check_input_size(x_in.shape[2:])
  hidden_states_out = self.swinViT(x_in, self.normalize)
  enc0 = self.encoder1(x_in)
  enc1 = self.encoder2(hidden_states_out[0])
  enc2 = self.encoder3(hidden_states_out[1])
  enc3 = self.encoder4(hidden_states_out[2])
  dec4 = self.encoder10(hidden_states_out[4])
  dec3 = self.decoder5(dec4, hidden_states_out[3])
  dec2 = self.decoder4(dec3, enc3)
  dec1 = self.decoder3(dec2, enc2)
  dec0 = self.decoder2(dec1, enc1)
  out = self.decoder1(dec0, enc0)
  logits = self.out(out)
  return logits

組件的定義如下:文章來(lái)源地址http://www.zghlxwxcb.cn/news/detail-851399.html

self.normalize = normalize

self.swinViT = SwinTransformer(
  in_chans=in_channels,
  embed_dim=feature_size,
  window_size=window_size,
  patch_size=patch_sizes,
  depths=depths,
  num_heads=num_heads,
  mlp_ratio=4.0,
  qkv_bias=True,
  drop_rate=drop_rate,
  attn_drop_rate=attn_drop_rate,
  drop_path_rate=dropout_path_rate,
  norm_layer=nn.LayerNorm,
  use_checkpoint=use_checkpoint,
  spatial_dims=spatial_dims,
  downsample=look_up_option(downsample, MERGING_MODE) if isinstance(downsample, str) else downsample,
  use_v2=use_v2,
)

self.encoder1 = UnetrBasicBlock(
  spatial_dims=spatial_dims,
  in_channels=in_channels,
  out_channels=feature_size,
  kernel_size=3,
  stride=1,
  norm_name=norm_name,
  res_block=True,
)

self.encoder2 = UnetrBasicBlock(
  spatial_dims=spatial_dims,
  in_channels=feature_size,
  out_channels=feature_size,
  kernel_size=3,
  stride=1,
  norm_name=norm_name,
  res_block=True,
)

self.encoder3 = UnetrBasicBlock(
  spatial_dims=spatial_dims,
  in_channels=2 * feature_size,
  out_channels=2 * feature_size,
  kernel_size=3,
  stride=1,
  norm_name=norm_name,
  res_block=True,
)

self.encoder4 = UnetrBasicBlock(
  spatial_dims=spatial_dims,
  in_channels=4 * feature_size,
  out_channels=4 * feature_size,
  kernel_size=3,
  stride=1,
  norm_name=norm_name,
  res_block=True,
)

self.encoder10 = UnetrBasicBlock(
  spatial_dims=spatial_dims,
  in_channels=16 * feature_size,
  out_channels=16 * feature_size,
  kernel_size=3,
  stride=1,
  norm_name=norm_name,
  res_block=True,
)

self.decoder5 = UnetrUpBlock(
  spatial_dims=spatial_dims,
  in_channels=16 * feature_size,
  out_channels=8 * feature_size,
  kernel_size=3,
  upsample_kernel_size=2,
  norm_name=norm_name,
  res_block=True,
)

self.decoder4 = UnetrUpBlock(
  spatial_dims=spatial_dims,
  in_channels=feature_size * 8,
  out_channels=feature_size * 4,
  kernel_size=3,
  upsample_kernel_size=2,
  norm_name=norm_name,
  res_block=True,
)

self.decoder3 = UnetrUpBlock(
  spatial_dims=spatial_dims,
  in_channels=feature_size * 4,
  out_channels=feature_size * 2,
  kernel_size=3,
  upsample_kernel_size=2,
  norm_name=norm_name,
  res_block=True,
)
self.decoder2 = UnetrUpBlock(
  spatial_dims=spatial_dims,
  in_channels=feature_size * 2,
  out_channels=feature_size,
  kernel_size=3,
  upsample_kernel_size=2,
  norm_name=norm_name,
  res_block=True,
)

self.decoder1 = UnetrUpBlock(
  spatial_dims=spatial_dims,
  in_channels=feature_size,
  out_channels=feature_size,
  kernel_size=3,
  upsample_kernel_size=2,
  norm_name=norm_name,
  res_block=True,
)

self.out = UnetOutBlock(spatial_dims=spatial_dims, in_channels=feature_size, out_channels=out_channels)
5.2.1 SwinTransformer
class SwinTransformer(nn.Module):
  """
  Swin Transformer based on: "Liu et al.,
  Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
  <https://arxiv.org/abs/2103.14030>"
  https://github.com/microsoft/Swin-Transformer
  """

  def __init__(
    self,
    in_chans: int,
    embed_dim: int,
    window_size: Sequence[int],
    patch_size: Sequence[int],
    depths: Sequence[int],
    num_heads: Sequence[int],
    mlp_ratio: float = 4.0,
    qkv_bias: bool = True,
    drop_rate: float = 0.0,
    attn_drop_rate: float = 0.0,
    drop_path_rate: float = 0.0,
    norm_layer: type[LayerNorm] = nn.LayerNorm,
    patch_norm: bool = False,
    use_checkpoint: bool = False,
    spatial_dims: int = 3,
    downsample="merging",
    use_v2=False,
  ) -> None:
  """
  Args:
    in_chans: dimension of input channels.
    embed_dim: number of linear projection output channels.
    window_size: local window size.
    patch_size: patch size.
    depths: number of layers in each stage.
    num_heads: number of attention heads.
    mlp_ratio: ratio of mlp hidden dim to embedding dim.
    qkv_bias: add a learnable bias to query, key, value.
    drop_rate: dropout rate.
    attn_drop_rate: attention dropout rate.
    drop_path_rate: stochastic depth rate.
    norm_layer: normalization layer.
    patch_norm: add normalization after patch embedding.
    use_checkpoint: use gradient checkpointing for reduced memory usage.
    spatial_dims: spatial dimension.
    downsample: module used for downsampling, available options are `"mergingv2"`, `"merging"` and a
        user-specified `nn.Module` following the API defined in :py:class:`monai.networks.nets.PatchMerging`.
        The default is currently `"merging"` (the original version defined in v0.9.0).
    use_v2: using swinunetr_v2, which adds a residual convolution block at the beginning of each swin stage.
  """
    super().__init__()
    self.num_layers = len(depths)
    self.embed_dim = embed_dim
    self.patch_norm = patch_norm
    self.window_size = window_size
    self.patch_size = patch_size
    self.patch_embed = PatchEmbed(
        patch_size=self.patch_size,
        in_chans=in_chans,
        embed_dim=embed_dim,
        norm_layer=norm_layer if self.patch_norm else None,  # type: ignore
        spatial_dims=spatial_dims,
    )
    self.pos_drop = nn.Dropout(p=drop_rate)
    dpr = [x.item() for x in torch.linspace(0, drop_path_rate, sum(depths))]
    self.use_v2 = use_v2
    self.layers1 = nn.ModuleList()
    self.layers2 = nn.ModuleList()
    self.layers3 = nn.ModuleList()
    self.layers4 = nn.ModuleList()
    if self.use_v2:
      self.layers1c = nn.ModuleList()
      self.layers2c = nn.ModuleList()
      self.layers3c = nn.ModuleList()
      self.layers4c = nn.ModuleList()
    down_sample_mod = look_up_option(downsample, MERGING_MODE) if isinstance(downsample, str) else downsample
    for i_layer in range(self.num_layers):
      layer = BasicLayer(
        dim=int(embed_dim * 2**i_layer),
        depth=depths[i_layer],
        num_heads=num_heads[i_layer],
        window_size=self.window_size,
        drop_path=dpr[sum(depths[:i_layer]) : sum(depths[: i_layer + 1])],
        mlp_ratio=mlp_ratio,
        qkv_bias=qkv_bias,
        drop=drop_rate,
        attn_drop=attn_drop_rate,
        norm_layer=norm_layer,
        downsample=down_sample_mod,
        use_checkpoint=use_checkpoint,
        )
      if i_layer == 0:
        self.layers1.append(layer)
      elif i_layer == 1:
        self.layers2.append(layer)
      elif i_layer == 2:
        self.layers3.append(layer)
      elif i_layer == 3:
        self.layers4.append(layer)
      if self.use_v2:
        layerc = UnetrBasicBlock(
          spatial_dims=3,
          in_channels=embed_dim * 2**i_layer,
          out_channels=embed_dim * 2**i_layer,
          kernel_size=3,
          stride=1,
          norm_name="instance",
          res_block=True,
        )
      if i_layer == 0:
        self.layers1c.append(layerc)
      elif i_layer == 1:
        self.layers2c.append(layerc)
      elif i_layer == 2:
        self.layers3c.append(layerc)
      elif i_layer == 3:
        self.layers4c.append(layerc)
    self.num_features = int(embed_dim * 2 ** (self.num_layers - 1))

  def proj_out(self, x, normalize=False):
    if normalize:
      x_shape = x.size()
      if len(x_shape) == 5:
        n, ch, d, h, w = x_shape
        x = rearrange(x, "n c d h w -> n d h w c")
        x = F.layer_norm(x, [ch])
        x = rearrange(x, "n d h w c -> n c d h w")
      elif len(x_shape) == 4:
        n, ch, h, w = x_shape
        x = rearrange(x, "n c h w -> n h w c")
        x = F.layer_norm(x, [ch])
        x = rearrange(x, "n h w c -> n c h w")
    return x

  def forward(self, x, normalize=True):
    x0 = self.patch_embed(x)
    x0 = self.pos_drop(x0)
    x0_out = self.proj_out(x0, normalize)
    if self.use_v2:
      x0 = self.layers1c[0](x0.contiguous())
    x1 = self.layers1[0](x0.contiguous())
    x1_out = self.proj_out(x1, normalize)
    if self.use_v2:
      x1 = self.layers2c[0](x1.contiguous())
    x2 = self.layers2[0](x1.contiguous())
    x2_out = self.proj_out(x2, normalize)
    if self.use_v2:
      x2 = self.layers3c[0](x2.contiguous())
    x3 = self.layers3[0](x2.contiguous())
    x3_out = self.proj_out(x3, normalize)
    if self.use_v2:
      x3 = self.layers4c[0](x3.contiguous())
    x4 = self.layers4[0](x3.contiguous())
    x4_out = self.proj_out(x4, normalize)
    return [x0_out, x1_out, x2_out, x3_out, x4_out]
5.2.2 UnetrBasicBlock
class UnetrBasicBlock(nn.Module):
  """
  A CNN module that can be used for UNETR, based on: "Hatamizadeh et al.,
  UNETR: Transformers for 3D Medical Image Segmentation <https://arxiv.org/abs/2103.10504>"
  """

  def __init__(
    self,
    spatial_dims: int,
    in_channels: int,
    out_channels: int,
    kernel_size: Sequence[int] | int,
    stride: Sequence[int] | int,
    norm_name: tuple | str,
    res_block: bool = False,
  ) -> None:
    """
    Args:
      spatial_dims: number of spatial dimensions.
      in_channels: number of input channels.
      out_channels: number of output channels.
      kernel_size: convolution kernel size.
      stride: convolution stride.
      norm_name: feature normalization type and arguments.
      res_block: bool argument to determine if residual block is used.
    """

    super().__init__()

    if res_block:
      self.layer = UnetResBlock(
        spatial_dims=spatial_dims,
        in_channels=in_channels,
        out_channels=out_channels,
        kernel_size=kernel_size,
        stride=stride,
        norm_name=norm_name,
      )
    else:
      self.layer = UnetBasicBlock(  # type: ignore
        spatial_dims=spatial_dims,
        in_channels=in_channels,
        out_channels=out_channels,
        kernel_size=kernel_size,
        stride=stride,
        norm_name=norm_name,
      )

  def forward(self, inp):
    return self.layer(inp)
5.2.3 UnetrUpBlock
class UnetrUpBlock(nn.Module):
  """
  An upsampling module that can be used for UNETR: "Hatamizadeh et al.,
  UNETR: Transformers for 3D Medical Image Segmentation <https://arxiv.org/abs/2103.10504>"
  """

  def __init__(
    self,
    spatial_dims: int,
    in_channels: int,
    out_channels: int,
    kernel_size: Sequence[int] | int,
    upsample_kernel_size: Sequence[int] | int,
    norm_name: tuple | str,
    res_block: bool = False,
  ) -> None:
    """
    Args:
      spatial_dims: number of spatial dimensions.
      in_channels: number of input channels.
      out_channels: number of output channels.
      kernel_size: convolution kernel size.
      upsample_kernel_size: convolution kernel size for transposed convolution layers.
      norm_name: feature normalization type and arguments.
      res_block: bool argument to determine if residual block is used.
    """
    super().__init__()
    upsample_stride = upsample_kernel_size
    self.transp_conv = get_conv_layer(
      spatial_dims,
      in_channels,
      out_channels,
      kernel_size=upsample_kernel_size,
      stride=upsample_stride,
      conv_only=True,
      is_transposed=True,
    )

    if res_block:
      self.conv_block = UnetResBlock(
        spatial_dims,
        out_channels + out_channels,
        out_channels,
        kernel_size=kernel_size,
        stride=1,
        norm_name=norm_name,
      )
    else:
      self.conv_block = UnetBasicBlock(  # type: ignore
        spatial_dims,
        out_channels + out_channels,
        out_channels,
        kernel_size=kernel_size,
        stride=1,
        norm_name=norm_name,
      )

  def forward(self, inp, skip):
    # number of channels for skip should equals to out_channels
    out = self.transp_conv(inp)
    out = torch.cat((out, skip), dim=1)
    out = self.conv_block(out)
    return out
5.2.4 UnetOutBlock
class UnetOutBlock(nn.Module):
  def __init__(
    self, spatial_dims: int, in_channels: int, out_channels: int, dropout: tuple | str | float | None = None
  ):
    super().__init__()
    self.conv = get_conv_layer(
      spatial_dims,
      in_channels,
      out_channels,
      kernel_size=1,
      stride=1,
      dropout=dropout,
      bias=True,
      act=None,
      norm=None,
      conv_only=False,
    )

  def forward(self, inp):
    return self.conv(inp)

到了這里,關(guān)于[論文筆記](méi) Swin UNETR 論文筆記: MRI 圖像腦腫瘤語(yǔ)義分割的文章就介紹完了。如果您還想了解更多內(nèi)容,請(qǐng)?jiān)谟疑辖撬阉鱐OY模板網(wǎng)以前的文章或繼續(xù)瀏覽下面的相關(guān)文章,希望大家以后多多支持TOY模板網(wǎng)!

本文來(lái)自互聯(lián)網(wǎng)用戶(hù)投稿,該文觀點(diǎn)僅代表作者本人,不代表本站立場(chǎng)。本站僅提供信息存儲(chǔ)空間服務(wù),不擁有所有權(quán),不承擔(dān)相關(guān)法律責(zé)任。如若轉(zhuǎn)載,請(qǐng)注明出處: 如若內(nèi)容造成侵權(quán)/違法違規(guī)/事實(shí)不符,請(qǐng)點(diǎn)擊違法舉報(bào)進(jìn)行投訴反饋,一經(jīng)查實(shí),立即刪除!

領(lǐng)支付寶紅包贊助服務(wù)器費(fèi)用

相關(guān)文章

  • 基于Pytorch深度學(xué)習(xí)的腦腫瘤分類(lèi)識(shí)別(文末送書(shū))

    基于Pytorch深度學(xué)習(xí)的腦腫瘤分類(lèi)識(shí)別(文末送書(shū))

    ???♂? 個(gè)人主頁(yè):@艾派森的個(gè)人主頁(yè) ???作者簡(jiǎn)介:Python學(xué)習(xí)者 ?? 希望大家多多支持,我們一起進(jìn)步!?? 如果文章對(duì)你有幫助的話, 歡迎評(píng)論 ??點(diǎn)贊???? 收藏 ??加關(guān)注+ 目錄 實(shí)驗(yàn)背景 實(shí)驗(yàn)?zāi)康?實(shí)驗(yàn)環(huán)境 實(shí)驗(yàn)過(guò)程 1.加載數(shù)據(jù) 2.訓(xùn)練模型 3.模型評(píng)估 源代碼 文末

    2024年02月12日
    瀏覽(92)
  • RCS-YOLO快速高精度的用于腦腫瘤檢測(cè)的目標(biāo)檢測(cè)模型學(xué)習(xí)實(shí)踐

    RCS-YOLO快速高精度的用于腦腫瘤檢測(cè)的目標(biāo)檢測(cè)模型學(xué)習(xí)實(shí)踐

    最近看到了一篇有意思的論文,講的是開(kāi)發(fā)應(yīng)用于醫(yī)療領(lǐng)域內(nèi)的腫瘤檢測(cè)的快速高精度的目標(biāo)檢測(cè)模型,論文地址在這里,如下所示: ? 憑借速度和準(zhǔn)確性之間的良好平衡,尖端的YOLO框架已成為最有效的算法之一用于對(duì)象檢測(cè)。然而,使用YOLO網(wǎng)絡(luò)的性能很少在腦腫瘤檢測(cè)中

    2024年02月14日
    瀏覽(92)
  • swin unetr的3D語(yǔ)義分割

    swin unetr的3D語(yǔ)義分割

    基于monai庫(kù)。其實(shí)我不是很喜歡這種,可擴(kuò)展性太差了,除非說(shuō)你想快速在自己的數(shù)據(jù)集上出結(jié)果。但是它的transform可以對(duì)3d醫(yī)學(xué)圖像增強(qiáng)操作,比torch的transform強(qiáng)一點(diǎn),因?yàn)樗臄?shù)據(jù)增強(qiáng)輸入是(x,y,z)h,w,d格式的,我還沒(méi)有試過(guò)單獨(dú)用它的transform來(lái)結(jié)合torch訓(xùn)練。 就這幾個(gè)文

    2024年02月12日
    瀏覽(23)
  • [深度學(xué)習(xí)論文筆記](méi)UNETR: Transformers for 3D Medical Image Segmentation

    [深度學(xué)習(xí)論文筆記](méi)UNETR: Transformers for 3D Medical Image Segmentation

    UNETR: Transformers for 3D Medical Image Segmentation UNETR:用于三維醫(yī)學(xué)圖像分割的Transformer Published: Oct 2021 Published in: IEEE Winter Conference on Applications of Computer Vision (WACV) 2022 論文:https://arxiv.org/abs/2103.10504 代碼:https://monai.io/research/unetr 摘要: ??過(guò)去十年以來(lái),具有收縮路徑和擴(kuò)展路徑

    2024年01月24日
    瀏覽(24)
  • 【論文閱讀】Swin Transformer Embedding UNet用于遙感圖像語(yǔ)義分割

    【論文閱讀】Swin Transformer Embedding UNet用于遙感圖像語(yǔ)義分割

    Swin Transformer Embedding UNet for Remote Sensing Image Semantic Segmentation 全局上下文信息是遙感圖像語(yǔ)義分割的關(guān)鍵 具有強(qiáng)大全局建模能力的Swin transformer 提出了一種新的RS圖像語(yǔ)義分割框架ST-UNet型網(wǎng)絡(luò)(UNet) 解決方案:將Swin transformer嵌入到經(jīng)典的基于cnn的UNet中 ST-UNet由Swin變壓器和CNN并聯(lián)

    2024年02月08日
    瀏覽(59)
  • 【論文筆記】SwinIR: Image Restoration Using Swin Transformer

    【論文筆記】SwinIR: Image Restoration Using Swin Transformer

    聲明 不定期更新自己精度論文,通俗易懂,初級(jí)小白也可以理解 涉及范圍:深度學(xué)習(xí)方向,包括 CV、NLP、Data Fusion、Digital Twin 論文標(biāo)題:SwinIR: Image Restoration Using Swin Transformer 論文鏈接:https://arxiv.org/abs/2108.10257v1 論文代碼:https://github.com/jingyunliang/swinir 發(fā)表時(shí)間:2021年8月

    2023年04月25日
    瀏覽(27)
  • Swin-transformer論文閱讀筆記(Swin Transformer: Hierarchical Vision Transformer using Shifted Windows)

    Swin-transformer論文閱讀筆記(Swin Transformer: Hierarchical Vision Transformer using Shifted Windows)

    論文標(biāo)題:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows 論文作者:Ze Liu, Yutong Lin, Yue Cao, Han Hu, Yixuan Wei, Zheng Zhang, Stephen Lin, Baining Guo 論文來(lái)源:ICCV 2021,Paper 代碼來(lái)源:Code 目錄 1. 背景介紹 2. 研究現(xiàn)狀 CNN及其變體 基于自注意的骨干架構(gòu) 自注意/Transformer來(lái)補(bǔ)充CN

    2024年02月07日
    瀏覽(24)
  • 【論文閱讀筆記】Fibroglandular Tissue Segmentation in Breast MRI using Vision Transformers--A multi-institut

    【論文閱讀筆記】Fibroglandular Tissue Segmentation in Breast MRI using Vision Transformers--A multi-institut

    Müller-Franzes G, Müller-Franzes F, Huck L, et al. Fibroglandular Tissue Segmentation in Breast MRI using Vision Transformers–A multi-institutional evaluation[J]. arXiv preprint arXiv:2304.08972, 2023.【代碼開(kāi)放】 本文創(chuàng)新點(diǎn)一般,只做簡(jiǎn)單總結(jié) 【論文概述】 本文介紹了一項(xiàng)關(guān)于乳房MRI中纖維腺體組織分割的研究,主

    2024年02月03日
    瀏覽(22)
  • 論文學(xué)習(xí)筆記:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

    論文學(xué)習(xí)筆記:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows

    論文閱讀:Swin Transformer: Hierarchical Vision Transformer using Shifted Windows 今天學(xué)習(xí)的論文是 ICCV 2021 的 best paper,Swin Transformer,可以說(shuō)是 transformer 在 CV 領(lǐng)域的一篇里程碑式的工作。文章的標(biāo)題是一種基于移動(dòng)窗口的層級(jí) vision transformer。文章的作者都來(lái)自微軟亞研院。 Abstract 文章的

    2024年02月08日
    瀏覽(23)
  • 圖像融合論文閱讀:SwinFuse: A Residual Swin Transformer Fusion Network for Infrared and Visible Images

    圖像融合論文閱讀:SwinFuse: A Residual Swin Transformer Fusion Network for Infrared and Visible Images

    @article{wang2022swinfuse, title={SwinFuse: A residual swin transformer fusion network for infrared and visible images}, author={Wang, Zhishe and Chen, Yanlin and Shao, Wenyu and Li, Hui and Zhang, Lei}, journal={IEEE Transactions on Instrumentation and Measurement}, volume={71}, pages={1–12}, year={2022}, publisher={IEEE} } 論文級(jí)別:SCI A2/Q1 影響因

    2024年04月23日
    瀏覽(39)

覺(jué)得文章有用就打賞一下文章作者

支付寶掃一掃打賞

博客贊助

微信掃一掃打賞

請(qǐng)作者喝杯咖啡吧~博客贊助

支付寶掃一掃領(lǐng)取紅包,優(yōu)惠每天領(lǐng)

二維碼1

領(lǐng)取紅包

二維碼2

領(lǐng)紅包