openmmlab

Run Swin Transformer for Semantic Segmentation on Colab

install mmcv and mmsegmentation

1 2	!pip install openmim !mim install mmcv

Use mmcv, not mmcv-full, to install the full version, because it took me so long that I canceled the installation.

MMCV contains C++ and CUDA extensions, thus depending on PyTorch in a complex way. MIM solves such dependencies automatically and makes the installation easier.However, it is not a must.
To install MMCV with pip instead of MIM, please follow MMCV installation guides. This requires manually specifying a find-url based on PyTorch version and its CUDA version.

There are two ways to install mmsegmentation:

Option(a): If you use mmsegmentation as a dependency or third-party package, install it with pip:
1
!pip install mmsegmentation

Option(b): If you develop and run mmseg directly, install it from source:

1
2
3

!git clone https://github.com/open-mmlab/mmsegmentation.git
%cd mmsegmentation
!pip install -e .

Either way is ok, choose which you prefer.

In fact, Swin-Transformer-Semantic-Segmentation | GitHub is based on mmsegmentation | GitHub, and the code structure and running environment of their repositories are completely the same, so there is no need for additional environment adaptation.

verify the installation

mmsegmentation/…/get_started.md#verify-the-installation | GitHub

get_started.html#verify-the-installation | latest

Step 1. You need to download config and checkpoint files.

1	mim download mmsegmentation --config pspnet_r50-d8_4xb2-40k_cityscapes-512x1024 --dest .

The downloading will take several seconds or more, depending on your network environment. When it is done, you will find two files in your current folder.

pspnet_r50-d8_512x1024_40k_cityscapes.py
pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth

Step 2. Verify the inference demo.

Option (a). If you install mmsegmentation from source, just run the following command.

python demo/image_demo.py demo/demo.png configs/pspnet/pspnet_r50-d8_512x1024_40k_cityscapes.py pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth --device cuda:0 --out-file result.jpg

You will see a new image result.jpg on your current folder, where segmentation masks are covered on all objects.

Option (b). If you install mmsegmentation with pip, open you python interpreter and copy&paste the following codes.

from mmseg.apis import inference_model, init_model, show_result_pyplot
import mmcv

config_file = 'pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py'
checkpoint_file = 'pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth'

# build the model from a config file and a checkpoint file
model = init_model(config_file, checkpoint_file, device='cuda:0')

# test a single image and show the results
img = 'demo/005.jpg'  # or img = mmcv.imread(img), which will only load it once
result = inference_model(model, img)
# visualize the results in a new window
show_result_pyplot(model, img, result, show=True)
# or save the visualization results to image files
# you can change the opacity of the painted segmentation map in (0, 1].
show_result_pyplot(model, img, result, show=True, out_file='result.jpg', opacity=0.5)

You can modify the code above to test a single image or a video, both of these options can verify that the installation was successful.

Run Swin Transformer for Semantic Segmentation on local environment

Prerequisites

1 2	conda create --name openmmlab python=3.8 -y conda activate openmmlab

On CPU platforms

1	conda install pytorch torchvision cpuonly -c pytorch

Installation

Step 1. Install MMCV using MIM

I have to close my Clash to excute the following commands successfully. And it takes very long time to download the file from the internet without a proxy.
Maybe you could try to add -i https://mirrors.aliyun.com/pypi/simple/ behind the install command next time.

# (openmmlab)...>
pip install openmim
mim install mmengine
mim install "mmcv>=2.0.0"

ERROR: Could not build wheels for mmcv, which is required to install pyproject.toml-based projects
solution by onexiaophai
Use pip installation regardless of mim.

Step 2. Install MMSegmentation.

Case a: If you develop and run mmseg directly, install it from source:

1
2
3

# using proxy of Clash to download this from github
git clone -b main https://github.com/open-mmlab/mmsegmentation.git
cd mmsegmentation

I closed the proxy of Clash for installation, and I successfully make the downloading process quicker by adding the aliyun mirror image source behind the install command.

# (openmmlab)...\mmsegmentation>
pip install -v -e . -i https://mirrors.aliyun.com/pypi/simple/
# '-v' means verbose, or more output
# '-e' means installing a project in editable mode,
# thus any local modifications made to the code will take effect without reinstallation.

I chose to install from the source, the case a, so I can check the source code of models and menthods easily.

Case b: If you use mmsegmentation as a dependency or third-party package, install it with pip:

1 2	# (openmmlab)...> pip install "mmsegmentation>=1.0.0"

Verify the installation

Step 1. We need to download config and checkpoint files.

1 2	# (openmmlab)...\mmsegmentation> mim download mmsegmentation --config pspnet_r50-d8_4xb2-40k_cityscapes-512x1024 --dest .

excute this command under the mmsegmentation folder, don’t change your directory.

Step 2. Verify the inference demo.

Excute this command under the mmsegmentation folder.

1
2

# (openmmlab)...\mmsegmentation>
python demo/image_demo.py demo/demo.png configs/pspnet/pspnet_r50-d8_4xb2-40k_cityscapes-512x1024.py pspnet_r50-d8_512x1024_40k_cityscapes_20200605_003338-2966598c.pth --device cpu --out-file result.jpg

note: I changed the device parameter from --device cuda:0 to --device cpu because this is a cpuonly platform.

You will see a new image result.jpg on your current folder, where segmentation masks are covered on all objects.

Training with mmsegmentation

Train & Test | mmsegmentation docs

run `train.py` in terminal

prepare dataset

Take chase-db1 for an example.

Download the CHASEDB1.zip file from the link given by the Tutorial 2: Prepare datasets#chase-db1

To convert CHASE DB1 dataset to MMSegmentation format, you should run the following command

1 2	# (openmmlab)...\mmsegmentation> python tools/dataset_converters/chase_db1.py C:/Users/xiaophai/Downloads/Compressed/CHASEDB1.zip

Then, run the following command to start your training.

1 2	# (openmmlab)...\mmsegmentation> python tools/train.py configs/unet/unet_s5-d16_deeplabv3_4xb4-40k_chase-db1-128x128.py

debug `train.py` in vscode

// Folder Structure:
├── .vscode
│   └── launch.json
└── mmsegmentation
    ├── tools
    │   └── train.py
    └── data
        └── CHASE_DB1
            ├── images
            │   ├── training
            │   └── validation
            └── annotations
                ├── training
                └── validation

// launch.json
{
  "version": "0.2.0",
  "configurations": [
    {
      "name": "Python: config_train",
      "type": "python",
      "request": "launch",
      "program": "${workspaceFolder}/mmsegmentation/tools/train.py",
      "args": "configs/unet/unet_s5-d16_deeplabv3_4xb4-40k_chase-db1-128x128.py",
      "console": "integratedTerminal",
      "cwd": "${workspaceFolder}/mmsegmentation",
      "justMyCode": true
    }
  ]
}

rewrite the mmsegmentation

mmengine | GitHub

Swin-Transformer-Semantic-Segmentation | GitHub

1 2	# (openmmlab)...\Swin-Transformer-Semantic-Segmentation-main> python tools/train.py configs/unet/deeplabv3_unet_s5-d16_128x128_40k_chase_db1.py

Prepare Dataset

ISPRS-Potsdam

Prepare Dataset: ISPRS-Potsdam

The Potsdam dataset is for urban semantic segmentation used in the 2D Semantic Labeling Contest - Potsdam.

The dataset can be requested at the challenge homepage. You will get a file named utf-8' 'Potsdam.zip (size: 13.3GB), and unzip this file to get a folder named Potsdam which contains 10 files:

Potsdam
├── 1_DSM.rar
├── 1_DSM_normalisation.zip
├── 2_Ortho_RGB.zip <--
├── 3_Ortho_IRRG.zip
├── 4_Ortho_RGBIR.zip
├── 5_Labels_all.zip
├── 5_Labels_all_noBoundary.zip <--
├── 5_Labels_for_participants.zip
├── 5_Labels_for_participants_no_Boundary.zip
├── assess_classification_reference_implementation.tgz

where 2_Ortho_RGB.zip and 5_Labels_all_noBoundary.zip are only required.

1
2
3

Potsdam
├── 2_Ortho_RGB.zip <--
├── 5_Labels_all_noBoundary.zip <--

For Potsdam dataset, please run the following command to re-organize the dataset.

1	(openmmlab) ...\mmsegmentation>python tools/dataset_converters/potsdam.py "D:/Dataset/Potsdam"

And you will get a folder structure as below:

mmsegmentation
├── mmseg
├── tools
├── configs
├── data
│   ├── potsdam
│   │   ├── img_dir
│   │   │   ├── train: 3456
│   │   │   ├── val: 2016
│   │   ├── ann_dir
│   │   │   ├── train: 3456
│   │   │   ├── val: 2016

In the default setting of mmsegmentation, it will generate 3456 images for training and 2016 images for validation.

In the 2_Ortho_RGB.zip file, it contains 38 pictures of size 6000x6000:

Potsdam

And every picture have been seplited into 12x12=144 patches of size 512x512. There are 38x144=5472=3456+2016 patches in total, in which 3456 patches are used for training and 2016 for validation.

Masks are single-channel images, and the comparison table of their values and categories is as follows:

The original color map of the mmseg split code for potsdam

0: [  0   0   0]
1: [255 255 255] : impervious surfaces
2: [255   0   0] : background
3: [255 255   0] : car
4: [  0 255   0] : tree
5: [  0 255 255] : low vegetation
6: [  0   0 255] : building

You should change the color map to:

1: [255 255 255] : impervious surfaces
2: [  0   0 255] : building
3: [  0 255 255] : low vegetation
4: [  0 255   0] : tree
5: [255 255   0] : car
6: [255   0   0] : clutter/background

# color_map = np.array([[0, 0, 0], [255, 255, 255], [255, 0, 0],
#                       [255, 255, 0], [0, 255, 0], [0, 255, 255],
#                       [0, 0, 255]])
color_map = np.array([[0, 0, 0], [255, 255, 255], [0, 0, 255],
                        [0, 255, 255], [0, 255, 0], [255, 255, 0],
                        [255, 0, 0]])

实际的标号会减1，采用0,1,2,3,4,5（而不是1,2,3,4,5,6）

'''
0: [255 255 255] : white  : impervious surfaces
1: [  0   0 255] : blue   : building
2: [  0 255 255] : cyan   : low vegetation
3: [  0 255   0] : green  : tree
4: [255 255   0] : yellow : car
5: [255   0   0] : red    : clutter/background
'''
color_map = np.array([[255, 255, 255], [0, 0, 255], [0, 255, 255], 
                      [0, 255, 0], [255, 255, 0], [255, 0, 0]])

关于颜色的设置

$\begin{array}{cclcl} 0 & \colorbox{white}{$\quad$} & [255, 255, 255] & \text{white} & \text{impervious surfaces}\\ 1 & \colorbox{blue}{$\quad$} & [0, 0, 255] & \text{blue} & \text{building}\\ 2 & \colorbox{cyan}{$\quad$} & [0, 255, 255] & \text{cyan} & \text{low vegetation}\\ 3 & \colorbox{green}{$\quad$} & [0, 255, 0] & \text{green} & \text{tree}\\ 4 & \colorbox{yellow}{$\quad$} & [255, 255, 0] & \text{yellow} & \text{car}\\ 5 & \colorbox{red}{$\quad$} & [255, 0, 0] & \text{red} & \text{clutter/background}\\ \end{array}$

label	color	rgb	name	catagory
0	$$\colorbox{white}{ $\quad$ }$$	[255,255,255]	white	impervious surfaces
1	$$\colorbox{blue}{ $\quad$ }$$	[0,0,255]	blue	building
2	$$\colorbox{cyan}{ $\quad$ }$$	[0,255,255]	cyan	low vegetation
3	$$\colorbox{green}{ $\quad$ }$$	[0,255,0]	green	tree
4	$$\colorbox{yellow}{ $\quad$ }$$	[255,255,0]	yellow	car
5	$$\colorbox{red}{ $\quad$ }$$	[255,0,0]	red	clutter/background

0: impervious surfaces
1: building
2: low vegetation
3: tree
4: car
5: clutter/background
6: boundary

If you use 5_Labels_all.zip as your ground truth

1
2
3

Potsdam
├── 2_Ortho_RGB.zip <--
├── 5_Labels_all.zip <--

the comparison table of their values and categories is as follows:

1: impervious surfaces
2: building
3: low vegetation
4: tree
5: car
6: clutter/background

The `top_potsdam_6_7_label.tif` contains error pixels

from PIL import Image
import numpy as np

mask_path = 'path/to/top_potsdam_6_7_label.tif'

mask = Image.open(mask_path)
mask = np.array(mask).reshape(-1,3)
values, counts = np.unique(mask, return_counts=True, axis=0)
print(values, counts)

# [[  0   0 255]
#  [  0 255   0]
#  [  0 255 255]
#  [252 255   0] <-- Error
#  [255   0   0]
#  [255 255   0]
#  [255 255 255]]

# [ 4857912  5669942 19962121   246304   797467     6749  4459505]

The values of `top_potsdam_4_12_label.tif` are abnormal

from PIL import Image
import numpy as np

mask_path = '../datasets/original_potsdam/label/top_potsdam_4_12_label.tif'

mask = Image.open(mask_path)
mask = np.array(mask).reshape(-1,3)
values, counts = np.unique(mask, return_counts=True, axis=0)
print(len(values)) # 24850 <-- Not Six Classes

Code to show the original image and label

import os
from PIL import Image
import matplotlib.pyplot as plt

image_names = ['top_potsdam_2_10', 'top_potsdam_4_12', 'top_potsdam_6_7']
path2image = '../datasets/original_potsdam/image'
path2mask = '../datasets/original_potsdam/label'

plt.close()
nrows = 2; ncols = len(image_names)
fig, axes = plt.subplots(nrows, ncols, figsize=(5*ncols, 5*nrows))

for i in range(ncols):
    axes[0,i].set_title(image_names[i])

for i, name in enumerate(image_names):
    image_path = os.path.join(path2image, name+"_RGB.tif")
    mask_path = os.path.join(path2mask, name+"_label.tif")

    image = Image.open(image_path)
    mask = Image.open(mask_path)

    axes[0,i].imshow(image)
    axes[1,i].imshow(mask)
plt.show()

Code used to split the images

import argparse
import glob
import math
import os
import os.path as osp
import tempfile
import zipfile
from tqdm import tqdm

from PIL import Image
import numpy as np

def get_parser():
    parser = argparse.ArgumentParser(
        description='Convert potsdam dataset to mmsegmentation format')
    parser.add_argument('dataset_path', help='potsdam folder path')
    parser.add_argument('--tmp_dir', help='path of the temporary directory')
    parser.add_argument('-o', '--out_dir', help='output path')
    parser.add_argument(
        '--clip_size',
        type=int,
        help='clipped size of image after preparation',
        default=512)
    parser.add_argument(
        '--stride_size',
        type=int,
        help='stride of clipping original images',
        default=256)
    # args = parser.parse_args(arg_list)
    return parser

def clip_big_image(image_path, clip_save_dir, args, to_label=False):
    # Original image of Potsdam dataset is very large, thus pre-processing
    # of them is adopted. Given fixed clip size and stride size to generate
    # clipped image, the intersection　of width and height is determined.
    # For example, given one 5120 x 5120 original image, the clip size is
    # 512 and stride size is 256, thus it would generate 20x20 = 400 images
    # whose size are all 512x512.
    # image = PIL.Image.open(image_path)
    image = Image.open(image_path)
    image = np.array(image)

    h, w, c = image.shape
    clip_size = args.clip_size
    stride_size = args.stride_size

    num_rows = math.ceil((h - clip_size) / stride_size) if math.ceil(
        (h - clip_size) /
        stride_size) * stride_size + clip_size >= h else math.ceil(
            (h - clip_size) / stride_size) + 1
    num_cols = math.ceil((w - clip_size) / stride_size) if math.ceil(
        (w - clip_size) /
        stride_size) * stride_size + clip_size >= w else math.ceil(
            (w - clip_size) / stride_size) + 1

    x, y = np.meshgrid(np.arange(num_cols + 1), np.arange(num_rows + 1))
    xmin = x * clip_size
    ymin = y * clip_size

    xmin = xmin.ravel()
    ymin = ymin.ravel()
    xmin_offset = np.where(xmin + clip_size > w, w - xmin - clip_size,
                           np.zeros_like(xmin))
    ymin_offset = np.where(ymin + clip_size > h, h - ymin - clip_size,
                           np.zeros_like(ymin))
    boxes = np.stack([
        xmin + xmin_offset, ymin + ymin_offset,
        np.minimum(xmin + clip_size, w),
        np.minimum(ymin + clip_size, h)
    ],
                     axis=1)

    if to_label:
        # color_map = np.array([[0, 0, 0], [255, 255, 255], [255, 0, 0],
        #                       [255, 255, 0], [0, 255, 0], [0, 255, 255],
        #                       [0, 0, 255]])
        color_map = np.array([[0, 0, 0], [255, 255, 255], [0, 0, 255],
                              [0, 255, 255], [0, 255, 0], [255, 255, 0],
                              [255, 0, 0]])
        flatten_v = np.matmul(
            image.reshape(-1, c),
            np.array([2, 3, 4]).reshape(3, 1))
        out = np.zeros_like(flatten_v)
        for idx, class_color in enumerate(color_map):
            value_idx = np.matmul(class_color,
                                  np.array([2, 3, 4]).reshape(3, 1))
            out[flatten_v == value_idx] = idx
        image = out.reshape(h, w)

    for box in boxes:
        start_x, start_y, end_x, end_y = box
        clipped_image = image[start_y:end_y,
                              start_x:end_x] if to_label else image[
                                  start_y:end_y, start_x:end_x, :]
        idx_i, idx_j = osp.basename(image_path).split('_')[2:4]

        # it takes too much of time to save clipped images in this way.
        # mmcv.imwrite(
        #     clipped_image.astype(np.uint8),
        #     osp.join(
        #         clip_save_dir,
        #         f'{idx_i}_{idx_j}_{start_x}_{start_y}_{end_x}_{end_y}.png'))

        clipped_image = Image.fromarray(clipped_image.astype(np.uint8))
        clipped_image.save(
            fp=osp.join(clip_save_dir, f'{idx_i}_{idx_j}_{start_x}_{start_y}_{end_x}_{end_y}.png'),
            format='PNG', compress_level=1
        )
        # 'data\\potsdam\\img_dir\\train'

def main():
    parser = get_parser()
    args = parser.parse_args(["D:/Dataset/Potsdam"])
    splits = {
        'train': [
            '2_11', '2_12', '3_10', '3_11', '3_12', '4_10', '4_11',
            '5_10', '5_11', '5_12', '6_8', '6_9', '6_10', # '4_12', '6_7', 
            '6_11', '6_12', '7_7', '7_8', '7_9', '7_10', '7_11', '7_12'
        ],
        'val': [
            '2_10'
        ],
        'test': [
            '2_13', '2_14', '3_13', '3_14', '4_13', '4_14', '4_15', '5_13',
            '5_14', '6_13', '6_14', '6_15', '7_13'
        ]
    }

    dataset_path = args.dataset_path
    if args.out_dir is None:
        out_dir = osp.join('data', 'potsdam') # 'data\\potsdam'
    else:
        out_dir = args.out_dir

    print('Making directories...')
    if not osp.exists(osp.join(out_dir, 'img_dir', 'train')):
        os.makedirs(osp.join(out_dir, 'img_dir', 'train'))
    if not osp.exists(osp.join(out_dir, 'img_dir', 'val')):
        os.makedirs(osp.join(out_dir, 'img_dir', 'val'))
    if not osp.exists(osp.join(out_dir, 'img_dir', 'test')):
        os.makedirs(osp.join(out_dir, 'img_dir', 'test'))

    if not osp.exists(osp.join(out_dir, 'ann_dir', 'train')):
        os.makedirs(osp.join(out_dir, 'ann_dir', 'train'))
    if not osp.exists(osp.join(out_dir, 'ann_dir', 'val')):
        os.makedirs(osp.join(out_dir, 'ann_dir', 'val'))
    if not osp.exists(osp.join(out_dir, 'ann_dir', 'test')):
        os.makedirs(osp.join(out_dir, 'ann_dir', 'test'))

    zipp_list = glob.glob(os.path.join(dataset_path, '*.zip'))
    print('Find the data', zipp_list)
    # ['D:/Dataset/Potsdam\\2_Ortho_RGB.zip',
    #  'D:/Dataset/Potsdam\\5_Labels_all_noBoundary.zip']

    for zipp in zipp_list:
        with tempfile.TemporaryDirectory(dir=args.tmp_dir) as tmp_dir: # tmp_dir changes in every loop
            zip_file = zipfile.ZipFile(zipp)
            zip_file.extractall(tmp_dir)
            # Check whether the *.tif files are unziped to current directory or a sub directory
            src_path_list = glob.glob(os.path.join(tmp_dir, '*.tif'))
            # if len(src_path_list)==0, it means *.tif are extracted to a sub directory rather than current directory directly
            if not len(src_path_list):
                sub_tmp_dir = os.path.join(tmp_dir, os.listdir(tmp_dir)[0])
                src_path_list = glob.glob(os.path.join(sub_tmp_dir, '*.tif'))

            prog_bar = tqdm(src_path_list)
            for src_path in prog_bar:
                idx_i, idx_j = osp.basename(src_path).split('_')[2:4] # e.g.'top_potsdam_2_10_RGB.tif'.split('_')[2:4]
                # data_type = 'train' if f'{idx_i}_{idx_j}' in splits[
                #     'train'] else 'val'
                if f'{idx_i}_{idx_j}' in splits['train']:
                    data_type = 'train'
                elif f'{idx_i}_{idx_j}' in splits['val']:
                    data_type = 'val'
                else:
                    data_type = 'test'
                
                if 'label' in src_path:
                    dst_dir = osp.join(out_dir, 'ann_dir', data_type)
                    clip_big_image(src_path, dst_dir, args, to_label=True)
                else:
                    dst_dir = osp.join(out_dir, 'img_dir', data_type) # 'data\\potsdam\\img_dir\\train'
                    clip_big_image(src_path, dst_dir, args, to_label=False)

    print('Removing the temporary files...')
    print('Done!')

if __name__ == '__main__':
    main()

Synapse

Synapse dataset{target="_blank"}

Multi-Atlas Labeling Beyond the Cranial Vault - Workshop and Challenge{target="_blank"}

Note: You need to join the challenge firstly, then you will see Abdomen and Cervix in the Files directory, which are private and invisible if you have not joined the challenge.
Just downloading the RawData.zip (1.531GB) is enough.