ISPRS-Vaihingen

The Vaihingen dataset is for urban semantic segmentation used in the 2D Semantic Labeling Contest - Vaihingen.

The dataset can be requested at the challenge homepage. You need to get a package file named Vaihingen.zip (size: 14.9 GB), and unzip this package to get a folder named Vaihingen which contains 14 files as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
Vaihingen
├── 3DLabeling
├── ALS
├── DSM
├── Images
├── Ortho
├── Reference_3d_reconstruction
├── Reference_object_detection
├── dsm_09cm_matching_area11.tif
├── ISPRS_semeantic_labeing_Vaihingen_ground_truth_eroded_for_participants.zip
├── ISPRS_semeantic_labeling_Vaihingen.zip # <-- original image
├── ISPRS_semeantic_labeling_Vaihingen_ground_truth_COMPLETE.zip # <-- ground truth without balck boundary line
├── ISPRS_semeantic_labeling_Vaihingen_ground_truth_eroded_COMPLETE.zip # <-- ground truth with balck boundary line
├── Overview_Vaihingen_DMC.pdf
└── Vaihingen_dsm_tiles_geoinfo.zip

dataset_prepare.html#isprs-vaihingen | mmsegmentation docs

其中我们需要 ISPRS_semantic_labeling_Vaihingen.zipISPRS_semantic_labeling_Vaihingen_ground_truth_COMPLETE.zip 两个文件

1
2
3
vaihingen
├── ISPRS_semantic_labeling_Vaihingen.zip # <-- image
└── ISPRS_semantic_labeling_Vaihingen_ground_truth_COMPLETE.zip # <-- label

其中 ISPRS_semantic_labeling_Vaihingen.zip 包含原始的 IRRG 图像

1
2
3
4
ISPRS_semeantic_labeling_Vaihingen.zip
├── dsm
├── gts_for_participants
└── top # <-- IRRG image

对于标签一共有两种,一种是不带 eroded 黑色边界的 ground truth,另一种是带这种 eroded 黑色边界的 ground truth。请选择使用,本文这里选择的是不带黑色边界的 ground truth,即 ISPRS_semantic_labeling_Vaihingen_ground_truth_COMPLETE.zip 这个文件。

数据集中的图片为 tif 格式,Windows 自带的各种图片工具可能不能正常打开,使用 vscode 中 tif 插件,例如 TIFF Preview,可以查看 数据集的 tif 原图。

Correspondence between colors and categories

0[255,255,255]whiteimpervious surfaces1[0,0,255]bluebuilding2[0,255,255]cyanlow vegetation3[0,255,0]greentree4[255,255,0]yellowcar5[255,0,0]redclutter/background\begin{array}{cclcl} 0 & \colorbox{white}{$\quad$} & [255, 255, 255] & \text{white} & \text{impervious surfaces}\\ 1 & \colorbox{blue}{$\quad$} & [0, 0, 255] & \text{blue} & \text{building}\\ 2 & \colorbox{cyan}{$\quad$} & [0, 255, 255] & \text{cyan} & \text{low vegetation}\\ 3 & \colorbox{green}{$\quad$} & [0, 255, 0] & \text{green} & \text{tree}\\ 4 & \colorbox{yellow}{$\quad$} & [255, 255, 0] & \text{yellow} & \text{car}\\ 5 & \colorbox{red}{$\quad$} & [255, 0, 0] & \text{red} & \text{clutter/background}\\ \end{array}

1
2
3
4
5
6
7
8
9
10
'''
0: [255 255 255] : white : impervious surface
1: [ 0 0 255] : blue : building
2: [ 0 255 255] : cyan : low vegetation
3: [ 0 255 0] : green : tree
4: [255 255 0] : yellow : car
5: [255 0 0] : red : clutter/background
'''
color_map = np.array([[255, 255, 255], [ 0, 0, 255], [ 0, 255, 255],
[ 0, 255, 0], [255, 255, 0], [255, 0, 0]])

注意如果使用的是 ISPRS_semeantic_labeling_Vaihingen_ground_truth_eroded_COMPLETE.zip 作为标签,其包含了边界标注,对应的 color_map 会有所不同

1
2
3
4
5
6
7
8
9
10
11
12
'''
0: [ 0 0 0] : black : boundary
1: [255 255 255] : white : impervious surface
2: [ 0 0 255] : blue : building
3: [ 0 255 255] : cyan : low vegetation
4: [ 0 255 0] : green : tree
5: [255 255 0] : yellow : car
6: [255 0 0] : red : clutter/background
'''
color_map = np.array([[0, 0, 0], [255, 255, 255], [0, 0, 255],
[0, 255, 255], [0, 255, 0], [255, 255, 0],
[255, 0, 0]])

Configuration

Use the code below to convert the original images to patches (pixels 512×512)

In the 2_Ortho_RGB.zip file, it contains 33 pictures with different sizes:

overview_vaihingen

In the default configuration, We assign the training, validation, and test sets as follows, 15 for training, 1 for validation and 17 for testing

1
2
3
4
5
6
7
8
9
10
11
12
13
splits = {
'train': [
'area1', 'area3', 'area5', 'area7', 'area11', 'area13', 'area15', 'area17',
'area21', 'area23', 'area26', 'area28', 'area32', 'area34', 'area37'
], # 15 for training
'val': [
'area30'
], # 1 for validation
'test': [
'area2', 'area4', 'area6', 'area8', 'area10', 'area12', 'area14', 'area16', 'area20',
'area22', 'area24', 'area27', 'area29', 'area31', 'area33', 'area35', 'area38'
], # 17 for testing
}

文件结构

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ tree datasets -L 3
datasets
├── potsdam_512x512
│ ├── ann_dir
│ │ ├── test
│ │ ├── train
│ │ └── val
│ └── img_dir
│ ├── test
│ ├── train
│ └── val
└── vaihingen_512x512
├── ann_dir
│ ├── test
│ ├── train
│ └── val
└── img_dir
├── test
├── train
└── val
1
2
3
4
5
6
$ ls -l path/to/vaihingen_512x512/img_dir/train/ | grep "^-" | wc -l
320
$ ls -l path/to/vaihingen_512x512/img_dir/test/ | grep "^-" | wc -l
398
$ ls -l path/to/vaihingen_512x512/img_dir/val/ | grep "^-" | wc -l
24

Split Code

Prepare Dataset: ISPRS-Vaihingen | mmsegmentation doc

tools/dataset_converters/vaihingen.py | mmsegmentation github

注意我们在下面代码中使用的是 cv2 来读取和保存 crop 后的标签图片,所以使用的 color map 是 BGR 格式的。cv2.imread 在读取本地的 RGB 图片时会按照 BGR 的顺序读取,所以 cv2.imread 读取在内存里面图片的通道顺序是 BGR 的;cv2.imwrite 在保存图片时默认内存里面需要被保存图片的通道顺序是 BGR 的,它会按照 BGR 的通道顺序将图片写到本地变成 RGB 的通道顺序。

1
2
3
4
5
6
7
8
9
10
11
'''
R G B | B G R
0: [255 255 255] | [255 255 255] : impervious surfaces
1: [ 0 0 255] | [255 0 0] : building
2: [ 0 255 255] | [255 255 0] : low vegetation
3: [ 0 255 0] | [ 0 255 0] : tree
4: [255 255 0] | [ 0 255 255] : car
5: [255 0 0] | [ 0 0 255] : clutter/background
'''
color_map = np.array([[255, 255, 255], [255, 0, 0], [255, 255, 0],
[ 0, 255, 0], [ 0, 255, 255], [ 0, 0, 255]])
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
# ref: https://github.com/open-mmlab/mmsegmentation/blob/main/tools/dataset_converters/vaihingen.py
import argparse
import glob
import math
import os
import os.path as osp
import tempfile
import zipfile
from tqdm import tqdm

import cv2
import numpy as np

def get_parser():
parser = argparse.ArgumentParser(
description='Convert vaihingen dataset to mmsegmentation format')
parser.add_argument('dataset_path', help='vaihingen folder path')
parser.add_argument('--tmp_dir', help='path of the temporary directory')
parser.add_argument('-o', '--out_dir', help='output path')
parser.add_argument(
'--clip_size',
type=int,
help='clipped size of image after preparation',
default=512)
parser.add_argument(
'--stride_size',
type=int,
help='stride of clipping original images',
default=256)
return parser

def clip_big_image(image_path, clip_save_dir, to_label=False):
# Original image of Vaihingen dataset is very large, thus pre-processing
# of them is adopted. Given fixed clip size and stride size to generate
# clipped image, the intersection of width and height is determined.
# For example, given one 5120 x 5120 original image, the clip size is
# 512 and stride size is 256, thus it would generate 20x20 = 400 images
# whose size are all 512x512.

image = cv2.imread(image_path) # get a BGR/BRIR image

h, w, c = image.shape
cs = args.clip_size
ss = args.stride_size

num_rows = math.ceil((h - cs) / ss) \
if math.ceil((h - cs) / ss) * ss + cs >= h \
else math.ceil((h - cs) / ss) + 1
num_cols = math.ceil((w - cs) / ss) \
if math.ceil((w - cs) / ss) * ss + cs >= w \
else math.ceil((w - cs) / ss) + 1

x, y = np.meshgrid(np.arange(num_cols + 1), np.arange(num_rows + 1))
xmin = x * cs
ymin = y * cs

xmin = xmin.ravel()
ymin = ymin.ravel()
xmin_offset = np.where(xmin + cs > w, w - xmin - cs, np.zeros_like(xmin))
ymin_offset = np.where(ymin + cs > h, h - ymin - cs, np.zeros_like(ymin))
boxes = np.stack([
xmin + xmin_offset, ymin + ymin_offset,
np.minimum(xmin + cs, w),
np.minimum(ymin + cs, h)
], axis=1)

if to_label:
'''This is the normal RGB color map
R G B | B G R
0: [255 255 255] | [255 255 255] : impervious surfaces
1: [ 0 0 255] | [255 0 0] : building
2: [ 0 255 255] | [255 255 0] : low vegetation
3: [ 0 255 0] | [ 0 255 0] : tree
4: [255 255 0] | [ 0 255 255] : car
5: [255 0 0] | [ 0 0 255] : clutter/background
'''
# Note it is a BGR color map in this place rather than RGB,
# because cv2 reverses the RGB to BGR when call the imread()
# and reverses BGR to RGB when call the imwrite().
color_map = np.array([[255, 255, 255], [255, 0, 0], [255, 255, 0],
[ 0, 255, 0], [ 0, 255, 255], [ 0, 0, 255]])
flatten_v = np.matmul(image.reshape(-1, c), np.array([2, 3, 4]).reshape(3, 1))
out = np.zeros_like(flatten_v)
for idx, class_color in enumerate(color_map):
value_idx = np.matmul(class_color, np.array([2, 3, 4]).reshape(3, 1))
out[flatten_v == value_idx] = idx
image = out.reshape(h, w)

for box in boxes:
start_x, start_y, end_x, end_y = box
clipped_image = image[start_y:end_y, start_x:end_x] if to_label else image[start_y:end_y, start_x:end_x, :]
area_idx = osp.basename(image_path).split('_')[3].strip('.tif')
cv2.imwrite(img=clipped_image.astype(np.uint8),
filename=osp.join(clip_save_dir, f'{area_idx}_{start_x}_{start_y}_{end_x}_{end_y}.png'))

def main(args):
splits = {
'train': [
'area1', 'area3', 'area5', 'area7', 'area11', 'area13', 'area15', 'area17',
'area21', 'area23', 'area26', 'area28', 'area32', 'area34', 'area37'
], # 15
'val': [
'area30'
], # 1
'test': [
'area2', 'area4', 'area6', 'area8', 'area10', 'area12', 'area14', 'area16', 'area20',
'area22', 'area24', 'area27', 'area29', 'area31', 'area33', 'area35', 'area38'
], # 17
}

dataset_path = args.dataset_path
if args.out_dir is None:
out_dir = osp.join('data', 'vaihingen')
else:
out_dir = args.out_dir

print('Making directories...')
if not osp.exists(osp.join(out_dir, 'img_dir', 'train')):
os.makedirs(osp.join(out_dir, 'img_dir', 'train'))
if not osp.exists(osp.join(out_dir, 'img_dir', 'val')):
os.makedirs(osp.join(out_dir, 'img_dir', 'val'))
if not osp.exists(osp.join(out_dir, 'img_dir', 'test')):
os.makedirs(osp.join(out_dir, 'img_dir', 'test'))

if not osp.exists(osp.join(out_dir, 'ann_dir', 'train')):
os.makedirs(osp.join(out_dir, 'ann_dir', 'train'))
if not osp.exists(osp.join(out_dir, 'ann_dir', 'val')):
os.makedirs(osp.join(out_dir, 'ann_dir', 'val'))
if not osp.exists(osp.join(out_dir, 'ann_dir', 'test')):
os.makedirs(osp.join(out_dir, 'ann_dir', 'test'))

zipp_list = glob.glob(os.path.join(dataset_path, '*.zip'))
print('Find the data', zipp_list)

for zipp in zipp_list:
with tempfile.TemporaryDirectory(dir=args.tmp_dir) as tmp_dir:
# unzip
print("Unzipping to the temporary folder...")
zip_file = zipfile.ZipFile(zipp) # open zipfile
zip_file.extractall(tmp_dir) # extract zipfile
# path2tif
src_path_list: list = []
mode, to_label = None, None
if 'ISPRS_semantic_labeling_Vaihingen.zip' in zipp:
mode, to_label = "img_dir", False
# 'ISPRS_semantic_labeling_Vaihingen/top' folder
src_path_list = glob.glob(os.path.join(os.path.join(tmp_dir, 'top'), '*.tif'))
elif 'ISPRS_semantic_labeling_Vaihingen_ground_truth_COMPLETE.zip' in zipp:
mode, to_label = "ann_dir", True
src_path_list = glob.glob(os.path.join(tmp_dir, '*.tif'))
else: continue

prog_bar = tqdm(src_path_list, desc=mode)
for src_path in prog_bar:
area_idx = osp.basename(src_path).split('_')[3].strip('.tif')
if area_idx in splits['train']:
data_type = 'train'
elif area_idx in splits['val']:
data_type = 'val'
elif area_idx in splits['test']:
data_type = 'test'
else: continue

dst_dir = osp.join(out_dir, mode, data_type)
clip_big_image(src_path, dst_dir, to_label=to_label)

print('Removing the temporary files...')
print('Done!')

if __name__ == '__main__':
# path/to/vaihingen
# ├── ISPRS_semantic_labeling_Vaihingen.zip # <-- image
# └── ISPRS_semantic_labeling_Vaihingen_ground_truth_COMPLETE.zip # <-- label
parser = get_parser()
args = parser.parse_args(["/15T-2/zwx/datasets_package/vaihingen/", # path to *.zip
"--out_dir", "/15T-2/zwx/datasets/vaihingen_512x512",
"--tmp_dir", "/15T-2/zwx/temp",
"--clip_size", "512", "--stride_size", "512"])
main(args)

mmsegmentation potsdam.py 代码中的 color_map

源代码见链接 potsdam.py | github,下面的代码块是截取的 color_map 部分

mmsegmentation 源码中对 color_map 的设置为 BGR,与正常的 RGB 正好相反,这一点需要注意。

1
2
3
4
5
6
7
8
9
# mmsegmentation 的 color_map 颜色顺序为 BGR
color_map = np.array([[0, 0, 0], [255, 255, 255], [255, 0, 0],
[255, 255, 0], [0, 255, 0], [0, 255, 255],
[0, 0, 255]])

# 正常颜色顺序应该为 RGB
color_map = np.array([[0, 0, 0], [255, 255, 255], [0, 0, 255],
[0, 255, 255], [0, 255, 0], [255, 255, 0],
[255, 0, 0]])

mmsegmentation 之所以标注 BGR 的颜色顺序,应该是其 imread 和 imwrite 方法的底层调用了 cv2 的 imread 和 imwrite,或者模仿了它们的设计。这里可以参考一下这篇博客 cv2如何处理RGB和BGR | 文羊羽

  • mmsegmentaion 的 color map, BGR

    1
    2
    3
    4
    5
    6
    7
    0: [  0   0   0] : boundary
    1: [255 255 255] : impervious surfaces
    2: [255 0 0] : background
    3: [255 255 0] : car
    4: [ 0 255 0] : tree
    5: [ 0 255 255] : low vegetation
    6: [ 0 0 255] : building
  • 正常的 color map, RGB

    1
    2
    3
    4
    5
    6
    7
    0: [  0   0   0] : boundary
    1: [255 255 255] : impervious surfaces
    2: [ 0 0 255] : building
    3: [ 0 255 255] : low vegetation
    4: [ 0 255 0] : tree
    5: [255 255 0] : car
    6: [255 0 0] : clutter/background

ann2rgb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
from typing import Union
import numpy
import torch

color_map = {
0: [255, 255, 255], # white
1: [ 0, 0, 255], # blue
2: [ 0, 255, 255], # cyan
3: [ 0, 255, 0], # green
4: [255, 255, 0], # yellow
5: [255, 0, 0], # red
}

def ann2rgb(annimg: Union[numpy.ndarray, torch.tensor]) -> numpy.ndarray:
"""convert [H, W] annotation mask to [H, W, 3] rgb image"""
h, w = annimg.shape
rgbimg = numpy.zeros(shape=(h, w, 3), dtype=numpy.uint8)
for idx, rgb in color_map.items():
rgbimg[annimg == idx] = rgb
return rgbimg

if __name__ == "__main__":
# annimg = torch.randint(low=0, high=6, size=(224, 224))
annimg = numpy.random.randint(low=0, high=6, size=(224, 224))
rgbimg = ann2rgb(annimg)
print(rgbimg.shape)