【摘要】 tensorflow2.3实现街景语义分割Cityscapes评测数据集即城市景观数据集,在2015年由奔驰公司推动发布,是目前公认的机器视觉领域内最具权威性和专业性的图像分割数据集之一。提供了8种30个类别的语义级别、实例级别以及密集像素标注(包括平坦表面、人、车辆、建筑、物体、自然、天空、空)。Cityscapes拥有5000张精细标注的在城市环境中驾驶场景的图像(2975train,5…
tensorflow2.3实现街景语义分割
Cityscapes评测数据集即城市景观数据集,在2015年由奔驰公司推动发布,是目前公认的机器视觉领域内最具权威性和专业性的图像分割数据集之一。提供了8种30个类别的语义级别、实例级别以及密集像素标注(包括平坦表面、人、车辆、建筑、物体、自然、天空、空)。Cityscapes拥有5000张精细标注的在城市环境中驾驶场景的图像(2975train,500 val,1525test)。它具有19个类别的密集像素标注(97%coverage),其中8个具有实例级分割。数据是从50个城市中持续数月采集而来,涵盖不同的时间以及好的天气情况。开始起以视频形式存储,因此该数据集按照以下特点手动选出视频的帧:大量的动态物体,变化的场景布局以及变化的背景。
代码
导入包
import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
import glob
显存自适应分配,查看tensorflow 的版本
os.environ[“CUDA_VISIBLE_DEVICES”] = “1”
tf.__version__
‘2.3.0’
读取数据,图像数据
images = glob.glob(‘./dataset/cityscapes/leftImg8bit/train/*/*.png’)
print(len(img))
img[:5]
2975
[’./dataset/cityscapes/leftImg8bit/train/dusseldorf/dusseldorf_000128_000019_leftImg8bit.png’,
‘./dataset/cityscapes/leftImg8bit/train/dusseldorf/dusseldorf_000113_000019_leftImg8bit.png’,
‘./dataset/cityscapes/leftImg8bit/train/dusseldorf/dusseldorf_000014_000019_leftImg8bit.png’,
‘./dataset/cityscapes/leftImg8bit/train/dusseldorf/dusseldorf_000207_000019_leftImg8bit.png’,
‘./dataset/cityscapes/leftImg8bit/train/dusseldorf/dusseldorf_000216_000019_leftImg8bit.png’]
标签数据
label = glob.glob(‘./dataset/cityscapes/gtFine/train/*/*_gtFine_labelIds.png’)
print(len(label))
label[:5]
2975
[’./dataset/cityscapes/gtFine/train/dusseldorf/dusseldorf_000015_000019_gtFine_labelIds.png’,
‘./dataset/cityscapes/gtFine/train/dusseldorf/dusseldorf_000213_000019_gtFine_labelIds.png’,
‘./dataset/cityscapes/gtFine/train/dusseldorf/dusseldorf_000164_000019_gtFine_labelIds.png’,
‘./dataset/cityscapes/gtFine/train/dusseldorf/dusseldorf_000050_000019_gtFine_labelIds.png’,
‘./dataset/cityscapes/gtFine/train/dusseldorf/dusseldorf_000072_000019_gtFine_labelIds.png’]
为了把图像数据和标签数据是一一对应的,所以按照名称进行排序。
img.sort(key=lambda x: x.split(‘/’)[-1].split(‘.png’)[0])
label.sort(key=lambda x: x.split(‘/’)[-1].split(‘.png’)[0])
创建序的索引
index = np.random.permutation(len(img))
乱序后查看图像和标签数据
img = np.array(img)[index]
label = np.array(label)[index]
乱序后保持图像和标签还是一一对应的。
img[:5]
array([’./dataset/cityscapes/leftImg8bit/train/stuttgart/stuttgart_000195_000019_leftImg8bit.png’,
‘./dataset/cityscapes/leftImg8bit/train/tubingen/tubingen_000047_000019_leftImg8bit.png’,
‘./dataset/cityscapes/leftImg8bit/train/monchengladbach/monchengladbach_000000_019682_leftImg8bit.png’,
‘./dataset/cityscapes/leftImg8bit/train/dusseldorf/dusseldorf_000075_000019_leftImg8bit.png’,
./dataset/cityscapes/leftImg8bit/train/monchengladbach/monchengladbach_000000_010733_leftImg8bit.png’], dtype=’<U158’)
label[:5]
array([’./dataset/cityscapes/gtFine/train/stuttgart/stuttgart_000195_000019_gtFine_labelIds.png’,
‘./dataset/cityscapes/gtFine/train/tubingen/tubingen_000047_000019_gtFine_labelIds.png’, ‘./dataset/cityscapes/gtFine/train/monchengladbach/monchengladbach_000000_019682_gtFine_labelIds.png’, ./dataset/cityscapes/gtFine/train/dusseldorf/dusseldorf_000075_000019_gtFine_labelIds.png’, ‘./dataset/cityscapes/gtFine/train/monchengladbach/monchengladbach_000000_010733_gtFine_labelIds.png’], dtype=’<U157’)
创建测试集
img_val = glob.glob(‘./dataset/cityscapes/leftImg8bit/val/*/*.png’)
label_val = glob.glob(‘./dataset/cityscapes/gtFine/val/*/*_gtFine_labelIds.png’)
len(img_val), len(label_val)
(500, 500)
测试集的图形和标签按照名字排序
img_val.sort(key=lambda x: x.split(‘/’)[-1].split(‘.png’)[0])
label_val.sort(key=lambda x: x.split(‘/’)[-1].split(‘.png’)[0])
测试集的数量
val_count = len(img_val)
val_count
训练集的数量
train_count = len(img)
train_count
构建训练集的dataset
dataset_train = tf.data.Dataset.from_tensor_slices((img, label))
dataset_train
rSliceDataset shapes: ((), ()), types: (tf.string, tf.string)>
构建测试集的dataset
dataset_val = tf.data.Dataset.from_tensor_slices((img_val, label_val))
dataset_val
<TensorSliceDataset shapes: ((), ()), types: (tf.string, tf.string)>
封装加载图像数据解码函数
def read_png(path):
img = tf.io.read_file(path)
img = tf.image.decode_png(img, channels=3)
return img
封装加载标签数据解码函数
def read_png_label(path):
img = tf.io.read_file(path)
img = tf.image.decode_png(img, channels=1)
return img
测试dataset中的数据
img_1 = read_png(img[0])
label_1 = read_png_label(label[0])
img_1.shape
label_1.shape
TensorShape([1024, 2048, 3])
TensorShape([1024, 2048, 1])
plt.imshow(label_1)
数据增强
concat = tf.concat([img_1, label_1], axis=-1)
concat.shape
TensorShape([1024, 2048, 4])
用函数tf.concat把图像和标签叠加到一起后图像通道变为4维了,3+1=4
自定义数据增强函数
def crop_img(img, mask):
concat_img = tf.concat([img, mask], axis=-1) #两张图片叠加在一起裁剪
concat_img = tf.image.resize(concat_img, (280,280), method=tf.image.ResizeMethod.NEAREST_NEIGHBOR)
crop_img = tf.image.random_crop(concat_img, [256, 256, 4]) #裁剪
return crop_img[:,:,:3], crop_img[:,:,3:]
该函数返回的是图像数据和标签数据,利用切片的方式返回。
测试一下
img_1, label_1 = crop_img(img_1, label_1)
img_1.shape, label_1.shape
(TensorShape([256, 256, 3]), TensorShape([256, 256, 1]))
plt.subplot(1,2,1)
plt.imshow(img_1.numpy())
plt.subplot(1,2,2)
plt.imshow(label_1.numpy()) #plt.imshow(np.squeeze(label_1.numpy())
图像形状变化和归一化
def normal(img, mask):
img = tf.cast(img, tf.float32)/127.5 -1
mask = tf.cast(mask, tf.int32)
return img, mask