This tutorial will walk you through image segmentation using a modified U-Net on the Oxford-IIIT Pet Dataset (created by Parkhi et al).
Image segmentation involves training a neural network to output a pixel-wise mask of an image. Each pixel is given a label which determines if it belongs to the object in that image, or not.
The Oxford-IIIT Pet Dataset consists of images, their corresponding labels, and pixel-wise masks. These masks are essentially labels for each pixel, which fall into three categories:
Image segmentation has many applications, for example in medical imaging, self-driving cars and satellite imaging.