Machine vision is concerned with the interpretation of camera images, i.e. the semantic interpretation of the image content. Due to their superior performance, deep neural networks trained by machine learning methods are currently used almost exclusively for this purpose.
One way of image data analysis is object recognition and object classification. For automated driving, for example, we distinguish between the classes of cars, trucks/buses, pedestrians, cyclists, traffic signs, traffic lights and many more. The detected objects are marked in the video by different colored boxes.
Another approach is the classification of each pixel, i.e. deciding to which object class the image pixel belongs. This is marked in the video by different coloring. The areas of the same color mark an object class and form an object. If semantic segmentation and object formation are performed simultaneously, this is called panoptic segmentation.
Finally, depth or distance values of pixels can be estimated from mono camera images. This is shown in the third part of the video.