Video content constitutes a dominant fraction of broadcast and internet traffic. The user expectation for high quality video is constantly increasing in all forms of screens. The increased demand has posed challenge to content creator and broadcaster. The broadcaster thrives to balance the economics of data bits and viewer’s engagements on a given platform. Though the codecs have improved these days, but the issue of quality is gaining increased importance as it helps to understand to better invest their network and codec resources toward optimizing the quality metrics that really matter.
Therefore, broadcasters are under stress to deliver video data with reduced bitrates without sacrificing quality. Companies have struggled to fine-tune additional parameters in video codecs and the improvements in core video codec tools have led to spectacular reduction in bitrate savings. One may achieve the same quality with an HEVC encoder (H.265), while using just a fraction (about 25-30%) of the bits required by MPEG-2. The other efficient codecs in vogue includes MPEG’s AVC.H264, Google’s VP 9, AOMedia Video 1 (AV1), etc.
Video artifacts. The whole purpose of video encoding is to compress visual information in a way that data rate is reduced and still makes it pleasing to the human eye. Video compression artifacts are categorized by whether they are time-based (temporal) or location-based (spatial). The spatial artifacts includes blurring, blocking, ringing, pattern effect, and color bleeding. The temporal artifacts are further classified as flickering, jerkiness, and floating etc.
Blurring causes reduction in sharpness of edges. Blockiness is the false discontinuities across block boundaries and is caused by course quantization. Ringing is halo surrounding objects and edges and does not move around frame to frame, unlike mosquito noise. The basic pattern effect usually occurs in the regions that have texture like waves, trees, field of grass, etc.
When the edges of one color extends in the image or overlaps into another color, it is called color bleeding. Its effect is worse in images with high color detail. It is caused by chroma sub sampling.
Video assessments. The various methods are employed to judge the quality of video, but ultimately eyes are best meters. The engineering gamut of video quality have large number of parameters to define it. Generally, the video quality is ranked from bad, poor, fair, good to excellent. Video quality can be assessed subjectively (by human eye) or objectively (by machine). Mean opinion score and differential mean opinion score are the form of subjective assessments.
The subjective measurements may be carried out using Golden Eye or Multiple Viewer methods. The eye measurement need some training for accurate results. The peak signal-to-noise ratio (PSNR) metric is widely used in the field of signal processing because it is a maximum ratio of power between signals to noise. Besides this, structural similarity index (SSIM), video multi-method assessment fusion (VMAF) falls under objective assessments.
Objective assessment may be carried out using full reference, reduced reference, or no reference to raw picture with test picture. Absolute quality estimator score can be measured comparing the reference image with distorted image. The average PSNR over all frames can be considered a video quality metric.
PSNR does not consider the different evaluation of the human eye to the same amount of noise in different images structures but SSIM metric is a perceptual metric and it provides a better assessment taking into account luminance, contrast, and structural information, which has better simulation for the human eye observations.