Video Quality Metrics

Digital images are subject to a wide variety of distortions during acquisition, compression storage, transmission and reproduction.  Any of these may result in degradation of the image quality.  To ensure an acceptable level of quality control, video metrics are applied to judge the quality of the image.  Ideally the image quality assessment process would supply image quality metrics that would predict the image quality automatically.  In reality subjective tests are carried out which require human experts to view the video and assess it for quality applying the Mean Opinion Score(MOS) weighting.  This is time consuming and expensive.  Objective tests can be carried out automatically, these objective quality metric systems predict visual quality by using information known about the human visual system to compare signals.

This essay will compare the performance of PSNR and SSIM – two objective video quality metrics.  For this purpose two video clips will be employed, the video clips will be compressed using the MPEG-4 codec to 128Kbps output rate and to 1Mbps output rate, which are suitable for low speed and high speed internet connections respectively.

Ffmpeg a computer programme which can record, convert and stream audio visual content in many formats was used to convert the files to mp4.  The command line is as follows:

ffmpeg –i coastguard_cif.avi –vcodec MPEG4  – b 1Mb – bf 1 –an –psnr –vstats coastbig.mp4

-i filename, -vcodec: the codec to be used, -b: the output bit rate, -bf: the number of B frames to be included between the P frames, -an: specifies no audio, -vstats the production of statistics,  -psnr include the psnr details.

MSU video quality measurement tool version 2.0 was the programme used to produce SSIM information.
The following tables detail information on the two video clips, information is included for the original clips and the output from the two MPEG-4 compressions performed for each.

Coastguard.avi

  Original file

Coastguard_cif.avi

Compressed with Mpeg-4 for Internet slow  connection Compressed with MPEG-4 for Internet fast connection
File Size 45.6 Mbytes 351Kbps 1.506MBps
Frame size 352 x 288 352 x 288 352 x 288
Bit rate 30421Kbps 145Kbps 914Kbps
Frame rate 25 fps 25 fps 25 fps
Length 12 secs 12 secs 12 secs
Compression factor 125 29
Frame size 152105 bytes 72 – 27018bytes 113 -27018 bytes
Recognition Significant blocking, particularly when there is activity in the scene Very little variance from original

News.avi file

  Original file

news_cif.avi

Compressed with Mpeg-4 for Internet slow  connection Compressed with MPEG-4 for Internet fast connection
File Size 45.6 MB 339KB 1.532MB
Frame size 352 x 288 352 x 288 352 x 288
Bit rate 30421Kbps 197Kbps 1007Kbps
Frame rate 25 fps 25 fps 25 fps
Length 12 secs 12 secs 12 secs
Compression factor 1 130 29
Frame size 152105bytes 43 -12970 bytes 166-24865bytes
Recognition Significant blocking, particularly when there is activity dancers Very little variance from original

 

 

Peak signal to noise ratio (PSNR is frequently used as a measure of the distortion introduced by MPEG compression.  For colour video, it is typically acceptable to compute PSNR on the luminance component only.  However, PSNR is not based on signal difference as processed by the human visual system.  Typical PSNR values for lossy compression are between 30 and 50db, the higher the value the better the quality.

The actual metric that is in use in this part of the assignment is the peak signal-to-reconstructed image measure which is called PSNR.  The MOS refers to the Mean Opinion Score to describe the subjective viewing comparison.  The following is a brief comment on each video clip regarding the subjective and objective quality measures.

Coastguard output at quality of 128Kbps

The maximum value of PSNR in this sequence is 41.41 and the minimum is 25.19 with an average of 27.35.  On subjective analysis of the clip, significant blocking takes place (MOS = 1) between frames 25 and 80. In this section of the film clip there is a scene in which two boats are in motion and this, in turn, causes  movement in the water. However, this is not reflected in the PSNR(figure 1)  whose value in these frames does not  fall below 32, which is well above average for the clip.

 

The SSIM (Figure 2)  values for this clip range from a minimum of .21164 to a maximum of .96969  with an average value of .649144 over the 300 frames. The SSIM  has its lowest values in the frames 50 to 80 which coincides with the part of the clip where the two boats are moving on the water  the SSIM graph output clearly indicates that there is a significant issue with quality at this point in the video with some of the frames in this range labelled as bad frames.

 

 

Figure 4: SSIM for coastguard.avi compressed using MPEG-4 to output rate of 1Mbps

Coastguard output at 1Mbps

The maximum value of PSNR in this sequence is 41.43 and the minimum is 31.5 with an average of 34.43.  This higher PSNR average demonstrates the superior quality of the output at these compression settings.  Whereas the highly compressed video mentioned above has a greater range between the maximum and the minimum PSNR, the difference is not so great in this, with the minimum well above the average in the previous clip. In this case, the PSNR results correlate with  the subjective measure with MOS not lower than 4 at any point.

Figure 6: SSIM for news.avi compressed using MPEG-4 to output rate of 128Kbps

Figure 8: SSIM for news.avi compressed using MPEG-4 to output rate of 1Mbps

News output at 1Mbps

The maximum value of PSNR in this sequence is 45.25 and the minimum is 39.87 with an average of 43.89.  This small range of values reflects the high quality result of this compression with little or no interruption to the viewer with a MOS of 5.   The SSIM values for this clip range from a minimum of .83389 to a maximum of .98662 with an average value of .956755 over the 299 frames.  The SSIM values display a dip in three areas as mentioned previously (frames 89,148,239), as SSIM incorporates structural information, luminance and contrast information the changing of a large part of the background from dark to light will register as a significant difference to the SSIM model, whereas subjectively this would not register as a change in the background is perceived to be a normal event.

Group of Pictures

Each sequence of is subdivided into groups of pictures and the groups of pictures are subdivided into slices or frames, each slice will be of type I, P, B or D.  I frames are Intra frames and as are compressed soley in a spatial manner they will have a higher PSNR value and SSIM, P frames are predicted and will have a lower value for PSNR and SSIM and B frames are bi-directionally predicted they will have the lowest value for PSNR and SSIM.  Each GOP starts with an I frame , the periodic peaks on the graph represent the I-frames, counting the frames between the peaks gives a GOP of between 10 and 15 frames.  The GOP length is constant for the file, and in this instance the GOP length is the approximately the same for every clip.  This is contrary to good practice as for the best performance, at low bit rate streaming, the visual quality of the output can be considerably improved by reducing the frame rate and the GOP size.

The relationship between increased compression levels and reduced PSNR is demonstrated in figure 10, the frames of the output produced by compressing to a rate of 128Kbps in the main has all frames compressed above 95:1, with all frames compressed above 90:1and the majority of teh PSNR values are in the range 25 to 35 with equivalent MOS values of poor to good, whereas the 1Mbps output has a more diverse range of compression ratio form 80:1 to 98:1 and has the PSNR values in the range 32 to 43 with MOS values of good to excellent.

Figure 12: PSNR vs compression ratio for news.avi compressed using MPEG-4

The PSNR values for news.avi compressed to 1Mbps are contained within a small range of 42 to 45, where as the compression ratio is spread over 82: to 99.9:1, the type of video lent itself well to compression, with the majority of the contents of the frame staying constant from frame to frame.  The compression rate for the majority of frames in the clip compressed at 128Kbps is in the region of 99:1 with only the I frames compressed at a lower rate(90:1-96:1) as evidenced by Figure 9.

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Post comment