Edge-enhanced ultra high definition video quality assessment

Jianxin Teng; Jiefeng He; Jinchun Yuan; Fengchuang Xing; Hanpin Wang

doi:10.11834/jig.220648

Image Processing | Views : 0 下载量: 0 CSCD: 0

Export
Share
Collection
Album

Edge-enhanced ultra high definition video quality assessment
Vol. 28, Issue 3, Pages: 691-701(2023)
Published： 16 March 2023 ，

Accepted： 04 September 2022
DOI： 10.11834/jig.220648
稿件说明：

移动端阅览

Jianxin Teng, Jiefeng He, Jinchun Yuan, Fengchuang Xing, Hanpin Wang. Edge-enhanced ultra high definition video quality assessment. [J]. Journal of Image and Graphics 28(3):691-701(2023)
DOI：

Jianxin Teng, Jiefeng He, Jinchun Yuan, Fengchuang Xing, Hanpin Wang. Edge-enhanced ultra high definition video quality assessment. [J]. Journal of Image and Graphics 28(3):691-701(2023) DOI： 10.11834/jig.220648.

摘要

目的

随着网络和电视技术的飞速发展，观看4 K(3840×2160像素)超高清视频成为趋势。然而，由于超高清视频分辨率高、边缘与细节信息丰富、数据量巨大，在采集、压缩、传输和存储的过程中更容易引入失真。因此，超高清视频质量评估成为当今广播电视技术的重要研究内容。本文提出了一种边缘加强的超高清视频质量评估方法。

方法

对输入视频的每一帧进行拆分处理，利用边缘检测算子对R、G、B三通道的图像分别进行边缘检测，合并R、G、B三通道的边缘信息得到视频帧的边缘图像。设计边缘掩蔽、内容依赖和时域记忆3个网络模块分别提取相应的特征，将特征输入到全连接层中进行降维处理后获得质量特征，基于质量特征计算输入视频的视频质量分数。由于超高清视频具有丰富的边缘，边缘细节清晰度极高，因此在边缘处引入的失真通常较为明显，而本文提出的边缘加强方法特别适用超高清视频的质量评估。同时由于提出的方法引入了内容依赖和时域迟滞特性，因此也同时适用其他野生视频的质量评估。

结果

实验在包括超高清在内的4个视频质量评估数据集上进行，与5种主流方法进行比较，结果表明提出的方法性能优越。在KoNViD-1K、DVL2021、LIVE-Qualcomm、LSVQ据集上，与当前性能最好的方法相比，SROCC(Spearman rank-order correlation coefficient)指标分别提升了3.9%、4.2%、10.0%和0.6%，PLCC(Pearson’s linear correlation coefficient)指标分别提升了3.9%、2.2%、10.1%和0.1%。

结论

本文方法结合超高清视频的特点，更好地拟合了人眼视觉特性，获得了当前最好的性能；同时由于未使用光流方法，大幅减少了计算量，获得了很好的泛化能力。

Abstract

Objective

The 4 K (3840×2160 pixels) ultra high definition (UHD) video has been developing intensively in terms of emerging network and television technology. However

in respect of acquisition

compression

transmission and storage

the distortion-acquired issue is challenged due to the huge amount of UHD video data

rich edge and texture information

and high resolution. Our research is focused on an edge-enhanced UHD-VQA method because UHD-based video quality assessment (VQA) has become a crucial research domain in television broadcasting.

Method

First

the input video frame is splitted to obtain 3 kinds of channels: 1) R

2) G

and 3) B. Then

the edge detection operator is used to detect the edge information for each channel. The edge information of R

G and B channels is coordinated and the edge map of the video frame is obtained. To extract the spatial information of the video

human visual system (HVS) is targeted to develop its content-oriented. To extract the spatial information of the video frame further

each frame is input into the ImageNet-1K-trained ResNet-50. To reduce the dimension of features

a global pooling-derived feature maps are concatenated on 3 aspects as mentioned below: 1) the feature maps is extracted and processed via recurrent unit-gated

2) the min pooling and softmin pooling are used to process the features output

and 3) it is obtained and the prediction score can be calculated in terms of a sum of the weighted value. To extract multiple features

the masking-edged

content-oriented

and memory-temporal network modules are designed. Finally

to obtain the quality features and its video quality score-calculated

the features are melted into the fully connected layer network for dimensionality reduction. Due to the high definition and rich of edge details of UHD video

it is more likely to cause severe distortion at the edge. So

our edge-enhanced method can be adapted to the quality assessment of UHD video specially. At the same time

due to the introduction of content-oriented and time-domain hysteresis features

our method has its potentials for the quality assessment of more outdoor-relevant videos.

Result

Experiments are compared to 5 popular methods on 4 datasets. We optimize some values on the 4 aspects: 1) 3.9% SROCC (Spearman rank-order correlation coefficient) improvement and 3.9% PLCC (Pearson's linear correlation coefficient) improvement on KoNViD-1K

2) 4.2% SROCC improvement and 2.2% PLCC improvement on DVL2021

3) 10.0% SROCC improvement and 10.1% PLCC improvement on LIVE-Qualcomm

and 4) 0.6% SROCC improvement and 0.1% PLCC improvement on LSVQ. To demonstrate its generalization ability

a cross-dataset experiment is carried out as well. Furthermore

to optimize the effectiveness of edge information

we conduct an ablation study as well. Our illustrated network can be actually trained well to match the feature of edge masking without edge masking.

Conclusion

To alleviate the edge-distorted

an edge-enhanced method is demonstrated to assess the quality of UHD video. At the same time

the content-oriented and time-domain hysteresis features are introduced to resolve the coordinated UHD-VQA problem. To detect edge information of video frames

the Canny operator is used and its configuration is sorted out. The training parameters are used to deal with the Heterogeneity problem in multiple video datasets. To verify the effectiveness of the proposed method

a large number of experiments are tested and compared to 4 popular video quality evaluation datasets (UHD included). The performance can be improved and reached to 10.0%

and the smallest performance is gained 0.1% as well. These experimental results show that edge information can optimize the performance of VQA methods greatly. The computational cost is optimized greatly since the optical flow method is not used. The future research direction can be predicted and concerned about more potential HVS features for the NR-VQA problem.

关键词

超高清视频视频质量评估(VQA)卷积神经网络(CNN)人类视觉系统(HVS)边缘增强

Keywords

ultra high definition videovideo quality assessment (VQA)convolutional neural network (CNN)human visual system (HVS)edge enhancement

references

Banitalebi-Dehkordi M, Ebrahimi-Moghadam A, Khademi M and Hadizadeh M. 2021. No-reference quality assessment of HEVC video streams based on visual memory modelling. Journal of Visual Communication and Image Representation, 75: #103011 [DOI: 10.1016/j.jvcir.2020.103011]

Berger K, Koudota Y, Barkowsky M and Le Callet P. 2015. Subjective quality assessment comparing UHD and HD resolution in HEVC transmission chains//Proceedings of the 7th International Workshop on Quality of Multimedia Experience. Pilos, Greece: IEEE: 1-6 [DOI: 10.1109/QoMEX.2015.7148114http://dx.doi.org/10.1109/QoMEX.2015.7148114]

Cisco. 2018. Cisco visual networking index: forecast and trends, 2017-2022. White Paper

Fang Y M, Sui X J, Yan J B, Liu X L and Huang L P. 2021. Progress in no-reference image quality assessment. Journal of Image and Graphics, 26(2): 265-286

方玉明, 眭相杰, 鄢杰斌, 刘学林, 黄丽萍. 2021. 无参考图像质量评价研究进展. 中国图象图形学报, 26(2): 265-286 [DOI: 10.11834/jig.200274]

Galkandage C, Calic J, Dogan S and Guillemaut J Y. 2021. Full-reference stereoscopic video quality assessment using a motion sensitive HVS model. IEEE Transactions on Circuits and Systems for Video Technology, 31(2): 452-466 [DOI: 10.1109/TCSVT.2020.2981248]

Ghadiyaram D, Pan J, Bovik A C, Moorthy A K, Panda P and Yang K C. 2018. In-capture mobile video distortions: a study of subjective behavior andobjective algorithms. IEEE Transactions on Circuits and Systems for Video Technology, 28(9): 2061-2077 [DOI: 10.1109/TCSVT.2017.2707479]

Gunawan I P and Ghanbari M. 2008. Reduced-reference video quality assessment using discriminative local harmonic strength with motion consideration. IEEE Transactions on Circuits and Systems for Video Technology, 18(1): 71-83 [DOI: 10.1109/TCSVT.2007.913755]

Hosu V, Hahn F, Jenadeleh M, Lin H H, Men H, Szirányi T, Li S J and Saupe D. 2017. The Konstanz natural video database (KoNViD-1K)//Proceedings of the 9th International Conference on Quality of Multimedia Experience (QoMEX). Erfurt, Germany: IEEE: 1-6 [DOI: 10.1109/QoMEX.2017.7965673http://dx.doi.org/10.1109/QoMEX.2017.7965673]

ITU-R. 2012. Methodology for the subjective assessment of the quality of television pictures. Document ITU-R Rec. BT. 500-13

ITU-R. 2015. Parameter values for ultra-high definition television systems for production and international programme exchange. Document ITU-R Rec. BT. 2020-2

Korhonen J. 2019. Two-level approach for no-reference consumer video quality assessment. IEEE Transactions on Image Processing, 28(12): 5923-5938 [DOI: 10.1109/TIP.2019.2923051]

Li D Q, Jiang T T and Jiang M. 2019. Quality assessment of in-the-wild videos//Proceedings of the 27th ACM International Conference on Multimedia. Nice, France: ACM: 2351-2359 [DOI: 10.1145/3343031.3351028http://dx.doi.org/10.1145/3343031.3351028]

Ma L, Ngan K N and Xu L. 2013. Reduced reference video quality assessment based on spatial HVS mutual masking and temporal motion estimation//Proceedings of 2013 IEEE International Conference on Multimedia and Expo. San Jose, USA: IEEE: 1-6 [DOI: 10.1109/ICME.2013.6607611http://dx.doi.org/10.1109/ICME.2013.6607611]

Mercat A, Viitanen M and Vanne J. 2020. UVG dataset: 50/120 fps 4K sequences for video codec analysis and development//Proceedings of the 11th ACM Multimedia Systems Conference. Istanbul, Turkey: ACM: 297-302 [DOI: 10.1145/3339825.3394937http://dx.doi.org/10.1145/3339825.3394937]

Mitra S, Soundararajan R and Channappayya S S. 2021. Predicting spatio-temporal entropic differences for robust no reference video quality assessment. IEEE Signal Processing Letters, 28: 170-174 [DOI: 10.1109/LSP.2021.3049682]

Mittal A, Moorthy A K and Bovik A C. 2012. Blind/referenceless image spatial quality evaluator//Proceedings of 2011 Conference Record of the 45th Asilomar Conference on Signals, Systems and Computers (ASILOMAR). Pacific Grove, USA: IEEE: 723-727 [DOI: 10.1109/ACSSC.2011.6190099http://dx.doi.org/10.1109/ACSSC.2011.6190099]

Peng Z J, Wang S P, Chen F, Zou W H, Jiang G Y and Yu M. 2019. Quality assessment of stereoscopic video in free viewpoint video system. Journal of Visual Communication and Image Representation, 63: #102569 [DOI: 10.1016/j.jvcir.2019.06.011]

Rafael S, Jose J, Matteo A, Maurizio M and Giusto D D. 2017. Subjective video quality assessments for 4K UHDTV//2017 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB). Cagliari, Italy: IEEE: 1-6 [DOI: 10.1109/BMSB.2017.7986225http://dx.doi.org/10.1109/BMSB.2017.7986225]

Rao R R R, Göring S, Robitza W, Feiten B and Raake A. 2019 AVT-VQDB-UHD-1: a large scale video quality database for UHD-1//2019 IEEE International Symposium on Multimedia (ISM). San Diego, USA: IEEE: 1-8 [DOI: 10.1109/ISM46123.2019.00012http://dx.doi.org/10.1109/ISM46123.2019.00012]

Saad M A, Bovik A C and Charrier C. 2014. Blind prediction of natural video quality. IEEE Transactions on Image Processing, 23(3): 1352-1365 [DOI: 10.1109/TIP.2014.2299154]

Seshadrinathan K, Soundararajan R, Bovik A C and Cormack L K. 2010. Studyof subjective and objective quality assessment of video. IEEE Transactions on Image Processing, 19(6): 1427-1441 [DOI: 10.1109/TIP.2010.2042111]

Sinno Z and Bovik A C. 2019. Large-scale study of perceptual video quality. IEEE Transactions on Image Processing, 28(2): 612-627 [DOI: 10.1109/TIP.2018.2869673]

Song W, Liu S M, Huang D M, Wang W J and Wang J. 2020. Non-reference underwater video quality assessment method for small size samples. Journal of Image and Graphics, 25(9): 1787-1799

宋巍, 刘诗梦, 黄冬梅, 王文娟, 王建. 2020. 适用小样本的无参考水下视频质量评价方法. 中国图象图形学报, 25(9): 1787-1799 [DOI: 10.11834/jig.200025]

Tu Z Z, Chen C J, Chen L H, Wang Y L, Birkbeck N, Adsumilli B and Bovik A C. 2021. Regression or classification? New methods to evaluate no-reference picture and video quality models//Proceedings of 2021 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). Toronto, Canada: IEEE: 2085-2089 [DOI: 10.1109/ICASSP39728.2021.9414232http://dx.doi.org/10.1109/ICASSP39728.2021.9414232]

van Wallendael G, Coppens P, Paridaens T, Van Kets N, Van den Broeck W and Lambert P. 2016. Perceptual quality of 4K-resolution video content compared to HD//Proceedings of the 8th International Conference on Quality of Multimedia Experience. Lisbon, Portugal: IEEE: 1-6 [DOI: 10.1109/QoMEX.2016.7498935http://dx.doi.org/10.1109/QoMEX.2016.7498935]

Vu P V and Chandler D M. 2014. ViS3: an algorithm for video quality assessment via analysis of spatial and spatiotemporal slices. Journal of Electronic Imaging, 23(1): #013016 [DOI: 10.1117/1.JEI.23.1.013016]

Wang Y L, Inguva S and Adsumilli B. 2019. YouTube UGC dataset for video compression research//Proceedings of the 21st IEEE International Workshop on Multimedia Signal Processing. Kuala Lumpur, Malaysia: IEEE: 1-5 [DOI: 10.1109/MMSP.2019.8901772http://dx.doi.org/10.1109/MMSP.2019.8901772]

Wu W, Li Q Y, Chen Z Z and Liu S. 2021. Semantic information oriented no-reference video quality assessment. IEEE Signal Processing Letters, 28: 204-208 [DOI: 10.1109/LSP.2020.3048607]

Xing F, Wang Y, Wang H, He J and Yuan J. 2022. DVL2021: an ultra high definition video dataset for perceptual quality study. Journal of Visual Communication and Image Representation, 82: #103374 [DOI: 10.1016/j.jvcir.2021.103374]

Ying Z Q, Mandal M, Ghadiyaram D and Bovik A. 2021. Patch-VQ: 'Patching Up' the video quality problem//Proceedings of 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Nashville, USA: IEEE: 14014-14024 [DOI: 10.1109/CVPR46437.2021.01380http://dx.doi.org/10.1109/CVPR46437.2021.01380]

Yuan Y and Wang C. 2019. IPTV video quality assessment model based on neural network. Journal of Visual Communication and Image Representation, 64: #102629 [DOI: 10.1016/j.jvcir.2019.102629]

Zhang W, Zou W J, Yang F Z, Lévêque L and Liu H T. 2019. The effect of spatio-temporal inconsistency on the subjective quality evaluation of omnidirectional videos//Proceedings of 2019 IEEE International Conference on Acoustics, Speech and Signal Processing. Brighton, UK: IEEE: 4055-4059 [DOI: 10.1109/ICASSP.2019.8682221http://dx.doi.org/10.1109/ICASSP.2019.8682221]

Zhang W X, Ma K D, Yan J, Deng D X and Wang Z. 2020. Blind image quality assessment using a deep bilinear convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1): 36-47 [DOI: 10.1109/TCSVT.2018.2886771]

Zhang W X, Ma K D, Zhai G T and Yang X K. 2021. Uncertainty-aware blind image quality assessment in the laboratory and wild. IEEE Transactions on Image Processing, 30: 3474-3486 [DOI: 10.1109/TIP.2021.3061932]

Zhu Z, Sang Q B and Zhang H. 2020. No reference video quality assessment based on spatio-temporal features and attention mechanism. Laser and Optoelectronics Progress, 57(18): #181509

朱泽, 桑庆兵, 张浩. 2020. 基于空时特征和注意力机制的无参考视频质量评价. 激光与光电子学进展, 57(18): #181509 [DOI: 10.3788/LOP57.181509]

Alert me when the article has been cited

提交

Vision transformer for fusing infrared and visible images in groups

The review of multi-focus image fusion methods based on deep learning

Review of Chinese characters generation and font transfer based on deep learning

Semi-supervised adversarial learning based semantic image segmentation

An overview of deep learning based pedestrian detection algorithms