Print

发布时间: 2017-05-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.160599
2017 | Volume 22 | Number 5




    第十八届全国图像图形
学术会议专栏    




  <<上一篇 




  下一篇>> 





分布式视频编码中关键帧丢失错误保护
expand article info 荣松, 杨红, 卿粼波, 王正勇
四川大学电子信息学院, 成都 610065

摘要

目的 分布式视频编码较其传统视频编码具有编码简单、误码鲁棒性高等特点,可以很好地满足如无人机航拍、无线监控等新型视频业务的需求。在分布式视频编码中,视频图像被交替分为关键帧和Wyner-Ziv帧,由于受到信道衰落和干扰等因素的影响,采用传统帧内编码方式的关键帧的误码鲁棒性远不如基于信道编码的Wyner-Ziv帧。关键帧能否正确传输和解码对于Wyner-Ziv帧能否正确解码起着决定性的作用,进而影响着整个系统的压缩效率和率失真性能。为此针对关键帧在异构网络中的鲁棒性传输问题,提出一种基于小波域的关键帧质量可分级保护传输方案。 方法 在编码端对关键帧同时进行传统的帧内视频编码和基于小波域的Wyner-Ziv编码,解码端将经过错误隐藏后的误码关键帧作为基本层,Wyner-Ziv编码产生的校验信息码流作为增强层。为了提高系统的分层特性以便使系统的码率适应不同的网络条件,进一步将小波分解后图像的各个不同层的低频带和高频带组合成不同的增强层,根据不同信道环境,传输不同层的Wyner-Ziv校验数据。同时对误码情况下关键帧的虚拟噪声模型进行了改进,利用第1个增强层已解码重建的频带与其对应边信息来获得第2个和第3个增强层对应频带的更加符合实际的虚拟信道模型的估计。 结果 针对不同的视频序列在关键帧误码率为1~20%时,相比较于传统的帧内错误隐藏算法,所提方案可以提高视频重建图像的主观质量和整体系统的率失真性能。例如在关键帧误码率为5%时,通过传输第1个增强层,不同的视频序列峰值信噪比(PSNR)提升可达2~5 dB左右;如果继续传输第2个增强层的校验信息,视频图像的PSNR也可以提升0.5~1.6 dB左右;如果3个增强层的校验信息都传输的话,基本上可以达到无误码情况下关键帧的PSNR。 结论 本文所提方案可以很好地解决分布式视频编码系统中的关键帧在实际信道传输过程中可能出现的误码问题,同时采用的分层传输方案可以适应不同网络的信道情况。

关键词

分布式视频编码; 关键帧; 小波域; 分级; 误码信道

Error protection for key frames in distributed video coding
expand article info Rong Song, Yang Hong, Qing Linbo, Wang Zhengyong
College of Electronic Information Engineering, Sichuan University, Chengdu 610065, China
Supported by: National Natural Science Foundation of China (61471248)

Abstract

Objective Distributed video coding (DVC) has attracted the significant attention of many relevant international standardization committees and experts ever since the emergence of distributed source coding (DSC). DSC is a new class of source coding approaches based on the Slepian-Wolf theorem and the Wyner-Ziv (WZ) theorem. Owing to its characteristic of slight encoding and high error robustness, DVC is a good way to meet the demands of the new video business, which requires low-power consumption and low complexity, such as video chat, unmanned aerial wireless monitoring, and so on. However, the bit error ratio of the wireless channel is higher than the wired channel because of the impact of the channel attenuation, multipath interference, frequency band mutual interference, and so on. In the DVC system, video source is interleaved with key frames and WZ frames, and the side information regarded as the noise version of the current WZ frame is generated by the motion estimation and compensation algorithm of the adjacent key frames. Therefore, the key frames, regardless of their ability to correctly decode and transmit, would affect the compression efficiency and rate distortion of the whole system. However, the robustness of the key frames that use traditional intra-frame coding is far lower than that of the WZ frames, which are based on channel coding. For the robustness and transmission of key frames in the heterogeneous network, this paper presents a quality scalable protection solution for the key frames in wavelet domain DVC. Method At the encoder side, the key frames are encoded by the traditional HEVC/H.265 (High Efficiency Video Coding) intra-frame coding and Wyner-Ziv coding based wavelet domain simultaneously. The HEVC bitstreams are transmitted to the wireless channel. The information bits are directly discarded for the WZ bistreams, and the generated parity bits are stored in buffer. To make the bit rate of the system adapt to different network conditions, different layers of low-frequency and high-frequency bands of the wavelet decomposition image can be combined into different enhanced layers. Initially, the decoder determines whether the HEVC bitstreams of the key frames are lost or not. If there is no error, the HEVC bitstreams are decoded to reconstruct directly, and the WZ parity bits in buffer will be deleted. On the contrary, the error concealment technique will be used to reconstruct a video frame of the received HEVC bitsreams. In addition, the reconstructed frame is accepted as the side information of the current key frame, and the decoder will request the WZ data of different enhancement layer according to the different channel environment. Moreover, the original frame and its corresponding side information roughly obey the Laplace distribution in the DVC system. Therefore, the real practice is to use the forward reference frame and side information to obtain the virtual noise model of the current frame because the decoder cannot obtain accurate original information. However, if the channel condition is limited and there are simultaneous errors in the key frames, then it is impossible to send the parity data of all enhancement layers. As a result, the quality of the reconstructed forward reference frame may be relatively poor and the estimation of the virtual noise model may have a large gap compared with the practical situation. Therefore, this paper improves the virtual noise model of the error key frames because of the similarity of the virtual noise model of the same layer in the wavelet decomposition image. With the decoded bands of the first enhancement layer and its corresponding side information, the more accord actual virtual noise model of the second and the third enhancement layer could be obtained. Result To validate the effectiveness of the proposed scheme, the luminance of three video sequences with different motion characteristics are simulated, including the $foreman$, $bus$, and $coastguard$ sequences. The rate-distortion performance over packet loss channels with different random packet loss ratio[i.e., packet loss rate (PLR), PLR=(1%, 5%, 10%, 20%)] is evaluated. Experiments results show that in comparison with the traditional error concealment method, the proposed scheme can effectively improve the rate-distortion performance of the reconstructed video image under different channel conditions. Specifically, if only the parity data of the first enhancement layer are transmitted and the loss rate of key frames is 5%, the peak signal-to-noise ratio (PSNR) of the reconstructed video can be improved to about 25 dB. If the parity data of the second enhancement layer continue to be transmitted, the PSNR of the reconstructed video can also be increased from 0.51.6 dB. If all parity data of the three enhancement layers are transmitted, the decoded video can basically achieve the same quality of the key frames without errors. When the data loss ratio is relatively high, such as 20%, the quality of the reconstructed video by typical error concealment method nearly cannot meet the basic requirements. However, with the parity data of the first enhancement layer transmitted, the PSNR could be improved about 4.58.3 dB in the proposed scheme. If the parity data of the second enhancement layer continues to be transmitted, the PSNR could be also increased from 2.74.1 dB, if all parity data of the three enhancement layers are transmitted, the PSNR can also be increased from 3.7 4.6 dB. In general, the different reconstructed video quality could be obtained with the transmission of the different enhancement layers. Conclusion Experimental results have indicated that the proposed error protection scheme for key frames in wavelet domain DVC can improve the robustness of key frames. The proposed framework can also improve the rate-distortion performance for different channel environments and requirements. However, the proposed scheme is based on the feedback channel, which causes some delay during decoding. Therefore, the rate estimation in the encoder side can be the next direction of research.>

Key words

distributed video coding (DVC); key frame; wavelet domain; scalable; bit error channel

0 引言

近年来,大规模无线多媒体传感器网络、移动手机终端、嵌入式移动设备等在我们的生活中日益普及[1-3]。在这些新型应用场合中,编码终端往往在计算能力和存储能力等方面受限,这对于传统的视频编码压缩技术来说都是非常大的挑战[4]。分布式视频编码 (DVC) 的出现[5],为这种应用场合提供了一种有效的解决思路,它采用独立编码联合解码的方式将复杂度从编码端转移到解码端,使得其在一些新型视频业务中具有很大应用潜力。在DVC中,原始视频序列被交替分为Wyner-Ziv帧和关键帧,其中关键帧采用的是传统的帧内模式来进行独立的编解码。

由于无线网络信道情况不稳定,视频的传输很容易受到各种外在环境因素的干扰,导致视频数据丢失受损,严重影响视频的重建质量。针对这些问题,国内外学者进行了大量的研究与改进。其中基于时间域和空间域的错误隐藏算法中比较有代表性的是双线性插值算法[6]和外边界匹配算法[7]等。Zhou等人[8]提出一种基于平面拟合的时域错误隐藏算法来提高H.264编码过程中受损图像的重建质量。Gao等人[9]提出了一种基于块合并的算法来对HEVC/H.265(high efficiency video coding) 中丢失的数据块进行错误隐藏。Dissanayake等人[10]结合Wyner-Ziv理论来实现H.264编码的抗误码传输。Liu等人[11]提出了一种基于DCT域的Wyner-Ziv视频编码关键帧抗误码方案。另一方面,针对网络的异构性和设备终端的多样性,可伸缩视频编码根据网络带宽的变化有选择性地传输码流从而能够很好地适应网络视频的传输特性。Fan等人[12]结合压缩感知理论提出了可分级的DVC结构,Ouatet等人[13]对基于DVC的分级结构和传统的可分级视频编码在不同信道环境下的鲁棒性和率失真性能进行了分析和比较。

然而,从所查文献来看,很少有对基于分布式视频编码中的关键帧在出现误码情况下提出可分级的保护方案,事实上,关键帧能否正确解码传输不仅影响着自身的重构质量,同时也对整个系统的性能起着很关键的作用。本文结合小波分解的特性,通过在编码端对关键帧同时进行传统的帧内编码和基于小波域的Wyner-Ziv编码,将经过错误掩盖后的误码关键帧作为基本层,将小波分解后图像的各个不同层的低频带和高频带组合成不同的增强层,根据不同的信道环境,传输不同层的数据,以此来达到在质量上可分级的保护误码关键帧的目的。

1 基于小波域的DVC质量可分级的关键帧抗误码方案

1.1 基于小波域的质量可分级方案

小波变换是以原始图像为初始值,不断将上一级图像的低频子带分解为4个子带的过程,这4个子带分别包含了上一级图像在频域上对应的低频信息、水平及对角线方向的边缘信息[14]。低频子带集中了图像的概貌信息,高频子带对应的是图像的细节信息,而不同方向的信息对于人眼有不同的作用。考虑到不同网络的带宽情况及小波变换后低频和高频子带的重要性不同,以3层小波分解为例来说明所提出的分级方案,为了叙述方便,对于误码引起数据丢失的关键帧,将进行错误隐藏 (EC) 后得到的关键帧作为基本层 (BL) 和边信息,并进行与编码端相同的小波变换。如图 1所示,其中第1个增强层 (EL1) 包括对经过错误隐藏后关键帧的$LL$3$HH$3$HH$2$HH$1这4个频带进行纠错解码,其余的各个频带直接用边信息对应的部分来进行填充;第2个增强层 (EL2) 是在第1个增强层的基础上,增加了对$LH$3$LH$2$LH$1这3个频带的纠错解码,剩下的频带同样直接用边信息对应的部分来进行填充;第3个增强层 (EL3) 则包括对所有的频带进行纠错解码。

图 1 小波频带分级方案
Fig. 1 The scalable scheme of wavelet frequency band

1.2 基于关键帧的可分级抗误码保护方案

图 1所示分级方案的基础上,提出的基于关键帧的可分级抗误码方案如图 2所示。在编码端,将关键帧同时进行传统的HEVC帧内视频编码和Wyner-Ziv编码,经过Wyner-Ziv量化编码后直接丢弃信息位,将校验信息存储在缓冲区中。

图 2 关键帧的可分级抗误码保护方案
Fig. 2 The scalable error protection scheme of key frames

在解码端,首先判断关键帧是否出现误码,如果没有则直接按传统的帧内解码重建即可,相反则对丢包的关键帧进行错误隐藏并作为边信息,根据本文所提的虚拟噪声模型的估计方案来计算拉普拉斯分布参数$\alpha $,根据$\alpha $来计算边信息的可信度概率,综合概率值和$\alpha $完成信道解码,最后根据边信息和解码码流完成重建。由于DVC中的关键帧采用的是传统的帧内独立编码模式,因此采用空域错误隐藏算法中比较典型的双线性插值算法来对误码的关键帧进行错误掩盖。然后将错误掩盖后的图像直接作为边信息,并进行与编码端相同的小波变换。在基本层优先传输的基础上,根据无线网络信道环境的特性,可以选择性地传输部分增强层来保护误码的关键帧。在信道条件不是很好的情况下,可以选择只传输增强层EL1的校验信息来进行保护,如果信道条件得到改善则可以增加传输增强层EL2的校验信息,当然,如果条件允许,则可以传输所有增强层的校验信息。

1.3 虚拟信道模型的估计

在Girod提出的Wyner-Ziv视频编码系统中,认为边信息$Y$和原始帧$X$之间的相关噪声概率分布近似服从拉普拉斯分布[15]

$ f\left( d \right) = \frac{\alpha }{2}{{\rm{e}}^{ - \alpha |d|}} $ (1)

式中,$\alpha $为拉普拉斯参数,$d$为原始帧与边信息之间的残差,即$d$=$X$-$Y$。拉普拉斯分布参数$\alpha $可由$X$$Y$之间的方差$\sigma $计算获得

$ \alpha = \sqrt {\frac{2}{{{\sigma ^2}}}} $ (2)

由于解码端无法准确获取原始信息,因此一般的做法是通过前向参考帧和边信息来建立虚拟噪声模型。针对关键帧出现误码的情况,如果信道带宽受限,传输所有增强层的校验信息来纠错解码则不现实。然而传输部分增强层后重建视频帧质量与原始帧还是有一定差异,如果将其作为前向参考帧来进行下一帧的虚拟噪声模型估计则会与实际噪声模型相差较大,严重影响系统的性能。因此,单一的采用传统的做法并不是很合适,针对这一情况,本文对于虚拟噪声的估计进行以下改进:

首先,对于小波分解后视频帧的第1个增强层EL1的4个频带$LL$3$HH$3$HH$2$HH$1,分别计算前向参考帧各个频带$X$ref与边信息$Y$对应频带之间的残差及方差

$ {d_{{\rm{E}}{{\rm{L}}_{\rm{1}}}}} = {X_{{\rm{ref}}}}-Y $ (3)

$ \sigma _{{\rm{E}}{{\rm{L}}_{\rm{1}}}}^2 = E\left[{d_{{\rm{E}}{{\rm{L}}_{\rm{1}}}}^2} \right] - {\left( {E\left[{{d_{{\rm{E}}{{\rm{L}}_1}}}} \right]} \right)^2} $ (4)

然后根据式 (1)(2) 分别可以得到增强层EL1中4个频带的虚拟噪声估计。通过传输增强层EL1来纠错解码重建后的频带用${{X}^{'}}$表示。对于误码情况下,由于同一层的各个频带之间虚拟噪声分布的相似性,因此第2个增强层EL2和第3个增强层EL3各个频带的虚拟噪声模型可以通过第1个增强层EL1${{X}^{'}}$和对应的$Y$获得,新的残差分布为

$ d{' _{{\rm{E}}{{\rm{L}}_{\rm{1}}}}} = X'-Y $ (5)

$ \sigma _{\text{E}{{\text{L}}_{3}}}^{2}\approx \sigma _{\text{E}{{\text{L}}_{2}}}^{2}\approx E\left[ d_{\text{E}{{\text{L}}_{\text{1}}}}^{'2} \right]-{{\left( E\left[ d_{\text{E}{{\text{L}}_{\text{1}}}}^{'} \right] \right)}^{2}} $ (6)

然后结合式 (1)(2) 可得到更符合实际的噪声估计。

2 实验与分析

2.1 实验参数设置

仿真实验对具有不同运动特征的3个具有代表性的视频序列foreman、bus、coastguard在不同误码率环境下的前100帧进行了测试,视频大小为CIF (352×288),视频格式Y:U :V为4:0:0,只对亮度分量进行测试,视频帧率均为15帧/s。在实验中关键帧采用传统H.265/HEVC帧内编解码。在本文所提方案中,由于关键帧在编码端需要经过两次量化和编码,传统HEVC编码时采用的量化步长用$QP$表示,Wyner-Ziv编码时采用的量化步长用$q$表示,通过大量仿真实验,$QP$$q$的对应取值关系如表 1所示时,可以得到良好的率失真效果。

表 1 $QP$$q$的对应取值关系
Table 1 The corresponding value of $QP$ and $q$

下载CSV
序列$QP$=22$QP$=24$QP$=26$QP$=28
foreman$q$=7$q$=9$q$=12$q$=15
bus$q$=7$q$=9$q$=12$q$=15
coastguard$q$=7$q$=9$q$=12$q$=15

2.2 实验结果分析

为了说明本文方案在不同的丢包率下的有效性,图 3分别给出了当$QP$取26,$q$为12时,不同丢包率 (PLR) 下,3个视频测试序列通过传输不同的增强层解码纠错后视频图像重构平均峰值信噪比 (PSNR) 对比情况。可以看出,随着丢包率的逐渐增加,基于错误掩盖算法重建后视频图像的PSNR下降趋势越来越明显。然而即使在丢包率很高达到20 %的情况下,本文所提方案也可以在不同程度上很好地提升误码情况下重建图像的PSNR。

图 3 不同丢包率下视频重构质量
Fig. 3 The reconstructed video quality of different loss rate ((a) foreman; (b) bus; (c) coastguard)

图 4分别给出了3个视频序列在随机丢包率为5 %时重构后视频的率失真性能曲线,可以看出,相比于传统的错误隐藏技术,只传输增强层EL1的校验信息来进行纠错重建后视频图像的质量有明显的提升,在量化步长比较大的情况下提升效果尤为明显,不同的序列PSNR提升可达2~5 dB左右;如果在增强层EL1的基础上继续传输增强层EL2的校验信息,视频图像的PSNR也可以提升0.5~1.6 dB左右;如果3个增强层的校验信息都传输的话,基本上可以达到无误码情况下关键帧的PSNR。当然,随着增强层传输个数的增加,码率也是逐渐增大。

图 4 3个视频序列率失真性能
Fig. 4 The R-D performance of three video sequences ((a) foreman; (b) bus; (c) coastguard)

为了进一步说明所提方案的可行性,本文对关键帧在误码情况下通过传输不同的增强层重建后图像的主观质量进行了对比。以foreman序列第16帧为例来说明,如图 5所示,4幅图像分别为误码情况下经过错误隐藏算法后得到的图像、只对第1个增强层EL1进行纠错保护后重建的图像、对第2个增强层EL2也进行纠错保护后重建的图像、对3个增强层EL1、EL2、EL3都进行纠错保护后重建的图像。可以看出,随着增强层数量的增加视频的主观重建质量也逐渐提高,因此在不同的信道情况下可以实现可分级的保护目的。

图 5 foreman主观质量对比
Fig. 5 The subjective quality comparison of foreman sequence
((a) error concealment; (b) enhanced layer 1; (c) enhanced layer 2; (d) enhanced layer 3)

3 结论

针对无线异构网络信道环境下分布式视频编码中关键帧可能出现的误码问题,本文提出一种基于小波域质量可分级的保护方案,同时对误码关键帧的虚拟噪声模型进行了改进,实验结果表明该方法可以满足不同信道环境的要求。由于本文是基于反馈信道来传输校验信息,会造成一定的时延,因此下一步的研究方向是将码率估计的方法引入到本文所提框架中。

参考文献

  • [1] Rawat P, Singh K D, Chaouchi H, et al. Wireless sensor networks:a survey on recent developments and potential synergies[J]. The Journal of Supercomputing, 2014, 68(1): 1–48. [DOI:10.1007/s11227-013-1021-9]
  • [2] Mumtaz S, Saidul Huq K M, Rodriguez J. Direct mobile-to-mobile communication:paradigm for 5G[J]. IEEE Wireless Communications, 2014, 21(5): 14–23. [DOI:10.1109/MWC.2014.6940429]
  • [3] Imran N, Seet B C, Fong A C M. Distributed video coding for wireless video sensor networks:a review of the state-of-the-art architectures[J]. SpringerPlus, 2015, 4(#513). [DOI:10.1186/s40064-015-1300-4]
  • [4] Sullivan G J, Ohm J R, Han W J, et al. Overview of the High Efficiency Video Coding (HEVC) Standard[J]. IEEE Transactions on Circuits & Systems for Video Technology, 2012, 22(12): 1649–1668.
  • [5] Vijayanagar K R, Kim J, Lee Y, et al. Low complexity distributed video coding[J]. Journal of Visual Communication and Image Representation, 2014, 25(2): 361–372. [DOI:10.1016/j.jvcir.2013.12.006]
  • [6] Cao J H, Li F T. Error concealment techniques in MPEG-2 video decoders[J]. Journal of Tsinghua University:Science and Technology, 2004, 44(7): 921–924. [曹继华, 李凤亭. MPEG-2视频解码器中的错误隐藏技术[J]. 清华大学学报自然科学版, 2004, 44(7): 921–924. ] [DOI:10.3321/j.issn:1000-0054.2004.07.016]
  • [7] Thaipanich T, Wu P H, Kuo C J. Video error concealment with outer and inner boundary matching algorithms[C]//Proceedings of the SPIE 6696, Applications of Digital Image Processing XXX. San Diego, CA:SPIE, 2007:66-96.[DOI:10.1117/12.735998]
  • [8] Zhou Q Y, Yang G B, Liu Z C, et al. Temporal error concealment algorithm for H.264/AVC video stream[J]. Journal of Image and Graphics, 2010, 15(9): 1338–1344. [周启亚, 杨高波, 刘志成, 等. 针对H.264/AVC的时域错误隐藏算法[J]. 中国图象图形学报, 2010, 15(9): 1338–1344. ] [DOI:10.11834/jig.20100908]
  • [9] Gao W H, Zhang Y Y, Wang H D. Error concealment for high efficiency video coding based on block-merging[J]. Journal of Computer Applications, 2015, 35(6): 1744–1748. [高文华, 张义云, 王海东. 高效率视频编码中基于块整合的错误隐藏算法[J]. 计算机应用, 2015, 35(6): 1744–1748. ] [DOI:10.11772/j.issn.1001-9081.2015.06.1744]
  • [10] Dissanayake M B, Worrall S, Fernando W A C. Wyner-Ziv based error correction of non-key frames for low complexity streaming applications[C]//Proceedings of the 6th IEEE International Conference on Industrial and Information Systems. Kandy:IEEE, 2011:467-471.[DOI:10.1109/ICⅡNFS.2011.6038115]
  • [11] Liu X J, Qing L B, Xiong S H, et al. Anovel error resilience scheme for key frames in Wyner-Ziv video coding[J]. Journal of Sichuan University:Natural Science Edition, 2016, 53(1): 98–104. [刘晓娟, 卿粼波, 熊淑华, 等. 一种新的Wyner-Ziv视频编码关键帧抗误码方案研究[J]. 四川大学学报:自然科学版, 2016, 53(1): 98–104. ]
  • [12] Fan N F, Zhu X Q, Liu Y, et al. Scalable distributed video coding using compressed sensing in wavelet domain[C]//Proceedings of the 78th IEEE Vehicular Technology Conference. Las Vegas, NV:IEEE, 2013:1-5.[DOI:10.1109/VTCFall.2013.6692400]
  • [13] Ouaret M, Dufaux F, Ebrahimi T. Error-resilient scalable compression based on distributed video coding[J]. Signal Processing:Image Communication, 2009, 24(6): 437–451. [DOI:10.1016/j.image.2009.02.011]
  • [14] Bernardini R, Rinaldo R, Zontone P, et al. Wavelet domain distributed coding for video[C]//Proceedings of 2006 IEEE International Conference on Image Processing. Atlanta, GA:IEEE, 2006:245-248.[DOI:10.1109/ICIP.2006.313171]
  • [15] Aaron A, Rane S D, Setton E, et al. Transform-domain Wyner-Ziv codec for video[C]//Proceedings of SPIE 5308, Visual Communications and Image Processing. San Jose, CA:SPIE, 2004:520-528.[DOI:10.1117/12.527204]