多模式3维视频形状编码
Multi-mode shape coding for 3D video
- 2018年23卷第7期 页码:953-960
收稿:2017-10-17,
修回:2018-1-26,
纸质出版:2018-07-16
DOI: 10.11834/jig.170533
移动端阅览

浏览全部资源
扫码关注微信
收稿:2017-10-17,
修回:2018-1-26,
纸质出版:2018-07-16
移动端阅览
目的
2
具有立体感和高端真实感的3D视频正越来越受到学术界和产业界的关注和重视
未来在3D影视、机器视觉、远程医疗、军事航天等领域将有着广泛的应用前景。对象基3D视频是未来3D视频技术的重要发展趋势
其中高效形状编码是对象基3D视频应用中的关键问题。但现有形状编码方法主要针对图像和视频对象
面向3D视频的形状编码算法还很少。为此
基于对象基3D视频的应用需求
提出一种基于轮廓和链码表示的高效多模式3D视频形状编码方法。
方法
2
对于给定的3D视频形状序列逐帧进行对象轮廓提取并预处理后
进行对象轮廓活动性分析
将形状图像分成帧内模式编码图像和帧间预测模式编码图像。对于帧内编码图像
基于轮廓内链码方向约束和线性特征进行高效编码。对于帧间编码图像
采用基于链码表示的轮廓基运动补偿预测、视差补偿预测、联合运动与视差补偿预测等多种模式进行编码
以充分利用视点内对象轮廓的帧间时域相关性和视点间对象轮廓的空域相关性
从而达到高效编码的目的。
结果
2
实验仿真结果显示所提算法性能优于经典和现有的最新同类方法
压缩效率平均能提高9.3%到64.8%不等。
结论
2
提出的多模式3D视频形状编码方法可以有效去除对象轮廓的帧间和视点间冗余
能够进行高效编码压缩
性能优于现有同类方法
可广泛应用于对象基编码、对象基检索、对象基内容分析与理解等。
Objective
2
Three dimensional video has attracted considerable attention from the image processing community due to its satisfactory performance in various applications
including 3D television
free-view television
free-view video
and immersive teleconference.Compared with traditional block-based techniques
object-based methods have the merits of flexible interactivity and efficient resource usage and are thus favored in many practical applications.Hence
object-based 3D video
whose efficient shape coding is a key technique in practical applications
is an important developing trend.Shape coding has been considerably studied
and many methods have been proposed.However
most of the existing shape coding methods are mainly proposed for image or video shape coding and seldom for 3D video shape coding.A straightforward approach to encoding the shapes of 3D video objects is through the use of the same techniques for image or video objects.However
this method of coding does not completely exploit the redundancy among 3D shape videos
thus resulting in poor coding efficiency.Strong inter-frame redundancy exists across time and view directions in a 3D video sequence.Therefore
most of the existing 3D video coding schemes jointly adopt motion-compensated prediction (MCP) and disparity-compensated prediction (DCP) techniques to achieve high coding efficiency.3D video and 3D shape video share certain similarities; thus
correlations may also exist among object contours across time and view directions
which may be exploited in shape coding to improve coding efficiency.Hence
with this speculation and consideration of the requirements of practical object-based 3D video applications
an efficient multi-mode 3D video shape coding scheme is proposed in this study.This scheme is based on contour and chain representation
where the correlation among object contours across time and view directions is exploited to achieve high coding efficiency.
Method
2
For a given 3D shape image
the contours of visual objects are first extracted and preprocessed frame by frame to create perfect single-pixel width.That is
the object contours are 8-connected and only one path exists between any two neighboring contour points.A new metric called shape activity is then applied to assess the shape variation of objects within each frame.On the basis of this assessment
the entire frames are classified into two categories:intra-coding and predictive inter-coding frames.If the shape activity within a frame is large
then intra-coding will be implemented; otherwise
intra-coding will be conducted.For an intra-coding frame
it is encoded on the basis of linearity and direction constraints within chain links to achieve high coding efficiency.For an inter-coding frame
it is encoded using one of the three coding modes
namely
contour-based MCP
DCP
or joint MCP and DCP
to efficiently remove the intra-view temporal correlation and the inter-view spatial correlation among object contours and improve coding efficiency.The principles of MCP and DCP for 3D shape video are similar to those for 3D video.However
the correlation among object contours is dissimilar to that between video textures.In 3D video
the textures are two dimensional
whereas object contour is one dimensional.Video textures can generally be viewed as rigid
whereas the shape of an object contour often changes non-regularly.Usually
a small variation of object contour in a frame may considerably decrease the correlation between consecutive frames.In addition
correlations may decrease more quickly than textures with an increase in time interval.Hence
conventional prediction techniques for 3D video are unsuitable for 3D shape video.In our coding scheme
a new prediction structure is developed to effectively exploit the intra-view temporal correlation and the inter-view spatial correlation among object contours to efficiently encode 3D shape video.
Result
2
Experiments are conducted to evaluate the performance of the proposed scheme
and results of partial comparison with several well-known methods are presented.The experimental results show that our scheme outperforms classic and state-of-the-art methods and the average compression efficiency can be improved by 9.3% to 64.8%.
Conclusion
2
The proposed scheme can effectively remove the intra-view temporal correlation and the inter-view spatial correlation among object contours.This scheme can also achieve high coding efficiency that outperforms those of existing methods and has potential in many object-based image and video applications
such as object-based coding
editing
and retrieval.
Gao Y, Wang M, Tao D C, et al.3-D object retrieval and recognition with hypergraph analysis[J].IEEE Transactions on Image Processing, 2012, 21(9):4290-4303.[DOI:10.1109/TIP.2012.2199502]
Zhu Z J, Wang Y E, Jiang G Y.Unsupervised segmentation of natural images based on statistical modeling[J].Neurocomputing, 2017, 252:95-101.[DOI:10.1016/j.neucom.2016.03.117]
Zhu Z J, Wang Y E, Jiang G Y.On multi-view video segmentation for object-based coding[J].Digital Signal Processing, 2012, 22(6):954-960.[DOI:10.1016/j.dsp.2012.05.006]
ISO/IEC JTC1/SC29. ISO/IEC-11544 Coded representation of picture and audio information-progressive bi-level image compression[S]. Japan: ISO/IEC, 1993.
ISO/IEC JTC1/SC29. ISO/IEC-14492 Coded representation of picture and audio information-lossy/lossless coding of bi-Level images (JBIG2)[S]. Japan: ISO/IEC, 2000.
ISO/IEC JTC1/SC29. ISO/IEC-14496-2 Information technology-coding of audio-visual objects-part 2: visual[S]. Japan: ISO/IEC, 1999.
Aghito S M, Forchhammer S.Context-based coding of bilevel images enhanced by digital straight line analysis[J].IEEE Transactions on Image Processing, 2006, 15(8):2120-2130.[DOI:10.1109/TIP.2006.875168]
Shen Z L, Frater M R, Arnold J F.Quad-tree block-based binary shape coding[J].IEEE Transactions on Circuits and Systems for Video Technology, 2008, 18(6):845-850.[DOI:10.1109/TCSVT.2008.919086]
Lai Z Y, Zhang F, Lin W S. Operational rate-distortion shape coding with dual error regularization[C]//Proceedings of 2014 IEEE International Conference on Image Processing. Paris, France: IEEE, 2014, 5547-5550.[ DOI:10.1109/ICIP.2014.7026122 http://dx.doi.org/10.1109/ICIP.2014.7026122) ]
Luo H T.Image-dependent shape coding and representation[J].IEEE Transactions on Circuits and Systems for Video Technology, 2005, 15(3):345-354.[DOI:10.1109/TCSVT.2004.842596]
Zhu Z J, Wang Y E, Jiang G Y.High efficient shape coding based on the representation of contour and chain code[J].Journal on Communications, 2014, 35(8):8-14.
朱仲杰, 王玉儿, 蒋刚毅.基于轮廓和链码表示的高效形状编码[J].通信学报, 2014, 35(8):8-14. [DOI:10.3969/j.issn.1000-436x.2014.08.002]
Zhu Z J, Wang Y E, Jiang G Y.Spatio-temporal shape prediction and efficient coding[J].Journal of Image and Graphics, 2016, 21(1):1-7.
朱仲杰, 王玉儿, 蒋刚毅.空时形状预测与高效编码[J].中国图象图形学报, 2016, 21(1):1-7. [DOI:10.11834/jig.20160101]
相关作者
相关机构
京公网安备11010802024621