Current Issue Cover

刘宇洋,朱策,郭红伟(电子科技大学, 成都 611731)

摘 要
光场数据体量大,为存储和压缩带来巨大困难。由于光场数据格式与传统图像视频数据不同,现有图像视频编码工具难以高效压缩光场数据。因此,光场数据高效压缩研究对降低存储消耗和传输带宽具有重要意义。目前,光场压缩的研究越来越深入,提出的方法种类也越来越丰富。本文对现阶段光场压缩进行系统综述,为后续研究者提供研究基础。本文简要介绍了光场的基本理论及四类光场采集设备,分析了4类采集设备的优缺点,阐明了光场采集方式对光场数据格式的影响;介绍了国际标准组织联合图像专家组(JPEG)在光场压缩标准化方面的最新进展,对JPEG Pleno光场编码器的每个模块做了详细介绍;在广泛文献调研的基础上,将光场压缩算法分成3类:基于变换的压缩方法、基于伪视频序列的压缩方法和基于预测的压缩方法,对每类算法进行详细梳理和总结,并做了详细地对比分析。通过系统地梳理,凝练出光场压缩近期的进展和尚存在的问题,并对未来光场压缩的研究趋势进行展望。实现光场的高效压缩非常具有挑战性,虽然光场压缩研究近期迅猛发展,但是压缩性能仍有待进一步提高。
Survey of light field data compression

Liu Yuyang,Zhu Ce,Guo Hongwei(University of Electronic Science and Technology of China, Chengdu 611731, China)

Light field imaging is an attractive technique for 3D visualization, especially in virtual and augmented reality application scenarios. This technique has also been applied to computer vision areas, such as depth estimation, 3D reconstruction, and object detection. However, light field data have put great pressure on cost-effective storage and transmission owing to the large data volume. The data format of light field is also relatively different from that of conventional images or videos. This difference has resulted in the inefficient compression of light field data by current coding tools designed for traditional images or videos. Thus, light field compression methods must be developed, especially from the perspective of cost-effective storage and transmission bandwidth. With the advancement of light field compression, various light field compression methods have been proposed. This study conducts a survey of related works on light field compression to provide a research foundation for later researchers who will focus on this topic. First, this study briefly introduces the fundamentals of light field and the four types of light field-capturing devices. The advantages and drawbacks of different types of capturing devices are presented accordingly. The influence of different capturing devices on light field data format is also described. Second, this work discusses the recent advances in JPEG Pleno, which is a standard framework for representing and signaling plenoptic modalities. JPEG Pleno was started in 2015 by the Joint Photographic Experts Group Committee. The term “pleno” is an abbreviation of “plenoptic,” which is a mathematical formulation to represent the information of a beam of light passing through an arbitrary point within a scene. JPEG Pleno proposes a light field-coding framework for the light field data acquired by a plenoptic camera or a high-density array of cameras. The JPEG Pleno light field encoder consists of three parts, with each part illustrated in detail. Lastly, on the basis of extensive literature research, the proposed light field compression methods are divided into three categories according to the characteristics of the coding algorithms, namely, transform, pseudo-sequence-based, and predictive coding approaches. We analyze and discuss the coding methods in each category. As for transform coding approaches, the coding performance is not better than those of the other two methods because transform coding approaches do not contain the prediction process. Although several transform methods can achieve good performance in terms of energy compaction, the decorrelation efficiency of transform methods is not as good as that of the hybrid coding framework that consists of prediction and transformation. As for pseudo-sequence-based coding approaches, the correlation in spatial or view domain is converted into temporal domain. Temporal correlation can be removed by inter-prediction techniques with the use of a well-developed video encoder, such as HEVC (high efficiency video coding) codec. The coding performance can be further improved because the disparity information is not used in the video encoder. As for the predictive coding approaches, they can be further divided into two methods: self-similarity-based coding methods, which were proposed in the last two years, and disparity prediction-based coding approaches. Self-similarity-based coding methods directly encode light field images by applying template-matching-based coding methods. However, the coding performance of this method is insufficient compared with that of disparity prediction-based coding approaches. The latter can achieve the best coding performance compared with other coding methods. JPEG Pleno applies such method to encode light field data. The advantages and shortcomings of existing light field-coding methods are elucidated on the basis of the preceding analysis, and possible promising directions for future research are suggested. First, light field video data sets to explore light field video coding are lacking. Second, the JPEG Pleno light field coding framework should be studied, and coding methods should be developed on the basis of this framework. Lastly, a few coding tools, such as depth estimation and view synthesis, should be improved. Light field compression is a popular research topic, and related research achievements, including standardization advances on JPEG Pleno, will attract increasing attention. Efficient compression of light field data remains a great challenge. Although many compression approaches are available for light field data, the coding performance still needs to be improved.