刘宇洋,朱策,郭红伟(电子科技大学, 成都 611731)
Survey of light field data compression
Liu Yuyang,Zhu Ce,Guo Hongwei(University of Electronic Science and Technology of China, Chengdu 611731, China)
Light field imaging is an attractive technique for 3D visualization, especially in virtual and augmented reality application scenarios. This technique has also been applied to computer vision areas, such as depth estimation, 3D reconstruction, and object detection. However, light field data have put great pressure on cost-effective storage and transmission owing to the large data volume. The data format of light field is also relatively different from that of conventional images or videos. This difference has resulted in the inefficient compression of light field data by current coding tools designed for traditional images or videos. Thus, light field compression methods must be developed, especially from the perspective of cost-effective storage and transmission bandwidth. With the advancement of light field compression, various light field compression methods have been proposed. This study conducts a survey of related works on light field compression to provide a research foundation for later researchers who will focus on this topic. First, this study briefly introduces the fundamentals of light field and the four types of light field-capturing devices. The advantages and drawbacks of different types of capturing devices are presented accordingly. The influence of different capturing devices on light field data format is also described. Second, this work discusses the recent advances in JPEG Pleno, which is a standard framework for representing and signaling plenoptic modalities. JPEG Pleno was started in 2015 by the Joint Photographic Experts Group Committee. The term “pleno” is an abbreviation of “plenoptic,” which is a mathematical formulation to represent the information of a beam of light passing through an arbitrary point within a scene. JPEG Pleno proposes a light field-coding framework for the light field data acquired by a plenoptic camera or a high-density array of cameras. The JPEG Pleno light field encoder consists of three parts, with each part illustrated in detail. Lastly, on the basis of extensive literature research, the proposed light field compression methods are divided into three categories according to the characteristics of the coding algorithms, namely, transform, pseudo-sequence-based, and predictive coding approaches. We analyze and discuss the coding methods in each category. As for transform coding approaches, the coding performance is not better than those of the other two methods because transform coding approaches do not contain the prediction process. Although several transform methods can achieve good performance in terms of energy compaction, the decorrelation efficiency of transform methods is not as good as that of the hybrid coding framework that consists of prediction and transformation. As for pseudo-sequence-based coding approaches, the correlation in spatial or view domain is converted into temporal domain. Temporal correlation can be removed by inter-prediction techniques with the use of a well-developed video encoder, such as HEVC (high efficiency video coding) codec. The coding performance can be further improved because the disparity information is not used in the video encoder. As for the predictive coding approaches, they can be further divided into two methods: self-similarity-based coding methods, which were proposed in the last two years, and disparity prediction-based coding approaches. Self-similarity-based coding methods directly encode light field images by applying template-matching-based coding methods. However, the coding performance of this method is insufficient compared with that of disparity prediction-based coding approaches. The latter can achieve the best coding performance compared with other coding methods. JPEG Pleno applies such method to encode light field data. The advantages and shortcomings of existing light field-coding methods are elucidated on the basis of the preceding analysis, and possible promising directions for future research are suggested. First, light field video data sets to explore light field video coding are lacking. Second, the JPEG Pleno light field coding framework should be studied, and coding methods should be developed on the basis of this framework. Lastly, a few coding tools, such as depth estimation and view synthesis, should be improved. Light field compression is a popular research topic, and related research achievements, including standardization advances on JPEG Pleno, will attract increasing attention. Efficient compression of light field data remains a great challenge. Although many compression approaches are available for light field data, the coding performance still needs to be improved.