利用条件生成对抗网络的光场图像重聚焦
Light field image re-focusing based on conditional enerative adversarial networks leverage
- 2022年27卷第4期 页码:1056-1065
收稿:2020-08-26,
修回:2021-1-28,
录用:2021-2-4,
纸质出版:2022-04-16
DOI: 10.11834/jig.200471
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-08-26,
修回:2021-1-28,
录用:2021-2-4,
纸质出版:2022-04-16
移动端阅览
目的
2
传统的基于子视点叠加的重聚焦算法混叠现象严重,基于光场图像重构的重聚焦方法计算量太大,性能提升困难。为此,本文借助深度神经网络设计和实现了一种基于条件生成对抗网络的新颖高效的端到端光场图像重聚焦算法。
方法
2
首先以光场图像为输入计算视差图,并从视差图中计算出所需的弥散圆(circle of confusion,COC)图像,然后根据COC图像对光场中心子视点图像进行散焦渲染,最终生成对焦平面和景深与COC图像相对应的重聚焦图像。
结果
2
所提算法在提出的仿真数据集和真实数据集上与相关算法进行评价比较,证明了所提算法能够生成高质量的重聚焦图像。使用峰值信噪比(peak signal to noise ratio,PSNR)和结构相似性(structural similarity,SSIM)进行定量分析的结果显示,本文算法比传统重聚焦算法平均PSNR提升了1.82 dB,平均SSIM提升了0.02,比同样使用COC图像并借助各向异性滤波的算法平均PSNR提升了7.92 dB
平均SSIM提升了0.08。
结论
2
本文算法能够依据图像重聚焦和景深控制要求,生成输入光场图像的视差图,进而生成对应的COC图像。所提条件生成对抗神经网络模型能够依据得到的不同COC图像对输入的中心子视点进行散焦渲染,得到与之对应的重聚焦图像,与之前的算法相比,本文算法解决了混叠问题,优化了散焦效果,并显著降低了计算成本。
Objective
2
Light field images like rich spatial and angular information are widely used in computer vision applications. Light field information application can significantly improve the visual effect based on the focal plane and depth of field of an image. The current methods can be divided into two categories as mentioned below: One of the categories increases the angular resolution of a light field image via light field reconstruction. Since aliasing phenomenon is derived of disparity amongst light-field-images-based of the sub-aperture views. These methods require high computational costs and may introduce color errors or other artifacts. In addition
these methods can just improve the quality of refocusing straightforward under original focus plane and depth of field. Another category illustrates various filters derived of the circle of confusion (COC) map to defocus/render the center sub-aperture view to produce bokeh rendering effect. A rough defocusing visual effect can obtained. This above category has low computational cost and can sort both the focus plane and depth of field out. Deep convolutional neural network (DCNN) has its priority in bokeh rendering. To this end
we facilitate a novel conditional generative adversarial network based (C-GAN-based) for bokeh rendering.
Method
2
Our analysis takes a light field image as input. It contains three aspects as following: First
it calculates the COC map with different focal planes and depths of field derived of the disparity map for the input light field image estimation. The obtained COC map and the central sub-view of the light field image are fed into the generator of the conditional GAN. Next
the generator processes two input data each based on two four-layer encoders in order to integrate two-encoders-based features extraction
which add the four consecutive residual modules. At the end
the acquired refocused image is melted into the discriminator to identify that the obtained refocused image corresponding to the COC map. To enhance the high-frequency details of the refocused/rendered image
we adopt a pre-trained Visual Geometry Group 16-layer (VGG-16) network to calculate the style loss and the perceptual loss. L1 loss is used as the loss of the generator
and the discriminator adopts the cross-entropy loss. The Blender is used to adjust the position of focus planes and depths of field and render corresponding light field images. A digital single lens reflex(DSLR) camera plug-in tool of the Blender is used to render the corresponding refocused images as the ground truth. Our network is implemented based on the Keras framework. The input and output sizes of the network are both 512 × 512 × 3. The network is trained on a Titan XP GPU card. The number of epochs for training our targeted neural network is set to 3 500. The initial learning rate is set to 0.000 2. The training process took about 28 hours.
Result
2
Our synthetic dataset and the real-world dataset are compared with similar algorithms
including current refocusing algorithms
three different light field reconstruction algorithms
and defocusing algorithm using anisotropic filtering with COC map. Our quantitative analysis uses the peak signal to noise ratio (PSNR) and structural similarity (SSIM) for evaluation. Our proposed network-structure-based qualitative evaluation can obtain refocused images with different focus planes and depths of field in terms of the input COC map analysis. In the process of quantitative analysis
our average PSNR obtained is 1.82 dB. The average SSIM was improved by 0.02. Compared with the methods that use COC map and anisotropic filtering
our average PSNR was improved 7.92 dB and the average SSIM is improved 0.08. The methods had achieved poor PSNR values in the context of reconstruction/super-resolution due to the chromatic aberration of the generated sub-views.
Conclusion
2
Our algorithm can generate the disparity-map-based corresponding COC map obtained from the input light field image
refocusing plane and depth of field. To produce the corresponding refocused image
our conditional generative adversarial network demonstration can perform bokeh rendering on the central sub-view image based on differentiate COC map.
Busam B, Hog M, McDonagh S and Slabaugh G. 2019. SteReFo: efficient image refocusing with stereo vision//Proceedings of 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW). Seoul, Korea (South): IEEE: 3295-3304[ DOI: 10.1109/ICCVW.2019.00411 http://dx.doi.org/10.1109/ICCVW.2019.00411 ]
Dansereau D G, Pizarro O and Williams S B. 2015. Linear volumetric focus for light field cameras. ACM Transactions on Graphics, 34(2): #15[DOI: 10.1145/2665074]
Goodfellow I J, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A and Bengio Y. 2014. Generative adversarial nets[EB/OL]. [2020-08-10] . https://arxiv.org/pdf/1406.2661.pdf https://arxiv.org/pdf/1406.2661.pdf
Guo X Q, Yu Z, Kang S B, Lin H T and Yu J Y. 2016. Enhancing light fields through ray-space stitching. IEEE Transactions on Visualization and Computer Graphics, 22(7): 1852-1861[DOI: 10.1109/TVCG.2015.2476805]
Ignatov A, Patel J and Timofte R. 2020. Rendering natural camera bokeh effect with deep learning//Proceedings of 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops. Seattle, USA: IEEE: 1676-1686[ DOI: 10.1109/cvprw50498.2020.00217 http://dx.doi.org/10.1109/cvprw50498.2020.00217 ]
Kalantari N K, Wang T C and Ramamoorthi R. 2016. Learning-based view synthesis for light field cameras. ACM Transactions on Graphics, 35(6): #193[DOI: 10.1145/2980179.2980251]
Ledig C, Theis L and Huszar F. 2016. Photo-realistic single image super-resolution using a generative adversarial network//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 104-114[ DOI: 10.1109/CVPR.2017.19 http://dx.doi.org/10.1109/CVPR.2017.19 ]
Lin X, Wu J M, Zheng G A and Dai Q H. 2015. Camera array based light field microscopy. Biomedical Optics Express, 6(9): 3179-3189[DOI: 10.1364/BOE.6.003179]
Liu C L, Shih K T, Huang J W and Chen H H. 2020. Light field synthesis by training deep network in the refocused image domain. IEEE Transactions on Image Processing, 29: 6630-6640[DOI: 10.1109/TIP.2020.2992354]
Liu D W, Nicolescu R and Klette R. 2016. Stereo-based bokeh effects for photography. Machine Vision and Applications, 27(8): 1325-1337[DOI: 10.1007/s00138-016-0775-5]
Liu G L, Reda F A and Shih K J. 2018. Image inpainting for irregular holes using partial convolutions//Proceedings of the 15th European Conference on Computer Vision (ECCV). Munich, Germany: Springer: 89-105[ DOI: 10.1007/978-3-030-01252-6_6 http://dx.doi.org/10.1007/978-3-030-01252-6_6 ]
Mirza M and Osindero S. 2014. Conditional generative adversarial nets[EB/OL]. [2020-08-10] . https://arxiv.org/pdf/1411.1784.pdf https://arxiv.org/pdf/1411.1784.pdf
Ren Ng R, Levoy M, Brédif M, Duval G, Horowitz M and Hanrahan P. 2005. Light field photography with a hand-held plenoptic camera. CSTR 2005-02, Stanford University: 1-11
Overbeck R S, Erickson D, Evangelakos D, Pharr M and Debevec P. 2018. A system for acquiring, processing, and rendering panoramic light field stills for virtual reality. ACM Transactions on Graphics, 37(6): #197[DOI: 10.1145/3272127.3275031]
Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition[EB/OL]. [2020-08-10] . https://arxiv.org/pdf/1409.1556.pdf https://arxiv.org/pdf/1409.1556.pdf
Tsai Y J, Liu Y L, Ouhyoung M and Chuang Y Y. 2020. Attention-based view selection networks for light-field disparity estimation//Proceedings of 2020 AAAI Conference on Artificial Intelligence. Palo Alto, USA: AAAI: 12095-12103[ DOI: 10.1609/aaai.v34i07.6888 http://dx.doi.org/10.1609/aaai.v34i07.6888 ]
Vaish V, Wilburn B, Joshi N and Levoy M. 2004. Using plane+ parallax for calibrating dense camera arrays//Proceedings of 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Washington, USA: IEEE: #1315006[ DOI: 10.1109/CVPR.2004.1315006 http://dx.doi.org/10.1109/CVPR.2004.1315006 ]
Wang C, Zhang J and Gao J. 2020. Anti-specular light-field depth estimation algorithm. Journal of Image and Graphics, 25(12): 2630-2646
王程, 张骏, 高隽. 2020. 抗高光的光场深度估计方法. 中国图象图形学报, 25(12): 2630-2646[DOI: 10.11834/jig.190526]
Wang Y Q, Wang L G, Yang J G, An W, Yu J Y and Guo Y L. 2020. Spatial-angular interaction for light field image super-resolution[EB/OL]. [2020-08-10] . https://arxiv.org/pdf/1912.07849.pdf https://arxiv.org/pdf/1912.07849.pdf
Wang Y Q, Yang J G, Guo Y L, Xiao C and An W. 2019. Selective light field refocusing for camera arrays using bokeh rendering and Superresolution. IEEE Signal Processing Letters, 26(1): 204-208[DOI: 10.1109/LSP.2018.2885213]
Wang Y Q, Yang J G, Mo Y, Xiao C and An W. 2018. Disparity estimation for camera arrays using reliability guided disparity propagation. IEEE Access, 6: 21840-21849[DOI: 10.1109/ACCESS.2018.2827085]
Wu G C, Zhao M D, Wang L Y, Dai Q H, Chai T Y and Liu Y B. 2017. Light field reconstruction using deep convolutional network on EPI//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, USA: IEEE: 1638-1646[ DOI: 10.1109/CVPR.2017.178 http://dx.doi.org/10.1109/CVPR.2017.178 ]
Yan T, Xie N Y, Wang J M, Wang S Y and Liu Y. 2019. Baseline editing method for light field images. Journal of Frontiers of Computer Science and Technology, 13(11): 1911-1924
晏涛, 谢柠宇, 王建明, 王士同, 刘渊. 2019. 光场图像基线编辑方法. 计算机科学与探索, 13(11): 1911-1924[DOI: 10.3778/j.issn.1673-9418.1906035]
相关作者
相关机构
京公网安备11010802024621