Current Issue Cover
LLFlowGAN:以生成对抗方式约束可逆流的低照度图像增强

黄颖, 彭慧, 李昌盛, 高胜美, 陈奉(重庆邮电大学软件工程学院, 重庆 400065)

摘 要
目的 现有低照度图像增强方法大多依赖于像素级重建,旨在学习低照度输入和正常曝光图像之间的确定性映射,没有对复杂的光照分布进行建模,从而导致了不适当的亮度及噪声。大多图像生成方法仅使用一种(显式或隐式)生成模型,在灵活性和效率方面有所限制。为此,改进了一种混合显式—隐式的生成模型,该模型允许同时进行对抗训练和最大似然训练。方法 首先设计了一个残差注意力条件编码器对低照度输入进行处理,提取丰富的特征以减少生成图像的色差;然后,将编码器提取到的特征作为可逆流生成模型的条件先验,学习将正常曝光图像的分布映射为高斯分布的双向映射,以此来模拟正常曝光图像的条件分布,使模型能够对多个正常曝光结果进行采样,生成多样化的样本;最后,利用隐式生成对抗网络(generative adversarial network,GAN)为模型提供约束,改善图像的细节信息。特别地,两个映射方向都受到损失函数的约束,因此本文设计的模型具有较强的抗模式崩溃能力。结果 实验在2个数据集上进行训练与测试,在低照度(low-light dataset,LOL)数据集与其他算法对比,本文算法在峰值信噪比(peak signal-to-noise ratio,PSNR)上均有最优表现、图像感知相似度(learned perceptual image patchsimilarity,LPIPS)、在结构相似性(structural similarity index measure,SSIM)上取得次优表现0.01,在无参考自然图像质量指标(natural image quality evaluator,NIQE)上取得较优结果。具体地,相较于18种现有显著性模型中的最优值,本文算法PSNR提高0.84 dB,LPIPS降低0.02,SSIM降低0.01,NIQE值降低1.05。在MIT-Adobe FiveK(Massa-chu-setts Institute of Technology Adobe FiveK)数据集中,与5种显著性模型进行对比,相较于其中的最优值,本文算法PSNR提高0.58 dB,SSIM值取得并列第一。结论 本文提出的流生成对抗模型,综合了显式和隐式生成模型的优点,更好地调整了低照度图像的光照,抑制了噪声和伪影,提高了生成图像的视觉感知质量。
关键词
LLFlowGAN:a low-light image enhancement method for constraining invertible flow in a generative adversarial manner

Huang Ying, Peng Hui, Li Changsheng, Gao Shengmei, Chen Feng(School of Software Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China)

Abstract
Objective Low-light images are produced by imaging devices that cannot capture sufficient light due to unavoidable environmental or technical limitations(such as nighttime,backlight,and underexposure). Such images usually have the characteristics of low brightness,low contrast,narrow grayscale range,color distortion,and strong noise,which almost need more information. Low-light images containing these problems do not meet human beings’visual requirements and directly limit the role of the subsequent advanced visual system. The low-light image enhancement task is an ill-posed problem because the low-light image loss of illumination information,that is,a low-light image may correspond to countless normal-light images. Low-light image enhancement should be regarded as selecting the most suitable solution from all possible outputs. Most existing reconstruction methods rely on pixel-level reconstruction algorithms that aim to learn a deterministic mapping between low-light inputs and normal-light images. They provide a normal-light result for a low-light image rather than modeling complex lighting distributions,which usually result in inappropriate brightness and noise. Furthermore,most existing image generation methods use only one(explicit or implicit)generative model,which limits flexibility and efficiency. Flow models have recently demonstrated promising results for low-level vision tasks. This paper improves a hybrid explicit-implicit generative model,which can flexibly and efficiently reconstruct normal-light images with satisfied lighting,cleanliness,and realism from degraded inputs. The model alleviates the fuzzy details and singularity problems produced by explicit or implicit generative modeling. Method This paper proposes a low-light image enhancement network with a hybrid explicit(Flow)and implicit generative adversarial network(GAN),named LLFlowGAN that contains three parts:conditional encoder,flow generation network,and discriminator. Flow generation networks operate at multiple scales conditioned on encoded information from low-light input. First,a residual attention conditional encoder is designed to process low-light input,calculate low-light color maps,and extract rich features to reduce the color deviation of generated images. Due to the flexibility of the flow model,the conditional encoder mainly consists of several residual blocks plus efficient stacking of channel attention modules. Then,the features extracted by the encoder are used as conditional prior to the generative flow model. Moreover,the flow model learns to map the high-dimensional random variables obeying the normal exposure image distribution into a bidirectional mapping with simple tractable latent variables(Gaussian distribution). By simulating the conditional distribution of normal exposure images,the model allows the sampling of multiple normal exposure results to generate diverse samples. Finally,the GAN-based discriminator provides constraints for the model and improves the detailed information of the image in the reverse mapping. Because the model learns a bidirectional mapping relationship,both mapping directions can be regarded as constrained by the loss function,providing the network stability and resistance to mode collapse. Result The proposed algorithm in this paper is validated using experiments on two datasets,namely,Low-Light(LOL)dataset and MIT-Adobe FiveK dataset,to verify its effectiveness. Quantitative evaluation metrics include peak signal-to-noise ratio(PSNR),structural similarity index measure(SSIM),learned perceptual image patch similarity(LPIPS),and natural image quality evaluator(NIQE). Our model is compared with 18 saliency models in the LOL dataset,including the traditional supervised and unsupervised deep learning methods including state-of-the-art methods in this field. Compared with the model with the second-best performance,our method improves the PSNR value by 0. 84 dB and reduces the LPIPS value(the smaller,the better)by 0. 02. SSIM obtains the second-best value,decreases by 0. 01,and NIQE decreases by 1. 05. Saliency maps of each method are also provided for comparison. Our method better preserves rich detail and color information while enhancing image brightness,where artifacts are rarely observed, achieving better perceptual quality. In the MIT-Adobe FiveK dataset,the five most advanced methods are compared. Compared with the model with the second-best performance,the PSNR value increases by 0. 58 dB,and the SSIM value is also tied for first place. In addition,a series of ablation experiments and cross-dataset tests in the LOL dataset are conducted to verify the effectiveness of each algorithm module. Experimental results prove our proposed algorithm improves the effect of low-light image enhancement. Conclusion In this paper,a hybrid explicit-implicit generative model is proposed. The model inherits the flow-based explicit generative model,which can accurately complete the free conversion between the natural image space and a simple Gaussian distribution and flexibly generate diverse samples. The adversarial training strategy is further used to improve the detailed information of the generated image,enrich the saturation,and reduce the color distortion. The proposed approach can achieve competitive performance compared with representative state-of-the-art low-light image enhancement methods.
Keywords

订阅号|日报