黄颖, 彭慧, 李昌盛, 高胜美, 陈奉(重庆邮电大学)
目的 低照度图像增强是一个不适定问题，一张低照度图像对应着无数的正常光图像。现有的方法大都依赖于像素级重建，旨在学习低照度输入和正常曝光图像之间的确定性映射，没有对复杂的光照分布进行建模，从而导致了不适当的亮度及噪声。此外，大多数现有的图像生成方法只使用一种（显式或隐式）生成模型，这在灵活性和效率方面有所限制。为此，本文改进了一种混合显式-隐式的生成模型，该模型允许同时进行对抗训练和最大似然训练。方法 首先设计了一个残差注意力条件编码器对低照度输入进行处理，提取丰富的特征以减少生成图像的色差；然后，将编码器提取到的特征作为可逆流生成模型的条件先验，学习将正常曝光图像的分布映射为高斯分布的双向映射，以此来模拟正常曝光图像的条件分布，使模型能够对多个正常曝光结果进行采样，生成多样化的样本；最后，利用隐式生成对抗策略（GANs）为模型提供约束，改善图像的细节信息。特别地，两个映射方向都受到损失函数的约束，因此本文设计的模型具有较强的抗模式崩溃能力。结果 实验在2个数据集上进行训练与测试，在LOL数据集中，与18种显著性模型进行了比较，相比于性能第2的模型，PSNR（Peak Signal-to-Noise Ratio）值提高0.84dB，LPIPS（Learned Perceptual Image Patch Similarity，LPIPS）值降低0.02，SSIM（Structural Similarity Index）指标取得次优值（略低0.01），NIQE（Natural Image Quality Evaluator）值降低1.05；在MIT-Adobe FiveK数据集中，与5种显著性模型进行了比较，相比于性能第2的模型，PSNR值提高0.58dB，SSIM值并列第一。同时也在LOL数据集中进行了消融实验和跨数据集测试以验证算法各模块的有效性，实验结果证明提出的算法改善了低照度图像增强的效果。结论 本文所提出的流生成对抗模型，综合了显式和隐式生成模型的优点，更好地调整了低照度图像的光照，抑制了噪声和伪影，提高了生成图像的视觉感知质量。
LLFlowGAN: A low-light image enhancement method for constraining invertible flow in a generative adversarial manner
huang ying, peng hui, li chang sheng, gao sheng mei, chen feng(Chongqing University of Posts and Telecommunications)
Objective Low-light images are produced by imaging devices that cannot capture sufficient light due to unavoidable environmental or technical limitations (such as nighttime, backlight, underexposure, etc.). Such images usually have the characteristics of low brightness, low contrast, narrow grayscale range, color distortion, and strong noise, which almost need more information. Low-light images containing these problems do not meet human beings" visual requirements and directly limit the role of the subsequent advanced visual system. The low-light image enhancement task is an ill-posed problem since the low-light image loss of illumination information. That is, a low-light image may correspond to countless normal-light images. Low-light image enhancement should be regarded as selecting the most suitable solution from all possible outputs. Most existing reconstruction methods rely on pixel-level reconstruction algorithms that aim to learn a deterministic mapping between low-light inputs and normal-light images. They will give a normal-light result for a low-light image rather than modeling complex lighting distributions, which usually result in inappropriate brightness and noise. Furthermore, most existing image generation methods use only one (explicit or implicit) generative model, which limits flexibility and efficiency. Flow models have recently demonstrated promising results for low-level vision tasks. This paper improves a hybrid explicit-implicit generative model, which can flexibly and efficiently reconstruct normal-light images with satisfied lighting, cleanliness, and realism from degraded inputs. It alleviates the fuzzy details and singularity problems produced by explicit or implicit generative modeling. Method This paper proposes a low-light image enhancement network with a hybrid explicit (Flow)-implicit generative model (GAN) named LLFlowGAN. Mainly, it contains three parts: conditional encoder, flow generation network, and discriminator. Flow generation networks operate at multiple scales conditioned on encoded information from low-light input. Specifically, we first design a residual attention conditional encoder to process low-light input, calculate low-light color maps, and extract rich features to reduce the color deviation of generated images. Due to the flexibility of the flow model, the conditional encoder mainly consists of several residual blocks plus efficient stacking of channel attention modules. Then, the features extracted by the encoder are used as conditional prior to the generative flow model. Moreover, the flow model learns to map the high-dimensional random variables obeying the normal exposure image distribution into a bidirectional mapping with simple tractable latent variables (Gaussian distribution). By simulating the conditional distribution of normal exposure images, the model allows the sampling of multiple normal exposure results to generate diverse samples. Finally, the GAN-based discriminator provides constraints for the model and improves the detailed information of the image in the reverse mapping process. Since the model learns a bidirectional mapping relationship, both mapping directions can be regarded as constrained by the loss function, providing the network stability and resistance to mode collapse. Result The proposed algorithm in this paper has experimented on two datasets: LOL (Low-Light dataset) and MIT-Adobe FiveK dataset to verify the effectiveness of the proposed method. Quantitative evaluation metrics include Peak Signal-to-Noise Ratio (PSNR), Structural Similarity Index Measure (SSIM), Learned Perceptual Image Patch Similarity (LPIPS), and Natural Image Quality Evaluator (NIQE). We compare our model with 18 saliency models in the LOL dataset, including the traditional supervised and unsupervised deep learning methods. Some of them are state-of-the-art methods in this field. Compared to the model with the second-best performance, our method improves the PSNR value by 0.84dB and reduces the LPIPS value (the smaller, the better) by 0.02. SSIM obtained the second-best value, decreased by 0.01, and NIQE decreased by 1.05. We also provide the saliency maps of each method for comparison. In contrast, our method better preserves rich detail and color information while enhancing image brightness, where artifacts are rarely observed, achieving better perceptual quality. In the MIT-Adobe FiveK dataset, we compared the five most advanced methods. Compared to the model with the second-best performance, the PSNR value has increased by 0.58dB, and the SSIM value is also tied for first place. In addition, we also conducted a series of ablation experiments and cross-dataset tests in the LOL dataset to verify the effectiveness of each algorithm module. The experimental results prove that our proposed algorithm improves the effect of low-light image enhancement. Conclusion In this study, we propose a hybrid explicit-implicit generative model. It inherits the flow-based explicit generative model, which can accurately complete the free conversion between the natural image space and a simple Gaussian distribution and flexibly generate diverse samples. At the same time, the adversarial training strategy is further used to improve the detailed information of the generated image, enrich the saturation and reduce the color distortion. It can achieve competitive performance compared with representative state-of-the-art low-light image enhancement methods.