一种隐特征监督的孪生网络弱光光流估计方法
肖照林, 苏展, 左逢源, 金海燕(西安理工大学) 摘 要
目的 弱光照条件下成像存在信噪比低、运动模糊等问题,这对光流估计带来了极大挑战。与现有“先增强-再估计”的光流估计方法不同,为了避免在弱光图像增强阶段损失场景的运动信息,本文提出一种隐特征监督的弱光光流估计孪生网络学习方法。方法 首先,该方法采用权重共享的孪生网络提取可映射的弱光光流和正常光照光流特征;进而,计算弱光邻帧图像的K近邻相关性卷表,以解决计算四维全对相关性卷表的高时空复杂度问题;本文在全局运动聚合模块中引入针对二维运动特征的注意力机制,以降低弱光条件下强噪声、运动模糊、低对比度对光流估计的不利影响。最后,本文提出隐特征监督的光流估计模块,采用正常光照光流特征监督弱光照光流特征的学习,实现高精度的光流估计。结果 与三种最新光流估计方法的对比实验表明,在正常光照条件下,本文方法取得了与现有最佳光流估计方法相近的性能;在弱光FCDN数据集上,本文方法光流估计性能最优,相较于次优方法端点误差精度可提升0.29;在VBOF数据集上,本文方法端点误差精度可提升0.08。结论 本文采用权重共享的双分支孪生网络实现了对正常光照和弱光照光流特征的准确编码,并采用监督学习方式实现了高精度的弱光照光流估计.实验结果表明,本文方法在弱光光流估计精度及泛化性方面均具有显著优势。本文代码可在Github中下载。
关键词
Low-light optical flow estimation with hidden feature supervision using a siamese networkZhaolin Xiao1,2,Zhan Su1,Fengyuan Zuo1,Haiyan Jin1,2
Xiao Zhaolin, Su Zhan, Zuo fengyuan, Jin haiyan(Xi''an University of Technology) Abstract
Objective Optical flow estimation has been widely used in target tracking, video time-domain super-resolution, behavior recognition, scene depth estimation, and other vision applications. Unfornatunely, imaging under low-light conditions can hardly avoid low signal-to-noise ratio and motion blur, making low-light optical flow estimation very challenging. Applying a pre-step low-light image enhancement can effectively improve the image’s visual perception, which may not be helpful for further optical flow estimation. Unlike the ‘low light enhancement first and optical flow estimation next strategy’, in order to prevent the loss of scene motion information, we suggest that the low-light image enhancement should be considered with the optical flow estimation simultaneously. The optical flow features are encoded into the latent space, which enables supervised feature learning for the paired low-light and normal-light datasets. This paper also reveals that the post-task-oriented feature enhancement outperforms the general visual enhancement of low-light images. The main contributions of this paper can be summarized as follows: 1) We propose a dual-branch siamese network framework for weak light optical flow estimation. We use a weight-sharing block to establish the correlation of motion features between low-light images and normal-light images. 2) We propose an iterative low-light flow estimation module, which can be supervised using normal-light hidden features. Our solution is free of explicit enhancement of low-light images. Method This paper proposes a dual-branch siamese network to encode both the low-light and the normal-light optical flow features. Then, the encoded features are used to estimate the optical flow in a supervised manner. Our dual-branch feature extractor is constructed using a weight-sharing block, which encodes the motion features. Importantly, our algorithm does not need a pre-step low-light enhancement, which is usually employed in most existing optical flow estimations. To overcome the high spatial-temporal computational complexity, this paper proposes to compute the K-nearest neighbor correlation volume instead of the four-dimensional all-pair correlation volume. To better fuse local and global motion features, we introduce an attention mechanism for the 2D motion feature aggregation. After the feature extraction, we use a discriminator to distinguish the low-light features from the normal-light features. The feature extractor training is completed when the discriminator is incapable to recognize the two. To avoid the explicit enhancement of low-light images, the final optical flow estimation module is composed of a feature enhancement block and a gated recurrent unit (GRU). In an iterative way, the optical flow is decoded from the enhanced feature in the block. We use a latent feature supervised loss and an iterative similarity loss to keep the convergence of the training stage. In the experiment part, we train the network on an NVIDIA GeForce RTX 3080Ti GPU. The input images are uniformly cropped to 496 × 368 in resolution. Since the low-light and normal-light image paired datasets are limited, we jointly use the FCDN (Flying Chairs Dark Noise) and the VBOF (Various Brightness Optical Flow) datasets in the training stage. Results We compare the proposed algorithm with three state-of-the-art optical flow estimation models on several low-light datasets and normal-light datasets, including FCDN, VBOF, Sintel, and KITTI datasets. Besides the visual comparison, we also conduct the quantitative evaluation with the EPE (End-point-error) metric. The experimental results show that the proposed method achieves a comparable performance to the best available optical flow estimation under normal illumination conditions. But, under the low-light condition, compared with the second-best solution, the proposed solution improves up to 0.29 in terms of the EPE index on the FCDN dataset. On the VBOF dataset, the proposed solution improves 0.08 in terms of the EPE index than the second-best algorithm. We also provide visual comparison results with all the compared methods. The visual results show that the proposed model preserves more accurate details than other optical flow estimations, especially under low-light conditions. Conclusion In this paper, we propose a dual-branch siamese network for realizing the accurate encoding of the optical flow features under normal-light and low-light conditions. The feature extractor is constructed with a weight-sharing block, which enables better-supervised learning for low-light optical flow estimation. The proposed model has significant advantages in both flow estimation accuracy and generalizability. The experimental results indicate that the proposed supervised hidden feature learning outperforms the state-of-the-art optical flow estimations in terms of precision on the low-light datasets.
Keywords
Optical flow estimation Siamese network Correlation volume Global motion aggregation Low-light image enhancement
|