Current Issue Cover
图像匹配方法研究综述

贾迪, 朱宁丹, 杨宁华, 吴思, 李玉秀, 赵明远(辽宁工程技术大学电子与信息工程学院, 葫芦岛 125105)

摘 要
目的 图像匹配作为计算机视觉的核心任务,是后续高级图像处理的关键,如目标识别、图像拼接、3维重建、视觉定位、场景深度计算等。本文从局部不变特征点、直线、区域匹配3个方面对图像匹配方法予以综述。方法 局部不变特征点匹配在图像匹配领域发展中最早出现,对这类方法中经典的算法本文仅予以简述,对于近年来新出现的方法予以重点介绍,尤其是基于深度学习的匹配方法,包括时间不变特征检测器(TILDE)、Quad-networks、深度卷积特征点描述符(DeepDesc)、基于学习的不变特征变换(LIFT)等。由于外点剔除类方法常用于提高局部不变点特征匹配的准确率,因此也对这类方法予以介绍,包括用于全局运动建模的双边函数(BF)、基于网格的运动统计(GMS)、向量场一致性估计(VFC)等。与局部不变特征点相比,线包含更多场景和对象的结构信息,更适用于具有重复纹理信息的像对匹配中,线匹配的研究需要克服包括端点位置不准确、线段外观不明显、线段碎片等问题,解决这类问题的方法有线带描述符(LBD)、基于上下文和表面的线匹配(CA)、基于点对应的线匹配(LP)、共面线点投影不变量法等,本文从问题解决过程的角度对这类方法予以介绍。区域匹配从区域特征提取与匹配、模板匹配两个角度对这类算法予以介绍,典型的区域特征提取与匹配方法包括最大稳定极值区域(MSER)、基于树的莫尔斯区域(TBMR),模板匹配包括快速仿射模板匹配(FAsT-Match)、彩色图像的快速仿射模板匹配(CFAST-Match)、具有变形和多样性的相似性度量(DDIS)、遮挡感知模板匹配(OATM),以及深度学习类的方法MatchNet、L2-Net、PN-Net、DeepCD等。结果 本文从局部不变特征点、直线、区域3个方面对图像匹配方法进行总结对比,包括特征匹配方法中影响因素的比较、基于深度学习类匹配方法的比较等,给出这类方法对应的论文及代码下载地址,并对未来的研究方向予以展望。结论 图像匹配是计算机视觉领域后续高级处理的基础,目前在宽基线匹配、实时匹配方面仍需进一步深入研究。
关键词
Image matching methods

Jia Di, Zhu Ningdan, Yang Ninghua, Wu Si, Li Yuxiu, Zhao Mingyuan(School of Electronic and Information Engineering, Liaoing Technical University, Huludao 125105, China)

Abstract
Objective Image matching, the core task of computer vision, is the key of subsequent advanced image processing, such as object recognition, image mosaic, 3D reconstruction, visual location, and scene depth calculation. Although many excellent methods have been proposed by domestic and foreign scholars in this field in recent years, no comprehensive summary of image matching methods has been reported. On this basis, this study reviews these methods from three aspects, namely, locally invariant feature points, straight lines, and regions. Method Locally invariant feature point matching first appeared in image matching development, such as Harris corner detector, features from accelerated segment test, and scale-invariant feature transform. The classical algorithms in this type of method are only briefly described in this paper. New methods, especially deep learning-based matching methods, including temporally invariant learned detector, Quad-networks, discriminative learning of deep convolutional feature point descriptors, and learned invariant feature transform (LIFT), are mainly introduced in recent years. Other methods, including bilateral functions for global motion modeling, grid-based motion statistics, and vector field consensus, are also introduced because the outer point culling method is often used to improve the accuracy of local invariant feature matching. Lines contain more scene and object structure information and are more suitable for matching image pairs with repeated texture information than local invariant feature points. Research on line matching should overcome various problems, such as inaccurate endpoint position, inconspicuous line segment, and segment fragmentation. The methods for solving such problems are line band descriptor, two-view line matching algorithm based on context and appearance, line matching leveraged by point correspondences, and new coplanar line point projection invariant. This paper introduces such methods from the perspective of problem solving process. Region matching is introduced from two aspects of region feature extraction and matching and template matching. Typical regional feature extraction and matching methods include maximally stable extremal regions, tree-based morse regions, template matching (including fast affine template matching), fast affine template matching for color images, deformable diversity similarity, occlusion aware template matching, and deep learning methods, such as MatchNet, L2-Net, PN-Net, and DeepCD. Medical image matching is an important application in the image matching field, which is significant for clinically precise diagnosis and treatment. This work introduces this type of method from the point of view of practical applications, such as fractional total variation-L1 and feature matching with learned nonlinear descriptors. Result In the analysis and comparison of multiple image matching algorithms, the CPU with two cores at 3.4 GHz and with graphics card NVIDIA GTX TITAN X GPU are selected as the experimental environment of the computer. The test datasets are the Technical University of Denmark dataset and Oxford University dataset Graf. This paper summarizes and compares these methods from three aspects, namely, local invariant feature points, straight lines, and regions. The comparison results of influential factors in feature matching methods, mismatched point removal methods, between hand-crafted and learn-based descriptors, and matching objects and the implementation forms of semantic matching methods are also presented. The corresponding papers and downloaded code addresses of such methods are provided, and the future research directions of image matching algorithms are prospected. Conclusion Image matching is the basis for subsequent advanced processing in the computer vision field. This method is widely used in medical image analysis, satellite image processing, remote sensing image processing, and computer vision. At present, further research is required on wide baseline and real-time matching.
Keywords

订阅号|日报