Current Issue Cover
面向重复纹理及非刚性形变的像对高效稠密匹配方法

贾迪1, 赵明远1, 杨宁华1, 朱宁丹1, 孟琭2(1.辽宁工程技术大学电子与信息工程学院, 葫芦岛 125105;2.东北大学信息科学与工程学院, 沈阳 110819)

摘 要
目的 像对稠密匹配是3维重建和SLAM(simultaneous localization and mapping)等高级图像处理的基础,而摄影基线过宽、重复纹理、非刚性形变和时空效率低下等问题是影响这类方法实用性的主要因素,为了更好地解决这类问题,本文提出一种面向重复纹理及非刚性形变的高效稠密匹配方法。方法 首先,采用DeepMatching算法获得降采样后像对的匹配点集,并采用随机抽样一致算法剔除其中外点。其次,利用上一步得到的匹配结果估计相机位姿及缩放比例,以确定每个点对稠密化过程中的邻域,再对相应点对的邻域提取HOG描述符并进行卷积操作得到分数矩阵。最后,根据归一化后分数矩阵的数值以及下标距离的方差确定新的匹配点对以实现稠密化。结果 在多个公共数据集上采用相同大小且宽高比为4:3的像对进行实验,实验结果表明,本文方法具备一定的抗旋转、尺度变化与形变的能力,能够较好地完成宽基线条件下具有重复纹理及非刚性形变像对的匹配。与DeepMatching算法进行对比实验,本文方法在查准率、空间效率和时间效率上分别提高了近10%、25%和30%。结论 本文提出的稠密匹配方法具有较高的查准率和时空效率,其结果可以运用于3维重建和超分辨率重建等高级图像处理技术中。
关键词
Efficient dense matching method for repeated texture and non-rigid deformation

Jia Di1, Zhao Mingyuan1, Yang Ninghua1, Zhu Ningdan1, Meng Lu2(1.School of Electronic and Information Engineering, Liaoning Technical University, Huludao 125105, China;2.College of Information Science and Engineering, Northeast University, Shenyang 110819, China)

Abstract
Objective Dense matching between images is the basis of 3D reconstruction, SLAM (simultaneous localization and mapping), and other advanced image processing methods. However, the problems of excessive baseline, repeated texture, non-rigid deformation, and time-space efficiency largely affect the practicability of such methods. To solve such problems, this study proposes an efficient dense matching method for repeated textures and non-rigid deformation. Method First, the source and target images are scaled α via linear-bilinear interpolation. A series of matching points are obtained via DeepMatching (DM), which constitutes the set S, and the outer points are eliminated by random sample consensus. Second, the matching set S obtained in the previous step is used to estimate the camera pose x and scaling α to determine the neighborhood of each point during densification. Third, the fractional matrix Sim is obtained by convoluting the HOG (histogram of gradient) descriptors extracted from the corresponding neighborhood. The fractional matrix Sim, which is composed of similarity scores between all points in the neighborhood, is the most important concept in our method because it connects two major steps:selecting the appropriate convolution region and determining the new matching point. The size and position of the convolution area, which are respectively decided by scaling factor α and camera position x, determine the appropriate neighborhood. The selection of the above convolution neighborhood is still stable under conditions of rotation and scaling. Finally, new matching points are determined according to the values and variance of the subscript distance of the normalized fractional matrix Sim to achieve densification. This condition also means that the relative coordinates of the maximum values in each group of Sim are restored to the absolute coordinates of the input image. Result The code is implemented in VS2013 with Intel MKL2015 and Opencv3. Image pairs with the same size and an aspect ratio of 4:3 on Mikolajczyk, MPI-Sintel, and Kitti datasets are used for the experiment in an environment with a 3.8 GHz CPU and 8 GB RAM. To evaluate our method comprehensively and objectively, we select multiple sets of images with different sizes to compare the time and memory usage and precision of the proposed method with those of DeepMatching. To illustrate the problem solved by the proposed method, the method is applied to the matching of image pairs under repeated texture and non-rigid deformation conditions. Under the condition of repeated texture, the method can not only solve the matching problem under rotation and scaling conditions but also realize the matching problem of repeated texture under a wide baseline; the method also performs well in non-rigid deformation. To evaluate the time and space efficiency of the method, the same size and aspect ratio 4:3 pairs were tested on the Mikolajczyk, MPI-Sintel, and Kitti datasets, respectively. From the experiment, the proposed algorithm outperformed the DM in terms of time and space efficiency, especially in processing certain large-size images. For the convenience of comparison of the processing time, the experiment was performed on the Kitti dataset and the median of the results was taken as the data, when α was seted 0.5, the execution time of the algorithm and the memory usage rate were both low and the density in the unit pixel is similar to the original image (α=1). To evaluate the accuracy assessment of this method, a pixel was considered correct if its pixel match in the second image was closer than 8 pixels to the ground-truth, while allowing some tolerance in the blurred areas that were difficult to match exactly. Since our method used camera pose to eliminate some outer points in the process of determining the centre of the neighbourhood, so the accuracy of our method is better than the DM when the image size selected between 16 and 512, but as the image size increased to 5121 024, the proportion of DM outer points is less and less due to the increase of the number of DM inner points. The accuracy of DM and ours was basically the same. In summary, by combining the calculation results on precision in the above datasets, the precision of the experimental results of this method is determined to be better than that of the direct use of the DeepMatching algorithm (average increase of about 10%). Moreover, as the image size increases, memory and time usage increase by nearly 25% and 30%, respectively. Conclusion To verify the effectiveness of the proposed method, the time and memory usage and precision of this method are compared with those of DeepMatching in multiple public datasets. Precision and time and memory usage increase by 10%, 25%, and 30%, respectively. The effect of wide baseline, repeated texture, and non-rigid deformation on the robustness and efficiency of matching results is solved. We code rotation and scaling to achieve algorithm versatility. For high versatility and practicality, we will integrate this method into advanced image processing, such as 3D reconstruction and SLAM.
Keywords

订阅号|日报