发布时间: 2019-01-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.180343
2019 | Volume 24 | Number 1

图像处理和编码

使用线约束运动最小二乘法的视差图像拼接

樊逸清¹, 李海晟¹, 楚东东²

1. 华东师范大学计算机科学技术系, 上海 200062;

2. 中国银行软件中心上海分中心, 上海 201201

收稿日期: 2018-06-05; 修回日期: 2018-08-14

第一作者简介: 樊逸清, 1993年生, 女, 硕士研究生, 主要研究方向为图像处理、特征匹配。E-mail:51164500043@stu.ecnu.edu.cn;
楚东东, 男, 硕士, 主要研究方向为图像拼接。E-mail:dozai@sina.cn.

中图法分类号: TP391

文献标识码: A

文章编号: 1006-8961(2019)01-0023-08

摘要

目的图像配准是影响拼接质量的关键因素。已有的视差图象拼接方法没有解决匹配特征点对间的错误配准问题，容易引起不自然的拼接痕迹。针对这一问题，提出了使用线约束运动最小二乘法的配准算法，减少图像的配准误差，提高拼接质量。方法首先，计算目标图像和参考图像的SIFT（scale-invariant feature transform）特征点，应用RANSAC（random sample consensus）方法建立特征点的匹配关系，由此计算目标到参考图像的最佳单应变换。然后，使用线约束运动最小二乘法分别配准两组图像：1）第1组是目标图像和参考图像；2）第2组是经单应变换后的目标图像和参考图像。第1组用逐点仿射变换进行配准，而第2组配准使用了单应变换加上逐点仿射变换。最后，在重叠区域，利用最大流最小割算法寻找最优拼接缝，沿着拼接缝评估两组配准的质量，选取最优的那组进行融合拼接。结果自拍图库和公开数据集上的大量测试结果表明，本文算法的配准精度超过95%，透视扭曲比例小于17%。与近期拼接方法相比，本文配准算法精度提高3%，拼接结果中透视扭曲现象减少73%。结论运动最小二乘法可以准确地配准特征点，但可能会扭曲图像中的结构对象。而线约束项则尽量保持结构，阻止扭曲。因此，线约束运动最小二乘法兼顾了图像结构的完整性和匹配特征点的对准精度，基于此配准模型的拼接方法能够有效减少重影和鬼影等人工痕迹，拼接结果真实自然。

关键词

图像配准; 视差图像; 线约束; 运动最小二乘法; 图像拼接; 最大流最小割

Parallax image stitching using line-constraint moving least squares

Fan Yiqing¹, Li Haisheng¹, Chu Dongdong²

1. Department of Computer Science and Technology, East China Normal University, Shanghai 200062, China;

2. Software Center Shanghai Branch, Bank of China, Shanghai 201201, China

Abstract

Objective Image alignment is a key factor in assessing stitching performance. Image deformation is a critical step of the alignment model for parallax image stitching and directly determines the alignment quality. Accurately aligning all points in an overlapping region of parallax images is difficult. Thus, an alignment strategy that can produce visually satisfying stitching results must be developed. Recent state-of-the-art stitching methods practically combine homography with content-preserving warping. Either homography is first used to pre-align two images and is followed by content-preserving warping to refine alignment, or the mesh deformation is globally optimized by solving an energy function, which is a weighting linear combination of homography and content-preserving warping. Both approaches commonly use homography in the aligning phase and therefore easily produce perspective distortion. At the same time, these approaches possibly misalign the object edges of images with several dominant structural objects. To address these problems, this paper presented a novel stitching method that combines homography, deformation using moving least squares (MLS), and line constraint. The deformation method based on MLS has an interpolation property and can therefore accurately align matching feature points. However, this deformation method may distort structural regions; thus a line constraint item was added to the deformation model to preserve the structure. Method To attain a clear depiction, we considered a two-image stitching as an example. The two input images are called the target and reference images, respectively, and are denoted by $\mathit{\boldsymbol{T}}$ and $\mathit{\boldsymbol{R}}$. First, feature detection and matching estimation were conducted using SIFT and RANSAC, followed by distance similarity to check the matching accuracy of the feature points. The homography (denoted by $\mathit{\boldsymbol{H}}$) with the best geometric fit was selected. Then, $\mathit{\boldsymbol{H}}$ was applied to the target image $\mathit{\boldsymbol{T}}$, and the transformed image was denoted as ${\mathit{\boldsymbol{T}}_H}$. Afterward, the two group images ($\mathit{\boldsymbol{T}}$, $\mathit{\boldsymbol{R}}$) and (${\mathit{\boldsymbol{T}}_H}$, $\mathit{\boldsymbol{R}}$) were aligned using a line constraint MLS. To eliminate perspective distortion in the deformation image, affine transformation was used in MLS. However, a simple affine transformation was insufficient to handle the parallax. Thus, an additional pair of images (${\mathit{\boldsymbol{T}}_H}$, $\mathit{\boldsymbol{R}}$) was processed as a candidate stitching result for the pair of images ($\mathit{\boldsymbol{T}}$, $\mathit{\boldsymbol{R}}$). The test experiments revealed that many examples obtained a more natural stitching result when only affine transformation rather than the composite transformation of homography and affine transformation was applied, implying that the alignment between $\mathit{\boldsymbol{T}}$ and $\mathit{\boldsymbol{R}}$ was better than that between ${\mathit{\boldsymbol{T}}_H}$ and $\mathit{\boldsymbol{R}}$. Taking the deformation from the target image $\mathit{\boldsymbol{T}}$ to the reference image $\mathit{\boldsymbol{R}}$ as an example, the line constraint MLS was outlined as follows. First, the four corner points of $\mathit{\boldsymbol{T}}$ were deformed to the coordinate system of $\mathit{\boldsymbol{R}}$ by using matching feature points as control points based on MLS. Then, we deformed the remaining points on the four border lines (top, bottom, left, and right boundaries) of $\mathit{\boldsymbol{T}}$ by using line constraint MLS. Here, the line constraint was constructed by preserving the relative position of each point of a border line, based on which a deformation objective function was developed. Similarly, we handled the internal points of $\mathit{\boldsymbol{T}}$ by using vertical and horizontal grid lines as constraint conditions, and the vertical and horizontal grid lines are consisted of the constraint lines of their intersection point. Finally, the quality of each alignment was evaluated, and the best one was chosen to blend them. In the overlapping regions, the max-flow min-cut algorithm was used to find the best stitching seam-cut of two alignments and assess the alignment quality along the seam-cut. The assessment of the alignment quality mainly considered the color and structural differences between overlapping regions of two images, and the structure was reflected by a gradient. Then, feathering approach was utilized to blend the two images of the best alignment. Result To test our stitching algorithm, 23 pairs of pictures, which cover commonly seen natural and man-made scenes, were captured. In addition, we conducted several experiments on publicly published data provided by recent related works. The experimental results demonstrated that the alignment accuracy of our method exceeded 95%, and the ratio of perspective distortion was lower than 17%. Compared with recent state-of-the-art methods, our method's alignment accuracy was higher by 3%, and the ratio of perspective distortion was lower by 73%. Therefore, our method exhibits a better performance in handling image stitching with a large parallax, and the stitching result is authentic and natural. Conclusion This paper presented a hybrid transformation for aligning two images that combines line constraint with MLS. In addition, an alignment quality evaluation rule was introduced by computing the weighted differences of the points along the stitching seam-cut and the remaining points in the overlapping region. As the proposed method can balance alignment accuracy and structure preservation, it can address the misalignment issues easily caused by current stitching approaches for parallax images and effectively reduce stitching artifacts, such as ghosting and distortion.

Key words

image alignment; parallax image; line constraint; moving least squares; image stitching; max-flow min-cut

0 引言

图像拼接技术能够把有重叠的多幅图像合成高分辨率大尺寸的图像，是计算机视觉和图形学领域的研究课题之一。一般地，图像配准和合成是图像拼接的两个基本步骤，其中配准是否正确将直接影响后续的图像融合的好坏，决定着拼接结果的质量。基于像素的和基于特征的匹配是两类主要的配准方法^[1]，前者直接计算不同图像的像素之间的差异来配准图像，而后者利用图像特征确定匹配区域，据此变换图像进行配准。由于不依赖于图像尺度和方向的特征描述子，以及高效可靠的特征检测和匹配方法相继出现^[2-3]，因此基于特征的图像配准具有稳定、准确性好等优点，成为目前图像拼接方面的主流配准方法^[4-5]。

众所周知，当图像的场景近似平面或者相机绕固定中心旋转拍摄成的图像拼接比较容易，技术相当成熟，目前已有许多商用软件可以完美地处理这类拼接问题。然而，这些方法和软件处理视差图像时容易出现明显的人工痕迹，原因是它们的配准模型不适用于视差图像。虽然应用一些融合技术^[6-7]可以削弱拼接痕迹，但是不能本质上解决误配准引起的错位、结构破坏等问题。为此，许多研究人员投入到该领域开展研究，相继提出了一些性能较好的方法。这些方法的拼接过程基本相同，大致可分为3个步骤：特征点检测和匹配，基于特征的图像变换，计算拼接缝和图像融合。大部分的工作集中在第2步，即配准方法。根据所使用的图像变换方法，可以分成以下3类：

1) 整体2D变换。文献[8-10]是基于SIFT特征的拼接工作，而曹世翔等人^[11]则是利用边缘特征点进行配准，以减少计算复杂性。Brown等人^[8]应用一个最佳单应变换进行图像配准。Gao等人^[9]提出了双单应变换的配准模型，能够处理远平面和背景平面占优的图像。基于单应变换的拼接缝评估^[10]首先计算多个候选单应变换，然后评估每个单应变换对应的配准质量，选择最好的那个单应变换作为配准模型。这些方法可以很好地处理相机中心固定拍摄的或场景近似平面的图像，但是不能胜任一般视差图像的配准。

2) 空间变化的变形。Lin等人^[12]提出了逐点变化的连续仿射变换的图像配准模型。由于仅使用仿射变换，对于透视感强的图像的配准不够理想。Zaragoza等人^[13]提出了运动直接线性变换来配准图像，该方法与运动最小二乘法的变形方法^[14]类似，只是用单应变换代替仿射变换。与整体2D变换相比，这类方法灵活度大，能够产生更好的配准结果，但是容易产生透视和几何扭曲现象。

3) 上述两种变换的组合方法。一般先使用单应变换预配准图像，然后应用内容或特征保持的变形微调匹配特征点之间的配准。2014年Zhang等人^[15]把单应变换和内容保持的变形结合起来，提出了一种混合配准模型。随后，2016年Lin等人^[16]提出了另一种组合方法：先把特征点分成若干组，利用每一组特征点或者它们组合而成的特征点组生成若干局部单应变换初步配准图像，然后应用特征加权和结构保持的变形迭代地修正每个局部配准，评估每个配准的拼接缝合的质量，选取最优的拼接。这两种方法均用变形前后特征点在网格内的相对位置保持不变这一约束对预配准进行局部修正，尽可能地配准匹配特征点附近的区域。这种组合方法的拼接性能整体上优于以往的方法，但是对单应变换预配准时产生的特征点的配准误差消除有限，仍有可能产生局部错位现象。

因此，既要保证对应特征点的配准精度又要保持图像的结构特性仍是视差图像拼接领域的一个挑战性难题。文献[14]应用运动最小二乘法，提出了逐点变换的图像变形方法，能够把控制点准确地变换到目标位置。使用该方法配准图像，可以准确地配准对应的特征点，但是这个变形方法没有考虑整体性，容易破坏结构对象。本文在该方法的基础上，加入线约束项，尽可能保持结构特性。基于此配准方法，提出了一个新的视差图像拼接方法。

1 线约束运动最小二乘法

图像的拼接质量取决于配准结果，而基于特征的配准依据的是图像的特征点集。理想的配准模型应该把一幅图像上的每一个特征点准确地对应到另一幅图像上与之匹配的特征点。对于视差图像，一个整体变换很难达到这个要求，只有使用多个甚至逐点变换才能达到这个配准精度。本文称前者为整体配准，后者为局部配准。整体配准的缺点是局部区域的配准不够正确，但不会破坏图像结构；而局部配准为每个区域(或每个点)构建变换，变换之间的一致性弱，容易折断线条、扭曲结构等。由此可见，提高视差图像的拼接质量的一条可行途径是局部配准加结构保持的约束。本文将沿此思路给出一个局部配准下结构保持的配准方法——线约束运动最小二乘法，修改文献[14]中求解变换的能量函数，以使变换兼顾特征点的准确对应和结构的完整性，下面以两幅图像的配准为例讲述这个方法。

像大多数图像变形算法一样，本文变形方法不是对每个像素操作，而是把图像均匀地划分成若干网格，变形网格顶点，然后用纹理映射的方法填充每个网格。为了方便叙述，先引入一些记号：

1) 两幅源图像分别称为目标图像和参考图像，记为$\mathit{\boldsymbol{T}}$和$\mathit{\boldsymbol{R}}$。

2) 每幅图像均匀分割成$m$(行)×$n$(列)个网格，${\mathit{\boldsymbol{v}}_{kj}}$($k=0, 1, …, m; j=0, 1, …, n$)是网格顶点，${\mathit{\boldsymbol{\tilde v}}_{kj}}$是${\mathit{\boldsymbol{v}}_{kj}}$变换后的顶点。$\mathit{\boldsymbol{T}}$的特征点集记为{${\mathit{\boldsymbol{p}}_i}$}，$\mathit{\boldsymbol{R}}$中与之匹配的点集记为{${\mathit{\boldsymbol{q}}_i}$}，即点${\mathit{\boldsymbol{p}}_i}$的对应点是${\mathit{\boldsymbol{q}}_i}$，其中$i$是特征点的索引。

配准的具体步骤为：

1) 对于图像$\mathit{\boldsymbol{T}}$的四个角点${\mathit{\boldsymbol{v}}_{kj}}$ ($k=0, m; j=0, n$)，构造一个仿射变换${F_v}(\mathit{\boldsymbol{p}}){\rm{ }} = \mathit{\boldsymbol{p}}{\mathit{\boldsymbol{M}}_v} + {\mathit{\boldsymbol{T}}_v}$，其中$\mathit{\boldsymbol{p}}$和${\mathit{\boldsymbol{T}}_v}$是行向量，${\mathit{\boldsymbol{M}}_v}$是2×2的矩阵，把它们变换到$\mathit{\boldsymbol{R}}$图像坐标系中。${F_v}$通过如下的优化问题求解，即

$ {\rm{arg}}\;\mathop {{\rm{min}}}\limits_{{F_v}} \sum\limits_i {{w_i}} {\left| {{F_v}({\mathit{\boldsymbol{p}}_i})-{\mathit{\boldsymbol{q}}_i}} \right|^2} $

(1)

式中，${w_i} = 1/{\left| {{\mathit{\boldsymbol{p}}_i}-\mathit{\boldsymbol{v}}} \right|^2}$。这是个经典的二次优化问题，其解为

$ \begin{array}{l} {\mathit{\boldsymbol{M}}_v} = {({\sum\limits_i {({\mathit{\boldsymbol{p}}_i}-{\mathit{\boldsymbol{p}}_*})} ^{\rm{T}}}{w_i}({\mathit{\boldsymbol{p}}_i}-{\mathit{\boldsymbol{p}}_*}))^{-1}}\cdot\\ \;\;\;\;\;\;\;\;\;\;\sum\limits_i {{{({\mathit{\boldsymbol{p}}_i} - {\mathit{\boldsymbol{p}}_*})}^{\rm{T}}}} {w_i}({\mathit{\boldsymbol{q}}_i} - {\mathit{\boldsymbol{q}}_*})\\ \;\;\;\;\;\;\;\;\;\;\;\;\;\;\;{\mathit{\boldsymbol{T}}_v} = {\mathit{\boldsymbol{q}}_*} - {\mathit{\boldsymbol{p}}_*}{\mathit{\boldsymbol{M}}_v} \end{array} $

式中, ${\mathit{\boldsymbol{p}}_*}$和${\mathit{\boldsymbol{q}}_*}$为加权中心

$ {\mathit{\boldsymbol{p}}_*} = \frac{{\sum\limits_i {{w_i}{\mathit{\boldsymbol{p}}_i}} }}{{\sum\limits_i {{w_i}} }}, \;{\mathit{\boldsymbol{q}}_*} = \frac{{\sum\limits_i {{w_i}{\mathit{\boldsymbol{q}}_i}} }}{{\sum\limits_i {{w_i}} }} $

2) 对于$\mathit{\boldsymbol{T}}$的上、下和左、右4条边界线上的顶点，同样用仿射变换${F_v}$变形到参考图像坐标系中，但求解${F_v}$的能量函数有所不同，增加了保持结构的约束项

(2)

式中，后一项是线约束项，${w_{\rm{l}}}$是线约束权重，满足$\mathit{\boldsymbol{v}} = u{\mathit{\boldsymbol{v}}_{\rm{s}}} + (1-u){\mathit{\boldsymbol{v}}_{\rm{e}}}$，${\mathit{\boldsymbol{v}}_{\rm{s}}}$和${\mathit{\boldsymbol{v}}_{\rm{e}}}$是$\mathit{\boldsymbol{v}}$所在的上下(或左右)边界的两个端点。对于上下边界内的点${\mathit{\boldsymbol{v}}_{kj}}$ ($k=0, m; j=1, 2, … n-1$)，${\mathit{\boldsymbol{v}}_{\rm{s}}} = {\mathit{\boldsymbol{v}}_{k0}}, {\mathit{\boldsymbol{v}}_{\rm{e}}} = {\mathit{\boldsymbol{v}}_{kn}}$；而对于左右边界内的顶点${\mathit{\boldsymbol{v}}_{kj}}$ ($k=1, 2, …, m-1; j=0, n$)，${\mathit{\boldsymbol{v}}_{\rm{s}}} = {\mathit{\boldsymbol{v}}_{0j}}, {\mathit{\boldsymbol{v}}_{\rm{e}}} = {\mathit{\boldsymbol{v}}_{mj}}$。

3) 为了保持变换的一致性，图像$\mathit{\boldsymbol{T}}$的其余网格顶点${\mathit{\boldsymbol{v}}_{kj}}$($k$≠0, $m$且$j$≠0, $ n$)的变换仍然使用仿射变换${F_v}$，只是加强了结构保持的力度，沿水平和纵向网格线进行约束。求解模型为

$ \begin{array}{l} {\rm{arg}}\;\mathop {{\rm{min}}}\limits_{{F_v}} \sum\limits_i {{w_i}} {\left| {{F_v}({\mathit{\boldsymbol{p}}_i})-{\mathit{\boldsymbol{q}}_i}} \right|^2} + \\ {w_{{\rm{hl}}}}{\left| {{F_v}\left( \mathit{\boldsymbol{v}} \right)-u{{\mathit{\boldsymbol{\tilde v}}}_{{\rm{hs}}}}-\left( {1 - u} \right){{\mathit{\boldsymbol{\tilde v}}}_{{\rm{he}}}}} \right|^2} + \\ {w_{{\rm{vl}}}}{\left| {{F_v}\left( \mathit{\boldsymbol{v}} \right) - \lambda {{\mathit{\boldsymbol{\tilde v}}}_{{\rm{vs}}}} - \left( {1 - \lambda } \right){{\mathit{\boldsymbol{\tilde v}}}_{{\rm{ve}}}}} \right|^2} \end{array} $

(3)

式中，$u$满足$\mathit{\boldsymbol{v}} = {\rm{ }}u{\mathit{\boldsymbol{v}}_{{\rm{hs}}}} + (1-u){\mathit{\boldsymbol{v}}_{{\rm{he}}}}$，${\mathit{\boldsymbol{v}}_{{\rm{hs}}}}$和${\mathit{\boldsymbol{v}}_{{\rm{he}}}}$是$\mathit{\boldsymbol{v}}$所在水平网格线的两个端点；$λ$满足$\mathit{\boldsymbol{v}} = \lambda {\mathit{\boldsymbol{v}}_{{\rm{vs}}}} + (1-\lambda){\mathit{\boldsymbol{v}}_{{\rm{ve}}}}$，${\mathit{\boldsymbol{v}}_{{\rm{vs}}}}$和${\mathit{\boldsymbol{v}}_{{\rm{ve}}}}$是$\mathit{\boldsymbol{v}}$所在纵向网格线的两个端点。${w_{{\rm{hl}}}}$和${w_{{\rm{vl}}}}$分别是水平和纵向网格线约束权重。把最小化问题式(2)和式(3)中的线约束项中的顶点视作虚拟的匹配特征点对，就可利用问题式(1)解法求解这两个问题。

图 1给出了一个比较例子，图 1(a)(b)是输入的目标和参考图像，以及用第3节中步骤1)筛选过的特征点之间的匹配关系，图 1(c)中的红色方框标出了扭曲的结构。

图 1 匹配特征点对作为控制点的图像变形

Fig. 1 Image deformation ((a) target image; (b) reference image; (c) deformed image using MLS; (d) deformed image using our line-constraint MLS)

2 配准质量评估

一般而言，图像之间的配准精度可以用匹配的特征点对间的重合度来衡量。但是对于视差图像，这种评估机制下得到的配准结果并不一定理想，因为特征点之间的准确对应可能会导致一些结构对象的破坏，而这些结构对象往往是视觉重要内容，由此得到的拼接结果视觉上不能令人满意。所以，本文的配准质量是通过最终的拼接效果来评估的，主要考虑拼接缝附近的区域，包括颜色和结构的差异。

把目标图像变形到参考图像所在的坐标系，计算重叠区域内像素的差异来评估配准结果的好坏，差异由两部分组成：颜色强度和结构内容。由于梯度可以反映图像的结构成分，所以差异公式定义为

$ \begin{array}{l} D\left( \mathit{\boldsymbol{v}} \right) = \left| {{I_1}\left( \mathit{\boldsymbol{v}} \right)-{I_2}\left( \mathit{\boldsymbol{v}} \right)} \right| + \\ \;\;\;\;\;\left| {\nabla {I_1}\left( \mathit{\boldsymbol{v}} \right)-\nabla {I_2}\left( \mathit{\boldsymbol{v}} \right)} \right| \end{array} $

(4)

式中，$\mathit{\boldsymbol{v}}$是像素点，${I_1}(\mathit{\boldsymbol{v}})$和${I_2}(\mathit{\boldsymbol{v}})$分别是变形图像和参考图像的颜色强度，$\nabla {I_1}(\mathit{\boldsymbol{v}})$和$\nabla {I_2}(\mathit{\boldsymbol{v}})$分别是变形图像和参考图像的梯度。因为是在重叠区域寻找最佳拼接缝进行融合拼接的，所以这个差异量应该主要来自拼接缝附近。把重叠区域分成两部分：$\mathit{\boldsymbol{ \boldsymbol{ \varOmega} }} $和$\mathit{\boldsymbol{ \boldsymbol{\overline \varOmega} }} $。$\mathit{\boldsymbol{ \boldsymbol{\varOmega} }} = \mathop \cup \limits_i {\mathit{\boldsymbol{N}}_5}({\mathit{\boldsymbol{p}}_i})$，${{\mathit{\boldsymbol{p}}_i}}$是拼接缝上的点，${\mathit{\boldsymbol{N}}_5}({\mathit{\boldsymbol{p}}_i})$是一个以${\mathit{\boldsymbol{p}}_i}$为中心的5×5的邻域，$\mathit{\boldsymbol{ \boldsymbol{\overline \varOmega} }} $是其余的点，那么评估函数定义为

$ E = \alpha \frac{{\sum\limits_{v \in \mathit{\boldsymbol{ \boldsymbol{\varOmega} }}} {D\left( \mathit{\boldsymbol{v}} \right)} }}{{\left| \mathit{\boldsymbol{ \boldsymbol{\varOmega} }} \right|}} + \left( {1-\alpha } \right)\frac{{\sum\limits_{v \in \mathit{\boldsymbol{ \boldsymbol{\overline \varOmega} }} } {D\left( \mathit{\boldsymbol{v}} \right)} }}{{\mathit{\boldsymbol{ \boldsymbol{\overline \varOmega} }} }} $

(5)

式中，$α$表示权重，本文实验中取2/3。评价对准好坏的标准是：$E$越小，质量越好。

3 局部配准下结构保持的拼接算法

线约束运动最小二乘法具有很好的局部准确配准性能，同时能尽量保持图像结构，是处理视差图像配准问题的良好工具。据此提出了一个新的视差图像拼接方法，称为局部配准下结构保持的拼接算法，步骤如下：

1) 检测目标和参考图像$\mathit{\boldsymbol{T}}$和$\mathit{\boldsymbol{R}}$的SIFT特征点，用RANSAC算法筛选准确的匹配特征点对，随后，用距离相似性^[17]进一步去除错误的匹配点对；

2) 用筛选出来的所有匹配特征点计算一个最佳单应矩阵$\mathit{\boldsymbol{H}}$, 用$\mathit{\boldsymbol{H}}$对$\mathit{\boldsymbol{T}}$进行单应变换，变换后的图像记为${\mathit{\boldsymbol{T}}_H}$；

3) 以$\mathit{\boldsymbol{T}}$和$\mathit{\boldsymbol{R}}$以及${\mathit{\boldsymbol{T}}_H}$和$\mathit{\boldsymbol{R}}$的匹配特征点对作为控制点和变换后的对应位置，应用线约束运动最小二乘法变形$\mathit{\boldsymbol{T}}$和${\mathit{\boldsymbol{T}}_H}$，分别记为${\mathit{\boldsymbol{T}}^W}$和${\mathit{\boldsymbol{T}}_H}^W$；

4) 在${\mathit{\boldsymbol{T}}^W}$和$\mathit{\boldsymbol{R}}$以及${\mathit{\boldsymbol{T}}_H}^W$和$\mathit{\boldsymbol{R}}$的重叠区域分别构建网络流模型，寻找最佳拼接缝，并用第2节的方法进行评估，选取配准最好的那组图像进行融合拼接。

步骤1)计算待拼接图像的特征点，尽可能正确地建立两幅图像的特征点之间的匹配关系。众所周知，即使用逐点反射变换也不能保证配准所有的视差图像，因为图像是3维对象的透视成像，包含透视成分，步骤2)3)考虑了这个问题，首先用单应矩阵变换目标图像，得到${\mathit{\boldsymbol{T}}_H}$。由于单应矩阵其实是个透视变换，所以${\mathit{\boldsymbol{T}}_H}$是$\mathit{\boldsymbol{T}}$的透视变换结果。然后分别配准$\mathit{\boldsymbol{T}}$和$\mathit{\boldsymbol{R}}$、${\mathit{\boldsymbol{T}}_H}$和$\mathit{\boldsymbol{R}}$。这样就不会漏掉配准时要用透视变换的情形。步骤4)选择配准好的那组图像沿拼接缝进行羽化融合。图 2框图总结了整个拼接算法，直观地给出了其逻辑流程。

图 2 拼接算法流程框图

Fig. 2 The flow chart of stitching algorithm

4 实验结果和讨论

在Windows系统上，用Visual studio开发工具和OpenCV API实现了本文的拼接算法。测试图像库是自行拍摄的23对图像和近期类似工作提供的公开图库^{[13, 15]}，共39对图像。选取了3个近期相近的高水平工作：APAP(as-projective-as-possible)变形方法^[13]，PTIS(parallax-tolerant image stitching)^[15]和SEAGULL(seam-guided local alignment)^[16]，以及最新版Photoshop与本文方法比较。APAP和本文方法最相似，虽然作者提供了MATLAB代码，但是该方法需要精心设置两个参数，而且不同的图像设置不一样，因此选择他们测试过的图像做对比，以体现APAP的最好性能。另外两个方法PTIS^[15]和SEAGULL^[16]都没有提供程序，对比实验在文献[15]公开发表的图像库上测试，因为这个图库文献[15]和文献[16]都进行了测试。

表 1给出了测试的总体配准精度结果。自拍的图像库上，本文方法配准22组，误配准1组，Photoshop配准19组，误配准4组；在文献[15]的图像库上，本文方法配准34组，误配准1组，PTIS和SEAGULL配准33组，误配准2组。另外，PTIS方法产生扭曲变形2组。

表 1 测试图像库上的配准精度
Table 1 Alignment accuracy on the picture libraries

下载CSV

方法	配准精度
方法	自拍图库	文献[15]图库
Photoshop	0.826	—
PTIS	—	0.943
SEAGULL	—	0.943
本文	0.956	0.971
注：“—”表示无测试结果。

图 3是一个自然场景的拼接例子，图像包含了石阶和树林等多种结构，而且视差大。图 3(b)是Photoshop的拼接结果，出现石阶断裂现象，用红色方框标记，说明对准错误。图 3(c)是本文方法得到的结果，对准明显要好，拼接结果自然。从结果上看，这里的配准只用了仿射变换，没有用单应变换加仿射变换。

图 3 本文方法和Photoshop的拼接结果比较

Fig. 3 Comparison between our method and Photoshop ((a) input images; (b) Photoshop; (c)ours)

在APAP发表的图库上做测试实验，拼接结果分别由APAP提供的MATLAB程序和本文方法运算得到。图 4包含了大量的铁轨这种线形结构，以及建筑物和街景的拼接，APAP和本文方法都准确配准，但APAP的透视扭曲比较明显。

图 4 本文方法和APAP的拼接结果比较

Fig. 4 Comparison between APAP and our method((a) input images; (b) APAP; (c)ours)

图 5给出了本文方法和PTIS、SEAGULL两个方法的比较结果，由于PTIS和SEAGULL都使用了单应变换和内容保持的变形方法相结合的配准模型，结果中透视变形明显，而且对特征点的配准误差调整有限，局部配准不够理想，如图像中穿白衬衣的人物出现了错位现象。图 5(d)是本文方法的拼接结果，配准准确，变形的目标图像没有强烈的透视变形，更接近一张真实的照片。

图 5 不同方法的拼接结果比较

Fig. 5 Comparison results of different methods((a) input images; (b) PTIS; (c)SEAGULL; (d)ours)

5 结论

提出了一个处理视差图像的组合拼接方法：用线约束运动最小二乘法与单应变换的组合配准目标图像和参考图像，然后沿拼接缝评估配准质量，选取质量最好的配准进行融合拼接。在图像配准阶段，为了兼顾特征点对应的准确性和结构的完整性，在MLS变形能量函数中加入网格线约束，以保持原有结构。在计算拼接缝阶段，用图像的重叠区域内的顶点构建网络流模型，用最大流和最小割算法计算最优拼接缝。在自拍的相片和近期相关工作提供的视差图像库上的实验结果表明本文方法配准质量好，透视和结构扭曲现象少，拼接图像自然。然而，当特征点太少或者过于集中时，本文方法会产生错误配准或不能配准图像，导致拼接结果不够理想。这是基于特征的拼接算法的共同缺点，是需要进一步开展的研究工作，目前则是简单地通过手动添加特征点的方式解决。把本文方法推广到3维点云数据的配准是未来的另一项研究工作。

参考文献

[1] Szeliski R. Image alignment and stitching:a tutorial[J]. Foundations and Trends in Computer Graphics and Vision, 2006, 2(1): 1–104. [DOI:10.1561/0600000009]

[2] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91–110. [DOI:10.1023/b:visi.0000029664.99615.94]

[3] Fischler M A, Bolles R C. Random sample consensus:a paradigm for model fitting with applications to image analysis and automated cartography[J]. Communications of the ACM, 1981, 24(6): 381–395. [DOI:10.1145/358669.358692]

[4] Brown M, Szeliski R, Winder S. Multi-image matching using multi-scale oriented patches[C]//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA: IEEE, 2005: 510-517.[DOI: 10.1109/cvpr.2005.235]

[5] McLauchlan P F, Jaenicke A. Image mosaicing using sequential bundle adjustment[J]. Image and Vision Computing, 2002, 20(9-10): 751–759. [DOI:10.1016/s0262-8856(02)00064-1]

[6] Pérez P, Gangnet M, Blake A. Poisson image editing[J]. ACM Transactions on Graphics, 2003, 22(3): 313–318. [DOI:10.1145/882262.882269]

[7] Gu Y, Zhou Y, Ren G, et al. Image stitching by combining optimal seam and multi-resolution fusion[J]. Journal of Image and Graphics, 2017, 22(6): 842–851. [谷雨, 周阳, 任刚, 等. 结合最佳缝合线和多分辨率融合的图像拼接[J]. 中国图象图形学报, 2017, 22(6): 842–851. ] [DOI:10.11834/jig.160638]

[8] Brown M, Lowe D G. Automatic panoramic image stitching using invariant features[J]. International Journal of Computer Vision, 2007, 74(1): 59–73. [DOI:10.1007/s11263-006-0002-3]

[9] Gao J H, Kim S J, Brown M S. Constructing image panoramas using dual-homography warping[C]//Proceedings of CVPR 2011. Colorado Springs, CO, USA, USA: IEEE, 2011: 49-56.[DOI: 10.1109/cvpr.2011.5995433]

[10] Gao J H, Li Y, Chin T J, et al. Seam-driven image stitching[C]//Proceedings of EuroGraphics. The Eurographics Association, 2013: 45-48.[DOI: 10.2312/conf/EG2013/short/045-048]

[11] Cao S X, Jiang J, Zhang G J, et al. Multi-scale image mosaic using features from edge[J]. Journal of Computer Research and Development, 2011, 48(9): 1788–1793. [曹世翔, 江洁, 张广军, 等. 边缘特征点的多分辨率图像拼接[J]. 计算机研究与发展, 2011, 48(9): 1788–1793. ]

[12] Lin W Y, Liu S Y, Matsushita Y, et al. Smoothly varying affine stitching[C]//Proceedings of CVPR 2011. Colorado Springs, CO, USA, USA: IEEE, 2011: 345-352.[DOI: 10.1109/cvpr.2011.5995314]

[13] Zaragoza J, Chin T J, Brown M S, et al. As-projective-as-possible image stitching with moving DLT[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 2339-2346.[DOI: 10.1109/cvpr.2013.303]

[14] Schaefer S, McPhail T, Warren J. Image deformation using moving least squares[J]. ACM Transactions on Graphics, 2006, 25(3): 533–540. [DOI:10.1145/1141911.1141920]

[15] Zhang F, Liu F. Parallax-tolerant image stitching[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 3262-3269.[DOI: 10.1109/cvpr.2014.423]

[16] Lin K M, Jiang N J, Cheong L F, et al. SEAGULL: Seam-guided local alignment for parallax-tolerant image stitching[C]//Leibe B, Matas J, Sebe N, et al. Computer Vision-ECCV 2016. ECCV 2016. Cham: Springer, 2016: 370-385.[DOI: 10.1007/978-3-319-46487-9_23]

[17] Chu D D, Li H S. Parallax image stitching based on moving least square method[J]. Computer Applications and Software, 2017, 34(8): 231–235. [楚东东, 李海晟. 基于移动最小二乘法的视差图像拼接[J]. 计算机应用与软件, 2017, 34(8): 231–235. ] [DOI:10.3969/j.issn.1000-386x.2017.08.041]