发布时间: 2018-07-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.170653
2018 | Volume 23 | Number 7

图像处理和编码

具有直线结构保护的网格化图像拼接

何川, 周军

1. 上海交通大学图像通信与网络工程研究所, 上海 200240;

2. 上海市数字媒体处理与传输重点实验室, 上海 200240

收稿日期: 2017-12-22; 修回日期: 2018-02-01

基金项目: 国家自然科学基金项目(61471234, 61527804);国家科技支撑计划项目(2015BAK05B03)

第一作者简介: 何川, 1993年生, 男, 上海交通大学信息与通信工程专业硕士研究生, 主要研究方向为多视点图像配准。E-mail:tribody@sjtu.edu.cn.

中图法分类号: TP301.6

文献标识码: A

文章编号: 1006-8961(2018)07-0973-11

摘要

目的基于网格变形的图像配准方式, 针对待拼接图片重叠区域的视差具有一定的容忍性, 并且能够适应更复杂的图像拼接场景。在NISwGSP (natural image stitching with the global similarity prior)算法基础上提出了一种具有直线结构保护的图像拼接算法(MISwLP), 该算法通过提取图片中的直线结构并施加约束, 可以得到视觉效果自然、畸变较小的图像拼接结果。方法首先对图片进行网格划分, 建立网格优化模型, 针对网格顶点坐标集定义能量函数, 在保证图片重叠区域高度对齐的同时, 对网格进行相似性连续约束, 并辅以直线结构约束, 最后使用共轭梯度最小二乘法求解得到最优网格顶点集, 指导网格变形。结果针对不同场景下的图片进行拼接实验, 同时和几种比较流行的图像拼接软件和算法进行比较。结果表明, 同经典拼接算法, 比如Autostitch相比, 基于网格优化的图像拼接算法能够适应更加复杂的多平面场景, 在减小投影失真和对齐误差方面表现更好; 同现在比较好的几种网格拼接算法, 比如SPHP (shape-preserving half-projective warps for image stitching)、APAP (as-projective-as-possible image stitching with moving DLT)、NISwGSP等的比较, MISwLP算法不仅能够很好地对齐图像和避免投影失真, 并且能够保持图像重叠区域到非重叠区域的一致性, 即保护原图中的直线结构。结论提出了一种基于网格优化的直线约束方法, 对于具有显著几何结构的图像拼接场景, 能够较好地保护拼接后图像中原有的直线结构, 具有较好的应用价值。

关键词

图像拼接; 网格变形; 直线保护; 能量函数; 最优化; 共轭梯度最小二乘法; 投影失真

Mesh-based image stitching algorithm with linear structure protection

He Chuan, Zhou Jun

1. Institute of Image Communication and Network Engineering, Shanghai Jiao Tong University, Shanghai 200240, China;

2. Key Laboratory of Digital Media Processing and Transform, Shanghai 200240, China

Supported by: National Science Foundation of China(61471234, 61527804);National Key Technology Research and Development Program of the Ministry of Science and Technology of China(2015BAK05B03)

Abstract

Objective The image registration method based on mesh deformation can handle some parallax in the overlapping area of input images and can adapt to more complex scenarios where the scenery is not in the same plane.A new mesh-based image stitching with linear structure protection (MISwLP) is proposed.This algorithm applies constraints to the lines extracted from images to protect them from being distorted by the mesh deformation process, thus obtaining natural panoramas with reduced distortion. Method MISwLP is based on mesh deformation.Images are meshed with a set of vertices, and image deformation is guided by the indexed vertices.The algorithm can be implemented with four steps.The first step is called APAP (As-Projective-As-Possible Image Stitching with Moving DLT) pre-registration.The APAP algorithm is applied to align the images, and the feature matching pairs obtained by the APAP algorithm can be used to obtain all the vertex matching pairs in the overlapped area of the image matching pairs, which are called matching points.These matching points are distributed evenly and can be used to obtain good alignment capability for the mesh optimization model.The second step is called global similarity estimation.The relative 2D rotation angle and the relative scale between two images are estimated in this step.Then, a similarity transform between the two images can be constructed.In the third step, a mesh optimization model is established for the input vertices of the images.The mesh optimization process is implemented in two stages.In the first stage, the energy function includes three terms, namely, the alignment, local similarity, and global similarity terms, and the original vertices are used as the input for this function.It is solved by the least-squares conjugate gradient method.The first stage aims to align the images.Then, the outputs of the first stage are used as the input vertices of the second stage.In the second stage, a new term called line protection is added for further optimization.The lines are extracted by the LSD algorithm with a threshold or user-guided interface and then sampled across the grid.The line protection term constrains the sample points in a straight line.The optimization solution is computed with a sparse matrix effectively.At this time, the distorted lines in the first stage of this step are straightened.In the fourth step, a texture mapping method is applied by affine transforming the input grids into the output grids.All images are blended with a linear blending method. Result The performance of MISwLP is verified using images captured from different sceneries by handheld devices, such as mobile phones and digital cameras, and several open datasets.The scenes include urban and nature sceneries.MISwLP can handle more complicated image stitching tasks, in which the scenery consists of two planes, than image-stitching algorithms, which use only one global holography, such as AutoStitch.Furthermore, MISwLP produces a natural stitching result and reduced projective distortion.In addition, MISwLP outperforms several state-of-the-art methods, such as SPHP (Shape-Preserving Half-Projective Warps for Image Stitching), APAP, and NISwGSP (Natural Image Stitching with the Global Similarity Prior).These algorithms use similarity transform to protect the non-overlapping area from projective distortion.Consequently, inconsistency is introduced between the overlapped and non-overlapped areas.The human eyes can perceive the destruction of certain geometry structures of the transitional area.MISwLP handles this problem with a line protection term and provides a good result with only a few geometry distortions.The proposed method works especially well for urban sceneries that contain many linear structures.For sceneries with no evident geometry, a user-guided auxiliary method is provided for selecting lines to protect.MISwLP is based on the NISwGSP algorithm, but the experiments show that their time complexities are nearly the same. Conclusion The performance of the proposed method is superior to those of state-of-the-art image-stitching methods.MISwLP protects the linear structure in the image stitching process, thereby providing a good stitching result with no geometry and projective distortions.Therefore MISwLP has good application value.

Key words

image stitching; mesh deformation; lines protection; energy function; optimization; least-squared conjugate gradient method; projective distortion

0 引言

随着智能手机、数码相机等手持式图像采集设备的普及, 人们越来越多地使用手持式设备获取图片。但是由于手持镜头的局限性, 很难获得视野较大的图片。拍摄者一般通过旋转手持式设备拍摄多张照片后使用拼接方法得到一张近似360°的全景照片。理想条件下, 完美的全景拼接要求多角度拍摄的照片具有共点的固定光心。实际操作时该条件无法保证, 这就导致重叠区域会存在一定的视差, 尤其是当景物距离镜头较近时, 视差会非常大, 无法满足图像拼接中的单视点透视假设, 所以大多数情况下我们并不能得到一张比较完美的全景照片。

基于单视点透视假设的图像拼接技术, 以Brown等人^[1]的研究成果为里程碑, 形成了一套非常成熟的方法。此类方法基本流程如图 1所示, 算法已经非常成熟, 应用也十分广泛。比较有代表性的如微软的ICE(image composite editor), Photoshop中的全景拼接工具, AutoStitch等, 能够提供很好的图像拼接效果, 一段时间内图像拼接问题被认为是一个已经完美解决的问题。但正如实际拍摄时候我们所遇到的问题那样, Brown等人的方法必须满足两个假设条件^{[2, 4]}:

图 1 经典图像拼接流程

Fig. 1 Classical image stitching process

1) 图像重叠区域必须尽量处于同一个平面, 画面整体最好处于同一个平面;

2) 每次拍摄时, 相机光心近乎重合。

如果无法满足第1个条件, 即景物场景存在多个平面, 那么一个全局单应就无法同时对齐多个平面的图像, 拼接后的图像会存在非常明显的错位和重影现象。同样, 如果无法满足第2个条件, 即两次拍摄时的相机光心发生了运动, 当相机光心距离(即基线距离)达到一定程度, 就会造成重叠区域产生明显的视差, 进而使图像拼接结果产生重影。

对于第1个问题, 一个全局单应已经无法准确描述图像之间的几何变换关系, 因此可以考虑使用多单应拟合的方法来解决。

而针对第2个问题, 即多视点透视导致的视差问题, 目前的解决办法主要分为3类:第1类方法是从图像融合的角度出发, 可以通过多通带融合^[1]的方式去除一定程度的重影, 但是计算开销过大, 并且视差较大时该方法失效; 第2类方法使用局部单应性对齐图像, 强调图像的局部配准, 其中较为突出的方法是基于网格变形^[3]的图像配准算法, 可以处理存在一定视差的拼接场景; 第3类方法以缝合线为主导, 在图像配准、图像融合阶段之间操作, 选择最佳缝合线拼接图片, 能够处理存在大视差和遮挡的场景。由于这3类方法分别处于图像拼接的不同阶段, 在实际应用时可以综合使用。

多通带融合在Brown的算法^[1]中已经取得了很好的效果, 因此, 当前的研究主要围绕着配准和缝合线两大主题展开。

Gao等人^[4]率先提出了基于双平面假设的图像拼接算法, 即假设所获图像近似存在背景和地面两个基本平面, 通过分别计算两个平面的单应变换, 而更好的配准图像。Lin等人^[5]使用平滑变化的仿射变换对齐图像, 能够接受一定的视差。Zaragoza等人^[6]率先使用网格优化模型, 将图像划分为密集网格, 以网格中心到特征点的距离为权重, 为每个网格计算一个单应矩阵, 使用局部单应性对齐图像, 并给出了一套高效的计算方法Moving DLT。网格优化模型是一套非常灵活的图像优化框架, 基于网格变形的图像优化算法, 能够对图像做更精细化的调整。考虑到点特征匹配的不可靠性(比如周期重复纹理等造成的误匹配), Li和Joo等人^[7-8]在提出了一种结合点特征和线特征的图像配准方法。Zhang等人^[9]借鉴视频去抖方法的优化项, 并结合了缝合线搜索算法^[10], 提高了大视差场景下的拼接性能^[11]。Lin等人^[12]提出使用缝合线引导的局部单应对齐方法(SEAGULL), 通过super pixel^[13]聚类产生初始化的一组局部单应假设, 然后通过计算缝合线生成一系列的拼接候选结果, 通过评估缝合效果选择最终结果。在考虑使用更好地几何变换模型对齐图像的同时, 越来越多的算法开始关注保护原图片的拍摄视角, 减少投影失真, 从而获得观感更加自然的拼接效果。Chang等人^[14]借鉴图像缩放中的形状保护方法, 使得非重叠区域到重叠区域, 由相似变换逐渐过渡到透视变换, 既保证了重叠区域的对齐性, 又保证了非重叠区域产生较少的投影失真。Lin等人^[15]使用单应插值的方式控制单应渐进变换, 并提出了一种自动选择最小旋转角度的全局相似变换的方法, 减小图片之间的旋转角度, 能够进一步提高图片自然观感。Chen等人^[16]使用APAP初始化网格, 并同时利用局部相似项和全局相似项约束网格变形, 同样减小了投影失真, 多图拼接效果提升较大。Li和Xiang等人^[17-18]的工作同样强调了对非重叠区域的保护。

然而, 多数消除非重叠区域投影失真的算法会造成过渡区域的不一致性, 使图片中的几何结构, 尤其是直线结构遭到破坏。基于网格模型的拼接算法具备很好的灵活性, 能够进行更精细化的调整, 但是少有算法考虑保护图像中的直线结构。由于图像是按照网格限定变换, 在网格内部的图片信息, 能够随着网格的整体变换而变换。但是网格大小是由人为设定的, 当图像中的几何结构跨越单个网格时, 就不能保证给几何结构在变换后保持不变。基于这一问题, 本文提出了网格优化中的直线结构保护方法, 该方法可以与任意一种网格优化算法相结合提升算法的直线结构保护能力, 本文在NISwGSP的方法上提出了具有直线保护的网格图像拼接(MISwLP)方法, 该方法通过添加直线约束项, 有效保护图片拼接中的直线结构。

1 NISwGSP算法简介

NISwGSP算法采用网格优化模型。算法总体流程可分解为四大步骤:

1) APAP预配准;

2) 全局相似参数估计;

3) 网格优化;

4) 图像合成。

第1步寻找匹配点对。由于APAP算法具有非常好的局部对齐效果, 该算法使用APAP的配准结果可以得到重叠区域网格顶点的匹配点对。如图 2所示, 图 2(a)为RANSAC算法^[19]滤除误匹配点后的特征点对, 图 2(b)为左图重叠区域网格顶点到右图的匹配点对。由于网格顶点匹配对分布更加均匀, 用其作为配准项效果更好。

图 2 特征匹配点对和重叠区域的网格点匹配对(素材图来自文献[14])

Fig. 2 Feature points matches and overlapped-area vertices matches((a) feature points matches of two images; (b) vertices of left image in the overlapped area with matching points of right image)

第2步估计每张图片相对于参考图片的旋转参数和尺度参数, 作为图片的全局相似变换。

第3步针对网格顶点坐标集定义能量代价函数, 通过网格优化求解得到新的顶点集。

第4步根据求得的网格顶点坐标, 完成图像映射与像素融合。

2 MISwLP算法

人眼对图像中的几何结构, 尤其是直线结构, 非常敏感, 因此在图像拼接过程中, 保护图像中的直线结构显得非常有必要。APAP、SPHP和NISwGSP等一系列方法都未涉及到对直线结构的保护, 这使得这些算法在某些具有显著直线结构, 例如建筑物场景时, 拼接结果会有明显的弯曲或变形, 造成不好的视觉效果。如图 3所示, NISwGSP算法室内图片拼接结果, 输入图片共有6张, 重叠区域主要集中在中间部分。可以看到, 拼接结果中的直线发生了明显的弯曲。NISwGSP算法基于网格连续性相似约束变形, 只能确保网格内部不会发生形变, 当直线结构跨网格时, 这种约束就不存在了。本文算法在NIS-wGSP算法基础上, 提出了具有直线结构保护约束的网格化图像拼接方法MISwLP(mesh-based image stitching with linear structure protection)。

图 3 NISwGSP算法拼接结果(素材图来自文献[16])

Fig. 3 NISwGSP stitch results((a) input images; (b) output images)

MISwLP算法流程如图 4所示, 第1步APAP预配准和第2步全局相似估计同NISwGSP算法一致, 得到网格顶点匹配对和全局相似变换。网格优化分为两个阶段, 第1阶段同原算法相同, 不考虑直线约束项, 第2阶段优化加入直线约束项, 得到最终的顶点集。

图 4 MISwLP算法流程

Fig. 4 MISwLP algorithm flow

2.1 网格优化模型

假设输入图片集合为

$ {\mathit{\boldsymbol{S}}_\mathit{\boldsymbol{I}}} = \{ {\mathit{\boldsymbol{I}}_1}, {\mathit{\boldsymbol{I}}_2}, \ldots, {\mathit{\boldsymbol{I}}_N}\} $

图片之间的邻接关系集合为

$ \mathit{\boldsymbol{J}} = \{ ({\mathit{\boldsymbol{I}}_i}, {\mathit{\boldsymbol{I}}_j})|{\mathit{\boldsymbol{I}}_i}{\rm{match}}{\mathit{\boldsymbol{I}}_j}\} ;i, j \in {\rm{ }}\left\{ {1, \ldots, N} \right\} $

式中, match表示${\mathit{\boldsymbol{I}}_i}$与${\mathit{\boldsymbol{I}}_j}$匹配, 即相邻。将图片网格化, 使用向量

$ {\mathit{\boldsymbol{V}}_i} = {[{x_1}^i{y_1}^i{x_2}^i{y_2}^i \ldots {x_m}^i\;{y_m}^i]^{\rm{T}}} $

表示${\mathit{\boldsymbol{I}}_i}$的网格顶点坐标, 其中$m$为顶点个数。所有图片顶点坐标可表述为

$ \mathit{\boldsymbol{V}} = {[{\mathit{\boldsymbol{V}}_0}^{\rm{T}}{\mathit{\boldsymbol{V}}_1}^{\rm{T}} \ldots {\rm{ }}{\mathit{\boldsymbol{V}}_\rm{N}}^{\rm{T}}]^{\rm{T}}} $

优化求解第1阶段的能量代价函数为

$ E(\mathit{\boldsymbol{\hat V}}) = {\lambda _a}{E_a}(\mathit{\boldsymbol{\hat V}}) + {\lambda _{ls}}{E_{ls}}(\mathit{\boldsymbol{\hat V}}) + {\lambda _{gs}}{E_{gs}}(\mathit{\boldsymbol{\hat V}}) $

(1)

式中, ${E_a}(\mathit{\boldsymbol{\hat V}})$为对齐误差项, ${E_{ls}}(\mathit{\boldsymbol{\hat V}})$为局部相似项, ${E_{gs}}(\mathit{\boldsymbol{\hat V}})$为全局相似项。${\lambda _a}$, ${\lambda _{ls}}$, ${\lambda _{gs}}$分别为对齐误差项, 局部相似项和全局相似项的调节权重。

第2阶段的能量代价函数为

$ E'(\mathit{\boldsymbol{\hat V}}) = {\lambda _a}{E_a}(\mathit{\boldsymbol{\hat V}}) + {\lambda _{ls}}{E_{ls}}(\mathit{\boldsymbol{\hat V}}) + {\rm{ }}{\lambda _{gs}}{E_{gs}}(\mathit{\boldsymbol{\hat V}}) + {\lambda _l}{E_l}(\mathit{\boldsymbol{\hat V}}) $

(2)

式中, ${\lambda _l}$为直线约束项的权重, 这里的权重对所有图片都适用, 各权重之间的相对大小会决定各约束项在约束求解过程中的重要性, 权重大小与其重要性成正相关性。

2.2 对齐误差项${E_a}(\mathit{\boldsymbol{\hat V}})$

为了使图片在重叠区域能够更好地对齐, 使用APAP算法获得的网格匹配点对集${\mathit{\boldsymbol{M}}^{\mathit{ij}}}$ (如图 2所示)替代传统方法中的特征点匹配对, 定义

$ \begin{array}{l} {E_a}(\mathit{\boldsymbol{\hat V}}) = {\rm{ }}\\ \sum\limits_{i = 1}^N {\sum\limits_{({\mathit{\boldsymbol{I}}_i}, {\mathit{\boldsymbol{I}}_j}) \in \mathit{\boldsymbol{J}}} {\sum\limits_{p_k^{ij} \in {\mathit{\boldsymbol{M}}^{ij}}} {\left\| {\mathit{\tilde v}(\mathit{\boldsymbol{p}}_k^{ij})-\mathit{\tilde v}(\mathit{\boldsymbol{{p}}_k^{ij}}^\prime )} \right\|} } } {\rm{ }}\\ \mathit{\tilde v}(\mathit{\boldsymbol{p}}_k^{ij}) = \mathit{\boldsymbol{W}}_k^{ij}\mathit{\boldsymbol{\hat V}}\\ \mathit{\tilde v}(\mathit{\boldsymbol{{p}}_k^{ij}}^\prime ) = \mathit{\boldsymbol{{W}}_k^{ij}}^\prime \mathit{\boldsymbol{\hat V}} \end{array} $

(3)

式中, $\mathit{\boldsymbol{W}}_k^{ij}$, $\mathit{\boldsymbol{W}}_k^{ij}\prime \in {{\mathbb{R}}^{{\rm{2}} \times {\rm{m}}}}$分别表示$\mathit{\boldsymbol{p}}_k^{ij}$和其匹配点$\mathit{\boldsymbol{p}}_k^{ij}\prime $在${{\mathit{\boldsymbol{\hat V}}}_i}$下的权值矩阵, 由其所在的网格4个顶点差值得到, 形如

$ \mathit{\boldsymbol{W}}_k^{ij} = \left[{\begin{array}{*{20}{l}} \ldots &{w_1^{ij}}&0& \ldots &{w_2^{ij}}&0& \ldots &{w_3^{ij}}&0& \ldots &{w_4^{ij}}&0& \ldots \\ \ldots &0&{w_1^{ij}}& \ldots &0&{w_2^{ij}}& \ldots &0&{w_3^{ij}}& \ldots &0&{w_4^{ij}}& \ldots \end{array}} \right] $

(4)

2.3 局部相似项约束${E_{ls}}(\mathit{\boldsymbol{\hat V}})$

局部相似项约束项的作用, 在于确保每个网格尽量经历相似变换的情况下, 将重叠区域网格的几何变换传播到整张网格之上, 保证网格变换的连续性。具体计算公式为

$ \begin{array}{l} {E_{ls}}(\mathit{\boldsymbol{\hat V}}) = {\sum\limits_{I = 1}^N {\sum\limits_{{\mathit{\boldsymbol{e}}_j}^i \in {\mathit{\boldsymbol{E}}_i}} {\left\| {{\mathit{\boldsymbol{e}}_j}^i\prime- \mathit{\boldsymbol{S}}_j^i{\mathit{\boldsymbol{e}}_j}^i} \right\|} } ^2}\\ {\mathit{\boldsymbol{e}}_j}^i\prime = \mathit{\tilde v}_k^i- {{\mathit{\tilde v}}^i}_j, \mathit{\boldsymbol{e}}_j^i = v_k^i- v_j^i\\ \mathit{\boldsymbol{S}}_j^i = \left[{\begin{array}{*{20}{c}} {c({\mathit{\boldsymbol{e}}_j}^i)}&{s({\mathit{\boldsymbol{e}}_j}^i)}\\ {-s({\mathit{\boldsymbol{e}}_j}^i)}&{c({\mathit{\boldsymbol{e}}_j}^i)} \end{array}} \right] \end{array} $

(5)

式中, ${\mathit{\boldsymbol{E}}_i}$表示${\mathit{\boldsymbol{I}}_i}$的边集合, $\mathit{\boldsymbol{S}}_j^i$表示边${{\mathit{\boldsymbol{e}}_j}^i}$经历的相似变换, 根据文献[16]中方法计算得到。

2.4 全局相似项约束${E_{gs}}(\mathit{\boldsymbol{\hat V}})$

全局相似项的作用, 在于控制非重叠区域的网格以全局相似变换为主, 而重叠区域的网格以对齐为主, 保证重叠区域能够充分对齐, 而非重叠区域能够尽量保证经历一个相似变换, 从而减少投影失真, 全局相似项表达式为

$ {E_{gs}}\left( {\mathit{\boldsymbol{\hat V}}} \right) = \\ \sum\limits_{i = 1}^N {\sum\limits_{{\mathit{\boldsymbol{e}}_j}^i \in {\mathit{\boldsymbol{E}}_i}} {w({\mathit{\boldsymbol{e}}_j}^i)} \left[ \begin{array}{l} {(c({\mathit{\boldsymbol{e}}_j}^i) - {s_i}{\rm{cos}}{\theta _i})^2} + \\ {\rm{ }}{(s({\mathit{\boldsymbol{e}}_j}^i) - {s_i}{\rm{sin}}{\theta _i})^2} \end{array} \right]} $

(6)

式中, $\mathit{\boldsymbol{w}}({\mathit{\boldsymbol{e}}_j}^i)$为权重函数, 与网格边距离重叠区域的距离正相关, 计算方法参照文献[16], ${{s_i}}$和${{\theta _i}}$为第2步得到的全局相似参数。

2.5 直线保持项${E_l}(\mathit{\boldsymbol{\hat V}})$

2.5.1 直线提取

首先, 需要获取图像中较好的直线结构。由于需要的是局部直线线段, 所以选用目前提取直线片段较为优良的LSD算法。考虑到网格内部的直线结构并不会变形, 因此只选取长度超过网格对角线长度的直线(具体筛选策略可根据场景指定, 比如只选择水平直线和垂直直线), 记为${\mathit{\boldsymbol{L}}_i}$。图 5为通过LSD算法检测并经过筛选后得到的直线。

图 5 直线检测结果(蓝色部分, 素材图来自文献[14, 16])

Fig. 5 Line detection results((a)lines extracted from eitan_office; (b)lines extracted from building)

图 5(a)中, 直线结构较为清晰, 提取的直线质量较好。而图 5(b)中, 直线结构并不完整, 提取的直线比较碎片化, 不利于用作直线保护, 因此提供了另外一种用户交互方式, 可以手动选定场景中需要保护的直线, 作为辅助方式。

2.5.2 直线约束

得到了图像中的直线之后, 对直线进行采样, 由于图像是基于网格为基本单位变形的, 本文采样的准则是保证采样点之间的距离能够跨越一个网格。这样可以得到每条直线$\mathit{l}_\mathit{k}^\mathit{i} \in {\mathit{\boldsymbol{L}}_i}$的采样点集合${{\mathit{\boldsymbol{P}}_k}^i}$。由于图像之间经历的变换有可能是射影变换, 直线的交比性是无法保证的, 只有当图片之间由仿射变换或更简单的变换关系联系起来时, 可以采取控制交比不变性达到约束效果。因此只需要约束直线上的采样点在变换之后仍然共线即可。

约束三点共线的方式很多, 为了归一化, 本文使用点到直线距离作为惩罚项约束三点共线。基于NISwGSP, 增加新的直线约束项能量代价函数${E_l}(\mathit{\boldsymbol{\hat V}})$为

$ \begin{array}{l} {E_l}(\mathit{\boldsymbol{\hat V}}) = \sum\limits_{i = 1}^N {\sum\limits_{{l_k}^i \in {\mathit{\boldsymbol{L}}_i}} {\sum\limits_{{\mathit{\boldsymbol{p}}_j} \in {\mathit{\boldsymbol{P}}_k}^i} {\frac{{{S_j}}}{{\left\| {\mathit{\tilde v}({\mathit{\boldsymbol{p}}_1})-\mathit{\tilde v}({\mathit{\boldsymbol{p}}_0})} \right\|}}} } } \\ {S_j} = \left\| {(\mathit{\tilde v}({\mathit{\boldsymbol{p}}_j})-\mathit{\tilde v}({\mathit{\boldsymbol{p}}_0})) \times (\mathit{\tilde v}({\mathit{\boldsymbol{p}}_1})-\mathit{\tilde v}({\mathit{\boldsymbol{p}}_0}))} \right\| \end{array} $

(7)

式中, ${{\mathit{\boldsymbol{p}}_0}}$与${{\mathit{\boldsymbol{p}}_1}}$表示被采样直线的两个端点坐标。$表示向量${\mathit{\boldsymbol{p}}_j}-{\mathit{\boldsymbol{p}}_0}$与向量${\mathit{\boldsymbol{p}}_1}-{\mathit{\boldsymbol{p}}_0}$形成的四边形面积。

2.6 迭代优化

用稀疏矩阵存储所有待优化项, 并通过最小二乘法最小化能量误差, 考虑到矩阵过大, 直接使用最小二乘法计算复杂度过高, 因此采用共轭梯度最小二乘法(CGLS), 计算得到最优顶点集坐标${\mathit{\boldsymbol{\hat V}}}$。

然而, 新加入的${E_l}(\mathit{\boldsymbol{\hat V}})$并不是线性函数, 式(7)中, ${{\mathit{\boldsymbol{p}}_0}}$、${{\mathit{\boldsymbol{p}}_1}}$和${{\mathit{\boldsymbol{p}}_j}}$都是未知的, 利用叉乘计算面积必然会产生交叉项, 所以整体能量函数为非线性的, 无法使用最小二乘法。因此在实际求解中, 为了使用线性最优化方法求解, 我们可以考虑使用迭代的方式, 分两个阶段求解。

第1阶段, 为了获得对齐后图像中直线段点的位置, 不添加直线约束项, 我们要求解的线性方程组为

$ \left[{\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{A}}_a}}\\ {{\mathit{\boldsymbol{A}}_{ls}}}\\ {{\mathit{\boldsymbol{A}}_{gs}}} \end{array}} \right]\mathit{\boldsymbol{\hat V}} = \left[{\begin{array}{*{20}{c}} 0\\ 0\\ {{\mathit{\boldsymbol{b}}_{gs}}} \end{array}} \right] $

(8)

式中, ${{\mathit{\boldsymbol{A}}_a}}$, ${{\mathit{\boldsymbol{A}}_{ls}}}$, ${{\mathit{\boldsymbol{A}}_{gs}}}$和0, 0, ${{\mathit{\boldsymbol{b}}_{gs}}}$分别表示对齐项, 局部相似项和全局相似项在顶点集${\mathit{\boldsymbol{\hat V}}}$下对应的Jacob矩阵和残差向量。通过最优化计算, 可得到顶点集${{\mathit{\boldsymbol{\hat V}}}^{\left( 1 \right)}}$。

第2阶段, 根据第1阶段计算得到的顶点集${{\mathit{\boldsymbol{\hat V}}}^{\left( 1 \right)}}$, 可以计算得到直线集$\mathit{\boldsymbol{L}} = \{ {\mathit{\boldsymbol{L}}_1}, {\mathit{\boldsymbol{L}}_2}, \ldots, {\mathit{\boldsymbol{L}}_{\rm{N}}}\} $中的直线端点位置。对于式(7), 此时${{\mathit{\boldsymbol{p}}_0}}$与${{\mathit{\boldsymbol{p}}_1}}$变为已知, 该方程式变为线性方程, 现在以此为基础, 加入直线保持项, 合并代入稀疏矩阵中, 求解线性方程组

$ \left[{\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{A}}_a}}\\ {{\mathit{\boldsymbol{A}}_{ls}}}\\ {{\mathit{\boldsymbol{A}}_{gs}}}\\ {{\mathit{\boldsymbol{A}}_l}} \end{array}} \right]\mathit{\boldsymbol{\hat V}} = \left[{\begin{array}{*{20}{c}} 0\\ 0\\ {{\mathit{\boldsymbol{b}}_{gs}}}\\ {{\mathit{\boldsymbol{b}}_l}} \end{array}} \right] $

(9)

式中, ${{\mathit{\boldsymbol{A}}_l}}$, ${{\mathit{\boldsymbol{b}}_l}}$分别为直线约束项的Jacob矩阵和残差向量。得到新的顶点集合${{\mathit{\boldsymbol{\hat V}}}^{\left( 2 \right)}}$, 即为最终所求顶点集坐标${\mathit{\boldsymbol{\hat V}}}$。

2.7 纹理映射

如图 6, 网格${\mathit{\boldsymbol{I}}_1}$和${\mathit{\boldsymbol{I}}_2}$局部顶点坐标经过最优化计算后, 得到其对应的全局顶点坐标。将单个网格分割为两个三角形, ${\mathit{\boldsymbol{v}}_a}$、${\mathit{\boldsymbol{v}}_b}$, ${\mathit{\boldsymbol{v}}_c}$为原图中三角形的顶点坐标, ${{\mathit{\boldsymbol{v'}}}_a}$、${{\mathit{\boldsymbol{v'}}}_b}$、${{\mathit{\boldsymbol{v'}}}_c}$为优化计算后对应三角形的顶点坐标。由3对点的映射关系可以计算得到三角形${{\mathit{\boldsymbol{v'}}}_a}{{\mathit{\boldsymbol{v'}}}_b}{{\mathit{\boldsymbol{v'}}}_c}$到三角形${\mathit{\boldsymbol{v}}_a}{\mathit{\boldsymbol{v}}_b}{\mathit{\boldsymbol{v}}_c}$的仿射变换, 完成纹理映射。

图 6 纹理映射

Fig. 6 Texture mapping

在建立了输出图像像素点与输入图像像素点之间的映射关系后, 使用线性融合方式融合图像。

3 实验结果及分析

与NISwGSP原算法相比, 本文改进了其能量优化函数, 能够在一定程度上保护图像中的直线结构。本节对改进算法的性能进行测试和分析, 并对比几种代表性算法的拼接效果。

为了测试算法的可靠性, 本文测试并对比了包括文献和自己拍摄的多组场景照片的拼接效果。

3.1 算法参数选定

本文算法可以调节的参数有两大类。第1类参数为网格优化参数, 包括网格大小, 降采样图片的尺寸, 直线约束相关参数, 特征提取、匹配、噪声点滤除过程中的参数, APAP算法相关参数, 以及捆绑调整参数等等一系列中间算法参数。除了直线约束相关的参数, 其他参数一致采用原作者参数设定。第2类参数为能量函数权重调整参数, 一共有6个参数, 分别为对齐项权重, 局部相似项权重, 全局相似项权重, 以及直线约束项权重。权重参数选定为

$ \begin{array}{l} {\lambda _a} = 1, {\lambda _{ls}} = 0.75, {\lambda _{gs}} = 1, {\rm{ }}\\ \beta = 6, \gamma = 20, {\lambda _l} = 1.10 \end{array} $

权重参数作用在于控制对应约束项在整个最优求解过程中所起的作用, 与权值大小正相关。该参数设定对所有图片均有效, 输出结果稳定。在本次实验中, 设定的网格以像素为单位, 大小为40×40像素, 降采样图片大小为800×600像素(或者600×800像素, 自适应), 直线采样间距为$40\sqrt 2 $, 选择直线长度不小于60, 另根据不同场景选择直线约束的角度范围, 所有图片均采用线性融合合成。

3.2 算法时间复杂度分析

如图 4所示, MISwLP算法可以分为4个主要部分, 分别是APAP预配准, 全局相似估计, 两阶段网格优化, 和图像合成。其中APAP预配准阶段, 需要提取SIFT特征、特征匹配、RANSAC滤除噪声点和基于网格的单应求解等一系列过程, 耗时占比最大。全局相似估计阶段耗时第2。而真正的优化求解, 由于使用了稀疏矩阵和共轭梯度最小二乘法求解, 计算开销并不大。图像合成耗时与第3步相当。所以, 决定整个算法时间复杂度的主要在于第1步和第2步, 因此本算法与NISwGSP原算法耗时相当, 表 1展示了在不同数量图片数据集上的算法运行时间对比, 运行平台为Mac Pro(2009), 6 GB内存和4核英特尔速龙处理器。

表 1 算法运行时间对比
Table 1 Algorithm runtime comparison

下载CSV

数据集(图片数量)	NISwGSP/s	MISwLP/s
Park(2)	11.492 8	13.493 1
Building(3)	22.459 8	25.869 2
Lake(4)	22.695 3	31.869 2
Garden(5)	77.756 9	89.018 8
Eitan office(6)	19.904 8	24.378 2
SantaMaria-all(8)	255.704 7	269.782 6
Playroom-all(10)	49.532 2	62.013 3
PalazzoPubblico2(11)	243.734 4	267.987 0
SienaCathedralLibrary(14)	804.434 4	838.648 7
Grail(17)	55.350 2	73.436 5
Times Square(20)	202.776 0	231.821 5
Church(32)	287.297 3	338.513 0
Atrium(34)	415.666 9	466.522 2

另外, 本文也测试了Auto Stitch、APAP、SPHP等算法的拼接速度。整体上来说, 基于全局单应假设, 以Auto Stitch为代表的算法在运行时间上是占有较明显优势的。目前基于多单应映射和网格优化的算法, 在算法层面并没有较好提升性能的方式, 目前解决办法大多使用CPU多线程或GPU并行算法以及硬件提升方案提升运行速度。

3.3 算法效果对比

首先, 将MISwLP算法与原算法NISwGSP做了对比测试, 在某些有着明显直线结构的场景下, MISwLP对图像中直线结构起到很好的保护作用。如图 7所示, 为两组场景下, MISwLP与NISwGSP的对比, 相关细节使用红色线标出。两组场景分别对应着建筑物和室内两种情况。

图 7 建筑物和室内拼接效果(素材图来自文献[14, 16])

Fig. 7 Stitching Results of building and indoor((a) NISwGSP; (b) MISwLP)

第1组照片难点在于输入的3张照片分别对应着不同的视角, 因此有着不同的消失点, NISwGSP算法是基于网格的连续性相似约束指导网格变形, 并通过能量函数惩罚对齐误差, 完成图像拼接的, 所以并没有考虑到图片之间的投影关系, 造成拼接处图片直线向外膨胀。而MISwLP通过约束图像中的直线结构, 能够一定程度上缓解这种形变, 图 8展示了NISwGSP优化后的网格图和勾选的直线与MISwLP优化后的网格图和勾选的直线, 可以很明显地看出, 施加了直线约束后, 网格相对于原算法向内压缩, 楼房的轮廓被重新拉直。

图 8 网格拼接图(上)和直线变形情况(下)拼接结果

(素材图来自文献[14])

Fig. 8 Mesh and lines deformation results

((a) NISwGSP; (b) MISwLP)

第2组照片在室内拍摄, 照片中实质上出现了空间中的4个平面, 同样的, 原算法只是较好地完成了图像对齐和相似性约束, 而没能够保持住原有的基本直线结构。

同时, 本文同Auto Stitch、SPHP、APAP算法和NISwGSP算法做了比较, 如图 9所示。

图 9 各算法在park数据集上的拼接效果对比

(素材图来自文献[14])

Fig. 9 Qualitative comparisons on the park image pair

从图 9中可以看出, 这些算法中Autostitch对齐效果最差, APAP射影失真较严重, NISwGSP对齐效果最好。但是观察输入图片, 可以发现, 除了APAP, 这些算法在图片结合处都发生了一定程度的形变, 原本的弧线或直线发生了弯折, 而MISwLP则没有。

总地来说, 在直线特征明显的情况下, MISwLP能够比较好地保护了原图的直线结构。

NISwGSP在对齐和多图拼接的场景下优于其他算法, 通过结合直线结构保护, 能得到较好的拼接结果。和APAP算法以及SPHP等算法相比, MISwLP更是具有明显的优势, 图 10和图 11展示了在笔者自己采集的照片上, NISwGSP算法和本文MISwLP算法的拼接效果比较。

图 10 NISwGSP与MISwLP图片拼接结果对比(6幅图)

Fig. 10 Image stitching results between NISwGSP and MISwLP (six images)

图 11 NISwGSP与MISwLP图片拼接结果对比(两幅图)

Fig. 11 Image stitching results between NISwGSP and MISwLP (two images)

此外, 本文提出的直线约束方法可以应用到任何基于网格优化的图像拼接算法上, 为算法提供更好地直线保持能力。

4 结论

针对基于网格变形的图像拼接算法, 提出了一种具有直线约束的网格变形拼接方法(MISwLP), 对于拼接后图像具备一定的直线保持能力。本文方法使用LSD算法提取图像中的直线, 并采样直线得到采样点集, 通过在网格优化函数中添加直线约束项, 限制直线的变形, 从而保护图像中的直线结构。由于直线约束性为非线性能量函数, 为了使用基于稀疏矩阵的线性求解, 本文将优化过程串行化, 分为两阶段求解。实验结果表明, 本文提出的MISwLP算法能够较好的保护拼接图像中的直线结构, 针对图像中直线结构不明显的情况, 提供了一种用户交互界面可作为辅助方式帮助用户勾选需要保护的直线结构, 本算法具有一定的应用价值。为更好保护拼接图像中的全局直线结构, 后续研究中将进一步解决全局直线的提取方法。

参考文献

[1] Brown M, Lowe D G. Automatic panoramic image stitching using invariant features[J]. International Journal of Computer Vision, 2007, 74(1): 59–73. [DOI:10.1007/s11263-006-0002-3]

[2] Joshi H, Sinha K L. A survey on image mosaicing techniques[J]. International Journal of Advanced Research in Computer Engineering & Technology, 2013, 2(2): 1–11.

[3] Igarashi T, Igarashi Y. Implementing as-rigid-as-possible shape manipulation and surface flattening[J]. Journal of Graphics, Gpu, and Game Tools, 2009, 14(1): 17–30. [DOI:10.1080/2151237x.2009.10129273]

[4] Gao J H, Kim S J, Brown M S. Constructing image panoramas using dual-homography warping[C]//Proceedings of 2011 IEEE Conference on Computer Vision and Pattern Recognition. Springs, Colorado, USA: IEEE, 2011: 49-56. [DOI:10.1109/cvpr.2011.5995433]

[5] Lin W Y, Liu S Y, Matsushita Y, et al. Smoothly varying affine stitching[C]//Proceedings of 2011 IEEE Conference on Computer Vision and Pattern Recognition. Colorado Springs, USA: IEEE, 2011: 345-352. [DOI:10.1109/cvpr.2011.5995314]

[6] Zaragoza J, Chin T J, Brown M S, et al. As-projective-as-possible image stitching with moving DLT[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 2339-2346. [DOI:10.1109/CVPR.2013.303]

[7] Li S W, Yuan L, Sun J, et al. Dual-feature warping-based motion model estimation[C]//Proceedings of 2015 IEEE International Conference on Computer Vision. Santiago, Chile: IEEE, 2015: 4283-4291. [DOI:10.1109/iccv.2015.487]

[8] Joo K, Kim N, Oh T H, et al. Line meets as-projective-as-possible image stitching with moving DLT[C]//Proceedings of 2015 IEEE International Conference on Image Processing. Québec City, Canada: IEEE, 2015: 1175-1179. [DOI:10.1109/icip.2015.7350985]

[9] Liu F, Gleicher M, Jin H L, et al. Content-preserving warps for 3D video stabilization[C]//Proceedings of 2009 ACM SIGGRAPH. New Orleans, Louisiana, USA: ACM, 2009: #44. [DOI:10.1145/1576246.1531350]

[10] Gao J H, Li Y, Chin T J, et al. Seam-driven image stitching[C]//Proceedings of 2013 EUROGRAPHICS. Girona, Spain: EUROGRAPHICS, 2013. [DOI:10.2312/conf/eg2013/short/045-048]

[11] Zhang F, Liu F. Parallax-tolerant image stitching[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, OH, USA: IEEE, 2014: 3262-3269. [DOI:10.1109/cvpr.2014.423]

[12] Lin K M, Jiang N J, Cheong L F, et al. SEAGULL: seam-guided local alignment for parallax-tolerant image stitching[C]//Proceedings of the 14th European Conference on Computer Vision-ECCV 2016. Amsterdam, The Netherlands: Springer, 2016: 370-385. [DOI:10.1007/978-3-319-46487-9_23]

[13] Achanta R, Shaji A, Smith K, et al. SLIC superpixels compared to state-of-the-art superpixel methods[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(11): 2274–2282. [DOI:10.1109/tpami.2012.120]

[14] Chang C H, Sato Y, Chuang Y Y. Shape-preserving half-projective warps for image stitching[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus, Ohio, USA: IEEE, 2014: 3254-3261. [DOI:10.1109/cvpr.2014.422]

[15] Lin C C, Pankanti S U, Ramamurthy K N, et al. Adaptive as-natural-as-possible image stitching[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, Massachusetts, USA: IEEE, 2015: 1155-1163. [DOI:10.1109/CVPR.2015.7298719]

[16] Chen Y S, Chuang Y Y. Natural image stitching with the global similarity prior[C]//Proceedings of the 14th European Conference on European Conference on Computer Vision. Amsterdam, Netherlands: Springer, 2016: 186-201. [DOI:10.1007/978-3-319-46454-1_12]

[17] Li N, Xu Y F, Wang C. Quasi-homography warps in image stitching[J]. IEEE Transactions on Multimedia, 2017. [DOI:10.1109/tmm.2017.2771566]

[18] Xiang T Z, Xia G S, Bai X, et al. Image stitching by line-guided local warping with global similarity constraint[J]. arXiv: 1702. 07935, 2017.

[19] Yao D, Zhou J, Xue Z. Homography matrix genetic consensus estimation algorithm[C]//Proceedings of 2010 International Conference on Audio Language and Image Processing. Shanghai, China: IEEE, 2010: 1139-1143. [DOI:10.1109/icalip.2010.5685084]