渐进式前景更新和区域关联的图像协同分割

姚拓中; 左文辉; 安鹏; 宋加涛

doi:10.11834/jig.180476

图像分析和识别 | 浏览量 : 0 下载量: 4 CSCD: 0

PDF
导出
分享
收藏
专辑

渐进式前景更新和区域关联的图像协同分割
Image co-segmentation with progressive foreground updating and hierarchical region correlation
2019年24卷第3期页码：366-375
收稿：2018-08-09，

修回：2018-9-10，

纸质出版：2019-03-16
DOI： 10.11834/jig.180476
稿件说明：

移动端阅览

姚拓中, 左文辉, 安鹏, 宋加涛. 渐进式前景更新和区域关联的图像协同分割[J]. 中国图象图形学报, 2019,24(3):366-375. DOI： 10.11834/jig.180476.

Tuozhong Yao, Wenhui Zuo, Peng An, Jiatao Song. Image co-segmentation with progressive foreground updating and hierarchical region correlation[J]. Journal of Image and Graphics, 2019, 24(3): 366-375. DOI： 10.11834/jig.180476.

摘要

目的

图像协同分割技术是通过多幅参考图像以实现前景目标与背景区域的分离，并已被广泛应用于图像分类和目标识别等领域中。不过，现有多数的图像协同分割算法只适用于背景变化较大且前景几乎不变的环境。为此，提出一种新的无监督协同分割算法。

方法

本文方法是无监督式的，在分级图像分割的基础上通过渐进式优化框架分别实现前景和背景模型的更新估计，同时结合图像内部和不同图像之间的分级区域相似度关联进一步增强上述模型估计的鲁棒性。该无监督的方法不需要进行预先样本学习，能够同时处理两幅或多幅图像且适用于同时存在多个前景目标的情况，并且能够较好地适应前景物体类的变化。

结果

通过基于iCoseg和MSRC图像集的实验证明，该算法无需图像间具有显著的前景和背景差异这一约束，与现有的经典方法相比更适用于前景变化剧烈以及同时存在多个前景目标等更为一般化的图像场景中。

结论

该方法通过对分级图像分割得到的超像素外观分布分别进行递归式估计来实现前景和背景的有效区分，并同时融合了图像内部以及不同图像区域之间的区域关联性来增加图像前景和背景分布估计的一致性。实验表明当前景变化显著时本文方法相比于现有方法具有更为鲁棒的表现。

Abstract

Objective

As a hotspot in computer vision

image co-segmentation is a research branch of the classic image segmentation problem that uses multiple images to separate foreground objects from background regions in an image. It has been widely used in many fields

such as image classification

object recognition

and 3D object reconstruction. Image co-segmentation has become an ill-conditioned and challenging problem due to many factors

such as viewpoint change and intraclass diversity of the foreground objects in the image. Most current image co-segmentation algorithms have limits in performance

which only work efficiently in images with dramatic background and minimal foreground changes.

Method

This study proposes a new unsupervised algorithm that optimizes foreground/background estimation progressively. Our proposed algorithm has three advantages:1) it is unsupervised and does not need sample learning

2) it can be used to co-segment multiple images simultaneously or an image with multiple foreground objects

3) it is more adaptable to dramatic intraclass variations than previous algorithms. The main steps of our algorithm are as follows. A classic hierarchical segmentation is first utilized to generate a multiscale superpixel set. Different Gaussian mixture models are then used to estimate the foreground and background distributions on the basis of classic color and texture descriptors at the superpixel level. A Markov random field (MRF) model is used to estimate the annotation of each superpixel by solving a traditional energy minimization problem. In our MRF model

each node represents a superpixel or pixel. The first two unary potentials denote the possibilities of a superpixel or pixel belonging to the foreground or background

and the last pairwise potential penalizes the annotation consistency among superpixels in different images. This energy minimization can be solved by a classic graph cut. Unlike most image co-segmentation algorithms

the foreground and background models are progressively estimated based on the initial superpixel annotation from the pre-learned object detector. These models use the annotation in the current step to update the superpixel annotation in the next step for foreground and background distribution updating until these distributions are no longer optimized significantly. Intra- and inter-image similarity correlations in different superpixel levels are integrated into our iterative-type framework to increase the robustness of foreground and background model estimation. Each image is divided into a series of segmentation levels by hierarchical segmentation

and three matrices are used to model the semantic correlations among different regions. An affinity matrix

$$\mathit{\boldsymbol{A}}$$

is utilized to define the relationship among neighboring superpixels inside one image. A constraint matrix

$$\mathit{\boldsymbol{C}}$$

is defined to describe the hierarchical relation among different segmentation levels. Another affinity matrix

$$\mathit{\boldsymbol{M}}$$

is utilized to define the relationship among superpixels in different images. A normalized affinity matrix is then defined based on

$$\mathit{\boldsymbol{P}}$$

and a new matrix

$$\mathit{\boldsymbol{Q}}$$

created based on

$$\mathit{\boldsymbol{C}}$$

to project

$$\mathit{\boldsymbol{P}}$$

into the solution space. The optimal annotation of superpixel pairs inside one image and in different images can be achieved by classic normalized cuts. Thus

a new pairwise potential is added to our MRF model for penalizing the corresponding superpixel pairs with different annotations in different images.

Result

In our experiment

iCoseg and MSRC datasets are utilized to compare the performance of our algorithm with those of several state-of-the-art algorithms. Experimental results demonstrate that our proposed algorithm can achieve the highest segmentation accuracy and mean of segmentation accuracy in most object classes

which imply that our algorithm does not need large foreground and background differences and can be used for generalized images with dramatic foreground changes and different foreground objects. In some object classes

such as "Skating" and "Panda"

however

our algorithm is inefficient because of the inaccurate initial distribution estimation from the out-of-date object detector

and our iterative-type framework still cannot help the distribution estimation to jump out of a local minimum. Nonetheless

our algorithm can be significantly improved by using state-of-the-art deep learning-based object detectors

such as Mask-RCNN.

Conclusion

This study proposes a novel unsupervised image co-segmentation algorithm

which iteratively estimates the appearance distribution of each superpixel by hierarchical image segmentation to distinguish the foreground from background. Regional semantic correlations inside one image and in different images are considered a new pairwise potential in the MRF model to increase the consistency of foreground and background distribution. Our detailed experiment shows that our proposed algorithm can achieve a more robust performance than those of state-of-the-art algorithms and can be used to co-segment multiple images with dramatic foreground changes and multiple foreground objects.

关键词

Keywords

references

Rother C, Minka T, Blake A, et al. Cosegmentation of image pairs by histogram matching-incorporating a global constraint into MRFs[C]//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, NY, USA: IEEE, 2006: 993-1000.[ DOI:10.1109/CVPR.2006.91 http://dx.doi.org/10.1109/CVPR.2006.91 ]

Batra D, Kowdle A, Parikh D, et al. iCoseg: interactive co-segmentation with intelligent scribble guidance[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 3169-3176.[ DOI:10.1109/CVPR.2010.5540080 http://dx.doi.org/10.1109/CVPR.2010.5540080 ]

Vitaladevuni S N, Basri R. Co-clustering of image segments using convex optimization applied to EM neuronal reconstruction[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 2203-2210.[ DOI:10.1109/CVPR.2010.5539901 http://dx.doi.org/10.1109/CVPR.2010.5539901 ]

Kowdle A, Batra D, Chen W C, et al. iModel: interactive co-segmentation for object of interest 3d modeling[C]//Proceedings of the 11th European Conference on Trends and Topics in Computer Vision. Heraklion, Crete, Greece: Springer, 2010: 211-224.[ DOI:10.1007/978-3-642-35740-4_17 http://dx.doi.org/10.1007/978-3-642-35740-4_17 ]

Gallagher A C, Chen T. Clothing cosegmentation for recognizing people[C]//Proceedings of 2008 IEEE Conference on Computer Vision and Pattern Recognition. Anchorage, AK, USA: IEEE, 2008: 1-8.[ DOI:10.1109/CVPR.2008.4587481 http://dx.doi.org/10.1109/CVPR.2008.4587481 ]

Mukherjee L, Singh V, Peng J M. Scale invariant cosegmentation for image groups[C]//Proceedings of CVPR 2011. Colorado Springs, CO, USA: IEEE, 2011: 1881-1888.[ DOI:10.1109/CVPR.2011.5995420 http://dx.doi.org/10.1109/CVPR.2011.5995420 ]

Hochbaum D S, Singh V. An efficient algorithm for co-segmentation[C]//Proceedings of the 2009 IEEE 12th International Conference on Computer Vision. Kyoto, Japan: IEEE, 2009: 269-276.[ DOI:10.1109/ICCV.2009.5459261 http://dx.doi.org/10.1109/ICCV.2009.5459261 ]

Joulin A, Bach F, Ponce J. Discriminative clustering for image co-segmentation[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 1943-1950.[ DOI:10.1109/CVPR.2010.5539868 http://dx.doi.org/10.1109/CVPR.2010.5539868 ]

Collins M D, Xu J, Grady L, et al. Random walks based multi-image segmentation: quasiconvexity results and GPU-based solutions[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 1656-1663.[ DOI:10.1109/CVPR.2012.6247859 http://dx.doi.org/10.1109/CVPR.2012.6247859 ]

Lee C, Jang W D, Sim J Y, et al. Multiple random walkers and their application to image cosegmentation[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition. Boston, MA, USA: IEEE, 2015: 3837-3845.[ DOI:10.1109/CVPR.2015.7299008 http://dx.doi.org/10.1109/CVPR.2015.7299008 ]

Chang K Y, Liu T L, Lai S H. From co-saliency to co-segmentation: an efficient and fully unsupervised energy minimization model[C]//Proceedings of CVPR 2011. Colorado Springs, CO, USA: IEEE, 2011: 2129-2136.[ DOI:10.1109/CVPR.2011.5995415 http://dx.doi.org/10.1109/CVPR.2011.5995415 ]

Vicente S, Rother C, Kolmogorov V. Object cosegmentation[C]//Proceedings of CVPR 2011. Colorado Springs, CO, USA: IEEE, 2011: 2217-2224.[ DOI:10.1109/CVPR.2011.5995530 http://dx.doi.org/10.1109/CVPR.2011.5995530 ]

Wang F, Huang Q X, Guibas L J. Image co-segmentation via consistent functional maps[C]//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013: 849-856.[ DOI:10.1109/ICCV.2013.110 http://dx.doi.org/10.1109/ICCV.2013.110 ]

Kim G, Xing E P, Li F F, et al. Distributed cosegmentation via submodular optimization on anisotropic diffusion[C]//Proceedings of 2011 International Conference on Computer Vision. Barcelona, Spain: IEEE, 2011: 169-176.[ DOI:10.1109/ICCV.2011.6126239 http://dx.doi.org/10.1109/ICCV.2011.6126239 ]

Kim G, Xing E P. On multiple foreground cosegmentation[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 837-844.[ DOI:10.1109/CVPR.2012.6247756 http://dx.doi.org/10.1109/CVPR.2012.6247756 ]

Ma T Y, Latecki L J. Graph transduction learning with connectivity constraints with application to multiple foreground cosegmentation[C]//Proceedings o f 2013 IEEE Conference on Computer Vision and Pattern Recognition. Portland, OR, USA: IEEE, 2013: 1955-1962.[ DOI:10.1109/CVPR.2013.255 http://dx.doi.org/10.1109/CVPR.2013.255 ]

Wang Z X, Liu R J. Semi-supervised learning for large scale image cosegmentation[C]//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013: 393-400.[ DOI:10.1109/ICCV.2013.56 http://dx.doi.org/10.1109/ICCV.2013.56 ]

Dai J F, Wu Y N, Zhou J, et al. Cosegmentation and cosketch by unsupervised learning[C]//Proceedings of 2013 IEEE International Conference on Computer Vision. Sydney, NSW, Australia: IEEE, 2013: 1305-1312.[ DOI:10.1109/ICCV.2013.165 http://dx.doi.org/10.1109/ICCV.2013.165 ]

Taniai T, Sinha S N, Sato Y. Joint recovery of dense correspondence and cosegmentation in two images[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 4246-4255.[ DOI:10.1109/CVPR.2016.460 http://dx.doi.org/10.1109/CVPR.2016.460 ]

Joulin A, Bach F, Ponce J. Multi-class cosegmentation[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 542-549.[ DOI:10.1109/CVPR.2012.6247719 http://dx.doi.org/10.1109/CVPR.2012.6247719 ]

Carreira J, Sminchisescu C. Constrained parametric min-cuts for automatic object segmentation[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 3241-3248.[ DOI:10.1109/CVPR.2010.5540063 http://dx.doi.org/10.1109/CVPR.2010.5540063 ]

Van de Sande K, Gevers T, Snoek C. Evaluating color descriptors for object and scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2010, 32(9):1582-1596.[DOI:10.1109/TPAMI.2009.154]

Kim E, Li H S, Huang X L. A hierarchical image clustering cosegmentation framework[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition. Providence, RI, USA: IEEE, 2012: 686-693.[ DOI:10.1109/CVPR.2012.6247737 http://dx.doi.org/10.1109/CVPR.2012.6247737 ]

Deselaers T, Ferrari V. Global and efficient self-similarity for object classification and detection[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE, 2010: 1633-1640.[ DOI:10.1109/CVPR.2010.5539775 http://dx.doi.org/10.1109/CVPR.2010.5539775 ]

Bullard J W, Garboczi E J, Carter W C, et al. Numerical methods for computing interfacial mean curvature[J]. Computational Materials Science, 1995, 4(2):103-116.[DOI:10.1016/0927-0256(95)00014-H]

Yu S X, Shi J B. Segmentation given partial grouping constraints[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2004, 26(2):173-183.[DOI:10.1109/TPAMI.2004.1262179]

Shotton J, Winn J, Rother C, et al. TextonBoost : joint appearance, shape and context modeling for multi-class object recognition and segmentation[C]//Proceedings of the 9th European Conference on Computer Vision. Graz, Austria: Springer, 2006: 1-15.[ DOI:10.1007/11744023_1 http://dx.doi.org/10.1007/11744023_1 ].

Yang F, Li X, Cheng H, et al. Object-aware dense semantic correspondence[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 4151-4159.[ DOI:10.1109/CVPR.2017.442 http://dx.doi.org/10.1109/CVPR.2017.442 ]

文章被引用时，请邮件提醒。

提交

融合注意力机制与知识蒸馏的孪生网络压缩