发布时间: 2019-08-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.180669
2019 | Volume 24 | Number 8

图像分析和识别

基于自步学习的鲁棒多样性多视角聚类

唐永强^1,2, 张文生^1,2

1. 中国科学院自动化研究所, 北京 100190;

2. 中国科学院大学, 北京 100190

收稿日期: 2018-12-18; 修回日期: 2019-03-07

基金项目: 国家自然科学基金项目(61432008, 61472423, U1636220)

第一作者简介: 唐永强, 1992年生, 男, 博士研究生, 主要研究方向为计算机视觉、数据挖掘、机器学习。E-mail:tangyongqiang2014@ia.ac.cn.

中图法分类号: TP391

文献标识码: A

文章编号: 1006-8961(2019)08-1338-11

摘要

目的大数据环境下的多视角聚类是一个非常有价值且极具挑战性的问题。现有的适合大规模多视角数据聚类的方法虽然在一定程度上能够克服由于目标函数非凸性导致的局部最小值，但是缺乏对异常点鲁棒性的考虑，且在样本选择过程中忽略了视角多样性。针对以上问题，提出一种基于自步学习的鲁棒多样性多视角聚类模型（RD-MSPL）。方法 1）通过在目标函数中引入结构稀疏范数$\mathrm{L}_{2, 1}$来建模异常点；2）通过在自步正则项中对样本权值矩阵施加反结构稀疏约束来增加在多个视角下所选择样本的多样性。结果在Extended Yale B、Notting-Hill、COIL-20和Scene15公开数据集上的实验结果表明：1）在4个数据集上，所提出的RD-MSPL均优于现有的2个最相关多视角聚类方法。与鲁棒多视角聚类方法（RMKMC）相比，聚类准确率分别提升4.9%，4.8%，3.3%和1.3%；与MSPL相比，准确率分别提升7.9%，4.2%，7.1%和6.5%。2）通过自对比实验，证实了所提模型考虑鲁棒性和样本多样性的有效性；3）与单视角以及多个视角简单拼接的实验对比表明，RD-MSPL能够更有效地探索视角之间关联关系。结论本文提出一种基于自步学习的鲁棒多样性多视角聚类模型，并针对该模型设计了一种高效求解算法。所提方法能够有效克服异常点对聚类性能的影响，在聚类过程中逐步加入不同视角下的多样性样本，在避免局部最小值的同时，能更好地获取不同视角的互补信息。实验结果表明，本文方法优于现有的相关方法。

关键词

多视角学习; 聚类; 自步学习; 鲁棒; 多样性

Robust and diverse multi-view clustering based on self-paced learning

Tang Yongqiang^1,2, Zhang Wensheng^1,2

1. Institute of Automation, Chinese Academy of Sciences, Beijing 100190, China;

2. University of Chinese Academy of Sciences, Beijing 100190, China

Supported by: National Natural Science Foundation of China (61432008, 61472423, U1636220)

Abstract

Objective In real-world applications, datasets naturally comprise multiple views. For instance, in computer vision, images can be described by different features, such as color, edge, and texture; a web page can be described by the words appearing on the web page itself and the hyperlinks pointing to them; and a person can be recognized by their face, fingerprint, iris, and signature. Clustering aims to explore meaningful patterns in an unsupervised manner. In the era of big data, with the rapid increase of multi-view data, obtaining better clustering performance than any single view by using complementary information from different views is a valuable and challenging task. Popular multi-view clustering methods can be roughly divided into two categories:spectral clustering based and nonnegative matrix factorization (NMF) based. Multi-view spectral clustering methods can have superior performance in nonlinear separate data partitioning. However, the high computational complexity due to the feature decomposition of Laplacian matrix limits their applications in large-scale data clustering. Conversely, the classical $K$-means clustering method, which has been proven to be equivalent to NMF, is often used in the big data environment because of its low computational complexity and convenient parallelization. Several studies have extended single-view $K$-means to a multi-view setting. To a certain extent, multi-view self-paced learning (MSPL) can overcome bad local minima due to non-convex objective functions. However, two drawbacks need to be solved. First, MSPL lacks robustness for the data outliers. Second, MSPL considers only the criterion that samples should be added to the clustering process from easy to more complex sequences while ignoring the diversity in the sample selection process. To solve the above two problems, we propose a robust and diverse multi-view clustering model based on self-paced learning (RD-MSPL). Method The robust $K$-means clustering method is needed to achieve a more stable clustering performance with respect to a fixed initialization. To address this problem, we introduce a structural sparsity norm ($\mathrm{L}_{2, 1}$-norm) into the objective function to replace the $\mathrm{L}_{2}$-norm. The $\mathrm{L}_{2, 1}$-norm-based clustering objective enforces the $\mathrm{L}_{1}$-norm along the data point direction of data matrix and $\mathrm{L}_{2}$-norm along the feature direction. Thus, the effect of outlier data points in clustering is reduced by the $\mathrm{L}_{1}$-norm. In addition, ideal self-paced learning should utilize not only easy but also diverse examples that are sufficiently dissimilar from what has already been learned. To achieve this goal, we apply the negative $\mathrm{L}_{2, 1}$-norm constraints to the sample weight matrix in the self-paced regularization. As discussed above, the $\mathrm{L}_{2, 1}$-norm leads to group-wise sparse representation (i.e., nonzero entries tend to be concentrated in a small number of groups). By contrast, the negative $\mathrm{L}_{2, 1}$-norm should have a countereffect to groupwise sparsity (i.e., nonzero entries tend to be scattered across a large number of groups). The anti-structure sparse constraint is expected to realize the diversity of samples selected from multiple views. The difficulty of solving the proposed objective comes from the $\mathrm{L}_{2, 1}$-norm non-smoothness. In this study, we propose an effective algorithm to handle this problem. Result We perform experiments on four public datasets, namely, extended Yale B, Notting-Hill, COIL-20, and Scene15. The clustering performance is measured using six popular metrics:normalized mutual information (NMI), accuracy (ACC), adjusted rank index (AR), F-score, precision, and recall. Higher metrics correspond to improved performance. Those metrics favor different properties in the clustering such that a comprehensive evaluation can be achieved. In all datasets, the reported final results on those metrics are measured by the average and standard derivation of 20 runs. We highlight the best values in bold in each table. First, we compare our proposal with robust multi-view $K$-means clustering (RMKMC) and MSPL, which are the most relevant multi-view clustering methods. The experimental results indicate that the proposed RD-MSPL is superior to these two methods in almost all metrics except for the recall metric on the Notting-Hill dataset. Then, we experimentally prove the importance of two key components in the proposed model (i.e., model robustness and sample diversity). Finally, we compare the proposed RD-MSPL with single view and concatenated multiple views. Its superior performance confirms that RD-MSPL can better capture complementary information and explore the relationship among multiple views. In the proposed model, two self-paced learning parameters influence the clustering performance. These two parameters control the pace at which the model learns new and diverse examples separately, and they usually increase iteratively during optimization. In this study, we conduct further parameter sensitivity analysis to better understand the characteristics of our RD-MSPL model. The experimental results show that although these two parameters play an important role in performance, most results are still better than the single-view baseline. Conclusion In this paper, a new model called RD-MSPL is proposed to perform large-scale multi-view data clustering. The proposed model can effectively overcome the effect of outliers. In the clustering process, with the gradual addition of diverse samples from different views, our proposed method can better obtain complementary information from different views while avoiding the local minima. We conduct a series of comparative analyses with several existing methods on multiple datasets. The experimental results show that the proposed model is superior to the existing related multi-view clustering methods. Future research will focus on 1) expanding the applicability of the method to a wider range of data with kernel trick because the proposed method is based on the assumption that all the features are on linear manifolds and 2) the importance of the adaptive learning approach for the self-paced learning parameter in such unsupervised setting.

Key words

multi-view learning; clustering; self-paced learning; robust; diversity

0 引言

多视角数据在实际应用中非常常见，在多视角数据中，一组数据对象由多个视角的数据实例组成。例如，在计算机视觉中，图像可以由颜色、边缘、纹理等特征表示；网页可以由页面文本或指向它们的链接表示；一个人可以通过面部、指纹、虹膜和签名等方式来识别^[1-3]。在大数据时代，随着多视角数据量的增加，如何以无监督聚类的方式高效探索数据中不同视角间蕴含的有意义模式是一个非常有价值同时也极具挑战性的任务。

多视角聚类旨在同时分析多个视角数据实例的特性，其基本假设为不同视角中表示相同数据对象的数据实例属于相同的簇^[4]。当前受到较多关注的多视角聚类方法主要包含基于谱聚类和基于非负矩阵分解两大类。谱聚类是一种从图论中演化出来的聚类方法，主要思想是将样本看做图的节点，样本之间的相似性看做图的边，通过求解构造的图的规范化最小割问题达到聚类目的^[5]。目前多视角谱聚类已有广泛研究^[6-9]，其核心思想在于最小化不同视角之间的差异或寻找一致的相似度矩阵^[4]。多视角谱聚类方法在非线性可分的数据上表现优异，然而谱聚类在对拉普拉斯矩阵做特征分解时计算复杂度高，限制了其在大规模数据聚类问题上的应用。

非负矩阵分解(NMF)的目标是将一个非负矩阵分解为两个低维的非负子矩阵。非负矩阵分解及其扩展算法已被证明与多种聚类算法等价，包括经典的$K$-means聚类方法^[10]。与谱聚类相比，$K$-means聚类方法具有计算复杂度低和易于并行化等特点，常用于解决大规模数据聚类问题。$K$-means本身是一种单视角聚类方法，在聚类多视角数据时，最直观的想法是将所有视角特征简单拼接，然后在拼接的特征上进行聚类，然而不同视角的特征具有异构性，拼接后的特征缺乏明确的物理意义。为了更好地探索不同视角之间的关联关系，Cai等人^[11]提出一种鲁棒多视角聚类方法(RMKMC)，基本假设是不同视角之间的指示矩阵存在一致性，并以此为基础在目标函数上进一步考虑了聚类过程中噪声的影响。此后，Xu等人^[12]提出一种重加权判别嵌入$K$-means多视角聚类框架(RDEKM)，除了考虑噪声鲁棒性之外，能够对从高维特征空间到低维子空间的映射建模，并通过迭代重加权最小二乘算法求解每个视角权重。RMKMC和RDEKM为大规模多视角数据聚类问题提供了一种有效解决思路，然而存在一个共同缺点：两者均是非凸优化问题，在优化过程中极易陷入局部最小值。为克服这一不足，Xu等人^[13]将自步学习(SPL)机制引入多视角$K$-means聚类，提出一种多视角自步学习(MSPL)方法。MSPL在聚类过程中首先聚类容易被划分的样本，然后逐渐增加样本难度直到聚类所有样本。

MSPL方法虽然能在一定程度上避免陷入局部最小值，但是缺乏对异常点鲁棒性考虑，且在样本选择过程中忽略了视角多样性。为了解决以上两个问题，本文提出一种基于自步学习的鲁棒多样性多视角自步学习模型(RD-MSPL)，通过在目标函数中引入结构稀疏范数$\mathrm{L}_{2, 1}$来建模异常点，在自步正则项内通过反结构稀疏约束来增加样本选择过程中视角的多样性，并针对所提模型设计了一种高效求解算法。通过在4个公开数据集上的实验比较，验证了本文所提模型的有效性。

1 多视角聚类相关方法

给定由维度为$d$的$n$个样本组成的数据集$\boldsymbol{X}=\left[\boldsymbol{x}_{1}, \boldsymbol{x}_{2}, \cdots, \boldsymbol{x}_{n}\right] \in {\bf R}^{d \times n}$，针对该数据集的聚类簇划分$\boldsymbol{C}=\left\{\boldsymbol{C}_{1}, \boldsymbol{C}_{2}, \cdots, \boldsymbol{C}_{k}\right\}$，$K$-means最小化每个样本到其聚类中心的平方误差，即

$ \sum\limits_{i = 1}^k {\sum\limits_{x \in {\mathit{\boldsymbol{C}}_i}} {\left\| {\mathit{\boldsymbol{x}} - {\mathit{\boldsymbol{u}}_i}} \right\|_2^2} } $

(1)

式中，$\boldsymbol{u}_{i}=\sum\limits_{\boldsymbol{x} \in \boldsymbol{C}_{i}} \boldsymbol{x} / n_{i}$是簇$\boldsymbol{C}_{i}$的原型向量。最小化式(1)是一个NP难问题，因为最优解需要考虑样本集所有可能的簇划分。

为求解该问题，Ding等人^[14]提出一种基于非负矩阵分解的求解思路，将式(1)改写为

$ \begin{array}{c}{\min\limits_{U, F}\|\boldsymbol{X}-\boldsymbol{U} \boldsymbol{F}\|_{2}^{2}} \\ {\text { s.t. } F_{i j} \in\{0, 1\}, \sum\limits_{i=1}^{k} F_{i j}=1} \\ {\forall j=1, 2, \cdots, n}\end{array} $

(2)

式中，$\boldsymbol{U}=\left[\boldsymbol{u}_{1}, \cdots, \boldsymbol{u}_{k}\right] \in {\bf R}^{d \times k}$为$k$个原型向量组成的聚类中心矩阵；$\boldsymbol{F} \in {\bf R}^{k \times n}$为聚类指示矩阵，该矩阵每一列满足独热编码(OHE)，即如果样本点$\boldsymbol{x}_{i}$被分配到第$k$个类别中，则$F_{k i}=1$；反之，$F_{k i}=0$。

式(2)虽然为$K$-means聚类问题提供了一种直观简明的求解思路，但仅适用于单视角数据聚类问题。将式(2)扩展至多视角数据聚类问题的一个合理假设是：虽然不同视角$\boldsymbol{X}^{(v)} \in {\bf R}^{d_{v} \times n}$下的聚类中心矩阵$\boldsymbol{U}^{(v)} \in {\bf R}^{d_{v} \times n}$不相同，但聚类结果存在一致性。在该假设条件下，由式(2)扩展得到的多视角聚类模型可形式化为

$ \begin{array}{c}{\min\limits_{\boldsymbol{U}^{(v)}, \boldsymbol{F}} \sum\limits_{v=1}^{V}\left\|\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right\|_{2}^{2}} \\ {\text { s. t. } F_{i j} \in\{0, 1\}, \sum\limits_{i=1}^{k} F_{i j}=1, \forall j=1, 2, \cdots, n}\end{array} $

(3)

式中，$\boldsymbol{F} \in {\bf R}^{k \times n}$是被$V$个视角共享的聚类指示矩阵。

值得注意的是，单视角的式(2)和多视角的式(3)均是非凸优化问题，在优化过程中极易陷入局部最小值。为缓解这一问题，Xu等人^[13]提出一种多视角自步学习聚类模型(MSPL)，将自步学习范式引入式(3)中，在优化过程中通过逐步增加自步正则项的惩罚系数，将样本按照从简单到复杂的顺序加入到聚类过程中，其形式化为

$ \begin{array}{c}{\min\limits_{\boldsymbol{X}^{(v)}, \boldsymbol{F}, \boldsymbol{W}} \sum\limits_{v=1}^{V}\left\|\left(\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right) \operatorname{diag}\left(\sqrt{\boldsymbol{w}^{(r)}}\right)\right\|_{2}^{2}+f(\boldsymbol{W} ; \lambda)} \\ {\text { s.t. } F_{i j} \in\{0, 1\}, \sum\limits_{i=1}^{k} F_{i j}=1, \forall j=1, 2, \cdots, n} \\ {\boldsymbol{w}^{(v)} \in[0, 1]^{n}, \forall v \in[1, V]}\end{array} $

(4)

式中，$\boldsymbol{w}^{(v)}=\left[w_{1}^{(v)}, \cdots, w_{n}^{(v)}\right]^{\mathrm{T}}$是由视角$v$中$n$个样本的权值组成的权重向量，$\boldsymbol{W}=\left[\boldsymbol{w}^{(1)}, \cdots, \boldsymbol{w}^{(V)}\right]$，$f(\boldsymbol{W} ; \lambda)$表示自步正则项，在聚类过程中决定着样本和视角的选择，其最简单形式为

$f(\boldsymbol{W} ; \lambda)=-\lambda \sum\limits_{v=1}^{V} \sum\limits_{i=1}^{n} w_{i}^{(v)}$

(5)

在该正则项约束下，视角$v$中样本$i$的最优权重为

$ w_{i}^{(v) *}=\left\{\begin{array}{ll}{1} & {l_{i}^{(v)} \leqslant \lambda} \\ {0} & {l_{i}^{(v)}>\lambda}\end{array}\right. $

(6)

式中，$l_{i}^{(v)}$表示样本$i$在视角$v$下的重建误差，以自步惩罚系数$λ$为阈值，简单样本的重建损失低于该阈值，权重为1；复杂样本的重建损失高于该阈值，权重为0。参数$λ$控制模型学习新样本的步伐，通常在优化过程中迭代增加。

2 本文方法

为了解决MSPL方法中存在的问题，本文提出了基于自步学习的鲁棒多样性多视角聚类模型，并针对该模型给出了高效的求解算法。

2.1 基于自步学习的鲁棒多样性多视角模型

MSPL方法虽然能在聚类过程中同时考虑不同样本和视角的差异，并取得优异的性能表现，但是存在以下两方面不足：1)由于$K$-means的求解是一种迭代方法，在迭代过程中，数据的异常值会对$K$-means聚类方法的性能产生极大影响，MSPL缺乏针对异常点鲁棒性的考虑；2)不同视角刻画数据的角度不同，在聚类迭代过程中，应当充分探索多个视角的信息，MSPL仅按照从简单到复杂的单一准则将样本逐步加入，在样本选择过程中忽略了视角的多样性。

为克服异常点的非鲁棒问题，本文提出用$\mathrm{L}_{2, 1}$范数代替式(4)目标函数中的$\mathrm{L}_{2}$范数。矩阵$\boldsymbol{X}$的$\mathrm{L}_{2, 1}$范数定义为$\|\boldsymbol{X}\|_{2, 1}=\sum\limits_{i=1}^{n}\left\|\boldsymbol{x}_{i}\right\|_{2}$，其中$\boldsymbol{x}_{i}$表示矩阵$\boldsymbol{X}$的第$i$列。相比于$\mathrm{L}_{2}$范数，基于$\mathrm{L}_{2, 1}$范数的目标函数通过对样本点重构误差施加$\mathrm{L}_{1}$范数稀疏性约束来减少异常点影响，提升模型鲁棒性。针对不同视角下样本多样性选择问题，受Jiang等人^[15]和Zhang等人^[16]启发，在式(5)的基础上，本文进一步在自步正则项$f(\boldsymbol{W})$中对样本权值矩阵$\boldsymbol{W}$施加负$\mathrm{L}_{2, 1}$范数约束，具体形式为

$f\left(\boldsymbol{W} ; \lambda_{1}, \lambda_{2}\right)=-\lambda_{1} \sum\limits_{v=1}^{V} \sum\limits_{i=1}^{n} w_{i}^{(v)}-\lambda_{2}\|\boldsymbol{W}\|_{2, 1}$

(7)

由于$\boldsymbol{W}$的$\mathrm{L}_{2, 1}$范数会产生组稀疏效应，即$\boldsymbol{W}$的非零项倾向于集中分布在少数几个视角中，那么负$\mathrm{L}_{2, 1}$范数对组稀疏则会有反作用，即$\boldsymbol{W}$的非零项倾向于分散在多个视角中，换言之，这种反组稀疏约束有望实现不同视角下样本的多样性选择。

综上所述，本文提出的基于自步学习的鲁棒多样性多视角聚类模型可形式化为

$ \begin{array}{c}{\min\limits_{\boldsymbol{U}^{(v)}, \boldsymbol{F}, \boldsymbol{W}} \sum\limits_{v=1}^{V}\left\|\left(\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right) \operatorname{diag}\left(\boldsymbol{w}^{(v)}\right)\right\|_{2, 1}+} \\ {f\left(\boldsymbol{W} ; \lambda_{1}, \lambda_{2}\right)} \\ {\text { s.t. } F_{i j} \in\{0, 1\}, \sum\limits_{i=1}^{k} F_{i j}=1, \forall j=1, 2, \cdots, n}\\ \boldsymbol{w}^{(v)} \in[0, 1]^{n}, \quad \forall v \in[1, V]\end{array} $

(8)

式中，自步正则项$f\left(\boldsymbol{W} ; \lambda_{1}, \lambda_{2}\right)$的具体形式为式(7)。

2.2 模型求解

所提模型的求解难点在于$\mathrm{L}_{2, 1}$范数的非凸性，针对该问题，本文提出一种变量交替迭代的高效求解算法。首先，将式(8)中的目标函数$J$写成迹的形式

$ \begin{array}{c}{\min\limits_{\boldsymbol{U}^{(v)}, \boldsymbol{D}^{(v)}} \sum\limits_{v=1}^{V} \operatorname{tr}\left\{\left(\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right) \times\right.} \\ {\operatorname{diag}\left(\boldsymbol{w}^{(v)}\right) \boldsymbol{D}^{(v)}\left(\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right)^{\mathrm{T}} \}+f\left(\boldsymbol{W} ; \lambda_{1}, \lambda_{2}\right)}\end{array} $

(9)

式中，$\boldsymbol{D}^{(v)} \in {\bf R}^{n \times n}$表示对应于视角$v$的对角矩阵，其对角线上的第$i$个元素定义为

$D_{i i}^{(v)}=\frac{1}{2\left\|\boldsymbol{e}_{i}^{(v)}\right\|_{2}}, \quad \forall i=1, 2, \cdots, n$

(10)

式中，$\boldsymbol{e}_{i}^{(v)}$表示重构误差矩阵$\boldsymbol{E}^{(v)}=\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}$的第$i$列。下面分别求解4种变量$\boldsymbol{U}^{(v)}、\boldsymbol{F}、\boldsymbol{W}$和$\boldsymbol{D}^{(v)}$对应的子问题。

1) 固定$\boldsymbol{F}、\boldsymbol{W}、\boldsymbol{D}^{(v)}$，求解目标函数$J$关于聚类中心矩阵$\boldsymbol{U}^{(v)}$的导数，即

$\frac{\partial \boldsymbol{J}}{\partial \boldsymbol{U}^{(v)}}=-2 \boldsymbol{X}^{(v)} \widetilde{\boldsymbol{D}}^{(v)} \boldsymbol{F}^{\mathrm{T}}+2 \boldsymbol{U}^{(v)} \boldsymbol{F} \widetilde{\boldsymbol{D}}^{(v)} \boldsymbol{F}^{\mathrm{T}}$

(11)

式中，$\widetilde{\boldsymbol{D}}^{(v)}=\operatorname{diag}\left(\boldsymbol{w}^{(v)}\right) \boldsymbol{D}^{(v)}$。将式(11)置零，可得关于$\boldsymbol{U}^{(v)}$的闭式解，即

$\boldsymbol{U}^{(v)}=\boldsymbol{X}^{(v)} \widetilde{\boldsymbol{D}}^{(v)} \boldsymbol{F}^{\mathrm{T}}\left(\boldsymbol{F} \widetilde{\boldsymbol{D}}^{(v)} \boldsymbol{F}^{\mathrm{T}}\right)^{-1}$

(12)

2) 固定$\boldsymbol{U}^{(v)}、\boldsymbol{W}、\boldsymbol{D}^{(v)}$，求解关于聚类指示矩阵$\boldsymbol{F}$的子问题，即

$ \begin{array}{c}{\min\limits_{\boldsymbol{F}} \sum\limits_{v=1}^{V} \operatorname{tr}\left\{\left(\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right) \widetilde{\boldsymbol{D}}^{(v)}\left(\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right)^{\mathrm{T}}\right\}=} \\ {\min\limits_{\boldsymbol{F}} \sum\limits_{v=1}^{V} \sum\limits_{i=1}^{n} \widetilde{\boldsymbol{D}}_{i i}^{(v)}\left\|\boldsymbol{x}_{i}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{f}_{i}\right\|_{2}^{2}=} \\ {\min\limits_{\boldsymbol{F}} \sum\limits_{i=1}^{n}\left(\sum\limits_{v=1}^{V} \widetilde{\boldsymbol{D}}_{i i}^{(v)}\left\|\boldsymbol{x}_{i}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{f}_{i}\right\|_{2}^{2}\right)}\end{array} $

(13)

为求解式(13)，将其解耦为$n$个独立子问题，然后对每个样本点$i$分别求解关于向量$\boldsymbol{f}=[f_{1}, f_{2}, \cdots, f_{K} ]^{\mathrm{T}} \in {\bf R}^{K \times 1}$的优化子问题，即

$ \begin{array}{c}{\min\limits_{\boldsymbol{f}} \sum\limits_{v=1}^{V} \tilde{\boldsymbol{d}}^{(v)}\left\|\boldsymbol{x}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{f}\right\|_{2}^{2}} \\ {\text { s.t. } f_{k} \in\{0, 1\}, \sum\limits_{k=1}^{K} f_{k}=1}\end{array} $

(14)

式中，$\widetilde{d}^{(v)}=\widetilde{\boldsymbol{D}}_{i i}^{(v)}$是矩阵$\widetilde{\boldsymbol{D}}^{(v)}$对角上的第$i$个元素。由于$\boldsymbol{f}$满足1-of-$K$编码，因此，式(14)包含$K$个可行解，每个可行解为单位矩阵$\boldsymbol{I}_{K}=\left[\boldsymbol{e}_{1}, \boldsymbol{e}_{2}, \cdots, \boldsymbol{e}_{K}\right]$中的一列，通过穷举搜索方式求得最优解$\boldsymbol{f}^{*}$^[11]为

$\boldsymbol{f}^{*}=\underset{\boldsymbol{e}_{j}}{\arg \min } \sum\limits_{v=1}^{V} \widetilde{d}^{(v)}\left\|\boldsymbol{x}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{e}_{j}\right\|_{2}^{2}$

(15)

3) 为求解W，固定$\boldsymbol{U}^{(v)}$、$\boldsymbol{F}$、$\boldsymbol{D}^{(v)}$，将式(8)中的目标函数改写为

$ \begin{array}{c}{\min\limits_\boldsymbol{W} \sum\limits_{v=1}^{V}\left\|\left(\boldsymbol{X}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{F}\right) \operatorname{diag}\left(\boldsymbol{w}^{(v)}\right)\right\|_{2, 1}+} \\ {f\left(\boldsymbol{W} ; \lambda_{1}, \lambda_{2}\right)=\min\limits_\boldsymbol{W} \sum\limits_{v=1}^{V} \sum\limits_{i=1}^{n} w_{i}^{(v)}\left\|\boldsymbol{x}_{i}^{(v)}-\boldsymbol{U}^{(v)} \boldsymbol{f}_{i}\right\|_{2}-} \\ {\lambda_{1}\|\boldsymbol{W}\|_{1}-\lambda_{2}\|\boldsymbol{W}\|_{2, 1}=\min\limits_\boldsymbol{W} \sum\limits_{v=1}^{V} \sum\limits_{i=1}^{n} w_{i}^{(v)} l_{i}^{(r)}-} \\ {\lambda_{1}\|\boldsymbol{W}\|_{1}-\lambda_{2}\|\boldsymbol{W}\|_{2, 1}}\end{array} $

(16)

受文献[15]求解方法的启发，将同一视角$v$下的$n$个样本按照重构误差损失值$l_{i}^{(v)}$升序排列，然后通过如下方式计算样本权值

$ w_{i}^{(v) *}=\left\{\begin{array}{l}{1} & {l_{i}^{(v)}<\lambda_{1}+} \\ {} & {\lambda_{2} \frac{1}{\sqrt{{rank}\left(l_{i}^{(v)}\right)}+\sqrt{{rank}\left(l_{i}^{(v)}\right)-1}}}\\ {0}&{其他} \end{array}\right. $

(17)

式中，${rank}\left(l_{i}^{(v)}\right)$表示视角$v$下样本$i$在$n$个样本中的损失排序。

4) 固定$\boldsymbol{U}^{(v)}、\boldsymbol{F}、\boldsymbol{W}$，通过式(10)更新$\boldsymbol{D}^{(v)}$。

通过以上4步，迭代更新$\boldsymbol{U}^{(v)}、\boldsymbol{F}、\boldsymbol{W}$和$\boldsymbol{D}^{(v)}$，直至目标函数收敛。本文算法具体步骤如下：

输入：多视角特征$\boldsymbol{X}^{(v)}, v=1, 2, \cdots, V$，聚类类别数$K$，自步学习参数$\mu_{1}$，$\mu_{2}$。

输出：聚类结果$\boldsymbol{F}$。

初始化：随机初始化指示矩阵$\boldsymbol{F}$，使$\boldsymbol{F}$中的每一列满足1-of-$K$编码；初始化样本权值矩阵$\boldsymbol{W}$，使每个样本随机初始化的权值为0或1；初始化对角矩阵$\widetilde{\boldsymbol{D}}^{(v)}=\operatorname{diag}\left(\boldsymbol{w}^{(v)}\right)$；初始化$\lambda_{1}$和$\lambda_{2}$；$\varepsilon=10^{-6}$。

迭代过程：

1) 根据式(12)更新聚类中心矩阵$\boldsymbol{U}^{(v)}$；

2) 根据式(13)更新聚类指示矩阵$\boldsymbol{F}$；

3) 根据式(16)(17)更新样本权值矩阵$\boldsymbol{W}$；

4) 根据式(10)更新对角矩阵$\boldsymbol{D}^{(v)}$；

5) 更新$\lambda_{1}, \lambda_{2} : \lambda_{1} \leftarrow \mu_{1} \lambda_{1}, \lambda_{1} \leftarrow \mu_{1} \lambda_{1}$；

6) 检查是否满足迭代停止条件：$J_{t-1}-J_{t} \leqslant \varepsilon$。若满足，终止迭代；若不满足，返回步骤1)。

在代码实现过程中，$\lambda_{1}$和$\lambda_{2}$的初始化方式与文献[15]类似，初始值按照样本损失值的排序确定。

3 实验与结果分析

本文所有方法在4.0 GHz CPU, 160 GB RAM和12 GB高速缓存TITANX GPU环境下运行，采用Linux操作系统，由MATLAB R2015b编程实现。

3.1 数据集描述

选择4个公开数据集验证本文所提算法的有效性，所选数据集的基本信息如表 1所示。

表 1 实验数据集
Table 1 Experimental datasets

下载CSV

数据集	样本	视角	类别
Extended Yale B	640	3	10
Notting-Hill	4 660	3	5
COIL-20	1 440	3	20
Scene15	4 485	3	15

4个数据集的组成如下：

1) Extended Yale B数据集^[9]包含38个人在不同光照条件下的每人64幅正面人脸图像，本文使用前10个人共640个样本，提取光照强度(intensity)、LBP^[17]和Gabor^[18]3种特征。其中，LBP特征的采样尺度为8个像素，划分块数量为7×8；Gabor特征在$θ$={0°, 45°, 90°, 135°}4个方向采用同一尺度$λ$=4提取，最终提取LBP和Gabor特征维度分别为3 304和6 750。

2) Notting-Hill数据集^[19]来源于电影《诺丁山》，收集了5个角色在76个片段中的4 660幅人脸图像，大小为120×150像素，本文将其下采样至40×50像素，特征提取方式与Extended Yale B数据集一致。

3) COIL-20数据集^[20]由从不同角度观测得到的包含20种通用对象的1 440幅图像组成，每个类别72个样本，本文将这些图像下采样至32×32像素，按照与Extende Yale B数据集相同方式提取特征。

4) Scene15数据集包含办公室、厨房、客厅、卧室等15种场景，每种场景包括210~410个样本。本文提取3类手工设计特征：视觉词袋金字塔直方图特征(PHOW)^[21]、成对旋转不变共现局部二进制模式特征(PRI-CoLBP)^[22]和CENTRIST特征^[23]，提取方式详见文献[21-23]。

3.2 对比方法

为验证本文方法的有效性，与6种最相关方法进行对比。

1) RMKMC^[11]。鲁棒多视角$K$-means聚类方法，通过学习得到的每个视角权重来组合多种视角的特征。

2) MSPL^[13]。首次将自步学习引入多视角聚类，但是缺乏考虑模型鲁棒性及视角多样性，为本文重要的对比基准。

3) $\mathrm{RD}-\mathrm{SSPL}_{\mathrm{best}}$。将所提方法依次应用到单一视角，给出所有视角的最优值。

4) Con-MC。将所有视角的特征简单拼接，在拼接后的特征上运行所提方法。

5) D-MSPL。不考虑所提方法鲁棒性，仅考虑视角多样性，即将式(8)中的$\mathrm{L}_{2, 1}$范数改为$\mathrm{L}_{2}$范数。

6) R-MSPL。不考虑所提方法视角多样性，仅考虑鲁棒性，即将式(7)自步正则项中$\lambda_{2}$置零。

本文所提方法的自步学习参数$\mu_{1}$和$\mu_{2}$的取值范围为{1.05, 1.2, 1.4, 1.6, 1.8, 2.0}，在4个数据集上，每种方法运行20次，求得平均结果，最终实验结果为不同参数下最高的平均值。

3.3 评价指标

本文选取ACC、NMI、AR、F-score、Precision和Recall等6种评价指标来全面客观地评价模型性能。

1) 指标ACC。假设聚类样本数为$N$，对于样本$x_{i}$，聚类算法预测标签为$r_{i}$，真实标签为$t_{i}$，则ACC定义为

$f_{\mathrm{ACC}}=\frac{\sum\limits_{i=1}^{N} \delta\left(t_{i}, {map}\left(r_{i}\right)\right)}{N}$

(18)

式中

$ \delta(a, b)=\left\{\begin{array}{ll}{1} & {a=b} \\ {0} & { 其他}\end{array}\right. $

(19)

映射函数$map(x)$表示通过匈牙利算法^[24]得到的最佳置换匹配映射。

2) 指标NMI。归一化互信息(NMI)是一种衡量两个聚类之间共享信息量多少的信息理论度量，能够较为可靠地评价不平衡数据集聚类效果。假设$C$和$C′$分别表示真实聚类划分和预测聚类划分，则NMI计算如下

$ \begin{array}{c}{f_{\mathrm{NMI}}\left(\boldsymbol{C}, \boldsymbol{C}^{\prime}\right)=} \\ {\frac{\sum\limits_{i=1}^{K} \sum\limits_{j=1}^{S}\left|\boldsymbol{C}_{i} \cap \boldsymbol{C}_{j}^{\prime}\right| \lg \frac{N\left|\boldsymbol{C}_{i} \cap \boldsymbol{C}_{j}^{\prime}\right|}{|\boldsymbol{C}_{i}||\boldsymbol{C}_{j}^{\prime}|}}{\sqrt{\left(\sum\limits_{i=1}^{K}\left|\boldsymbol{C}_{i}\right| \lg \frac{\left|\boldsymbol{C}_{i}\right|}{N}\right)\left(\sum\limits_{j=1}^{S}\left|\boldsymbol{C}_{j}^{\prime}\right| \lg \frac{\left|\boldsymbol{C}_{j}^{\prime}\right|}{N}\right)}}}\end{array} $

(20)

3) 指标AR、F-score、Precision和Recall。将聚类视为一系列决策，这些决策的目标是将每个数据集包含的$N(N-1)/2$个样本对中两个相似的样本分配到同一个聚类类别中。关于这4个指标的更详细定义，参见文献[25]。

对于以上6项指标，值越大表示模型性能越好。

3.4 实验结果

3.4.1 RD-MSPL与对比方法的性能比较

表 2—表 5是每个方法在4个数据集上的详细结果，从中可得出以下结论：

表 2 在Extended Yale B数据集上的聚类结果(均值±标准差)
Table 2 Clustering results (mean±standard deviation) on Extended Yale B dataset

下载CSV

算法	NMI	ACC	AR	F-score	Precision	Recall
RMKMC	0.071±0.036	0.165±0.022	0.019±0.016	0.123±0.012	0.115±0.014	0.132±0.012
MSPL	0.024±0.028	0.135±0.012	0.007±0.014	0.118±0.013	0.098±0.010	0.151±0.025
RD-SSPL_best	0.128±0.043	0.192±0.030	0.037±0.017	0.155±0.017	0.125±0.012	0.210±0.046
Con-MC	0.126±0.046	0.191±0.033	0.017±0.008	0.163±0.007	0.109±0.005	0.336±0.088
D-MSPL	0.130±0.024	0.210±0.021	0.035±0.008	0.142±0.008	0.128±0.007	0.158±0.015
R-MSPL	0.078±0.030	0.170±0.016	0.021±0.013	0.123±0.011	0.117±0.011	0.130±0.014
RD-MSPL	0.135±0.028	0.214±0.026	0.038±0.010	0.146±0.010	0.129±0.008	0.170±0.024
注：加粗字体为最优值。

表 3 在Notting-Hill数据集上的聚类结果(均值±标准差)
Table 3 Clustering results (mean±standard deviation) on Notting-Hill dataset

下载CSV

算法	NMI	ACC	AR	F-score	Precision	Recall
RMKMC	0.729±0.067	0.760±0.103	0.665±0.117	0.743±0.088	0.713±0.104	0.778±0.074
MSPL	0.760±0.081	0.766±0.118	0.711±0.133	0.766±0.101	0.728±0.117	0.812±0.098
RD-SSPL_best	0.692±0.055	0.752±0.085	0.648±0.100	0.727±0.077	0.708±0.077	0.749±0.081
Con-MC	0.712±0.069	0.740±0.091	0.635±0.119	0.716±0.092	0.708±0.094	0.726±0.095
D-MSPL	0.767±0.063	0.792±0.083	0.711±0.103	0.776±0.080	0.763±0.075	0.790±0.087
R-MSPL	0.374±0.277	0.526±0.192	0.324±0.285	0.476±0.220	0.470±0.223	0.482±0.217
RD-MSPL	0.787±0.066	0.808±0.093	0.738±0.109	0.796±0.085	0.779±0.082	0.813±0.090
注：加粗字体为最优值。

表 4 在COIL-20数据集上的聚类结果(均值±标准差)
Table 4 Clustering results (mean±standard deviation) on COIL-20 dataset

下载CSV

算法	NMI	ACC	AR	F-score	Precision	Recall
RMKMC	0.754±0.022	0.587±0.057	0.525±0.051	0.544±0.047	0.489±0.064	0.644±0.028
MSPL	0.741±0.031	0.549±0.061	0.497±0.058	0.528±0.053	0.436±0.069	0.679±0.025
RD-SSPL_best	0.723±0.017	0.531±0.036	0.475±0.033	0.507±0.030	0.409±0.039	0.672±0.037
Con-MC	0.744±0.030	0.556±0.052	0.479±0.065	0.512±0.058	0.403±0.069	0.720±0.043
D-MSPL	0.759±0.017	0.594±0.048	0.520±0.045	0.547±0.041	0.477±0.055	0.645±0.027
R-MSPL	0.668±0.210	0.545±0.178	0.470±0.219	0.500±0.204	0.457±0.201	0.558±0.200
RD-MSPL	0.772±0.016	0.620±0.041	0.547±0.040	0.573±0.037	0.510±0.049	0.656±0.022
注：加粗字体为最优值。

表 5 在Scene 15数据集上的聚类结果(均值±标准差)
Table 5 Clustering results (mean±standard deviation) on Scene 15 dataset

下载CSV

算法	NMI	ACC	AR	F-score	Precision	Recall
RMKMC	0.505±0.008	0.468±0.022	0.309±0.009	0.359±0.008	0.347±0.012	0.361±0.014
MSPL	0.478±0.016	0.416±0.040	0.270±0.021	0.329±0.018	0.281±0.026	0.402±0.022
RD-SSPL_best	0.471±0.010	0.449±0.027	0.305±0.017	0.331±0.015	0.329±0.018	0.364±0.014
Con-MC	0.503±0.011	0.474±0.034	0.313±0.014	0.362±0.011	0.334±0.023	0.405±0.024
D-MSPL	0.510±0.011	0.471±0.031	0.310±0.014	0.352±0.012	0.356±0.017	0.354±0.010
R-MSPL	0.511±0.010	0.479±0.033	0.312±0.013	0.360±0.011	0.354±0.017	0.365±0.007
RD-MSPL	0.514±0.007	0.481±0.018	0.316±0.008	0.364±0.007	0.360±0.009	0.368±0.009
注：加粗字体为最优值。

1) 提出的RD-MSPL算法总体优于最相关的两个多视角$K$-means聚类算法：RMKMC和MSPL。在Extended Yale B、Notting-Hill、COIL-20、Scene 15数据集上，RD-MSPL算法的ACC指标与RMKMC算法相比，分别提升了4.9%、4.8%、3.3%和1.3%；与MSPL算法相比，分别提升了7.9%、4.2%、7.1%和6.5%。原因在于RMKMC仅在视角层面对不同视角的信息进行融合，忽略了同一视角下样本之间的差异性；MSPL虽然考虑了视角和样本两个层面的信息，但是缺乏对模型鲁棒性和视角多样性考虑；RD-MSPL能够对两种方法存在的不足进行统一建模，使得所提模型既能精细地刻画视角和样本两个层面的信息，又能充分考虑模型的鲁棒性和样本的多样性。

2) RD-MSPL在4个数据集上的表现均优于其两个变体D-MSPL和R-MSPL，证实了对模型鲁棒性和视角多样性考虑的有效性。从4个数据集上的实验结果还可以看出，RD-MSPL对R-MSPL的性能提升(ACC分别提高3.6%、28.2%、7.5%、0.2%)总体高于对D-MSPL的提升(ACC分别提高0.4%、1.7%、2.6%、1%)，说明相比于模型鲁棒性，视角多样性对所提模型的影响更为显著。

3) 在4个数据集上，提出的融合了多个视角信息的RD-MSPL的ACC指标均优于表现最优的单个视角RD-SSPL_best，分别提升了2.2%、5.6%、8.9%和3.2%，证明了所提方法在多视角信息融合方面的有效性。此外，RD-MSPL显著优于Con-MC，除个别数据集的极少指标外，ACC指标分别提升2.3%、6.8%、6.4%和0.7%。原因在于所提方法能够保持多视角数据结构，更有效地探索视角之间关联关系。

3.4.2 多视角特征的贡献

如图 1所示，虽然Extended Yale B、Notting-Hill和COIL-20均使用intensity、LBP、Gabor等3种特征，但是3种特征在不同数据集上的贡献存在显著差异。例如，在Extended Yale B数据集上，由于光照强度变化较大，因此相比于LBP和Gabor，光照强度特征intensity判别力更强，而在Notting-Hill和COIL-20数据集上，LBP特征的聚类表现更优。基于以上分析可以发现，不同数据集之间的特性差异很大，很难设计一种对所有数据集均适用的特征。为解决这一问题，很自然的一种想法是对多种特征做融合，其中最简单直观的融合方法是将多种特征做拼接。从图 1可以看出，虽然对特征简单拼接也能取得与仅使用该数据集单视角最优特征相似的效果，但是由于拼接后的特征维度过高且缺乏明确物理意义，限制了其性能的进一步提升。

图 1 本文方法与仅用单视角特征及简单特征拼接的比较

Fig. 1 Comparison among the proposal with single view feature, the concatenating features and proposed ((a)NMI; (b)ACC)

不同于特征的简单拼接，所提出的RD-MSPL方法能够更为有效地利用多个视角之间的互补信息。为了更好地说明这一点，图 2给出在Notting-Hill数据集上所提方法分别在单视角和多视角条件下的混淆矩阵，从图 2可以看出，虽然LBP特征在所有3种特征中总体表现最优，但是易将第3类混淆为第5类；与此相反，Gabor特征虽然总体表现较差，但是能较好区分第3类。3种特征各有所长，而RD-MSPL能够有效获取3种特征的互补信息。

图 2 在Notting-Hill数据集上所提方法在单视角和多视角条件下的混淆矩阵

Fig. 2 The confusion matrices of the proposed method under single view and multiple views on Notting-Hill dataset

((a) intensity; (b) LBP; (c) Gabor; (d) the combination of three features)

3.4.3 模型参数灵敏度分析

为了更深入理解所提模型特性，进一步分析自步学习正则项中两个关键参数$\mu_{1}$和$\mu_{2}$对模型性能影响，这两个参数控制着模型学习多样性样本的速度。图 3是在4个数据集上的参数灵敏度，深蓝色平面表示单视角最优值。从图 3可以看出，尽管$\mu_{1}$和$\mu_{2}$对模型性能会产生一些影响，但是总体来看模型表现相对稳定，在实验给出的较大的参数空间内，所提的多视角聚类模型的表现均优于单个视角最优表现。

图 3 在4个数据集上的参数灵敏度分析

Fig. 3 The parameter sensitivity analysis on four datasets ((a) Extended Yale B; (b) Notting-Hill; (c) COIL-20; (d) Scene 15)

4 结论

针对大规模多视角数据聚类场景，提出一种基于自步学习的鲁棒多样性多视角聚类模型，并给出了高效求解算法。所提模型能有效克服异常点对聚类性能的影响，在聚类过程中逐步加入不同视角下的多样性样本，在避免局部最小值的同时，更好地获取了不同视角的互补信息。在4个广泛使用的公开数据集上的实验结果表明，所提模型能够更有效地探索视角之间关联关系，聚类性能优于现有的两个最相关多视角聚类方法，验证了所提模型考虑鲁棒性和样本多样性的有效性。下一步的研究工作主要包含两方面，一方面，由于本文方法基于线性流形假设，不适合非线性关系类型的数据，因此今后的研究工作目标是将现有方法与核方法结合使其应用范围更广；另一方面，本文方法中涉及到的两个自步学习参数在无监督情况下很难事先设定，下一步工作将设计一种自适应的策略来自动寻找最优参数值。

参考文献

[1] Xie Y, Tao D C, Zhang W S, et al. On unifying multi-view self-representation for clustering by tensor multi-rank minimization[J]. International Journal of Computer Vision, 2018, 126(11): 1157–1179. [DOI:10.1007/s11263-018-1086-2]

[2] Chao G Q, Sun S L, Bi J B. A survey on multi-view clustering[J]. arXiv: 1712.06246, 2017.

[3] Zhao Y W, Zhang E H, Lu J W, et al. Gait recognition via multiple features and views information fusion[J]. Journal of Image and Graphics, 2009, 14(3): 388–393. [赵永伟, 张二虎, 鲁继文, 等. 多特征和多视角信息融合的步态识别[J]. 中国图象图形学报, 2009, 14(3): 388–393. ] [DOI:10.11834/jig.20090302]

[4] Zong L L. Research on multi-view clustering[D]. Dalian: Dalian University of Technology, 2017. [宗林林.多视角聚类研究[D].大连: 大连理工大学, 2017.] http://cdmd.cnki.com.cn/Article/CDMD-10141-1017188571.htm

[5] Liu X P, Lu J T, Xie W J. Foot plant detection based on spectral clustering algorithm for motion capture data[J]. Journal of Image and Graphics, 2014, 19(9): 1306–1315. [刘晓平, 陆劲挺, 谢文军. 运动捕捉数据中足迹的谱聚类检测方法[J]. 中国图象图形学报, 2014, 19(9): 1306–1315. ] [DOI:10.11834/jig.20140907]

[6] Wang X B, Guo X J, Lei Z, et al. Exclusivity-consistency regularized multi-view subspace clustering[C]//Proceedings of the 30th IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 1-9.[DOI: 10.1109/CVPR.2017.8]

[7] Wang Y, Wu L, Lin X M, et al. Multiview spectral clustering via structured low-rank matrix factorization[J]. IEEE Transactions on Neural Networks and Learning Systems, 2018, 29(10): 4833–4843. [DOI:10.1109/TNNLS.2017.2777489]

[8] Yin M, Gao J B, Xie S L, et al. Multiview subspace clustering via tensorial t-product representation[J]. IEEE Transactions on Neural Networks and Learning Systems, 2019, 30(3): 851–864. [DOI:10.1109/TNNLS.2018.2851444]

[9] Zhang C Q, Fu H Z, Liu S, et al. Low-rank tensor constrained multiview subspace clustering[C]//Proceedings of the 15th IEEE International Conference on Computer Vision. Santiago: IEEE, 2015: 1582-1590.[DOI: 10.1109/ICCV.2015.185]

[10] Li T, Ding C. The relationships among various nonnegative matrix factorization methods for clustering[C]//Proceedings of the 6th IEEE International Conference on Data Mining. Hong Kong, China: IEEE, 2006: 362-371.[DOI: 10.1109/ICDM.2006.160]

[11] Cai X, Nie F P, Huang H. Multi-view K-means clustering on big data[C]//Proceedings of the 23rd International Joint Conference on Artificial Intelligence. Beijing: AAAI Press, 2013: 2598-2604.

[12] Xu J L, Han J W, Nie F P, et al. Re-weighted discriminatively embedded $K$-means for multi-view clustering[J]. IEEE Transactions on Image Processing, 2017, 26(6): 3016–3027. [DOI:10.1109/TIP.2017.2665976]

[13] Xu C, Tao D C, Xu C. Multi-view self-paced learning for clustering[C]//Proceedings of the 24th International Conference on Artificial Intelligence. Buenos Aires, Argentina: AAAI Press, 2015: 3974-3980.

[14] Ding C, He X F, Simon H D. Nonnegative Lagrangian relaxation of K-means and spectral clustering[C]//Proceedings of the 16th European Conference on Machine Learning. Porto, Portugal: Springer, 2005: 530-538.[DOI: 10.1007/11564096_51]

[15] Jiang L, Meng D Y, Yu S I, et al. Self-paced learning with diversity[C]//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal: MTI Press, 2014: 2078-2086.

[16] Zhang D W, Meng D Y, Han J W. Co-saliency detection via a self-paced multiple-instance learning framework[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(5): 865–878. [DOI:10.1109/TPAMI.2016.2567393]

[17] Ojala T, Pietikainen M, Maenpaa T. Multiresolution gray-scale and rotation invariant texture classification with local binary patterns[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002, 24(7): 971–987. [DOI:10.1109/TPAMI.2002.1017623]

[18] Lades M, Vorbruggen J C, Buhmann J, et al. Distortion invariant object recognition in the dynamic link architecture[J]. IEEE Transactions on Computers, 1993, 42(3): 300–311. [DOI:10.1109/12.210173]

[19] Zhang Y F, Xu C S, Lu H Q, et al. Character identification in feature-length films using global face-name matching[J]. IEEE Transactions on Multimedia, 2009, 11(7): 1276–1288. [DOI:10.1109/TMM.2009.2030629]

[20] Cao X C, Zhang C Q, Fu H Z, et al. Diversity-induced multi-view subspace clustering[C]//Proceedings of the 28th IEEE Conference on Computer Vision and Pattern Recognition. Boston: IEEE, 2015: 586-594.[DOI: 10.1109/CVPR.2015.7298657]

[21] Bosch A, Zisserman A, Munoz X. Image classification using random forests and ferns[C]//Proceedings of the 11th IEEE International Conference on Computer Vision. Rio de Janeiro, Brazil: IEEE, 2007: 1-8.[DOI: 10.1109/ICCV.2007.4409066]

[22] Qi X B, Xiao R, Li C G, et al. Pairwise rotation invariant co-occurrence local binary pattern[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2014, 36(11): 2199–2213. [DOI:10.1109/TPAMI.2014.2316826]

[23] Wu J X, Rehg J M. CENTRIST:a visual descriptor for scene categorization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1489–1501. [DOI:10.1109/TPAMI.2010.224]

[24] Cai D, He X F, Han J W. Document clustering using locality preserving indexing[J]. IEEE Transactions on Knowledge and Data Engineering, 2005, 17(12): 1624–1637. [DOI:10.1109/TKDE.2005.198]

[25] Manning C D, Raghavan P, Schütze H. Introduction to Information Retrieval[M]. Cambridge: Cambridge University Press, 2008.