Consensus graph learning-based self-supervised ensemble clustering
- Vol. 28, Issue 4, Pages: 1069-1078(2023)
Published: 16 April 2023
DOI: 10.11834/jig.210947
移动端阅览
浏览全部资源
扫码关注微信
Published: 16 April 2023 ,
移动端阅览
耿伟峰, 王翔, 景丽萍, 于剑. 2023. 共识图学习驱动的自监督集成聚类. 中国图象图形学报, 28(04):1069-1078
Geng Weifeng, Wang Xiang, Jing Liping, Yu Jian. 2023. Consensus graph learning-based self-supervised ensemble clustering. Journal of Image and Graphics, 28(04):1069-1078
目的
2
随着实际应用场景中海量数据采集技术的发展和数据标注成本的不断增加,自监督学习成为海量数据分析的一个重要策略。然而,如何从海量数据中抽取有用的监督信息,并该监督信息下开展有效的学习仍然是制约该方向发展的研究难点。为此,提出了一个基于共识图学习的自监督集成聚类框架。
方法
2
框架主要包括3个功能模块。首先,利用集成学习中多个基学习器构建共识图;其次,利用图神经网络分析共识图,捕获节点优化表示和节点的聚类结构,并从聚类中挑选高置信度的节点子集及对应的类标签生成监督信息;再次,在此标签监督下,联合其他无标注样本更新集成成员基学习器。交替迭代上述功能块,最终提高无监督聚类的性能。
结果
2
为验证该框架的有效性,在标准数据集(包括图像和文本数据)上设计了一系列实验。实验结果表明,所提方法在性能上一致优于现有聚类方法。尤其是在MNIST-Test(modified national institute of standards and technology database)上,本文方法实现了97.78%的准确率,比已有最佳方法高出3.85%。
结论
2
该方法旨在利用图表示学习提升自监督学习中监督信息捕获的能力,监督信息的有效获取进一步强化了集成学习中成员构建的能力,最终提升了无监督海量数据本质结构的挖掘性能。
Objective
2
Clustering is focused on machine learning-related data segmentation for multiple datasets. Its applications are in relevant to such domains like image segmentation and anomaly detection. In addition, to simplify complex tasks optimize its performance, clustering is used in data preprocessing tasks of those are data sub-blocks segmentation, pseudo-labels generation, and abnormal points-removal. Self-supervised learning has become an essential technique for massive data analysis. However, it is challenged to extract effective supervision information and analyze the input data.
Method
2
A consensus graph learning based self-supervised ensemble clustering (CGL-SEC) framework is developed. It consists of three main modules: 1) to construct the consensus graph based on several ensemble components (i.e., the basic clustering methods). 2) to extract the supervision information by learning the consensus graph representation, and 3) its node clustering results, where the subset of nodes with the high-confidence are selected as labeled samples. To optimize the ensemble components and the corresponding consensus graph, t basic clustering methods are re-trained in related the option of samples-labeled and other samples-unlabeled. The final clustering results can be optimized iteratively until the learning process converges.
Result
2
A series of experiments are carried out on benchmarks, including both image and textual datasets. Especially, CGL-SEC is 3.85% over baseline in terms of clustering evaluation metric on themodified national institute of standards and technology database(MNIST-Test). First, to optimize data representation and cluster assignment at the same time, deep embedding clustering can be focused on data itself as the supervision information and auto-encoder with the reconstruction loss is pre-trained. The soft cluster assignment of features-embedded is then calculated, and the KL(Kullback-Leibler) divergence is minimized between the soft cluster assignment and the auxiliary target distribution. To improve the performance of the model further, following deep clustering network (DCN) can use hard clustering instead of soft allocation, and local constraints are applied by improved deep embedding clustering (IDEC). The pseudo-label strategy is implemented as a self-supervised learning method that uses the prediction results of the neural network as the label to simulate the supervision information compared to using data itself as the supervision information. Deep-cluster-based K-means clustering is used to generate pseudo-labels to guide the training of convolutional networks. However, the generated pseudo-labels have lower confidence and are prone to trivial solutions in the initial stage of network training. Deep embedding clustering with data augmentation (DEC-DA) and MixMatch-based prediction of data-enhanced samples are used as the supervision information of the original data, which improves the accuracy of the supervision information to a certain extent, but this method is difficult to extend to text and other fields. Deep adaptive clustering-based high-confidence pseudo-label subsets-selected are iteratively trained the network in the prediction results, but low-confidence samples-involved data distribution information is ignored. Pseudo-semi-supervised clustering votes are used to select a subset of high-confidence pseudo-labels, and all samples are used to train semi-supervised neural network. Although the ensemble strategy can improve the confidence of the pseudo-label, the voting strategy is concerned of category representation only without the feature representation of the sample itself, which can reduce the clustering performance in some cases. The ensemble learning is regarded as a representative machine learning method that reflects the ability of "group intelligence", whereas a learning method can improve the overall prediction performance via multiple base learners training and their coordinated prediction results. In pseudo-label-based clustering tasks, it can coordinate multiple base learners to obtain high-confidence pseudo-labels. However, the effectiveness of the supervision information acquisition is still to be resolved. The category information of the sample is considered for current pseudo-label-based ensemble clustering method only when the label is captured and some effective information are ignored like the feature representation of the sample itself and the clustering structure between samples.
Conclusion
2
Graph neural network is composed of content information of nodes and the structural information between nodes at the same time. To design a self-supervised ensemble clustering method based on consensus graph representation learning, it is required to make full use of sample features and relationships between samples in ensemble learning. To obtain higher confidence pseudo-labels as supervised information and improve the performance of self-supervised clustering, it is necessary to mine global and local information at the same time. We illustrate a learnable data ensemble representation through graph neural network. The confidence of pseudo-labels is improved, and the entire model is trained in self-supervision iteratively. To be summarized: 1) Commonly-used consensus graph learning-integrated clustering framework is developed, which can use multi-level information like clustering-integrated sample characteristics and category structure. 2) Self-supervision method is proposed, which uses graph neural network to mine the global and local information of the consensus graph, and high-confidence pseudo-labels are obtained as supervised information. 3) Experiments are demonstrated that the consensus graph learning ensemble clustering method has its potentials on image and text datasets.
集成聚类自监督聚类图表示学习共识图伪标签置信度
ensemble clusteringself-supervised clusteringgraph representation learningconsensus graphpseudo-label confidence
Abdala D D, Wattuya P and Jiang X Y. 2010. Ensemble clustering via random walker consensus strategy//Proceedings of the 20th International Conference on Pattern Recognition. Istanbul, Turkey: IEEE: 1433-1436 [DOI: 10.1109/icpr.2010.354http://dx.doi.org/10.1109/icpr.2010.354]
Berthelot D, Carlini N, Goodfellow I, Oliver A, Papernot N and Raffel C. 2019. MixMatch: a holistic approach to semi-supervised learning//Proceedings of the 33rd International Conference on Neural Information Processing Systems. Vancouver, Canada: [s.n.]: #454
Bianchi F M, Grattarola D and Alippi C. 2020. Spectral clustering with graph neural networks for graph pooling//Proceedings of the 37th International Conference on Machine Learning. [s.l.]: PMLR: 874-883
Caron M, Bojanowski P, Joulin A and Douze M. 2018. Deep clustering for unsupervised learning of visual features//Proceedings of the 15th European Conference on Computer Vision. Munich, Germany: Springer: 139-156 [DOI: 10.1007/978-3-030-01264-9_9http://dx.doi.org/10.1007/978-3-030-01264-9_9]
Chang J L, Wang L F, Meng G F, Xiang S M and Pan C H. 2017. Deep adaptive image clustering//Proceedings of 2017 IEEE International Conference on Computer Vision. Venice, Italy: IEEE: 5880-5888 [DOI: 10.1109/iccv.2017.626http://dx.doi.org/10.1109/iccv.2017.626]
Fred A L N and Jain A K. 2005. Combining multiple clusterings using evidence accumulation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(6): 835-850 [DOI: 10.1109/tpami.2005.113http://dx.doi.org/10.1109/tpami.2005.113]
Gomes R, Krause A and Perona P. 2010. Discriminative clustering by regularized information maximization//Proceedings of the 23rd International Conference on Neural Information Processing Systems. Vancouver British, Canada: Curran Associates Inc.: 775-783
Guo X F, Gao L, Liu X W and Yin J P. 2017. Improved deep embedded clustering with local structure preservation//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia: IJCAI.org: 1753-1759 [DOI: 10.24963/ijcai.2017/243http://dx.doi.org/10.24963/ijcai.2017/243]
Guo X F, Zhu E, Liu X W and Yin J P. 2018. Deep embedded clustering with data augmentation//Proceedings of the 10th Asian Conference on Machine Learning. Beijing, China: PMLR: 550-565
Gupta D, Ramjee R, Kwatra N and Sivathanu M. 2020. Unsupervised clustering using pseudo-semi-supervised learning//Proceedings of the 8th International Conference on Learning Representations. Addis Ababa, Ethiopia: OpenReview.net: #1520
Hu W H, Miyato T, Tokui S, Matsumoto E and Sugiyama M. 2017. Learning discrete representations via information maximizing self-augmented training//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: PMLR: 1558-1567 [DOI: 10.48550/arXiv.1702.08720http://dx.doi.org/10.48550/arXiv.1702.08720]
Jain A K. 2010. Data clustering: 50 years beyond K-means. Pattern Recognition Letters, 31(8): 651-666 [DOI: 10.1016/j.patrec.2009.09.011http://dx.doi.org/10.1016/j.patrec.2009.09.011]
Kipf T N and Welling M. 2016. Variational graph auto-encoders [EB/OL]. [2021-10-13]. https://arxiv.org/pdf/1611.07308.pdfhttps://arxiv.org/pdf/1611.07308.pdf
Kipf T N and Welling M. 2017. Semi-supervised classification with graph convolutional networks//Proceedings of the 5th International Conference on Learning Representations. Toulon, France: OpenReview.net:
Li T, Ding C and Jordan M I. 2007. Solving consensus and semi-supervised clustering problems using nonnegative matrix factorization//Proceedings of the 7th IEEE International Conference on Data Mining (ICDM). Omaha, USA: IEEE: 577-582 [DOI: 10.1109/icdm.2007.98http://dx.doi.org/10.1109/icdm.2007.98]
Liu H F, Liu T L, Wu J J, Tao D C and Fu Y. 2015. Spectral ensemble clustering//Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Sydney, Australia: ACM: 715-724 [DOI: 10.1145/2783258.2783287http://dx.doi.org/10.1145/2783258.2783287]
Lu Z W, Peng Y X and Xiao J G. 2008. From comparing clusterings to combining clusterings//Proceedings of the 23rd National Conference on Artificial Intelligence. Chicago, USA: AAAI Press: 665-670 [DOI: 10.5555/1620163.1620175http://dx.doi.org/10.5555/1620163.1620175]
MacQueen J B. 1967. Some methods for classification and analysis of multivariate observations//The 5th Berkeley Symposium on Mathematical Statistics and Probability. Berkeley, USA: University of California Press: 281-297 [DOI: 10.1.1.308.8619http://dx.doi.org/10.1.1.308.8619]
Pan S R, Hu R Q, Long G D, Jiang J, Yao L N and Zhang C Q. 2018. Adversarially regularized graph autoencoder for graph embedding//Proceedings of the 27th International Joint Conference on Artificial Intelligence. Stockholm, Sweden: IJCAI.org: 2609-2615 [DOI: 10.24963/ijcai.2018/362http://dx.doi.org/10.24963/ijcai.2018/362]
Rasmus A, Valpola H, Honkala M, Berglund M and Raiko T. 2015. Semi-supervised learning with ladder networks//Proceedings of the 28th International Conference on Neural Information Processing Systems. Montreal Canada: MIT Press: 3546-3554
Sagi O and Rokach L. 2018. Ensemble learning: a survey. WIREs Data Mining and Knowledge Discovery, 8(4): #1249 [DOI: 10.1002/widm.1249http://dx.doi.org/10.1002/widm.1249]
Shi J B and Malik J. 2000. Normalized cuts and image segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 22(8): 888-905 [DOI: 10.1109/34.868688http://dx.doi.org/10.1109/34.868688]
Strehl A and Ghosh J. 2003. Cluster ensembles-a knowledge reuse framework for combining multiple partitions. The Journal of Machine Learning Research, 3: 583-617 [DOI: 10.1162/153244303321897735http://dx.doi.org/10.1162/153244303321897735]
Tao Z Q, Liu H F and Fu Y. 2017. Simultaneous clustering and ensemble//Proceedings of the 31st AAAI Conference on Artificial Intelligence. San Francisco, USA: AAAI Press: 1546-1552 [DOI: 10.1609/aaai.v31i1.10720http://dx.doi.org/10.1609/aaai.v31i1.10720]
Tao Z Q, Liu H F, Li J, Wang Z W and Fu Y. 2019. Adversarial graph embedding for ensemble clustering//Proceedings of the 28th International Joint Conference on Artificial Intelligence. Macao, China: IJCAI.org: 3562-3568 [DOI: 10.24963/ijcai.2019/494http://dx.doi.org/10.24963/ijcai.2019/494]
Topchy A, Jain A K and Punch W. 2003. Combining multiple weak clusterings//Proceedings of the 3rd IEEE International Conference on Data Mining. Melbourne, USA: IEEE: 331-338 [DOI: 10.1109/icdm.2003.1250937http://dx.doi.org/10.1109/icdm.2003.1250937]
Wu J J, Liu H F, Xiong H, Cao J and Chen J. 2015. K-means-based consensus clustering: a unified view. IEEE Transactions on Knowledge and Data Engineering, 27(1): 155-169 [DOI: 10.1109/tkde.2014.2316512http://dx.doi.org/10.1109/tkde.2014.2316512]
Xie J Y, Girshick R B and Farhadi A. 2016. Unsupervised deep embedding for clustering analysis//Proceedings of the 33nd International Conference on Machine Learning. New York, USA: JMLR.org: 478-487
Yang B, Fu X, Sidiropoulos N D and Hong M Y. 2017. Towards K-means-friendly spaces: simultaneous deep learning and clustering//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: JMLR.org: 3861-3870 [DOI: 10.1145/3149166.3149171http://dx.doi.org/10.1145/3149166.3149171]
Zhao D L and Tang X O. 2008. Cyclizing clusters via Zeta function of a graph//Proceedings of the 21st International Conference on Neural Information Processing Systems. Vancouver British, Canada: Curran Associates Inc.: 1953-1960 [DOI: 10.1.1.144.2539http://dx.doi.org/10.1.1.144.2539]
相关作者
相关机构