A Gaussian mixture variational autoencoder based clustering network

Huahua Chen; Zhe Chen; Chunsheng Guo; Na Ying; Xueyi Ye

doi:10.11834/jig.200467

Image Analysis and Recognition | Views : 0 下载量: 0 CSCD: 0

PDF
Export
Share
Collection
Album

A Gaussian mixture variational autoencoder based clustering network
Vol. 27, Issue 7, Pages: 2148-2156(2022)
Published： 16 July 2022 ，

Accepted： 09 November 2020
DOI： 10.11834/jig.200467
稿件说明：

移动端阅览

Huahua Chen, Zhe Chen, Chunsheng Guo, Na Ying, Xueyi Ye. A Gaussian mixture variational autoencoder based clustering network. [J]. Journal of Image and Graphics 27(7):2148-2156(2022)
DOI：

Huahua Chen, Zhe Chen, Chunsheng Guo, Na Ying, Xueyi Ye. A Gaussian mixture variational autoencoder based clustering network. [J]. Journal of Image and Graphics 27(7):2148-2156(2022) DOI： 10.11834/jig.200467.

摘要

目的

经典的聚类算法在处理高维数据时存在维数灾难等问题，使得计算成本大幅增加并且效果不佳。以自编码或变分自编码网络构建的聚类网络改善了聚类效果，但是自编码器提取的特征往往比较差，变分自编码器存在后验崩塌等问题，影响了聚类的结果。为此，本文提出了一种基于混合高斯变分自编码器的聚类网络。

方法

使用混合高斯分布作为隐变量的先验分布构建变分自编码器，并以重建误差和隐变量先验与后验分布之间的KL散度(Kullback-Leibler divergence)构造自编码器的目标函数训练自编码网络；以训练获得的编码器对输入数据进行特征提取，结合聚类层构建聚类网络，以编码器隐层特征的软分配分布与软分配概率辅助目标分布之间的KL散度构建目标函数并训练聚类网络；变分自编码器采用卷积神经网络实现。

结果

为了验证本文算法的有效性，在基准数据集MNIST (Modified National Institute of Standards and Technology Database)和Fashion-MNIST上评估了该网络的性能，聚类精度(accuracy，ACC)和标准互信息(normalized mutual information，NMI)指标在MNIST数据集上分别为95.86%和91%，在Fashion-MNIST数据集上分别为61.34%和62.5%，与现有方法相比性能有了不同程度的提升。

结论

实验结果表明，本文网络取得了较好的聚类效果，且优于当前流行的多种聚类方法。

Abstract

Objective

Effective automatic grouping of data into clusters

especially clustering high-dimensional datasets

is one of the key issues in machine learning and data analysis. It is related to many aspects of signal processing applications

including computer vision

pattern recognition

speech and audio recognition

wireless communication and text classification. Current clustering algorithms are constrained of high computational complexity and poor performance in processing high-dimensional data due to the dimension disaster. Deep neural networks based clustering methods have its potential for real data clustering derived of their high representational ability. Autoencoder (AE) or variational autoencoder (VAE) clustering networks improve clustering effectiveness. But

their clustering performance is easy to be distorted intensively because of poor features extraction in distinguishing clear and unclear data or posterior collapse to clarify determining its posterior parameters of the latent variable of VAE

and they are insufficient to segment multiple classes

especially share very similar mean and variance in the context of clustering a multiclass dataset or two different classes. We demonstrate a clustering network based on VAE with the prior of Gaussian mixture (GM) distribution in terms of the deficiency of AE and VAE.

Method

The VAE

a maximum likelihood generative model

maximizes evidence lower bound (ELBO) via minimizing model reconstruction errors. Its difference of potential cost is through Kullback-Leibler (KL) divergence between the posterior distribution and the hypothesized prior

and then establishes maximum marginal log-likelihood (LL) of the data observed. Due to the approximate posterior distribution used VAE as a benched Gaussian distribution

it is challenged to match the ground truth posterior and have its priority of the KL term in ELBO

and the latent variable space may be arbitrarily complicated or even multimodal. To further improve the description of latent variables

a VAE is facilitated based on a latent variable prior of GM distribution. Its GM distribution prior linked data representation is approximated using the posterior distribution of the latent variable composed of a GM model

and the reconstruction error and the KL divergence based cost function between posterior and prior distribution is adopted to train the GM model based VAE. Due to the KL divergence between two GM distribution functions without a closed form solution

we use the approximate variational lower bound solution of the cost function with the aid of the fact that the KL divergence between the two single Gaussians has a closed-form solution

and implement the VAE using GM distribution priors optimization to resolve the KL divergence. A VAE based clustering network is constructed through a clustering layer combination behind the VAE. To improve the clustering performance

the STUDENT's

-distribution is used as a kernel to compute the soft assignment of the latent features of the VAE between the embedded point and the cluster centroid. Furthermore

a KL divergence cost is constructed between the soft assignment and its auxiliary target distribution. The commonly used VAE utilizes fully-connected neural networks to compute the latent variable

which generates more over fitted data parameters. Thus

the clustering network is carried out by convolutional neural networks (CNNs)

which consist of three convolutional layers and two fully-connected layers

without fully-connected neural networks

and no pooling layers used in the network because it will result in loss of useful information of the data. The network is trained by optimizing the KL divergence cost using stochastic gradient descent (SGD) method with the initial network parameters from the VAE. Our clustering network was obtained by the two-step training mentioned above like acquired VAE

as the initial value to train the following clustering layer.

Result

To test the effectiveness of the proposed algorithm

our network is evaluated on the multiclass benchmark datasets MNIST(Modified National Institute of Standards and Technology Database) which contains images of 10 categories of handwritten digits

and Fashion-MNIST which consists of grayscale images associated to a 10 segmented label. Our algorithm achieves 95.86% accuracy (ACC) and 91% normalized mutual information (NMI) on MNIST

ACC 61.34% and 62.5% NMI on Fashion-NMIST. Our network demonstration has the similar performance to ClusterGAN with fewer parameters and less memory space. The experimental results illustrate that our network achieves feasible clustering performance.

Conclusion

We construct a VAE based clustering network with the prior of GM distribution. A novel framed VAE is established to improve the representation ability of the latent variable based on a latent variable prior of GM distribution. The KL divergence between posterior and prior GM distribution is optimized to achieve latent variable features of VAE and reconstruct its input well. To improve the clustering performance

the clustering network is trained by optimizing the KL divergence between the soft distribution of the latent features of the VAE and the auxiliary target distribution of the soft assignment. We focus on the issue of where the number of Gaussian components in prior and posterior is different and the ability of the representation of the model on complex texture features further.

关键词

聚类混合高斯分布变分自编码器(VAE)软分配KL散度

Keywords

clusteringGaussian mixture distributionvariational autoencoder(VAE)soft assignmentKullback-Leibler(KL) divergence

references

Chazan S E, Gannot S and Goldberger J. 2019. Deep clustering based on a mixture of autoencoders//Proceedings of the 29th IEEE International Workshop on Machine Learning for Signal Processing. Pittsburgh, USA: IEEE: 1-6[DOI: 10.1109/MLSP.2019.8918720http://dx.doi.org/10.1109/MLSP.2019.8918720]

Cheng B Z, Zhao C H, Zhang L L and Zhang J P. 2017. Joint spatial preprocessing and spectral clustering based collaborative sparsity anomaly detection for hyperspectral images. Acta Optica Sinica, 37(4): 296-306

成宝芝, 赵春晖, 张丽丽, 张健沛. 2017. 联合空间预处理与谱聚类的协同稀疏高光谱异常检测. 光学学报, 37(4): 296-306[DOI:10.3788/AOS201737.0428001]

Dempster A P, Laird N M and Rubin D B. 1977. Maximum likelihood from incomplete data via the EM algorithm. Journal of the Royal Statistical Society: Series B (Methodological), 39(1): 1-22[DOI:10.2307/2984875]

Dilokthanakul N, Mediano P A M, Garnelo M, Lee M C H, Salimbeni H, Arulkumaran K and Shanahan M. 2017. Deep unsupervised clustering with Gaussian mixture variational autoencoders[EB/OL]. [2020-05-06].https://arxiv.org/pdf/1611.02648.pdfhttps://arxiv.org/pdf/1611.02648.pdf

Duan L, Aggarwal C, Ma S and Sathe S. 2019. Improving spectral clustering with deep embedding and cluster estimation//Proceedings of 2019 IEEE International Conference on Data Mining. Beijing, China: IEEE: 170-179[DOI: 10.1109/ICDM.2019.00027http://dx.doi.org/10.1109/ICDM.2019.00027]

Ester M, Kriegel H P, Sander J and Xu X W. 1996. A density-based algorithm for discovering clusters in large spatial databases with noise[EB/OL]. [2020-05-06].https://max.book118.com/html/2017/0725/124226331.shtmhttps://max.book118.com/html/2017/0725/124226331.shtm

Fraley C and Raftery A E. 1998. How many clusters? Which clustering method? Answers via model-based cluster analysis. The Computer Journal, 41(8): 578-588[DOI:10.1093/comjnl/41.8.578]

Guo C S, ZHOU J L, Chen H H, Ying N, Zhang J W and Zhou D. 2020. Variational autoencoder with optimizing Gaussian mixture model priors. IEEE Access, 8: 43992-44005[DOI:10.1109/ACCESS.2020.2977671]

Guo X F, Gao L, Liu X W and Yin J P. 2017. Improved deep embedded clustering with local structure preservation//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia: AAAI: 1753-1759[DOI: 10.24963/ijcai.2017/243http://dx.doi.org/10.24963/ijcai.2017/243]

Hershey J R and Olsen P A. 2007. Approximating the Kullback Leibler divergence between Gaussian mixture models//Proceedings of 2007 IEEE International Conference on Acoustics, Speech and Signal Processing. Honolulu, USA: IEEE: IV-317-IV-320[DOI: 10.1109/ICASSP.2007.366913http://dx.doi.org/10.1109/ICASSP.2007.366913]

Jiang Z X, Zheng Y, Tan H C, Tang B S and Zhou H N. 2017. Variational deep embedding: an unsupervised and generative approach to clustering//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne, Australia: AAAI: 1965-1972[DOI: 10.24963/ijcai.2017/273http://dx.doi.org/10.24963/ijcai.2017/273]

Kingma D P and Welling M. 2014. Auto-encoding variational Bayes[EB/OL]. [2020-05-06].https://arxiv.org/pdf/1312.6114.pdfhttps://arxiv.org/pdf/1312.6114.pdf

Kuhn H W. 2005. The Hungarian method for the assignment problem. Naval Research Logistics, 52(1): 7-21[DOI:10.1002/nav.20053]

LeCun Y, Bottou L, Bengio Y and Haffner P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE, 86(11): 2278-2324[DOI:10.1109/5.726791]

Lim K L, Jiang X D and Yi C Y. 2020. Deep clustering with variational autoencoder. IEEE Signal Processing Letters, 27: 231-235[DOI:10.1109/LSP.2020.2965328]

Lu R, Xiang L, Liu M R and Yang Q. 2012. Discovering news topics from microblogs based on hidden topics analysis and text clustering. Pattern Recognition and Artificial Intelligence, 25(3): 382-387

路荣, 项亮, 刘明荣, 杨青. 2012. 基于隐主题分析和文本聚类的微博客中新闻话题的发现. 模式识别与人工智能, 25(3): 382-387[DOI:10.3969/j.issn.1003-6059.2012.03.004]

MacQueen J. 1967. Some methods for classification and analysis of multivariate observations[EB/OL]. [2020-05-06].http://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=4172CEB4912D3E21EF68579C8888BA56?doi=10.1.1.308.8619&rep=rep1&type=pdfhttp://citeseerx.ist.psu.edu/viewdoc/download;jsessionid=4172CEB4912D3E21EF68579C8888BA56?doi=10.1.1.308.8619&rep=rep1&type=pdf

Mukherjee S, Asnani H, Lin E and Kannan S. 2019. ClusterGAN: latent space clustering in generative adversarial networks[EB/OL]. [2020-05-06].https://arxiv.org/pdf/1809.03627v1.pdfhttps://arxiv.org/pdf/1809.03627v1.pdf

Opochinsky Y, Chazan S E, Gannot S and Goldberger J. 2020. K-autoencoders deep clustering//Proceedings of 2020 IEEE International Conference on Acoustics, Speech and Signal Processing. Barcelona, Spain: IEEE: 4037-4041[DOI: 10.1109/ICASSP40776.2020.9053109http://dx.doi.org/10.1109/ICASSP40776.2020.9053109]

Sabour S, Frosst N and Hinton G E. 2017. Dynamic routing between capsules[EB/OL]. [2020-05-06].https://arxiv.org/pdf/1710.09829.pdfhttps://arxiv.org/pdf/1710.09829.pdf

van der Maaten L and Hinton G. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research, 9: 2579-2605

Wu H, Wang Y J, Wang Z, Wang X L and Du S Z. 2010. Two-phase collaborative filtering algorithm based on co-clustering. Journal of Software, 21(5): 1042-1054

吴湖, 王永吉, 王哲, 王秀利, 杜拴柱. 2010. 两阶段联合聚类协同过滤算法. 软件学报, 21(5): 1042-1054[DOI:10.3724/SP.J.1001.2010.03758]

Xiao H, Rasul K and Vollgraf R. 2017. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms[EB/OL].[2020-05-06].https://arxiv.org/pdf/1708.07747.pdfhttps://arxiv.org/pdf/1708.07747.pdf

Xie J Y, Girshick R and Farhadi A. 2016. Unsupervised deep embedding for clustering analysis[EB/OL].[2020-05-06].https://arxiv.org/pdf/1511.06335.pdfhttps://arxiv.org/pdf/1511.06335.pdf

Yang B, Fu X, Sidiropoulos N D and Hong M Y. 2017. Towards K-means-friendly spaces: simultaneous deep learning and clustering[EB/OL].[2020-05-06].https://arxiv.org/pdf/1610.04794.pdfhttps://arxiv.org/pdf/1610.04794.pdf

Yue F, Sun L, Wang K Q, Wang Y J and Zuo W M. 2008. State-of-the-art of cluster analysisof gene expression data. Acta Automatica Sinica, 34(2): 113-120

岳峰, 孙亮, 王宽全, 王永吉, 左旺孟. 2008. 基因表达数据的聚类分析研究进展. 自动化学报, 34(2): 113-120[DOI:10.3724/SP.J.1004.2008.00113]

Zhang H T, Cui Y, Wang D and Song T. 2018. Study of online healthy community user profile based on concept lattice. Journal of the China Society for Scientific and Technical Information, 37(9): 912-922

张海涛, 崔阳, 王丹, 宋拓. 2018. 基于概念格的在线健康社区用户画像研究. 情报学报, 37(9): 912-922[DOI:10.3772/j.issn.1000-0135.2018.09.006

Alert me when the article has been cited

提交

Recognition of ocean floor manganese nodules by deep kernel fuzzy C-means clustering of hyperspectral images

Superpixel segmentation with texture awareness

Embedded deep neural network hyperspectral image clustering

Robust and diverse multi-view clustering based on self-paced learning

Dominant sub-plane registration algorithm for large parallax image stitching