引入概率分布的深度神经网络贪婪剪枝

胡骏; 黄启鹏; 刘嘉昕; 刘威; 袁淮; 赵宏

doi:10.11834/jig.200438

自动驾驶场景感知与仿真 | 浏览量 : 0 下载量: 1 CSCD: 2

PDF
导出
分享
收藏
专辑

引入概率分布的深度神经网络贪婪剪枝
Greedy pruning of deep neural networks fused with probability distribution
2021年26卷第1期页码：198-207
纸质出版日期： 2021-01-16 ，

录用日期： 2020-10-27
DOI： 10.11834/jig.200438
稿件说明：

移动端阅览

胡骏, 黄启鹏, 刘嘉昕, 刘威, 袁淮, 赵宏. 引入概率分布的深度神经网络贪婪剪枝[J]. 中国图象图形学报, 2021,26(1):198-207.

Jun Hu, Qipeng Huang, Jiaxin Liu, Wei Liu, Huai Yuan, Hong Zhao. Greedy pruning of deep neural networks fused with probability distribution[J]. Journal of Image and Graphics, 2021,26(1):198-207.
胡骏, 黄启鹏, 刘嘉昕, 刘威, 袁淮, 赵宏. 引入概率分布的深度神经网络贪婪剪枝[J]. 中国图象图形学报, 2021,26(1):198-207. DOI： 10.11834/jig.200438.

Jun Hu, Qipeng Huang, Jiaxin Liu, Wei Liu, Huai Yuan, Hong Zhao. Greedy pruning of deep neural networks fused with probability distribution[J]. Journal of Image and Graphics, 2021,26(1):198-207. DOI： 10.11834/jig.200438.

摘要

目的

深度学习在自动驾驶环境感知中的应用，将极大提升感知系统的精度和可靠性，但是现有的深度学习神经网络模型因其计算量和存储资源的需求难以部署在计算资源有限的自动驾驶嵌入式平台上。因此为解决应用深度神经网络所需的庞大计算量与嵌入式平台有限的计算能力之间的矛盾，提出了一种基于权重的概率分布的贪婪网络剪枝方法，旨在减少网络模型中的冗余连接，提高模型的计算效率。

方法

引入权重的概率分布，在训练过程中记录权重参数中较小值出现的概率。在剪枝阶段，依据训练过程中统计的权重概率分布进行增量剪枝和网络修复，改善了目前仅以权重大小为依据的剪枝策略。

结果

经实验验证，在Cifar10数据集上，在各个剪枝率下本文方法相比动态网络剪枝策略的准确率更高。在ImageNet数据集上，此方法在较小精度损失的情况下，有效地将AlexNet、VGG（visual geometry group）16的参数数量分别压缩了5.9倍和11.4倍，且所需的训练迭代次数相对于动态网络剪枝策略更少。另外对于残差类型网络ResNet34和ResNet50也可以进行有效的压缩，其中对于ResNet50网络，在精度损失增加较小的情况下，相比目前最优的方法HRank实现了更大的压缩率（2.1倍）。

结论

基于概率分布的贪婪剪枝策略解决了深度神经网络剪枝的不确定性问题，进一步提高了模型压缩后网络的稳定性，在实现压缩网络模型参数数量的同时保证了模型的准确率。

Abstract

Objective

In recent years

deep learning neural network has continued to develop

and excellent results have been achieved in the fields of computer vision

natural language processing

and speech recognition. In autonomous driving technology

the environment perception is an important application. The environment perception mainly processes the collected image information about the surrounding environment. Thus

deep learning is an important section in this link. However

the number of layers of existing neural network models continues to increase with the continuous increase in the complexity of processing problems. Thus

the number of overall parameters of the network and the required computing power are increasing. These models run well on platforms with sufficient computing power

such as server platforms with sufficient computing power. However

many deep neural network models are difficult to be deployed on embedded platforms with limited computing and storage resources

such as autonomous driving platforms. Compressing the existing deep neural network models is necessary to solve the contradiction between the huge amount of calculation required for the application of deep neural networks and the limited computing power of embedded platforms. This process can reduce the number of model parameters and computing power. This paper proposes a greedy network pruning method based on the existing model compression method. The propose method incorporates the probability distribution of weights to reduce redundant connections in the network model and improve the computational efficiency and parameters of the model.

Method

The current pruning method mainly uses the property of weight parameter as a criterion for parameter importance evaluation. The 1 norm of the convolution kernel weight parameter is used as the basis for determining the importance. However

this method ignores the variation of weight during training. In the pruning process

many methods use the trained model to perform one-time pruning. Thus

the accuracy of the model after pruning is difficult to maintain. the proposed is inspired by the study of uncertain graphs to solve the above problem. The probability distribution of weights is introduced

and the importance of the connection is jointly judged in accordance with the probability distribution of the weight parameter value and the size of the current weight in the training. The importance of the network connection and the effect of cutting the connection on the loss function are jointly used. The degree of influence collectively represents the contribution rate of this network connection to the result

thereby serving as the basis for pruning the network connection. In the stage of greedy pruning of the model

the proposed method uses incremental pruning to control the scale and speed of pruning. Iterative pruning and restoration are performed for a small proportion of connections until the state of the current sparse connections no longer changes. The pruning scale is gradually expanded until the expected model compression effect is achieved. Therefore

the incremental pruning and recovery strategy can avoid the weight gradient explosion problem caused by excessive pruning

improve the pruning efficiency and model stability

and realize dynamic pruning compared with the one-time pruning process based on the weight parameters. The proposed pruning method guarantees the maximum compression of the model's volume while maintaining its accuracy.

Result

The experiment uses networks of different depths for experiments

including CifarSmall

AlexNet

and visual geometry group(VGG)16

and nets with residual connections

including ResNet34 and ResNet50 networks

to verify the applicability of the proposed method to different depth networks. The experimental dataset uses the commonly used classification datasets

including CIFAR-10 and ImageNet ILSVRC(ImageNet Large Scale Visual Recognition Challenge)-2012

making it convenient for comparison with other methods. The main comparison content of the experiment includes the proposed method and the dynamic network pruning strategy on CIFAR-10. The pruning effect of the proposed method and the current state-of-the-art (SOTA) pruning algorithm HRank is compared on the Imagenet dataset in ResNet50. Experimental results prove that the accuracy of the proposed method is higher than that of the dynamic network pruning strategy at various pruning rates on the Cifar10 dataset. On the ImageNet data set

the proposed method effectively compresses the number of parameters of AlexNet and VGG16 by 5.9 and 11.4 times

respectively

with a small loss of accuracy. The number of training iterations required is more than that of the dynamic network pruning strategy. Effective compression can be performed for the residual type networks ResNet34 and ResNet50. For the ResNet50 network

a larger compression rate is achieved with a small increase in accuracy loss compared with the current SOTA method HRank.

Conclusion

The greedy pruning strategy fused with probability distribution solves the uncertainty problem of deep neural network pruning

improves the stability of the network after model compression

and realizes the compression of the number of network model parameters while ensuring the accuracy of the model. Experimental results prove that the proposed method has a good compression effect for many types. The probability distribution of the weight parameters introduced in this research can be used as an important basis for the subsequent parameter importance criterion in the pruning research. The incremental pruning and the connection recovery in the pruning process used in this article are important for accuracy maintenance

However

optimizing and accelerating the reasoning of the sparse model obtained after pruning needs further research.

关键词

深度学习神经网络模型压缩概率分布网络剪枝

Keywords

deep learningneural networkmodel compressionprobability distributionnetwork pruning

references

Cheng Y, Wang D, Zhou P and Zhang T. 2017. A survey of model compression and acceleration for deep neural networks[EB/OL].[2020-06-30].https://arxiv.org/pdf/1710.09282.pdfhttps://arxiv.org/pdf/1710.09282.pdf

Courbariaux M, Hubara I, Soudry D, El-Yaniv R and Bengio Y. 2016. Binarized neural networks:training deep neural networks with weights and activations constrained to +1 or -1[EB/OL].[2020-06-30].https://arxiv.org/pdf/1602.02830v3.pdfhttps://arxiv.org/pdf/1602.02830v3.pdf

Denil M, Shakibi B, Dinh L, Ranzato M and de Freitas N. 2013. Predicting parameters in deep learning//Proceedings of the 26th International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc.: 2148-2156

Ding X H, Ding G G, Guo Y C and Han J G. 2019. Centripetal SGD for pruning very deep convolutional networks with complicated structure//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 4938-4948[DOI:10.1109/CVPR.2019.00508http://dx.doi.org/10.1109/CVPR.2019.00508]

Gao Y. 2013. Uncertain Graph and Uncertain Network. Beijing: Tsinghua University

高原. 2013.不确定图与不确定网络.北京: 清华大学

Guo Y W, Yao A B and Chen Y R. 2016. Dynamic network surgery for efficient DNNs[EB/OL].[2020-06-30].https://arxiv.org/pdf/1608.04493.pdfhttps://arxiv.org/pdf/1608.04493.pdf

Han S, Pool J, Tran J and Dally W J. 2015. Learning both weights and connections for efficient neural networks[EB/OL].[2020-06-30].https://arxiv.org/pdf/1506.02626.pdfhttps://arxiv.org/pdf/1506.02626.pdf

He Y, Liu P, Wang Z W, Hu Z L and Yang Y. 2019. Filter pruning via geometric median for deep convolutional neural networks acceleration//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach, USA: IEEE: 4335-4344[DOI:10.1109/CVPR.2019.00447http://dx.doi.org/10.1109/CVPR.2019.00447]

Krizhevsky A, Sutskever I and Hinton G E. 2012. ImageNet classification with deep convolutional neural networks//Proceedings of the 25th International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc.: 1097-1105

Lin M B, Ji R R, Wang Y, Zhang Y C, Zhang B C, Tian Y H and Shao L. 2020. HRank: filter pruning using high-rank feature map[EB/OL].[2020-06-30].https://arxiv.org/pdf/2002.10179.pdfhttps://arxiv.org/pdf/2002.10179.pdf

Lin S H, Ji R R, Yan C Q, Zhang B C, Cao L J, Ye Q X, Huang F Y and Doermann D. 2019. Towards optimal structured CNN pruning via generative adversarial learning//Proceedings of 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Long Beach, USA: IEEE: 2785-2794[DOI:10.1109/CVPR.2019.00290http://dx.doi.org/10.1109/CVPR.2019.00290]

Mathieu M, Henaff M and LeCun Y. 2013. Fast training of convolutional networks through FFTs[EB/OL].[2020-06-30].https://arxiv.org/pdf/1312.5851.pdfhttps://arxiv.org/pdf/1312.5851.pdf

Meng F X, Cheng H, Li K, Xu Z X, Ji R R, Sun X and Lu G M. 2020. Filter grafting for deep neural networks[EB/OL].[2020-06-30].https://arxiv.org/pdf/2001.05868.pdfhttps://arxiv.org/pdf/2001.05868.pdf

Rastegari M, Ordonez V, Redmon J and Farhadi A. 2016. XNOR-Net: ImageNet classification using binary convolutional neural networks//European Conference on Computer Vision. Amsterdam, The Netherlands: Springer: 525-542[DOI:10.1007/978-3-319-46493-0_32http://dx.doi.org/10.1007/978-3-319-46493-0_32]

Simonyan K and Zisserman A. 2015. Very deep convolutional networks for large-scale image recognition[EB/OL].[2020-06-30].https://arxiv.org/pdf/1409.1556.pdfhttps://arxiv.org/pdf/1409.1556.pdf

Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V and Rabinovich A. 2015. Going deeper with convolutions//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston, USA: IEEE: 1-9[DOI:10.1109/CVPR.2015.7298594http://dx.doi.org/10.1109/CVPR.2015.7298594]

Yang Z C, Moczulski M, Denil M, de Freitas N, Smola A, Song L and Wang Z Y. 2014. Deep fried convnets[EB/OL].[2020-06-30].https://arxiv.org/pdf/1412.7149.pdfhttps://arxiv.org/pdf/1412.7149.pdf

Yu R C, Li A, Chen C F, Lai J H, Morariu V I, Han X T, Gao M F, Lin C Y and Davis L S. 2018. NISP: pruning networks using neuron importance score propagation//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, USA: IEEE: 9194-9203[DOI:10.1109/cvpr.2018.00958http://dx.doi.org/10.1109/cvpr.2018.00958]

Zhang W. 2017. K-nearest neighbor search algorithm for uncertain graphs based on sampling. Computer Applications and Software, 34(6):180-186

张伟. 2017.基于抽样的不确定图k最近邻搜索算法.计算机应用与软件, 34(6):180-186[DOI:10.3969/j.issn.1000-386x.2017.06.033]

Zhou A J, Yao A B, Guo Y W, Xu L and Chen Y R. 2017. Incremental network quantization: towards lossless CNNs with low-precision weights[EB/OL].[2020-06-30].https://arxiv.org/pdf/1702.03044.pdfhttps://arxiv.org/pdf/1702.03044.pdf

文章被引用时，请邮件提醒。

提交