1. 华北电力大学电气与电子工程学院, 保定 071003;
2. 华北电力大学河北省电力物联网技术重点实验室, 保定 071003;
3. 国网浙江杭州萧山区供电有限公司, 杭州 310000
 收稿日期: 2020-12-29; 修回日期: 2021-04-13; 预印本日期: 2021-04-20 基金项目: 国家自然科学基金项目（61871182，61773160）；北京市自然科学基金项目（4192055）；河北省自然科学基金项目（F2020502009）；中央高校基本科研业务费专项资金资助（2018MS095，2020YJ006）；模式识别国家重点实验室开放课题基金项目（201900051） 作者简介: 戚银城, 1968年生, 男, 教授, 硕士生导师, 主要研究方向为电力信息分析与智能处理。E-mail: qiych@ncepu.edu.cn 金超熊, 男, 硕士研究生, 主要研究方向为知识蒸馏及电网智能巡检技术。E-mail: King11031@163.com 赵振兵, 通信作者, 男, 教授, 博士生导师, 主要研究方向为电力视觉检测。E-mail: zhaozhenbing@ncepu.edu.cn 丁洁涛, 男, 硕士研究生, 主要研究方向为弱监督及电网智能巡检技术。E-mail: jietaoding@163.com 吕斌, 男, 高级工程师, 从事电力信息与通信方面的运维与管理工作。E-mail: lvbin_csdl@163.com *通信作者: 赵振兵  zhaozhenbing@ncepu.edu.cn 中图法分类号: TN911.73 文献标识码: A 文章编号: 1006-8961(2021)11-2571-11

# 关键词

Image classification method of transmission line bolt defects using the optimal knowledge transfer wide residual network
Qi Yincheng1,2, Jin Chaoxiong1, Zhao Zhenbing1,2, Ding Jietao1, Lyu Bin3
1. School of Electrical and Electronic Engineering, North China Electric Power University, Baoding 071003, China;
2. Hebei Key Laboratory of Power Internet of Things Technology, North China Electric Power University, Baoding 071003, China;
3. State Grid Zhejiang Hangzhou Xiaoshan Power Supply CO., LTD., Hangzhou 310000, China
Supported by: National Natural Science Foundation of China (61871182, 61773160); Beijing Municipal Natural Science Foundation (4192055); Natural Science Foundation of Hebei Province, China (F2020502009); Fundamental Research Funds for the Central Universities (2018MS095, 2020YJ006); Open Project Program of the National Laboratory of Pattern Recognition(NLPR) (201900051)

# Abstract

Objective Bolts play a key role in fixing and connecting various metal parts in transmission lines. Defects seriously affect the power transmission of transmission lines. The imaging background of an inspection image is complicated, the imaging distance and angle are variable, and the bolts occupy a small proportion in the inspection image. Thus, bolt defect images of transmission lines have low resolution and scarce visual information, and they usually require a large model with high complexity and excellent performance to classify bolt defects and ensure accuracy. A large model has a complex structure and numerous parameters, and deploying it on a large scale is difficult due to a large amount of computing resources needed in data analysis. A small model has a simple structure and few parameters, but it cannot completely guarantee the accuracy of bolt defect classification. This study proposes an image classification method of transmission line bolt defects based on the optimal knowledge transfer network to compensate for the limitations of bolt defect classification using large and small models. Method The width of the large model is changed, that is, the dimension of network feature expression is broadened, to fully mine the target information in the bolt image, thereby increasing the bolt defect knowledge of the transferability of the large model to the small model. To reduce the parameters of the small model considerably and improve its operation and maintenance capabilities, the structure of the small model is simplified to a 10-layer residual network with three residual blocks. The number of convolution kernels of each residual block is 16, 32, and 64. Therefore, the small model still has obvious features that focus on the low gradient of the bolt image in the low layer, the high difference area in the middle layer, and the overall characteristics of the bolt image in the high layer. Then, the large models of different widths use the attention transfer algorithm and the knowledge distillation algorithm to guide the training of the small models, and the accuracy of the small models under different widths after the training is calculated. Afterward, the concept of knowledge deviation is proposed to measure the degree of bolt defect knowledge transfer of large models and select the large model with the best performance in transferring bolt defect knowledge. The performance of the large and small models is mapped on a number line in the form of accuracy. The calculation process of knowledge deviation proceeds as follows. First, the difference in bolt defect classification accuracy between the large model with a known width and the small model being instructed is calculated. Second, the difference in bolt defect classification accuracy between the large model and the small model that is not being instructed is computed. Lastly, the ratio of the two differences before and after the calculation is adopted as the knowledge deviation. The smaller the knowledge deviation is, the greater the degree of bolt defect knowledge transfer is from the large model to the small model. The optimal knowledge transfer model is determined according to the knowledge deviation of different widths and the bolt defect classification accuracy of the small models under different guidance methods. The optimal knowledge transfer model combines the attention algorithm and the knowledge distillation algorithm to guide in small model training and maximize the bolt defect classification performance of the small model. Result The self-built bolt defect image classification data set verifies the effectiveness of this method in improving the accuracy of bolt classification of the simplified small model. The data set is constructed by clipping and optimizing the transmission line inspection image. The bolt defect image classification data set contains a total of 6 420 images, including 3 136 normal bolts, 2 820 bolts with missing pins, and 464 bolts with missing nuts, which belong to three categories. Experimental results show that the large model with a width of 5 has the best performance in transferring bolt defect knowledge to the small model, and it increases the bolt defect classification accuracy of the small model by 5.56%. The difference in the accuracy of bolt defect classification between the small model and the optimal knowledge transfer model is only 2.17%. The knowledge deviation is 0.28, and the parameter of the small model is only 0.56% of the parameter of the large model. Conclusion The proposed bolt defect classification method based on the optimal knowledge transfer network greatly alleviates the problems of large model parameters and low classification accuracy of small models caused by bolt image characteristics. Balance is achieved between the classification accuracy of the bolt defect image and resource consumption. The method meets the requirements of actual field operation and maintenance, such as the work requirements of embedded equipment (e.g., online monitoring in transmission lines) and reduces the resource consumption of transmission line patrol data analysis.

# Key words

bolt defect classification; optimal knowledge transfer; knowledge deviation; knowledge distillation; attention transfer

# 1 研究方法

1) 改变教师网络宽度即拓宽卷积通道维度，来提高教师网络对螺栓图像的特征表达能力，达到增加向学生网络传递螺栓缺陷知识的目的；同时将学生网络简化至含有3个残差块的10层残差网络，以大幅度降低学生网络参数量。

2) 为了确定最优知识传递网络，提出了知识偏差的概念来衡量教师网络与学生网络的差异性，将不同宽度的教师网络与学生网络性能以准确率的形式在数轴上进行映射，通过计算已知宽度的教师网络与被指导学生网络的精度差和教师网络与无指导学生网络的精度差的比值来可视化教师网络向学生网络的螺栓缺陷知识传递程度。知识偏差越小，知识传递程度越大，反之越小。最后综合分析知识偏差和不同算法下学生网络的精度提升程度，确定最优知识传递网络。

3) 为了最大程度提升学生网络螺栓缺陷分类性能，将最优知识传递的教师网络采用隐藏层注意力转移(attention transfer, AT)算法和输出层知识蒸馏(knowledge distillation, KD)算法相结合指导学生网络训练后，得到螺栓缺陷分类精度最佳的学生网络。

# 1.1.1 输出层知识蒸馏

 $S_{i}=\frac{\exp \left(z_{i} / T\right)}{\sum\limits_{j} \exp \left(z_{j} / T\right)}$ (1)

 $L_{3}=L_{2}\left(Q_{\mathrm{s}}, y_{\text {true }}\right)+L_{\text {AT }}$ (8)

 $L=L_{\mathrm{AT}}+L_{\mathrm{KD}}$ (9)

# 1.2.1 拓宽教师网络

 $M=N \times K, N \in[16,32,64]$ (10)

# 1.2.2 简化学生网络结构

1) 输入螺栓图像；

2) 增大教师网络宽度；

3) 增加卷积通道数，即拓宽螺栓图像的特征表达维度；

4) 教师网络充分提取螺栓缺陷图像中的特征信息；

5) 增加教师网络可传递性螺栓缺陷知识；

6) 最优教师网络指导简化后的学生网络训练；

7) 提高简化后学生网络螺栓缺陷图像分类精度。

# 2.1 实验准备

1) 利用螺栓缺陷分类图像数据集训练不同宽度的教师网络，并统计准确率；

2) 利用已训练完成的不同宽度的教师网络指导学生网络训练螺栓缺陷图像分类数据集，并统计准确率。

# 2.2 结果与分析

Table 1 Classification accuracy of network with different parameters

 残差网络 分类准确率/% 参数量 字节数/MB ResNet-10-1 83.26 78 330 0.30 ResNet-40-1 89.14 566 650 2.16 ResNet-40-2 89.71 2 248 954 8.58 ResNet-40-3 90.11 5 046 650 19.25 ResNet-40-4 90.47 8 959 994 34.17 ResNet-40-5 90.99 13 988 986 53.36 ResNet-40-6 90.63 20 133 626 76.80 ResNet-40-7 90.71 27 393 914 104.50 ResNet-40-8 89.89 35 769 850 136.45 注：加粗字体为最优结果。

Table 2 Classification accuracy of the student network after being instructed to train

 教师网络 学生网络螺栓缺陷图像分类准确率/% 注意力转移 知识蒸馏 ResNet-40-1 85.06 84.15 ResNet-40-2 85.35 84.75 ResNet-40-3 85.36 85.31 ResNet-40-4 85.55 85.64 ResNet-40-5 87.51 86.23 ResNet-40-6 86.48 85.85 ResNet-40-7 87.53 85.06 ResNet-40-8 85.96 85.33 注：加粗字体为每列前2名的最优结果。

 $T S_{W}=\frac{T_{W}-A S_{W}}{T_{W}-B S}=\frac{\Delta T A S}{\Delta T B S}, A S_{W} \in\left(B S, T_{W}\right)$ (11)

 $\lim \limits_{A S_{W} \rightarrow B S} T S_{W}=1$ (12)

 $\lim \limits_{A S_{W} \rightarrow T_{W}} T S_{W}=0$ (13)

Table 3 The classification accuracy of best teacher network and student network

 残差网络 分类准确率/% 参数量 字节数/MB ResNet-10-1 83.26 78 330 0.30 ResNet-10-1(AT+KD) 88.82 78 330 0.30 ResNet-40-5 90.99 13 988 986 136.45 注：加粗字体为本文学生网络分类最优结果。

1) 螺栓缺陷图像分类数据集中总的样本较少，且样本不平衡是造成分类精度普遍较低的主要原因。

2) 螺栓缺陷图像分类数据集中存在着类内差异性较大问题，如螺母缺失螺栓图像中存在着带销螺母缺失和脱销螺母缺失两种，如图 11所示。

3) 螺栓缺陷数据集中除了存在困难样本外，还存在着部分类别间容易混淆样本，如图 12所示。

Table 4 The classification accuracy and knowledge deviation of teacher-student networks

 残差网络 分类准确率/% AT+KD/% 知识偏差 ResNet-10-1 87.68 - - ResNet-40-1 93.58 87.90 0.96 ResNet-40-2 94.78 88.01 0.95 ResNet-40-3 95.20 88.32 0.91 ResNet-40-4 95.51 88.54 0.89 ResNet-40-5 95.78 88.65 0.88 ResNet-40-6 95.75 89.09 0.83 ResNet-40-7 95.90 88.26 0.93 ResNet-40-8 95.89 88.78 0.87 注：加粗字体为每列最优结果，“-”表示原文中未给出实验结果。

# 3 结论

1) 通过拓宽教师网络的宽度来增加螺栓图像的特征表达通道数，使教师网络的分类精度最高达90.99%，提升了1.85%；同时将学生网络简化至含有3个残差块，参数量仅有78 330个。

2) 提出了知识偏差的概念来可视化不同宽度教师网络向学生网络传递螺栓缺陷知识的程度；综合分析知识偏差和学生网络被不同宽度教师网络利用不同指导方式训练后的螺栓缺陷图像分类精度，确定了宽度为5即ResNet-40-5是最优知识传递网络。

3) 将最优知识传递网络利用注意力转移算法与知识蒸馏算法相结合的方式指导学生网络训练，使学生网络螺栓缺陷分类精度提高了5.56%，知识偏差为0.28，学生网络参数量仅是最优知识传递网络的0.56%。

