1. 清华大学计算机科学与技术系, 北京 100084;
2. 北京航空航天大学数学与系统科学学院, 北京 100191;
3. 广东省大数据分析与处理重点实验室, 广州 510006
# 关键词

Combining principal component analysis network with linear discriminant analysis for the classification of retinal optical coherence tomography images
Ding Sijing1,2, Sun Zhongyang1,3, Sun Yankui1,3, Wang Yongge2
1. Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China;
2. School of Mathematics and Systems Science, Beihang University, Beijing 100191, China;
3. Guangdong Key Laboratory of Big Data Analysis and Processing, Guangzhou 510006, China
Supported by: National Natural Science Foundation of China (61671272)

# Abstract

Objective Optical coherence tomography (OCT) is a 3D scanning imaging technology that has been widely used in ophthalmology as a clinical auxiliary to identify various eye lesions. Therefore, the classification technique of retinal OCT images is greatly important for the detection and treatment of retinopathy. Many effective OCT classification algorithms have been recently developed, and almost all these have artificial design features; however, retinal OCT images acquired from clinic usually contains a complex pathological structure. Therefore, the features from OCT images must be directly learned. Principal component analysis network (PCANet) is a simple version of convolutional neural network, which can directly extract the texture features of images, whereas features extracted by linear discriminant analysis (LDA) are more distinguishable for image classification. Combining the advantages of these two methods, this paper presents a PCANet with LDA (PCANet-LDA) for the automatic classification of three types of retinal OCT images, including age-related macular degeneration (AMD), diabetic macular edema (DME), and normal (NOR). Method The proposed PCANet-LDA algorithm adds an LDA supervisory layer based on the PCANet to allow the supervision of extracted image features by class labels. This algorithm can be implemented in three steps. The first step is the OCT image preprocessing, which involves a series of preprocessing including perceiving, fitting, and normalizing stages on retinal OCT images to obtain an interested retinal region for image classification. The second step is the PCANet feature extraction, where the preprocessed OCT images are sent into a PCA convolution layer with two stages and a nonlinear output layer. In the PCA convolution layer, PCA filter banks are learned, and the PCA features of retinal OCT images can be extracted. In the nonlinear output layer, the extracted PCA features are translated to PCANet features of the input images by some basic data-processing components, including binary hashing and blockwise histograms. The third step is the LDA supervisory layer, which uses the LDA idea to learn an LDA matrix from the PCANet features with class labels of AMD, DME, and NOR. Then, the LDA matrix is used to project PCANet features into a low-dimensional space to make the projected features more distinguishable for classification. Finally, the projected features are used to train a linear support vector machine and classify the retinal OCT images. Result Both experiments are done on two retinal OCT dataset, including the clinic dataset obtained from a hospital and Duke dataset. First, the comparative examples of AMD, DME and NOR retinal OCT images before and after preprocessing shows that the image preprocessing cuts out the non-retinal regions in the OCT image, leaving the meaningful retinal areas. Moreover, the remaining retina is rotated to a unified horizontal state to reduce the impact of inconsistent direction of retina on classification. Then, the sample PCANet feature maps extracted from AMD and DME retinal OCT images show that the PCA filter trained by PCANet tends to capture meaningful pathological structure information, which contributes to the classification of retinal OCT images. Finally, the correct classification rates of the PCANet algorithm, the ScSPM algorithm, and the PCANet-LDA algorithm proposed in this paper are compared. On the clinic dataset, the overall correct classification rate of the PCANet-LDA algorithm is 97.20%, which is 3.77% higher than that of the PCANet algorithm and slightly higher than that of the ScSPM algorithm. On the Duke dataset, the overall correct classification rate of the PCANet-LDA algorithm is 99.52%, which is 1.64% higher than that of the PCANet algorithm and a slightly higher than that of the ScSPM algorithm. Conclusion The PCANet algorithm can extract effective features. Accordingly, the PCANet-LDA algorithm obtains more distinguishing features by LDA method, to yield a higher correct classification rate than that of the PCANet and ScSPM algorithms; the latter is a state-of-the-art two-dimensional OCT image classification of the retina. Therefore, the proposed PCANet-LDA algorithm is effective, advanced in the classification of retinal OCT images, and can be a baseline algorithm for retinal OCT image classification.

# Key words

optical coherence tomography; age-related macular degeneration; diabetic macular edema; principal component analysis network; linear discriminant analysis; image classification; semi-supervised learning

# 1 本文算法

PCANet-LDA分类算法是一种类标签监督的多层卷积网络，该网络主要分为3部分：1)OCT图像预处理；2)PCANet特征提取；3)LDA监督层。网络最后连接一个线性SVM分类器，框架如图 1所示。对于给定的$N$张带类标签(AMD、DME、NOR)的视网膜OCT图像训练集，先进行预处理获得感兴趣的视网膜区域，然后利用PCANet-LDA网络在PCA卷积层学习PCA滤波器，在LDA监督层利用提取的PCANet特征和类标签学习LDA矩阵，最后由LDA投影特征训练线性的SVM分类器。

# 1.1 OCT图像预处理

OCT图像通常充满斑点，并且视网膜的位置在扫描中变化很大，这使得将所有视网膜区域对准到相对统一的位置并不容易。因此，需要对OCT图像进行预处理，提取视网膜OCT图像的感兴趣区域，提高图像分类准确率。本文采用Sun等人[7]提出的自动对齐和剪切视网膜区域技术对视网膜OCT图像预处理，该方法分为感知、拟合、归一化3个阶段，具体描述如图 2所示。

# 1.3 LDA监督层

LDA监督层是根据线性判别分析思想，将带有AMD、DME、NOR类标签的PCANet特征 $\left\{ {{\mathit{\boldsymbol{f}}_i}} \right\}_{i = 1}^N$投影到一个低维空间，使投影后同类图像的特征相距更近，异类图像的特征相距更远，即投影后特征的总类间散布矩阵的迹与总类内散布矩阵的迹之比最大。

Table 2 Correct classification rate comparison of three different algorithms on DUKE dataset

 /% PCANet ScSPM PCANet-LDA AMD 98.29±0.53 99.38±0.57 99.82±0.21 DME 98.14±0.74 99.59±0.60 99.77±0.32 NOR 97.21±1.56 99.30±0.55 98.98±0.88 总体 97.88±0.55 99.42±0.25 99.52±0.30

# 2.2 方法与结果分析

ScSPM算法是从OCT图像的片中提取人工特征(SIFT特征)后，通过稀疏编码进一步提取图像的局部特征；PCANet是通过两个卷积层学习得到OCT图像的PCA特征。从实验结果可以看出，ScSPM提取的图像特征优于PCANet提取的图像特征。而PCANet-LDA算法的分类结果稍优于ScSPM算法，也进一步说明了LDA监督层对PCANet特征的降维有助于OCT图像的分类。

# 3 结论

PCANet作为测试图像分类效果的基准(baseline)算法非常有效。本文在PCANet算法的基础上提出了一种新的PCANet-LDA算法并应用于视网膜OCT图像分类。该算法在PCANet提取图像特征的基础上，进一步通过LDA方法得到类别区分性更强的特征。在两个视网膜OCT图像数据集上的测试结果表明，与PCANet算法相比，PCANet-LDA算法的分类正确率显著提高，达到或优于当前具有代表性的ScSPM OCT图像分类算法。可见，PCANet-LDA可作为以后视网膜OCT图像分类的基准算法，来比较不同算法的有效性。未来的工作是在更大的OCT数据集以及其他典型的图像数据集上验证算法的有效性。

