多区域融合注意力网络模型下的核性白内障分类

章晓庆; 肖尊杰; 东田理沙; 陈婉; 胡衍; 袁进; 刘江

发布时间： 2022-03-19
摘要点击次数： 1870
全文下载次数： 622
DOI: 10.11834/jig.210735
2022 | Volume 27 | Number 3

多区域融合注意力网络模型下的核性白内障分类

章晓庆¹, 肖尊杰¹, 东田理沙^1,2, 陈婉³, 胡衍¹, 袁进³, 刘江^1,4,5(1.南方科技大学计算机科学与工程系, 深圳 518055;2.TOMEY株式会社, 名古屋 451-0051, 日本;3.中山大学中山眼科中心, 广州 510060;4.中国科学院宁波材料技术与工程研究所慈溪生物医学工程研究所, 宁波 315201;5.广东省类脑智能计算重点实验室, 深圳 518055)

摘要

目的核性白内障是主要致盲和导致视觉损害的眼科疾病，早期干预和白内障手术可以有效改善患者的视力和生活质量。眼前节光学相干断层成像图像（anterior segment optical coherence tomography，AS-OCT）能够非接触、客观和快速地获取白内障混浊信息。临床研究已经发现在AS-OCT图像中核性白内障严重程度与核性区域像素特征，如均值存在强相关性和高可重复性。但目前基于AS-OCT图像的自动核性白内障分类工作较少且分类结果还有较大提升空间。为此，本文提出一种新颖的多区域融合注意力网络（multi-region fusion attention network，MRA-Net）对AS-OCT图像中的核性白内障严重程度进行精准分类。方法在提出的多区域融合注意力模型中，本文设计了一个多区域融合注意力模块（multi-region fusion attention，MRA），对不同核性区域特征表示进行融合来增强分类结果；另外，本文验证了以人和眼为单位的AS-OCT图像数据集拆分方式对核性白内障分类结果的影响。结果在一个自建的AS-OCT图像数据集上结果表明，本文模型的总体分类准确率为87.78%，比对比方法至少提高了1%。在10种分类算法上的结果表明：以眼为单位的AS-OCT数据集优于以人为单位的AS-OCT数据集的分类结果，F1和Kappa评价指标分别最大提升了4.03%和8%。结论本文模型考虑了特征图不同区域特征分布的差异性，使核性白内障分类更加准确；不同数据集拆分方式的结果表明，考虑到同一个人两只眼的核性白内障严重程度相似，建议白内障的AS-OCT图像数据集拆分以人为单位。

关键词

核性白内障分类眼前节光学相干断层成像图像(AS-OCT) 多区域融合注意力模块深度学习核性区域

Nuclear cataract classification based on multi-region fusion attention network model

Zhang Xiaoqing¹, Xiao Zunjie¹, Risa Higashita^1,2, Chen Wan³, Hu Yan¹, Yuan Jin³, Liu Jiang^1,4,5(1.Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen 518055, China;2.TOMEY Corporation, Nagoya 451-0051, Japan;3.Zhongshan Ophthalmic Center, Sun Yat-sen University, Guangzhou 510060, China;4.Cixi Institute of Biomedical Engineering, Ningbo Institute of Materials Technology and Engineering, Chinese Academy of Sciences, Ningbo 315201, China;5.Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Shenzhen 518055, China)

Abstract

Objective Cataracts are the primary inducement for human blindness and vision impairment. Early intervention and cataract surgery can effectively improve the vision and life quality of cataract patients. Anterior segment optical coherence tomography (AS-OCT) image can capture cataract opacity information through a non-contact, objective, and fast manner. Compared with other ophthalmic images like fundus images, AS-OCT images are capable of capturing the clear nucleus region, which is very significant for nuclear cataract (NC) diagnosis. Clinical studies have identified that a strong opacity correlation relationship and high repeatability between average density value of the nucleus region and NC severity levels in AS-OCT images. Moreover, the clinical works also have suggested that the correlation relationships between different nucleus regions and NC severity levels. These original research works provide the clinical reference for automatic AS-OCT image-based NC classification. However, automatic NC classification based on AS-OCT images has been rarely studied, and there is much improvement room for NC classification performance on AS-OCT images. Method Motivated by the clinical research of NC, this paper proposes an efficient multi-region fusion attention network (MRA-Net) model by infusing clinical prior knowledge, aiming to classify nuclear cataract severity levels on AS-OCT images accurately. In the MRA-Net, we construct a multi-region fusion attention (MRA) block to fuse feature representation information from different nucleus regions to enhance the overall classification performance, in which we not only adopt the summation operation to fuse different region information but also apply the softmax function to focus on salient channel and suppress redundant channels. In respect of the residual connection can alleviate the gradient vanishing issue, the MRA block is plugged into a cluster of Residual-MRA modules to demonstrate MRA-Net. Moreover, we also test the impacts of two different dataset splitting methods on NC classification results:participant-based splitting method and eye-based splitting method, which is easily ignored by previous works. In the training, this paper resizes the original AS-OCT images into 224×224 pixels as the network inputs and set batch size to 16. Stochastic gradient descent (SGD) optimizer is used as the optimizer with default settings and we set training epochs to 100. Result Our research analysis demonstrates that the proposed MRA-Net achieves 87.78% accuracy and obtains 1% improvement than squeeze and excitation network (SENet) based on a clinical AS-OCT image dataset. We also conduct comparable experiments to verify that the summation operation works better the concatenation on the MRA block by using ResNet as the backbone network. The results of two dataset splitting methods also that ten classification methods like MRA-Net and SENet obtain better classification results on the eye-based dataset than the participant-based dataset, e.g., the highest improvements on F1 and Kappa are 4.03% and 8% correspondingly. Conclusion Our MRA-Net considers the difference of feature distribution in different regions in a feature map and incorporates the clinical priors into network architecture design. MRA-Net obtains surpassing classification performance and outperforms advanced methods. The classification results of two dataset splitting methods on AS-OCT image dataset also indicated that given the similar nuclear cataract severity in the two eyes of the same participant. Thus, the AS-OCT image dataset is suggested to be split based on the participant level rather than the eye level, which ensures that each participant falls into the same training or testing datasets. Overall, our MRA-Net has the potential as a computer-aided diagnosis tool to assist clinicians in diagnosing cataract.

Keywords

nuclear cataract classification anterior segment optical coherence tomography(AS-OCT) image multi-region fusion attention block deep learning nucleus region

在线采编平台

在线出版

年度会议

下载中心

年度信息