Print

发布时间: 2020-08-16
摘要点击次数:
全文下载次数:
DOI: 10.11834/jig.200005
2020 | Volume 25 | Number 8




    医学图像处理    




  <<上一篇 




  下一篇>> 





压缩感知下溃疡性结肠炎辅助诊断
expand article info 杨海清, 孙道洋
浙江工业大学信息工程学院, 杭州 310023

摘要

目的 长期感染溃疡性结肠炎(ulcerative colitis,UC)的患者罹患结肠癌的风险显著提升,因此早期进行结肠镜检测十分必要,但内窥镜图像数量巨大且伴有噪声干扰,需要找到精确的图像特征,为医师提供计算机辅助诊断。为解决UC图像与正常肠道图像的分类问题,提出了一种基于压缩感知和空间金字塔池化结合的图像特征提取方法。方法 使用块递归最小二乘(block recursive least squares,BRLS)进行初始字典训练。提出基于先验知识进行观测矩阵与稀疏字典的交替优化算法,并利用压缩感知框架获得图像的稀疏表示,该框架改善了原来基于稀疏编码的图像分类方法无法精确表示图像的问题,然后结合最大空间金字塔池化方法提取压缩感知空间金字塔池化(compressed sensing spatial pyramid pooling,CSSPP)图像特征,由于压缩感知的引入,获得的图像特征比稀疏编码更加丰富和精确。最后使用线性核支持向量机(support vector machine,SVM)进行图像分类。结果 对Kvasir数据集中的2 000幅真实肠道图像的分类结果表明,该特征的准确率比特征袋(bag of features,BoF)、稀疏编码空间金字塔匹配(sparse coding spatial pyramid matching,SCSPM)和局部约束线性编码(locality-constrained linear coding,LLC)分别提升了12.35%、3.99%和2.27%。结论 本文提出的溃疡性结肠炎辅助诊断模型,综合了压缩感知和空间金字塔池化的优点,获得了较对比方法更加精确的识别感染图像检测结果。

关键词

溃疡性结肠炎; 计算机辅助诊断; 压缩感知; 交替优化; 空间金字塔池化

Adjunctive diagnosis of ulcerative colitis under compressed sensing
expand article info Yang Haiqing, Sun Daoyang
College of Information Engineering, Zhejiang University of Technology, Hangzhou 310023, China
Supported by: Natural Science Foundation of Zhejiang Province, China (LY13F010008); Project of Science and Technology of Zhejiang Province, China (2015F50009)

Abstract

Objective Patients with chronic ulcerative colitis (UC) have a significantly increased risk of developing colon cancer, and an early detection via colonoscopy is necessary. Assisted diagnosis of UC is among the computer-assisted diagnostic topics in medical gastrointestinal endoscopy research that aims to find reliable features for identifying lesions in intestinal images. The typical endoscopic findings from UC images include loss of vascular patterns, granularity, bleeding mucous membranes, and ulcers. UC often repeats recurrence and remission cycles during its course and may be accompanied by parenteral complications. The risk of colon cancer increases when UC extensively affects the large intestine for long periods. However, endoscopes take a large amount of intestinal images, and noise interference, such as shadows, is often observed in these images. Therefore, accurate image features must be identified, and computer-aided diagnosis must be provided to physicians. The extant convolutional neural networks have demonstrated extraordinary capabilities in image classification but requires the support of large datasets and a very long training process. In sparsely coded image recognition, sparse coding pyramid matching (SCSPM) applies selective sparse coding to extract the significant characteristics of scale-invariant feature transform (SIFT) descriptors for local image blocks. Sparse coding also maximizes pooling on multiple spatial scales to combine translation and scale invariance instead of averaging the pools in the histogram. Locally constrained linear coding (LLC) refers to the rapid implementation of local coordinate coding, where local constraints are used to project each descriptor onto a local coordinate system. However, sparse coding cannot obtain an accurate image representation. The theory of compressed sensing is based on the sparseness of the signal and uses an underdetermined random observation matrix for sampling such signal. Each measurement value is not the signal itself but a linear combination function of multiple signals, that is, each measurement contains a small amount of information for all signals. The sparse reconstruction algorithm is used to achieve an accurate reconstruction of signals or an approximate reconstruction with certain errors. To solve the problems in UC and normal intestinal image classification, an image feature based on a combination of compressed sensing and spatial pyramid pooling is proposed in this paper. Method This paper proposes compressed perceptual spatial pyramid pooling (CSSPP) image features. First, block recursive least squares (BRLS) and K-singular value decomposition(K-SVD) dictionary learning are used to train some positive and negative datasets to obtain two initial dictionaries for comparing the two algorithms. Second, an alternating optimization algorithm based on the UC image prior perception matrix and sparse dictionary is developed. When finding the optimal perception matrix, the Gram matrix of the equivalent dictionary is considered, the sparse representation error is designed as prior knowledge, and the performance of the system is tested at different sampling rates. In dictionary optimization, the influence of the perception matrix on dictionary errors is considered, and the impact of different sparsity on classification accuracy is tested. Using the obtained sensing matrix and sparse dictionary to obtain sparse representations of all datasets in a compressed sensing framework can address the problem where the original image classification method based on sparse coding cannot accurately represent images. The maximum and average spatial pyramid pooling methods are then used to divide the image into 21 regions on multiple scales to extract CSSPP features. By introducing compressed sensing, the information of sparse representation is richer and more accurate than that of sparse coding. A linear kernel support vector machine (SVM) is used for image classification. Result The proposed algorithm was trained on 100 UC and 100 normal images taken from the Kvasir dataset. During the training phase, the mean square error of the dictionary obtained by BRLS training was slightly lower than that of K-SVD. The highest image restoration accuracy was obtained at a sampling rate of 31.25%. The performance of the two pooling features was examined under different sparsity. The maximum pooling in training accuracy was significantly higher than the average. The quantitative evaluation indicators include sensitivity, specificity, accuracy, receiver operating characferistic(ROC) curve, area under curve(AUC) value, and time efficiency. Tests were then performed on all 2 000 datasets, and the classification results show that the features can improve the auxiliary diagnosis of UC images. Compared with bag of features (BOF), SCSPM, and LLC, using the SVM classification results increases the AUC values by 0.092, 0.035, and 0.015, the sensitivity by 9.19%, 2.84%, and -0.24%, the specificity by 15.52%, 5.15%, and 4.77%, and the accuracy by 12.35%, 3.99%, and 2.27%, respectively. Compared with BOF and SCSPM, the running time was reduced by 2 164.5 and 1 455.6 seconds, respectively. Conclusion The auxiliary diagnosis model of UC proposed in this paper combines the advantages of compressed sensing and spatial pyramid pooling to obtain highly accurate detection results for identifying infected images. Compressed sensing was utilized instead of sparse coding to obtain the ideal classification results. Similar to neural networks, sufficient information is needed to train the system to obtain highly accurate results. Medical-image-assisted diagnosis requires highly comprehensive and rich image information as criterion. Using highly comprehensive image features to replace human subjective features presents a promising development trend. Therefore, the next goal is to provide doctors with auxiliary diagnosis of UC in real time. Accordingly, how to increase the complexity of the algorithm and the real-time nature of the system needs to be studied in the future.

Key words

ulcerative colitis (UC); computer aided diagnosis; compressed sensing(CS); alternate optimization; spatial pyramid pooling(SPP)

0 引言

溃疡性结肠炎(ulcerative colitis,UC)是一种弥漫性非特异性炎症疾病。不健康饮食和恶劣环境是成年人感染UC的重要原因之一。长期患有广泛性病变的UC患者罹患结直肠癌的风险显著升高(Matsuoka等,2018)。标准的医学检查方法是根据临床发现进行结肠镜检查以确诊UC(D’Haens等,2007)。典型的内窥镜检查结果包括血管模式丧失,颗粒状和出血的黏膜,以及溃疡(Travis等,2013)。

当前内窥镜图像主要依赖医师人工阅片,由于内窥镜采集图像数量巨大,需要耗费很大精力寻找病灶图像(Li等,2015a),为减少工作量并为疾病提供准确诊断,设计先进的计算机辅助诊断已成为热门课题(Deeba等,2018Zhang等,2019)。但是一般内窥镜图像存在颜色失真问题且伴随亮斑、气泡等干扰噪声(如图 1所示),即使同一病症的溃疡性肠道图像也具有不同的颜色图案(Li等,2009),需要寻找足以表示图像的特征,使得正常图像与疾病图像具有足够的区分度。分类的常规方法是先进行特征提取,然后将特征传入分类器。特征提取包括提取纹理等物理特征和稀疏表示等信号处理特征。Maeda等人(2019)基于纹理特征对UC进行诊断,首先对图像进行纹理分析,然后将特征传入支持向量机(support vector machine,SVM)分类器,判断UC处于“活跃”还是“愈合”状态,诊断准确率达91%。稀疏特征分类也越来越多地用于肠道图像分类(Yuan等,2016, 2017)。Li和Perona(2005)提出特征袋(bags of features,BoF)模型,主要思想是使用K-means统计特征点在全图的分布特征,生成全局直方图作为特征向量,但丢失了图像的局部信息。Lazebnik等人(2006)提出空间金字塔匹配(spatial pyramid matching,SPM)模型,在图像的不同分辨率中统计分布特性,获得图像的全局和局部信息。Yang等人(2009)提出稀疏编码空间金字塔匹配(sparse coding spatial pyramid patching,SCSPM)算法,打破了聚类算法稀疏度的限制,对图像提取SIFT(scale-invariant feature transform)特征,并在训练好的码本上进行稀疏编码获得稀疏矩阵,然后在空间金字塔上最大池化获取特征向量。Wang等人(2010)基于非零稀疏系数通常分配给编码数据附近的原子,提出了局部约束线性编码(locality-constrained linear coding,LLC),对图像提取方向梯度直方图(histogram of oriented gradient,HOG)特征,然后在稀疏编码阶段引入局部约束,不仅要满足稀疏性的要求,非零稀疏还要赋值给相近字典的列。

图 1 含噪声的肠镜图像
Fig. 1 Noisy colonoscopy images ((a) facula; (b) bubble; (c) facula and bubble)

神经网络将特征提取和分类合并为一个框架,不同层代表不同的意义(迟剑宁等,2018)。Ozawa等人(2019)使用GoogLeNet架构构建了基于卷积神经网络(convolutional neural network,CNN)的UC检测系统。但是为了满足实时性的要求,使用纹理特征相对简单,而神经网络算法需要大量的数据集支撑(Yang和Bang,2019)。

本文提出一种用于UC图像分类的新特征提取方法,采用基于先验的压缩感知(compressed sensing,CS)框架,并结合空间金字塔池化(spatial pyramid pooling,SPP)提取图像在不同分辨率的特征。首先,UC与正常肠道在全局和局部存在差异性,利用空间金字塔将图像分为全局图像和多层的局部图像,更好地利用图像空间分布信息。其次,在CS框架下进行图像表示具有如下优势:1)针对图像信息的冗余性,对图像进行压缩采样,降低内存需求; 2)设计的等效字典保证了图像表示的稀疏性和医学肠镜图像的精确性; 3)基于稀疏表示误差优化稀疏字典和观测矩阵,对肠道的噪声更具有鲁棒性。

1 压缩感知基础

CS作为新兴的研究领域一直备受关注(Candes和Wakin,2008Duarte和Eldar,2011),CS技术已经应用在图像压缩(Xie等,2016王金铭等,2017)、信号检测(Guo等,2019)和分类(Liu等,2019)等领域中。

使用数学框架从$M≪N$的观测矢量$\boldsymbol{y}∈{\bf{R}}^{M×1}$中精确恢复信号矢量$\boldsymbol{x}∈{\bf{R}}^{N×1}$(Donoho,2006),测量过程可以通过传感矩阵投影原始信号来实现,即

$ \mathit{\boldsymbol{y}} = \mathit{\boldsymbol{ \boldsymbol{\varPhi} x}} $ (1)

式中,$\boldsymbol{\varPhi }∈{\bf{R}}^{M×N}$称为传感矩阵或观测矩阵。

因为$M≪N$,所以式(1)是一个欠定问题,并且存在无限多个解。稀疏性是CS理论中的一种约束,可以确保$\boldsymbol{y}$$\boldsymbol{x}$之间的映射一一对应(Candes和Wakin,2008)。假设原始信号表示为

$ \mathit{\boldsymbol{x}} = \sum\limits_{k = 1}^L {{s_k}{\mathit{\boldsymbol{\psi }}_k}} = \mathit{\boldsymbol{ \boldsymbol{\varPsi} s}} $ (2)

式中,$\boldsymbol{\varPsi }∈{\bf{R}}^{N×L}$是稀疏字典,$\boldsymbol{\psi }_{k}$$\boldsymbol{\varPsi }$的第$k$列原子,$\boldsymbol{s}∈{\bf{R}}^{L×1}$是稀疏表示系数,$s_{k}$$\boldsymbol{s}$的第$k$个值。如果$‖\boldsymbol{s}‖_{0}=K$,则由式(2)给出的向量$\boldsymbol{x}$表示在$\boldsymbol{\varPsi }$上是$K$稀疏的。那么,结合式(1),可得

$ \mathit{\boldsymbol{y}} = \mathit{\boldsymbol{ \boldsymbol{\varPhi} \boldsymbol{\varPsi} s}} = \mathit{\boldsymbol{Ds}} $ (3)

式中,$\boldsymbol{D}∈{\bf{R}}^{M×L}$是等效字典。若要精确恢复出$\boldsymbol{x}$,需求解

$ {\rm{arg}}{\kern 1pt} {\kern 1pt} \mathop {{\rm{min}}}\limits_\mathit{\boldsymbol{s}} {\left\| \mathit{\boldsymbol{s}} \right\|_0}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{ s}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{t}}{\rm{. }}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \mathit{\boldsymbol{y}} = \mathit{\boldsymbol{Ds}} $ (4)

求解可以使用贪婪算法,如正交匹配追踪(orthogonal matching pursuit,OMP)(Tropp,2004)。

CS系统的性能由$\boldsymbol{D}$的特性决定,互相干性是$\boldsymbol{D}$的特性之一,定义为

$ \mu (\mathit{\boldsymbol{D}}) = \mathop {{\rm{max}}}\limits_{1 \le k \ne j \le L} \frac{{|\mathit{\boldsymbol{d}}_k^{\rm{T}}{\mathit{\boldsymbol{d}}_j}|}}{{{{\left\| {{\mathit{\boldsymbol{d}}_k}} \right\|}_2}{{\left\| {{\mathit{\boldsymbol{d}}_j}} \right\|}_2}}} $ (5)

式中,$\boldsymbol{d}_{k}$$\boldsymbol{D}$的第$k$列原子,$‖·‖_{2}$表示L2范数。若要从测量中精确地恢复$K$稀疏信号,只要满足$K < \frac{{1}}{{2}}[1+ \frac{{1}}{{μ(\boldsymbol{D})}}$

在给定$\boldsymbol{\varPsi }$的情况下,Elad(2007)首先提出对$\boldsymbol{\varPhi }$进行优化,即使用平均互相干$μ_{\rm av}(\boldsymbol{D})$代替$μ(\boldsymbol{D})$。此后,$\boldsymbol{\varPhi }$优化基本都是建立在优化$\boldsymbol{D}$的Gram矩阵上,其定义为$\boldsymbol{D}^{\rm T}\boldsymbol{D}$, 可以表示为

$ {\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}_{{\rm{opt}}}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{\mathit{\boldsymbol{ \boldsymbol{\varPhi} }},{\mathit{\boldsymbol{G}}_t} \in {\mathit{\boldsymbol{S}}_t}} \left\| {{\mathit{\boldsymbol{G}}_t} - \mathit{\boldsymbol{G}}} \right\|_{\rm{F}}^2 $ (6)

式中,$‖·‖_{\rm F}$是Frobenius范数,$\boldsymbol{G}_{t}$是目标Gram矩阵,属于具有一些特定属性的非空集$\boldsymbol{S}_{t}$矩阵(Li等,2015bBai等,2015)。

在压缩感知中,信号稀疏表示的字典学习即求解

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{ \boldsymbol{\varPsi} }} = {\rm{arg}}\mathop {{\rm{min}}}\limits_\mathit{\boldsymbol{\psi }} \left\| {\mathit{\boldsymbol{X}} - \mathit{\boldsymbol{ \boldsymbol{\varPsi} S}}} \right\|_{\rm{F}}^2}\\ {{\rm{ s}}{\rm{. t}}{\rm{. }}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {{\left\| {{\mathit{\boldsymbol{\psi }}_j}} \right\|}_2} = 1,\forall j} \end{array} $ (7)

式中,$\boldsymbol{X}$是图像,$\boldsymbol{S}$是其稀疏系数矩阵。Engan等人(2002)提出最优方向法(method of optimal directions, MOD),直接使用简单乘积$\boldsymbol{\varPsi }=\boldsymbol{XS}^{\rm T}(\boldsymbol{SS}^{\rm T})^{-1}$进行字典学习。Aharon等人(2006)提出K奇异值分解(K-singular value decomposition, K-SVD)字典学习算法,逐个更新字典原子,取得了比MOD更好的效果。为了降低初始字典对字典设计的影响,并且保证收敛性,Jiang等人(2016)提出块递归最小二乘(block recursive least squares,BRLS)字典学习算法,将以块为单位的训练数据用于自适应字典学习,可以获得更好的性能。

2 本文框架

本文提出的诊断UC图像方法的框架流程如图 2所示。数据集采用Kvasir数据集,由挪威维斯特维肯健康信托基金(Vestre Viken health trust,VVHT)的内窥镜设备采集的图像组成,并由经验丰富的内镜医师进行注释和验证。本文方法的主要思想如下:1)从原始数据集中均匀挑选部分数据集,使用BRLS字典学习算法,训练获得初始字典$\boldsymbol{\varPsi }_{0}$; 2)采用基于先验知识的联合优化方法,交替优化测量矩阵$\boldsymbol{\varPhi }$和稀疏字典$\boldsymbol{\varPsi }$,利用$\boldsymbol{\varPhi }$$\boldsymbol{\varPsi }$相乘获得等效字典$\boldsymbol{D}$; 3)使用OMP算法求得$\boldsymbol{Y}$$\boldsymbol{D}$上的稀疏系数矩阵$\boldsymbol{S}$,在空间多尺度金字塔上对$\boldsymbol{S}$进行最大池化来获得合并特征,然后将合并特征连接并归一化,获得长特征向量$\boldsymbol{f}$; 4)所有数据集进行特征化之后,将特征矩阵$\boldsymbol{F}$和数据集标签一起传入SVM分类器,获得分类的精度。

图 2 本文方法的框架流程图
Fig. 2 Framework flow chart of ours

3 观测矩阵和稀疏字典的交替优化

$512×512$像素尺寸的肠道图像直接使用$\boldsymbol{\varPhi }$进行压缩采样,效率显然很低。通常做法是将其分割为一系列等大小图像补丁,并将补丁拉成一列形成新的数据集图像。

$\boldsymbol{x}_{k}$为训练图像序列,其中$k=1, 2, …, P$。稀疏字典设计意味着求解

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{ \boldsymbol{\varPsi} }} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{\mathit{\boldsymbol{\psi }},{\mathit{\boldsymbol{s}}_k}} \sum\limits_{k = 1}^p {\left\| {{\mathit{\boldsymbol{x}}_k} - \mathit{\boldsymbol{ \boldsymbol{\varPsi} }}{\mathit{\boldsymbol{s}}_k}} \right\|_{\rm{F}}^2} }\\ {{\rm{ s}}{\rm{. t}}{\rm{. }}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \left\| {{\mathit{\boldsymbol{s}}_k}} \right\| \le K} \end{array} $ (8)

式中,字典$\boldsymbol{\varPsi }$通常满足$\forall l, ‖{\boldsymbol{\varPsi }_{l}‖}_{2}=1$。解决该问题的有效算法,如基于BRLS的算法(Jiang等,2016)。

稀疏表示误差$\boldsymbol{e}_{k}=\boldsymbol{x}_{k}-\boldsymbol{\varPsi }\boldsymbol{s}_{k}$通常不为零。将$\boldsymbol{X}$$\boldsymbol{S}$分别表示为训练图像矩阵和稀疏系数矩阵,即

$ \mathit{\boldsymbol{X}}(:,\mathit{\boldsymbol{k}}) = {\mathit{\boldsymbol{x}}_k},\mathit{\boldsymbol{S}}(:,k) = {\mathit{\boldsymbol{s}}_k},\forall k $

然后给出稀疏表示误差矩阵

$ \mathit{\boldsymbol{E}} = \mathit{\boldsymbol{X}} - \mathit{\boldsymbol{ \boldsymbol{\varPsi} S}} $

$\boldsymbol{y}_{k}=\boldsymbol{\varPhi }\boldsymbol{x}_{k}$$\boldsymbol{x}_{k}$在观测矩阵$\boldsymbol{\varPhi }$作用下的投影,则相应的测量值$\boldsymbol{y}_{k}$

$ {\mathit{\boldsymbol{y}}_k} = \mathit{\boldsymbol{ \boldsymbol{\varPhi} \boldsymbol{\varPsi} }}{\mathit{\boldsymbol{s}}_k} + \mathit{\boldsymbol{ \boldsymbol{\varPhi} }}{\mathit{\boldsymbol{e}}_k} $

或者矩阵形式为$\boldsymbol{Y}=\boldsymbol{\varPhi }\boldsymbol{X}$,即

$ \mathit{\boldsymbol{Y}} = \mathit{\boldsymbol{DS}} + \mathit{\boldsymbol{ \boldsymbol{\varPhi} E}} $

式中,$ε=\boldsymbol{\varPhi }\boldsymbol{E}$$\boldsymbol{Y}$$\boldsymbol{D}$中的稀疏表示误差。

3.1 基于先验$E$的观测矩阵求解

为获得肠道图像精确的稀疏特征,设计的$\boldsymbol{\varPhi }$尽可能地减少$\boldsymbol{E}$,同时满足

$ {\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}_{{\rm{opt}}}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{\mathit{\boldsymbol{ \boldsymbol{\varPhi} }},{\mathit{\boldsymbol{G}}_t} \in {\mathit{\boldsymbol{S}}_t}} \left\| {{\mathit{\boldsymbol{G}}_t} - \mathit{\boldsymbol{G}}} \right\|_{\rm{F}}^2 $ (9)

式中,$\boldsymbol{G}$是等效字典的Gram矩阵,$\boldsymbol{G}_{t}$是一个目标Gram矩阵,在一组松弛的等角紧框架(equiangular tight frame, ETF) Gram矩阵$\boldsymbol{S}_{t}$中搜索

$ \begin{array}{*{20}{l}} {{\mathit{\boldsymbol{S}}_t} = \{ {\mathit{\boldsymbol{G}}_t} \in {{{\bf{R}}}^{L \times L}}:{\mathit{\boldsymbol{G}}_t} = \mathit{\boldsymbol{G}}_t^{\rm{T}},{G_t}(k,k) = 1,}\\ {{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \forall k,\mathop {{\rm{max}}}\limits_{i \ne j} |{G_t}(i,j)| \le \eta \} } \end{array} $ (10)

式中,参数$0≤η < 1$是一个常数,用于控制搜索空间以获得最佳$\boldsymbol{G}_{t}$,其定义为

$ |\langle {\mathit{\boldsymbol{d}}_k},{\mathit{\boldsymbol{d}}_j}\rangle | = \eta ,\forall k \ne j $ (11)

Fickus等人(2012)已经证明,参数$η$具有最小值,仅与参数$M、L$相关,且

$ \mu = \sqrt {\frac{{L - M}}{{M(L - 1)}}} \le \eta $

$η=μ$时,$L$维的理想ETF Gram被限制在$\boldsymbol{S}_{t}$

为了解决在ETF下尽可能减小$\boldsymbol{E}$的问题,提出目标函数

$ \rho (\mathit{\boldsymbol{ \boldsymbol{\varPhi} }},{\mathit{\boldsymbol{G}}_t}) = \left\| {{\mathit{\boldsymbol{G}}_t} - \mathit{\boldsymbol{G}}} \right\|_{\rm{F}}^2 + \alpha \left\| {\mathit{\boldsymbol{ \boldsymbol{\varPhi} E}}} \right\|_{\rm{F}}^2 $ (12)

式中,$\boldsymbol{G}=\boldsymbol{D}^{\rm T}\boldsymbol{D}, \boldsymbol{D}=\boldsymbol{\varPhi }\boldsymbol{\varPsi }$是等效字典,$α>0$是控制目标函数中稀疏表示误差$\boldsymbol{E}$的系数。因此,该框架下的观测矩阵优化问题表述为

$ \begin{array}{l} {\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}_{{\rm{opt}}}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{\mathit{\boldsymbol{ \boldsymbol{\varPhi} }},{\mathit{\boldsymbol{G}}_t} \in {\mathit{\boldsymbol{S}}_t}} \rho (\mathit{\boldsymbol{ \boldsymbol{\varPhi} }},{\mathit{\boldsymbol{G}}_t})\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{s}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\rm{t}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \mathit{\boldsymbol{G}} = {\mathit{\boldsymbol{ \boldsymbol{\varPsi} }}^{\rm{T}}}{\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}^{\rm{T}}}\mathit{\boldsymbol{ \boldsymbol{\varPhi} \boldsymbol{\varPsi} }} \end{array} $ (13)

计算$\boldsymbol{G}$并投影到$\boldsymbol{S}_{t}$上,求解

$ {\mathit{\boldsymbol{G}}_t} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{{\mathit{\boldsymbol{G}}_t} \in {\mathit{\boldsymbol{S}}_t}} \left\| {{\mathit{\boldsymbol{G}}_t} - \mathit{\boldsymbol{G}}} \right\|_{\rm{F}}^2 $ (14)

来更新$\boldsymbol{G}_{t}$,具体收缩操作为

$ {G_t}(i,j) = \left\{ {\begin{array}{*{20}{l}} {G(i,j)}&{i \ne j,|v| < \eta }\\ {{\rm{sgn}} [G(i,j)]\eta }&{i \ne j,|v| > \eta }\\ 1&{i = j} \end{array}} \right. $ (15)

式中,sgn[·]是符号函数。接下来,求解

$ {\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}_{{\rm{opt}}}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_\mathit{\boldsymbol{ \boldsymbol{\varPhi} }} \rho (\mathit{\boldsymbol{ \boldsymbol{\varPhi} }},{\mathit{\boldsymbol{G}}_t}) $ (16)

对式(16)进行计算,得

$ \begin{array}{*{20}{c}} {\rho (\mathit{\boldsymbol{ \boldsymbol{\varPhi} }},{\mathit{\boldsymbol{G}}_t}) = \left\| {{\mathit{\boldsymbol{G}}_t}} \right\|_{\rm{F}}^2 - 2{\rm{tr}} [\mathit{\boldsymbol{G}}{\mathit{\boldsymbol{G}}_t}] + \left\| \mathit{\boldsymbol{G}} \right\|_{\rm{F}}^2 + \alpha \left\| {\mathit{\boldsymbol{ \boldsymbol{\varPhi} E}}} \right\|_{\rm{F}}^2 = }\\ {\left\| {{\mathit{\boldsymbol{G}}_t}} \right\|_{\rm{F}}^2 + {\rm{tr}} [{\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}^{\rm{T}}}\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}{\mathit{\boldsymbol{A}}_d}{\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}^{\rm{T}}}\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}{\mathit{\boldsymbol{A}}_d}] - 2{\rm{tr}} [{\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}^{\rm{T}}}\mathit{\boldsymbol{ \boldsymbol{\varPhi} B}}]} \end{array} $ (17)

式中,$\boldsymbol{A}_{d}=\boldsymbol{\varPsi }\boldsymbol{\varPsi }^{\rm T},\boldsymbol{B}=\boldsymbol{\varPsi }\boldsymbol{G}_{t}\boldsymbol{\varPsi }^{\rm T}- \frac{{α}}{{2}}\boldsymbol{EE}^{\rm T}$。令$\boldsymbol{A}_{d}=\boldsymbol{U}_{d}\boldsymbol{\varLambda }^{2}_{d}\boldsymbol{U}^{\rm T}_{d}$$\boldsymbol{A}_{d}$的奇异值分解(singular value decomposition, SVD),并且

$ {\mathit{\boldsymbol{A}}_{{\rm{sqr}}}} = {\mathit{\boldsymbol{U}}_d}{\mathit{\boldsymbol{ \boldsymbol{\varLambda} }}_d}\mathit{\boldsymbol{U}}_d^{\rm{T}},\quad C = \mathit{\boldsymbol{A}}_{{\rm{sqr}}}^{ - 1}\mathit{\boldsymbol{BA}}_{{\rm{sqr}}}^{ - 1} $ (18)

假设$\boldsymbol{\varPsi }$是满秩的,则式(17)可以表示为

$ \begin{array}{*{20}{l}} {\rho (\mathit{\boldsymbol{ \boldsymbol{\varPhi} ,}}{\mathit{\boldsymbol{G}}_t}) = \left\| {{\mathit{\boldsymbol{G}}_t}} \right\|_{\rm{F}}^2 + {\rm{tr}} [{\mathit{\boldsymbol{G}}_e}{\mathit{\boldsymbol{G}}_e}] - 2{\rm{tr}} [{\mathit{\boldsymbol{G}}_e}\mathit{\boldsymbol{C}}] = }\\ {{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \left\| {{\mathit{\boldsymbol{G}}_t}} \right\|_{\rm{F}}^2 + \ell ({\mathit{\boldsymbol{G}}_e},\mathit{\boldsymbol{C}})} \end{array} $ (19)

式中,$\boldsymbol{G}_{e}=\boldsymbol{A}_{\rm sqr}\boldsymbol{\varPhi }^{\rm T}\boldsymbol{\varPhi }\boldsymbol{A}_{\rm sqr}$, 为了减低表达,用函数ℓ表示两个迹相减。

$\boldsymbol{U}_{0}\boldsymbol{\varLambda }_{0} \boldsymbol{U}^{\rm T}_{0}$$\boldsymbol{C}$的特征分解,其中$\boldsymbol{\varLambda }_{0}={\rm diag}(λ_{1}, …, λ_{k}, …, λ_{N}), λ_{k}≥λ_{k+1}, \forall k$

定义$\boldsymbol{\varDelta }^{2}_{M}={\rm diag}(σ^{2}_{1}, …, σ^{2}_{k}, …, σ^{2}_{M})$,其中

$ {\sigma _k} = \left\{ {\begin{array}{*{20}{c}} {\sqrt {{\lambda _k}} }&{{\lambda _k} \ge 0}\\ 0&{{\lambda _k} \le 0} \end{array}} \right. $

然后,给出式(16)的一类解

$ {\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}_{{\rm{opt}}}} = \mathit{\boldsymbol{U}}\left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{ \boldsymbol{\varDelta} }}_M}}&0 \end{array}} \right]\left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{v}}_{11}^{\rm{T}}}&0\\ 0&{\mathit{\boldsymbol{v}}_{22}^{\rm{T}}} \end{array}} \right]\mathit{\boldsymbol{U}}_0^{\rm{T}}\mathit{\boldsymbol{A}}_{{\rm{sqr}}}^{\rm{T}} $ (20)

式中,$\boldsymbol{U}∈{\bf{R}}^{M×M}$$\boldsymbol{v}_{22}∈{\bf{R}}^{(N-M)×(N-M)}$都是任意正交矩阵,$\boldsymbol{v}_{11}∈{\bf{R}}^{M×M}$是由等于$\{σ_{k}\}$$\boldsymbol{v}^{\rm T}_{11}\boldsymbol{\varDelta }_{M}\boldsymbol{v}_{11}$的对角元素组成。由此,通过式(12)求解出基于先验知识优化的观测矩阵。

3.2 观测矩阵和稀疏字典的交替优化

在CS中,获得观测值$\boldsymbol{y}$在等效字典$\boldsymbol{D}$上的稀疏表示$\boldsymbol{\hat s}$,具体为

$ \mathit{\boldsymbol{\hat s}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{\mathit{\boldsymbol{\hat s}}} \left\| {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{D\hat s}}} \right\|_{\rm{F}}^2 \;\;\;\;\;{\rm{ s}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{t}}{\rm{. }}{\left\| {\mathit{\boldsymbol{\hat s}}} \right\|_0} \le K $ (21)

影响$‖ \boldsymbol{s} - \boldsymbol{\hat s}‖_{2}$精度的重要因素之一就是$\boldsymbol{y}$$\boldsymbol{D}$上的稀疏性。所以在设计$\boldsymbol{\varPsi }$时,期望建模误差$\boldsymbol{E}$和测量稀疏表示误差$\boldsymbol{\varepsilon }$都尽可能小。因此,优化稀疏字典学习的目标函数为

$ \omega (\mathit{\boldsymbol{ \boldsymbol{\varPsi} }},\mathit{\boldsymbol{S}}) = (1 - \beta )\left\| {\mathit{\boldsymbol{ \boldsymbol{\varPhi} }}(\mathit{\boldsymbol{X}} - \mathit{\boldsymbol{ \boldsymbol{\varPsi} S}})} \right\|_{\rm{F}}^2 + \beta \left\| {\mathit{\boldsymbol{X}} - \mathit{\boldsymbol{ \boldsymbol{\varPsi} S}}} \right\|_{\rm{F}}^2 $ (22)

式中,$β$用于平衡$\boldsymbol{E}$$\varepsilon $。则若要获取优化的$\boldsymbol{\varPsi }$,即求解

$ \begin{array}{*{20}{c}} {{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\mathit{\boldsymbol{ \boldsymbol{\varPsi} }}_{\rm opt}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_{\mathit{\boldsymbol{ \boldsymbol{\varPsi} }},\mathit{\boldsymbol{S}}} \omega (\mathit{\boldsymbol{ \boldsymbol{\varPsi} }},\mathit{\boldsymbol{S}})}\\ {{\rm{s}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{t}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {{\left\| {\mathit{\boldsymbol{S}}(:,k)} \right\|}_0} \le K,\quad \forall k} \end{array} $ (23)

并且通过一些代数运算,式(22)可以重写为

$ \omega (\mathit{\boldsymbol{ \boldsymbol{\varPsi} }},\mathit{\boldsymbol{S}}) = \left\| {\mathit{\boldsymbol{Z}} - \mathit{\boldsymbol{W \boldsymbol{\varPsi} S}}} \right\|_{\rm{F}}^2 $ (24)

式中,$ \boldsymbol{Z} = \begin{array}{l} \left[ \begin{matrix} \begin{array}{*{35}{l}} {\sqrt{1-β}\boldsymbol{\varPhi }\boldsymbol{X}} \\ {\sqrt{β} \boldsymbol{X}, } \end{array} \end{matrix} \right] \end{array}, \boldsymbol{W} = \begin{array}{l} \left[ \begin{matrix} \begin{array}{*{35}{l}} {\sqrt{1-β}\boldsymbol{\varPhi }} \\ {\sqrt{β} \boldsymbol{I}_{N}} \end{array} \end{matrix} \right] \end{array} $都独立于$\boldsymbol{S}$

由于式(24)中目标函数具有非凸性,因此采用交替优化的思想找到它的近似解。首先,利用BRLS稀疏字典$\boldsymbol{\varPsi }_{0}$和观测矩阵$\boldsymbol{\varPhi }$通过OMP算法得到$\boldsymbol{S}$的近似解。然后获取$\boldsymbol{\varPsi }$的近似解,具体为

$ \begin{array}{l} {\mathit{\boldsymbol{ \boldsymbol{\varPsi} }}_{{\rm{opt}}}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_\mathit{\boldsymbol{ \boldsymbol{\varPsi} }} \left\| {\mathit{\boldsymbol{Z}} - \mathit{\boldsymbol{W \boldsymbol{\varPsi} S}}} \right\|_{\rm{F}}^2\\ {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{s}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{t}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\left\| {{\mathit{\boldsymbol{\psi }}_l}} \right\|_2} = 1 \end{array} $ (25)

式中,$\boldsymbol{Z}$$\boldsymbol{W}$$\boldsymbol{S}$都是已知的。给出$\boldsymbol{W}$$\boldsymbol{S}$的SVD如下

$ \mathit{\boldsymbol{W}} = {\mathit{\boldsymbol{U}}_W}\left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{ \boldsymbol{\varDelta} }}_W}}&0\\ 0&0 \end{array}} \right]\mathit{\boldsymbol{V}}_B^{\rm{T}},\mathit{\boldsymbol{S}} = {\mathit{\boldsymbol{U}}_S}\left[ {\begin{array}{*{20}{c}} {{\mathit{\boldsymbol{ \boldsymbol{\varDelta} }}_S}}&0\\ 0&0 \end{array}} \right]\mathit{\boldsymbol{V}}_S^{\rm{T}} $

$ {\mathit{\boldsymbol{ \boldsymbol{\varPsi} }}_{{\rm{opt}}}} = {\mathit{\boldsymbol{V}}_W}\left[ {\begin{array}{*{20}{c}} {\mathit{\boldsymbol{ \boldsymbol{\varDelta} }}_W^{ - 1}{{\mathit{\boldsymbol{\tilde Z}}}_{11}}\varDelta _S^{ - 1}}&{{{\mathit{\boldsymbol{ \boldsymbol{\hat \varPsi} }}}_{12}}}\\ {{{\mathit{\boldsymbol{ \boldsymbol{\hat \varPsi} }}}_{21}}}&{{{\mathit{\boldsymbol{ \boldsymbol{\hat \varPsi} }}}_{22}}} \end{array}} \right]\mathit{\boldsymbol{U}}_S^{\rm{T}} $ (26)

式中,$\boldsymbol{\tilde Z}_{11}$维度为${\rm rank}(\boldsymbol{W})×{\rm rank}(\boldsymbol{S})$$ \boldsymbol{\tilde Z}= \boldsymbol{U}^{\rm T}_{W}\boldsymbol{ZV}_{S}= \left[ \begin{matrix} {\boldsymbol{\tilde Z}_{11}} & {\boldsymbol{\tilde Z}_{12}} \\ {\boldsymbol{\tilde Z}_{21}} & {\boldsymbol{\tilde Z}_{22}} \\ \end{matrix} \right] $的第1个块,并且$\boldsymbol{\hat \varPsi }_{12}$$\boldsymbol{\hat \varPsi }_{21}$$\boldsymbol{\hat \varPsi }_{22}$都是适当维度的任意矩阵。

综上所述,本文改进了Bai等人(2015)$\boldsymbol{\varPhi }$作为$\boldsymbol{\varPsi }$的函数的思想,提出基于先验的观测矩阵和稀疏字典的交替优化算法,具体步骤如下:

输入:BRLS稀疏字典$\boldsymbol{\varPsi }_{0}$,观测矩阵$\boldsymbol{\varPhi }_{0}$,交替优化算法的迭代次数$Iter$,更新字典的迭代次数$Iter_{1}$

输出:$\boldsymbol{\varPhi }=\boldsymbol{\varPhi }_{i}, \boldsymbol{\varPsi }=\boldsymbol{\varPsi }_{i}$

${\rm for} \;i=1 \;{\rm to} \;Iter$,重复步骤1)—3):

1) 使用式(20)获得$\boldsymbol{\varPhi }_{i}$

设置$j=1, \boldsymbol{\varPsi }^{(j-1)}_{i-1} = {\boldsymbol{\varPsi }_{i}}_{-1}$

2) 当$1≤j≤Iter_{1}$,执行

(1) 固定$\boldsymbol{\varPsi }^{(j-1)}_{i-1}$,通过使用OMP算法求$\boldsymbol{S}$的近似解

$ \begin{array}{*{20}{l}} {{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} \mathit{\boldsymbol{S}}_{i - 1}^{(j)} = {\rm{arg}}\mathop {{\rm{min}}}\limits_\mathit{\boldsymbol{S}} \omega (\mathit{\boldsymbol{ \boldsymbol{\varPsi} }}_{i - 1}^{(j - 1)},\mathit{\boldsymbol{S}})}\\ {{\rm{ s}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{t}}{\rm{. }}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {\kern 1pt} {{\left\| {\mathit{\boldsymbol{S}}(:,k)} \right\|}_0} \le K,\quad \forall k} \end{array} $ (27)

(2) 固定$\boldsymbol{S}^{(j)}_{i-1}$,更新$\boldsymbol{\varPsi }$

$ \mathit{\boldsymbol{ \boldsymbol{\varPsi} }}_{i - 1}^{(j)} = {\rm{arg}}\mathop {{\rm{min}}}\limits_\mathit{\boldsymbol{ \boldsymbol{\varPsi} }} \omega (\mathit{\boldsymbol{ \boldsymbol{\varPsi} }},\mathit{\boldsymbol{S}}_{i - 1}^{(j)}) $ (28)

(3)$j→j+1$,回到步骤2)。

3)$\boldsymbol{\varPsi }_{i}=\boldsymbol{\varPsi }^{(j)}_{i-1},i→i+1$

End for

4 CSSPP特征提取

4.1 稀疏表示

利用肠道图像优化设计后的$\boldsymbol{\varPhi }$$\boldsymbol{\varPsi }$,计算等效字典$\boldsymbol{D}=\boldsymbol{\varPhi }\boldsymbol{\varPsi }$,此时$\boldsymbol{y}$在字典$\boldsymbol{D}$上的稀疏系数

$ \begin{array}{*{20}{c}} {\mathit{\boldsymbol{s}} = {\rm{arg}}\mathop {{\rm{min}}}\limits_\mathit{\boldsymbol{s}} \left\| {\mathit{\boldsymbol{y}} - \mathit{\boldsymbol{Ds}}} \right\|_{\rm{F}}^2}\\ {{\rm{ s}}{\rm{.}}{\kern 1pt} {\kern 1pt} {\kern 1pt} {\rm{t}}{\rm{. }}\quad {{\left\| \mathit{\boldsymbol{s}} \right\|}_0} \le K} \end{array} $ (29)

可以通过OMP算法求解该非凸函数。

4.2 空间金字塔

UC图像相比于正常图像表现为血管模式丧失,颗粒状和出血的粘膜,甚至溃疡,在图像全局以及局部都存在差异。为更好地利用UC图像信息,采用多尺度金字塔模型,如图 3所示。该方法在层数$t=1, 2, 3$上分别将每幅UC图像在空间上划分为$2^{t-1}×2^{t-1}$个子区域,在这21个空间子区域中分别计算肠道图像的特征向量。

图 3 空间金字塔示意图
Fig. 3 Schematic diagram of the space pyramid ((a) image; (b) CS sparse representation; (c) space pyramid)

4.3 特征提取

观察$\boldsymbol{y}=\boldsymbol{Ds}, \boldsymbol{x}=\boldsymbol{\varPsi }\boldsymbol{s}$可以发现,稀疏向量每个元素的值反映了图像对字典原子的响应,由于稀疏度得到保证,非零元素个数小于等于$K$, 因此,可以在不同分辨率中合并稀疏矩阵的行,再连接合并特征并进行归一化,得到肠道图像的压缩感知空间金字塔池化(compressed sensing spatial pyramid pooling,CSSPP)特征向量$\boldsymbol{f}∈{\bf{R}}^{(L×21)×1}$Wang等人(2010)指出可以使用平均池化和最大池化两种合并方法获得合并特征。

平均池化

$ {\mathit{\boldsymbol{f}}_{{\rm{ averge }}}} = (1/H) \times ({\mathit{\boldsymbol{S}}_{j,1}} + {\mathit{\boldsymbol{S}}_{j,2}} + \cdots + {\mathit{\boldsymbol{S}}_{j,H}}) $

最大池化

$ {\mathit{\boldsymbol{f}}_{{\rm{max}}}} = {\rm{max}}({\mathit{\boldsymbol{S}}_{j,1}},{\mathit{\boldsymbol{S}}_{j,2}}, \cdots ,{\mathit{\boldsymbol{S}}_{j,H}}) $

式中,$\boldsymbol{S}_{j, k}$表示稀疏系数矩阵第$j$行第$k$列的元素,$H$是该子区域中稀疏向量的列数。本文选用的归一化方法是基于L2范数的,具体为

$ {\mathit{\boldsymbol{\hat f}}_i} = {\mathit{\boldsymbol{f}}_i}/{\left\| {{\mathit{\boldsymbol{f}}_i}} \right\|_2} $

式中,$\boldsymbol{f}_{i}$表示第$i$个子区域的合并特征。

5 实验

实验采用医师标注的Kvasir数据集,包含1 000幅UC图像和1 000幅正常的盲肠图像,图像大小为512 × 512像素,将每幅图像以8 × 8尺寸大小重新排列为64 × 4 096的矩阵,避免字典训练与传感矩阵设计尺寸过于庞大带来的内存负担。在BRLS字典训练阶段,选择正负各20幅样本组成训练集,得到稀疏字典$\boldsymbol{\varPsi }_{0}∈{\bf{R}}^{64×256}$,根据Bai等人(2015)发现,$L/M$值为4时,可以达到很好的重建效果。

5.1 评价指标

评价训练阶段系统性能优劣的重点是由该CS系统得到的稀疏系数能否精确表示图像,只有足够精确地恢复图像,才能说明该稀疏特征是有效的。本文使用均方误差指标表示图像恢复的精确度。

利用合适的特征合并方法得到最终的肠道图像分类特征后,随机挑选一幅肠道图像,对比不同条件下的恢复图像与原图的均方误差(mean square error, MSE),其定义为

$ {f_{{\rm{MSE}}}} = \frac{1}{n}\sum\limits_{i = 1}^n {{{({\mathit{\boldsymbol{x}}_i} - {{\mathit{\boldsymbol{\hat x}}}_i})}^2}} $

式中,$x_{i}$是原图像的像素点的值,$ \boldsymbol{\hat x}_{i}$是重建图像的像素点的值。MSE数值越小,说明图像恢复得越精确。

本文采用准确性$γ_{1}$、敏感性$γ_{2}$和特异性$γ_{3}$的评价指标比较不同算法的优劣,具体定义为

$ {{\gamma _1} = \frac{{TP}}{{TP + FN}} \times 100\% } $ (30)

$ {{\gamma _2} = \frac{{TN}}{{TN + FP}} \times 100\% } $ (31)

$ {{\gamma _3} = \frac{{TP + TN}}{{TP + FN + FP + TN}} \times 100\% } $ (32)

式中,$TP$$FN$$FP$$TN$分别表示真阳性、假阴性、假阳性和真阴性。

为了进一步测试本文实验的效果,还讨论了受试者工作特性曲线(receiver operating characteristic curve, ROC)下的面积(area under curve, AUC)。

5.2 统计学方法

采用SPSS17(statistical product and service solutions)作为数据统计与处理的方法,以$P < 0.05$作为具有统计学差异的判定标准,其中$P$值为结果可信程度的一个递减指标。

5.3 训练阶段

训练阶段的实验包括两个目标函数中控制误差系数$α$的选择、测量值$M$的选择、稀疏度$K$的设置。特征提取阶段的实验主要包括不同池化函数的对比。

首先,对$α$系数进行选择,分别令$α$ = [0.2, 0.5, 1, 2, 5],如图 4(a)所示,可以发现$α=1$时,MSE指标最低,表示值为1时图像的恢复精度最高。然后,使用新获得的$α$值,代入本文提出的基于先验的观测矩阵和稀疏字典的交替优化算法,使用不同$β$值字典训练,观察对图像表示的影响,在[0.1, 0.9]中取一位小数的值,如图 4(b)所示,MSE指标在两端波动较为明显,而中间相对平稳,所以选择$β$ = 0.5作为实验参数。

图 4 误差控制系数和采样率对CS系统的影响
Fig. 4 Influence of error control coefficient and sampling rate on CS system ((a) values of $α$; (b) values of $β$; (c) values of $M$)

测量值$M$在压缩感知框架中表示为采样率,将$N$维信号压缩为$M$维信号,并且$M≪N$,那么适当的$M$取值既可以提取到图像补丁块的全局信息特征又可以不损失图像的信息量。采样率选择在25%~50%之间进行实验,结果符合预期,如图 4(c)所示。因为当$M$小于一定值时,采样率太低,图像信息不能完全表示,当达到一定阈值后,就会趋于稳定,图像可以很好地表示,所以选择$M=20$作为本文实验的参数。

图 5是已训练好的${\bf{R}}^{64×256}$的BRLS字典与K-SVD训练字典的对比,使用UC图像在不同迭代次数下进行MSE性能比较,可以发现BRLS算法性能略优于K-SVD。

图 5 不同初始字典的性能对比
Fig. 5 Comparison of performance for different initial dictionaries

将稀疏度$K$和两种不同的池化方法一起考虑,选取正负各200幅肠道图像进行测试。因为稀疏度与维度的比值小于10%才能取得较好的稀疏效果,所以$K$在[1, 6]内取整数值,然后使用相同的稀疏系数矩阵比较两种池化方法,并采用十折交叉验证训练肠道图像数据集的分类精度,寻找较为理想的$K$值与池化方法。图 6是在不同$K$值情况下,最大池化和平均池化两种方法经SVM训练后的分类精度,可以看出$K=4$时训练精度比较理想,而且空间金字塔最大池化的精度明显高于平均池化。

图 6 $K$值与池化方法对精度的影响
Fig. 6 The effect of value of $K$ and pooling method on accuracy

5.4 实验结果与分析

将本文的特征与其他特征在全部数据上进行对比,图 7是不同方法的ROC曲线图,其中真阳性率(true positive rate,TPR)表示正确分类的正样本占正样本的比例,假阳性率(false positive rate,FPR)表示错误分类的负样本占负样本的比例,AUC值的范围为[0.5, 1],值越大表示系统的分类效果越好。

图 7 不同方法的AUC值对比
Fig. 7 Values of AUC of different methods

表 1展示了不同方法对UC图像分类的性能指标,包括敏感性、特异性、准确率和运行时间。由于BoF模型需要进行K-means聚类,在特征点数据较多时,训练时间较长。SCSPM将矢量量化推广到稀疏编码,并对图像提取多尺度描述符,所以比BOF获得了更好的效果。LLC使用HOG特征并在稀疏编码阶段强调局部约束,在敏感性和时间上取得了很好效果。本文方法基于先验知识训练稀疏字典和传感矩阵,使CS系统在保证Gram的基础上更好地表示图像信息,相比于其他方法,提供了更高的特异性和准确率。本文方法与对比算法的AUC值和准确率均存在统计学差异($P$ < 0.05)。

表 1 不同方法对UC图像分类的性能指标
Table 1 Comparison of different methods for UC classification

下载CSV
方法 敏感性/% 特异性/% 准确率/% 运行时间/s
BoF 86.98 83.23 85.11 5 478.7
SCSPM 93.33 93.60 93.47 4 769.8
LLC 96.41 93.98 95.19 2 761.2
本文 96.17 98.75 97.46 3 314.2
注:加粗字体表示各列最优结果。

6 结论

首次提出了CSSPP图像特征用于UC图像辅助诊断。首先基于肠道图像的先验知识求解观测矩阵,然后交替优化传感矩阵与稀疏字典,在此CS框架下,得到更精确的肠道图像的稀疏特征。最后结合SPP的特性,在不同分辨率中提取了数据集的局部和全局信息,更好地使用特征区分UC图像与正常肠道图像。本文在实际的UC数据集上验证,性能表现良好,其中,特异性比LLC算法提升了4.7%,准确率比LLC算法提升了2.27%。

医疗图像辅助诊断不仅要求算法的准确率,还需要考虑算法的实时性。尤其是对于入侵式的胃肠镜检查,需要计算机提供实时诊断信息,在减少病人痛苦的同时获得足够精确的结果。接下来将针对时效性做进一步研究。

参考文献

  • Aharon M, Elad M, Bruckstein A. 2006. K-SVD:an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Transactions on Signal Processing, 54(11): 4311-4322 [DOI:10.1109/TSP.2006.881199]
  • Bai H, Li G, Li S, Li Q W, Jiang Q R, Chang L P. 2015. Alternating optimization of sensing matrix and sparsifying dictionary for compressed sensing. IEEE Transactions on Signal Processing, 63(6): 1581-1594 [DOI:10.1109/TSP.2015.2399864]
  • Candes E J, Wakin M B. 2008. An introduction to compressive sampling. IEEE Signal Processing Magazine, 25(2): 21-30 [DOI:10.1109/MSP.2007.914731]
  • Chi J N, Yu X S, Zhang Y F. 2018. Thyroid nodule malignantrisk detection in ultrasound image by fusing deep and texture features. Journal of Image and Graphics, 23(10): 1582-1593 (迟剑宁, 于晓升, 张艺菲. 2018. 融合深度网络和浅层纹理特征的甲状腺结节癌变超声图像诊断. 中国图象图形学报, 23(10): 1582-1593) [DOI:10.11834/jig.180232]
  • Deeba F, Islam M, Bui F M, Wahid K A. 2018. Performance assessment of a bleeding detection algorithm for endoscopic video based on classifier fusion method and exhaustive feature selection. Biomedical Signal Processing and Control, 40: 415-424 [DOI:10.1016/j.bspc.2017.10.011]
  • D'Haens G, Sandborn W J, Feagan B G, Geboes K, Hanauer S B, Irvine E J, Lémann M, Marteau P, Rutgeerts P, Schölmerich J, Sutherland L R. 2007. A review of activity indices and efficacy end points for clinical trials of medical therapy in adults with ulcerative colitis. Gastroenterology, 132(2): 763-786 [DOI:10.1053/j.gastro.2006.12.038]
  • Donoho D L. 2006. Compressed sensing. IEEE Transactions on Information Theory, 52(4): 1289-1306 [DOI:10.1109/TIT.2006.871582]
  • Duarte M F, Eldar Y C. 2011. Structured compressed sensing:from theory to applications. IEEE Transactions on Signal Processing, 59(9): 4053-4085 [DOI:10.1109/TSP.2011.2161982]
  • Elad M. 2007. Optimized projections for compressed sensing. IEEE Transactions on Signal Processing, 55(12): 5695-5702 [DOI:10.1109/TSP.2007.900760]
  • Engan K, Aase S O and Husoy J H. 2002. Method of optimal directions for frame design//Proceedings of 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Phoenix, AZ: IEEE: 2443-2446[DOI: 10.1109/ICASSP.1999.760624]
  • Fickus M, Mixon D G, Tremain J C. 2012. Steiner equiangular tight frames. Linear Algebra and its Applications, 436(5): 1014-1027 [DOI:10.1016/j.laa.2011.06.027]
  • Guo K R, Chai R F, Candra H, Guo Y, Song R, Nguyen H, Su S. 2019. A hybrid fuzzy cognitive map/support vector machine approach for EEG-based emotion classification using compressed sensing. International Journal of Fuzzy Systems, 21(1): 264-273 [DOI:10.1007/s40815-018-0567-3]
  • Jiang Q R, Li S, Lu Z R and Sun B B. 2016. Block recursive least squares dictionary learning algorithm//Proceedings of 2016 Chinese Control and Decision Conference. Yinchuan, China: IEEE: 1961-1964
  • Lazebnik S, Schmid C and Ponce J. 2006. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories//Proceedings of 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. New York, USA: IEEE: 2169-2178[DOI: 10.1109/CVPR.2006.68]
  • Li B P, Qi L, Meng M Q H and Fan Y C. 2009. Using ensemble classifier for small bowel ulcer detection in wireless capsule endoscopy images//Proceedings of 2009 IEEE International Conference on Robotics and Biomimetics. Guilin, China: IEEE: 2326-2331[DOI: 10.1109/ROBIO.2009.5420455]
  • Li B P, Xu G Q, Zhou R, Wang T F. 2015a. Computer aided wireless capsule endoscopy video segmentation. Medical Physics, 42(2): 645-652 [DOI:10.1118/1.4905164]
  • Li F F and Perona P. 2005. A Bayesian hierarchical model for learning natural scene categories//Proceedings of 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Diego, CA, USA: IEEE: 524-531[DOI: 10.1109/CVPR.2005.16]
  • Li G, Li X, Li S, Bai H, Jiang Q R, He X X. 2015b. Designing robust sensing matrix for image compression. IEEE Transactions on Image Processing, 24(12): 5389-5400 [DOI:10.1109/TIP.2015.2479474]
  • Liu J X, Han G, Sun N, Li X F, Gong Z G, Sun Q S. 2019. Generalized compressed sensing with QR-based vision matrix learning for face recognition under natural scenes. Signal Processing:Image Communication, 77: 11-19 [DOI:10.1016/j.image.2019.05.009]
  • Maeda Y, Kudo S E, Mori Y, Misawa M, Ogata N, Sasanuma S, Wakamura K, Oda M, Mori K, Ohtsuka K. 2019. Fully automated diagnostic system with artificial intelligence using endocytoscopy to identify the presence of histologic inflammation associated with ulcerative colitis (with video). Gastrointestinal Endoscopy, 89(2): 408-415 [DOI:10.1016/j.gie.2018.09.024]
  • Matsuoka K, Kobayashi T, Ueno F, Matsui T, Hirai F, Inoue N, Kato J, Kobayashi K, Kobayashi K, Koganei K, Kunisaki R, Motoya S, Nagahori M, Nakase H, Omata F, Saruta M, Watanabe T, Tanaka T, Kanai T, Noguchi Y, Takahashi K I, Watanabe K, Hibi T, Suzuki Y, Watanabe M, Sugano K, Shimosegawa T. 2018. Evidence-based clinical practice guidelines for inflammatory bowel disease. Journal of Gastroenterology, 53(3): 305-353 [DOI:10.1007/s00535-018-1439-1]
  • Ozawa T, Ishihara S, Fujishiro M, Saito H, Kumagai Y, Shichijo S, Aoyama K, Tada T. 2019. Novel computer-assisted diagnosis system for endoscopic disease activity in patients with ulcerative colitis. Gastrointestinal Endoscopy, 89(2): 416-421 [DOI:10.1016/j.gie.2018.10.020]
  • Travis P L S, Schnell D, Krzeski P, Abreu M T, Altman D G, Colombel J F, Feagan B G, Hanauer S B, Lichtenstein G R, Marteau P R, Reinisch W, Sands B E, Yacyshyn B R, Schnell P, Bernhardt C A, Mary J, Sandborn W J. 2013. Reliability and initial validation of the ulcerative colitis endoscopic index of severity. Gastroenterology, 145(5): 987-995 [DOI:10.1053/j.gastro.2013.07.024]
  • Tropp J A. 2004. Greed is good:algorithmic results for sparse approximation. IEEE Transactions on Information Theory, 50(10): 2231-2242 [DOI:10.1109/TIT.2004.834793]
  • Wang J J, Yang J C, Yu K, Lv F J, Huang T and Gong Y H. 2010. Locality-constrained linear coding for image classification//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. San Francisco, CA, USA: IEEE: 13-18[DOI: 10.1109/CVPR.2010.5540018]
  • Wang J M, Ye S P, Xu Z Y, Chen C X, Jiang Y J. 2017. Smooth l0-norm minimization algorithm for compressed sensing with semi-tensor product. Journal of Image and Graphics, 22(1): 9-19 (王金铭, 叶时平, 徐振宇, 陈超祥, 蒋燕君. 2017. 半张量积压缩感知模型的l0-范数解. 中国图象图形学报, 22(1): 9-19) [DOI:10.11834/jig.20170102]
  • Xie D, Peng H P, Li L X, Yang Y X. 2016. Semi-tensor compressed sensing. Digital Signal Processing, 58: 85-92 [DOI:10.1016/j.dsp.2016.07.003]
  • Yang J C, Yu K, Gong Y H and Huang T, 2009. Linear spatial pyramid matching using sparse coding for image classification//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition. Miami, FL, USA: IEEE: 20-25[DOI: 10.1109/CVPR.2009.5206757]
  • Yang Y J, Bang C S. 2019. Application of artificial intelligence in gastroenterology. World Journal of Gastroenterology, 25(14): 1666-1683 [DOI:10.3748/wjg.v25.i14.1666]
  • Yuan Y X, Li B P, Meng M Q H. 2016. Improved bag of feature for automatic polyp detection in wireless capsule endoscopy images. IEEE Transactions on Automation Science and Engineering, 13(2): 529-535 [DOI:10.1109/TASE.2015.2395429]
  • Yuan Y X, Li B P, Meng M Q H. 2017. WCE abnormality detection based on saliency and adaptive locality-constrained linear coding. IEEE Transactions on Automation Science and Engineering, 14(1): 149-159 [DOI:10.1109/TASE.2016.2610579]
  • Zhang S K, Sun F R, Wang N S, Zhang C C, Yu Q L, Zhang M Q, Babyn P, Zhong H. 2019. Computer-aided diagnosis (CAD) of pulmonary nodule of thoracic CT image using transfer learning. Journal of Digital Imaging, 32(6): 995-1007 [DOI:10.1007/s10278-019-00204-4]