外观和运动模式感知的有丝分裂细胞检测

林凡超; 谢洪涛; 刘传彬; 张勇东

发布时间： 2023-09-20
摘要点击次数： 1029
全文下载次数： 395
DOI: 10.11834/jig.220901
2023 | Volume 28 | Number 9

复杂场景图像目标智能检测
<< 上一篇
下一篇>>

外观和运动模式感知的有丝分裂细胞检测

林凡超, 谢洪涛, 刘传彬, 张勇东(中国科学技术大学信息科学技术学院, 合肥 230026)

摘要

目的在癌症筛查和药物研发等医学研究和诊疗过程中,显微图像中的有丝分裂细胞检测可以提供重要的生物学判据。然而,不同培养条件下图像分布差异明显,且细胞密度逐渐增大导致场景变得复杂,常规预处理方法难以进行有效的区域筛选;不同阶段细胞外观相似、运动过程模糊,现有方法缺乏对区域特征编码的显式监督,容易因为语义区分能力不足导致错误预测。为此,本文提出基于外观和运动模式感知的检测框架,通过两阶段预处理和对细胞状态模式的判别性学习,实现复杂场景下的精准预测。方法本文方法采用 3 阶段检测框架:在预处理阶段结合区域分割网络和先验优化算法来充分精简候选区域;在预训练阶段构造基于图像分类和重构的两种辅助任务,为候选区域的外观和运动编码提供直接监督,使编码网络具备对不同细胞状态的语义感知能力;在全模型训练和预测阶段,以预处理得到的候选区域序列作为输入,用预训练的编码网络提取候选区域特征,最终通过时序网络融合序列上下文信息得到细胞检测结果。结果在 C2C12-16 数据集上的实验结果表明,本文方法的平均性能达到:验证集精准率 85.3%,召回率 89.3%,F 得分 87.2%;测试集精准率 86.4%,召回率 86.1%,F 得分 86.2%,时序检测误差 0.221±0.536 帧,空间检测误差 3.321±2.461 像素,在检测精度和稳定性上都超过了现有方法。结论本文提出了新的复杂场景下有丝分裂细胞检测框架。所采用的预处理策略可以有效精简候选区域,显著提高检测效率;针对编码网络的辅助任务预训练充分提升了模型对候选区域外观和运动特征的学习能力,最终能够克服电镜图像中复杂场景和细胞模式的干扰,准确且稳定地对有丝分裂细胞进行时空检测。

关键词

相衬显微图像有丝分裂细胞检测多阶段检测时空特征编码辅助训练

Mitosis detection by appearance and motion pattern perception

Lin Fanchao, Xie Hongtao, Liu Chuanbin, Zhang Yongdong(Department of Information Science and Technology, University of Science and Technology of China, Hefei 230026, China)

Abstract

Objective In the processes of medical research and diagnosis, such as cancer screening and drug development, mitosis detection under phase-contrast microscopy image provides a very important biological criterion.Manual counting of mitotic cells takes a lot of time and labor.Thus, automatic mitosis detection is more efficient and economic than the manual process.On the one hand, the distributions of mitosis images are significantly different under various culture conditions.Moreover, the increment of cell density makes screening out the cell regions difficult for conventional preprocessing methods.On the other hand, the cells at different stages have similar appearances and blurred motion processes.They also require the model to have a strong ability to discriminate cell types and states.Recent deep-learning-based works use threedimensional convolutions or temporal networks to obtain context information from the sequence images.However, an explicit supervision process for learning cell states is lacking, making effective pattern information of target regions difficult to achieve.As a result, these methods are not fully capable of distinguishing different cells and background areas from feature encoding, and their performance and generalization ability are limited.Therefore, this study explores a detection framework based on cell appearance and motion pattern perception to solve the above problems.An accurate prediction under complex scenes is also achieved through effective preprocessing and discriminative learning of cell patterns.Method The proposed method consists of three stages.The first stage aims to extract regions of interest as candidates.This stage serves as the preprocessing for finding the notable areas and facilitating the later detection.The original electron microscope image is divided into local slices.An instance segmentation network is also trained to segment roughly all the candidate regions that may contain mitosis.Then, a candidate region refinement algorithm is designed based on a concise spatiotemporal hypothesis to refine the candidates and reduce the redundant results.In the second stage, two encoding networks are pretrained to maintain the feature encoding of both appearance and motion information by building proxy learning processes.In particular, an image classification task is conducted for the appearance encoding network training, which learns to predict the cell categories from the spatial context of a single patch.Moreover, an image reconstruction task is conducted for the motion encoding network training, which considers patches from adjacent frames and learns the information of interframe changes by recovering the raw patches.These two processes complement each other to help model the cell states from different aspects.Finally, in the third stage, the whole spatiotemporal model is trained end-to-end by classifying the candidate patch sequences.The spatial modules are initialized with the pretrained parameters of encoding networks in the second stage, thereby allowing them to be aware of the cell patterns at the beginning of the training.Given the appropriate spatial context, the temporal modules are optimized to combine the interframe information and make the final prediction.The overall model provides a confidence score for each patch.The position with the highest score is regarded as a mitosis point.Result We conduct experiments on the public C2C12-16 benchmark.The experimental results demonstrate the superior detection ability of the proposed method.On the C2C12-16 validation set, the mean precision reaches 85.3%, the mean recall reaches 89.3%, and the mean F-score is 87.2%.On the C2C12-16 test set, the mean precision reaches 86.4%, the mean recall reaches 86.1%, and the F-score is 86.2%.The proposed method demonstrates high performance and can generate stable predictions under various conditions.The mean temporal bias of the proposed method in all groups is only 0.221 ±0.536 frames, and the mean spatial bias is 3.321 ±2.461 pixels, both of which are much lower than those obtained by the counterpart method.Conclusion This study explores a new framework to tackle the hard cases under complex scenes in mitosis detection.The preprocessing strategy effectively extracts candidate regions and substantially improves detection efficiency.The pre-training of the feature encoding network based on proxy tasks fully enhances the model's ability to learn the appearance and motion patterns of the candidate regions.With the preprocessing and pretraining designs, our framework can distinguish the discrepancy of visual patterns between mitosis cells, common cells, and background noises, overcome the interference of complex scenes and cell patterns in the microscope image, and achieve both accurate and stable mitosis detection from spatiotemporal dimensions.

Keywords

phase contrast microscopy image mitosis detection multistage detection spatiotemporal feature encoding proxy training