罗纯龙1,2, 赵屹1(1.中国科学院计算技术研究所泛在计算系统研究中心, 北京 100080;2.中国科学院大学, 北京 100049)
Review of deep learning methods for karyotype analysis
Luo Chunlong1,2, Zhao Yi1(1.Research Center for Ubiquitous Computing Systems, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, China;2.University of Chinese Academy of Sciences, Beijing 100049, China)
Chromosomal abnormalities can lead to serious diseases，such as chronic myeloid leukemia and down syndrome. Karyotyping can count chromosomes in metaphase images，segment them from the background，arrange them according to certain rules，and observe and issue diagnostic results. Therefore，karyotype analysis has been widely used in many modern clinical fields and scientific research. However，even an experienced cytogeneticist requires much time to complete karyotyping. Although machine learning or traditional geometric methods have tried to automate karyotype analysis，most of them have shown poor performance and do not satisfy clinical requirements，which means that cytogeneticists still require much time for manual intervention. While many deep-learning-based methods have been proposed，systematic reviews are lacking. This paper reviews the recent literature and summarizes them into chromosome counting，chromosome segmentation，chromosome cluster classification，chromosome preprocessing，chromosome classification，and chromosome anomaly. First，the chromosome counting methods are summarized based on bounding box detection to accurately identify each chromosome on the metaphase images. Specifically，these methods need to find candidate object proposals，classify them into different classes，and refine the locations. However，they must solve self-similarity problems，over-deletion problems，and inaccurate localization problems resulting from overlapping chromosomes. Researchers have also attempted to accelerate model inference speed through lightweight backbones. Methods for the chromosome segmentation task can be divided into semantic and instance segmentation methods. On the one hand，semantic segmentation methods can only solve the problem of segmenting chromosome clusters formed by two or more overlapping chromosomes，and some postprocessing should be introduced to splice chromosomes. On the other hand，instance segmentation methods can automate chromosome segmentation，and additional supervision information，such as key points or orientation information，can further improve its performance. Given that some chromosome segmentation methods can only solve a specific type of chromosome cluster，the types of clusters should be identified. Existing methods roughly classify chromosome clusters according to two criteria，namely，based on the number of overlapping chromosomes and based on the interrelationship between the touching and overlapping chromosomes. However，from the methodological perspective，previous studies are mostly based on simple convolution neural networks（CNNs）. Therefore，further innovative studies on chromosome cluster classification are required. As for the chromosome preprocessing task，existing methods mainly address the two preprocessing tasks of metaphase image denoising and chromosome straightening. The metaphase image denoising task is solved in a segmentation manner，where the chromosomes are regarded as a whole area that needs to be segmented from the background and impurities present in an image. The existing chromosome straightening methods rely on generative adversarial networks to straighten curved chromosomes and generally follow the image translation or motion transformation framework. Benefiting from the booming development of deep-learning-based image classification networks，the chromosome classification task has also received much attention and development in karyotype-analysis-related tasks. According to their properties，the available methods can be divided into 1）simple CNN-based methods，which redesign the network aiming at chromosome instances instead of directly using the famous CNN model proposed for the ImageNet dataset；2）feature-contrastive-based methods，which extract representative features in a contrastive manner and then classify them through a simple classifier； 3）image-preprocessing-based methods，where super-resolution methods are applied before classification to unify the size of chromosome images or enhance the banding pattern features using different filters；4）global- and local-feature-fusionbased methods，which explicitly crop and extract features of the local but important image parts and then fuse them for final classification；and 5）complex-strategy-based methods，which solve the chromosome classification task by detecting chromosomes from metaphase images and improve performance using the ensemble learning framework. The final reviewed task is chromosome anomaly that includes detection and generation subtasks. Despite being a subject of concern for clinical experts，previous studies can only detect a specific type of chromosome anomaly through basic CNN or roughly discriminate between normal and abnormal chromosomes using the generative adversarial network framework. Meanwhile，the available approaches for generation subtasks are based on generative adversarial networks. At the end of this paper，the various tasks and main methodologies are summarized and reviewed，and then feasible future developments are proposed. First，to fulfill these tasks，multiple advanced solution paradigms，such as multi-modality and image question answering，should be introduced. Second，chromosomal abnormality diagnosis has not been addressed because it involves the extraction of band-level features and relational reasoning. Third，pretraining models in a self-supervised learning manner are worth further research. Despite the unavailability of high-quality labeled data for chromosomes，a large amount of clinically unlabeled data can still reduce the cost of data labeling and improve the performance of downstream tasks through the self-supervised learning paradigm. In sum，deep-learning-based automatic karyotyping methods should be reviewed further to draw additional research interest.