Review of Deep Learning Methods for Karyotype Analysis
Luo Chunlong,Zhao Yi(Institute of Computing Technology,Chinese Academy of Sciences)
Chromosomal abnormalities can lead to serious diseases such as chronic myeloid leukemia and Down syndrome. Karyotyping can count chromosomes in metaphase images, segment them from the background, arrange them according to rules, and finally observe and issue diagnostic results. Therefore, karyotype analysis has gradually been widely used in many modern clinical fields and scientific research. But even an experienced cytogeneticist takes a lot of time to employ karyotyping. Though machine learning or traditional geometric methods have tried to automate karyotype analysis, most of them have poor performance and do not satisfy clinical requirements, which means that cytogeneticists still require a lot of time for manual intervention. Currently, many deep learning-based methods have been proposed, but there is a lack of systematic reviews. Our research reviews the recent literature and summarizes them into chromosome counting, chromosome segmentation, chromosome cluster classification, chromosome preprocessing, chromosome classification, and chromosome anomaly. First, it summarizes chromosome counting methods based on bounding box detection. The key point of it is to find out and identify each chromosome on the metaphase images accurately. Specifically, they need to find candidate object proposals, then classify them into different classes and refine locations. However, they must solve self-similarity problems, over-deletion problems, and inaccurate localization problems resulting from overlapping chromosomes. Meanwhile, researchers also pay attention to accelerating model inference speed through lightweight backbones. For the chromosome segmentation task, methods can be divided into two categories: semantic segmentation methods and instance segmentations. Semantic segmentation methods can only solve the problem of segmenting chromosome clusters formed by overlapping two or more chromosomes, and some post-processing should be introduced to splice chromosomes. Instance segmentation methods can automate chromosome segmentation, and additional supervision information such as key points or orientation information can further improve performance. Considering that some chromosome segmentation methods only can solve a specific type of chromosome cluster, it is necessary to identify the type of cluster. Existing methods roughly classify chromosome clusters according to different criteria, one based on the number of overlapping chromosomes and the other based on the interrelationship between touching and overlapping chromosomes. However, from a methodology perspective, current works are mostly based on simple convolution neural networks. Therefore, the chromosome cluster classification task needs more innovative studies. As for the chromosome preprocessing task, existing methods mainly address two preprocessing tasks: metaphase image denoising and chromosome straightening. The metaphase image denoising task is solved in a segmentation manner, where the chromosomes are regarded as a whole area and need to be segmented from the background and impurities present in the image. The existing chromosome straightening methods rely on generative adversarial networks to straighten curved chromosomes. They generally follow the image translation framework or motion transformation framework. Next, benefiting from the booming development of deep learning-based image classification networks, chromosome classification task has also received the most attention and development in karyotype analysis-related tasks. According to the properties of methods, the approaches available can be divided into 1) simple CNN-based methods, which means redesigning the network aiming at chromosome instances instead of directly using the famous CNN model proposed for the ImageNet dataset; 2) feature contrastive-based methods, which extract representative features using the contrastive manner and then classify them through simple classifier; 3) image preprocessing based methods, where before classification, they firstly apply super-resolution methods to unify size of chromosome images or enhance banding pattern features using different filters; 4) global and local feature fusion based methods, which explicitly crop and extract features of local but important image part, and then fusion them for final classification; 5) complex strategy based methods, where some of them solve chromosome classification task by detecting chromosomes from metaphase images and others improve performance using ensemble learning framework. The final reviewed task is chromosome anomaly including detection subtask and generation subtask. Though highly concerned by clinical experts, the existing studies only can detect a specific type of chromosome anomaly through basic CNN or roughly discriminate between normal and abnormal chromosomes by generative adversarial network framework. As for the generation subtasks, the approaches available are also based on generative adversarial networks. At the end of the paper, the various tasks and main methodologies are summarized and commented on, and then feasible future developments are also proposed. Firstly, to solve these tasks, multiple advanced solution paradigms can be introduced, such as multimodality and image question answering. The second problem is that chromosomal abnormality diagnosis has not been addressed because it involves the extraction of band-level features and relational reasoning. Finally, pretraining models in a self-supervised learning manner are worth noticing by researchers. Although lacking high-quality labeled data for chromosomes, a large amount of clinically unlabeled data can still reduce the cost of data labeling and improve the performance of downstream tasks through the self-supervised learning paradigm. In summary, it is necessary to review deep learning-based automatic karyotyping methods, which can draw more researchers" attention to this field.