Wang Xinnian, Shu Yingying. K-steps stabilization based on automatic clustering of shoeprint images[J]. Journal of Image and Graphics, 2016, 21(5): 574-587. DOI: 10.11834/jig.20160505.
Shoeprints serve as vital evidence in forensic investigations
and determining how a massive amount of shoeprint images can be clustered automatically through long-term accumulation has become one of the urgent tasks of criminal technology. Unlike other image sets
the number of shoeprint categories is not only considerable but also unknown. Shoeprint images in the feature space are distributed inhomogeneously and sparsely
but the quantity of each class is low. For these reasons
most existing clustering algorithms cannot satisfactorily cluster shoeprints. In this study
an automatic clustering method is proposed to divide shoeprint sets effectively based on an analysis of the distribution of shoeprint images in the feature space. Through statistics on labeled shoeprint-image databases
we found that shoeprint sets of different patterns do not intersect
and a blank region
where no shoeprints exist
is present between every two classes. Blank regions are called margins in this paper. The core objective of the proposed algorithm is to determine the margins between classes and use them to divide a shoeprint set. The process involves the following steps:1) dividing the shoeprint set with monotonically increasing or descending thresholds
which are used to classify two shoeprint images into the same cluster; 2) searching for the cluster that does not change with K consecutive partitions; 3) outputting the stable cluster and removing the shoeprints belonging to the output stable cluster from the dataset; 4) choosing the next threshold and dividing the remaining dataset; 5) returning to step 2) until the remaining set is empty. Experimental results on two kinds of publicly available databases and one real shoeprint database which comprises 5792 images
have shown that the proposed algorithm outperforms state-of-the-art clustering algorithms on common clustering evaluation measures. The precision and F-measure of the proposed algorithm on the real shoeprint database are approximately 99.68 and 95.99 percent
respectively. In this study
based on the distribution of shoeprint images in the feature space
an automatic clustering algorithm that searches for margins between clusters to divide a dataset is proposed. The proposed algorithm achieves a comparable or even better performance on clustering a shoeprint dataset than its competitors. Experiments have also shown that the performance of the algorithm is less sensitive to the parameter and shape of the clusters. The algorithm can also be applied to clustering other datasets of images with characteristics similar to those of shoeprint images.