最新刊期

    21 9 2016
    • Improved guided image filtering integrated with gradient information

      Xie Wei, Zhou Yuqin, You Min
      Vol. 21, Issue 9, Pages: 1119-1126(2016) DOI: 10.11834/jig.20160901
      摘要:To eliminate halo artifacts produced by guided image filtering, this paper proposes a novel, improved, guided image filter that is integrated with gradient information. Our method uses gradient information of guidance images to determine edge positions, and designs the weight model based on an exponential function to control the smoothness of different image regions. It makes an improved guided image filter that adaptively finds edges and emphasizes them, thereby avoiding the halo introduced by excessive smooth near edges. Experiments show that our method can avoid halo artifacts during periods of edge-preserving smoothing, and respectively attains approximately 30% and 15% higher advancement for SSIM and PSNR, compared with the guided image filter. Our method has greater robustness and performance in many computer vision and computer graphics applications such as image smoothing, detail enhancement, and multi-exposure fusion.  
      关键词:edge-preserving smoothing;guided image filter;gradient;halo;parameter self-adaption   
      4960
      |
      361
      |
      24
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56112768 false
      更新时间:2024-05-07
    • Fast prediction algorithm of index maps for screen image coding

      Chen Guisheng, Song Chuanming, Wang Xianghai, Liu Dan
      Vol. 21, Issue 9, Pages: 1127-1137(2016) DOI: 10.11834/jig.20160902
      摘要:Screen image coding requires high coding efficiency, real-time performance, and moderate computational complexity. Palette coding is a state-of-art screen content coding method, but its predictive coding efficiency of index maps needs improvement. This study proposes a fast prediction algorithm of index maps based on the local directional correlation. Experiments show that two neighboring indexes have the same directionality with a probability of 0.93. We define this "the local directional correlation" of an index map. We then use a 2×3 template to perform an initial direction prediction. If the initial prediction fails, we use a 3×4 template to perform the second-round direction prediction. We conducted extensive experiments on 19 standard test video sequences and 3 test images. Experimental results showed that the prediction accuracy of our algorithm reached 95.43%, which was a 2.48% average increase over typical multi-stage prediction algorithms, and was particularly suitable for videos with text characters, complex scenes, and multiple geometric elements. Moreover, the computational complexity was significantly lower than that of MSP. Thus, this algorithm satisfied the requirements of screen image coding. This study presented a prediction algorithm of index maps, which exploited the local directional correlation of index maps and accelerated the prediction speed. The proposed algorithm was found to be applicable to the palette-based coding of text/graphics blocks in screen images.  
      关键词:image coding;screen image;compound image;index map;direction prediction   
      2976
      |
      381
      |
      5
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56113925 false
      更新时间:2024-05-07
    • Image classification algorithm based on hash codes and space pyramid

      Peng Tianqiang, Li Fang
      Vol. 21, Issue 9, Pages: 1138-1146(2016) DOI: 10.11834/jig.20160903
      摘要:Sparse coding is widely used to represent images. However, this method and its improved algorithms require complex computation and long running times, among other drawbacks. An image classification algorithm, based on hash codes and space pyramids, is proposed to solve these issues. The algorithm consists of four steps. First, extract local feature points from the images. Second, learn binary auto-encoder hashing functions, which map the local feature points into hash codes. Third, perform binary k-means cluster on the binary hash codes and generate the binary visual vocabularies. Finally, combine with a spatial pyramid matching model, and represent the image by the histogram vector of the space pyramid, which is used for image classification. In order to verify the efficiency of the proposed algorithm, we used two common datasets, Caltech-101 and Scene-15. The results were compared with state-of-the-art sparse coding algorithms, which showed the time of learning vocabularies of our method was 50% left, the online encoder speed was increased 1.3~12.4 times, and the classification accuracy increased 1%~5%. We also compared the classification performance of different hash encode methods, such as PCA-ITQ and KSH. In this paper, a novel image classification algorithm was proposed. The proposed algorithm, encoded local feature points by hash codes rather than sparse coding, and image classification was achieved with a spatial pyramid matching model. Experimental results showed that the proposed algorithm had faster learning vocabulary and encoder speed, and could be used for online vocabulary learning and other online applications.  
      关键词:hash codes;spatial pyramid matching model;sparse coding;binary K-means cluster;image classification   
      3614
      |
      327
      |
      3
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56114225 false
      更新时间:2024-05-07
    • Feng Yushan, Wang Zilei
      Vol. 21, Issue 9, Pages: 1147-1154(2016) DOI: 10.11834/jig.20160904
      摘要:This paper addresses the problem of fine-grained visual categorization requiring domain-specific and expert knowledge. This task is challenging because of all the objects in a database belonging to the same basic level category, with very fine differences between classes. These subtle differences are easily overcome by image background information that is seldom discriminative and often serves as a distractor, which reduces recognition performance in fine-grained image categorization. Therefore, segmenting the background regions and localizing discriminative image parts are important precursors to fine-grained image categorization. To this end, a new segmentation algorithm based on top-down attention maps is proposed to detect discriminative objects and discount the influence of background. Once the objects are localized, they are described by convolutional neural network (CNN) features to predict the corresponding class. The proposed method first recognizes a dataset through the CNN model to obtain a ConvNet basis with GoogLeNet structure. The ConvNet basis is visualized to build reliable intuitions of the visual information contained in the trained CNN representations, and the visualization result shows that the saliency parts in the images correspond to the object regions while the activations of background regions are very small. Then, according to the learned ConvNet, we predict the top-1 class label of a given image and determine the spatial support of the predicted class among the image pixels. Spatial support is rearranged to produce a top-down attention map, which can effectively locate the informative image object regions. Next, given an image and its corresponding image-specific class attention map, we compute the object segmentation mask with the GraphCut segmentation algorithm. The high-quality foreground segmentations are then used to encode the image appearance into a highly discriminative visual representation by finetuning the ConvNet basis to learn a new segmentation ConvNet. Finally, the ConvNet basis and segmentation ConvNet are combined to conduct fine-grained image recognition. We also use the original images to finetune the segmentation ConvNet for improved accuracy. The proposed model was tested on two new benchmark datasets available for fine-grained image categorization. The two databases were Cars196 and Aircrafts100, which were designed for fine-grained image recognition with public annotations, including class labels, object bounding boxes, and part locations. Only the class label annotation was used in our evaluation, and the final average accuracy rates of Cars196 and Aircrafts100 databases were 86.74% and 84.70%, respectively. These results show that adding visual attention information is more accurate than the GoogLeNet model alone. A semantic segmentation strategy based on top-down attention map was proposed to improve the accuracy of fine-grained image categorization. Our method did not need any bounding box or part annotations, making it very robust and applicable to a variety of datasets. The experimental results show that the attention information was very useful for fine-grained image recognition. The proposed novel model proved capable of application to salient object detection, foreground segmentation, and fine-grained image categorization.  
      关键词:fine-grained visual categorization;convolutional neural network (CNN);top-down attention map;GraphCut;GoogLeNet   
      7503
      |
      317
      |
      9
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56115214 false
      更新时间:2024-05-07
    • Liu Lu, Wu Chengmao
      Vol. 21, Issue 9, Pages: 1155-1165(2016) DOI: 10.11834/jig.20160905
      摘要:As image segmentation technology has continued to develop, scholars have proposed numerous algorithms for image segmentation. Image complexity and structure instability have resulted in a growing number of image segmentation methods, which provide segmentation effects for different types of images. To further improve noise immunity and the accuracy of image segmentation, an improved possibilistic clustering algorithm combining intra-class distance and inter-class distance is proposed and applied to image segmentation. The algorithm uses a possible measure to describe the degree of membership. The constraint that memberships of sample points across clusters must sum to 1 in fuzzy c-means is removed using a possibility measure so that membership degree is suitable for the characterization of "typical" and "compatibility". The algorithm avoids the traditional possibilistic clustering segmentation algorithm that only considers the distance of the sample to the cluster center. In this paper, intra-class and inter-class distances are combined as a new measure for the algorithm, with consideration of both the intra-class compactness and the inter-class scatter degree, to improve the stability and anti-noise ability of different clustering structures. The histogram is integrated into the possibility of the fuzzy clustering segmentation algorithm so that it can achieve segmentation of all types of complex images. Through synthetic and remote sensing images, segmentation tests show that the proposed improved possibilistic clustering algorithm is effective, segmentation contour is clear, and classification accuracy and noise are small. Compared with other algorithms, the error rate is reduced by 2 percentage points, and the result is more satisfactory. This study aims to conduct fuzzy c-means clustering segmentation algorithm and possibilistic clustering segmentation algorithm for image classification with similar backgrounds and targeting color inaccuracy defects by combining intra-class distance and inter-class distance as measures of the algorithm effectively to solve the image segmentation problem classification. This method, combined with the histogram, is proposed for a fast possibilistic fuzzy clustering segmentation algorithm that is also applicable for complex, large images.  
      关键词:fuzzy clustering;possibilistic clustering;image segmentation;partition coefficient;intra-class distance;inter-class distance   
      4045
      |
      422
      |
      1
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56113616 false
      更新时间:2024-05-07
    • Face recognition method based on frequency cluster

      Yuan Heng, Wang Zhihong, Jiang Wentao
      Vol. 21, Issue 9, Pages: 1166-1177(2016) DOI: 10.11834/jig.20160906
      摘要:A novel approach to robust face recognition based on frequency cluster is proposed to solve the problem of robust face recognition under complex conditions. First, a continuous information sampling unit is scattered across a detected sub-region face image. Information entropy in the foreground and background regions of the sampling unit is calculated. The entropy energy and energy frequency of the sampling unit are then calculated; the weaker energy frequency is removed by filtration and the edge-of-face frequency is calculated by the second-order partial derivatives with normalized frequency coefficient. Thus, the main feature information of the face is established. Finally, the geometrical layout of each sampling unit is obtained according to the coordinate position of the sampling unit, the entropy energy, and the energy frequency. The frequency cluster model is taken as a facial feature for identification and matching, and is constructed based on entropy energy, energy frequency, and geometrical layout. The average recognition accuracy was 99.11% on FERET and ORL-Yale database, and 97.36% on CMU-PIE database. The average processing speed of a single face image was 0.077 seconds. Experiments showed that this method could overcome the effects of illumination, varied poses, and varied expressions, while taking advantage of the strong robustness of frequency cluster. The proposed approach showed good adaptability to face recognition and significantly improved the robustness of face recognition under complex conditions, such as illumination variations, feature ambiguity, pose, and expression changes.  
      关键词:face recognition;entropy energy;energy frequency;geometrical layout;frequency cluster   
      2672
      |
      484
      |
      1
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56113650 false
      更新时间:2024-05-07
    • Liu Wanjun, Liang Xuejian, Qu Haicheng
      Vol. 21, Issue 9, Pages: 1178-1190(2016) DOI: 10.11834/jig.20160907
      摘要:Deep learning algorithms based on convolutional neural networks are attracting attention in the field of image processing. To improve the accuracy of the feature extraction process and the convergence rate of parameters, as well as optimize the learning performance of the network, an improved dynamic adaptive pooling algorithm is proposed, which compares the effect of different pooling models on learning performance. A convolutional neural network model, which is trained with different pooling models, is constructed. The results of the trained model are verified in different iterations. To compensate for low accuracy and slow convergence speed, a dynamic adaptive pooling model is proposed, which trains the network with different pooling models. The effect of the model on the accuracy and convergence rate in different iterations are then studied. Contrast experiment shows that the dynamic pooling model has optimal learning performance. The maximum improvement of the convergence rate on handwritten database is 18.55% and the maximum decrement of the accuracy rate is 20%. A dynamic adaptive pooling algorithm can improve the accuracy of feature extraction, convergence rate, and accuracy of the convolutional neural network, thereby optimizing network learning performance. The dynamic adaptive pooling model can be further extended to other deep learning algorithms related to convolutional neural networks.  
      关键词:deep learning;convolutional neural network;image recognition;feature extraction;algorithm convergence;dynamic adaptive pooling   
      4661
      |
      468
      |
      26
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56113697 false
      更新时间:2024-05-07
    • Saliency detection based on lazy random walk

      Li Bo, Lu Chunyuan, Jin Lianbao, Leng Chengcai
      Vol. 21, Issue 9, Pages: 1191-1201(2016) DOI: 10.11834/jig.20160908
      摘要:Research on biological vision indicates that when a human observes an object, visual attention moves from one region to another, according to different saliency, ending with the observer focusing on the most interesting regions. In mathematics, the transition process of visual attention is similar to random walk, a special case of Markov process, which describes a state transition process according to different probabilities, which finally falls into a balanced state. Based on the descriptive ability of the random walk process for human visual attention, this paper presents a visual saliency detection method based on lazy random walk. Compared with the traditional method, this paper has two contributions. First, compared with ordinary random walk, the proposed method can effectively guarantee convergence to a steady state. Second, the method is more reasonable and robust, using the commute time of lazy random walk for saliency detection. Lazy random walk is first performed in the background by assigning a large lazy factor to the seeds on an undirected graph generated by an image superpixel. Prior information is then used to correct the initial saliency result, including the spatial center cue by convex hull detection and the color contrast cue. Finally, a robust visual saliency result is detected by applying a similar random walk from the salient seeds, which is obtained from the last step. Both qualitative and quantitative evaluations on the MSRA-1000 database demonstrated the robustness and efficiency of the proposed method compared with other state-of-the-art methods. The experimental results show that the proposed method outperforms relative algorithms with respect to both the ROC curve and the F measure. The lazy random walk-based saliency detection method proposed in this paper simulates human visual attention as well as achieves better and more robust detection results than those of other methods.  
      关键词:saliency detection;random walk;lazy random walk;commute time   
      3546
      |
      366
      |
      3
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56112953 false
      更新时间:2024-05-07
    • Background modeling based on adaptive neighborhood correlation

      Wan Jian, Hong Mingjian, Zhao Chenqiu
      Vol. 21, Issue 9, Pages: 1202-1212(2016) DOI: 10.11834/jig.20160909
      摘要:Background modeling is widely used to detect moving objects and is the basis for object tracking, behavior learning, and recognition in the field of computer vision. Mixture of Gaussian (MOG) and Codebook are current popular methods based on pixel value. However, these methods usually assume that pixels are independent and retain only time domain information while ignoring spatial information, limiting the model to the continuity of time. This paper proposes an adaptive neighborhood correlation (ANC) background modeling approach. The ANC approach increases the neighborhood model while retaining the domain information, and considers results to adjust neighborhood area. ANC begins by using the original pixel-based background modeling method to detect the candidate foreground; it then further compares the foreground results of candidate foreground detection with models of neighborhood pixels, with matched pixels considered as background pixels, while others foreground pixels. Finally, pixel confidence is introduced to adjust the neighborhood size adaptively. ANC outperforms MOG and Codebook by more than 7% in average accuracy and F-measure with the ROC curve and other aspects of the measures on change detection standard database. ANC overcomes the limitations of pixel-based background modeling methods and is suitable for a complex multimodal background. It not only describes the change in pixels accurately, but is also robust and adaptive to the complex background.  
      关键词:mixture of Gusassian (MOG);Codebook;background modeling;adaptive neighborhood;pixel   
      3049
      |
      260
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56114260 false
      更新时间:2024-05-07
    • Real-time robust feature-point matching algorithm

      Chen Tianhua, Wang Fulong
      Vol. 21, Issue 9, Pages: 1213-1220(2016) DOI: 10.11834/jig.20160910
      摘要:Traditional image feature points matching algorithms of large amounts of data bring large computation burdens, particularly for real-time systems, such as visual odometry, or low-power devices, such as cellphones. This condition has led to an intensive search for replacements with lower computation cost. This paper proposes a robust real-time feature-point matching algorithm designated RRM. It first determines the edge area of the image by deviation operation, and then finds the anchor point or the gradient local maximum points, which are likely to be the feature points in the edge area. Next, it determines the direction of feature points by calculating Intensity Centroid, and then describes the feature points based on the improved Brief. Finally, it matches the feature points by combining Hamming distance with symmetrical match. Compared with a variety of feature-point matching algorithms, the proposed algorithm attains a higher success rate of 83% for images with complex backgrounds such as illumination changes, viewpoint changes, and scaling and rotation changes. The proposed algorithm is superior to and more stable than others. Experimental results indicate that the proposed algorithm solves the limitation of traditional feature-point matching effectively, without losing accuracy. The proposed algorithm can be used for image stitching, object tracking, and object recognition.  
      关键词:feature point matching;real-time;robust;anchor point;symmetrical match;image stitching   
      4473
      |
      386
      |
      10
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56114080 false
      更新时间:2024-05-07
    • Zhao Jinwei, Shen Yiyun, Liu Chunxiao, Ouyang Yi
      Vol. 21, Issue 9, Pages: 1221-1228(2016) DOI: 10.11834/jig.20160911
      摘要:To address false candidate atmospheric light and halo effects in dark channel prior-based image dehazing methods, an improved image dehazing algorithm is proposed, with atmospheric light validation and halo elimination strategies, which can reveal high visibility in haze-free images. As dark channel prior-based methods do not verify the validity of atmospheric light and may fail to select a proper one, a support vector machine-based classifier is trained and utilized to reject the false candidate atmospheric light. The halo effect is introduced to haze free results with coarse transmission maps in dark channel prior-based approaches. Thus, a patch shift-based fine transmission estimation method is adopted, which can preserve edges in the input image and suppress halo effects in the haze-free image significantly. Several pixels may still remain with halo effects near sharp edges, which are detected and corrected by the proposed halo elimination strategy using the guided filter. Finally, the haze-free image is obtained by solving the haze image formation model. Experimental results demonstrate that the false candidate atmospheric light is rejected and the halo effect is diminished significantly when our algorithm is applied. The resulting haze-free images possess superior visibility, rich image details, and depth. Our algorithm outperforms the state-of-the-art image dehazing methods and significantly improves visibility in haze-free images, which meets the requirements of applications such as video surveillance, traffic navigation, and object detection.  
      关键词:image dehazing;dark channel prior;patch shift;atmospheric light;halo elimination   
      6906
      |
      768
      |
      7
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56113013 false
      更新时间:2024-05-07
    • Zheng Liping, Gao Wencan, Li Shanglin, Cao Li
      Vol. 21, Issue 9, Pages: 1229-1237(2016) DOI: 10.11834/jig.20160912
      摘要:As an extension of the Voronoi diagram, a power diagram possesses precise capacity holding characteristics. A capacity constrained power diagram (CCPD) can be obtained by imposing a capacity constraint on an ordinary power diagram. When the site is fixed, no efficient method to generate a centroidal CCPD (CCCPD) is available. A novel fixed-site method to generate CCCPD under constant density was proposed to solve this problem. The centroid of a power cell was optimized by revising the weight of its neighbor sites. Through this method, the capacity of the site was optimized by scaling the power cell in equal proportions, which eventually generated the required power diagram. This work considered centroid and capacity constraints to compare the converged power diagrams under uniform capacity constraint and non-uniform capacity constraint. The differences of the results of the experiments were then analyzed. On the basis of the analysis of the results, the proposed method could be concluded to be effective in solving the capacity constraint problem; it obtained an optimal solution under constraints. Experiment results proved that the proposed method can stably generate CCCPD under a constant density with the advantages of high precision and good adaptability.  
      关键词:Power diagram;fixed site;constant density;centroid restriction;capacity constrain   
      2589
      |
      304
      |
      2
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56113896 false
      更新时间:2024-05-07
    • Yuan Rong, Shi Shuyue, Xie Qingguo
      Vol. 21, Issue 9, Pages: 1238-1246(2016) DOI: 10.11834/jig.20160913
      摘要:In liver surgical planning, accurate and effective vessel segmentation is the basis of necessary topological analysis for preoperative volumetric assessment and surgery simulation. An algorithm is proposed for automatic hepatic vessel segmentation in multi-phase contrast-enhanced CT. First, segmented CT images are processed with anisotropy filter for noise reduction. With the segmented mask, the influence of related organs, such as ribs, is avoided, and the computational expense is reduced. An improved Hessian-based enhancing filter with intensity information is designed to overcome the discontinuity of the junction region. Intensity information is used for noise reduction rather than Frobenius matrix norm. Finally, an adaptive multi-scale region growing method is implemented for vessel segmentation in the enhanced result. The mean values of current segmented target and background are used for the adaptive threshold selection. A multi-scale iteration is implemented in the growing region to avoid intensity inhomogeneity. Five sets of clinical multi-phase contrast-enhanced CT images were used for the evaluation. In Cases1-Case4, images from portal phase were chosen as input, and only portal vessel systems were segmented. Topological analysis shows that fifth-stage bifurcation or sixth-stage bifurcation could be detected with good accuracy. In Case5, images from the delayed phase were chosen as input, and both portal vessel and hepatic vessel systems were extracted. Topological analysis of a single hepatic vein demonstrated that fifth-stage bifurcation could still be detected. Experimental results indicated that the trunks and branches of hepatic vessels were completely segmented. This paper presented a novel automatic segmentation for hepatic vessels. Different parameters were assigned for the region growing on different multi-scale enhanced images. The results showed the algorithm was effective and accurate. In addition, it provided the corrected topological structures of hepatic vessels.  
      关键词:multi-phase contrast-enhanced CT;hepatic vessel segmentation;enhancing filter;region growing   
      4458
      |
      332
      |
      0
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56112694 false
      更新时间:2024-05-07
    • Chen Sheng, Zhang Mingwu
      Vol. 21, Issue 9, Pages: 1247-1255(2016) DOI: 10.11834/jig.20160914
      摘要:We explore methods of segmenting bone structure from normal chest X-ray radiographs to obtain virtual dual-energy X-ray radiographs of soft tissue. Our goal is to obtain high-quality clinical radiographs without increasing the radiation dose. Our algorithm first divides the lung field into 8 sections:upper, middle, and lower part of the left and right lateral lobe and left lung. The lung has specific anatomical structures in each section. For every section, we use standard chest radiographs and the corresponding bone image to conduct large-scale training of artificial neural networks (ANN). After training, we acquire the virtual bone image of this section with ANN, and obtain a complete image by fusing these 8 images. We use minimal total variation to suppress the noise in the image and enhance the edge of the bone. Finally, we subtract the virtual bone image from the original image to obtain the virtual soft tissue image. To test our algorithm, we use 100 radiographs with nodules. This new method can be applied to normal radiographs. After processing, we obtain a virtual soft tissue image, in which the bone structure is effectively removed. The virtual soft tissue can preserve lung nodules and blood vessels, and is helpful for diagnosing pulmonary nodules. Our novel method can improve lung nodule recognition rate to 88% (traditional rate is 70%). Based on anatomical structures, artificial neural networks and regression models can effectively isolate bones. This method can be widely applied in clinical diagnosis, to help radiologists detect lung nodules.  
      关键词:regression model;chest X-ray radiography;virtual dual-energy;lung nodules;anatomical structure   
      3182
      |
      255
      |
      2
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56114037 false
      更新时间:2024-05-07
    • Li Xiuxia, Jing Linhai, Li Hui, Tang Yunwei, Ge Wenyan
      Vol. 21, Issue 9, Pages: 1256-1264(2016) DOI: 10.11834/jig.20160915
      摘要:As the spatial resolution of remote sensing imagery has continued to improve, object-based image analysis (OBIA) methods have become widely used. The basic units of OBIA are segments produced by image segmentation rather than single pixels. Image segmentation is a prerequisite for OBIA and determines the quality of the analysis outcome. Image segmentation, a process of partitioning images into homogenous regions, is an indispensable step in analyzing remote sensing imagery. Typically, segmentation processes for high-resolution imagery are conducted via initial segmentation and primitive merging, and various segmentation methods have been developed. However, the quality of segmentation, particularly that of high-resolution remote sensing imagery, is less than satisfactory. Among numerous image segmentation methods, the seed-based region growing (SRG) method is relatively easy, efficient, and robust. This method depends largely on initial seed extraction, thus, the seed pixels should be representative and as similar to neighbors as possible. Different approaches have been proposed to select seeds, yet issues remain. For instance, redundant seeds are frequently chosen and consequently reduce the efficiency of SRG-based image segmentation. Also, details and thin linear objects cannot be accurately separated from large objects with inadequate seeds. With the goal of choosing representative seeds, an approach based on one-dimensional spectral difference (ODSD) is proposed to extract seeds for the SRG segmentation method, in which an evaluation criterion comprised of spectral angle and spectral distance is employed. The ODSD method is implemented as follows:1) the horizontal and vertical spectral difference maps of the image calculated; 2) the shallow basins in difference maps with "imHmin" function in Matlab are removed, and the local minima in each directional spectral difference image are obtained as candidate seeds; 3) overlaid candidate seeds are merged from orthogonal directions into one seed, which has minimum differences from adjacent neighbors; 4) remaining candidate seeds are optimized by choosing unique seeds in areas connected in a single dimension; and 5) all optimized common seeds and optimized directional seeds are used for the region-growing process. After the image is initially segmented using the SRG method with selected seeds, the resulting primitives are merged using the fractal net evolution method in the eCognition package. Experiments based on an IKONOS image demonstrated that the proposed ODSD method proposed could be efficient for delineating details and thin objects, which current segmentation approaches accomplish with more difficulty. Moreover, the quantitative evaluation showed that the outcome of the ODSD method reached nearly the same accuracy as the eCognition package and were superior to the kernel graph cuts method on the same scale. The ODSD method proved efficient in offering optimal seeds in the flat areas, details, and thin objects, thereby guaranteeing the representativeness of the chosen seeds. Moreover, the ODSD method yielded highly detailed segmentation maps.  
      关键词:remote sensing;object-based image analysis;image segmentation;seeded region growing;seeds extraction;One-dimensional spectral difference   
      3427
      |
      226
      |
      6
      <HTML>
      <L-PDF><Meta-XML>
      <引用本文> <批量引用> 56114341 false
      更新时间:2024-05-07
    0