摘要:Mesoscale ocean fronts and eddies are important mesoscale marine environment characteristics. The ocean front is the interface of water masses with different properties. In the area where an ocean front exists, corresponding hydrological factors (e.g., temperature, chlorophyll concentration, and salinity) present a high horizontal gradient. Seawater convergence and vertical motion are enhanced in fields where fronts appear; this enhancement leads to the enrichment of nutrients and provides a rich diet to plankton, fish, and so on. Therefore, the sea area where fronts appear can serve as a good fishing ground (e.g., Zhoushan and Minnan fishing grounds in China). Mesoscale eddy plays an important role in ocean circulation and is an important undertaker of energy transport and ocean material transfer in oceans. Furthermore, eddies can influence the distribution of hydrological factors, such as temperature and salinity, and is thus one of the important factors of marine hydrological variation. Eddies associated with local lifting flow, such as the upwelling associated with cold eddies, carry nutrients to euphotic zones from the bottom of the ocean, greatly improve the primary productivity of the ocean, and influence the distribution of the fishing ground. Thus, they affect the development of the marine economy. Remote sensing data demonstrate excellent continuity and synchronization and can effectively reflect the spatial distribution characteristics of marine hydrological elements and the sea surface height. Therefore, remote sensing data, such as sea surface temperature, sea surface height, and sea level anomaly, have been widely used in the extraction of mesoscale ocean fronts and eddies. Research on the extraction of mesoscale ocean fronts and eddies based on remote sensing data is thus significant to marine ecosystem research, fishery stock assessment, and fishing condition forecasting. We aim to provide a reference and ideas for the extraction of mesoscale ocean fronts and eddies by summarizing and analyzing the methods of extracting mesoscale ocean fronts and eddies. This study summarizes and analyzes the methods of front extraction, such as gradient method, canny algorithm, wavelet analysis method, and algorithms based on the law of universal gravity, as well as the methods of eddy extraction, such as OW, WA, SSH-based and HD methods. Insights and new ideas are then provided. To show the difference among various front extraction methods intuitively, fronts are extracted from same area by using gradient, Sobel, and canny algorithms. The extraction results are presented in a figure. When gradient and Sobel algorithms are used to extract fronts, the thresholds used to distinguish the background and front pixels are obtained by the iterative method. Afterward, the Zhang-Suen method is applied to implement binary image thinning. The center line of the front is then obtained. When the canny algorithm is utilized to extract fronts, 0.25 is selected as the low threshold and 0.9 as the high threshold. With sea surface temperature (SST) data on the northern South China Sea (SCS) in February 2014, gradient, Sobel, and canny algorithms are used to extract the temperature front in northern SCS. A figure is subsequently drawn for the temperature front distribution of this area.Results show that among the various front extraction methods, the gradient method is simple but influenced greatly by noise. The canny algorithm presents a great advantage in front positioning accuracy, continuity, and computational efficiency. Wavelet analysis can be used in multi-scale analysis, but the computation is complex. The algorithm based on the law of universal gravity considers the influence of the value and position of the center and neighborhood pixels, so it has better anti-noise ability and accuracy than the gradient method. Among the various eddy extraction methods, the OW method can identify the eddy core region well, but the extraction result is greatly influenced by the selection of the W value threshold. The WA method presents excellent accuracy but requires extensive calculation to obtain the streamline. The SSH-based method is simple but can only extract the eddy boundary and not the core area of the eddy. Early eddy extraction methods ignore the condition that multi-core eddy structures may appear in oceans, and these methods are unable to identify multi-core eddy structures. The HD method combines the advantages of the OW and SSH-based methods. It can extract the boundary and core area of an eddy simultaneously and can identify multi-core eddy structures. According to the summary and analysis of various front and eddy extraction methods, threshold selection for front and eddy extraction is difficult but is important to the quality of the extraction result. Many researchers have examined the threshold selection method. In addition, edge detection methods, such as gradient method and the canny algorithm, that are used widely in front extraction are designed for extracting sharp edges (e.g., solid edge). However, sea water is fluid and characterized by a weak edge. In other words, the edge of water masses with different properties is not obvious and more difficult to identify than a solid edge. Therefore, traditional edge detection methods are unsuitable for mesoscale front extraction due to the weak edge characteristic of marine environment element images. Given that the ocean front is the interface of different water masses, the region growing algorithm, which is used widely in image segmentation, can be utilized to segment the marine environmental element image of the study area into several independent parts that represent different water masses. Then, the boundaries we wish to extract (i.e., the front) are searched.
摘要:Embedding watermark information into the host image leads to a contradiction between invisibility and robustness. High watermark embedding strength means strong watermark robustness but poor invisibility. Low watermark embedding strength means good watermark invisibility but weak robustness. As an effective means of copyright protection, a watermarking algorithm must ensure good invisibility and effectively resist various attacks. Geometric attacks destroy the synchronization between the watermark and host image and thus leads to the failure of the watermarking algorithm. To address the contradiction between invisibility and algorithm robustness and improve the capability to resist geometric attacks, this study proposes an invisible and robust watermarking algorithm based on an image block. The host image is divided into non-overlapping image blocks, and the texture and edge features of each image block are analyzed by using the masking property of the human visual system to calculate the masking value of each image block. The masking values are arranged in a descending order, and an appropriate number of good masking image blocks are selected as embedded sub-blocks according to the size of the watermark information. Two-level discrete wavelet transform is performed on the sub-block, and its low-frequency sub-band is decomposed by singular value decomposition to obtain orthogonal matrices and and diagonal matrix . The difference among the three sets of elements in the first column of the orthogonal matrix is calculated according to the watermark bit information. If the difference is less than the threshold value, the Arnold scrambled watermark information is embedded into the orthogonal matrix. Then, inverse singular value decomposition is applied on the selected image block, and the low-frequency sub-band and other middle-and high-frequency sub-bands of the image block are subjected to inverse wavelet transform. Afterward, all the image blocks are combined to obtain watermarked images. The scale-invariant feature transform (SIFT) feature points of the watermarked images are extracted, and the coordinate, scale, direction, and descriptor information are stored. In watermark extraction, the SIFT feature points of the watermarked image that may be attacked are extracted and matched with the feature points saved in the watermark embedding to determine if the watermarked image is subjected a geometric attack. If the image is subjected to geometric attacks, geometric correction of the watermarked image is realized by the coordinate relations and scales features of the SIFT feature points. Geometric correction restores the synchronization of the watermark. If no geometric attack occurs, two-level discrete wavelet transform is performed on the selected image block, and its low-frequency sub-band is decomposed by singular value decomposition to obtain orthogonal matrix . The watermark bit information is extracted according to the difference between the two elements in the first column of the orthogonal matrix and then transformed into a binary image, which is subjected to inverse Arnold transformation to obtain the watermark image. Through experiments on standard gray-scale images, the watermark information is embedded into three images:Lena, Elaine, and Baboon. With the increase in the threshold value, image quality is reduced correspondingly, but the normalized correlation coefficient of the extracted watermark is improved. Hence, the threshold value of the experimental image is 0.04 considering invisibility and robustness. The peak signal-to-noise ratios (PSNRs) of the three watermarked images, Lena, Elaine, and Baboon, are 49.864 5, 46.304 6, and 44.683 2 dB, respectively. These values show that the algorithm possesses good invisibility. When no attack occurs, the normalized correlation coefficients between the original and extracted watermark images can reach 1, which shows the effectiveness of the algorithm. Various types of attacks, including JPEG compression, noise, and filter, are applied to the watermarked images. With the increase in the attack intensity, the normalized correlation coefficients of the extracted watermark are influenced but mostly exceed 0.99. In particular, the normalized correlation coefficients of the watermark extracted from the compression attack can reach 1. Rotating, scaling, cyclic shifting, and shearing attacks are then performed on the watermarked images. Afterward, geometric correction of the watermarked images is realized with the coordinate relations and scales features of the SIFT feature points. Given that the watermarked images are subjected to rotating attacks without changing the size of the images, some of the pixel information is lost during the rotation, such that the normalized correlation coefficients of the extracted watermark could not reach 1. The normalized correlation coefficient of the extracted watermark when a watermarked image is enlarged is larger than the normalized correlation coefficient of the extracted watermark when the watermarked image is reduced. All normalized correlation coefficients of the extracted watermarks under the cyclic shifting attack can reach 1. Shearing of the good masking region affects the anti-shearing attack capability, but the normalized correlation coefficients of the extracted watermarks under the shearing attack exceed 0.95. Experimental results on conventional and geometric attacks show that this algorithm exhibits strong robustness against both attacks. The texture and edge information of the image can be calculated to obtain the masking value of each image block. The invisibility of the watermarking algorithm can be ensured by selecting the image block with good masking as the embedded sub-block. Selecting the pair of elements with the largest difference in the first column of the orthogonal matrix as the embedded position minimizes the influence on the overall visual quality of the original image and improves the robustness of the watermarking algorithm. Given that SIFT feature points are a type of space-based image local feature description operators that are invariant to image rotation, scaling, translation, and so on, geometric correction of the watermarked image is realized by with the coordinate relations and scales features of the SIFT feature points to improve the ability of resisting geometric attacks. The above mentioned methods enable the watermarking algorithm to effectively address the contradiction between invisibility and robustness.
关键词:human visual system(HVS);singular value decomposition(SVD);scale invariant feature transform(SIFT);invisibility;robustness;geometric correction
摘要:Image classification is an important issue in computer vision and a hot research topic. The traditional sparse coding (SC) method is effective for image representation and has achieved good results in image classification. However, the SC method has two drawbacks. First, the method ignores the local relationship between image features, thus losing local information. Second, because the combinatorial optimization problems of SC involve addition and subtraction, the subtraction operation might cause features to be cancelled. These two drawbacks result in coding instability, which means similar features are encoded into different codes. Meanwhile, representation and classification are usually independent of each other during image classification, so the features of image semantic relations between image features are not well preserved. In other words, image representation is not task-driven and may be unable to perform the final classification task well. Furthermore, the local feature quantization method disregards the underlying semantic information of the local region, which influences the classification performance. To deal with such problems, a two-stage method of image classification with non-negative and local Laplacian SC and context information (NLLSC-CI) is proposed in this study. NLLSC-CI aims to improve the efficiency of image representation and the accuracy of image classification. The representation of an image involves two stages. In the first stage, non-negative and locality-constrained Laplacian SC (NLLSC) is introduced to the encoding of the local features of the image to overcome coding instability. First, non-negativity is introduced in Laplacian SC (LSC) by non-negative matrix factorization (NMF) to avoid offsetting between features, which is applied to constrain the negativity of the codebook and code coefficient. Second, bases that are near the local features are selected to constrain the codes because locality is more important than sparseness; thus, the local information between features is preserved. Then, original image representation is attained by using spatial pyramid division (SPD) and max pooling (MP) in the pooling step. In the second stage, several original image representations are selected and connected to generate joint context spaces. All images are then mapped into these spaces by the SVM classifier. The mapped features in these joint context spaces are regarded as the final representations of images. In this manner, image representation and classification tasks are considered jointly to achieve improved performance. This two-stage representation method preserves the context relationship between the features of images to a certain extent. To validate the performance of the proposed method, experiments on four public image datasets, namely, Corel-10, Scene-15, Caltech-101, and Calthch-256, are conducted.Results suggest that the classification accuracy of NLLSC-CI increases by about 3% to 18% compared with that of state-of-the-art SC algorithms. The accuracy rate of NLLSC-CI increases by 3% to 12% in the Corel-10 dataset. For the Scene-15 dataset, classification accuracy increases by 4% to 15%. The classification performance in the Caltech-101 and Caltech-256 datasets increases by 3% to 14% and 4% to 18%, respectively. These findings show that the classification accuracy of the proposed method is better than that of state-of-art SC algorithms in the four benchmark image datasets. In addition, Tables 2 to 5 show that classification accuracy is the lowest in the Calthch-256 dataset. The reason could be the size of this dataset. The dataset contains too many categories and images, and the difference between and within classes is too large. As a result, the corresponding category of images cannot be identified correctly during classification. Thus, the accuracy of the proposed method is relatively low for datasets with large numbers and multiple classes of images. In general, however, NLLSC-CI demonstrates improved classification accuracy. This study proposes an algorithm called NLLSC-CI to solve coding instability and the independence between image representation and classification. The proposed method overcomes coding instability and preserves the mutual context dependency between the local features of images. Specifically, due to the incorporation of non-negativity, locality, and graph Laplacian regularization, this new method improves the consistency of sparse codes and their mutual dependency, thus preserving more features and local information between them and making the local features more discriminating. The new optimization problem in NLLSC-CI is solved by defining a diagonal matrix to obtain the analytical solution. Furthermore, the consistency of sparse codes is maintained by introducing a Laplacian matrix. This two-stage method of image representation jointly considers two independent tasks:image representation and classification. The construction of a joint space based on context information preserves the context between image features, and the image representation obtained by context information and image classification are mutually dependent. Therefore, NLLSC-CI can model images adequately and represent the original images through mutual dependency and context information among features, thus improving the classification accuracy. Several benchmark image datasets are studied, and the final experimental results show that the proposed algorithm presents better performance than other previous algorithms. In addition, this novel method can be applied to other computer vision issues, such as image segmentation, image annotation, and image retrieval. Meanwhile, extensive image data need to be maximized because the experimental image data used in this study are from several standard image datasets. Moreover, although the context information of this method can effectively convey the information expressed by images, it cannot reflect the complete method of thinking of humans. Therefore, other methods and models of image semantic content that are closer to humans' perception and thinking need to be investigated.
摘要:As information tags, 2D bar codes can easily be accessed by taking a picture or scanning. They possess the advantages of large data capacity, high decoding reliability, and wide range of encoding. Therefore, they are widely used worldwide due to the growth of the Internet in recent years. The quick response (QR) code is a 2D bar code. Compared with other 2D bar codes, the QR code has more features, such as high reading speed, large data density, and small occupied space. The QR code can encode arbitrary text strings, emails, hyperlinks, phone numbers, and so on. To obtain the embedded messages, mobile devices are used to capture the QR code, and information is acquired through a QR code reader. Given the popularity of smartphones, the QR code has become the most widespread 2D bar code. A common use of the QR code is to encode URLs so that people can scan a QR code to load a website on a cell phone instead of typing in a URL. QR codes are encoded using Reed Solomon error-correcting codes. Therefore, a QR scanner does not have to view every pixel correctly to decode the content. Error correction corrupts a part of the code (less than the maximum amount that the algorithm can fix) to make an image. This high fault tolerance makes QR codes popular and widely used in many commercial applications. In the conventional QR code, the modules are designated on an image by black and white square patterns, which can easily be identified by machines. However, noisy black and white patterns reveal no intuitive information about the QR code when viewers look at the patterns without the aid of a scanning software; the patterns also seriously degrade the aesthetic appeal of the carrier. Although the aesthetics and visual significance of the code are unimportant for scanning purposes, they do matter in advertising layout and can provide valuable brand distinction. Considering that QR codes often occupy a non-negligible display area in print media, the demand for visually appealing QR codes is increasing. Visually appealing codes incorporate high-level visual features, such as colors, letters, illustrations, or logos. Researchers have attempted to endow the QR code with aesthetic elements, and QR code beautification has been formulated as an optimization problem that minimizes visual perception distortion subject to an acceptable decoding rate. However, the visual quality of the QR code generated by existing methods still requires improvement. The key challenge is the lack of proper understanding or analytical formulations capturing the stability of QR codes under variations in lighting, camera specifications, and even perturbations to the QR codes. Patented and ill-documented algorithms employed to read QR codes cause further difficulties. Consequently, existing approaches are mostly ad hoc and often favor readability at the cost of reduced visual quality. This work presents an algorithm that visually encodes a QR code by synthesizing the conventional QR code with a theme image. This task is fulfilled by dividing the theme image into equal-sized non-overlapping blocks and modifying the average luminance of each block to its corresponding module type in the QR code by applying the well-designed Gaussian modulating function. In the Gaussian modulating function, standard deviation is dynamically determined according to the smoothness of the corresponding module block. The brightness of the central region of the modified module gradually changes along the circular and presents a smooth appearance and different sizes, which make it consistent with the human visual system. In addition, the size of the module's brightness-sensing region can be adjusted according to application scenarios and the sensitivity of the human eye to different noises. In the experimental stage, visually meaningful QR codes are synthesized by setting different parameters, and their correct decoding rate is tested. The optimal parameters are determined to ensure decoding reliability and make the QR code easily recognizable for humans. Furthermore, the correct decoding rate is tested using different mobile devices and decoding applications. Experimental results show that even when the performance of a mobile device is poor, the correct decoding rate remains satisfactory. Compared with a similar method of synthesizing visual QR codes in literature, the proposed method exhibits good performance in terms of the visual quality of the generated visual QR code and the response time in accessing the QR code content. The proposed method generates a visual QR code without referring to the encoding process of the QR code. The algorithm automatically generates a colorful QR code as long an image, data, and parameters are provided. The generated code is superior to the traditional black-and-white QR code and more attractive to users because of its satisfactory identification rate. The algorithm meets the QR code standard, is compatible with any decoding system, can be integrated into corporate identification system, and can be used in marketing, advertising, and other related areas. Applying our experimental results to advertising and industrial design would highlight the recognition of individual and corporate brands and overcome the drawbacks of QR codes with a black and white appearance. The proposed method exerts a positive advertising effect and enhances the willingness of users to scan.
摘要:The inherent uncertainty and randomness of a random-valued impulse result in the indeterminacy and sensibility of the noise threshold. Obtaining a robust noise threshold during noise reduction is therefore difficult because of these poor features. Moreover, the noise extraction efficiencies of algorithms used for random-valued impulse reduction are seriously affected. To improve the accuracy of noise detection, this study proposes an efficient double-side impulse noise detection algorithm based on the directional characteristic and neutrosophic indeterminacy information. Given that an accurate and effective weight function is important to the performance of a filter, a new bilateral filter based on neutrosophic indeterminacy and rank-ordered absolute difference (ROAD) statistic is constructed to strengthen the weights of pixels that are similar to the current pixel and restrain the influence of pixels with high indeterminacy. For noise detection, an adaptive function of noise threshold is designed according to the relation between the performance of the peak signal-to-noise ratio (PSNR) and threshold . The value of can be automatically adjusted in this function according to the density of noise. After the values of the rank-ordered logarithmic difference (ROLD) and neutrosophic indeterminacy of each pixel are calculated, the pixels in the polluted image are divided into three types according to the relationship between the ROLD value and the threshold. The first type comprises exceeding-threshold pixels whose ROLD values are greater than or equal to threshold . The second category comprises adjacent-threshold pixels whose ROLD values are equal to or greater than 0.8 but less than . The last category comprises security pixels that have ROLD values less than 0.8 . All pixels belonging to these three types are roughly divided into noise and noise-free sets by using a switching strategy. Pixels belonging to the first type are temporarily classified as corrupted pixels because their ROLD values are greater than .All pixels belonging to the second and third types are temporarily regarded as uncorrupted pixels because their ROLD values are less than . To improve the accuracy of noise evaluation in the exceeding-and adjacent-threshold regions, different strategies are adopted to conduct a second noise examination. Pixels in the exceeding-threshold region are further detected by exploiting the directional ROLD statistics. Before the directional statistics are calculated, 45°, 90°, 135°and 180° directional templates are designed, and different templates are applied under different noise densities. Templates with 9 pixels are used in the low-noise environment, and templates with 25 pixels are used in the high-noise condition to limit the harmful impact of noise. The smallest among the four directional ROLD values is selected as a pixel's final directional ROLD. Whether a pixel is a real noise is decided by the relationship between directional ROLD and threshold . When the selected directional statistic is less than the preset ,the present pixel is firmly regarded as false noise and deleted from the noise set. Pixels in the adjacent-threshold region are re-detected by applying the statistical information of rank-ordered neutrosophic indeterminacy. When the ranking order of a pixel's neutrosophic indeterminacy is in the first three places of the filtering window, the current pixel is reclassified as noise and removed from the noise-free set. During filtering, to enhance the weights of excellent pixels that possess low indeterminacy and high similarity, a new bilateral function is built based on the ROAD statistic and neutrosophic indeterminacy. The weight of a specific pixel is jointly determined by the value of ROAD and the degree of indeterminacy information. All pixels extracted as noises are restored by applying the new bilateral function, and an iterative filtering mechanism is used for denoising to achieve a satisfactory result. Experiments show that in the exceeding-threshold region, a number of edge pixels that have been incorrectly evaluated as noises in the first checking can be extracted, and the extraction ratio of these pixels is as high as 67%. In the adjacent-threshold region, about 91% of the pixels whose statistical values are less than but close to the noise threshold can be re-extracted from noise-free set. The proposed double-side impulse noise detection algorithm not only reduces the sensitivity of the impulse noise threshold but also improves the validity of noise judgment. Under noise ratios from 10% to 80%, the proposed algorithm is compared with seven other algorithms. The new method is the best among the algorithms in terms of PSNR, mean differences in PSNR and subjective quality. The double-side impulse noise detection algorithm maximizes the use of the image directional characteristic and neutrosophic indeterminacy information during noise checking. It efficiently reduces the false rate in the exceeding-threshold region and the missed rate in the adjacent-threshold region. Therefore, many edge pixels are not damaged. The double-side detection algorithm significantly decreases the sensitivity of the noise threshold and enhances the correctness of noise detection. Furthermore, the new bilateral filter improves the accuracy of a pixel's weight by using two features to measure pixel similarity. In short, the proposed filter based on double-side impulse noise detection and a novel bilateral function fully considers uncertainty and directional information. The edges and details of the image are well protected and efficiently recovered.
摘要:In the traditional total variation method, the removal of Poisson noise causes the “staircase effect” and image edge blurring in the plain area of the image. In reality, laser radar, satellite remote sensing, and medical imaging CT are based on the system of light quantum counting. The interference in image acquisition is basically subject to the Poisson distribution of quantum noise. How to effectively suppress the “staircase effect” and protect the edge of the image have thus become variational Poisson noise problems. To solve these problems, a new adaptive fractional-variation Poisson noise denoising model based on the traditional variational method is proposed.s The new model performs adaptive non-convex regularization based on the analysis of the Poisson noise distribution characteristics. Compared with the original model, the new non-convex regularization model can adjust the regularization coefficient adaptively according to the characteristics of regularization to different regions of the image; the image edge is therefore maintained. Regularization in the traditional total variational model involves the first-order discrete differential vector. Owing to the function characteristic of the bounded variation of the first-order differential vector, the “staircase effect” is easily caused in the plain area during denoising. To suppress the “staircase effect”, the new model uses the fractional discrete differential vector to combine the characteristics of image information and replaces the first-order discrete differential vector with the fractional discrete vector in the regularization. Given that the regularization of the new model is non-convex and the discrete differential is the fractional order differential vector, the traditional partial differential equation and the Chambolle projection algorithm cannot quickly and effectively obtain the numerical solution for the new model. For this reason, a more efficient numerical solution is proposed by combining the iterative method and the weighted primitive-dual method. Numerical results show that the new model is superior to the traditional total variational Poisson noise denoising model. The edge of the image is well protected, and the “staircase effect” is effectively suppressed. With the two classic images of Peppers and Lena as an example, the peak signal-to-noise ratio (PSNR) of the new model in the Peppers image is increased from 28.98 to 30.24 compared with the traditional model. The signal-to-noise ratio (SNR) is increased from 15.01 to 16.31, image structure similarity (SSIM) is increased from 0.77 to 0.87, and the mean square error of the image (MSE) is decreased from 82.24 to 61.52. In the Lena image, the PSNR of the new method is increased from 29.08 to 29.62, SNR is increased from 14.55 to 15.08, SSIM is increased from 0.78 to 0.83, and the MSE of the image is reduced from 80.37 to 70.97 compared with the traditional model. In addition, the numerical solution proposed in this study exhibits rapid convergence and low complexity compared with the traditional numerical solution. Similarly, with the classic Lena image as an example, the convergence time of the algorithm is reduced to 0.056 seconds compared with the convergence time of the partial differential equation, Chambolle projection, and other traditional numerical solutions (e.g., 0.5 and 0.1 seconds). Experimental results reveal the feasibility of the numerical method and model proposed in this study. The model and numerical solution are superior to traditional variants in terms of PSNR, SNR, MSE, and image visual effect. Numerical experimentation on several representative images shows that the new model and numerical solution possess good universality. The new model effectively suppresses the “staircase effect” and enhances the edge information of the image. It can effectively remove noise in radar and medical images. The fractional order in the new model is a fixed value. Obviously, different regions of the image use different orders of the fractional order to exert different effects. The effect of using a fixed value of the fractional order for the entire image could be subject to further improvement.
关键词:Poisson noise;adaptive;fractional-order;total variation;primal dual;edge information
摘要:Medical ultrasound imaging, CT, MR, and X-ray imaging are four modern medical imaging techniques. Medical ultrasound imaging techniques are ultrasonic-based diagnostic imaging approaches used to visualize subcutaneous body structures, such as muscles, vessels, tendons, joints, and internal organs. Compared with other imaging techniques, medical ultrasound imaging is widely used in clinical diagnosis, especially in pregnant women and fetuses, because it is non-invasive, inexpensive, convenient, can be applied in real time, and so on. However, due to the influence of the ultrasonic imaging principle, the ultrasonic image is inevitably disturbed by speckle noise during the generation process, which not only reduces the quality of the ultrasonic image but also makes the identification and analysis of the image detail highly difficult. In this study, an improved non-local means (NLM) image denoising algorithm based on the noise model of the ultrasonic image is proposed. A statistical model of speckle noise is obtained based on the probability distribution of the ultrasonic image. Then, the Bayesian formula and speckle noise model are utilized to improve the weight function of the NLM filter algorithm. The weight function of the traditional NLM algorithm is based on Gaussian distribution, so it can suppress Gaussian noise well. However, it is unsuitable for speckle noise. In this study, the weight function is improved based on the speckle noise model to make the algorithm applicable to an ultrasonic image. The algorithm preprocesses the image according to the characteristics of the proposed weight function by using a pre-defined threshold. If the average gray value of the image is greater than 155, then the image is processed directly. If the average gray value of the image is less than 100, then the anti-colored image is used for denoising. If the image has an average gray value of 100 to 155, both the original and anti-colored images are processed, and the average of the results is calculated and used as the final result. This step makes the algorithm produce a good denoising effect. Afterward, different sampling intervals are utilized to subsample the image; each sampling interval must be smaller than the similar window size in the NLM algorithm. For each pixel in the sampled block, the filtered value is calculated with the improved NLM algorithm. If a pixel is in the intersection of two sampled blocks, the final estimated value of the pixel is calculated by the weighted average of the filtered values in the two sampled blocks. After all the pixels are calculated, de-speckling performances in terms of filtered time, peak signal-to-noise ratio (PSNR), mean squared error (MSE), and mean structural similarity (MSSIM) at different sampling intervals are analyzed to optimize the sampling interval so that the algorithm can reduce noise while reducing the processing time. Finally, the optimized sampling interval and the improved NLM algorithm are applied to ultrasonic image denoising. The search and image window sizes are fixed to 11×11 and 5×5, respectively, in the optimized Bayesian NLM algorithm (OBNLM) algorithm and the proposed algorithm. The optimal sampling interval is fixed to 3 according to the experimental results. Experiments on phantom images and real 2D ultrasound datasets show that the proposed algorithm outperforms other well-accepted methods, including the traditional NLM algorithm, OBNLM, non-local total variation (NLTV) algorithm, and speckle-reducing anisotropic diffusion filter (SRAD), in terms of objective and subjective evaluations (e.g., MSE, PSNR, MSSIM, and computational time). The images filtered with the proposed algorithm have a higher PSNR value than the other de-speckling algorithms, which means the proposed algorithm can preserve the details of the image information better, and the filtered image has similar edges as the noise-free image. Comparison of MSE and MSSIM values indicates that the proposed algorithm has lower MSE values and higher MSSIM values than the others, which means the proposed algorithm can better preserve the structure information of the original image. With regard to computation time, the proposed algorithm does not demonstrate superiority in this aspect, but the speed of the proposed algorithm is almost nine times faster than that of the traditional NLM algorithm. An experiment is also conducted on real 2D ultrasound images, and results show that the proposed algorithm provides a better visual effect than other well-accepted methods. Speckle noise reduces the quality of ultrasonic images and limits the development of automatic diagnostic technology. According to the speckle noise model of ultrasonic images, the weight function of the NLM algorithm is optimized to make the algorithm suitable for ultrasonic image denoising. Experimental results show that the proposed algorithm is better than other algorithms and suitable for ultrasonic image denoising.
关键词:despeckle;non-local means;speckle noise;ultrasonic image;subsample;weight function
摘要:Haze is a common natural phenomenon formed by suspended particles in the atmosphere (e.g., water droplets and dust). In foggy weather, images obtained outdoors lose contrast and exhibit color distortion. Therefore, these images are difficult to utilize in computer vision applications, such as object detection and target tracking. One of the reasons images captured outdoors exhibit the above mentioned problems is that the reflected light received by the camera is attenuated. Another reason is that the irradiance from these objects blends with the atmospheric light scattered by particles. Therefore, effective methods must be used to remove haze. Single-image haze removal has exhibited remarkable progress recently. Dark channel prior is one of the valid methods of haze removal. Most images applied with dark channel prior produce good results. However, the dark channel prior method uses a soft matting to refine transmission maps, thus increasing the complexity of the algorithm. The guided filter is used to optimize transmission maps in dark channel prior. However, dark channel prior may still lead to the problem of cross color because the estimated transmission of the sky or bright-object regions in a hazy image is undervalued. Similarly, most existing methods of image dehazing cannot deal with this problem well. To resolve this deficiency, a method is proposed in this study to remove haze from an image through dark channel prior and a guided filter. First, with the model of dark channel prior, a coarse estimate of atmosphere veil is obtained. Second, the atmosphere veil containing sky or bright-object regions of the hazy image is corrected by introducing a correction function because dark channel prior is inapplicable to bright regions. Third, to smooth the edge and retain the detail information of the image, a guided filter is utilized to optimize the coarse atmosphere veil. The initial transmission map is obtained from the optimized atmosphere veil and optimized by the guided filter. Finally, the optimized transmission map and the estimated atmospheric light are used to obtain the restored image. To demonstrate the effectiveness of the proposed method, several classic images are used to conduct experiments. Peak signal-to-noise ratio (PSNR) and mean squared error (MSE) are adopted to measure the degree of distortion of the experimental results. The experimental results show that the haze-free image recovered with the proposed method produces a better result for non-bright regions and retains the original color of bright-object regions compared with Tarel's method, He's method, Meng's method, and Jiang's method. In general, the proposed method presents minimum distortion. Compared with He's method, the PSNR of the proposed method is increased by 0.600 5 dB, MSE is reduced by 0.002 6, and operation time is reduced by 29.622 0 s for an image with a size of 460×300. This study proposed a method to remove haze from an image that includes sky or bright-object regions. The method uses dark channel prior and a guided filter. Subjective and objective evaluations show that the proposed method produces good results for sky or bright-object regions. The method addresses the problem of cross color in bright regions caused by dark channel prior. Compared with that of He's method, the operation time of the proposed method is shorter.
摘要:Many local features, such as corner and inflection points, exist in images. The corner is the basic feature of an image, and it is always defined as a point where at least two edges intersect, a point having the maxima curvature, or a point around which a significant change in intensity occurs in all directions. Many corner detectors are available, and existing approaches can be broadly classified into edge-based, model-based, and gray-based methods. Corner detectors have their respective advantages and disadvantages. Model-based corner detectors detect corner points by matching image patches to the predefined templates. However, predefined templates cannot easily cover the full corner in natural images. Gray-based corner detection algorithms measure the local intensity variation of an image to find corners. These methods are sensitive to local variation and not robust to noise. Edge-based methods extract the edges from the input image and analyze the edge's shape to find corners. These approaches cannot fully extract image information. Corner detection is a challenging task in image processing and computer vision systems, such as object recognition, object tracking, simultaneous localization and mapping (SLAM), and pattern matching. Therefore, the performance of the corner detector is important. To improve the detection performance of the corner detector, this paper presents a new corner detector that combines the edge contour and gray information of the image and utilizes the consistency of the edge pixels with the log-Gabor gradient direction. According to the definition of “corner,” we know that intensities around a corner change extremely in every direction. The gradient direction of a corner with adjacent pixels differs significantly. However, the gradient direction of adjacent edge pixels is the same and perpendicular to the ridge of the edge. This study uses this characteristic to construct a new corner measure. The proposed algorithm employs the Canny edge detector to detect and extract the edge map of an input image. Then, the imaginary parts of log-Gabor filters are used to smooth the edge pixels along multi-directions, and the corresponding gradient directions of pixels are determined. The obtained gradient directions are used to construct the new corner measure. Afterward, both the corner measure and angle threshold are used to remove false and weak corners. The proposed detector is compared with three corner detectors on planar curves and under affine transforms. We also evaluate the performance under Gaussian noise degradation. To evaluate the performance of four detectors on planar curves, two published test image shapes of different sizes are selected. Ten different common images from standard databases are also used to evaluate the performance under affine transforms and Gaussian noise degradation. Average repeatability and localization error are the two evaluation criteria. Average repeatability measures the average number of repeated corner points between affine-transformed and original images. The localization error measures the localization deviation of the repeated corner. In the simulation experiments, the average rankings of the four approaches are as follows:CPDA is 2.00, Harris is 3.33, He and Yung is 2.83, and proposed method is 1.67. Experimental results show that the proposed method presents excellent performance in terms of average repeatability and localization error under affine transform and Gaussian noise degradation. The number of false and missed corners in published test images is less than that of the three other corner detectors in the experiments. High computation complexity is the shortcoming of the proposed method. The edge-based corner detection algorithm mostly depends only on the edge shape of the image without considering the change in image gray. The gray-based corner detection algorithm only considers the gray information of the image. The proposed method considers the image edge shape and the gray changes. The imaginary parts of log-Gabor filters are used to smooth the edge pixels along multi-directions. Meanwhile, the consistency of the gradient directions of the edge pixels is used to construct the corner measure. Experimental results show that the proposed algorithm has good stability and corner detection performance. To address the high computation complexity of the proposed method, hardware measures, such as an embedded processor and FPGA controller, should be used to improve the problem of real-time processing. In the future, the optimized algorithm should be considered. Meanwhile, the proposed method can be applied to image matching.
关键词:feature detection;corner detection;contour-based;log-Gabor filters;gradient direction consistency
摘要:Color image segmentation is an important image analysis technology, that has important applications in image recognition systems. The quality of image segmentation directly affects image processing. However, color images in real life are difficult to segment precisely because of noise, uneven color, and weak boundaries. This study proposed a watershed segmentation algorithm based on homomorphic filtering and morphological hierarchical reconstructions. By combining the advantages of homomorphic filtering, morphological operations, and watershed transform, the qualities of color image segmentations are improved. The watershed transform algorithm is widely used in image segmentation, because of its low computational burden, high accuracy, and continuous extraction. However, due to irregular regions and noises in an image, image segmentation relying solely on a watershed transform algorithm easily results in a large number of false contours. To improve the quality of image segmentation by watershed transform, this study adopted homomorphic filtering and morphological reconstruction. First, the proposed algorithm used the Sobel edge operator to compute the gradient of each color component according to the image's R, G, and B values. The maximum value was selected as the gradient of the color image. After extracting the gradient map of a color image, it was modified through homomorphic filtering using Fourier transform. Filtering highlights the foreground contour information and removes the detail texture noise. Irregular details and a small amount of noise still existed in the gradient image, especially in the boundary and background, after filtering, but the morphological reconstruction operators addressed this shortcoming. A modified gradient map was then reconstructed hierarchically by using the operators of open and close morphological reconstructions. According to the cumulative distribution function of the gradient map and the distribution information of the gradient histogram after filtering, the formula to calculate the number of gradient layers was provided, and the sizes of morphological structure elements, which decreased with the increase in the gradient value in each layer, were calculated adaptively. Finally, standard watershed transform was applied to the reconstructed gradient map, and image segmentation was realized. To verify the effectiveness of the algorithm, an experiment was conducted. Four color images of different features were utilized for segmentation in the experiment. The proposed algorithm effectively restrained over segmentation and maintained the weak boundary, so the segmentations were more accurate than those of other watershed algorithms. To objectively evaluate the performance of the different segmentation methods, the experimental results were quantified through an unsupervised evaluation of segmentations. The synthesize index combined with regional consistency and diversity indexes was applied. The evaluation index values of our algorithm in the four test images were 0.633 3, 0.665 6, 0.629 3, and 0.648 4, which were higher than the results of other watershed algorithms; the segmentation performance of our algorithm was also better. With regard to timeliness, our algorithm requires more time than the other two algorithms, but the difference was small. Watershed transform is a widely used algorithm for image segmentation, but it often leads to over segmentation. Many methods focus on solving this problem while ignoring the weak boundaries of images, which are also important in segmentation. This study proposed a new improved watershed algorithm for color images. In the algorithm, homomorphic filtering is used to preserve the weak boundary of an image, and adaptive morphological reconstruction is applied to suppress the over segmentation of watershed transform. Balance exists between under and over segmentations. The segmentation results of the proposed algorithm are closer to the human perception of mages. Whether in terms of the evaluation index or segmentation performance, the proposed algorithm performs well. The algorithm is insensitive to noise, possesses good robustness, and can be widely application in computer vision, traffic control, biomedicine, and other targeted segmentations.
摘要:Visual tracking is an important field in computer vision and is applied in various domains. Although numerous visual tracking methods have been developed in the past several decades, many challenging issues (e.g., occlusions, illumination changes, and background clutter) still affect the tracking performance of these methods. Inspired by sparse representation applied in face recognition, the L1 tracker based on sparse representation was proposed by Mei et al. The L1 tracker has good robustness toward partial occlusion but is prone to model drift and time consuming. To address these two problems, this study proposes a tracking method based on discriminative sparse representation. Considering the interference of background and occlusion information, a discriminative sparse representation model is proposed. The proposed model uses the sparseness of the coefficients associated with target and background templates so that the candidate targets can be represented accurately. The sparseness of the coefficients associated with trivial templates makes the proposed tracker robust to partial occlusion. By using the coefficients associated with trivial and target templates, the observation likelihood model, which is adopted in this study, eliminates the interference of the background information and leads to improved tracking results. A fast sparse representation algorithm is designed to increase the tracking speed and used to calculate the coefficients of the discriminative sparse representation model. At the first stage, the proposed algorithm uses the learned iterative shrinkage and thresholding algorithm (LISTA) to calculate the coefficients associated with target templates. At the second stage, the proposed algorithm uses the soft shrinkage operator to calculate the coefficients associated with trivial templates. Based on block coordinate optimization theory, the above optimization procedure is iteratively used to obtain excellent sparse representation coefficients. Under the particle filter framework, the tracking task is accomplished with the proposed model and the fast solution algorithm. The proposed tracker is tested on eight sequences, namely, FaceOcc1, FaceOcc2, David3, Dudek, Singer1, Car4, Jumping, and CarDark. The strength of the proposed tracker is analyzed by comparing the proposed tracker with L1, L1APG (L1 tracker based on accelerated proximal gradient), SP, and L1L2 trackers. The issues in these sequences include occlusion, in-plane rotation, out-plane rotation, target appearance variations, illumination changes, camera motion, scale changes, motion blur, and background clutter. The selected state-of-the-art trackers, which are used to demonstrate the effectiveness of the proposed tracker, are all based on sparse representation. L1APG tracker, SP tracker, and L1L2 tracker are improvements of the L1 tracker. For robustness evaluation, qualitative and quantitative experiments are conducted to evaluate the proposed tracker. The qualitative comparison shows that the proposed tracker overcomes various challenging issues during tracking. For the quantitative comparison, a precision plot is used to analyze the performance of the proposed tracker. With the location threshold varying from 0 to 50 pixels, the precision plot of the proposed tracker is better than that of the others in the eight sequences. In terms of computing speed, the proposed algorithm can significantly reduce the computational cost of sparse representation. The time for solving an image patch is 0.152 s and 0.257 s for patches with resolutions of 16×16 and 32×32, respectively. The proposed tracker consumes less time than the others in the experiment. Compared with other trackers that do not adopt background templates to construct the sparse representation model, the proposed tracker produces better tracking results. The proposed tracker is more robust to occlusion and other challenges, such as background clutter and appearance changes, and has better tracking speed than the state-of-the-art trackers. Thus, trackers based on the proposed method can be used for many engineering applications, such as video surveillance, medical diagnosis, and athletics. The adopted method, which is used to update the target templates, has low time consumption but may sometimes bring some interference information to the trackers. Thus, a more effective method of updating target templates needs to be developed in the future.
摘要:Image processing has been widely used in various fields due to the rapid development of computer vision. Accurate and robust image feature extraction is the premise of image analysis and recognition in image processing. Many state-of-the-art feature description algorithms are available, and local intensity order pattern (LIOP) is one of them. However, in existing feature description algorithms based on LIOP, the information between sampling points is not fully considered during weight calculation, and redundant intensity order patterns exist during the construction of the feature descriptor. These conditions result in an inaccurate feature descriptor. To solve this problem, an improved algorithm is proposed in this study. The algorithm maximizes the use of information between sampling points and eliminates redundant patterns. The LIOP algorithm shows that a feature descriptor is obtained by adding the weight of the index value of the corresponding order pattern. Therefore, the weight of each pattern affects the precision and accuracy of the feature descriptor, and the information on the sampling point is crucial. First, investigation of the local information of the sampling point indicates that the information on the local patch is not fully utilized, and redundant patterns exist. Therefore, an algorithm that calculates the weight of the feature descriptor is constructed by combining the order structure information of the sampling points. Second, with the increase in the sampling points, the patterns increase rapidly, and the dimension of the feature descriptor expands rapidly. These occurrences result in the curse of dimensionality. The relationship between the intensity order pattern and its corresponding weight is then analyzed. The contribution of the low-weight patterns is small when structuring the feature descriptor because these patterns are decided by the value of the sampling pixel points. High-weight patterns contribute significantly to the feature descriptor. Noise also causes the feature descriptor to become inaccurate. Analysis reveals the presence of redundant patterns, which need to be removed to make the feature descriptor reduction robust. Therefore, the redundant patterns are eliminated when constructing the feature descriptor. The obtained feature descriptor makes full use of local information and eliminates redundant patterns. The proposed algorithm is used to experiment on a standard dataset (Oxford Dataset) and four other complex illumination images. A total of 132 dimensional feature descriptors are obtained. Compared with the original algorithm of LIOP, the proposed algorithm can effectively improve the precision-recall curve and description ability without increasing the feature dimensions. In addition, the robustness of the feature descriptor is enhanced in terms of monotonic intensity and rotation change. Therefore, the improved feature descriptor demonstrates better performance than the original feature descriptor. The proposed algorithm for constructing a feature descriptor makes full use of the difference and structural information between sampling points and eliminates redundant patterns. The performance and robustness of the feature descriptor are improved, and its dimension is reduced. The algorithm obtains high identification accuracy even under complex illumination.
摘要:Studies on handwritten Chinese characters, such as those on signature verification and text recognition, have been conducted for many years. The skeleton is a key point in these studies. It reduces redundant information but retains a complete topology structure. Using a thinning algorithm to extract a skeleton from a handwritten Chinese character image is a traditional approach. However, distortions exist in the extracted skeleton primarily because the complex areas are not well detected nor processed. Complex areas are the intersections and junctions of strokes. Considering that characters are saved as static images, a computer cannot recognize the existence of these areas with more than one stroke. The computer still regards these areas as an entirety, so the thinning algorithm does not perform well. To solve distortion, this study proposes a skeleton extraction algorithm based on the optimum local correlation degree for handwritten Chinese characters. A simple and effective method to extract complex areas is designed. This method uses a thinning algorithm to obtain the original skeleton. The points on the skeleton are classified as end, common, and complex points. Complex areas are extracted by detecting connected complex points with an eight-neighbor window. Afterward, the information on complex areas is used to modify the original skeleton. The modification algorithm is based on a strategy involving split and reconstruction. The skeleton is split into several stroke segments because all complex areas are removed. Distortions are also eliminated in the removal. The reconstruction step focuses on the reconnection of stroke segments; it analyzes the relationship among stroke segments to restore the skeleton. The directional relationship is considered. The slope between two end points of a segment may not accurately represent the correct direction because the stroke segments are not always straight. Sub-segments adjacent to a complex area can provide the required directional information. In most cases, two stroke segments that are originally connected possess similar directions. However, in several situations, obtaining the direction is insufficient when determining whether two stroke segments belong to one natural stroke. Consequently, the curvature relationship should also be considered. A concept of local correlation degree is proposed based on the relationship of direction and curvature between sub-segments. The correlation degree is designed to be sensitive to the change in direction. The correlation degrees of any two stroke segments in one complex area are calculated. When two stroke segments share the optimal local correlation degree, they are regarded as a pair of continuous segments. The connection step uses interpolation to restore the removed part between continuous segments. Discontinuous segments are provided a proper extension to prevent an incorrect connection. By connecting the stroke segments, the split skeleton is reconstructed, and distortions are modified. Twenty people are asked to write 600 Chinese character samples for the experiment using different pens. All images are denoised and binarized. The use of the eight-neighbor window to detect complex areas in the skeleton provides a good effect. The number of detected complex zones in the 600 samples is close to the theoretical value, whereas that obtained with the contour method is 2.5 times the theoretical value. Most distortions are modified with the local correlation degree, and the reconstructed skeleton approximates the real topology. With the standard skeleton as a criterion, the accuracy of skeleton extraction is 98.41%. The proposed skeleton extraction algorithm for handwritten Chinese characters uses a strategy involving split and reconstruction. Reconstruction is based on the optimum local correlation degree. The proposed method has two main advantages over other methods. First, complex area detection is considerably improved. Other methods detect complex areas mainly through the analysis of turning points on the contour. Unlike these methods, the proposed method implements detection directly from the skeleton. The method is simple and avoids excessive detection. Second, the stroke extraction algorithm provides a good result on distortion modification. Removing complex areas with distortions and reconnecting stroke segments through interpolation provide an efficient solution. The extracted skeletons retain good shapes, and the position relationships among strokes are correct. To conclude, the proposed stroke extraction method demonstrates high accuracy and processing speed. It is an effective and useful method for applications dealing with handwritten Chinese characters.
关键词:handwritten Chinese character;thinning;skeleton distortion;complex area;local correlation degree
摘要:Image stitching can synthesize a panoramic image from multiple successive images and can be applied in many military and civil applications. The ghost problem exists in the overlap areas between two images being stitched when moving objects exist or a registration error occurs during image stitching. The issues of stitching line, color inconsistency, and so on occur when the camera exposure time and illumination change during imaging. These factors may affect the panoramic image if the images are simply fused. The improved image fusion technique, which is one of the key technologies in image stitching, can be used to solve these problems. Considering that the optimal seam algorithm is an effective method, an improved optimal seam algorithm based on differential image weighting is proposed to solve the ghost problem (seams pass through moving objects or inaccurate registration areas) in the classical optimal seam algorithm. A partition fusion algorithm based on multi-resolution fusion and weighted average fusion is presented to solve the stitching line problem caused by the change in exposure time and illumination. First, the images being stitched are mapped to a cylindrical surface after registration. The Harris corner is used to find correspondences between images. Second, the overlap areas between the image being stitched and the fused image are calculated and partitioned into three areas, namely, an optimal seam search area and two transition regions. The optimal seam search area is set to occupy three-fifth of the space, and the two transition regions occupy one-fifth each. Third, a differential image-weighted optimal seam algorithm is proposed to search for the optimal seam in the seam search regions. Aside from considering the difference in color and structure, the metric of computing the optimal seam for each pixel is also weighted by image difference. The weighting coefficient is set to infinity if this difference is above a certain threshold. Therefore, the moving-object region can be bypassed when searching seam line if the image difference is large. After the metric image is computed, a dynamic programming algorithm is used to search for the optimal seam in this metric image. Finally, a mask image is generated based on the obtained optimal seam. To address the many invalid areas caused by the mapping to a cylindrical surface, the method of image extension is adopted to fill these areas by using nearby pixels before combining two images. The multi-resolution image fusion algorithm is applied to the entire overlapping region after image extension. Then, the weighted average fusion algorithm is adopted to eliminate the stitching line in the transition regions. Several image sequences at the crossing captured by a mobile phone are used to test the proposed algorithm. Afterward, the proposed algorithm is compared with the algorithm of dynamic image mosaic via SIFT and dynamic programming and the stitching algorithm implemented in OpenCV. Many moving objects (e.g.cars and pedestrians) exist in the images. Experimental results demonstrate that the probability of the optimal seam passing through moving objects and inaccurate registration areas is reduced due to the improved optimal seam search algorithm. The probability of passing through moving objects when the optimal seam cannot circumvent them is as low as possible. The stitching line is effectively eliminated by combining the algorithms of multi-resolution fusion and weighted average fusion, and the quality of the panoramic image using our algorithm is better than that using the multi-resolution fusion algorithm only, especially in the scenario where varying illumination exists. The stitching result is the same or better than that using OpenCV, although some distortion exists in the panoramic image due to misregistration. A comparison of the computational costs of the proposed algorithm and the dynamic image mosaic algorithm is also presented. The proposed optimal seam algorithm can avoid the problem in which the seam line passes through the moving-object region or a registration error occurs. The multi-resolution image fusion algorithm is applied to non-strict overlapping image fusion through image extension after region partitioning, and a high-quality panoramic image is synthesized. However, the proposed algorithm has the following shortcomings. The code must be optimized to meet real-time requirements, and global registration optimization should be used to reduce the distortion problem in the panoramic image during the process of image stitching.
关键词:image stitching;optimal seam;multi-resolution fusion;weighted average fusion
摘要:Owing to their superior characteristics, local image descriptors have been widely used in many computer vision and image processing fields, such as image matching, image classification, image search, and structure from motion. This study proposes a new local feature called max edge orientation pattern (MEOP). First, the maximum intensity difference between the center pixel and surrounding ones is calculated. Second, the position and sign of the maximum intensity difference are encoded. The pixel with the maximum intensity difference denotes the strongest edge of the local adjacent region. The position of MEOP describes the radial direction, and the sign describes the arrow of the direction. Compared with the local binary pattern, the maximum edge direction pattern only encodes the maximum intensity difference. Therefore, the maximum edge direction pattern does not change as long as the position and sign of the maximum intensity difference do not change. The robustness of the maximum edge direction pattern is high, and its anti-noise ability is strong. The maximum edge direction pattern differs from the local binary pattern in describing the local structure of the image. Nevertheless, the two patterns are complementary in expressing the local structure of the image. Local rotation invariant coordinates are used to calculate the maximum edge orientation pattern. The rotation-invariant intensity-order space division method and multiple support regions are employed to pool the maximum edge orientation pattern and obtain a new local image descriptor, namely, maximum edge orientation pattern histogram (MEOPH). Compared with the MRRID descriptor using the local binary pattern, the MEOPH descriptor with the maximum edge direction pattern has different statistical properties and superior performance. With the standard test image set of the affine invariant research group of University of Oxford, image matching experiments are conducted on current popular descriptors, including SIFT, DAISY, CS-LBP, HRI-CSLTP, and MRRID. Experimental results on standard test image sets show that MEOPH and MRRID demonstrate the best performance. The matching performance of MEOPH is better than that of SIFT, DAISY, CS-LBP, and HRI-CSLTP in all test data sets and is slightly better than that of MRRID in most cases. The matching performance of MEOPH is much better than that of MRRID in the experiments wherein Gaussian noise is added to the standard test image sets. In addition, MEOPH and MRRID complement each other in image matching, and matching performance is significantly enhanced by the combination of the two descriptors. The superior performance of MEOPH in terms of stability makes the method suitable for local descriptor matching in complex environments. In the context of high-discrimination requirements in local descriptor matching, MEOPH can be used in conjunction with MRRID.