摘要:The color transfer between object images has been a hot topic in recent years. Current color transfer technology is widely applied for the color adjustment of photos taken in social activities, whereas traditional color transfer technology is used for general image processing. This paper presents an adaptive color transfer algorithm that targets skin colors. First, users specify the area that needs color conversion by human-machine interaction.The YCbCr color space is used to obtain the skin area for the first time and achieve rough extraction of skin color data. Second, the distribution characteristics of color components in lɑβ space are used to achieve the accurate extraction of pure color information. Finally, color conversion is achievedby the corresponding conversion rules. Experiment data shows that compared with other algorithms, the proposed algorithm presents converted skin color features that looks natural in the face region, avoids changing the color information of the non-skin region, and maintains the clearance of the original image.Results show that the algorithm has strong applicability. This algorithm can only handle selected skin information, avoid non-skin data, retain the original color of other objects, achieve color adaptive selection and automatic color transfer,and obtain improved conversion results between significantly different skin colors.
关键词:image processing;skin color transfer;color adjustment;skin color area;lɑβ color space
摘要:To improve the robustness of the data singular points of super-resolution (SR) algorithm, we propose an SR algorithm based on K-means clustering and support vector data description (Kmeans-SVDD). The proposed algorithm is composed of two stages: training and testing. In the training stage, the approximating sub-band of trained images is clustered by the K-means method.SVDD is employed to drop singular points for each cluster. Finally, principle component analysis (PCA) is used to train the approximation and the detail sub-band dictionaries in the wavelet domain. In the testing stage, the approximating sub-band of the input low-resolution (LR) image is used as the approximating sub-band of the high-resolution (HR) image. This step is conducted on the basis of the observation that the approximating sub-band of the HR image is similar to it fo the corresponding LR image. Then, the prepared dictionaries are utilized to recover the detail sub-bands of the HR image. Finally, converse wavelet transformation is used to obtain the recovered HR image. The proposed Kmeans-SVDD algorithm is shown to be superior to existing algorithms with an average PSNR improvement of 1.82 dB, 0.37 dB, 0.30 dB, and 0.15 dB over the bicubic interpolation, Zeyde's algorithm, ANR algorithm, and K-means-PCA algorithm, respectively. Extensive tests indicate that once the SVDD process is added before the dictionary training, the outliers can be removed and the quality of the dictionary can be finally improved. By reconstructing each band separately in the wavelet domain, the influences caused by the uncertain high-frequency information in low-frequency image to image super-resolution can be prevented; thus, reliable detail information can be restored.
关键词:dictionary learning;sparse representation;k-means clustering;support vector data description (SVDD);super-resolution reconstruction
摘要:Image segmentation is an important basic operation of image processing. It has a direct effect on subsequent operations such as image analysis and scene understanding. The watershed algorithm is widely used to segment images because of its ability to obtain accurate boundaries. However, this algorithm tends to cause over-segmentation phenomena. Alternatively, segmentation algorithms specific to RGB images cannot radically solve the case wherein regions have similar colors and textures. To solve the over-segmentation problem in the watershed segmentation algorithm and overcome the limitation caused by using only RGB images, a marked watershed segmentation algorithm based on RGBD images is proposed. An RGBD image is composed of an RGB image and a depth image. By using depth information, we extract the geometry information of the object to assist the image segmentation process. For RGB images, the probability map of boundary is computed by the PB algorithm and used as the color gradient image in our algorithm. For depth images, a depth gradient operator and a normal gradient operator are defined to measure the differences in geometry information. We use the circular gradient operator with eight orientations. For each orientation, we divide the disk into two half disks along the diameter of the orientation and compute the gradient by using information from the two half disks. For the depth gradient, we obtain the weighted depths of Gaussian weight from the two half disks and consider the depth difference between the two weighted depths as the gradient. For normal gradient, we fit this normals to the points set in the two half disks and compute the angle between the two normal as its gradient. We select the maximum gradient among the different orientations as the final gradient. By fusing the depth gradient image, normal gradient image, and color gradient image, the marker image is extracted. Thereafter, the color gradient image is modified by the marker image using the minimum calibration technique, and the segmentation process is conducted by using the watershed algorithm. Our experiments are based on the NYU2 dataset provided by New York University. This dataset includes different indoor scenes and provides the labels of real segmentations. Results show that by extracting the markers and modifying the color gradient image, our algorithm can reduce the number of regions and solve the over-segmentation problem effectively. Furthermore, by using the geometry information of the object, our algorithm can obtain closer results to real labels. The accuracy of the segmentation is improved, particularly for regions with similar colors and textures. Specific to RGBD images, we propose a new watershed segmentation algorithm. Except the color gradient image obtained from the general RGB image, we also extract the depth gradient and normal gradient to assist in the image segmentation process. Furthermore, a marked watershed algorithm is used to obtain the final segmentation results. Results show that our algorithm reduces the number of regions and obtains improved segmentation accuracy, particularly in regions with similar colors and textures.
摘要::A light field camera can obtain 4D light field data from stereoscopic space and generate focal stacks with one shot. Depth information can be extracted by using a focus detection function. However, generalizing varying scenes is difficult because of the distinct response characteristics of focus detection functions. Furthermore, most of the existing methods lead to large defocusing errors, which are not robust in practical usage. In this paper, we present a new depth extraction method based on light field images (i.e., focal slices and all-focus images) to obtain high-accuracy depth information. We develop a windowed focusing detection function based on gradient mean square error to extract depth information. Thereafter, we correct the defocusing errors by using the local search method for the area marked with the defocusing function. Finally, we synthesize the depth images to improve accuracy. The experiments on the Lytro dataset and our own data show that our approach achieves higher accuracy with less noise than other state-of-the-art methods. Precision increases by approximately 9.29%, and MSE decreases by approximately 0.056 compared with other advanced methods. The use of the windowed gradient mean square error function to extract depth information produces less speck noise. By using the color information from all-focus images, our approach can correct the defocusing error. Finally, the depth edges fused in an MRF framework are clear and maintain good consistency with the color image.The depth estimation by our approach is better than other methods for low texture images.
关键词:depth extraction;light field camera;stack of refocusing images;focusness detection;defocusing error
摘要:By focusing on the problem of visual saliency detection in images,an algorithm for saliency detection based on background prior and multi-scale analysis is proposed. The method consists of four steps. First, the original image is decomposed into super pixels. The sizes of salient regions vary;thus, the super pixel scale has a significant effect on the detection results. Therefore, the image is analyzed with different super pixel scales. Second, background region is extracted. When extracting the background region, three rules are used, namely, boundary, connectivity, and feature difference among the super pixels. The super pixels of the image are classified into background and foreground. Third, according to feature differences between the super pixel and background prior, the background-based saliency of the super pixels is calculated. Similarly, the foreground-based saliency can also be computed. The saliency map can be generated by integrating background-based saliency and foreground-based saliency. Finally, saliency maps under different scales are fused to obtain the final saliency map. To verify the efficiency of theproposed algorithm, we used four datasets, namely, MASR-1000, ECSSD, SED, and SOD datasets. The results are compared with state-of-the-art algorithms. We compare our algorithm and other state-of-the-art algorithms by four indicators: precision, recall, f-measure, and mean absolute error (MAE). Experimental results show that the proposed algorithm outperformsother current popular algorithms on MSRA-1000, SED, and SOD datasets.On the ECSSD dataset, our algorithm is similar to the manifold ranking algorithm. The average values of precision, recall, F-measure, and MAE are 0.7189, 0.6999, 0.7086, and 0.0423, respectively. In this paper, a novel saliency detection algorithm is proposed. According to the proposed algorithm, the original image is analyzed in multi-scales.Visual saliency is computed by using the background prior. Experimental results show that the proposed algorithm can be successfully used in salient object detection and object segmentation in natural images.
摘要:Many studies have been conducted on the extension of the B-spline curve with parameters. These extended curves have similar local shape controlability as the B-spline curve and have shape adjustability that is independent of the control points. However, previous studies on this topic used global parameters, thus preventing the shape of the curves to be locally adjusted. The blending functions in several studies do not have total positivity, thus removing the variation-diminishing properties and convex-preserving properties of the curves. The purpose of this paper is to construct a curve with convexity-preserving property, local shape adjustment, and local shape control. By using the theoretical framework of quasi-extended function space, we first prove that the existing extended basis of the cubic Bézier curve, called -Bernstein basis, is exactly the normalized B-basis of the corresponding space. Thereafter, we use the linear combination of the -Bernstein basis to express the extended basis of the cubic uniform B-spline curve. According to the preset properties of the curve, we deduce the properties of the extended basis and then determine the coefficients of the linear combination and the expression of the basis. The extended basis can be represented as the product of the -Bernstein basis and a conversion matrix. We prove the totally positive property of the matrix. By using this basis, we define a piecewise curve with one local shape parameter with the same structure as the cubic B-spline curve. The total positivity of the conversion matrix determines the total positivity of the extended basis. The total positivity of the basis determines that the extended curve has a variation-diminishing property and convex-preserving property. The locality of the shape parameter determines that the shape of the curve can be adjusted locally. The piecewise structure indicates that the curve has local shape control ability. The method for constructing the extended B-spline basis with total positivity has generality. Compared with most extended curves in literature, the curve given in this paper has variation diminishing property and convex-preserving property; thus, providing an efficient method for conformal design.
摘要:With the development of mobile Internet, the speed of advancement of mobile augmented reality applications is accelerating. However, given that storage and computation capacity are limited on mobile devices and that large-scale outdoor images mostly include similar building structure images, these applications are restricted to indoor environments. To solve this problem, a mobile augmented reality system based on cloud image recognition is established. To reduce mistake-matching probability, we add gravity information into the SURF and BRISK feature description algorithm. Gravity information is used as the main orientation. In our system, these two description algorithms are named Gravity-SURF and Gravity-BRISK. In the cloud, large-scale images and augmented information are managed by the augmented reality management system. The VLAD algorithm based on Gravity-SURF is applied to large-scale image recognition. On intelligent terminals, different user interfaces are designed for augmented reality applications to display augmented information. The model rendered is a presentation format of the augmented information. We propose a method that combines Gravity-BRISK and optical flow to perform a real-time camera tracking algorithm. When using rendering engine Unity3D for 3D rendering, we propose a solution for the transformation of camera pose in different coordinate systems. Thereafter, the camera pose data will be transmitted from an Android platform to Unity3D by the Android Native Development Kit. In the experiments, we first build an image database with 4 000 images. The database contains 800 outdoor scenes and each scene contains 5 images in different angles and distances. Each image must be associated with the local gravity information. Thereafter, we compare the traditional description algorithm with the gravity-based description algorithm. From the experiments, we can conclude that the gravity-based description algorithm is able to enhance the discrimination of similar features and improve the matching accuracy. Finally, we test the performance and accuracy of our image recognition algorithm and tracking algorithm. The recognition algorithm experiments show that the VLAD algorithm based on Gravity-SURF can improve the recognition rate. In 4G and WIFI network environments, the upload time of the recognition image is less than 40 ms, the online recognition time is approximately 420 ms, and the augmented data download time is less than 4 s. Furthermore, the tracking algorithm experiments show that the RMS errors in variation of the rotation and scale are all less than 1.2 pixels. For the optical flow stage, the frame rate is able to reach 23 frame/s. In terms of the above experiments, the algorithm can satisfy the accuracy and real-time demands of the system. The mobile augmented reality system is suitable for complex outdoor environments. This system has been applied in Google Glass and the journalism field but is not limited to these two areas. Nonetheless, the augmented information in different applications can be managed efficiently. The system we established aims to promote the development of MAR applications.
摘要:A method of watershed segmentation using the edge gradient combined with distance transformation for segmentation of microscopic images is proposed. This method aims to solve the problem of cell extraction and segmentation of overlapped cells. First, a topographic map from the binary image of cells is generated by transforming distance, after which the local extreme point or set of points are used as the marks of foreground and the first time distance watershed transformation is processed. Then, the resulting watershed ridge line from the previous step is used as the mark of the background region, and the mark of foreground is unchanged. The gradient magnitude is used as the topographic map so that the gradient watershed transformation is achieved, after which the results of segmentation are obtained. Before the two watershed transformations are achieved, the method of mandatory minimum value is used to modify the topographic map so that controlling the local minimum value in the topographic map can only be done within the region of the selected mark. The resulting ridgeline is used as the mark of the background in gradient watershed transformation, which can effectively separate the adhesion target. This method not only retains the advantages of edge positioning accuracy in watershed transform based on gradient images, but also solves the problems of adhesion objectives that cannot be segmented or adhesion objective that is oversegmented. As verified by many clinical image segmentation experiments, the method can extract the cells from the images and segment the overlapped cells automatically. Thus, the method successfully meets the requirements of medical image segmentation.
摘要:This study explores the method of aggregation connectivity planning in high-quality arable land and discusses the significance of arable land connectivity to realize connectivity planning based on arable land quality grade when the Ministry of Land and Resources and Agriculture conduct the designation of permanent farmland. For landscape ecology, this paper discusses the concept of farmland connectivity and analyzes the threshold value between farmland landscape plots. Morphological closing operation is used to achieve spatial connectivity calculation on high-quality arable land. Finally, this paper analyzes the connectivity of Jincheng Town arable land on the basis of this method. Research results show that this method is highly efficient when used to realize connectivity planning based on arable land quality grade. The result matches the local conditions. Compared with other methods, the proposed method can define the land quality grade and threshold value, thus effectively enhance concentration and integrity and retaining the lot edge shape features. This method is helpful for realizing the centralized management of arable land and provides a new method for the study of basic farmland spatial analysis.
摘要:Maintaining the consistency of a multi-scale spatial relationship is an important step in conflict detection of multi-scale spatial data and data matching. Existing methods mainly focus on the relationship similarity measurement of spatial data with identical or similar resolutions and areless concerned with the consistency of spatial relationship for collapse operation. In addition, the common concept-distance-based evaluation method cannot effectively achieve multi-scale representation. Hence, a generalized consistency assessment method for a mult-iscale spatial relationship that considers collapse operation is proposed. The concept of homonymous entity and its characteristics are introduced and discussed within the scope of multi-scale spatial representation. By considering the influences of the collapse operation in generalization, three generalized consistency assessment methods are proposed based on topological relation, cardinal direction relation, and distance relation. Then, a graph representing the adjacent relation of homonymous entities is adopted to implement and visualize the similarity calculation. To reduce the cost of calculation, the graph is simplified and extracted as subgraphs of spatial relationship. Finally, the consistency of multi-scale spatial representation can be judged according to the spatial relationship similarities based on the obtained relation graphs. The 1:10 000 geographical foundation data and the 1:50 000 derived data of a common area are used in the case study. Experimental results show that, compared with the concept-distance-based method, the proposed generalized consistency assessment method is more effective and has a wider application scope. This consistency assessment method can be effectively used in map generalization, multi-scale spatial database building, and spatial scene matching. Our subsequent work shall focus on establishing a systematic consistency assessment method from the perspectives of individual form, overall gestalt principle, and spatial semantics.
摘要:Super-resolution reconstruction by learning can better describe image details and significantly enhance image resolution, thus improving the visual effect of the image because of the introduction of priori knowledge. Applying super-resolution reconstruction to sketch face recognition does not only improve the quality of the image but also effectively increases the recogniton rate. First, eigenface algorithm is used to synthesize a photo according to the input sketch. Then, super-resolution reconstruction via sparse representation is executed on the synthesized photo. Finally, principal component analysis is employed to recognize the synthesized photos that have been formed before the reconstruction and after. The experiment is performed on CUHK Face Sketch Database (CUFS). Experimental results show that, after super-resolution reconstruction, the synthesized photo can describe the facial details better, such as the eyes. Moreover, because of the introduction of priori knowledge, the sketch face recognition rate is improved after reconstruction. Experimental results also indicate that the recognition rate of support vector machine algorithm is improved from 65% to 66%, and the recognition rate of the principal component analysis algorithm is improved from 87% to 89%. Sketch face recognition based on super-resolution reconstruction can improve the image visual effect and increase the sketch face recognition rate effectively.
关键词:sketch face recognition;super-resolution reconstruction;sparse representation;eigenface;dictionary learning
摘要:Kinect can be utilized to capture motion data in real time. Given that its cost is lower than that of traditional motion-capture devices, Kinect is widely used to capture motion data. However, the noise in Kinect-captured motion data makes the quality of motion data relatively unsatisfactory. Thus, previous data-processing methods failed to handle such data well. Foot plant detection is a key procedure in motion editing; it detects whether the character's foot is on the ground. A robust foot plant detection algorithm for Kinect-captured motion data is proposed in this study. First, an adaptively bilateral filtering method is proposed to reduce the noise in Kinect-captured motion data. Second, multiple features of the motion data are defined and utilized to optimize the effect of foot plant detection. Finally, a decision function is trained with the support vector machine algorithm and applied to foot plant detection. After being applied to a dataset that consists of various types of motion, the noise in the Kinect-captured motion data was reduced effectively.The accuracy of foot plant detection increased by 6% after applying the proposed adaptively bilateral filtering method.Good time performance and high accuracy of foot plant detection were acquired as well. The foot plant detection accuracy of the proposed detection algorithm increased by 11% and 8% compared with that of the baseline method and the K nearest-neighbor method, respectively. The time consumed in the detection of the motion data of one frame is a seventh of that of the K nearest-neighbor method. Experimental results proved the effectiveness and robustness of the proposed foot plant detection algorithm. Thus, this algorithm can be widely utilized in motion data processing.
摘要:Given the advantages of low computation cost and absence of a training requirement, local descriptor-based palmprint recognition methods are eliciting an increasing amount of attention. The Weber local descriptor (WLD) is a newly presented local descriptor inspired by Weber's law in psychology. This study applies WLD to palmprint recognition. To improve palmprint recognition performance, a line feature WLD is presented by considering the sufficient line features of a palmprint. First,modified finite random transform or the Gabor filter is applied to a palmprint image to generate directional image and energy image E. Second, energy image E is convoluted by the Weber operator to generate differential excitation image . Finally, based on directional image and differential excitation image , the histogram of the line Weber local feature can be constructed for use in palmprint recognition. The Polytechnic University Palmprint Database II and the Cross-Sensor Palmprint Database are utilized in an experiment on Polyu Ⅱ database. The proposed method can achieve 100% identification rate with both Manhattan and chi-square distance. Results demonstrate that the presented method has a high identification rate and is robust. This study introduces WLD into palmprint recognition to develop a new palmprint recognition method based on a local descriptor. Compared with other palmprint recognition methods based on a local descriptor, the presented method has a higher identification rate and is more robust.
关键词:machine learning;biological characteristics;palmprint recognition;Weber local descriptor;line feature
摘要:Today video surveillance is being widely used, but ultra-wide area surveillance in real-time and low price use is still the restrictions. We proposed a single-camera, online real-time image stitching by support vector machine and phase correlation. For image sequences taken from a single camera rotation, the mosaic matrix has regular characteristic. So first calculate image position using phase correlation and then exact stitching matrix by SIFT operator. Finally using support vector machine establish model. While online process, calculate the stitching matrix to current frame by the offline model using image position. Then use phase correlation method again to achieve an exact match to get real-time image stitching. Based on DM6446 embedded platform test the reliability of the algorithm, the algorithm can complete image mosaic stability within 0.1 second of sampling time. It shows a high stablily and operating rate compare to traditional method based on feature matching, and the sharpness still viable to monitoring requirements. We proposed a new image stitching method which calculate stitching matrix by image position to be a fast panoramic image. The experiment result tells it has a good balance both to stability and image sharpness while in dynamic complex environment.
摘要:Hyperspectral remote sensors, which can collect data simultaneously in dozens or hundreds of narrow and adjacent spectral bands for each pixel, have been developed. However, the bands are usually highly correlated because of the fine spectrum resolution, thereby leading to great redundancy in hyperspectral datasets. Owing to the redundancy of data, utilizing all of the bands in an algorithm does not necessarily lead to an improvement in the results. These problems result in serious difficulties in data processing. To implement data dimensionality reduction efficiently and make the data exhibit minimal redundancy and considerable information after dimensionality reduction, a band selection method based on optimal combination factors is proposed. Coarse selection is implemented to remove bands with low informative content. Shannon entropy is utilized as a measure of informative content. For a large number of bands, subspace partitioning of the entire data is conducted. An adaptive subspace partition method that can automatically complete subspace partitioning is applied. In each subspace, by calculating the errors of two bands (one is minimized and the other is small), the reconstruction error is obtained by the linear predictive mode. Their combination factors are calculated with the product of their errors and standard deviation. The combination factors are then compared to identify which one to remove. The band that has minimized combination factors is removed. Autocorrelation matrix-based band selection, which utilizes the minimum linear prediction error as the selection criterion and searches the suboptimal subset by sequential backward selection (SBS), is employed. The proposed method uses the same SBS strategy to remove the band one by one until the desired number of bands is obtained. Finally, the bands selected in all subspaces are merged to a new set. The same dataset is utilized for experiments on time consumption and classification accuracy. The dataset was acquired with the airborne visible infrared imaging spectrometer (AVIRIS) in 1992. The dataset has 145×145 pixels and 220 bands with a range of 400 nm to 2 500 nm. The method has high computational efficiency, as revealed by experiments. Comparison of the computation time of all the four methods showed that the proposed method has a slightly shorter computation time. Support vector machine (SVM) has elicited much attention because of its capability to handle dimensionality compared with conventional classification techniques. Therefore, SVM is used in this study to classify the band subset. The classification accuracy of this method is approximately 1.5% better than that of others. The method of combination factors considers the minimum degree of redundancy and the maximum information of the band subset. This method obtains the best band subset of the data, has minimal computational complexity compared with other methods, and is applicable to AVIRIS and other high-spectral image data.