摘要:Depth map is a hotspot in computer vision research field. With the development and promotion of 3D display equipment, depth map extraction methods have attracted considerable attention. This study reviews the development trend of depth map extraction and summarizes the existing methods. Classifications are adopted on the basis of depth cues and human-computer interaction degree. The existing methods are grouped into three categories, namely, monocular, binocular, and multiple depth cue-based types. These methods are then classified into three schemes based on different human-computer interactions, namely, manual, semi-automatic, and automatic. This study focuses on the basic principles of these methods and emphasizes their advantages and limitations. A detailed analysis is also performed based on the application and development of machine learning methods for depth extraction. Future developments in depth extraction, including the adoption of new methods and the introduction of new depth cues, are discussed.
摘要:With the current level of technological advances, steganography can be easily steganalysed and information can be easily extracted illegally. Steganography based on modular function has other disadvantages. To overcome these disadvantages, this study presents the optimal parameterized binary modulo mapping steganographic algorithm. The information hiding parametric design is defined, and the security of the parametric information hiding algorithm is evaluated. The optimal parameterized binary modulo mapping steganographic algorithm is presented. The modulo operation result of the optimal combination of two pixel values is mapped to an -ary notational system by using the proposed algorithm. The proposed algorithm realizes that each secret digit in an -ary notational system is carried by two cover pixels. Compared with other similar algorithms, the proposed method has stronger resistance to steganalysis and information extraction, better visual quality of stego image, tighter security, and higher practicability.
摘要:To preserve the image edge detail and avoid introducing false information during noise removal. This study proposes to detect noise point and improve image de-noising performance using the fractional differential gradient based on fractional calculus. The convolution of different-directions factional differential gradient template with noisy images is performed to calculate the fractional differential gradient in different directions. Images of the different directions fractional differential gradient are obtained according to a pre-set threshold value. The pixel is determined as a noise point when its gradient occurs along all selected directions. Only the detected noise points are processed by the variable-order fractional integration operator in eight directions. De-noising experiments that involve adding Gaussian noise or impulse noise in artificial and natural images arrive at the same conclusion. The visual effects serve as the subjective criteria, and the peak signal-to-noise ratios serve as the objective evaluation criteria. As the integral order v increases, the image de-noising effect increases, the image texture details become smooth, the image appears blurry, and the peak signal-to-noise ratio decreases. In addition, the de-noising based on the detected noise points and the efficiency of removed noise are enhanced with increasing integral order v. Noise detection based on the fractional differential gradient can help solve the contradiction between image de-noising and detail texture preservation. The proposed technique improves image noise detection accuracy. The results of this study may serve as a basis for improving the performance of the current de-noising algorithm.
关键词:fractional calculus;image de-noising;fractional differential gradient;peak signal to noise ratio
摘要:A watershed segmentation algorithm based on adaptive gradient reconstruction is proposed to address over-segmentation in the gray-scale watershed algorithm and to simplify its application on color images. Principal component analysis is used to reduce the dimensionality of color images. The gradient of low-dimensional images is calculated. The gradient image is modified by applying an adaptive gradient reconstruction algorithm. Watershed transformation is employed to the optimized gradient image to achieve color image segmentation. Use performance indicators and the numbers of segmented regions that contain color distance, mean square error and region information to evaluate the segmented results. For different types of color images.This algorithm can correctly segment different types of color images. Compared with the existing watershed segmentation algorithms, the proposed method can effectively eliminate the pseudo minimum caused by irregular details and noise. It can also overcome over-segmentation, thereby improving segmentation accuracy. The proposed algorithm has good applicability and robustness.
摘要:Fuzzy clustering-based methods and statistical models have been widely used for image segmentation. To improve segmentation accuracy, this study develops a novel robust spatial factor-based fuzzy clustering algorithm by introducing statistical information into the fuzzy objective function. A novel spatial factor is proposed to overcome the impact of noise on images. The proposed spatial factor is constructed based on the posterior and prior probabilities by incorporating the spatial information between neighboring pixels. It acts as a linear filter that smoothens and restores noise-corrupted images. The proposed spatial factor is fast, easy to implement, and capable of preserving details. The negative logarithm joint probability, which serves as a dissimilarity function, considers the prior probabilities and thus improves the capability to identify the class of each pixel. Integrating the dissimilarity function and novel spatial factor into the fuzzy objective function, we can obtain the final segmentations by iteratively minimizing the objective function. The comparison results on synthetic images demonstrate that the proposed algorithm can realize accurate segmentation and strong de-noising. The comparison results on color images demonstrate that the proposed algorithm can produce satisfactory segmentation results and accuracies by utilizing the suitable feature descriptor. The proposed algorithm address the drawbacks of current segmentation algorithms and further improve the accuracy for image segmentation. It outperforms state-of-the-art segmentation approaches in terms of accuracy. The proposed algorithm applies to image with noise and color image in aid of texture features.
摘要:The perimeter of a target boundary in a 2D image is an essential object feature in image analysis. However, this feature is usually inaccurately estimated because of discontinuous or blurred target boundaries. Accordingly, this study proposes an improved method of perimeter estimation. This method is based on gray-level information and combined with boundary tracking. Different from the traditional methods that generally use binary information to calculate the perimeter, the proposed method utilizes gray-level information in digital images to obtain substantial information on a target boundary. The concepts of pixel coverage, internal boundary, and external boundary are introduced. The slope at each configuration on the internal and external boundaries is computed, and the perimeters of the internal and external boundaries are estimated based on the arc length integration formula. The perimeter of the target object boundary is obtained by combining the perimeter information on the internal and external boundaries. The perimeters of the target object boundaries are estimated using the proposed method and three classical methods. Synthetic images are used to obtain the exact perimeters of the target objects. The results are compared with the ground truth perimeters of the target objects in synthetic images. In the first two experiments, synthetic images with continuous and blurred boundaries are tested. The performance of the proposed method is similar to that of the classical methods but is better than that of the original gray-level method. In the third experiment, synthetic images with discontinuous and blurred boundaries are tested. In this case, the two classical geometrical methods are not applicable, whereas the proposed method remains to have satisfactory performance. The advantage of the proposed method is more obvious when the boundary of the target is complex. Compared with the classical methods, the proposed method has better adaptability and stability for target objects with blurred or discontinuous boundaries.
关键词:perimeter estimation;boundary tracking;internal and external boundaries;gray level;fuzzy boundary;discontinuous boundary
摘要:Compared with other biometric recognition methods, face recognition is friendlier, more natural, and less interferential to users. Hence, face recognition has numerous applications in various fields. However, the 3D shape of the human face is a nonrigid free-form surface, which causes face region distortion under expression variations, particularly in the mouth region. Therefore, 3D face recognition is easily affected by facial expression variations. Laughing and acting surprised make the face produce a hole in the mouth region and consequently change the topology of the facial model. This study proposes a 3D face recognition method based on facial fiducial points to avoid the effect of different facial expressions. The active shape model algorithm, which is usually used for 2D images, is applied to roughly detect facial fiducial points in depth images. The shape index is used to accurately locate fiducial points in 3D point clouds. A series of iso-geodesic contours from the landmark (located in the middle of nose tip and nose root) is then extracted to represent the facial shape and avoid the mouth region. Procrustean features (distances and angles) defined by pose-invariant curves are selected as the final recognition features. The classification results of each geodesic contour are compared and combined at the decision level as the final results. FRGC V2.0 is a large public face dataset. Experiments on the detection and recognition of facial fiducial points are conducted using the FRGC V2.0 dataset. In the detection experiment, face models in fall 2003 are selected as the training set, and 150 scans acquired in spring 2004 are selected as the testing set. Seven fiducial points, including eye corners, nose tip, and mouth corners, are manually located in the testing set. The accuracy of the seven detected fiducial points is measured through the Euclidean distance between the manually detected fiducial points and the corresponding automatic points. Each of the seven points is accurately detected, and the mean value of the positional error is less than 2.36 mm. In the recognition experiment, 424 scans from 60 subjects in the FRGC V2.0 database are randomly selected. Each iso-geodesic contour with procrustean features (distances and angles) is tested, and the classification results of the eight iso-geodesic contours are weighted in a decision-level fusion. The final Rank-1 recognition rate is 98.35%. Expression variation is an important research direction for 3D face recognition. This study proposes a novel method for locating fiducial points in 3D point clouds. Combined with the depth images and 3D point clouds, seven fiducial points are completely, automatically, rapidly, and accurately detected. A series of facial contours based on these points is extracted to represent the facial surface. The effect of expression variations on recognition is decreased because the iso-geodesic contours are located in approximate rigid regions. Overall, the proposed method is valid and robust in the presence of pose and expression variations.
关键词:Three-dimensional face recognition;fiducial points;iso-geodesic contours;Procrustes analysis;feature fusion
摘要:3D curve reconstruction is indispensable in computer vision. The traditional 3D reconstruction considerably relies on point-based correspondences. However, point-based reconstruction ignores the structural information between sample points and may reduce reconstruction accuracy. To avoid this problem, high-level geometric primitives are needed to reconstruct 3D scenes, typically as a 3D curve. This study proposes a vision-based method under the L-infinity norm of curvatures for the 3D reconstruction of nonparametric space curves from a pair of images. An initial cost function for minimization was defined as a weighted sum of reprojection errors and curvatures of the reconstructed curve. Reaching the global minimum is difficult, specifically if the initial cost function is not significantly close to the actual result, because of nonconvexity. In addition, the weight coefficients that control the relative significance between the reprojection errors and curvature were not readily. To address this issue, L-infinity norm replaced with L norm of the curvatures was used as an inequality constraint. The curvatures of all discrete points along the curve should be less than the maximal curvature , which can be estimated from the projection curves of the two images. The reconstructed curve, given a small , was smooth under the inequality constraint. A method utilizing a generalized Lagrange multiplier was used to solve this nonlinear optimization problem with inequality constraint. An experiment using synthetic data showed that the reconstructed curve was significantly similar to the real results and that its reconstruction error was less than 1/7 from the point-based method. These findings illustrate that the proposed method is evidently superior to the point-based method. The proposed method also has higher accuracy than a previously reported method [12]. The influences of the 3D reconstruction errors with respect to in a certain range were not large, thereby validating the robustness of the proposed method. In the noise tests, reconstruction errors linearly increased with the noise. Strong noise probably caused a large stereo ambiguity along the curve and increased uncertainty of the space curve. The intersection line of two crossed cylinders was reconstructed using the proposed method. The reconstructed curve was projected in the XY plane, wherein the projection of the actual intersection line was a circle. The radius of one cylinder can be estimated after fitting the projection curve of the reconstructed curve as a circle. The relative deviation between the estimated and real radii was 2.98%. This value indicates that the proposed method has high accuracy for general industrial applications. The proposed method was used to reconstruct 3D skeletons of a fruit tree trunk; the resultant 3D skeletons were visually acceptable. The proposed method can be used to reconstruct 3D space curves for various applications. However, the stereo ambiguity and uncertainty of camera parameters may reduce the accuracy of the method because only two images were used. The number of projection curves may be increased to solve this problem.
摘要:In order to improve robustness of the optical flow method in the treatment of the illumination change and large displacement. The optical flow estimation method combining structure texture decomposition preprocessing and weighted median filter is proposed. The idea to obtain the data term is to use the combination of gray and gradient constancy assumption, the combination of local and global constraint. Meanwhile, the use of texture decomposition, the weighted median filter and the pyramid structure is further enhance the accuracy and practicability of the optical flow algorithm. The proposed method is evaluated by using both the Middlebury optical flow database images and real scene images. The experimental results show improved optical flow estimation method in the treatment of the illumination change performance is good, not only to obtain dense optical flow field, and improves the detection precision optical flow field on the edge of the target. Compared with traditional optical flow method, the proposed method under the condition of illumination change can obtain a more ideal result and reduce the interference of the actual light changes in a scene. The optical flow method can better apply to actual scenario.
关键词:optical flow;large displacement;illumination change;structure texture-decomposition;weighted median filter;robustness
摘要:Stereo vision depends on feasible approaches for real-time/hardware implementation. Cost aggregation, the most complex part of the stereo matching algorithm, substantially affects the overall running time. Therefore, this study proposes a novel parallelization strategy to map the stereo cost aggregation of graphics processing units (GPUs) using compute unified device architecture (CUDA). The linear stereo matching algorithm is selected as the stereo cost aggregation strategy in the proposed approach. Linear stereo matching with constant complexity can achieve more accurate disparity maps than global disparity optimization methods. Although its computation complexity is considerably less than that of most global approaches, linear stereo matching, even when optimized by some effective strategies, remains to demonstrate a performance that exceeds real-time or near real-time requirements for practical applications. The parallelization strategy introduced in this study is based on a separable filter with linear complexity in the filter window size and with proven efficiency on GPU platforms. The computation for each step (cost computation, mean filter, and coefficients computation) of the cost aggregation is reformulated, and the rational use of different types of GPU memory is ensured. This study proposes several parallelization optimizations to increase parallelism degree and data throughput. After being optimized by these parallelization optimizations, our approach ensures that the computation of each CUDA thread is independent of other threads and maximizes parallelism degree. These parallelization optimizations also reduce the complexity of each thread from the exponential relationship to the linear relationship with window radius and further improve the efficiency. The efficiency of the memory access and the data throughput are also dramatically improved in our final implementation, cached by texture or shared memories in certain circumstances. These experimental results show that the proposed strategy is effective and efficient. We dramatically accelerate the stereo cost aggregation on GPUs under the assistance of the outstanding parallel computation performance of GPUs. Compared with the original CPU implementation accelerated by the integral image technology, our CUDA implementation on a specific NVIDIA GTX780 GPU provides, on the same stereo image pairs, accurate cost matrix within a significantly shorter running time (less than 80 ms) and improves the average efficiency by tenfold. Our approach also outperforms other real-time or near real-time stereo cost aggregation implementations on GPUs. The proposed approach outperforms the previous constant time stereo solutions and produces accurate results comparable with those of adaptive weight aggregation on GPUs with CUDA. It also provides an efficient and feasible method to obtain an accurate disparity map on general PC platforms in real time.
摘要:The traditional shadow play is a cultural treasure of Chinese folk art, combining sculpture, painting, drama, music, performances, and other arts, and it was once considered as a spiritual food. However, studies show that traditional shadow play is on the verge of extinction. Therefore, to protect and uphold the culture of traditional shadow play, numerous researchers combine shadow play with current computer technologies, inherit the shadow charming artistry, and create the possibility of a live shadow protection-digital shadow play. In this study, we manipulate the digital shadow play and the traditional shadow play interaction using Kinect and color shadow play control rods, respectively. We also analyze the essential technologies for the input of the interaction and the output of the digital shadow play actions. According to the characteristics of the Kinect depth map and control rod movements, the regions for the control rods are segmented to serve as mask basis in detecting the rod head position in a frame. The colors marks in the control rods are trained in advance using the Nave Bayesian method to obtain a corresponding color category table. These marks are identified while a single frame is detected to calculate the rod head positions. Each rod is set through the control of the Kalman filter to provide stable control information for shadow animation. After analyzing the traditional shadow play action characteristics and the human body joints, we organize the digital shadow figure into a joint hierarchical model by controlling the three rods. We analyze the action principles of the current model combined with the input control information in terms of moving, waving arms, and changing legs for the shadow play figure. The shadow animation is then rendered by OpenGL to present to the user. Facing the Kinect, a user holds three rods to control the shadow play figure rendered by the computer. Along with the rods' movements, the shadow play figure correspondingly acts through movements, such as arbitrarily moving, waving arms, and changing legs, in the shadow scene of Wukong picking peaches. User surveys show that our control method is similar to the traditional shadow play, except that ours is simpler to operate. In addition, the animation in our control method is interesting and beautiful. This study proposes a novel way to interact with the digital shadow play as a traditional interaction simulation. This strategy allows nonprofessionals to experience controlling traditional shadow figures. Compared with other digital shadow play interactions, the proposed method has more traditional shadow play culture efforts.
关键词:digital shadow play;human-computer interaction;Kinect;shadow play control rod
摘要:Considering the stereo model registration in large-scale 3D terrain reconstruction from aerial imagery, we propose an approach for registering all stereo models to realize automatic large-scale 3D terrain reconstruction. Individual stereo models were constructed from the adjacent images. The tie points of adjacent models were extracted according to the pixel coordinates of the corresponding points based on feature matching in the shared image. All models were linked and registered through loop traversal search. Subsequently, the large area terrain of the overall aerial photography area was produced through triangulation, Tin generation, differential correction, elevation interpolation, and color blending. The above processes were sequentially and automatically performed. Two groups of data were applied in the experiment. Results showed that the root-mean-square (RMS) registration errors of all 3D models of the two data sets were 5.20 and 2.63 pixels. The relative accuracy of the produced large-scale terrain was high. The absolute orientation was determined based on the results generated from the second data set with ground control points, and the absolute positioning accuracy was evaluated using checkpoints. The RMS errors of the check points in the planimetry and altitude were 0.326 and 0.502 m, respectively, illustrating a high absolute accuracy. Couclusion: Large-scale 3D terrain that was organized in a special format was automatically produced by incorporating a series of overlapped aerial imagery. The visualized and realistic 3D terrain validated the feasibility and effectiveness of the proposed solution.
摘要:Synthetic Aperture Radar (SAR) is suitable for dynamic monitoring because it is unaffected by weather conditions. SAR image change detection is the key technology for the dynamic monitoring of targets. However, multi-scale SAR images contain more details than single-scale images. Thus, we integrate multi-scale analysis in the change detection algorithm to obtain accurate results. Gaussian multi-scale theory is easy to understand and has advantages. Despite its capacity to preserve image details, it is seldom used in change detection. This study proposes a new adaptive multi-scale change detection method on the basis of the structure similarity (SSIM) of SAR images. The proposed method includes five steps, namely, obtaining difference images (DIs), multi-scale decomposition based on Gaussian kernel, optimal scale estimation by SSIM, building feature vectors for each pixel, and fuzzy C-means (FCM) clustering to obtain the final change map. DIs are obtained using a log-ratio operator. A median filter is utilized to suppress the speckle noise. The optimal Gaussian scale is estimated by finding the maximum SSIM of the DIs through iterations. The optimal Gaussian kernel scale and its differential forms are convoluted with the DIs to generate change detection feature vectors at the pixel level. FCM is introduced to classify the changed and unchanged pixels using the feature vectors, and the change detection map is achieved. Experiments on two pairs of real SAR images of Bern and Ottawa areas show that the proposed method outperforms state-of-the-art algorithms. The correct detection rates of the two pairs of SAR images reach 0.9952 and 0.9623, and their kappa coefficients reach 0.8200 and 0.8540. This study proposes an effective multi-scale change detection method based on the SSIM of SAR images. This method fully utilizes the scale information of SAR images and is robust to speckle noise. The proposed method is effective in finding the optimal scale of DI and in detecting changes.
摘要:Fusion restoration is one of the most concise and practical methods for resolution reconstruction. To solve the existing problems related to fusion and restoration, this study proposes a new improved framework. The normalized convolution is improved and then used to implement the fusion step. The maximum a posteriori estimation is improved and then used to implement the fusion step. These improvements lead to the construction of a super-resolution reconstruction algorithm. In the fusion step of the proposed algorithm, the improved normalized convolution introduces a double applicability function. That adds a neighbor-intensity correlation. Then the improved normalized convolution introduces a new certainty function that mixes the Gaussian and Laplace certainty functions. In the restoration step of the proposed algorithm, the improved maximum a posteriori estimation introduces a feature-driven function. This function is obtained by mixing two constant prior models. The formulation of the feature-driven prior is completely determined using the statistics of the image feature. Several test images are synthetically degraded into low-resolution sequences with different disturbance levels. These sequences are then reconstructed using the proposed algorithm and other several algorithms for comparison. Results show that the proposed algorithm is superior to other algorithms in terms of visual effects and performance indexes. The fusion step in the proposed fusion restoration algorithm considers the spatial distance and intensity difference between neighboring pixels to efficiently restrain noise and outliers. The restoration step adopts the feature-driven prior that is determined by the image itself and not by experience. Therefore, the image is accurately characterized. The experimental results verify the effectiveness of the proposed algorithm.
关键词:super-resolution reconstruction;feature-driven prior;normalized convolution;maximum a posteriori estimation;image restoration
摘要:In order to simulate three dimensional appearances and predict the performance of woven fabrics based on the weave structure, this paper presented a method to create the three dimensional geometrical models of yarn buckling in woven fabrics directly from weave structural parameters. In the method, two steps are combined, including weave diagram generation form structure parameters and three dimensional geometrical modeling from weave diagram. In first step, weave diagram is looked as a Boolean matrix. According to the weave category, the yarns interlacing rules in the weave diagram are described by a series of parameters, such as the float array, crossing number, step number and the numbers of ends or filling in weave repeat. Several functions are created to generate the matrixes of common single layer weaves according to these parameters. The weaves are separate into regular weave, quasi regular weaves and un-regular weave. In second step, the yarn paths are separated into two segment types (straight line and cosine curve) according to the warp or filling yarns buckling models. The yarns cross section shape are looked as circle or ellipse. The three dimensional geometrical coordinates of warp and filling yarn buckling path are expressed by some piecewise functions according to weave matrix. The parameters in these functions include the long radius and short radius of ellipse, the straight distances of line segment in warp yarns or weft yarns, the straight distances of cosine curve segment in warp yarns or weft yarns, the amplitudes of warp and weft yarns buckling waves. Additionally, the modified functions also are created and two correcting factors are introduced to simulate three dimensional yarn buckling path by considering the distortion of warp and filling yarns in woven fabric. The values of the correcting factors can be generated by adopting a sine function or random function. Finally, the three dimensional modeling experiments of single layer woven fabrics were performed. In these experiments, the input information includes weave types, structural parameters, yarn cross section shape and yarn distortion parameters. The created models in these experiments include plain weave, twill weave, satin weave, compound twill, waved twill curved twill, interlaced twill. The models of plain weaves with different structural parameter values are compared, including yarns cross section shape, yarn counts and the yarn buckling amplitude. The models of twisted plain weaves based on uniform random function are also compared with untwist plain weave model and the scanned fabric image. The results show that this method is feasible and convenient. All fabric models in the paper with different input parameters have good simulated effect. This paper proposes a convenient method to create the three dimensional geometrical models of single layer fabric directly from weave structural parameters, which combined weave matrix generation and yarn buckling modeling. Only by changing the weave information and fabric structure parameters, the different fabric geometrical models can be created quickly. The method can not only simulate two dimensional yarn buckling models, but also for three dimensional yarn buckling.
摘要:Bleeding simulation is an important visual effect. Real-time bleeding simulation is challenging because of the extensive computations required for blood-solid interactions. This study proposes a simulation method for surgical bleeding based on graphics-processing units (GPUs). The proposed method is derived from the smooth particle hydrodynamics (SPH) proposed by Müller et al. SPH uses the temperature item to generate particles with different speeds and simulate bloodstream. The multi-thread parallel technology compute unified device architecture (CUDA), which is implemented on GPUs, is used to rapidly solve the control equation of the particle and the blood-solid interactions. Thus, this method allows real-time bleeding based on a uniformly spaced grid. The proposed method can simulate bleeding after a cut. It can also simulate blood flowing over complex obstacles in a surgical simulator. The method only takes 20 ms when the particle number is 9 000 and has 20.15 times faster computation than the central processing unit implementation. Thus, real-time bleeding simulation can be obtained with the method using a large particle number. The proposed method is flexible and capable of simulating real-time surgical bleeding.