摘要:As the “Human-centered computing” is getting more popular and novel applications are evolving, action recognition and activity understanding are attracting researchers in the field of computer vision. In this paper, we review the state-of-the-art work on action and activity analysis with focus on three parts: Definition of activity, low-level motion features extraction and action representation, and reasoning method for activity understanding. Furthermore, open problems for future research and potential directions are discussed.
摘要:A new method which combined face profile, ear feature and relative position of ear and face profile to detect and recognize human ear is proposed in this paper. This method consists of off-line training and on-line detection and recognition. At the offline stage, a compositive feature vector which includes face profile contour, ear coordinates and the statistical features of ears is obtained. At the online stage, the ear is coarsely detected by the face profile contour, and accurately localized and recognized by the statistical features of ears and the relative position. Experimental results with real ear images show that the proposed method has good performance.
摘要:Face recognition is an active research area in the artificial intelligence. A face recognition algorithm using the RBF network is proposed based on wavelet analysis and multi-level histogram sequence local Binary pattern (M-HSLBP). Since wavelet analysis is insensitive to changes in expression, it can express the principal features of the face image by compressing data. LBP is an efficient local texture description operator. The wavelet transformed images were scanned with multi-degree changeable Sub-windows. Sub-images were transformed by an enhanced LBP, and then the LBP features are concatenated into an enhanced feature vector, which can express both local and holistic features of the face image. RBF network with high generalization is a good classifier, especially for larger number of samples. Experimental results on ORL and YALE face show that the proposed algorithm, which achieves recognition accuracy of above 98% is more effective and faster than the traditional method.
摘要:An illumination invariant face recognition algorithm based on L1 total variation model is proposed. It estimates the illumination from images using L1 total variation model as a low pass filter. Then in log field, the log quotient image is defined as the quotient of the original image and its illumination and is used as normalized illumination invariance for face recognition. TV-L1 based smoothing filter preserves the edges better and can remove halo efficiently. Its parameter selection is also simpler. Experimental results on YaleB and CMU PIE face databases show that the algorithm can effectively improve the face recognition rate under varying lighting conditions.
摘要:Scale-space feature detection is one of the most frequently used method in hand gesture recognition based on geometric model. However, the traditional method of scale-space feature detection involves heavy computation of Gaussian convolution, which makes the detection and recognition time-costly. In this paper, a fast scale-space feature detection method is proposed. First, a series of simple rectangular feature templates are used to approximate the complicated Gaussian derivatives convolution templates, with which the fast detectors of scale-space geometric features are obtained. After the detection of blob and ridge structures in gesture image, palm and finger structures are described and then gesture recognition is performed according to the configuration of palm and fingers. Then, integral image is used to rapidly calculate the convolution of rectangular feature templates, so the detection of scale-space geometric features is greatly accelerated in the method. Experiments on the standard dataset and the natural scene dataset show that the proposed method significantly reduces the time cost of gesture recognition while keeping comparable accuracy with traditional method.
摘要:The present face illumination modeling methods always suppose that the faces meet convex Lambertian and the surface normals and albedos must be known. However, these restrictions are different from the facts and the constructed illumination model also has bigger difference. In order to solve the problem, a novel method of face illumination modeling is proposed. Firstly,the multi-mode decomposition of Riemannian tensor is adopted and the face illumination is modeled. Then, the illumination model is optimized based on the improved generalized Lagrange algorithm. Theoretical analysis and experimental results both show that our method has higher precision and better practicability compared with photometric stereo harmonic image method.
摘要:In this paper, a two-stage method of image feature extraction, called Enhanced two-dimensional principal component analysis (2DPCA), is proposed in this paper, which uses 2DPCA operated in the row direction and alternative 2DPCA operated in column direction. Enhanced 2DPCA can compress image in row and column direction. Enhanced 2DPCA needs fewer coefficients for image representation than 2DPCA does. The experimental results on the ORL and FERET database show that the Enhanced 2DPCA can work well and surpass two-directional two-dimensional principal component analysis((2D)2PCA).
关键词:principal component analysis (PCA);two-dimensional principal component analysis (2DPCA);feature extraction;face recognition
摘要:We present a comprehensive survey on state of art of 3D human motion editing and synthesis techniques Firstly, four types of methods for generating 3D human motion were introduced and compared Secondly, techniques of motion capturing data representation, motion editing and motion synthesis were reviewed Finally, some open problems and future directions were discussed
关键词:virtual human animation;motion capture;motion editing;motion synthesis
摘要:Color image filtering is one of the most common research tasks in the area of color image processing. Color image filtering techniques can be divided as component-wise methods and vector processing based methods. However, the component-wise methods were only used in the early time and a large amount of research indicates that the vector filtering methods are more efficient than the component-wise methods since the vector processing based methods can preserve better the spectral characteristics of color images. This paper systematically summarizes and analyzes the fundamental theories and methods of color image vector filtering, and discusses the recent, important developments in this field. Some typical applications of color image vector filtering techniques are also reported in this paper. First, the classification of color image vector filtering techniques is analyzed, and for each class of filtering technique, the most commonly-used, representative filtering algorithms are introduced and explained in detail. Then, combining the authors research on this field, some new research methods are proposed. Finally, for some representative, frequently-used filtering algorithms, by taking the impulsive noise for example, both the perceptual visual effect and objective evaluation data are presented to illustrate their filtering performance.
摘要:The shortcoming of morphological edge enhancement algorithm has been discussed. Based on that, an adaptive edge width reduction enhancement fuzzy algorithm is presented. It uses local gray mean filter to smooth noise in the non-edge areas. In edge areas, it adaptively uses the gray means of edge direction, the means of high gray and low gray in gradient direction to replace the current pixel gray to reach the purpose of reducing ramp edge width. Membership functions are introduced to describe the degree of whether the dot is in the edge area or in the homogeneous area. Fuzzy strategy has been used to control the enhancement. Results of experiments show that this method enhances the ramp edges while reducing the noise.
摘要:The Gaussian curvature based method proposed by Suk Ho Lee and Jin Keun Seo was applicable to low gradient image areas and reserved its characteristics availably, But black and white points would appear on the resumed image if the iterative step is a bit longer and the number of iterations would severely increase when small step is selected. This paper proposes a modified model which can avoid the appearance of noising points with a larger step, with use of Tukeys biweight function to control the diffuseness of Guassian curvature. Farther more, considering the denoising methods of higher order are effectual and rapid for high gradient image areas, it introduces a fusion denoising model based on both Gaussian curvature and differential of higher order. The model could distribute different weights to every part reasonably according to real images. The presented model can not only remove salt and pepper noise,which cannot be accomplished the surface fitting method but also keep virtues of each technique. Edges and characteristics would be reserved synchronously.
摘要:To attack steganographic schemes in graphics interchange format (GIF) images, a steganalytic algorithm was proposed based on sum of distances (SOD) of sub-block. The distance was computed as an absolute value of the difference between one index and other one in a sub-block of an image. The difference between two indices may be positive or negative so that an SOD was further classified into two kinds. Considering the sign of algebraic sum of positive and negative SODs in a sub-block, the two SODs were differentiated as primary and secondary SODs respectively. 32 statistics were computed using these quantities in an image and a 56-dimensional feature vector was obtained. A steganalytic algorithm was designed based on the feature vector and Support vector machine. Experimental results show that the presented algorithm outperforms traditional methods. The number of different indices in a sub-block was utilized to improve the extracting feature vectors techrique.
关键词:steganalysis;sum of distances;feature vector;GIF(graphics interchange format)
摘要:There are two kinds of integer transform in H264,8×8 integer transform and 4×4 integer transform This makes hardware design more complex At the same time, more powerful decoder is required for high definition video application High performance hardware architecture is proposed for 2D inverse integer transform in H264 For 4×4 inverse integer transform, four 4×4 blocks in an 8×8 sub-macroblock was reconstructed Therefore, the 8×8 2D inverse integer transform and 4×4 2D inverse integer transform could have the same architecture With a new strategy of data storage and pipeline, inverse transform for column data and inverse transform for row data could perform at the same time On average, 32 clocks were needed for processing an 8×8 sub-macroblock The transpose memory was composed of a two-port 32×32bits SRAM and 8 groups of registers Compared with former design, the new architecture could reduce 537% area of transpose memory When clocked at 108 MHz, the proposed design can perform real-time inverse transform for high definition video decoder of H264
摘要:Context-based adaptive binary arithmetic coding (CABAC) is a highly efficient entropy coding way, but its coding speed is restricted by high computational complexity, which becomes a major bottleneck in its application To solve this problem, an effective way, based on the analysis of the CABAC algorithm and its computational complexity, will be proposed in this article to improve the algorithms coding speed by improving the probability estimation update part The algorithm first codes packets which each include N symbols, and then updates the probability estimation part, which decreases the update frequency of probability estimation remarkably The experimental results show that, compared with the former algorithms, the coding speed of CABAC has been substantially increased from 133% to 307%, with coding efficiency declines a little from 187% to 298%
摘要:By using F-P(Filter-Pipeline) model, this paper proposed a way to stitch multi-projector displays in real-time video processing The model takes multithreading technology which can process several frames at the same time It ensures that the video can be played fluently with a high quality effects With this model, the system can be easily constructed based on the common PCs and benefits us with reduced cost Furthermore, the system constructed with F-P model can adapt to a new environment with the extensibility to add or change the filters in the model Here, this paper also gives two examples in designing “geometrical transform filter” and “edge smoothing filter” and approved to be well done in real-time video processing
摘要:The purpose of affective computing is to enable computers to perceive, understand and express emotionsAffect data acquisition and affect measure are essential to the research of affective computingTo explore the problem of affect coding, experiments were elaborately performed in the scenario of FIFA football electronic gameFirstly natural affect of game players were elicited in the game environmentThen a scheme of affect coding based on emotion prototypes was proposed, and both categorical emotions and levels within the affective space were assessed by game players as well as third-party observersFinally an analysis was performed on the inter-observer agreementThe experiment data showed that valid and consistent affect coding results can be attained with the proposed method
摘要:License plate location is the key technology of license plate recognition. At present license plate color and texture feature were considered in the most license plate location methods, however, these methods had weak adaptability in different environment. In order to solve the problem, firstly, since the fixed color matches of the vehicle license plate, a color matched template matrix was constructed, which was used to restrict the initial edge detected image, so a restricted binary edge image could be obtained. Then, some morphologic operators which have the ability of eliminating noise were applied to form initial localization candidate regions. Finally, according to the texture feature of the license plate, the real license plate area was chosen from candidate regions. BP neural network was adopted to attain the strongly adaptive method which was used to recognize different colors in HSI space. And for shortening the period of location, only the color space in the neighborhood of edge pixels were conversed. The experiment showed that the accurate location of the license plate could be achieved in the complicated environment and different illumination, making use of this method.
摘要:Fabric defect classification plays an important role in computer visionbased fabric quality inspection. In this paper, a novel defect classification method based on wavelet frames is proposed. Defects of texture properties are characterized using the wavelet frames. Minimum classification error training method is used to incorporate the design of a linear transform matrixbased feature extractor and a classifier, which yields classification-oriented wavelet features and minimizes the error rate associate with the classifier. The proposed method has been evaluated on the classification of 329 defect samples containing nine classes of fabric defects, and 328 non-defect samples. A 93.1% classification accuracy has been achieved which is 27.1% better than the traditional wavelet-based classification method.
关键词:fabric automatic detection;fabric defect classification;wavelet frame;Minimum classification error training
摘要:Military target classification is the most challenging work in SAR ATR. In order to improve the recognition effect and on the basis of analyzing the characteristic of MSTAR SAR image, a method of discrete wavelet analysis is proposed to extract features. Because wavelet lowpass approximation coefficients contain the energy of SAR target echo and highpass detail coefficients contain the details of target and speckle, the approximation coefficients are obtained as features for classification, although they actually compose a low-resolution SAR image. The decision directed acyclic graph is chosen to improve the classification ability of support vector machine for more than two classes of targets. The experiments results show that high classification probability can be obtained by SVM when the approximation coefficients are used as features by the third level wavelet analysis. Moreover, the size of features is reduced and the recognition method is much more effective.
摘要:Based on the maximum entropy principle, a new segmentation approach of fuzzy partition Renyi entropy of two-dimensional histogram is proposed. First, the concept of fuzzy partition is introduced. In view of the Renyi entropy as a generalized form of Shannon entropy, fuzzy probability and conditional probability are used to define the fuzzy Renyi entropy. Then, in sample space the optimal pair of parameters is searched. Finally, image segmentation is realized using membership functions. Experiments are conducted on three real object pictures by MATLAB. Results show that the proposed approach does good control performance to noise and interference. And its better than other contrast methods.
摘要:For large computation in facet-model-based surface detection methods, an acceleration scheme combining separable filter recursive algorithm for 3D facet model with region of interest strategy is proposed. The separable filter recursive algorithm implements the 3D convolution with three 1D convolutions and allows the 1D convolution to be implemented recursively. This significantly reduces the computation time by rendering the computation independent of the kernel size. To solve the subsequent high memory consuming problem of the separable filter recursive algorithm, an incremental method is employed As for the region of interest strategy, objects piecewise bonding box extracted after image segmentation is adopted as the valid region. This can greatly decreases the amount of data to be processed. Experiment results show the presented scheme achieved excellent acceleration performance with same accuracy.
关键词:surface detection;subvoxel;facet model;separable filter;region of interest
摘要:Common tools based on landmarks in medical image elastic registration are the thin plate spline (TPS). However, in real application, such scheme would deform the image globally when the deformation desired is local. Although radial basic functions can limit the effect of the deformation locally, how to choose its parameter is still a problem. This paper used Gaussian function to compute the local elastic registration, purposed a method to choose Gaussian parameter to localize the deformation between two landmarks. Using this method in local image elastic registration base on landmarks, deformation of image could be reduced. Finally, this paper provides lots of experiments to support our conclusion.
摘要:In order to fuse a high-resolution panchromatic image and a low-resolution image effectively, a novel image fusion algorithm using lifting wavelet transform and the IHS transform is proposed in this paper. The high-resolution panchromatic image is firstly decomposed by lifting wavelet decomposition without down-sampling to the lifting wavelet planes. Then the region is divided by edge information from the lifting wavelet planes and the proposed merging algorithm is done by adding edge influence factor in different regions. At the same time, the region is divided based on the evolutionary alogrithm. The method proposed is compared with the IHS, the wavelet transform and other improved methods. The results of the comparison demonstrate that the proposed algorithm can fuse images with high quality more effectively for the test images.
摘要:Nowadays, mobile devices such as Pocket PC,PDA and Smartphone are used in Distributed Virtual Environment because of their portability and mobility. But mobile devices have their own limitations. For instance real-time rendering of 3D models on the mobile devices have not yet been satisfactorily resolved. In this paper, a new contour-based remote rendering algorithm for mobile device applications is proposed, which integrates the advantages of multi-resolution mesh and contour-based remote rendering, and can preserve the feature details of the models. At the preprocessing stage, the algorithm extracts the feature lines from the original mesh and then preprocess for multi-resolution mesh construction; At the real-time stage, the algorithm selects the final feature lines using a selection strategy and then constructs the appropriate multi-resolution mesh according to the interaction information, such as the viewpoint and frame rate, then extracts the contour lines from this mesh. At last, it sends the selected feature lines and the contour lines to the mobile device for rendering. Experiments show that the algorithm can preserve the details well in contour-based remote rendering, and can assure the rendering to be real-time on mobile device.
摘要:In this paper, the definition of NRLCTI (normalized run length code of conner and tangent and inflexion points) of a planar curve is given firstly. Then a new algorithm is designed to match sub-curves. Last, a novel approach is presented to recognize curves from a line drawing or an image. The proposed method has two merits. One is that the method matches feature points both on an object and models preliminarily based on NRLCTI, which can cope with the low efficiency and high cost problem for reaching feature points corresponding. The other is that the method partitions the curve into many sub-curves based on the landmarks, then matches and recognizes them. The low accuracy for curve approximated by polygon or conics curve can be overcome. Computer simulations demonstrate the effectiveness of the algorithm preliminarily.
摘要:It is important but difficult to generalize contour lines in cartographic generalization domain all along, and the graphic simplification of which is also a necessary difficulty. At present, existent simplification method of contour lines is mainly a geometrical approach, whose results after generalization process can hardly keep the primary shape characteristics. Based on analysis of area dividing and bend nesting of curve by the furthest visibility condition, a new generalization method of contour lines is proposed firstly. The experiment results show that the method is better than Douglas-Peucker algorithms on eliminating self-intersection, keeping character points, and can keep the shape after simplification better.
摘要:Existing laser graphics display system was too monotonous, the graphics were not flexible, and it could not be reflected in the laser display function. Laser multi-channel graphics system based on a computer-controlled application was proposed. The system was completed with oscillating mirror, rotating mirror and stepping mirror coordination by light splitting. Three different modes of motor function were better displayed. The issue of Motor response to the distortion had been resolved in the light splitting system, so the Laser graphics display had been enriched and improved. The experiments on the improved laser graphical display system demonstrated that this multi-channel laser graphics display system is easier to use and the image quality and content are improved.
摘要:Characteristics of current texture synthesis techniques on PC platform are summarized. Texture synthesis based on Wang tiles is suitable to be implemented on resource-constraint mobile devices. An improved bi-directional texture synthesis algorithm based on Wang tiles is presented through analyzing the stochastic tilling algorithm suggested by Cohen. The proposed algorithm mainly consists of the following steps: selecting sample sub-images, filling the tile with texture and tiling through bi-direction scanning. Accordingly, the search region for the best cut path is enlarged, randomicity of the selected tiles is increased and the requirements of computation and storage are also reduced. Experiments validate the usability of our algorithm on mobile devices. Compared with stochastic tilling, the proposed algorithm illustrates better visual effects in aperiodicity.