摘要:As one of a serial of reports, this paper is a survey on multimedia researches and applications in China, 2007. Since multimedia is a cross research area, papers about multimedia technology are distributed on various journals. We checked 3 034 papers published on 9 Chinese journals in 2007, from which we have selected 516 related to multimedia technology and applications. Then we made analysis on them and the statistics were compared with those from 2003 to 2006. The catagories have been modified to make the papers selected focus more on multimedia researches. Looking into the data, we can find that digital watermarking, virtual reality, multimedia data retrieval, multicast, streaming media, interactive mode and interface, identification using biometrics earn high attention of researchers in China. Meanwhile, multimedia application systems are diversified and intergraded with daily life gradually.We present here an overview on the progress in multimedia technology in China, 2007. This article can be used by researchers as a thoroughly references, and also will be helpful for people in technical planning and management.
关键词:multimedia;digital watermarking;virtual reality;multimedia data retrieval;multicast;streaming media;interactive mode and interface;biometrics
摘要:Video Fire Detection (VFD) is one of the most active research topics being valuable for both theoretical and practical research in computer vision, especially has a wide spectrum of promising applications in video surveillance for early fire alarms in public security. As the improvement on visual feature model of fire, many VFD systems have been developed. In this paper, some main issues on VFD are reviewed, including its advantages to traditional detectors, the classification and description for visual fire features, the representative algorithms and systems, the future trends, and so on. Then some key problems on the compatibility, real time efficiency, intelligence, performance evaluation and multi sensor fusion for VFD are discussed. In addition, a novel VFD model based on hierarchical attention and a saliency fusion framework based on multi sensors are proposed for boosting the efficiency and activity of fire surveillance by using salient feature representation and low computational redundancy.
关键词:computer vision;fire/flame;real time alarm;video fire detection (VFD);visual attention (VA)
摘要:Image Interpolation aims at reconstructing a high resolution image from a low resolution noisy image. Though many magnification algorithms have been proposed in literatures, it is much difficult to balance the tradeoff between the visual quality of the interpolated image and the computational complexity of the algorithm. In the paper, a novel interpolation PDE approach is proposed driven by local geometric structures. Coupled with different diffusion mechanisms corresponding to edges, textures, and corners, the novel algorithm is not only robust to noise, but also capable of enhancing the edges and textures, as well as preserving the corner structures. The novel PDE is subsequently applied to super resolution reconstruction, consisting in that image interpolation and super resolution are mathematically consistent. Besides, coupled with total variation modeling, a slightly improved version of the novel PDE is proposed to remove the false textures in the super resolved image in the case of high level noise. Numerous experiment results demonstrate the effectiveness of our approach, both in the visual effect and the PSNR value.
摘要:The latest video coding standard H.264 supports variable size block motion prediction and compensation. Adapt to this new feature, a new temporal error concealment algorithm based on variable size block motion vector resilience, is proposed in this paper. Exploiting the correlation of adjacent macroblocks, the encoding mode of lost macroblock, partition pattern of lost macroblock, and the reference frame used in matching can be decided according to neighboring macroblocks’ encoding mode and used reference frame information, and then motion vector is recovered for each partitioned subblocks using side distortion matching method. Finally,simulation results show the proposed algorithm is superior to conventional temporal concealment methods for H.264 standard.
摘要:One of the key issues for robust watermarking is to resist geometrical attacks. A matching strategy from coarse to fine scale to achieve the geometrical synchronization incorporating Zernike moments and the template in wavelet domain is proposed. At the coarse stage, the rotation and scaling parameters are estimated with the Zernike moments of the translation normalized image, while the translation parameters are estimated with the centroid’s increments between the original image and the image that are corrected for rotation and scaling. At the fine stage, the accurate values of rotation, scaling, and translation (RST) are obtained by matching the template around the roughly estimated RST values, which hence reduces the searching spaces to some extent. Finally, the attacked image is corrected with the accurate RST values. A watermarking scheme based on the vector Hidden Markov Model in wavelet domain (DWT HMM) is also adopted. Good robustness is observed against StirMark attacks and their joint attacks.
摘要:A recoverable semi fragile watermarking algorithm for image content authentication is proposed. The algorithm employs a compressed halftoned binary image as watermark and embeds it in the wavelet domain through quantization index modulation. It can not only detect the tampered area, but also recover it through decompressing and inverse halftoning the watermark image. Beside, a secret key is used to select the embedded watermark bits and their embedded positions to increase the security of watermarking. Experimental results show that this semi fragile watermarking algorithm is effective and practical for content authentication.
关键词:image authentication;quantization index modulation;halftone;tamper recovery
摘要:A feature coding scheme based on the content of JPEG image was proposed. The basic logic of the scheme is to use partial energy relation between groups of 8×8 DCT block to produce related feature code. A semi fragile watermarking scheme combines correctly the content based feature coding with the watermarking scheme. The feature code has the semi fragile property, meaning that it is robust to the acceptable ‘content preserving’ modifications but sensitive to the ‘content changing’ tampering. The lower DCT region is used to form the feature codes while the higher one is used to embed watermarks. The experimental results demonstrate that the proposed algorithm has the advantages such as simple computation complexity, good robustness to JPEG, and precise location of tampered areas.
摘要:The seamless tiled display that is based on the multi projector is an effective realization for the wide field and high resolution display of graph, image, video and so on. The key problem for the seamless tiled display is the solution of color maladjustment. The existing analysis result of the reason for color maladjustment includes two parts, one is the variation characteristic of the projectors color output, and the other is the effect of the projecting screen and environment. So many photometric calibration techniques are proposed in recent years, and there are three types in sum which include edge blending based calibration technique, single projector light based calibration technique and gamut output matching based calibration technique according to the principle and implement method of the solution for color maladjustment problem. We compare the advantage and disadvantage of these three techniques in the seamless display effect, maintainability, expansibility and other aspects. The future of this field is the real time photometric calibration considering different type of projector, different shape and reflecting characteristic of display screen and moveable observer.
摘要:Intrascan intensity inhomogeneities are a common source of difficulty for MRI segmentation. We estimate the bias field by Legendre polynomials.The bias field could be the best when we get minimum entropy. It needs to work out parameters of the base function in the process of finding bias field, but conventional methods such as gradient descent method often find local best. To find global best, we present genetics algorithm to find best parameters to estimate the bias field, however the result was not satisfying. Then we make some modification of genetics algorithm to make it easier to find global best. Experiments on the segmentation of brain magnetic resonance images show our modification can achieve optimal bias field and accurate segmentation results.
关键词:magnetic resonance image (MRI);bias field;entropy;gradient descent;genetics algorithm;local best;global best
摘要:To eliminate the disadvantage of linear filter which always results in blurring the image edge or degrading the image quality in image process area, the research of non linear filter becomes a meaningful job. Myriad algorithm is a kind of non linear filter based on the theory of stable model. It fully uses the different modes of stable distribution to process the non linear signal with weight. When compare with the most frequently used middle filter among algorithms of image filter, Weighted Myriad Filter not only can filter the Salt&Pepper noise efficiently, but also can preserve the image details better than middle filter. After the brief introduction on normal Myriad Filter,in the article we present the central weighted Myriad Filter and the Adaptive Weighted Myriad Filter to achieve better performance through adjusting the K parameter, which is used to balance between noises removing and detail preserving in the window. The weight added to the filter can also vary adaptively according to the image content.
摘要:The median filtering method and its improved methods is an effective approach to remove random value impulse noise in images. However, most methods have the same shortcomings in finding the optimal threshold and the edges and over smoothed textures structure of images. In the paper, we propose a novel method based on the geometric structure detection to remove random value impulse noise from images. First, the histogram of a noisy image is used to estimate the noise ratio. Next, the two thresholds are adaptively determined from the noise ratio and the histogram of the detail image. Utilizing these two thresholds, all pixels in the detail image are divided into three sets: ‘uncorrupted pixels’, ‘undetermined pixels’ and ‘noise pixels’. The set of ‘undetermined pixels’ is often composed of pixels in edges and textures as well as noise pixels. Finally, the geometric structure detection is proposed to distinguish ‘undetermined pixels’. Based on types of each pixel, the result of the median filtering is modified using the detail image. The simulation results show that the proposed method can remove impulse noise while preserve the edge structure of the image. It is superior to the existing methods in performance.
摘要:A camera imaging model with radial distortion, used for a structured light vision system to measure geometric tolerance, is introduced in this paper, together with an improved calibration of Direct Linear Transformation (DLT). First, with traditional calibration technique used on linear model, all the parameters can be worked out through solving super linear equations; second, with the results of first step as initial value, the inner parameters, such as Center Of Image, Equivalent Foci, Slope Factor, Coefficient Of Distortion, can be worked out through non linear iteration. Pre calibration is not necessary in calibration with this method. The method performs with suitable accuracy, and is proved to be simple and practical.
摘要:Foreground detection is an important research problem in visual surveillance. In this paper, we present a novel multiple layer background model to detect and classify foreground into three classes, moving object, stationary object and ghost. The background is divided into two layers, reference background and dynamic background. Single Gaussian model and Gaussian mixture model are used respectively. Compared with many existing background models, an unique characteristic of the proposed algorithm is that through analyzing the Gaussian distributions of the two layers, stationary object and ghost are correctly labeled. Real time object detection and tracking system is developed and tested under indoor and outdoor scenes with various scenarios. Extensive experimental results demonstrate that the proposed algorithm is effective and efficient and the processing speed of the system reaches 15fps for the image size of 320×240.
摘要:Urtra wide band (UWB) synthetic aperture radar (SAR) is widely used to explore the foliage concealed targets for its better penetrability. But no systemic and integrated detection algorithm is presented for UWB SAR detection. Most of research institutes follow the three stage detection flow developed by Lincoln laboratory to detect the foliage concealed targets in UWB SAR images. This current detection algorithm performs excellently when it is used to detect and recognize targets based on high frequency, high resolution and fully polarimetric SAR data. But there are many problems when it is used to detect the foliage concealed targets in UWB SAR images. The question is analyzed when the current algorithm is used to detect the targets of UWB SAR, and a new detection algorithm is presented in the paper. The new algorithm decreases the requirements of detection condition by small slip window average, low threshold constant false alarm rate (CFAR) detection and connectivity analysis. It has the good applicability and steadiness. The detection results of the current detection algorithm and the proposed new algorithm for foliage concealed targets in three UWB SAR images are given in the end of the paper, and they testify the validity of the new detection algorithm in the foliage concealed target detection of UWB SAR.
摘要:According to the mechanism that the Short term Synaptic Plasticity caused by repeating stimulation can lead to distortion in receptive fields of neurons in human vision system, a hypothesis on the manner of distortion in receptive fields is proposed in this paper. The influences of the angle of edges orientation and the distorted receptive fields’ long axes on the filtering performance have been studied. Based on the analysis of the curve, a new high pass filtering algorithm is put forward. The experiments confirm the superiority of the algorithm in filtering performance and in real time processing. And the hypothesis mentioned above is proved to be rational in the experiments. The experimental results show that the algorithm is more appropriate to describe the gazing mechanism in Human vision System.
关键词:short term synaptic plasticity;gazing mechanism;high pass filtering
摘要:It is one of the important work to extract linear features, e.g., roads, from remotely sensed imagery in the field of remote sensing information extraction. A semi automatic method to extract roads from high spatial resolution remotely sensed imagery is proposed. The main steps include: 1) some basic profile features, e.g., the starting road direction, width, and radiometry distribution are obtained with the user specified starting road seed couple; 2) a searching fan is then created, within which several ‘scan snakes’ on several directions are dispatched, which contains themselves’ several snake joints, i.e., the scan profiles. Within each scan joint of each snake, a pair of edge points (gradient extremes of the pixel values along each side of the road) which satisfy the road profile model will be searched. For every finding within every joint of a snake, its votes will be added. The best snake is the one which carries the most votes, which then denotes the next searching direction. The searching is carried out from the starting position until reaching some finishing conditions, e.g., the boundary of an image; 3) these edge points are then connected to form a double side road. The main road network can be extracted under a lot of complex conditions, such as distinguishing changes of road directions and radiometry distributions, road broken and intersections. Several experiments on Beijing 1 panchromatic imagery (with spatial resolution 4m) are given, which validate the adaptive ability and practicability of our method.
摘要:Automatic facial expression recognition is the kernel part of emotional information processing. This study is dedicated to develop an automatic facial expression recognition approach based on confusion crossed support vector machine tree (CSVMT) to improve recognition accuracy and robustness. Pseudo Zernike moment features were extracted to train a CSVMT for automatic recognition. The structure of CSVMT enables the model to divide the facial recognition problem into sub problems according to the teacher signals, so that it can solve the sub problems in decreased complexity in different tree levels. In the training phase, those sub samples assigned to two internal sibling nodes perform decreasing confusion cross, thus, the generalization ability of CSVMT for recognition of facial expression is enhanced. The experiments are conducted on Cohn Kanade facial expression database. Competitive recognition accuracy 9631% is achieved. The compared results on Cohn Kanade facial expression database also show that the proposed approach appeared higher recognition accuracy and robustness than other approaches.
关键词:automatic facial recognition;confusion cross;support vector machine tree;Pseudo Zernike moment
摘要:In the field of optical deformation measurement, obtaining the fringe phase information from one speckle interferometry high accurately and automatically is a difficult problem. In this paper, a novel method is proposed to extract the skeletons precisely. We estimate the fringe orientation firstly and use a special filter to depress orientation noises. Then with the fringe orientations, the fringe slope angles are computed. In the slope angle map, great jumps of the slope intensity values occurred at fringe peaks and fringe valleys. At last, the fringe skeletons are easily extracted from the slope angle map. This method has been tested by real speckle fringe pattern images, and the results show that this method is efficient and robust for ESPI even with high speckle noise.
摘要:The fusion between multispectral and SAR (synthetic aperture radar) images has great significances. However, the small scale texture information of SAR images has not aroused scholars attention in previous fusions. For better fusion result, a new data fusion technique has thus been developed based on biorthogonal wavelet transform with three types of images. They are: TM multi spectral image, single temporal JERS 1 SAR image and the small scale texture image extracted from multi temporal JERS 1 SAR images based on the principles of multi channel filtering and SFIM (smoothing filter based intensity modulation) fusion. Compared with the fusion method without texture image, the new one not only preserves spectral information well, but also inherits from SAR images rich small scale textural information that makes the result have higher definition. The wavelet fusion algorithm with texture image is of high flexibility and practicability.
关键词:smoothing filter based intensity modulation(SFIM);wavelet transform;texture image;remote sensing data fusion;TM image;JERS 1 SAR image
摘要:It is urgently needed to develop a solution to content based video retrieval, in order to support efficient management and quick browsing mass video data. Shot classification is an important research content in soccer video processing and retrieval. Considering the shortcomings of the existed methods, we propose a novel approach which can classify soccer video shots into main, middle, close up shots and other ones based on sub window region. The key step is to calculate the proportion of pixels with field color in sub windows region of soccer video frame in HSV color space. Experimental results have shown that the proposed method has a very high precision and recall rate.
关键词:shot classification;sub window region;soccer video;HSV color space
摘要:The improved method of active shape model is used to extract face feature points accurately. Then in view of the characteristic of face shape, taking some face feature points as the face model, 3D space face pose is estimated via the least squares method optimization. The experiment results show that the new method not only can achieve steady pose solution, but also has better estimation accuracy compared with the same kind methods.
关键词:estimation of face pose;active shape model(ASM);optimization
摘要:Based on the theory of Marching Cubes algorithm, an improved algorithm to extract isosurface from the 3D data field was introduced. At first the 3D data field was decomposed to the topological structure of points, lines, aces and cubes. Intersection points of the lines with the isosurface were calculated at first in the 3D data field. The intersection lines of the faces with the isosurface were gained by joining the intersection points in the faces. Intersection lines were joined together to form the space polygons in the cubes. The triangle mesh of the isosurface would be obtained by making triangle of the space polygons in each cube; The triangle mesh would be registered by a vertex table and a triangle table. Based on the relationship of connection among triangles at the vertexes, the seed algorithm was used to mark the vertexes and the triangles belong to the same child isosurface. And then vertexes and triangles belonging to the same child isosurface were registered by the individual vertex table and triangle table. It was proved by the instance that the algorithm could extract isosurface and group child isosurfaces with high efficiency.
摘要:In this paper, a contour based image retrieval algorithm is proposed for improving the algorithm proposed by Choi Wai pak et al, in which the shape of an object is represented based on the normalized maximal disks. In order to generate a simpler and smoother contour, the one dimensional Gaussian functions of two different scales are employed respectively for the concave and convex part of the contour in the proposed algorithm. Additionally, the skeleton of the contour is extracted by a skeletonization algorithm. Finally, the histogram of the distances between the evolved contour and skeleton is used to describe the shape for the retrieval purpose. As compared with the original algorithm that uses only the skeleton of an object, the algorithm proposed uses not only the contour that represents the shape of an object from outer but also the skeleton that preserves the original objects topology from inner. Experimental results show that the new algorithm proposed here outperforms that proposed by Choi Wai pak et al. in the robustness to the scaling, rotation and noise corruptions.
摘要:To produce various realistic skin images, a physical model based and robust algorithm is proposed in this paper to automatically separate single skin image into melanin and hemoglobin components based on the analysis of skin structure and compositions. Skin image can then be locally or globally synthesized based on pigment components separation. Firstly, we divide the input image into several sub regions and ICA algorithm is successively performed in each sub region to extract separation vectors. After testing the validity of separation vector, invalid separation vector is discarded and the valid are considered as candidate vectors. Repeat above operations and we obtain final separation vector from the collection of candidate vectors for obtaining melanin and hemoglobin component map. Based on separated component map, we can globally synthesize new skin image, or select ROI (Region of Interest) to locally synthesize. Our experiments show that the proposed algorithm is very effective and can act as an e cosmetic function to reproduce realistic results.
摘要:For lack of graphics module, general shared memory Multiprocessors can not be used to visualize the three dimensional scientific computational datasets generated on themselves using hardware accelerated volume rendering techniques. An algorithm of the hybrid parallel rendering based on multiprocessors and distributed graphics workstations is presented. The datasets are processed on both local multiprocessors and distributed graphics workstations. The following image composition is performed by multiprocessors to utilize efficiently communicating capabilities. Through load balancing optimization, parallel rendering pipeline can overlap rendering, compositing and display process. Experimental results show that the interactive rendering abilities can be achieved when handling large datasets resided in multiprocessors with 1 024×1 024 image resolution.