摘要:This is the ninth in the survey series of the yearly bibliographies on image engineering in China. The purpose of this survey work is mainly to capture the up-to-date development of image engineering in China, to provide a convenient means of literature searching facility for readers working in related areas, and to supply a useful reference for the editors of journals and potential authors of papers. Considering the wide distribution of related publications in China, 577 image engineering research and technique references are selected carefully from 2341 research papers published in a set of 15 Chinese journals. These 15 journals are considered as important journals in which papers concerning image engineering have higher quality and are relatively concentrated. Those selected references are classified first into 5 categories (image processing, image analysis, image understanding, technique application and survey), and then into 21 specialized classes according to their main contents. Some analysis and discussions about the statistics made on the classification results are also presented. This work shows a general and off-the-shelf picture of the various progresses of image engineering in China. In 2003, the number of research papers in image engineering had a considerable increase. Except "traditional" fields of image segmentation and image coding, new emerging research topics, such as digital image watermarking, human face and organ detection, image matching and information fusion, and image and video retrieval are keeping "hot". It should be pointed out particularly that the ratio of the number of research papers in image engineering over the number of research papers published in the above 15 journals attends a new stage in 2003. This has shown the tendency of fast progresses of image engineering in China.
摘要:Many images contain abundant text, such as text in banner used for page design on web pages and text in video. If these text occurrences could be detected, segmented, extracted and recognized automatically, they would be a very valuable source of high-level semantics for image indexing and retrieval. So many international researchers pay more and more attention to acquiring text in images and videos. But now national researchers enter into this field. In order to make people know the academic area more systemically and researchers look up references more conveniently, this paper gives an overview of state of the art of text acquirement research in images and videos. Firstly, this paper discusses the current development of the area based on understand and analysis of related recent papers . Then, from the two aspects of text detection and extraction and text recognition, typical techniques and approaches are discussed mostly, as well as their merits and shortcomings, such as techniques and approaches based on edge, texture, color region, machine learning, video multi-frames and OCR. Finally, with the present problems in this area, the paper give some work and open issues that can be researched more in the future.
摘要:Spatial relations are extensively used in spatial database query language, content-based spatial data retrieval and spatial data analysis. The uncertainty is an inherent characteristic of spatial relations, but little attention was paid to it. In this article, firstly, the definition, the implication of uncertainty of spatial relations and their influences on application of spatial relations are presented. Secondly, the sources of uncertainty spatial relations from the uncertainty of spatial data, spatial recognition and processing of spatial relation are analyzed, and the framework of processing uncertainty is proposed. Thirdly, the qualitative methods of describing spatial relations are evaluated and their shortcomings are pointed out. Finally, the fuzzy set theory is introduced to describing the fuzziness of spatial relations. The fuzzy methods can describe the partly membership between a geometry spatial relation and a concept. The fuzziness of spatial objects, people's cognition to spatial relations and similarity between two spatial relations can be described in a unified method, and spatial relations between two crisp objects, between two fuzzy objects and between crisp objects and fuzzy objects can also be expressed in a unified method. The fuzzy method lays a foundation for researching uncertain spatial relations.
摘要:The energy function of the classical active contour model is composed of the internal energy and the external energy, the internal energy is used to restrict the Snake shaping and the external energy leads the convergence process of active contour. Actually, the convergence accuracy of the objective contour is mainly determined by the external energy, while the internal energy is only to ensure that the contour shapes with reason. Because the internal energy includes a few unrelated factors, it has some side-effect on the convergence accuracy of the objective contour. So in this paper a new active contour structure is presented to solve the problem, in which the internal energy is separated from energy function, only the external energy is applied to the convergence process of active contour, at the same time the image energy and the control energy are redefined, a revised contour function is introduced, the continuity and smoothness is embodied in the Snake shaping process. Experimental results show that the active contour algorithm under new structure is little dependent on the initial contour and can converge quickly; the number of control points adaptively changes also can increase the tracking accuracy of objective contour.
摘要:SIR-C is the first spaceborne imaging Radar system with multi-wavelength and quad-polarization developed by joint effort of The U.S, Italy and Germany. Polarization SAR can measures the scattering matrix of each pixel on ground and synthesizes the image at given orientation and ellipticity angle , including linear and elliptical polarization. It has many advantages over single or multi-polarization SAR in detecting objects, identifying targets and extracting geometric structure of ground targets. During recent years, theoretical modeling and field experiments have established the fundamentals of active microwave remote sensing as an important tool in determining physical properties of ground objects. But different ground targets often have the same polarization signal characteristics because of the complexity of the distribution of the targets , which leads to wrong interpretation of the images and identification of the targets. Besides, relatively high correlation of the synthesized polarized images often lead to poor accuracy of classification. Based on SIR-C data of He Tian prefecture in Xinjiang of China, we use target decomposition theory to decompose the data into three no-related scattering components: an odd number of reflections, an even number reflections, and a cross-polarized scattering power, which represent different scattering mechanism of different objects. This decomposition technique allows us to obtain the estimation of single and double reflection components of backscattering coefficients for VV and HH polarization .They greatly improve the correctness of identification of ground objects. And what is more, the three components are non-correlated., which provides richer data resource. This paper employed neural networks classifier to classify the SAR images by combining them with polarimetric synthesized SAR power image. The decomposition result shows that the decomposed three scattering components reflect the correct scattering feature. The classification result shows that the method can effectively extract information of land cover, achieve the better classification accuracy of ground objects and improve the ability of SAR to monitor the land use and cover.
摘要:Minutiae has been comsidered as the most distinctive feature of a fingerprint and thus widely used in finger reconition/verification.Instead of using those conventional extraction methods based on binarization and ridgeline thinning,D.Maio and D.Maltoni proposed a totally different approach to detect the minutiae directly in gray-scale images in 1997.In this approach minutiae were extracted based on ridge tracing directly in gray images,so it showed more robust performance by avoiding the errors introduced by binarization and thinning.However,there are also drawbacks of this approach,such as the difficulties to decide the 7 thresholds,inaccurate ridge ending determination and etc.In this paper,an approach with several important improvements on D.Maio and D.Maltoni's method is presented.A novel method based on canny operator and morphology is used for fingerprint foreground segmentation,and both ridge tracing angle and the change of gray-level are introduced into the stop criteria of ending point.Other improvements include the selection of a better filter,and a post-processing to eliminate the spurious ridgelines and false minutiae by considering the length of ridgeline and the local structure of ridgelines where minutiae are detected.The efficiency and robustness of the proposed algorithm have been shown in the experiments.
摘要:On the base of the diameter of granule, the touching relationship of two objects in the image can be divided into three types: untouching, touching with different grain-size, touching with same grain-size. It also can be further divided into strong-touching, mid-touching and thin-touching by the size of touching part. The shape preserving segmentation of touching objects should be their original shape after segmentation, but watershed segmentation, geodesic reconstruction damage their shapes, and they are easily affected by many aspects. They also can't calculate the size of touching part. An alternative method is condition granulometry segmentation. After the erosion of image, the method doesn't dilation the marked area. It can segment different grain-size touching objects. For the same grain-size touching objects, using a special condition granulometry segmentation after erosion, the small grain-size(narrow piece) is the object touching part when reconstruction. In the end, the paper introduces the application to rock grain-size distribution and cement types of this algorithm.
摘要:In this paper, a system valid of the segmentation and classification of skewed document images with irregular graph regions and form regions is proposed. In this system, the skew angle of the document images is detected with a novel algorithm based on the morphological operation of Hit-or-Miss and the hierarchical Hough transform. The former(Hit-or-Miss operation) is for the detection of the baseline points while the latter(Hough transform) is for the detection of the skew angle of the baseline which is also of the page image. To make the system valid for the document images with irregular graph regions involved, we proposed to introduce a middle point cut process to the traditional projection profile cut algorithm so that the irregular graph regions can be approximated with a lot of small rectangles. The segmented regions are classified with two features of the black to white ratio and the cross correlation between adjacent pixels of the sub-blocks. Experimental results have proved the fastness and the reliability of the system proposed in this paper.
摘要:While research on the new multi-seeder dynamical testing technology based on image techniques, one of the most important problems is the matching and joining of the serial dynamical images of seed kernels. In this paper, a useful method is introduced to solve this problem. Based on the theory of patter recognition, this method successfully meets the requirement of real-time test. The theory of the building of a testing system is explained, factors that influence the precision of image joining is analyzed and methods used to join the serial images is proposed. A universal test-bed system is expounded that can be used to test the performance of precision seeders, grain drills and seeding monomer. This test bed, which is automatically controlled, can measure the seed kernels interval correctly and reliably. Experiments show that theory and method are practical, speedy and reliable. The requirement of real-time test is satisfied. The test error of the system is within 2mm.
摘要:For realistic application, near real time matching 3D polygonal arcs is required. The method for representing and matching 3D polygonal arcs is presented., in this paper. The polygonal arcs junction is defined. 3D polygonal arcs are represented by Spherical coordinates sets that are obtained by defined local Cartesian coordinates system of each junction. This representation is invariant to translation and rotation transformation. The set is views as feature sets. The benefits of using this feature sets include attribute of geometry and structure of topology of polygon. The 3D polygonal arcs matching task is reduce into a 1D numerical string-matching problem so that the matching is easy and the processing time is greatly saved. The objection function is defined as the mean square errors between the feature sets. Experiments with different classes polygonal arcs and real images show that the matching algorithm produces sufficiently reliable and is robust to digitization errors and noise effects.
摘要:Different image matching of the same scene is a key problem in computer vision, and is frequently used in three-dimensional object reconstruction,object recognition,image alignment,camera self-calibration and so on. Feature point matching is the most common one among a11 kinds of image matching.To solve the problem of 3-dimensional scene reconstruction, and to improve the performance of present feature point matching, a matching scheme which is invariant to perspective deformation induced by changes in viewpoint is required. This paper proposes a novel algorithm of Feature Match Based on Corner Affine Invariant. It selects corners as extracting feature of the image matching, and these corners are characterized by their orientation and angular width. Through calculating affine invariant, the influence of image stretch, skew, rotation, translation and 1ighting conditions is removed, and by using the epipolar geometry as a matching constraint, those outliers are eliminated too. Consequently we realize the feature matching of image pairs with much difference. And the experimentation shows that the algorithm has high matching accuracy and good matching performance.
摘要:In the field of computer visualization, three-dimensional object is usually rendered by triangle meshes,but too many triangle meshes can affect the result of real-time operation, such as shifting, rotation and zooming, most of users can't bear the slowness. So how to simplify the triangle meshes is regarded more and more important. To simplify triangles effectively while not influence the vision effect, based on the traditional method of edge contraction, a new method is presented, which named Two Times of Local Mapping. First, it maps the three-dimensional triangle meshes from three-dimensional space to two-dimensional plane by using Gaussian sphere which can determine the direction of projection. Then on two-dimensional plane, using the traditional edge contraction method, a new contracted point is found in the kernel of a polygon. With the new point, the number of triangles is reduced greatly. After simplification, when mapping the simplified triangles back to three-dimensional space, the error is checked. Two kinds of error are defined to ensure the minimum error. It is proved that the given method is little-error, efficient and without self-intersected, and also it is stable and applicable.
摘要:Bézier surface is one of the most commonly used modeling tools in CAD/CAM systems. During the development of a system, the research of approximate merging algorithms for a pair of adjacent ézier surfaces is of great importance. Approximate merging of two adjacent Bézier surfaces is that two adjacent Bézier surfaces of degreem×nis approximated by one of degreek×l(k≥m,l≥n) within the admissible error bound. As exchanging information becomes more and more important in product design, the international network is more and more developed, and more and more enterprises have been vastly established, the exchanging of product model data is becoming far more frequent than ever. Before the exchanging of product model data, if approximate merging algorithms are used, then geometric data can be reduced. In this paper, an approximate merging method for tensor product Bézier surfaces is presented by using the matrix representation of subdivided Bézier surface and by minimizing the distance function defined between the original Bézier surfaces and the merged Bézier surface. The explicit representation for control points of the merged surface is obtained. Higher order of continuity along the boundaries of surfaces is considered in the merging process. The approximate merging for a pair of adjacent Bézier surfaces can be directly carried out by using the method of this paper.
摘要:The traditional volumetric visual hull generating methods were not applicable to huge-volume objects due to redundant calculations. Other methods based on ray intersections were sensitive to input perturbations and were hence lack of robustness. A fast new algorithm was represented in this paper for reconstructing an object's visual hull from its silhouette. The topological structure of the object's surface was taken into consideration, and the surface mesh was reconstructed directly without having to compute the redundant information of all the voxels inside. Then, an improved SurfaceNet algorithm was adopted to smooth the 3d surface. The robustness of the classical volume carving method was reserved, while the time complexity was reduced to only linearly depend on the number of vertices on the final surface. The dependence of the time complexity on the number of photos was reduced as well. The results of the experiments show that this topological generation algorithm is superior to some classical visual hull methods, as far as its reconstruction function for practicality is concerned.
摘要:Along with the popularization of Internet and the development of multimedia techniques, internet has become a main way for information communication. People can get a lot of information through the Internet, such as text, image, graphics, audio and video etc. Unfortunately however: internet and multimedia also afford virtually unprecedented opportunities to pirate copyrighted material. This risk is same to the distribution of GIS data through internet. As a result, digital Watermarking has been presented as an efficiently solution to copyright protection. Dozens of schemes and algorithms have been proposed, but most of those are for multimedia data. In this paper, a new watermarking method is proposed for the protection of GIS data. This approach applies the Chinese Remaindering to construct embedding watermarking in GIS data. The influence of embedding process on GIS data would ignore under the lower measurement precision. It can be used as tamper proof and authentication of GIS data .This algorithm accord with Asmuth-Bloom system, so we can extract watermarking use only a part of GIS data. The extraction process doesn't need the original watermarking, because the embedding procedure is based on the states that a positive integer is uniquely specified by its remainder modulo relatively prime integers. The algorithm bases on the coordinate of the points, so it can be widely used in GIS data protection such as relation database, as well as triangular mesh. The results show that the algorithm can work well. Finally, some open problem is discussed.
摘要:This paper presents a new coding method based on content statistics according to four-dimensional matrix and four-dimensional matrix DCT (4D-MDCT) theories in order to increase the compression ratio of color video sequence. First, four-dimensional matrix, which is constructed by color video sequence, is divided into submatrices and 4D-MDCT is performed to the submatrices. Then four-dimensional matrix quantization is applied to the transformed coefficients. In order to use the correlation between the inter color frames and intra color frames, statistics on quantized four-dimensional submatrix coefficients is performed to search four-dimensional submatrices which have identical quantized coefficients. Only the first one of these identical quantized four-dimensional submatrices is coded. When the first quantized four-dimension submatrix is coded, conventional runlength coding and entropy coding are performed. Experimental results proved that the compression ratio can be increased greatly and the high PSNR is preserved. Because conventional runlength coding and entropy coding are simple, and easy to be realized by using hardware, the method in this paper has a prosperous future.
摘要:In order to dispel the correlation between color components of color images more efficiently and get the better color images compression results, in this paper, a new approach to decorrrelate color components of color images is presented based on the color information matrix. A quantization and encoding scheme is provided in the new color space according to the characters of the new color components respectively. Similar to the KLT color space, the new color space is also data dependent which can take full advantage of the inherent correlation between color components of original images. It differs, however, from the KLT color space in the optimal standards. The KLT color space is optimal when the original color components of images have been centralized, while the new color space is optimal when the original color components of images have not been centralized. Many experiments are carried out to analyze the encoding performance in the new color space. The results show that the new approach can get rid of the correlation between color components of color images efficiently and is easily implemented. It also boasts of better signal to noise ratio and higher compression rates. The reconstructed image is visual comfort and the coder is easily combined with other compression and encoding methods.
摘要:The computation in video codec mainly concentrates on transformation and motion search,which are the bottlenecks of realization of real time video codec.In order to reduce the computation in the transformation and motion search of H.263 encoder, float-point DCT is replaced by integer transform,which make video encoder free of float-point operations, and an all-zero decision technology based on integer transform is proposed in this paper. The all-zero decision is made in inter frame coding by comparing sum of the absolute differences(SAD) to a value related with quantization parameter during motion search.If the all-zero block is found, later computiaon of motion search and integer transform are eliminated, which can avoid a large amount of computiaon when mtion search is very accurate. This technology can basically maintain the quality of image,and highly improve the efficiency of real time H.263 encoder by skiping all-zero block before integer transformaion and shortening the code stream at the same time.
摘要:The problem of gray level incorporation during image histogram modification is analyzed in the paper. And two histogram modification methods which can prevent image details losing are presented. One is addressed by expanding the gray-level range of image, so less gray levels are integrated and more details of the image can be hold. The other method is realized by modifying local gray level. According to the method, the image's histogram is modified by way of general histogram modification method firstly. Then the gradient of modified image and original image is calculated and compared. The pixels whose gradient decreased markedly are found. And their gray levels are modified according to the gradient of the original image finally. So the details lost during histogram modification can be resumed. Both the methods can enhance the image contrast and hold the details simultaneously. Further more, the proposed approach is simple and easy to be performed. Experimental results are given at the end of the paper. Four images are adopted to demonstrate the performance of the methods.
摘要:Compressed video is vulnerable to transmission errors when transmitted over unreliable channels. In this paper, a two-step multi-weighted boundary-matching algorithm (TMBMA) for video error concealment is proposed to combat the transmission errors. With this algorithm, the motion vector (MV) of damaged block can be estimated by making full use of information of blocks around the damaged one. A pre-concealment is implemented to improve video frame quality to some degree and then the multi-weighted boundary-matching algorithm is used to evaluate every MV candidate of the damaged block. The MV candidate giving the minimum boundary-matching differences is selected to conceal the damaged block and video frame quality can be improved further. The disadvantages of side matching algorithm (SMA), such as incorrect MB displacement at object border and poor performance when one entire frame or plenty of consecutive GOBs are lost, are overcome with this algorithm. It is proved by H.263 simulation results that satisfactory concealment performance can be achieved by effectively controlling the propagation of video errors.