摘要:This study proposes a fast mode decision algorithm combined with image texture direction and spatial correlation in high-efficiency video coding (HEVC) to reduce the number of candidate patterns in the rough mode decision(RMD) and to speed up the frame of mode selection. In the HEVC algorithm,the Sobel operator is used to obtain the texture direction of the current block Prediction Union (PU), in order to select the corresponding angle prediction model to comprise the RMD candidate list,and the spatial correlation of images is adopted to add the optimal mode of the neighboring PU to the candidate list. The proposed algorithm will reduce the number of candidate patterns from 35 to less than 10 in the RMD. Thus,this method significantly shortens the encoding time by 34.9%, with the bit-rate increasing to 0.41% and the PSNR decreasing to 0.0314 dB. Resultsshow that this method significantly reduces the complexity of RMD algorithm and improves the efficiency of HEVC coding when the encoding quality is invariant.
关键词:high efficiency video coding(HEVC);video compression;intra-prediction;intra-modes decision;Sobel operator
摘要:To reduce redundancy and generate anti-aliasing subbands, we design a low redundant image multiscale transform with new anti-aliasing direction filtering. The new transform combines wavelet with frequency directional filter banks. The directional filter bank has compact support in frequency domain. The filtered directional subbands lack the spectrum aliasing phenomenon. A novel decimation method is adopted to make the decimated directional subbands with shift invariance avoid the aliasing. The new transform can reduce redundancy efficiently and improve the direction selectivity of directional filter banks. The simulation results show that, in sparse representation, compression and other fields, the novel image multiscale transform achieves good results.
关键词:image multiscale transform;anti-aliasing directional filter bank;frequency direction filtering;low redundant property
摘要:Traditional image compression algorithms are mostly based on L norm. However, those methods are not able to precisely control the error of each point. In this paper a maximum error image shift compression algorithm based on L norm is proposed. The algorithm can guarantee that the error at each point of the reconstructed image is limited in a given range. First,the algorithm takes the advantage of the image pixel similarity to decompose the image into many non-overlapping sub-blocks. Then each sub-block should be completely shifted, and the retained shift coefficients must be stored. Finally, the original image can be reconstructed by the retained shift coefficients. The experimental results show that, the suitable block size of the images of different resolutions is not the same, which increases by the increase of image resolution. Compared with the existing algorithm based on L norm maximum error shift compression, the compression ratio, reconstructed image quality and compression speed can be improved by the proposed algorithm.
摘要:Nonlocal means (NLM) is a successful method to denoise digital images. However, the decay control parameter is usually fixed and cannot be adaptively changed according to the image content adaptively. Some intensified researches on the influence of the decay control parameter that describes the width of the smoothing kernel and data-dependence have been done to make the NLM more effective. Therefore, this paper proposes to apply the no-reference image content metric (denoted as Q) which does not need the noise to be Gaussian to NLM to optimize the fixed decay control parameter without a reference image. Some methods have been proposed to use SURE (Stein's unbiased risk estimate) as an estimator of mean squared error (MSE) of NLM to select the optimal parameters for restoring an image. However, these methods are fit only for Gaussian noise. Moreover, in real situations, the noise of an image is usually not Gaussian, and the variance of the noise is also space varying. Thus, in this situation, selecting the optimal parameters via SURE will be ineffective. Therefore, this paper first proposes to incorporate the no-reference image content metric Q which does not need the noise to be Gaussian to NLM to optimize the fixed decay control parameter in an iterative way and obtain the optimal denoised results. Second, aiming at the linear relationship between the Rician noise level in magnetic resonance imaging (MRI) images and the optimal decay control parameter, we propose to tune the parameter of unbiased NLM for MRI denoising according to the structure information (such as texture and edge) of the image. This method measures textures and edges in an image via the metric Q. By considering the structure information of the image, we can tune down the decay control parameter when the value of Q is high so that the textures and edges can be retained in the denoised results. Experimental results show that our proposed method can improve the value of peak signal-to-noise ratio. At the same time, the denoised results obtained by the proposed method are superior to the results obtained by the traditional methods in vision. Incorporating the image content metric to NLM is valid for denoising. We propose to optimize the parameter of NLM via the no-reference image content metric Q. The proposed method aims to enable the NLM method to choose the decay control parameter adaptively according to the image content. Experimental results show that the proposed method can optimize the NLM in terms of vision and PSNR. Thus, incorporating the image content metric to NLM for denoising is valid.
关键词:nonlocal means;no-reference image content metric;parameter optimization;decay control parameter;image denoising
摘要:We propose a tone optimization-based image and video detail-enhancement algorithm to deal with the tone discrepancy between the input image and the detail-enhanced image in state-of-the-art methods. We first extract intensity information through color space conversion to improve efficiency and avoid the problem of color distortion that results from the correlation of color channels. Subsequently, we perform edge-preserving image filtering based on local extreme values to divide the intensity image rapidly into a base image containing coarse-scale information and several detail layer images containing fine-scale information. Under the constraints of user-specified detail-magnification coefficients and the color fields of the input image, a gradient domain energy optimization-based detail-enhancement algorithm is proposed to obtain the detail-exaggerated intensity image with tone consistency. Finally, we employ inverse color space conversion to obtain the detail-enhancement results. Results show that the proposed algorithm can achieve significant detail-enhancement effects with tone-preserving characteristics for arbitrary input images. The proposed algorithm satisfies the technical requirements of image and video detail enhancement and shows great potential in various applications, such as scientific investigation, visual surveillance, and special visual effects.
摘要:A scene restoration algorithm based on image fusion and segmentation is proposed to enhance the contrast and detail information of haze images captured by a machine vision system. Haze density is roughly estimated based on the physical properties of the optical reflectance imaging and morphology operation. The atmospheric veil is then estimated accurately by using weighted image fusion and by computing for the local variance. The global atmospheric light is obtained by segmenting the most hazed region or the sky part of the image. Finally, the ideal result is obtained through a physical model, and the brightness and chroma of the images are adjusted via tone mapping. This method can avoid halo artifacts or color distortion while achieving a good restoration of contrast and color fidelity. Results show that the proposed method has robust scene adaptability and achieves different degrees of improvement in terms of restoration effect and computation speed.
摘要:The resolution of depth image is very low. In this paper, with a registered and potentially high resolution color image of the same scene, we propose a second order total generalized variation super-resolution method based on color image regularization terms. First, the low-resolution depth image is mapped onto the high-resolution color image coordinate system. Then, the second-order total generalized variation model is used and the high-resolution image constrained term with an edge indicator function is used to construct the regularization term. The depth map super-resolution problem is solved by developing an energy optimization framework. Finally, the reweighted method and primal-dual method are used to solve the energy function. The experimental results demonstrate that the proposed approach can well preserve the edge information and obtain a high resolution depth image in terms of both its spatial resolution and depth precision. The proposed method can effectively solve the problem of low depth image resolution.
关键词:depth image super-resolution reconstruction;second-order total generalized variation;edge indicator function;TOF camera
摘要:Image retargeting quality assessment is used to describe the quality of image retargeting results from different retargeting methods. However,the availablemethods cannot provide efficient assessments with objective and quantitative values. Thus, we present a new image retargeting quality assessment method based on the preservation of important regions in image retargeting. First, a new important region identification algorithm is defined by combining salient region detection, image edges, and superpixel segmentation. Second, area change and component change functions are constructed to describe the area and component preservation of the important regions in image retargeting. Finally, the image retargeting quality assessment function is defined by combining the first two steps. We conduct our experiments on the basic image database proposed in RetargetMe, which is a benchmark for image retargeting. We then compute for the quality value of the retargeted results to assess the different retargeting methods. Kendall coefficients are used to measure the consistency between our assessment and the subjective assessment. The average value of Kendall coefficient overcomes 0.5%—11% then the values of present objective assessment methods. Compared with those of available objective assessment methods, the Kendall coefficients of the proposed method are larger. This finding demonstrates that the proposedmethod can quantitatively assess the retargeting results of different kinds of images accurately, efficiently, and rapidly.
摘要:To overcome the disadvantage of projective non-negative matrix factorization (PNMF), which fails to discover the intrinsic geometrical and discriminating structure, this paper proposes a novel graph embedding regularized projective non-negative matrix factorization (GEPNMF) method to extract face image features. First, this paper constructs two adjacent graphs to characterize the intrinsic geometrical structure of data subspace and interclass separability. Then, using the Laplacian matrices of the two adjacent graphs, this paper designs a graph embedding regularization incorporated with PNMF's objective function to construct the GEPNMF's objective function. Given that the graph embedding regularization is adopted by the objective function, the learned subspace of GEPNMF can preserve the geometrical structure of the data subspace, while it maximizes the margins between different classes. In addition, this paper introduces an orthogonal regularization into the objective function to ensure the learned bases to be parts-based. Finally, this paper deduces a multiplicative update rule to optimize the objective function and theoretically proves its convergence. The face recognition experiments are conducted on ORL, Yale, and CMU PIE face image datasets. The recognition rates reach 94.00%, 66.43%, and 98.58%, respectively. Experimental results show that the face image features extracted by GEPNMF can achieve better face recognition performance.
摘要:Linear dynamical system(LDS) as the description for dynamic texture can capture the transition of appearance and motion effectively. However, the LDS model does not belong to Euclidean space, making it impossible to apply traditional sparse coding techniques for classification and recognition. A novel approach based on sparse coding and LDS is proposed to be applied in dynamic texture recognition. The proposed algorithm employs a principled convex optimization formulation that allows both a sparse representation code and a linear transformation matrix to be jointly inferred. Model parameters are optimized and learned to realize good texture recognition. Experiments are conducted on publicly available dynamic texture databases UCLA, and comparison with other methods is made. Experimental results show that the proposed method has better performance, for the recognition rate 97% and better robustness to occlusion. show that the proposed algorithm outperforms earlier approaches, including robustness to occlusion.
摘要:We propose the bottom-up method for the perceptual grouping of parallel lines according to the sharp structure characteristics of insulator strings in the detection of transmission line defects of unmanned aerial vehicles (UAVs). This method is applied to improve the correct recognition rate of the insulator and to overcome the deficiency to the color-based insulator recognition method. First, the line segments extracted from all directions are divided into six groups in the inspection image. The line segments with approximate lengths, directions, and center point orientations are then grouped into parallel segment clusters. Insulator regions are detected by combining parallel segment clusters and organizing the circumscribed shapes of these clusters based on knowledge about transmission line models. Glass insulator defects can be diagnosed according to the similarities among the feature blocks of the mean and variance of inertia moment, the adaptive partition for insulator regions, and the calculation of the direction and distance between the insulator strings. Compared with the HSI color-based insulator recognition method, the insulator recognition method based on multiple internal parallel line structures exhibits more stable performance and is thus more suitable for transmission line inspection. Transmission line images from UAV inspection are tested, and results show that the proposed method can be used to identify various types of insulators and to detect insulator off-chip defects effectively in cluttered backgrounds.
摘要:To overcome the shortcoming about the lack of payload and endurance, the autonomous aerial refueling technology is an effective solution for unmanned aircraft to extend flight duration, which results in exponentially enhancing of the aircraft fighting efficiency, and it can also reduce the refueling operation risk for manned aircraft. Relative position sensing between the receiver aircraft and the refueling drogue is the first and necessary step in the autonomous aerial refueling procedure, and the vision-based method is the mostly used for relative positioning. We propose a circular feature-based drogue detecting and tracking method by using the internal refueling port of drogue as the matching feature. The drogue tracking is the final step which outputs the final image positioning results even in the drogue detecting process. In drogue tracking, the rectangle of interest (ROI) of current frame image is determined and extracted depending on the positioning result in the last frame, and then the Canny algorithm is applied to obtain the ROI edge image. With the edge pixel points, the row and column scanning method is used to obtain the inside edge points of circular objects, which can exclude the irrelevant image pixel points, and then the precise elliptical shape parameters can be solved through an iterative least-squares fitting method. In drogue detecting, a multi-direction closest point searching method is proposed to extract all possible drogue image areas by inspecting the circular feature when scanning the multiple directions of the interested pixel point, and final result with precise matching can be decided by the drogue tracking method mentioned before. Experiments are designed and done with a drogue model which has almost the same size of the real refueling drogue. The Hough-based circle detecting method is treated as the comparison method, and simultaneously a laser rangefinder is used to provide the reference positions in the experimental procedure. Experimental results show that, the proposed method has a success rate of 94.71% in precise relative distance positioning of drogue, better than the Hough-based circle detecting method. In real-time performance, the Hough-based circle detecting method has an average time consumption of 59.52 ms. When using the proposed method, drogue detecting takes less than 500 ms, and drogue tracking has a time consumption of maximum 21.59 ms and average 4.18 ms, which can well work with the camera having a frame rate of 25 frame/s. Vision-based navigation method is a potential and available solution for probe-drogue aerial refueling docking, and the complex background of the tanker aircraft makes the image detecting and tracking of drogue difficult and easily disturbed. We use the internal circular refueling port as the image detecting feature, and propose a new drogue image detecting and tracking method. The installation of additional optical marks on the refueling drogue is not required, and the real-time position and size of drogue in the image plane can be accurately determined. The proposed method is able to meet the requirements of the accuracy, the real-time performance, and the reliability for image processing in the autonomous aerial refueling procedure.
摘要:The ability to capture depth information of static real-world objects has reached increased importance in many fields of application, such as manufacturing and prototyping as well as the design of virtual worlds for movies and games. The use of time-of-flight camera to obtain the scene depth map is very convenient, but given the limitations of the hardware, the resolution of the depth map is very low and cannot meet the actual needs. How to improve the resolution of the depth image is an interesting topic. To overcome this problem, we propose a novel method for solving the depth map super-resolution problem. Given a low-resolution depth map as input, we recover a high-resolution depth map by using a registered high-resolution color image. Based on the benefits of non-local and local priors, we propose a novel adaptive weighting filter framework to solve this depth map super-resolution problem. Specifically, given that discontinuities in range and color tend to co-align, we formulate the non-local and local adaptive weighting filters based on the raw depth map and the features of high-resolution color images. With this non-local adaptive weighting filter, our algorithm can well prevent the depth super-resolution result from the jagged effect and is more robust to different initial depth input. Then, our local adaptive weighting filter can further improve the quality of the reconstructed depth results. Experiments demonstrate that our approach can obtain excellent high-resolution range images in terms of both spatial resolution and depth precision. The Peak Signal to Noise Ratio (PSNR) comparison experiments show that our method can reconstruct much better high-resolution range images compared with other state-of-the-art methods. Especially when the down-sample factor is larger, the performance of our algorithm is more obvious. In this paper, we present an adaptive weighting filter framework for the depth map super-resolution problem. Based on the mutual benefits between the raw depth map and the visual features of the color image, we formulate the super-resolution process as an adaptive weighting filter integrating non-local and local priors. It is experimentally shown that the proposed methods can produce sharper edges and more faithful details compared with other state-of-the-art approaches.
摘要:Computed tomography angiography (CTA) is one of the most widely used imaging techniques to diagnose coronary artery disease. However, the similarities of the gray scale distributions of both coronary arteries and surrounding tissues make the recognition of vascular structures difficult. This study proposes a coronary artery tracking method based on direction clustering. Vessel structures are enhanced by integrating the closing operation and adaptive grayscale stretching. A spherical operator is proposed for the detection of vascular boundaries. The operator is constructed by placing direction rays at the center of the extraction points. The intensity difference ratios of adjacent sampling points are extracted based on the intensity distribution on the rays. Vascular boundaries are then extracted and refined through convex hull optimization. Finally, hierarchical clustering is employed to extract the direction vector of the vascular structure. demonstrate that the proposed method can accurately extract the coronary artery from CTA images. The extraction accuracy can be reached to 0.39 mm. The designed spherical operator can be well adaptive to different kinds of 3D vessels. And the hierarchical clustering method that is used for the director vector classification don't need the training. The proposed method has high automaticity and accuracy.
摘要:The key technology in hyperspectral remote sensing image compressiom is to get rid of the spatial correlation and spectral correlation. According to the characteristics of the structure of data of hyperspectral remote sensing image, how to effectively remove the spatial correlation and spectral correlation in hyperspectral image compression is a crucial problem. For hyperspectral remote sensing image, in the image coding, wavelet transform is a very effective method to remove redundancy. 3D wavelet transform can effectively remove the spatial correlation and spectral correlation of hyperspectral image. Therefore, this paper proposes a coding algorithm based on band ordering and 3D hybrid tree. First, the hyperspectral remote sensing image is divided into groups based on natural spectral order, each group contains 16 continuous spectral bands. Second, in each band group, the sum of the spectral correlation of all the adjacent bands is calculated. If the correlation value is less than a given threshold, a complete graph is structured on the basis of the group of bands. In the complete graph, each apex represents a band, so there are 16 apexes in the graph, and each side of the complete graph represents the spectral correlation value of two bands in the group. Then, the max Hamilton loop of the complete graph is searched for, and the group of bangs is reorderd based on the max Hamilton order to improve the correlation value of the band group. On the basis of the above, the reordered group of bands is given a 3D wavelet transform, and 3D hybrid tree coding algorithm is used to encoded the 3D wavelet transform coefficients. As to those band groups whose sum correlation values are greater than the given threshold, their band order does not have to be reordered. 3D wavelet transform will be given to these band groups directly, and then wavelet transform coefficients will be encoded by the 3D hybrid tree coding algorithm. First, reordering the band groups with a relatively small correlation value can improve the efficiency of 3D wavelet transform in removing the spectral correlation and spatial correlation of hyperspectral image. By reordering, 3D wavelet can more effectively remove the redundancy. Second, the use of 3D hybrid tree coding algorithm can produce more effective zero trees because the group of band with a weak correlation value is reordered to improve the correlation value. Based on the algorithm above, the coding efficiency improves to a certain extent. AVIRIS hyperspectral images are used in the simulation experiment to verify the validity of the algorithm. Given the adoption of a reordering mechanism on the band groups that have weak spectral correlation values, the hybrid tree structure of the 3D zero tree coding algorithm appears more effective and to a certain extent, improve the coding efficiency. From the experimental data we can see that, compared with the use of wavelet transform only, joined band ordering and the hybrid tree structure coding algorithm improves the peak signal-to-noise ratio of the decoded image to some extent.
摘要:To meet the different needs of different users to the quality of remote sensing images leading to a series of problems, such as a large amount of image data and transmission and display delay in heterogeneous network environments, an online remote sensing image progressive transmission model is constructed in which remote sensing image compression and decompression are synchronized with transmission. At the same time, a pipeline-based multi-thread acceleration scheme with SPIHT algorithm, which is a quality progressive method, is proposed through solving the asynchronous problem between compression, decompression, and transmission to improve the efficiency of remote sensing progressive transmission. In the VC++ platform, the proposed model was implemented based on a socket communication channel and SPIHT, which can provide a qualitative progressive code stream. First, a given image was compressed into a code stream by SPIHT. Then, the socket channel was used for sending the code stream in real time. For the client, it decompressed the code stream and displayed the reconstructed image as soon as it received the code stream. The real-time system was realized by multi-threading technology, which guaranteed that the compression, transmission, and decompression were processed synchronously. Therefore, the whole processing time can be reduced. Experimental results show that the whole processing speed has been improved nearly twice without reducing image quality by using the proposed progressive transmission of the real-time compression model. Compare with multi-resolution progressive method, the similarity index has improved an increase of 20% between each progressive layer of image and the original image in our quality progressive method. The asynchronous problem during the processing of compression-transmission-decompression for remote sensing images was solved in the new model, and the transmitting efficiency has greatly improved. On the other hand, this proposed progressive transmission model has done better in visual effects as contrasted with the multi-resolution progressive transmission.