多尺度显著区域检测图像压缩

曲海成; 田小容; 刘腊梅; 石翠萍

doi:10.11834/jig.190168

图像处理和编码 | 浏览量 : 0 下载量: 9 CSCD: 2

PDF
导出
分享
收藏
专辑

多尺度显著区域检测图像压缩
Image compression method based on multi-scale saliency region detection
2020年25卷第1期页码：31-42
收稿日期：2019-05-07，

修回日期：2019-06-26，

录用日期：2019-7-3，

纸质出版日期：2020-01-16
DOI： 10.11834/jig.190168
稿件说明：

移动端阅览

曲海成, 田小容, 刘腊梅, 石翠萍. 多尺度显著区域检测图像压缩[J]. 中国图象图形学报, 2020,25(1):31-42. DOI： 10.11834/jig.190168.

Haicheng Qu, Xiaorong Tian, Lamei Liu, Cuiping Shi. Image compression method based on multi-scale saliency region detection[J]. Journal of image and graphics, 2020, 25(1): 31-42. DOI： 10.11834/jig.190168.

摘要

目的

为了解决利用显著区域进行图像压缩已有方法中存在的对多目标的图像内容不能有效感知，从而影响重建图像的质量问题，提出一种基于多尺度深度特征显著区域检测图像压缩方法。

方法

利用改进的卷积神经网络（CNNs），进行多尺度图像深度特征检测，得到不同尺度显著区域；然后根据输入图像尺寸自适应调整显著区域图的尺寸，同时引入高斯函数，对显著区域进行滤波，得到多尺度融合显著区域；最后结合编码压缩技术，对显著区域实行近无损压缩，非显著区域利用有损编码技术进行有损压缩，完成图像的压缩和重建工作。

结果

提出的图像压缩方法较JPEG压缩方法，编码码率为0.39 bit/像素左右时，在数据集Kodak PhotoCD上，峰值信噪比（PSNR）提高了2.23 dB，结构相似性（SSIM）提高了0.024；在数据集Pascal Voc上，PSNR和SSIM两个指标分别提高了1.63 dB和0.039。同时，将提出的多尺度特征显著区域方法结合多级树集合分裂（SPIHT）和游程编码（RLE）压缩技术，在Kodak数据集上，PSNR分别提高了1.85 dB、1.98 dB，SSIM分别提高了0.006、0.023。

结论

提出的利用多尺度深度特征进行图像压缩方法得到了较传统编码技术更好的结果，该方法通过有效地进行图像内容的感知，使得在图像压缩过程中，减少了图像内容损失，从而提高了压缩后重建图像的质量。

Abstract

Objective

Image compression

which aims to remove redundant information in an image

is a popular issue in image processing and computer vision. In recent years

image compression based on deep learning has attracted much attention of scholars in the field of image processing. Image compression using convolutional neural networks (CNNs) can be roughly divided into two categories. One is the image compression method based on the end-to-end convolutional network. The other category is CNNs combined with the traditional image compression method

which uses CNNs to deeply perceive the image content and obtains salient regions. High-quality coding is then applied to the salient regions

and lower-quality coding is used for non-significant regions to improve the visual quality of the compressed reconstructed images. However

in the latter method

the quality of the reconstructed image is often considerably affected because there is no effective perception of the image content information. In view of the effectiveness of image content perception

the influence of scale on image content detection is disregarded in several conventionally proposed salient region detection methods. Furthermore

the difference in size between the input image and the output saliency map is not considered

which limits the model's perception domain to the image. Consequently

several salient objects in the original image cannot be effectively perceived

which affects the reconstructed image's quality in the subsequent compression. A novel image compression method based on multi-scale depth feature salient region (MS-DFSR) detection is proposed in the current study to deal with this problem.

Method

Improved CNNs are used to detect the depth features of multi-scale images. For multi-scale images

with the help of the scale space concept

a plurality of saliency maps is generated by inputting an image into the MS-DFSR model using a pyramid structure to complete the detection of multi-scale saliency regions. Scale selection

in the presence of an extremely large scale

causes the resulting salient area to become too divergent and loses salient meaning. Therefore

two scales are used in this work. The first one is the standard output scale of the network

and the second scale is the larger scale adopted in this work. The latter scale is used to effectively detect multiple salient objects in an image and perceive the image content effectively. For depth features' salient region detection

we replace the fully connected layer and the fourth max pooling layer with a global average pooling layer and an avg pooling layer in order to retain spatial location information on multiple salient objects in an image as much as possible. Then

the salient areas of different scales that are detected by MS-DFSR are obtained. To increase the perceived domain of an image and the perceived image content effectively

the size of the salient region map is adaptively adjusted according to the size of the input image by considering the difference between the input and output salient image sizes. Meanwhile

a Gaussian function is introduced to filter the salient region

retain the original image content information

and obtain a multi-scale fusion saliency region map. Finally

we complete image compression and reconstruction by combining the obtained multi-scale saliency region map with image coding methods. To protect the image's salient content and improve the reconstructed image's quality

the salient regions of an image are compressed using near-lossless and lossy compression methods

such as joint photographic experts' group (JPEG) and set partitioning in hierarchical trees (SPIHT)

on the non-salient regions.

Result

We compare our model with three traditional compression methods

namely

JPEG

SPIHT

and run-length encoding (RLE) compression techniques. The experimental datasets include two public datasets

namely

Kodak PhotoCD and Pascal Voc. The quantitative evaluation metrics (higher is better) include the peak signal-to-noise ratio (PSNR)

the structural similarity index measure (SSIM)

and a modified PSNR metric based on HVS (PSNR-HVS). Experiment results show that our model outperforms all the other traditional methods on the Kodak PhotoCD and Pascal Voc datasets. The saliency map shows that our model can produce results that cover multiple salient objects and improve the effective perception of image content. We compare the image compression method based on MS-DFSR detection with the image compression method based on single-scale depth feature salient region (SS-DFSR) detection

and the validity of the MS-DFSR detection model is verified. Comparative experiments demonstrate that the proposed compression method improves image compression quality. The quality of the image reconstructed using the proposed compression method is higher than that using the JPEG image compression method. When the code rate is approximately 0.39 bpp on the Kodak PhotoCD dataset

PSNR is improved by 2.23 dB

SSIM by 0.024

and PSNR-HVS by 2.07. On the Pascal Voc dataset

PSNR

SSIM

and PSNR-HVS increase by 1.63 dB

0.039

and 1.57

respectively. At the same time

when MS-DFSR is combined with SPIHT and RLE compression technology on the Kodak PhotoCD dataset

PSNR is increased by 1.85 dB and 1.98 dB

respectively. SSIM is improved by 0.006 and 0.023

respectively

and PSNR-HVS is increaseal by 1.90 and 1.88

respectively.

Conclusion

The proposed image compression method using multi-scale depth features exhibits better performance than traditional image compression methods because the proposed method effectively reduces image content loss by improving the effectiveness of image content perception during the image compression process. Consequently

the quality of the reconstructed image can be improved significantly.

关键词

Keywords

references

Cheng M M, Zhang G X, Mitra N J, Huang X L and Hu S M. 2011, Global contrast based salient region detection//Proceedings of CVPR 2011. Colorado Springs, CO, USA: IEEE: 409-416[ DOI: 10.1109/CVPR.2011.5995344 http://dx.doi.org/10.1109/CVPR.2011.5995344 ]

Cui L L, Xu J L, Xu G and Wu Q. 2018. Image saliency detection method based on a pair of feature maps. Journal of Image and Graphics, 23(4):583-594

崔玲玲, 许金兰, 徐岗, 吴卿. 2018.融合双特征图信息的图像显著性检测方法.中国图象图形学报, 23(4):583-594

Egiazarian K, Astola J, Ponomarenko N, Lukin V, Battisti F and Caril M. 2006. A new full-reference quality metrics based on HVS//Proceedings of the 2nd International Workshop on Video Processing and Quality Metrics. Scottsdale, USA: CD-ROM.

Zünd F, Pritch Y, Sorkine-Hornung A, Mangold S and Gross T. 2013. Content-aware compression using saliency-driven image retargeting//Proceedings of 2013 IEEE International Conference on Image Processing. Melbourne, VIC, Australia: IEEE: 1845-1849[ DOI: 10.1109/ICIP.2013.6738380 http://dx.doi.org/10.1109/ICIP.2013.6738380 ]

Fang Z, Cao T Y, Hong S Z and Xiang S K. 2018. Saliency detection via fusion of deep model and traditional model. Journal of Image and Graphics, 23(12):1864-1873

方正, 曹铁勇, 洪施展, 项圣凯. 2018.融合深度模型和传统模型的显著性检测.中国图象图形学报, 2018, 23(12):1864-1873[DOI:10.11834/jig.180073]

Islam A, Kalash M and Bruce N D B. 2018. Revisiting salient object detection: simultaneous detection, ranking, and subitizing of multiple salient objects//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 7142-7150[ DOI: 10.1109/CVPR.2018.00746 http://dx.doi.org/10.1109/CVPR.2018.00746 ]

Itti L, Koch C and Niebur E. 1998. A model of saliency-based visual attention for rapid scene analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence, 20(11):1254-1259[DOI:10.1109/34.730558]

Johnston N, Vincent D, Minnen D, Covell M, Singh S, Chinen T, Hwang S J, Shor J and Toderici G. 2018. Improved lossy image compression with priming and spatially adaptive bit rates for recurrent networks//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 4385-4393[ DOI: 10.1109/C-VPR.2018.00461 http://dx.doi.org/10.1109/C-VPR.2018.00461 ]

Li M, Zuo W M, Gu S H, Zhao D B and Zhang D. 2018. Learning convolutional networks for content-weighted image compression//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 3214-3223[ DOI: 10.1109/CVPR.2018.00339 http://dx.doi.org/10.1109/CVPR.2018.00339 ]

Mnih V, Heess N, Graves A and Kavukcuoglu K. 2014. Recurrent models of visual attention//Proceedings of the 27th International Conference on Neural Information Processing Systems. Montreal, Canada: ACM: 2204-2212.

Prakash A, Moran N, Garber S, Dilillo A and Storer J. 2017. Semantic perceptual image compression using deep convolution Networks//Proceedings of 2017 Data Compression Conference. Snowbird, UT, USA: IEEE, 250-259[ DOI: 10.1109/DCC.2017.56 http://dx.doi.org/10.1109/DCC.2017.56 ]

Said A and Pearlman W A. 1996. A new, fast, and efficient image codec based on set partitioning in hierarchical trees. IEEE Transactions on Circuits and Systems for Video Technology, 6(3):243-250[DOI:10.1109/76.499834]

Shi C P. 2016. Research on Hierarchical Compression Method of Optical Remote Sensing Images. Harbin: Harbin Institute of Technology http://cdmd.cnki.com.cn/Article/CDMD-10213-1017862244.htm .

石翠萍. 2016.光学遥感图像分级压缩方法研究.哈尔滨: 哈尔滨工业大学

Shapiro J M. 1993. Embedded image coding using zerotrees of wavelet coefficients. IEEE Transactions on Signal Processing, 41(12):3445-3462[DOI:10.1109/78.258085]

Toderici G, Vincent D, Johnston N, Hwang S J, Minnen D, Shor J and Covell M. 2017. Full resolution image compression with recurrent neural networks//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE: 5435-5443[ DOI: 10.1109/CVPR.2017.577 http://dx.doi.org/10.1109/CVPR.2017.577 ]

Wang W G, Shen J B, Dong X P and Borji A. 2018. Salient object detection driven by fixation prediction//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE[ DOI: 10.1109/CVPR.2018.00184 http://dx.doi.org/10.1109/CVPR.2018.00184 ]

Wang X H and Song C M. 2009. Image and Video Scalable Coding. Beijing:Science Press

王相海, 宋传鸣. 2009.图像及视频可分级编码.北京:科学出版社

Xia Q, Li S, Hao A M and Zhao Q P. 2019. Deep learning for digital geometry processing and analysis:a review. Journal of Computer Research and Development, 56(1):155-182

夏清, 李帅, 郝爱民, 赵沁平. 2019.基于深度学习的数字几何处理与分析技术研究进展.计算机研究与发展, 56(1):155-182[DOI:10.7544/issn1000-1239.2019.20180709]

Zhang J, Huang Y J, Dai K X and Li G H. 2009. Decomposing SAR image and protecting target region for compression. Journal of Image and Graphics, 14(1):3-7

张军, 黄英君, 代科学, 李国辉. 2009.图像分解和区域保护在SAR图像压缩中的应用.中国图象图形学报, 14(1):3-7[DOI:10.11834/jig.20090101]

Zhou S J, Ren F J, Du J and Yang S. 2017. Salient region detection based on the integration of background-bias prior and center-bias prior. Journal of Image and Graphics, 22(5):584-595

周帅骏, 任福继, 堵俊, 杨赛. 2017.融合背景先验与中心先验的显著性目标检测.中国图象图形学报, 22(5):584-595[DOI:10.11834/jig.160387]

Zhang X N, Wang T T, Qi J Q, Lu H C and Wang G. 2018. Progressive attention guided recurrent network for salient object detection//Proceedings of 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE: 714-722[ DOI: 10.1109/CVPR.2018.00081 http://dx.doi.org/10.1109/CVPR.2018.00081 ]

Zhou B L, Khosla A, Lapedriza A, Oliva A and Torralba A. 2016. Learning deep features for discriminative localization//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE: 2921-2929[ DOI: 10.1109/CVPR.2016.319 http://dx.doi.org/10.1109/CVPR.2016.319 ]

文章被引用时，请邮件提醒。

提交

图像的多描述编码

基于ISA-DWT的多个任意形状感兴趣区域编码框架

应用于图像的基于提升方法的双自适应小波变换

缩减码书的快速分形图像编码算法

一种改进的紧凑遗传算法及其在分形图像压缩中的应用