面向上行流媒体的压缩感知视频流技术前沿
Survey on compressive sensing video stream for uplink streaming media
- 2021年26卷第7期 页码:1545-1557
收稿:2020-08-18,
修回:2021-2-4,
录用:2021-2-11,
纸质出版:2021-07-16
DOI: 10.11834/jig.200487
移动端阅览

浏览全部资源
扫码关注微信
收稿:2020-08-18,
修回:2021-2-4,
录用:2021-2-11,
纸质出版:2021-07-16
移动端阅览
上行流媒体在军民融合领域展现出日益重要的新兴战略价值,压缩感知视频流技术体系在上行流媒体应用中具有前端功耗低、容错性好、适用信号广等独特优势,已成为当前可视通信研究的前沿与热点之一。本文从阐述上行流媒体的应用特征出发,从性能指标、并行分块计算成像、低复杂度视频编码、视频重构和语义质量评价等方面,分析了当前针对压缩感知视频流的基础理论与关键技术,对国内外相关的研究进展进行了探究与比较。面向上行流媒体的压缩感知视频流面临着观测效率难控、码流适配困难和重建质量较低等技术挑战。对压缩感知视频流的技术发展趋势进行展望,未来将通过前端与智能云端的分工协作,突破高效率的视频观测与语义质量导引视频重构等关键技术,进一步开拓压缩感知视频流在上行流媒体应用中的定量优势与演进途径。
Uplink streaming media has an emerging strategic value in the civil-military integration field. For uplink streaming media applications
any compressive sensing video stream has technological advantages in terms of low-complexity terminal
good error resilience
and widely available signals. This technology is becoming one of the main issues in visual communication research. The compressive sensing video stream is a new type of visual communication whose functional modules mainly consist of front-end video observation and cloud-end video reconstruction. The core technology of compressive sensing video stream has not developed to a degree that can be standardized. When the uplink streaming media provides a large number of video sensing signals not for human viewing but for universal machine vision
any compressive-sensing video stream utilizes a new signal-processing mechanism that can avoid the shortage of existing uplink streaming technologies to first obtain additional information and then discard it. Based on the application characteristics of uplink streaming media
this study analyzes the basic theories and key technologies of compressive-sensing video stream
i.e.
performance metrics
parallel block computational imaging
low-complexity video encoding
video reconstruction
and semantic quality evaluation. The latest research progress is also investigated and compared in this survey. The video sensing signal is usually divided into group-of-frames (GOF)
and each GOF is further divided into a key frame and several non-key frames. As block compressive sensing (BCS) requires less sensing or storage resources at the front end
it not only realizes the lightweight observation matrix but also transmits block-by-block or in parallel. In a compressive sensing video stream
the GOF-BCS block array denotes the set of all BCS blocks in a GOF. The existing compressive sensing video stream adopts such a technical framework as single-frame observation
open-loop encoding
and fidelity-guided reconstruction. The study results show that for uplink streaming media
the existing compressive-sensing video stream faces bottleneck problems such as uncontrollable observation efficiency
lack of bitstream adaptation
and low reconstruction quality. Therefore
the technology development trend of compressive-sensing video streams have to be examined. The research directions of future compressive sensing video streams aim to focus on the following aspects. 1) Efficiency-optimized GOF-BCS block-array layout. The existing compressive-sensing video stream only uses a simple combination of GOF frame number
BCS block size
and sampling rate
which is a special layout of the GOF-BCS block-array. This special layout lacks a rationality proof. Therefore
we need to compare and analyze various block-array layouts and spatial-temporal partitions
and then design a universally optimized GOF-BCS block-array to quickly generate the observation vectors with more spatiotemporal semantics. At the same time
this approach is conducive to the hierarchical sparse modeling of video reconstruction. 2) Observation control and bitstream adaptation of video sensing signal. During video encoding
a trade-off occurs between the sampling rate and quantization depth. In subsequent study
an important task is to know how to construct the distribution model of observation vectors and adaptively adjust the sampling rate and quantization depth. Based on an efficiency-optimized GOF-BCS block-array
the novel compressive sensing video stream may improve the observation efficiency at the front end
and adapt both low-complexity encoding and wireless transmission. Through the dynamic interaction between source and channel at the front end
the feedback coordination is formed between video observation and wireless transmission
and the front-end complexity may be quantitatively controlled. 3) During video reconstruction
an important methodology is to obtain the sparse solution of the underdetermined system by prior modeling. When the hierarchical sparse model cannot stably represent the observation vectors
the data-driven reconstruction mechanism can make up for the deficiency of prior modeling. Future research will construct the generation and recovery mechanism of partial reversible signals
and explore the hybrid reconstruction mechanism of hierarchical sparse model and deep neural network (DNN). 4) Semantic quality assessment model for any reconstructed block-array. At present
the quality evaluation of reconstructed videos is limited to pixel-level fidelity. For universal machine vision
the video reconstruction relies more on semantic quality evaluation. On the basis of sparse residual prediction reconstruction
the cloud end gradually adds the data-driven reconstruction by DNN. By integrating the semantic quality assessment model
the video reconstruction mechanism with memory learning may be provided at cloud end. 5) A new technical framework will combine the high-efficiency observation and semantic-guided hybrid reconstruction. One of the important research directions is to construct the effective division and cooperation between the front and cloud ends. Besides the complexity-controllable front end
the new technical framework should demonstrate the higher semantic quality in video reconstruction and enhance the interpretability of compressive-sensing deep learning. For the video-sensing signal with dynamic scene changes
the new technical framework can balance the observation distortion
bitrate
and power consumption at any resource-constrained front end. The research directions are expected to break through the limitations of the existing compressive-sensing video stream. Such key technologies have to be developed as high-efficiency observation and semantic-guided hybrid reconstruction
which can further highlight the unique advantage and quantitative evolution of compressive-sensing video stream technology for uplink streaming media applications.
Barakabitze A A, Barman N, Ahmad A, Zadtootaghaj S, Sun L F, Martini M G and Atzori L. 2020. QoE management of multimedia streaming services in future networks: a tutorial and survey. IEEE Communications Surveys and Tutorials, 22(1): 526-565[DOI:10.1109/COMST.2019.2958784]
Baraniuk R G, Goldstein T, Sankaranarayanan A, Studer C, Veeraraghavan A and Wakin M B. 2017. Compressive video sensing: algorithms, architectures, and applications. IEEE Signal Processing Magazine, 34(1): 52-66[DOI:10.1109/MSP.2016.2602099]
Brites C, Ascenso J and Pereira F. 2021. Lenslet light field image coding: classifying, reviewing and evaluating. IEEE Transactions on Circuits and Systems for Video Technology, 31(1): 339-354[DOI:10.1109/TCSVT.2020.2976784]
Chen C, Zhou C, Liu P Y and Zhang D Y. 2020a. Iterative reweighted tikhonov-regularized multihypothesis prediction scheme for distributed compressive video sensing. IEEE Transactions on Circuits and Systems for Video Technology, 30(1): 1-10[DOI:10.1109/TCSVT.2018.2886310]
Chen J, Chen Z F, Su K X, Peng Z and Ling N. 2020b. Video compressed sensing reconstruction based on structural group sparsity and successive approximation estimation model. Journal of Visual Communication and Image Representation, 66: #102734[DOI:10.1016/j.jvcir.2019.102734]
Chen Z, Hou X S, Shao L, Gong C, Qian X M, Huang Y and Wang S D. 2020c. Compressive sensing multi-layer residual coefficients for image coding. IEEE Transactions on Circuits and Systems for Video Technology, 30(4): 1109-1120[DOI:10.1109/TCSVT.2019.2898908]
Deng C, Zhang Y L, Mao Y F, Fan J T, Suo J L, Zhang Z L and Dai Q H. 2021. Sinusoidal sampling enhanced compressive camera for high speed imaging. IEEE Transactions on Pattern Analysisand Machine Intelligence, 43(4): 1380-1393[DOI:10.1109/TPAMI.2019.2946567]
Fowler J E, Mun S and Tramel E W. 2012. Block-based compressed sensing of images and video. Foundations and Trends ® in Signal Processing, 4(4): 297-416[DOI:10.1561/2000000033]
Gao X W, Jiang F, Liu S H, Che W B, Fan X P and Zhao D B. 2016. Hierarchical frame based spatial-temporal recovery for video compressive sensing coding. Neurocomputing, 174: 404-412[DOI:10.1016/j.neucom.2015.07.110]
Guo J, Song B, Yu F R, Chi Y H and Yuen C. 2019. Fast video frame correlation analysis for vehicular networks by using CVS-CNN. IEEE Transactions on Vehicular Technology, 68(7): 6286-6292[DOI:10.1109/TVT.2019.2916726]
Hadizadeh H and Bajić I. 2020. Soft video multicasting using adaptive compressed sensing. IEEE Transactions on Multimedia, 23: 12-25[DOI:10.1109/TMM.2020.2975420]
Hu B, Li L D, Wu J J, Wang S Q, Tang L and Qian J S. 2017. No-reference quality assessment of compressive sensing image recovery. Signal Processing: Image Communication, 58: 165-174[DOI:10.1016/j.image.2017.08.003]
Hyder R and Asif M S. 2020. Generative models for low-dimensional video representation and reconstruction. IEEE Transactions on Signal Processing, 68: 1688-1701[DOI:10.1109/TSP.2020.2977256]
Ji X Y. 2020. Coded photography. Acta Optica Sinica, 40(1): 0111012
季向阳. 2020. 编码摄像. 光学学报, 40(1): #0111012[DOI:10.3788/AOS202040.0111012]
Jiang Q R, Li S, Zhu Z H, Bai H, He X X and De Lamare R C. 2020. Design of compressed sensing system with probability-based prior information. IEEE Transactions on Multimedia, 22(3): 594-609[DOI:10.1109/TMM.2019.2931400]
Ke J, Zhang L X and Zhou Q. 2020. Applications of compressive sensing in optical imaging. Acta Optica Sinica, 40(1): #0111006
柯钧, 张临夏, 周群. 2020. 压缩感知在光学成像领域的应用. 光学学报, 40(1): #0111006[DOI:10.3788/AOS202040.0111006]
Leinonen M, Codreanu M, Juntti M and Kramer G. 2018. Rate-distortion performance of lossy compressed sensing of sparse sources. IEEE Transactions on Communications, 66(10): 4498-4512[DOI:10.1109/TCOMM.2018.2834349]
Li C L, Toni L, Zou J N, Xiong H K and Frossard P. 2018. Delay-power-rate-distortion optimization of video representations for dynamic adaptive streaming. IEEE Transactions on Circuits and Systems for Video Technology, 28(7): 1648-1664[DOI:10.1109/TCSVT.2017.2681024]
Li L X, Wen G Q, Wang Z M and Yang Y X. 2020a. Efficient and secure image communication system based on compressed sensing for IoT monitoring applications. IEEE Transactions on Multimedia, 22(1): 82-95[DOI:10.1109/TMM.2019.2923111]
Li W, Liu F, Jiao L C, Hu F and Yang S Y. 2019. Video reconstruction based on intrinsic tensor sparsity model. Signal Processing: Image Communication, 72: 113-125[DOI:10.1016/j.image.2018.11.010]
Li Y, Dai W R, Zou J N, Xiong H K and Zheng Y F. 2020b. Scalable structured compressive video sampling with hierarchical subspace learning. IEEE Transactions on Circuits and Systems for Video Technology, 30(10): 3528-3543[DOI:10.1109/TCSVT.2019.2939370]
Li Z and Cui C. 2018b. An optimization algorithm for observation matrix in compressive sensing. Journal of Signal Processing, 34(2): 201-209
李周, 崔琛. 2018. 压缩感知中观测矩阵的优化算法. 信号处理, 34(2): 201-209 [DOI:10.16798/j.issn.1003-0530.2018.02.010]
Liu D, Li Y, Lin J P, Li H Q and Wu F. 2020a. Deep learning-based video coding: a review and a case study. ACM Computing Surveys, 53(1): #11[DOI:10.1145/3368405]
Liu X M, Zhai D M, Zhou J T, Zhang X F, Zhao D B and Gao W. 2016. Compressive sampling-based image coding for resource-deficient visual communication. IEEE Transactions on Image Processing, 25(6): 2844-2855[DOI:10.1109/TIP.2016.2554320]
Liu Y T, Gu K, Zhang Y B, Li X, Zhai G T, Zhao D B and Gao W. 2020b. Unsupervised blind image quality evaluation via statistical measurements of structure, naturalness, and perception. IEEE Transactions on Circuits and Systems for Video Technology, 30(4): 929-943[DOI:10.1109/TCSVT.2019.2900472]
Lucas A, Iliadis M, Molina R and Katsaggelos A K. 2018. Using deep neural networks for inverse problems in imaging: beyond analytical methods. IEEE Signal Processing Magazine, 35(1): 20-36[DOI:10.1109/MSP.2017.2760358]
Ma Y P, Qi H X, Han G C, Wang Y K, Wang L and Shu R. 2017. Experimental study for coding computational imaging and quality evaluation of reconstructed image. Journal of Infrared and Millimeter Waves, 36(4): 498-504, 512
马彦鹏, 亓洪兴, 韩贵丞, 王义坤, 汪磊, 舒嵘. 2017. 像面编码计算成像实验研究及重构图像的质量评价. 红外与毫米波学报, 36(4): 498-504, 512 [DOI:10.11972/j.issn.1001-9014.2017.04.019]
Mittal A, Saad M A and Bovik A C. 2016. A completely blind video integrity oracle. IEEE Transactions on Image Processing, 25(1): 289-300[DOI:10.1109/TIP.2015.2502725]
Palangi H, Ward R and Deng L. 2016. Distributed compressive sensing: a deep learning approach. IEEE Transactions on Signal Processing, 64(17): 4504-4518[DOI:10.1109/TSP.2016.2557301]
Pudlewski S and Melodia T. 2013. A tutorial on encoding and wireless transmission of compressively sampled videos. IEEE Communications Surveys and Tutorials, 15(2): 754-767[DOI:10.1109/SURV.2012.121912.00154]
Scheirer W, VidalMata R, Banerjee S, RichardWebster B, Albright M, Davalos P, McCloskey S, Miller B, Tambo A, Ghosh S, Nagesh S, Yuan Y, Hu Y Y, Wu J R, Yang W H, Zhang X S, Liu J Y, Wang Z Y, Chen H T, Huang T W, Chin W C, Li Y C, Lababidi M and Otto C. 2021. Bridging the gap between computational photography and visual recognition. IEEE Transactions on Pattern Analysis and Machine Intelligence, #2996538[DOI:10.1109/TPAMI.2020.2996538]
Shi W Z, Liu S H, Jiang F and Zhao D B. 2021. Video compressed sensing using a convolutional neural network. IEEE Transactions on Circuits and Systems for Video Technology, 31(2): 425-438[DOI:10.1109/TCSVT.2020.2978703]
Trevisi M, Akbari A, Trocan M, Rodríguez-Vázquez Á and Carmona-Galán R. 2020. Compressive imaging using RIP-compliant CMOS imager architecture and Landweber reconstruction. IEEE Transactions on Circuits and Systems for Video Technology, 30(2): 387-399[DOI:10.1109/TCSVT.2019.2892178]
Triki I, El-Azouzi R and Haddad M. 2020. NEWCAST: joint resource management and QoE-driven optimization for mobile video streaming. IEEE Transactions on Network and Service Management, 17(2): 1054-1067[DOI:10.1109/TNSM.2019.2952498]
Unde A S and Pattathil D P. 2020. Adaptive compressive video coding for embedded camera sensors: compressed domain motion and measurements estimation. IEEE Transactions on Mobile Computing, 19(10): 2250-2263[DOI:10.1109/TMC.2019.2926271]
Wang C, Chen F, Wu J J, Zhao Y, Lei H, Liu J Y and Wen D S. 2020. Progress in mechanism and data processing of visual sensing. Journal of Image and Graphics, 25(1): 19-30
王程, 陈峰, 吴金建, 赵勇, 雷浩, 刘纪元, 汶德胜. 2020. 视觉传感机理与数据处理进展. 中国图象图形学报, 25(1): 19-30 [DOI:10.11834/jig.190404]
Wang S, Yu L and Xiang S. 2019. A low complexity compressed sensing-based codec for consumer depth video sensors. IEEE Transactions on Consumer Electronics, 65(4): 434-443[DOI:10.1109/TCE.2019.2929586]
Wu F, Han Y H, Liao B B and Yu J Q. 2018. Researches on multimedia technology: 2017——memory-augmented media learning and creativity. Journal of Image and Graphics, 23(11): 1617-1634
吴飞, 韩亚洪, 廖彬兵, 于俊清. 2018. 多媒体技术研究: 2017——记忆驱动的媒体学习与创意. 中国图象图形学报, 23(11): 1617-1634 [DOI:10.11834/jig.180558]
Xiong H K, Dai W R, Lin Z C, Wu F, Yu J Q, Shen Y M and Xu M X. 2020. Advances in mathematical theory for multimedia signal processing. Journal of Image and Graphics, 25(1): 1-18
熊红凯, 戴文睿, 林宙辰, 吴飞, 于俊清, 申扬眉, 徐明星. 2020. 多媒体信号处理的数学理论前沿进展. 中国图象图形学报, 25(1): 1-18 [DOI:10.11834/jig.190468]
Yuan X and Haimi-Cohen R. 2020. Image compression based on compressive sensing: end-to-end comparison with JPEG. IEEE Transactions on Multimedia, 22(11): 2889-2904[DOI:10.1109/TMM.2020.2967646]
Zhang R F, Wu S H, Wang Y and Jiao J. 2020. High-performance distributed compressive video sensing: jointly exploiting the HEVC motion estimation and the ℓ 1 -ℓ 1 reconstruction. IEEE Access, 8: 31306-31316[DOI:10.1109/ACCESS.2020.2973392].
Zhang X F, Lin W S, Zhang Y B, Wang S Q, Ma S W, Duan L Y and Gao W. 2018. Rate-distortion optimized sparse coding with ordered dictionary for image set compression. IEEE Transactions on Circuits and Systems for Video Technology, 28(12): 3387-3397[DOI:10.1109/TCSVT.2017.2748382]
Zhang Y, Gao X B, He L H, Lu W and He R. 2019. Blind video quality assessment with weakly supervised learning and resampling strategy. IEEE Transactions on Circuits and Systems for Video Technology, 29(8): 2244-2255[DOI:10.1109/TCSVT.2018.2868063]
Zhao C, Ma S W, Zhang J, Xiong R Q and Gao W. 2017. Video compressive sensing reconstruction via reweighted residual sparsity. IEEE Transactions on Circuits and Systems for Video Technology, 27(6): 1182-1195[DOI:10.1109/TCSVT.2016.2527181]
Zheng S, Zhang X P, Chen J and Kuo Y H. 2019. A high-efficiency compressed sensing-based terminal-to-cloud video transmission system. IEEE Transactions on Multimedia, 21(8): 1905-1920[DOI:10.1109/TMM.2019.2891415]
Zheng X W, Yang C L and Xuan Y Y. 2020. Video motion features based multi-hypothesis-dual-sparsity reconstruction algorithm in compressed video sensing. Acta Electronica Sinica, 48(2): 249-257
郑学炜, 杨春玲, 禤韵怡. 2020. CVS中基于视频运动特征的多假设——双稀疏重构算法. 电子学报, 48(2): 249-257 [DOI:10.3969/j.issn.0372-2112.2020.02.004]
Zhou J W, Fu Y C, Yang Y C and Ho A T S. 2019. Distributed video coding using interval overlapped arithmetic coding. Signal Processing: Image Communication, 76: 118-124[DOI:10.1016/j.image.2019.03.016]
相关作者
相关机构
京公网安备11010802024621