结合随机森林的FVC帧内编码单元快速划分

任妍; 彭宗举; 崔鑫; 陈芬; 陈华

发布时间： 2019-05-07
摘要点击次数： 1781
全文下载次数： 286
DOI: 10.11834/jig.180490
2019 | Volume 24 | Number 5

结合随机森林的FVC帧内编码单元快速划分

任妍, 彭宗举, 崔鑫, 陈芬, 陈华(宁波大学信息科学与工程学院, 宁波 315211)

摘要

目的未来视频编码（FVC）是在高效视频编码标准（HEVC）的基础上提出的新一代编码技术，复杂度极高。针对现有的基于HEVC的快速编码方法不适用于FVC中的四叉树加二叉树编码结构或节省时间有限的问题，提出了一种结合随机森林的FVC帧内编码单元（CU）快速划分算法。方法针对FVC中的四叉树加二叉树结构进行优化。首先，提取视频编码过程中的各CU的图像纹理特征和划分结果；然后，分别使用各划分深度下的纹理特征和划分结果进行在线训练，建立多个随机森林模型，不同深度的CU对应不同的模型；最后，使用模型对视频其余帧的CU进行划分结果预测，从而减少了划分模式遍历和率失真代价计算的次数，节省了编码时间。结果实验结果表明，与原始平台算法相比，本文算法能够节省44.1%的时间，在相同峰值信噪比的情况下，比特率仅上升2.6%；与当前先进的方法相比，能进一步节省20%以上的时间。结论通过提取图像的纹理特征，建立随机森林模型，对CU划分结果进行预测，在保证编码率失真性能的前提下，有效地降低了FVC的帧内CU划分复杂度。

关键词

视频编码未来视频编码帧内快速编码机器学习随机森林

Random forest-based fast intra coding unit partition algorithm for FVC

Ren Yan, Peng Zongju, Cui Xin, Chen Fen, Chen Hua(Faculty of Information Science and Engineering, Ningbo University, Ningbo 315211, China)

Abstract

Objective Given the development of digital video technology, especially the emergence of ultra-high definition (UHD) video technology, video compression faces enormous challenges. To solve the problem of voluminous data and to address the high-speed transmission requirements of UHD videos, the Joint Video Experts Team (JVET) is exploring future video coding (FVC) based on the high-efficiency video coding (HEVC) standard. FVC uses the hybrid coding framework of HEVC with new techniques. The compression efficiency of FVC is higher than that of HEVC; however, its coding complexity is extremely high. Therefore, reducing the complexity of FVC is of great significance. Among all the new techniques in FVC, the most effective but extremely time consuming one is the quad tree plus binary tree (QTBT) coding structure, which includes four partition modes, namely, quad tree split, vertical split, horizontal split, and no-split. The final split of coding units (CUs) is decided after trying all the partition modes and calculating the rate distortion cost. Thus, the complexity of the QTBT is extremely high. The existing HEVC-based fast coding method is no longer suitable for FVC because the QTBT coding structure and the recent work about low-complexity encoding methods are insufficient for FVC applications. To reduce the high complexity of FVC, the complexity of the QTBT structure should be considered. The traversal process of CU partition modes exhibits redundancy, and unnecessary attempts to achieve mode partition should be avoided. To optimize CUs' split process, we propose a random forest-based fast intra coding unit partition algorithm for FVC. Method The proposed algorithm is designed to optimize the QTBT structure in FVC. Compared with traditional statistical-based methods, the machine learning-based approach is more applicable because of the elaborate split modes of the QTBT structure. Among the methods of machine learning, random forest offers unique advantages. Random forest can handle the classification problem of multi-dimensional data and is strongly resistant to over-fitting and estimation. Furthermore, the approach performs well on classification issues and is suitable for CU splitting. Therefore, a fast algorithm based on random forest is proposed. The problem of distinguishing different split results of CUs is considered a classification problem, and random forest is used as the classifier. The image texture features and split results of the CU in the first frame of video sequences are first extracted. Image texture features have a strong correlation with split results and can thus be selected as the training data of the model. Various image texture features are used in the algorithm to achieve superior performance, and they are carefully selected by the calculation of feature importance. Specifically, the features finally used in the proposed algorithm are the width and height of the CU, Haar wavelet coefficients, angular second moment, entropy, contrast, inverse differential moment, and standard deviation. After the data collection process, four random forest models are established for different depths of CUs. CU depth can be represented as the joint depth of the quad tree and the binary tree, and this representative method is used to collect data in the algorithm. Then, the texture features and split results are set as multidimensional data, and they are separately trained online for each model. The training time is included in the entire encoding time and is relatively shorter than the encoding time. Finally, the trained models are used to predict the split results of the CUs of the remaining frames of the video sequences, thereby reducing the traversal of the partition modes and the time of rate distortion cost calculation. To ensure the algorithm's effectiveness, we test the accuracy of the models online by using different video sequences. The algorithm is implemented on the recently released JEM5.0 platform. A total of 22 test sequences of different contents and resolutions from class A1 to class E are tested under the common test condition, which is a full I-frame configuration mode with quantization parameters 22, 27, 32, and 37. The encoding performance of the algorithm is evaluated using the Bjontegaard delta bitrate (BDBR) and average amount of time saved between the proposed algorithm and the original platform. Result Experimental results show that compared with the original platform's algorithm, the proposed algorithm can decrease the average encoding time by 44.1% with negligible coding performance loss, and the BDBR only increases by 2.6%. The approach can also save more than 20% of encoding time relative to state-of-the-art methods, with BDBR slightly increasing. This algorithm is suitable for various classes of video sequences with different resolutions and textures. Among all the sequences, the sequences with high resolution save more encoding time than other sequences do because of the online training time consumption. Furthermore, the coding performance of the proposed algorithm is stable, thereby proving the effectiveness of the models. Conclusion A random forest-based fast intra CU partition algorithm for FVC is proposed to reduce the complexity of the QTBT structure in FVC. By extracting the texture features of images, the algorithm establishes random forest models to predict the CU partitioning result while avoiding the unnecessary traversal of split modes to save encoding time. The proposed intra prediction coding algorithm can effectively reduce the complexity of FVC and maintain the encoding performance. The proposed algorithm is more suitable for video sequences with high resolution. Furthermore, the proposed algorithm should be optimized in the future to enhance time reduction and reduce coding performance loss. The possibilities of machine learning in FVC inter-prediction will also be explored in the future.

Keywords

video encoding future video coding (FVC) fast intra prediction coding machine learning random forest

在线采编平台

在线出版

年度会议

下载中心

年度信息