广东工业大学学报 ›› 2024, Vol. 41 ›› Issue (03): 131-140.doi: 10.12052/gdutxb.230067

• 信息与通信技术 • 上一篇    

基于神经网络的HEVC帧内预测组合快速算法

范俊宇1, 宋立锋1,2   

  1. 1. 广东工业大学 信息工程学院, 广东 广州 510006;
    2. 惠州市广工大物联网协同创新研究院有限公司, 广东 惠州 516025
  • 收稿日期:2023-05-11 出版日期:2024-05-25 发布日期:2024-06-14
  • 通信作者: 宋立锋(1967-),男,副教授,博士,主要研究方向为视频编解码及传输,E-mail:songlf@gdut.edu.cn
  • 作者简介:范俊宇(1997-),男,硕士研究生,主要研究方向为视频编解码,E-mail:telecom162@163.com
  • 基金资助:
    广东省科技创新战略专项资金 (省重点实验室认定) 项目 (2021B1212050003)

A Fast Combination Algorithm for HEVC Intra-Prediction Based on Neural Network

Fan Jun-yu1, Song Li-feng1,2   

  1. 1. School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China;
    2. Huizhou Guangdong University of Technology IoT Cooperative Innovation Institute Co., Ltd., Huizhou 516025, China
  • Received:2023-05-11 Online:2024-05-25 Published:2024-06-14

摘要: 为了提升高效视频编码(High Efficiency Video Coding, HEVC) 帧内编码的实时性能,本文提出的方法利用了引入偶数边长与步长的卷积核以及自注意力机制的轻量级卷积网络来预测编码树单元(Coding Tree Unit, CTU) 的帧内划分结构,从而减少了编码器对CTU进行四叉树递归遍历划分的编码时间。原始编码策略中粗模式决策通过基于残差经哈德曼变换的预测残差绝对值总和 (Sum of Absolute Transformed Difference, SATD) 的损失值来估计率失真优化过程中的率失真损失值来进行加速,但仍会耗费一定的编码时间。提出一种方法通过采样搜索的方式减少粗模式决策过程中计算的模式数,从35种模式降低到了18种模式,降低了粗模式决策过程中计算估计损失值的时间。由粗模式决策过程得到的较优的多个候选帧内模式来进行率失真优化,为了缩减粗模式决策需要计算的候选模式数,在候选模式列表中根据前后帧内预测角度模式的估计损失值的差距来筛选掉部分可能性较低的候选模式实现早停止决策,从而减少需要进行率失真优化的候选模式数量,进而减少率失真优化过程的计算时间。本文提出的算法在测试序列上平均实现78.15%的编码时间缩减,BD-PSNR为 -0.168 dB,BD-RATE为3.49%。

关键词: 视频编码, 神经网络, 帧内预测, 快速算法

Abstract: To improve the real-time performance of High Efficiency Video Coding (HEVC) intra-frame encoding, a method, which utilizes a lightweight convolutional network with even-length and step-size convolutional kernels and a self-attention mechanism, is proposed to predict the intra-frame partitioning structure of Coding Tree Units(CTU) , thereby reducing the encoding time required for the encoder to perform quadtree recursive traversal partitioning on CTUs. In the original encoding strategy, Rough Mode Decision accelerates the process by estimating the rate-distortion loss value in Rate Distortion Optimization based on the Sum of Absolute Transformed Difference (SATD) -based loss value, but it still consumes a certain amount of encoding time. A proposed method reduces the number of patterns calculated in the Rough Mode Decision process through a sampling search approach, reducing the number of patterns from 35 to 18, and decreasing the time required to estimate the loss value during the Rough Mode Decision process. The more favorable multiple candidate intra-frame modes obtained from the Rough Mode Decision process are used for Rate Distortion Optimization. In order to reduce the number of candidate modes that need to be calculated in Rate Distortion Optimization, an early stopping decision is implemented by filtering out some less likely candidate modes based on the differences in the estimated loss values of the intra-frame prediction angle modes in the candidate mode list, thus reducing the number of candidate modes that need to be evaluated in Rate Distortion Optimization and consequently decreasing the computation time of the Rate Distortion Optimization process. The proposed algorithm achieves an average encoding time reduction of 78.15% on the test sequences, with a BD-PSNR of -0.168dB and a BD-RATE of 3.49%.

Key words: video coding, neural network, intra-frame prediction, fast algorithm

中图分类号: 

  • TN919.81
[1] LENG J, SUN L, IKENAGA T, et al. Content based hierarchical fast coding unit decision algorithm for HEVC[C]//2011 International Conference on Multimedia and Signal Processing. Guilin: IEEE, 2011: 56-59.
[2] 唐燕, 王晓东, 章联军. 一种HEVC的CU分割模式快速算法[J]. 无线通信技术, 2020, 29(3): 12-15.
TANG Y, WANG X D, ZHANG L J. A fast CU partition mode algorithm for HEVC [J]. Wireless Communication Technology, 2020, 29(3): 12-15.
[3] ZHANG Y, KWONG S, JIANG G, et al. Statistical early termination model for fast mode decision and reference frame selection in multiview video coding [J]. IEEE Transactions on Broadcasting, 2012, 58(1): 10-23.
[4] 何书前, 余绪杭, 邓正杰. 高效的H. 265/HEVC快速帧内编码方法[J]. 计算机工程与设计, 2022, 43(9): 2601-2608.
HE S Q, YU X H, DENG Z J. Efficient H. 265/HEVC fast intra frame coding method [J]. Computer Engineering and Design, 2022, 43(9): 2601-2608.
[5] 郭磊, 王晓东, 徐博文, 等. 基于HEVC的帧内预测模式决策和编码单元划分快速算法[J]. 计算机应用, 2018, 38(4): 1157-1163.
GUO L, WANG X D, XU B W, et al. Fast intra mode prediction decision and coding unit partition algorithm based on high efficiency video coding [J]. Journal of Computer Applications, 2018, 38(4): 1157-1163.
[6] SHEN L, ZHANG Z, LIU Z. Effective CU size decision for HEVC intracoding [J]. IEEE Transactions on Image Processing, 2014, 23(10): 4232-4241.
[7] NISHIKORI T, NAKAMURA T, YOSHITOME T, et al. A fast CU decision using image variance in HEVC intra coding[C]//2013 IEEE Symposium on Industrial Electronics & Applications. Kuching: IEEE, 2013: 52-56.
[8] 汤进, 彭勇. 基于时空相关与纹理特性的HEVC编码单元快速划分算法[J]. 计算机与数字工程, 2019, 47(007): 1753-1756.
TANG J, PENG Y. Fast coding unit partition algorithm for HEVC based on temporal-spatial correlation and texture property [J]. Computer & Digital Engineering, 2019, 47(007): 1753-1756.
[9] 伍冠健, 宋立锋. HEVC快速帧内模式和深度决策算法[J]. 广东工业大学学报, 2015, 32(4): 132-137.
WU G J, SONG L F. Fast intra mode and depth decision algorithm for HEVC [J]. Journal of Guangdong University of Technology, 2015, 32(4): 132-137.
[10] JAMALI M, COULOMBE S, CARON F. Fast HEVC intra mode decision based on edge detection and SATD costs classification[C]//2015 Data Compression Conference. Snowbird: IEEE, 2015: 43-52.
[11] MIN B, CHEUNG R. A fast CU size decision algorithm for the HEVC intra encoder [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 25(5): 892-896.
[12] 石敏, 席诗华, 易清明. 基于预测单元尺寸的高效视频编码帧内预测模式快速选择的改进算法[J]. 激光与光电子学进展, 2019, 56(20): 226-234.
SHI M, XI S H, YI Q M. Improved algorithm for intraframe prediction mode fast selecting in high-efficiency video coding based on size of prediction units [J]. Laser & Optoelectronics Progress, 2019, 56(20): 226-234.
[13] CORREA G, ASSUNCAO P A, AGOSTINI L V, et al. Fast HEVC encoding decisions using data mining [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2014, 25(4): 660-673.
[14] HU Q, SHI Z, ZHANG X, et al. Fast HEVC intra mode decision based on logistic regression classification[C]//2016 IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB) . Nara: IEEE, 2016: 1-4.
[15] NAIR P S, RAO K R, NAIR M S. A machine learning approach for fast mode decision in HEVC intra prediction based on statistical features [J]. Journal of Intelligent and Fuzzy Systems, 2019, 36(3): 2095-2106.
[16] LIU D, LIU X, LI Y. Fast CU size decisions for HEVC intra frame coding based on support vector machines[C]//2016 IEEE 14th Intl Conf on Dependable, Autonomic and Secure Computing, 14th Intl Conf on Pervasive Intelligence and Computing, 2nd Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech) . Auckland: IEEE, 2016: 594-597.
[17] LIU X, LI Y, LIU D, et al. An adaptive CU size decision algorithm for HEVC intra prediction based on complexity classification using machine learning [J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 29(1): 144-155.
[18] RYU S, KANG J. Machine learning-based fast angular prediction mode decision technique in video coding [J]. IEEE Transactions on Image Processing, 2018, 27(11): 5525-5538.
[19] 周帅燃, 杨静. 低复杂度HEVC帧内编码快速划分算法[J]. 小型微型计算机系统, 2021, 42(7): 1475-1478.
ZHOU S R, YANG J. Fast partition algorithm of low complexity HEVC intra coding [J]. Journal of Chinese Computer Systems, 2021, 42(7): 1475-1478.
[20] 易清明, 林成思, 石敏. 利用深度学习的HEVC帧内编码单元快速划分算法[J]. 小型微型计算机系统, 2021, 42(2): 368-373.
YI Q M, LIN C S, SHI M. Fast HEVC coding units partitioning algorithm based on deep learning [J]. Journal of Chinese Computer Systems, 2021, 42(2): 368-373.
[21] LIU Z, YU X, GAO Y, et al. CU partition mode decision for HEVC hardwired intra encoder using convolution neural network [J]. IEEE Transactions on Image Processing, 2016, 25(11): 5088-5103.
[22] LAUDE T, OSTERMANN J. Deep learning-based intra prediction mode decision for HEVC[C]//2016 Picture Coding Symposium (PCS) . Nuremberg: IEEE, 2016: 1-5.
[23] CHEN Z, SHI J, LI W. Learned fast HEVC intra coding[J]. 2020 IEEE Transactions on Image Processing, 2020, 29: 5431-5446.
[24] XU M, LI T, WANG Z, et al. Reducing complexity of HEVC: a deep learning approach [J]. IEEE Transactions on Image Processing, 2018, 27(10): 5044-5059.
[25] 贾克斌, 崔腾鹤, 刘鹏宇, 等. 基于深层特征学习的高效率视频编码中帧内快速预测算法[J]. 电子与信息学报, 2021, 43(7): 2023-2031.
JIA K B, CUI T H, LIU P Y, et al. Fast prediction algorithm in high efficiency video coding intra-mode based on deep feature learning [J]. Journal of Electronics & Information Technology, 2021, 43(7): 2023-2031.
[1] 郑侠聪, 程良伦, 黄国恒, 王敬超. 嵌入拓扑特征的自然场景文本检测方法[J]. 广东工业大学学报, 2024, 41(03): 102-109.
[2] 殷丹丽, 凌捷. 基于异构信息网络的Android恶意程序检测方法[J]. 广东工业大学学报, 2024, 41(02): 56-64.
[3] 陈睿, 蔡念, 罗智浩, 刘璇, 黎剑. 基于多任务循环神经网络带状回归模型的乳腺癌个体生存分析[J]. 广东工业大学学报, 2024, 41(01): 34-40.
[4] 黄晓湧, 李伟彤. 基于TSSI和STB-CNN的跌倒检测算法[J]. 广东工业大学学报, 2023, 40(04): 53-59.
[5] 谢国波, 林立, 林志毅, 贺笛轩, 文刚. 基于YOLOv4-MP的绝缘子爆裂缺陷检测方法[J]. 广东工业大学学报, 2023, 40(02): 15-21.
[6] 张锐, 吕俊. 基于分离结果信噪比估计与自适应调频网络的单通道语音分离技术[J]. 广东工业大学学报, 2023, 40(02): 45-54.
[7] 邱俊豪, 程志键, 林国怀, 任鸿儒, 鲁仁全. 具有执行器故障的非线性系统指定性能控制[J]. 广东工业大学学报, 2023, 40(02): 55-63.
[8] 陈靖宇, 吕毅. 基于脉冲神经网络的冷链制冷机结霜检测方法[J]. 广东工业大学学报, 2023, 40(01): 29-38.
[9] 叶文权, 李斯, 凌捷. 基于多级残差U-Net的稀疏SPECT图像重建[J]. 广东工业大学学报, 2023, 40(01): 61-67.
[10] 彭美春, 阳晨, 李君平, 叶伟斌, 黄文伟. 基于BP神经网络的车辆碳排放测算研究[J]. 广东工业大学学报, 2023, 40(01): 107-112.
[11] 刘洪伟, 林伟振, 温展明, 陈燕君, 易闽琦. 基于MABM的消费者情感倾向识别模型——以电影评论为例[J]. 广东工业大学学报, 2022, 39(06): 1-9.
[12] 章云, 王晓东. 基于受限样本的深度学习综述与思考[J]. 广东工业大学学报, 2022, 39(05): 1-8.
[13] 彭积广, 肖涵臻. 模型预测控制下多移动机器人的跟踪与避障[J]. 广东工业大学学报, 2022, 39(05): 93-101.
[14] 黎耀东, 任志刚, 吴宗泽. 基于深度神经网络的注塑过程预测控制[J]. 广东工业大学学报, 2022, 39(05): 120-126,136.
[15] 曾江毅, 李志生, 欧耀春, 金宇凯. 季节指数改进的PM2.5质量浓度组合预测模型研究[J]. 广东工业大学学报, 2022, 39(03): 89-94.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
No Suggested Reading articles found!