轻量级实时渲染参数优化方法
Lightweight parameter optimization for real-time rendering
- 2024年 页码:1-12
网络出版日期: 2024-12-23
DOI: 10.11834/jig.240483
移动端阅览
浏览全部资源
扫码关注微信
网络出版日期: 2024-12-23 ,
移动端阅览
.轻量级实时渲染参数优化方法[J].中国图象图形学报,
.Lightweight parameter optimization for real-time rendering[J].Journal of Image and Graphics,
目的
2
随着数字孪生、虚拟现实等技术的普及,人们对画质和流畅性的需求不断提高。然而,受到关键性能硬件的制约,个人电脑或移动设备往往需要通过调整游戏或渲染引擎中的各项参数来提高帧率,而这必然会造成渲染质量损失。如何设置合理的渲染参数,在降低时间开销的同时,实现更高的渲染质量,成为图形应用领域广泛关注的问题。
方法
2
本文提出了一种通用的轻量级实时渲染自动参数优化方法,使用极致梯度提升(extreme gradient boosting,XGBoost)对虚拟场景渲染时不同参数的渲染时间和图像质量进行建模,在预计算后,模型被简化为查找表(look up table,LUT)。在实际渲染时根据硬件状态、场景信息等条件使用LUT自动调整渲染参数,在减少渲染时间的同时保证渲染质量。
结果
2
该方法能够应用于游戏、渲染引擎中的各类渲染技术。本文分别在次表面散射和环境光遮蔽效果进行应用和测试。结果表明,与最佳的渲染参数相比,使用本文方法的次表面散射渲染时间缩短40%左右,环境光遮蔽渲染时间减低70%左右,而图像误差均仅增加2%左右。
结论
2
本文提出的方法在减少渲染时间的同时,能够保持较高的渲染质量,具有良好的实用性,适用于游戏和渲染引擎中的各类渲染技术。下面是我们的代码仓库:
https://github.com/LightweightRenderParamOptimization/LightweightRenderParamOptimization
https://github.com/LightweightRenderParamOptimization/LightweightRenderParamOptimization
。
Objective
2
With the rapid development of Virtual Reality, Augmented Reality and Digital Twin technologies, these innovations have not only transformed the way people perceive virtual worlds but have also greatly advanced graphics rendering techniques. In these emerging fields, the quality of user experience directly depends on the realism and interactivity of the virtual world, making high-quality graphics and smooth performance indispensable. While high-quality rendering has made significant progress in PC and console games, applying these techniques to mobile devices, such as laptops, tablets, and smartphones, remains a significant challenge. Mobile devices are considerably limited in processing power, graphics capabilities, and memory compared to high-end PCs and dedicated gaming consoles. Additionally, mobile devices require long battery life and cannot afford the high power consumption typical of desktop systems. As a result, achieving high-quality, low-latency rendering under constrained hardware conditions is a major challenge. In modern game and rendering engines, a variety of rendering techniques (such as subsurface scattering, ambient occlusion, screen space reflections, normal mapping, etc.) are integrated into a complex rendering pipeline. These techniques often come with numerous adjustable parameters, such as the number of scattering samples, shadow precision, reflection intensity, ambient occlusion level, texture resolution, and level of detail. These parameters significantly affect image quality but also directly impact rendering computation and time costs. Therefore, finding an optimal balance between image quality and rendering time is critical for optimizing rendering parameters. Typically, these parameters are manually configured by developers based on different hardware environments and scene requirements. Developers often rely on trial and error, adjustment, and visual feedback to optimize these parameters for ideal rendering performance and quality. This manual approach is inefficient, error-prone, and becomes nearly impossible when dealing with complex 3D scenes and dynamic game environments. Moreover, as games and virtual reality technologies evolve, real-time rendering must complete large amounts of complex calculations for each frame. Any misconfiguration can lead to performance bottlenecks or distorted visual effects. For example, shadow rendering precision may be crucial in some scenes but can be reduced in others to save computational resources. If these parameters cannot be dynamically optimized in real-time, the rendering engine might overuse resources in certain frames, leading to frame rate drops or increased latency, which severely affects user experience. To address this issue, researchers in recent years have explored various methods for optimizing rendering parameters. These include sampling scene space using octrees, leveraging Pareto frontiers to find locally optimal parameters, using regression analysis and linear functions to quickly fit low-power parameters, or employing neural networks to estimate performance bottlenecks in real time based on drawcall counts. While these methods have achieved some success in rendering optimization, they still have significant limitations. First, function-fitting methods are prone to errors across different scenes, making generalization difficult. Second, the complexity of neural network inference introduces substantial computational overhead. Each time the neural network is used for parameter prediction, it adds extra computational burden. In real-time rendering, any delay can negatively affect performance. Consequently, existing neural network-based optimization methods often perform parameter prediction every few dozen frames instead of calculating the optimal parameters for every single frame. This non-real-time parameter updating is particularly problematic in dynamic scenes where the complexity of the scene and camera view may change drastically at any moment. Neural networks may fail to respond to these changes promptly, compromising rendering stability and image quality. For instance, when the camera moves quickly, the objects and lighting in the scene may undergo significant changes, rendering the previous parameter predictions obsolete, leading to visual artifacts or frame rate fluctuations, which in turn degrades the user experience.
Method
2
To address these issues, this paper proposes a lightweight, real-time automatic rendering parameter optimization method. The proposed method is computationally efficient and allows for adaptive per-frame rendering parameter updates, ensuring consistency in rendering after parameter adjustments. The method is divided into three stages: model training, pre-computation, and adaptive real-time rendering. In the model training stage, various rendering parameters, hardware configurations, and scene information are used within a virtual environment to collect data on rendering time and image quality. This data is then used to train the model, which is divided into two parts: one for evaluating rendering time and the other for evaluating image quality. This separation enables the model to fully explore the intrinsic relationships between parameters, rendering time, and image quality. Additionally, the specially designed virtual scenes provide sufficient sample information, allowing the model to generalize to new scenes. In the pre-computation stage, the key step is to first assess the real-time hardware information of the device, including the processor, graphics card, and other performance parameters. This step is completed during scene loading to ensure that rendering parameter optimization can be customized based on the specific performance of the device. Subsequently, the system simplifies the optimization problem of rendering time and image quality from a two-dimensional multi-objective optimization problem to two independent one-dimensional linear search tasks. This significantly accelerates the pre-computation speed, as linear search is far simpler than complex optimization in two-dimensional space. Specifically, there is typically a trade-off between rendering time and image quality, and optimizing these two factors requires finding a balance among many parameter combinations. To simplify this process, the system decomposes it into two independent one-dimensional linear search tasks. First, within the given rendering time threshold (set to the fastest 20% in this paper), the system searches for the optimal rendering time settings achievable under the current hardware conditions. Next, the system searches along the image quality dimension, ensuring that rendering time does not increase significantly, to find rendering parameters that maximize image quality. By employing this two-step search strategy, the system effectively balances rendering time and image quality while ensuring the optimization process is both efficient and accurate. Once optimization is completed, the resulting model is simplified into a lookup table (LUT), which records the optimal rendering parameter combinations for different hardware configurations. This LUT is tailored according to the device's hardware parameters, ready for use in the subsequent real-time rendering phase. In the adaptive real-time rendering stage, before rendering each frame, the system quickly retrieves the optimal rendering parameter settings from the pre-generated LUT based on the current hardware status and scene information. The lookup speed of the LUT is extremely fast, significantly reducing the computational overhead compared to real-time parameter calculation. This allows the system to complete parameter selection within milliseconds and immediately apply these parameters for rendering. The process ensures both efficiency and flexibility in rendering. By completing extensive pre-computation tasks in advance, the system only needs to perform simple lookup operations during actual rendering, achieving a balance between high-quality rendering and fast responsiveness. Ultimately, the selected parameters are applied directly to the rendering of the current frame, ensuring that each frame achieves the optimal result based on the hardware performance and scene requirements.
Result
2
The experimental results show that when compared with neural networks and LightGBM models applied to subsurface scattering and ambient occlusion rendering techniques, the proposed method demonstrates advantages across multiple dimensions, including image quality, scene dependency, rendering time, and model performance. Specifically, in various scenes, the proposed method reduces subsurface scattering rendering time by approximately 40% and ambient occlusion rendering time by about 70%, with only around a 2% increase in image quality error. Additionally, the real-time inference time per frame is less than 0.1 milliseconds.
Conclusion
2
The proposed method effectively reduces rendering time while maintaining high rendering quality, making it highly practical for the actual demands of modern games and rendering engines. The implementation can be accessed at the following link:
https://github.com/LightweightRenderParamOptimization/LightweightRenderParamOptimization
https://github.com/LightweightRenderParamOptimization/LightweightRenderParamOptimization
.
实时渲染渲染优化XGBoost查找表虚幻引擎
real-time renderingrendering optimizationXGBoostlook up tableunreal engine
Cheng W C and Pedram M. 2004. Power minimization in a backlit TFT-LCD display by concurrent brightness and contrast scaling. IEEE Transactions on Consumer Electronics, 50(1): 25–32[DOI: 10.1109/TCE.2004.1277837http://dx.doi.org/10.1109/TCE.2004.1277837]
Chen H D, Wang J, Chen W F, Qu H M and Chen W. 2014. An image-space energy-saving visualization scheme for OLED displays. Computers & Graphics, 38: 61-68[DOI: 10.1016/j.cag.2013.10.020http://dx.doi.org/10.1016/j.cag.2013.10.020]
Chen W F, Chen W, Chen H D, Zhang Z F and Qu H M. 2016. An energy-saving color scheme for direct volume rendering. Computers & Graphics, 54: 57-64[DOI: 10.1016/j.cag.2015.07.015http://dx.doi.org/10.1016/j.cag.2015.07.015]
Chen T Q and Guestrin C. 2016. XGBoost: A Scalable Tree Boosting System//Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. New York, USA: ACM: 785–794[DOI: 10.1145/2939672.2939785http://dx.doi.org/10.1145/2939672.2939785]
Farazmand N and Kaeli D R. 2017. Quality of Service-Aware Dynamic Voltage and Frequency Scaling for Mobile 3D Graphics Applications//IEEE International Conference on Computer Design (ICCD). Boston, USA: IEEE: 513-516[DOI: 10.1109/ICCD.2017.89http://dx.doi.org/10.1109/ICCD.2017.89]
Huynh-Thu Q and Ghanbari M. 2008. Scope of validity of PSNR in image/video quality assessment. Electronics Letters, 44(13): 800-801[DOI:10.1049/el:20080522http://dx.doi.org/10.1049/el:20080522]
Iyer S, Luo L, Mayo R and Ranganathan P. 2003. Energy-Adaptive Display System Designs for Future Mobile Environments// Proceedings of the 1st International Conference on Mobile Systems, Applications and Services. New York, USA: ACM: 245–258[DOI: 10.1145/1066116.1189045http://dx.doi.org/10.1145/1066116.1189045]
Ke G L, Meng Q, Finley T, Wang T F, Chen W, Ma W D, Ye Q W and Liu T Y. 2017. LightGBM: a highly efficient gradient boosting decision tree//Proceedings of the 31st International Conference on Neural Information Processing Systems. Red Hook, USA: Curran Associates Inc.: 3149–3157[DOI: 10.5555/3294996.3295074http://dx.doi.org/10.5555/3294996.3295074]
Luebke D, Watson B, Cohen J D, Reddy M and Varshney A. 2002. Level of detail for 3D graphics. New York: Elsevier Science Inc.
Narra P and Zinger D S. 2004. An effective LED dimming approach//Conference Record of the 2004 IEEE Industry Applications Conference. Seattle, USA: IEEE: 1671-1676[DOI: 10.1109/IAS.2004.1348695http://dx.doi.org/10.1109/IAS.2004.1348695]
Robert D. 2012. Trends and Forecasts in Computer Graphics— power-efficient rendering[EB/OL]. [2024-08-16]. https://www.jonpeddie.com/news/trends-and-forecasts-in-computer-graphics-power-efficient-rendering/https://www.jonpeddie.com/news/trends-and-forecasts-in-computer-graphics-power-efficient-rendering/
Sara U, Akter M and Uddin M. 2019. Image quality assessment through FSIM, SSIM, MSE and PSNR—a comparative study. Journal of Computer and Communications, 7(3): 8-18[DOI: 10.4236/jcc.2019.73002http://dx.doi.org/10.4236/jcc.2019.73002]
Shkurko K, Grant T, Kopta D, Mallett I, Yuksel C and Brunvand E. 2017. Dual streaming for hardware-accelerated ray tracing//Proceedings of High Performance Graphics. New York, USA: ACM: Article No.: 12, Pages 1-11[DOI: 10.1145/3105762.3105771http://dx.doi.org/10.1145/3105762.3105771]
Vasiou E, Shkurko K, Mallett I, Brunvand E and Yuksel C. 2018. A detailed study of ray tracing performance: render time and energy cost. The Visual Computer: International Journal of Computer Graphics, 34(6-8): 875-885[DOI: 10.1007/s00371-018-1532-8http://dx.doi.org/10.1007/s00371-018-1532-8]
Wang R, Yu B W, Marco J, Hu T L, Gutierrez D and Bao H J. 2016. Real-time rendering on a power budget. ACM Transactions on Graphics, 35(4): Article No.: 111, Pages 1-11[DOI: 10.1145/2897824.2925889http://dx.doi.org/10.1145/2897824.2925889]
Wang Z, Bovik A C, Sheikh H R and Simoncelli E P. 2004. Image quality assessment: from error visibility to structural similarity. IEEE Transactions on Image Processing, 13(4): 600-612[DOI: 10.1109/TIP.2003.819861http://dx.doi.org/10.1109/TIP.2003.819861]
Wu C, Yang B W, Zhu W W and Zhang Y X. 2018. Toward High Mobile GPU Performance Through Collaborative Workload Offloading. IEEE Transactions on Parallel and Distributed Systems, 29[2]: 435-449[DOI: 10.1109/TPDS.2017.2754482http://dx.doi.org/10.1109/TPDS.2017.2754482]
Zhao J S, Sun G Y, Loh G H. and Xie Y. 2012. Energy-efficient GPU design with reconfigurable in-package graphics memory//Proceedings of the 2012 ACM/IEEE International Symposium on Low Power Electronics and Design. New York, USA: ACM: 403–408[DOI: 10.1145/2333660.2333752http://dx.doi.org/10.1145/2333660.2333752]
Zhang Y J, Ortin M, Arellano V, Wang R, Gutierrez D and Bao H J. 2018. On-the-Fly Power-Aware Rendering. Computer Graphics Forum, 37(4): 155-166[DOI: 10.1111/cgf.13483http://dx.doi.org/10.1111/cgf.13483]
Zhang Y J, Wang R, Huo Y C, Hua W and Bao H J. 2022. PowerNet: Learning-Based Real-Time Power-Budget Rendering. IEEE Transactions on Visualization and Computer Graphics, 28(10): 3486-3498[DOI: 10.1109/TVCG.2021.3064367http://dx.doi.org/10.1109/TVCG.2021.3064367]
相关作者
相关机构