Enhanced Video Super-Resolution via a Dual-Stage Framework with VDSR and HMRF-DCNN

Mahnaz Mahdizadeh; Ali Akbar Khazaei; Seyyed Javad Seyyed Mahdavi Chabok; Farzan Khatib

doi:10.57647/j.spre.2025.0902.12

10.57647/j.spre.2025.0902.12

Enhanced Video Super-Resolution via a Dual-Stage Framework with VDSR and HMRF-DCNN

PDF

Mahnaz Mahdizadeh¹,
Ali Akbar Khazaei*¹,
Seyyed Javad Seyyed Mahdavi Chabok¹,
Farzan Khatib¹,

Department of Electrical Engineering, Ma.C., Islamic Azad University, Mashhad, Iran

Received: 2025-04-16

Revised: 2025-05-19

Accepted: 2025-05-27

Published in Issue 2025-06-01

This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.

How to Cite

Mahdizadeh, M., Khazaei, A. A., Seyyed Mahdavi Chabok, S. J., & Khatib, F. (2025). Enhanced Video Super-Resolution via a Dual-Stage Framework with VDSR and HMRF-DCNN. Signal Processing and Renewable Energy (SPRE), 9(2 (June 2025). https://doi.org/10.57647/j.spre.2025.0902.12

PDF views: 165

Abstract

Video super-resolution (VSR) is a critical task in video processing, aiming to enhance the resolution of consecutive frames while maintaining visual quality. This paper presents a comprehensive approach to video super-resolution, integrating deep learning networks with Hidden Markov Random Field (HMRF) techniques. The proposed method consists of two stages: In the first stage, the Very Deep Super-Resolution (VDSR) technique is employed to sharpen frame edges and enhance resolution, prioritizing perceptually significant details by focusing on luminance components. Additionally, random patching optimizes VDSR performance by enhancing relevant image details while mitigating computational burdens. In the second stage, a parallel network integrates the output of the first phase, HMRF-based inputs, and chronological inputs to capture spatial and temporal dependencies for final resolution enhancement. This multi-faceted approach ensures superior resolution and visual quality in the final output frames. Experimental evaluation demonstrates significant improvement over existing methods, with a peak signal-to-noise ratio (PSNR) of 37.0295 and a structural similarity index (SSIM) of 0.94683. The proposed method presents a promising solution for high-quality video super-resolution, addressing the complex interplay of resolution enhancement and visual fidelity in video processing.

Keywords

Video super-resolution,
Deep learning,
Hidden markov random field,
Very deep super-resolution,
Canny edge detection

PDF

References

Wang, L., Guo, Y., Liu, L., Lin, Z., Deng, X., & An, W. (2020). Deep video super-resolution using HR optical flow estimation. IEEE Transactions on Image Processing, 29, 4323-4336.‏
Masutani, E. M., Bahrami, N., & Hsiao, A. (2020). Deep learning single-frame and multiframe super-resolution for cardiac MRI. Radiology, 295(3), 552-561.‏
Yi, P., Wang, Z., Jiang, K., Jiang, J., Lu, T., & Ma, J. (2020). A progressive fusion generative adversarial network for realistic and consistent video super-resolution. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(5), 2264-2280.‏
Geng, Z., Liang, L., Ding, T., & Zharkov, I. (2022). Rstt: Real-time spatial temporal transformer for space-time video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 17441-17451).‏
Cao, J., Li, Y., Zhang, K., Liang, J., & Van Gool, L. (2021). Video super-resolution transformer. arXiv preprint arXiv:2106.06847.‏
Tian, Y., Zhang, Y., Fu, Y., & Xu, C. (2020). Tdan: Temporally-deformable alignment network for video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3360-3369).‏
Isobe, T., Zhu, F., Jia, X., & Wang, S. (2020). Revisiting temporal modeling for video super-resolution. arXiv preprint arXiv:2008.05765.‏
Fang, L., Monroe, F., Novak, S. W., Kirk, L., Schiavon, C. R., Yu, S. B., ... & Manor, U. (2021). Deep learning-based point-scanning super-resolution imaging. Nature methods, 18(4), 406-416.‏
Bhat, G., Danelljan, M., Yu, F., Van Gool, L., & Timofte, R. (2021). Deep reparametrization of multi-frame super-resolution and denoising. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 2460-2470).‏
Yi, P., Wang, Z., Jiang, K., Jiang, J., Lu, T., Tian, X., & Ma, J. (2021). Omniscient video super-resolution. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 4429-4438).‏
Li, W., Tao, X., Guo, T., Qi, L., Lu, J., & Jia, J. (2020). Mucan: Multi-correspondence aggregation network for video super-resolution. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16 (pp. 335-351). Springer International Publishing.‏
Guo, Y., Chen, J., Wang, J., Chen, Q., Cao, J., Deng, Z., ... & Tan, M. (2020). Closed-loop matters: Dual regression networks for single image super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5407-5416).‏
Ying, X., Wang, L., Wang, Y., Sheng, W., An, W., & Guo, Y. (2020). Deformable 3d convolution for video super-resolution. IEEE Signal Processing Letters, 27, 1500-1504.‏
Isobe, T., Jia, X., Gu, S., Li, S., Wang, S., & Tian, Q. (2020). Video super-resolution with recurrent structure-detail network. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XII 16 (pp. 645-660). Springer International Publishing.‏
Song, H., Xu, W., Liu, D., Liu, B., Liu, Q., & Metaxas, D. N. (2021). Multi-stage feature fusion network for video super-resolution. IEEE transactions on image processing, 30, 2923-2934.‏
Xu, G., Xu, J., Li, Z., Wang, L., Sun, X., & Cheng, M. M. (2021). Temporal modulation network for controllable space-time video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 6388-6397).‏
Li, Y., Jin, P., Yang, F., Liu, C., Yang, M. H., & Milanfar, P. (2021). Comisr: Compression-informed video super-resolution. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 2543-2552).‏
Jing, Y., Yang, Y., Wang, X., Song, M., & Tao, D. (2021). Turning frequency to resolution: Video super-resolution via event cameras. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7772-7781).‏
Chan, K. C., Zhou, S., Xu, X., & Loy, C. C. (2022). Basicvsr++: Improving video super-resolution with enhanced propagation and alignment. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5972-5981).‏
Chan, K. C., Wang, X., Yu, K., Dong, C., & Loy, C. C. (2021). Basicvsr: The search for essential components in video super-resolution and beyond. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 4947-4956).‏
Wen, W., Ren, W., Shi, Y., Nie, Y., Zhang, J., & Cao, X. (2022). Video super-resolution via a spatio-temporal alignment network. IEEE Transactions on Image Processing, 31, 1761-1773.‏
Liu, M., Jin, S., Yao, C., Lin, C., & Zhao, Y. (2022). Temporal consistency learning of inter-frames for video super-resolution. IEEE Transactions on Circuits and Systems for Video Technology, 33(4), 1507-1520.‏
Liu, C., Yang, H., Fu, J., & Qian, X. (2022). Learning trajectory-aware transformer for video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 5687-5696).‏
Lu, Y., Wang, Z., Liu, M., Wang, H., & Wang, L. (2023). Learning spatial-temporal implicit neural representations for event-guided video super-resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 1557-1567).‏
Xiao, Z., Xiong, Z., Fu, X., Liu, D., & Zha, Z. J. (2020, October). Space-time video super-resolution using temporal profiles. In Proceedings of the 28th ACM International Conference on Multimedia (pp. 664-672).‏
Canny, John. "A computational approach to edge detection." IEEE Transactions on pattern analysis and machine intelligence 6 (1986): 679-698.‏
Li, S. Z. (2009). Markov random field modeling in image analysis. Springer Science & Business Media.‏
Zhang, Y., Brady, J. M., & Smith, S. M. (2001). An hmrf-em algorithm for partial volume segmentation of brain mri fmrib technical report tr01yz1. Brain.‏
Wang, Q. (2012). HMRF-EM-image: implementation of the hidden markov random field model and its expectation-maximization algorithm. arXiv preprint arXiv:1207.3510.‏
Cao, Y., Wang, C., Song, C., Tang, Y., & Li, H. (2021, July). Real-time super-resolution system of 4k-video based on deep learning. In 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP) (pp. 69-76). IEEE.‏
Pan, J., Bai, H., Dong, J., Zhang, J., & Tang, J. (2021). Deep blind video super-resolution. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 4811-4820).‏
Li, F., Bai, H., & Zhao, Y. (2020). Learning a deep dual attention network for video super-resolution. IEEE transactions on image processing, 29, 4474-4488.‏
Xiang, X., Tian, Y., Zhang, Y., Fu, Y., Allebach, J. P., & Xu, C. (2020). Zooming slow-mo: Fast and accurate one-stage space-time video super-resolution. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 3370-3379).‏
Isobe, T., Li, S., Jia, X., Yuan, S., Slabaugh, G., Xu, C., ... & Tian, Q. (2020). Video super-resolution with temporal group attention. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (pp. 8008-8017).‏

Enhanced Video Super-Resolution via a Dual-Stage Framework with VDSR and HMRF-DCNN

How to Cite

Download Citation

Abstract

Keywords

References