10.57647/j.mjee.2025.17063

Towards Omni-Font Optical Character Recognition (OCR) for Persian script using the YOLO object detection model

  1. Department of Electrical and Computer Engineering, Jundi-Shapur University of Technology, Dezful, Iran

Received: 2025-06-03

Revised: 2025-06-24

Accepted: 2025-07-12

How to Cite

Gandomkar, M., & Khoramipour, S. Towards Omni-Font Optical Character Recognition (OCR) for Persian script using the YOLO object detection model. Majlesi Journal of Electrical Engineering. https://doi.org/10.57647/j.mjee.2025.17063

PDF views: 335

Abstract

Optical Character Recognition (OCR), especially for scripts with complex structures like Persian script, faces significant challenges in interpreting nuanced characters and contextual variations. This study provides a straightforward and scalable approach to developing omni-font OCR systems. A synthetic dataset incorporating words, numbers, punctuation marks, mathematic symbols, and whitespace characters is developed to evaluate YOLO's capability in detecting 70 characters across 15 formal, informal, and handwritten-style fonts. The proposed method for detection of regular space and non-breaking space characters achieved high-precision results that may eliminate the need for a separate word detection stage in an OCR system. In another investigation, we attempted to detect characters from an unseen font by training the model on a batch of other fonts. A formal font such as “B Nazanin” is near-completely detectable without being directly included in the model’s training with a batch of fourteen other fonts. For a handwritten-style font such as “MRT_Sayeh-1”, the mean detectability increases from 54%, when training the model with other single fonts, to 80% when a batch of fourteen other fonts is used. Overall, this study demonstrates that object detection-based OCR models have the potential for Omni-Font text recognition through expanded datasets and advancements in deep learning.

Keywords

  • Optical Character Recognition,
  • Object Detection,
  • YOLO,
  • Persian Script,
  • Handwritten-style fonts

References

  1. H. Noori, “Persian Car License Plate Recognition using Deep Convolutional Neural Networks,” AUT Journal of Electrical Engineering, vol. 57(2), pp. 295-316, 2025. Available: https://doi.org/10.22060/eej.2024.23296.5601
  2. M. M. Zeinali; S. Ghofrani. “Application-oriented Farsi license plate recognition using deterministic clustering algorithm and MSER detector,” Majlesi Journal of Electrical Engineering, vol. 10(2), pp. 39-52, 2016. Available: https://mjee.isfahan.iau.ir/article_696233.html
  3. Y. K. Singh, R. Jaiswal, P. Choudhary, B. Chugh, “Verifying bank checks using deep learning and image processing,” presented at the IEEE International Conference on Intelligent Systems for Cybersecurity (ISCS), pp 1-6, 2024. Available: https://doi.org/10.1109/ISCS61804.2024.10581393
  4. S. Prommas, T. Siriborvornratanakul, “CNN-based Thai handwritten OCR: an application for automated mail sorting,” International Journal of Information Technology, vol. 16(2), pp. 793-798, 2024. Available: https://doi.org/10.1007/s41870-023-01638-4
  5. C. Q. Lin, D. H. Wang, Y. F. Su, D. W. Ge, X. Y. Zhang, “OCR4HSV: A Multi-task Learning Approach for Handwritten Signature Verification,” presented at the International Conference on Pattern Recognition, Springer, Cham, pp. 287-302, 2025. Available: https://doi.org/10.1007/978-3-031-78119-3_20
  6. A. Fateh, M. Rezvani, A. Tajary, M. Fateh, “Persian printed text line detection based on font size,” Multimedia Tools and Applications, vol. 82(2), pp. 2393-2418, 2023. Available: https://doi.org/10.1007/s11042-022-13243-x
  7. M. Aliakbarzadeh, F. Razzazi, “Online Persian/Arabic Writer Identification using Gated Recurrent Unit Neural Networks,” Majlesi Journal of Electrical Engineering, vol. 14(3), pp. 73-79, 2020. Available: https://doi.org/10.29252/mjee.14.3.9
  8. Z. Khosrobeigi, H. Veisi, E. Hoseinzade, H. Shabanian, “Persian optical character recognition using deep bidirectional long short-term memory,” Applied Sciences, vol. 12(22), p. 11760, 2022. Available: https://doi.org/10.3390/app122211760
  9. N. Salehian, M. Yazdchi, A. Karimian, “Farsi Nastaligh Word Recognition by Using Artificial Neural Networks,” Majlesi Journal of Electrical Engineering, vol. 2(4), pp. 1-10, 2008. Available: https://doi.org/10.1234/mjee.v2i4.132
  10. A. K. Bhunia, A. Konwer, A. K. Bhunia, A. Bhowmick, P. P. Roy, U. Pal, “Script identification in natural scene image and video frames using an attention based convolutional-LSTM network” Pattern Recognition, vol. 85, pp. 172-184, 2019. Available: https://doi.org/10.1016/j.patcog.2018.07.034
  11. H. Hassanpour, N. Samadiani, F. Akbarzadeh, “A modified self-organizing map neural network to recognize multi-font printed Persian numerals,” International Journal of Engineering, Transactions B: Applications, vol. 30(11), pp. 1700-1706, 2017. Available: https://www.ije.ir/article_73055.html
  12. A. S. Shaker, M. F. Khaleel, O. A. Ismael, R. S. Majeed, M. R. Ahmed, “Information retrieval system of Arabic alphabetic characters by using hidden Markov Model,” presented at the IEEE International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), pp. 1-6, 2022. Available: https://doi.org/10.1109/HORA55278.2022.9799843
  13. M. Bonyani, S. Jahangard, M. Daneshmand, “Persian handwritten digit, character and word recognition using deep learning,” International Journal on document analysis and recognition (IJDAR), vol. 24(1), pp. 133-143, 2021. Available: https://doi.org/10.1007/s10032-021-00368-2
  14. B. M. Hasan, Z. J. Jaber, A. A. Habeeb, “Digits Recognition for Arabic Handwritten through Convolutional Neural Networks, Local Binary Patterns, and Histogram of Oriented Gradients,” Baghdad Science Journal, vol. 21(10), pp. 3322-3322, 2024.
  15. A. A. Sanjrani, J. Baber, M. Bakhtyar, I. Ullah, M. S. Naveed, W. Noor, A. Basit, A. Khan, N. Sheikh, “Extended framework for Sindhi numerals OCR using gradient orientation histograms,” Journal of Intelligent & Fuzzy Systems, vol. 43(2), pp. 2045-2056, 2022. Available: https://doi.org/10.3233/JIFS-219304
  16. J. Wang, “A study of the ocr development history and directions of development,” Highlights in Science, Engineering and Technology, vol. 72, pp. 409-415, 2023. Available: https://doi.org/10.54097/bm665j77
  17. A. M. Al-Shatnawi, F. Al-Saqqar, “A. Souri, Arabic handwritten word recognition based on stationary wavelet transform technique using machine learning,” Transactions on Asian and Low-Resource Language Information Processing, vol. 21(3), pp. 1-21, 2021. Available: https://doi.org/10.1145/3474391
  18. A. Kumar, B. Murugan, “Handwritten Digit Recognition Using Neural Network with Gabor Filter for Information Fusion,” presented at the International Conference on Machine Learning and Big Data Analytics, Cham: Springer International Publishing, pp. 411-421, 2022. Available: https://doi.org/10.1007/978-3-031-15175-0_34
  19. V. M. Ashiq, E. J. T. Fredrik, “An OCR for Arabic character recognition with advanced principal component analysis based on feature extraction and fuzzy-KNN based classification,” International Journal of Health Sciences, vol. 6(S1), pp. 12205–12224, 2022. Available: https://doi.org/10.53730/ijhs.v6nS1.7918.
  20. M. Mohammadpoor; A. Mehdizadeh; H. A. Noghabi. “A Novel Method for Persian Handwritten Digit Recognition Using Support Vector Machine,” Majlesi Journal of Electrical Engineering, vol. 12(3), pp. 63-67, 2018. Available: https://mjee.isfahan.iau.ir/article_696315.html
  21. V. Moradi, F. Razzazi, A. Behrad. “Recognition of Handwritten Persian Two-digit Numerals Using a Novel Hybrid SVM/HMM algorithm,” Majlesi Journal of Electrical Engineering, vol. 10(3), pp. 19-25, 2016. Available: https://mjee.isfahan.iau.ir/article_696236.html
  22. N. M. Alharbi, A. H. Osman, A. A. Mashat, H. J. Alyamani, “Letter Recognition Reinvented: A Dual Approach with MLP Neural Network and Anomaly Detectionm,” Computer Systems Science & Engineering, vol. 48(1), pp. 175-198, 2024. Available: https://doi.org/10.32604/csse.2023.041044
  23. S. Alghyaline, “Arabic Optical Character Recognition: A Review,” CMES-Computer Modeling in Engineering & Sciences, vol. 135(3), pp. 1825-1861, 2023. Available: https://doi.org/10.32604/cmes.2022.024555
  24. M. Elleuch, R. Maalej, M. Kherallah, “A new design based-SVM of the CNN classifier architecture with dropout for offline Arabic handwritten recognition,” Procedia Computer Science, vol. 80, pp. 1712-1723, 2016. Available: https://doi.org/10.1016/j.procs.2016.05.512
  25. B. Alizadehashraf, S. Roohi, “Persian handwritten character recognition using convolutional neural network,” presented at 10th Iranian Conference on Machine Vision and Image Processing (MVIP), IEEE, pp. 247-251, 2017. Available: https://doi.org/10.1109/IranianMVIP.2017.8342359
  26. M. Aarif K.O, S. Poruran, “OCR-nets: variants of pre-trained CNN for Urdu handwritten character recognition via transfer learning,” Procedia computer science, vol. 171, pp. 2294-2301, 2020. Available: https://doi.org/10.1016/j.procs.2020.04.248
  27. S. Khosravi, A. Chalechale, “Chimp optimization algorithm to optimize a convolutional neural network for recognizing Persian/Arabic handwritten words,” Mathematical Problems in Engineering, vol. 2022(1), p. 4894922, 2022. Available: https://doi.org/10.1155/2022/4894922
  28. A. Naseer, K. Zafar, “Meta features-based scale invariant OCR decision making using LSTM-RNN,” Computational and Mathematical Organization Theory, vol. 25, pp. 165-183, 2019. . Available: https://doi.org/10.1007/s10588-018-9265-9
  29. M. R. Soheili, M. R. Yousefi, E. Kabir, D. Stricker, “Merging clustering and classification results for whole book recognition,” presented at 10th Iranian Conference on Machine Vision and Image Processing (MVIP), IEEE, pp. 134-138, 2017. Available: https://doi.org/10.1109/IranianMVIP.2017.8342338
  30. S. M. Mousavi, V. Lyashenko, “Extracting old persian cuneiform font out of noisy images (handwritten or inscription),” presented at 10th Iranian Conference on Machine Vision and Image Processing (MVIP), IEEE, pp. 241-246, 2017. Available: https://doi.org/ 10.1109/IranianMVIP.2017.8342358
  31. R. Mondal, S. Malakar, E. H. Barney Smith, R. Sarkar, “Handwritten English word recognition using a deep learning based object detection architecture,” Multimedia Tools and Applications, vol. 81(1), pp. 975-1000, 2022. Available: https://doi.org/10.1007/s11042-021-11425-7
  32. A. A. Demir, U. Ozkaya, “Ottoman character recognition on printed documents using deep learning,” Mühendislik Bilimleri ve Tasarım Dergisi, vol. 12(2), pp. 392-402, 2024. Available: https://doi.org/10.21923/jesd.1383926
  33. A. Tourani, S. Soroori, A. Shahbahrami, A. Akoushideh, “Iranis: A large-scale dataset of Iranian vehicles license plate characters,” presented at 5th International Conference on Pattern Recognition and Image Analysis (IPRIA), IEEE, pp. 1-5, 2021. Available: https://doi.org/10.1109/IPRIA53572.2021.9483461
  34. K. Hemanth, T. J. Nagalakshmi, “Improving Accuracy of Face Detection in ID Proofs using CNN and Comparing with DLNN,” presented at 8th International Conference on Science Technology Engineering and Mathematics (ICONSTEM), IEEE, pp. 1-6, 2023. Available: https://doi.org/10.1109/ICONSTEM56934.2023.10142926
  35. M. Mosannafat, F. Taherinezhad, H. Khotanlou, E. Alighardash, “Farsi text detection and localization in videos and images,” presented at 9th Iranian Joint Congress on Fuzzy and Intelligent Systems (CFIS), IEEE, pp. 1-6, 2022. Available: https://doi.org/10.1109/CFIS54774.2022.9756472
  36. M. Gandomkar, S. Khoramipour, “Optical Character Recognition (OCR) in Cursive Scripts Using Object Detection Networks,” Tabriz Journal of Electrical Engineering, (Accepted 2024). Available: https://doi.org/10.22034/tjee.2024.62945.4877
  37. S. Hatami, S. Behnam, R. Shamsaee, “Improving Detection of Capsule Endoscopy Using YOLO,” Tabriz journal of electrical engineering, vol. 57(2), pp. 295-316, 2024. Available: https://doi.org/10.22034/tjee.2024.58239.4711
  38. R. Mondal, S. Malakar, E. H. Barney Smith, R. Sarkar, “Handwritten English word recognition using a deep learning based object detection architecture,” Multimedia Tools and Applications, vol. 81(1), pp. 975-1000, 2022. Available: https://doi.org/10.1007/s11042-021-11425-7
  39. S. Alghyaline, “A printed arabic optical character recognition system using deep learning,” Journal of Computer Science, vol. 18(11), pp. 1038-1050, 2022. Available: https://doi.org/10.3844/jcssp.2022.1038.1050