People Counting in Sample Video Footage Using CNN Integrated with YOLOv5

Authors

  • Ahmad Hasan Faqih Aulia IPB University
  • Carissa Fathinah Balti IPB University
  • Keisyah Zahra Anatasya IPB University
  • Gema Parasti Mindara IPB University
  • Endang Purnama Giri IPB University

DOI:

https://doi.org/10.59934/jaiea.v5i2.1933

Keywords:

Convolutional Neural Network, Image Processing, Object Detection, People Counting, YOLOv5

Abstract

Accurate people counting in dynamic environments remains challenging due to variations in lighting, complex backgrounds, and occlusion. This study proposes a video-based people counting system leveraging a Convolutional Neural Network (CNN) integrated with the YOLOv5 object detection model. The system applies a structured preprocessing pipeline, including frame extraction, normalization, and noise reduction, to enhance data consistency before detection. The model was evaluated using ten real-world campus video sequences to assess detection reliability and counting accuracy. Experimental results demonstrate that the proposed method achieves high precision and recall for real-time detection across diverse scenarios. Performance degradation was observed in frames containing dense crowds or low illumination, indicating limitations under extreme conditions. These findings validate the feasibility of lightweight CNN-based detectors for surveillance and monitoring applications, while highlighting the need for larger datasets and optimized training strategies to improve robustness in more complex environments.

Downloads

Download data is not yet available.

References

X. Zhang, “Application of Artificial Intelligence Recognition Technology in Digital Image Processing,” Wirel. Commun. Mob. Comput., vol. 2022, no. 1, p. 7442639, Jan. 2022, doi: 10.1155/2022/7442639.

M. Pervaiz, Y. Y. Ghadi, M. Gochoo, A. Jalal, S. Kamal, and D.-S. Kim, “A Smart Surveillance System for People Counting and Tracking Using Particle Flow and Modified SOM,” Sustainability, vol. 13, no. 10, p. 5367, May 2021, doi: 10.3390/su13105367.

D. Sharma, A. P. Bhondekar, A. K. Shukla, and C. Ghanshyam, “A review on technological advancements in crowd management,” J. Ambient Intell. Humaniz. Comput., vol. 9, no. 3, pp. 485–495, June 2018, doi: 10.1007/s12652-016-0432-x.

S. Yao et al., “From Lab to Field: Real-World Evaluation of an AI-Driven Smart Video Solution to Enhance Community Safety,” Aug. 12, 2025, arXiv: arXiv:2312.02078. doi: 10.48550/arXiv.2312.02078.

P. K. Hoong, I. K. T. Tan, and C. K. Weng, “A Comparison of People Counting Techniques via Video Scene Analysis,” ARPN J. Eng. Appl. Sci., vol. 10, no. 23, pp. 1813–1829, Dec. 2015.

R. Gouiaa, M. A. Akhloufi, and M. Shahbazi, “Advances in Convolution Neural Networks Based Crowd Counting and Density Estimation,” Big Data Cogn. Comput., vol. 5, no. 4, p. 50, Sept. 2021, doi: 10.3390/bdcc5040050.

H. Zhou, W. Li, S. Wei, G. Men, Y. Wang, and J. Li, “Steel Surface Defect Detection Method based on YOLOv11-MobileNetv4,” Int. Core J. Eng., vol. 11, no. 2, pp. 10–16, Feb. 2025, doi: 10.6919/ICJE.202502_11(2).0002.

I. K. Khairullah, A. D. Hartanto, A. Yusa, H. Hartatik, and K. Kusnawi, “Deteksi Citra Digital Menggunakan Algoritma CNN Dengan Model Normalisasi RGB,” Intechno J. Inf. Technol. J., vol. 2, no. 2, pp. 56–61, Dec. 2020, doi: 10.24076/intechnojournal.2020v2i2.1545.

K. Zhao et al., “Application research of image recognition technology based on CNN in image location of environmental monitoring UAV,” EURASIP J. Image Video Process., vol. 2018, no. 1, p. 150, Dec. 2018, doi: 10.1186/s13640-018-0391-6.

B. Pardamean, F. Abid, T. W. Cenggoro, G. N. Elwirehardja, and H. H. Muljo, “Counting people inside a region-of-interest in CCTV footage with deep learning,” PeerJ Comput. Sci., vol. 8, p. e1067, Sept. 2022, doi: 10.7717/peerj-cs.1067.

N. Ilyas, A. Shahzad, and K. Kim, “Convolutional-Neural Network-Based Image Crowd Counting: Review, Categorization, Analysis, and Performance Evaluation,” Sensors, vol. 20, no. 1, p. 43, Dec. 2019, doi: 10.3390/s20010043.

N. Wakhidah, P. T. Pungkasanti, and A. P. R. Pinem, “Deteksi Objek menggunakan Deep Learning untuk Mengetahui Tingkat Kerumunan Mahasiswa,” J. Edukasi Dan Penelit. Inform. JEPIN, vol. 9, no. 3, p. 465, Dec. 2023, doi: 10.26418/jp.v9i3.70132.

T. Hoeser and C. Kuenzer, “Object Detection and Image Segmentation with Deep Learning on Earth Observation Data: A Review-Part I: Evolution and Recent Trends,” Remote Sens., vol. 12, no. 10, p. 1667, May 2020, doi: 10.3390/rs12101667.

G. Kaur et al., “Face mask recognition system using CNN model,” Neurosci. Inform., vol. 2, no. 3, p. 100035, Sept. 2022, doi: 10.1016/j.neuri.2021.100035.

G. Rangel, J. C. Cuevas-Tello, J. Nunez-Varela, C. Puente, and A. G. Silva-Trujillo, “A Survey on Convolutional Neural Networks and Their Performance Limitations in Image Recognition Tasks,” J. Sens., vol. 2024, no. 1, p. 2797320, Jan. 2024, doi: 10.1155/2024/2797320.

S. Jiang, H. Huang, J. Yang, X. Zhang, and S. Wang, “Innovative Research on Small Object Detection and Recognition in Remote Sensing Images Using YOLOv5,” Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci., vol. XLVIII-4/W10-2024, pp. 77–83, May 2024, doi: 10.5194/isprs-archives-XLVIII-4-W10-2024-77-2024.

K. B. A. Hassen, J. J. M. Machado, and J. M. R. S. Tavares, “Convolutional Neural Networks and Heuristic Methods for Crowd Counting: A Systematic Review,” Sensors, vol. 22, no. 14, p. 5286, July 2022, doi: 10.3390/s22145286.

W. Jiang, X. Huang, Q. Zhao, and S. Liu, “ClassRoom-Crowd: A Comprehensive Dataset for Classroom Crowd Counting and Cross-Domain Baseline Analysis,” in The 1st International Conference on AI Sensors & the 10th International Symposium on Sensor Science, MDPI, Feb. 2025, p. 10. doi: 10.3390/engproc2024078010.

M. Hassan, F. Hussain, S. D. Khan, M. Ullah, M. Yamin, and H. Ullah, “Crowd counting using deep learning based head detection,” Electron. Imaging, vol. 35, no. 9, pp. 293--1-293–6, Jan. 2023, doi: 10.2352/EI.2023.35.9.IPAS-293.

M. A. M. Alhassan and E. Yılmaz, “Evaluating YOLOv4 and YOLOv5 for Enhanced Object Detection in UAV-Based Surveillance,” Processes, vol. 13, no. 1, p. 254, Jan. 2025, doi: 10.3390/pr13010254.

A. J. Mantau, I. W. Widayat, J.-S. Leu, and M. Köppen, “A Human-Detection Method Based on YOLOv5 and Transfer Learning Using Thermal Image Data from UAV Perspective for Surveillance System,” Drones, vol. 6, no. 10, p. 290, Oct. 2022, doi: 10.3390/drones6100290.

Downloads

Published

2026-02-15

How to Cite

Aulia, A. H. F., Balti, C. F., Anatasya, K. Z., Mindara, G. P., & Giri, E. P. (2026). People Counting in Sample Video Footage Using CNN Integrated with YOLOv5. Journal of Artificial Intelligence and Engineering Applications (JAIEA), 5(2), 2572–2580. https://doi.org/10.59934/jaiea.v5i2.1933