Abstract

The integration of closed-circuit television (CCTV) monitoring is crucial in the field of video processing, which provides an efficient method for comprehensive surveillance. However, a key challenge associated with this practice is its substantial demand for storage space. Typically, surveillance footage is stored in hard disk drives, and due to limited storage spaces, it is deleted after some time. To address this issue, an innovative method for compressing CCTV video, named object detection-based surveillance compression (ODSC), is introduced. Our ODSC model is divided into two steps: -i) depending upon the objects in the video, determine the significant and non-significant frames of surveillance video using the neural network approach YOLOv5s & YOLOv7-tiny and Yolov8s ii) construct the video of significant frames. Following a comprehensive analysis of the experimental outcomes, it is noted that YOLOv8s stands out with a remarkable detection accuracy of 99.7% on the COCO dataset. Our ODSC approach is reducing the storage space greatly and achieving an average compression ratio of up to 96.31% using YOLOv8s, which surpasses the existing state-of-the-art methods.

Keywords

Significant Frames, Non-Significant Frames, Object Detection, YOLOv5, YOLOv7, YOLOv8 , Surveillance Compression,

Downloads

Download data is not yet available.

References

  1. A. Hope, CCTV, school surveillance and social control. British Educational Research Journal, 35(6), (2009) 891-907. https://doi.org/10.1080/01411920902834233
  2. E. L. Piza, B. C. Welsh, D. P. Farrington, A. L. Thomas. CCTV surveillance for crime prevention: A 40-year systematic review with meta-analysis. Criminology & public policy, 18(1), (2019) 135-159. https://doi.org/10.1111/1745-9133.12419
  3. M. P. J. Ashby, The Value of CCTV Surveillance Cameras as an Investigative Tool: An Empirical Analysis. European Journal on Criminal Policy and Research, 23(3), (2017) 441-459. https://doi.org/10.1007/s10610-017-9341-6
  4. C. Norris, G. Armstrong, The maximum surveillance society: The rise of CCTV. Routledge, London. https://doi.org/10.4324/9781003136439
  5. L. Zhao, S. Wang, S. Wang, Y. Ye, S. Ma, W. Gao, Enhanced surveillance video compression with dual reference frames generation. IEEE Transactions on Circuits and Systems for Video Technology, 32(3), (2021) 1592-1606. https://doi.org/10.1109/TCSVT.2021.3073114
  6. P. Kavitha, A survey on lossless and lossy data compression methods. International Journal of Computer Science & Engineering Technology, 7(3), (2016) 110-114.
  7. H. Rhee, Y. Il Jang, S. Kim, N. I. Cho, Lossless Image Compression by Joint Prediction of Pixel and Context Using Duplex Neural Networks. IEEE Access, 9, (2021) 86632-86645. https://doi.org/10.1109/ACCESS.2021.3088936
  8. G. J. Sullivan, J.-R. Ohm, W.-J. Han, T. Wiegand. Overview of the high efficiency video coding (HEVC) standard. IEEE Transactions on circuits and systems for video technology, 22(12), (2012) 1649. https://doi.org/10.1109/TCSVT.2012.2221191
  9. Z. Cheng, H. Sun, M. Takeuchi, J. Katto, Learning Image and Video Compression Through Spatial-Temporal Energy Compaction. in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2019) 10071-10080. https://doi.org/10.1109/CVPR.2019.01031
  10. J. Mao, L. Yu, Convolutional neural network based bi-prediction utilizing spatial and temporal information in video coding. IEEE Transactions on Circuits and Systems for Video Technology, 30(7), (2019) 1856-1870. https://doi.org/10.1109/TCSVT.2019.2954853
  11. Z. Zou, K. Chen, Z. Shi, Y. Guo, J. Ye, Object detection in 20 years: A survey. Proceedings of the IEEE, 111(3), (2023) 257-276. https://doi.org/10.1109/JPROC.2023.3238524
  12. X. Wu, D. Sahoo, S.C.H. Hoi, Recent advances in deep learning for object detection. Neurocomputing, 396, (2020) 39-64. https://doi.org/10.1016/j.neucom.2020.01.085
  13. R. Girshick, J. Donahue, T. Darrell, J. Malik, Rich feature hierarchies for accurate object detection and semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition, (2014) 580-587. https://doi.org/10.1109/CVPR.2014.81
  14. R. Girshick, (2015) Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, IEEE, Chile. https://doi.org/10.1109/ICCV.2015.169
  15. S. Ren, K. He, R. Girshick, J. Sun, Faster r-cnn: Towards real-time object detection with region proposal networks. in IEEE Transactions on Pattern Analysis and Machine Intelligence, 39(6), (2017) 1137 - 1149. https://doi.org/10.1109/TPAMI.2016.2577031
  16. K. He, X. Zhang, S. Ren, J. Sun, Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE transactions on pattern analysis and machine intelligence, 37(9), (2015) 1904-1916. https://doi.org/10.1109/TPAMI.2015.2389824
  17. T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, Feature pyramid networks for object detection. in Proceedings of the IEEE conference on computer vision and pattern recognition, (2017) 2117-2125. https://doi.org/10.1109/CVPR.2017.106
  18. K. He, G. Gkioxari, P. Dollár, R. Girshick, Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, (2017) 2961-2969. https://doi.org/10.1109/ICCV.2017.322
  19. W. Liu et al., SSD: Single shot multibox detector. In Computer Vision-ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, (2016) 21-37. https://doi.org/10.1007/978-3-319-46448-0_2
  20. J. Redmon, S. Divvala, R. Girshick, A. Farhadi, You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, (2016) 779-788. https://doi.org/10.1109/CVPR.2016.91
  21. Y. Zhong, J. Wang, J.Peng, L. Zhang. Anchor box optimization for object detection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, (2020) 1286-1294. https://doi.org/10.1109/WACV45572.2020.9093498
  22. J. Redmon, A. Farhadi, YOLO9000: better, faster, stronger. In Proceedings of the IEEE conference on computer vision and pattern recognition, (2017) 7263-7271. https://doi.org/10.1109/CVPR.2017.690
  23. J. Redmon and A. Farhadi, Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, (2018). https://doi.org/10.48550/arXiv.1804.02767
  24. T. Diwan, G. Anirudh, J. V. Tembhurne, Object detection using YOLO: Challenges, architectural successors, datasets and applications. multimedia Tools and Applications, 82(6), (2023) 9243-9275. https://doi.org/10.1007/s11042-022-13644-y
  25. N. Mohod, P. Agrawal, V. Madaan, YOLOv4 Vs YOLOv5: Object Detection on Surveillance Videos. In International Conference on Advanced Network Technologies and Intelligent Computing, (2022) 654-665. https://doi.org/10.1007/978-3-031-28183-9_46
  26. N. Mohod, P. Agrawal, V. Madan, Human Detection in Surveillance Video using Deep Learning Approach. In 2023 6th International Conference on Information Systems and Computer Networks (ISCON), (2023) 1-6. https://doi.org/10.1109/ISCON57294.2023.10111951
  27. T. Wiegand, G. J. Sullivan, G. Bjontegaard, A. Luthra, Overview of the H. 264/AVC video coding standard. IEEE Transactions on circuits and systems for video technology, 13(7), (2003) 560-576. https://doi.org/10.1109/TCSVT.2003.815165
  28. L. Dong, Y. Li, J. Lin, H. Li, F. Wu, Deep learning-based video coding: A review and a case study. ACM Computing Surveys (CSUR), 53(1), (2020 1-35. https://doi.org/10.1145/3368405
  29. W. Cui, T. Zhang, S. Zhang, F. Jiang, W. Zuo, D. Zhao, (2017) Convolutional Neural Networks Based Intra Prediction for HEVC. Data Compression Conference Proceedings, USA. https://doi.org/10.1109/DCC.2017.53
  30. J. Lin, D. Liu, H. Li, F. Wu, (2018) Generative adversarial network-based frame extrapolation for video coding. In 2018 IEEE Visual Communications and Image Processing (VCIP), IEEE, Taiwan. https://doi.org/10.1109/VCIP.2018.8698615
  31. Y. Zhang, L. Chen, C. Yan, P. Qin, X. Ji, Q. Dai, weighted convolutional motion-compensated frame rate up-conversion using deep residual network. IEEE Transactions on Circuits and Systems for Video Technology, 30(1), (2018) 11-22. https://doi.org/10.1109/TCSVT.2018.2885564
  32. G. Lu, W. Ouyang, D. Xu, X. Zhang, C. Cai, Z. Gao, (2019) Dvc: An end-to-end deep video compression framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, USA.https://doi.org/10.1109/CVPR.2019.01126
  33. L. Zonglei, X. Xianhong, Deep compression: A compression technology for apron surveillance video. IEEE Access, 7, (2019) 129966-129974. https://doi.org/10.1109/ACCESS.2019.2940252
  34. L. Wu, K. Huang, H. Shen, L. Gao, Foreground-background parallel compression with residual encoding for surveillance video. IEEE Transactions on Circuits and Systems for Video Technology, 31(7), (2021) 2711-2724. https://doi.org/10.1109/TCSVT.2020.3027741
  35. N. Ghamsarian, H. Amirpourazarian, C. Timmerer, M. Taschwer, K. Schöffmann, Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Neural Networks. In Proceedings of the 28th ACM International Conference on Multimedia, (2020) 3577-3585. https://doi.org/10.1145/3394171.3413658
  36. N. Mathá, K. Schoeffmann, S. Sarny, D. Putzgruber-Adamitsch, Y. El-Shabrawi, Evaluation of Relevance-Driven Compression of Regular Cataract Surgery Videos. In 2022 IEEE 35th International Symposium on Computer-Based Medical Systems (CBMS), (2022) 429-434. https://doi.org/10.1109/CBMS55023.2022.00083
  37. C.Y. Wang, A. Bochkovskiy, H.Y.M. Liao, YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, (2023) 7464-7475. https://doi.org/10.1109/CVPR52729.2023.00721
  38. Jocher, G., Chaurasia, A., & Qiu, J. (2023). Ultralytics YOLO (Version 8.0.0) [Computer software]. https://github.com/ultralytics/ultralytics
  39. T.Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C.L. Zitnick, Microsoft coco: Common objects in context. In Computer Vision-ECCV 2014: 13th European Conference, (2014)740-755. https://doi.org/10.1007/978-3-319-10602-1_48
  40. S. Arora, K. Bhatia, V. Amit, Storage optimization of video surveillance from CCTV camera. 2nd International Conference on Next Generation Computing Technologies, (2016) 710-713. https://doi.org/10.1109/NGCT.2016.7877503