Beyond Sight: VQA for Car Parking Detection Using YOLOv8

Sphoorti Kulkarni, Anushree Bashetti, Samarth Nasabi, Kaushik Mallibhat

2025

Abstract

Visual Question Answering (VQA), is an interesting application area of Artificial Intelligence that can enable the machines to understand the image content and answer questions about the image. VQA integrates vision-based techniques with the Natural Language Processing techniques. The VQA model uses the visual elements of the image and information from the question to generate the best possible answer. The paper demonstrates the use of state of art, object detection algorithm -You Only Look Once (YOLO) for the identification of free and occupied car slots in a car parking system. Existing VQA systems for parking often struggle with some limitations including real-time application in dynamic and varying lighting conditions, night or bad weather, and cannot handle user queries related to parking availability, thus impacting their overall usability and effectiveness in practical applications. In this paper, a car parking VQA model has been designed where both image and question of the user feed as an input to the proposed system. The image is captured by a camera installed in the parking lot on a real time basis, and users select a question from the provided question menu. The system provides a user-friendly, menu-based question-answering system that allows users to select questions of interest and receive relevant responses based on the detected parking slot information. The proposed approach utilizes a YOLOv8 model, trained on the annotated PKLot dataset to detect and count both parked and vacant slots in real-time. This detection is integrated with a menu-based question-answer system, allowing users to interact with the model and receive accurate slot information based on their selected queries. The model performs well in both day and night, even in low- light conditions, due to the diverse PKLot dataset used for training, which covers various days and weather conditions. To improve efficiency and accuracy, some images are converted to grayscale, and a custom dataset is created. This preprocessing optimizes performance for nighttime monochromatic images, enhancing results under varying lighting. Trained on the PKLot dataset, the model achieved a mean Average Precision (mAP) of 0.994 for vacant and 0.993 for parked slots. The robust performance stems from the diversity of PKLot images and strategic preprocessing. In simpler terms, To enhance the efficiency of parking, a novel approach has been adopted, which combines computer vision with questionnaires presented in a user-friendly menu style, inspired by natural language processing (NLP). This technique enables the model to accurately detect if parking spaces are available or occupied, ultimately making the parking experience more convenient and hassle-free.

Download


Paper Citation


in Harvard Style

Kulkarni S., Bashetti A., Nasabi S. and Mallibhat K. (2025). Beyond Sight: VQA for Car Parking Detection Using YOLOv8. In Proceedings of the 3rd International Conference on Futuristic Technology - Volume 2: INCOFT; ISBN 978-989-758-763-4, SciTePress, pages 834-843. DOI: 10.5220/0013604100004664


in Bibtex Style

@conference{incoft25,
author={Sphoorti Kulkarni and Anushree Bashetti and Samarth Nasabi and Kaushik Mallibhat},
title={Beyond Sight: VQA for Car Parking Detection Using YOLOv8},
booktitle={Proceedings of the 3rd International Conference on Futuristic Technology - Volume 2: INCOFT},
year={2025},
pages={834-843},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0013604100004664},
isbn={978-989-758-763-4},
}


in EndNote Style

TY - CONF

JO - Proceedings of the 3rd International Conference on Futuristic Technology - Volume 2: INCOFT
TI - Beyond Sight: VQA for Car Parking Detection Using YOLOv8
SN - 978-989-758-763-4
AU - Kulkarni S.
AU - Bashetti A.
AU - Nasabi S.
AU - Mallibhat K.
PY - 2025
SP - 834
EP - 843
DO - 10.5220/0013604100004664
PB - SciTePress