Paper Title
A VISION TRANSFORMER-BASED APPROACH TO TRAFFIC SIGN RECOGNITION ON A HYBRID BANGLADESHI DATASET

Abstract
Traffic sign detection and classification play a crucial role in advancing Intelligent Transportation Systems (ITS) by ensuring road safety and supporting autonomous vehicle technologies. In this study, we present a novel approach to Bangladeshi traffic sign recognition by constructing a hybrid dataset that combines two distinct datasets. One dataset contains 13 predefined classes, while the other dataset contributes images that were selectively reorganized to form 11 additional traffic sign classes. The merged hybrid dataset integrates 24 diverse traffic sign classes, offering a comprehensive and unique benchmark for Bangladeshi traffic sign recognition. We propose a Vision Transformer (ViT)-based model, which is fine-tuned for optimal performance on the hybrid dataset. The proposed ViT model is evaluated against several state-of-the-art convolutional architectures, including VGG19, DenseNet121, Inception-ResNetV2 (IRv2), and Xception. Experimental results demonstrate that the ViT model achieves the highest validation accuracy of 99.7%, outperforming all baseline models. This study underscores the benefits of Vision Transformers for Bangladeshi traffic sign recognition, particularly in addressing the variability and complexity of the signs. It establishes a strong benchmark and provides insights into the application in real-world challenges. Keywords - Bangladeshi Dataset, Recognition and Classification, Traffic Sign, Vision Transformer