Paper Title
FORESIGHT: ACCESSIBILITY ACCELERATOR USING MULTIMODAL GENERATIVE AI
Abstract
This project introduces an innovative accessibility accelerator, empowering partially visually impaired individuals with advanced multimodal generative AI. The user-friendly mobile app addresses visual, cognitive, and physical challenges, offering personalized, context-aware experiences for seamless interaction. It utilizes state-of-the-art AI models like LLaVA with CLIP, Vicuna, and Detectron for object grounding. This integration provides clear, intuitive scene descriptions, coloring objects to match the image. The app supports dynamic conversation flows, enabling users to ask follow-up questions. It bridges the gap between the partially visually impaired and technology, fostering independence, inclusivity, and a sense of belonging.
Keywords - Multimodal Generative AI, Partially Visually Impaired, LLaVA, Detectron, Vision Augmentation.