Paper Title
Transforming Dataset Creation: Leveraging Generative AI for Machine Vision Applications in Data-Limited Domains
Abstract
In machine vision applications, particularly in data-limited domains like pharmaceutical manufacturing, the scarcity of anomaly data presents a significant challenge for effective model training. This study proposes an approach that mimics human cognitive ability in quality inspection, where defect identification relies on exemplary samples and verbal descriptions of potential anomalies. By translating this human framework into a generative AI pipeline, we demonstrate how machines can learn from limited examples and textual guidance. We compare different fine-tuning techniques for diffusion models—DreamBooth, LoRA, and Textual Inversion—demonstrating their ability to generate realistic synthetic images, including defect variations that are difficult or impossible to create physically. Our results confirm that diffusion models outperform traditional generative methods like GANs and VAEs in stability, diversity, and control over image details. This approach bridges the gap between human and machine perception, making AI-based inspection more intuitive and scalable for industries like healthcare, manufacturing, and agriculture.
Keywords - Synthetic Data Generation, Diffusion Models, MachineVision,Pharmaceutical Inspection, Data Scarcity.