Paper Title
SENTIMENT ANALYSIS OF BENGALI-ENGLISH-HINDI CODE-MIXED SOCIAL MEDIA TEXT USING MACHINE LEARNING TECHNIQUES
Abstract
In multilingual societies such as India, code-mixed text blending Bengali, Hindi, and English is increasingly prevalent on social media, presenting unique challenges for natural language processing tasks like sentiment analysis. This work introduces a machine learning framework for sentiment classification on a balanced dataset of approximately 850 Bengali-English-Hindi code-mixed sentences. Utilizing TF-IDF vectorization, we train classical classifiers—Support Vector Machine (SVM) and Multinomial Naive Bayes (NB) achieving a maximum accuracy of 61.87%. Our evaluation reveals stronger performance in identifying positive sentiments while highlighting persistent challenges in accurately classifying neutral sentiments. The study offers a foundational benchmark to guide future research in trilingual code-mixed sentiment analysis.
Keywords - Code-mixing, Sentiment Analysis, Bengali, Hindi, English, Machine Learning, TF-IDF, SVM, Naive Bayes