<strong>Paper Title</strong><br>

COMPARATIVE EVALUATION OF MACHINE LEARNING MODELS FOR EARLY MENTAL HEALTH RISK PREDICTION USING WORKPLACE SURVEY DATA<br>

<br>


<strong>Abstract</strong><br>

Mental health problems have become one of the key aspects of global concern because they affect people representing different age groups, professions and workplaces [1], [5]. Most of the time, the cases of mental health problems remain undiagnosed until they begin to seriously affect the work and home of a person. Therefore, it is paramount to identify mental health problems in people at the initial level and provide help to avoid severe psychological consequences. This paper presents a machine learning-based model that can be used to predict the individual likelihood of seeking mental health care, based on structured survey data obtained in the workplace [9], [11]. The dataset used in this case includes demographic information, work environment and mental health history that may negatively impact the mental well-being of a person.
Some preprocessing methods of data were also applied before the development of the model to improve the quality of data and make sound predictions. These preprocessing procedures comprised of filling gaps, converting categorical variables into appropriate numeric forms, and choosing the most appropriate features to be utilized in the building of the model. Four supervised machine learning algorithms were developed and analysed: Logistic Regression, Support Vector machine, Random Forest and Gradient Boosting [2], [3]. Our dataset was divided into 80 percent training set and 20 percent test set to find out the generalization ability of the models.
The standard classification metrics, including accuracy and F1-score, were used to evaluate model performance. The experimental results demonstrate that ensemble learning models outperform the conventional classification methods in predicting the needs in mental health treatment. The Gradient Boosting algorithm provided the best outcome of all the models with an accuracy of 75.39 and the f1-score of 0.75. Furthermore, importance analysis revealed work interference by mental health problems, family history of mood/psychotic disorders, and the availability of workplace care to be the top predictors. The results prove that machine learning methods may be used to aid in the early warning of mental health risks, but they are not to be viewed as the alternatives to professional clinical diagnosis.
At Kaggle, there is the Mental Health in Tech Survey Dataset - the one that was dragged into the work asthe primary source [8]. It has a core of about 1,433 answers with each having a weight of about 63 different pieces attached to it based on mental wellbeing, job conditions, life details and past care steps. Imagine it: a broad lens is pointed directly at what defines emotional health in the workplace. Nevertheless, excessive detail slows things down, so no more than 20 to 25 important things were proceeded with, having been examined at the same time as to their strength and clarity. The rest? Left at the start of cleaning, dropped like a dead fly.

Keywords - Mental Health Prediction, Machine Learning, Random Forest, Gradient Boosting.