<strong>Paper Title</strong><br>

Benchmarking Study of Classical Forecasting and Kernel-Based Methods Against Gradient Boosting for Cloud Resource Load Prediction<br>

<br>


<strong>Abstract</strong><br>

Accurate predictions of cloud resources' loads are crucial to efficiently schedule, elastically provision, and assure service-level agreements (SLA) across large-scale distributed systems. While classical linear and regularized regression techniques provide simple models with a low computational footprint, they do not effectively model highly dynamic workloads with much heterogeneity. This paper conducts an extensive systematic benchmark of traditional regression methods against modern gradient boosting regression techniques in predicting CPU load using real-world Google Borg trace data.We develop a complete prediction pipeline that includes data pre-processing; multi-modal feature extraction, statistical feature selection, temporal feature engineering, and dimensionality reduction. The models are evaluated based on predictive accuracy, computational complexity, memory consumption, and inference latency. For all metrics evaluated in our experiments, three gradient boosting models—XGBoost, LightGBM and CatBoost—outperformed the traditional approaches, yielding R2values greater than 0.999, and significantly reduced prediction errors. These models were more expensive to train, but their training times resulted in practical inference latencies suited for real-time cloud orchestration; on the other hand, although the traditional models trained faster than the gradient boosting models, they provided less accurate non-linear modeling capabilities.In this research, we demonstrate that there is a clear trade-off between accuracy and efficiency for different modelling approaches; and that, while there are many possible approaches that could be used to manage cloud resources, gradient boosted models yield the best balance of accuracy and efficiency when one is managing resources for production workloads. Furthermore, we provide a reproducible benchmarking framework that will help make decisions about which types of models to use for intelligent cloud computing environments, thereby providing a critical piece of information to help guide the future adoption and implementation of models for cloud resource management.

Keywords - Cloud Computing, Resource Load Prediction, Gradient Boosting, Performance Benchmarking.