Gradient boosting as a tool for solving classification problems in data-constrained environments

Kyrychek, M.

doi:10.30857/2786-5371.2025.2.3

Please use this identifier to cite or link to this item: https://er.knutd.edu.ua/handle/123456789/33633

Title:	Gradient boosting as a tool for solving classification problems in data-constrained environments
Other Titles:	Градієнтний бустінг як інструмент для вирішення задач класифікації в умовах обмежених даних
Authors:	Kyrychek, M.
Keywords:	машинне навчання адаптивні алгоритми оптимізація моделей XGBoost гіперпараметризація machine learning adaptive algorithms model optimisation XGBoost hyperparameterisation
Issue Date:	2025
Citation:	Kyrychek M. Gradient boosting as a tool for solving classification problems in data-constrained environments = Градієнтний бустінг як інструмент для вирішення задач класифікації в умовах обмежених даних [Текст] / M. Kyrychek // Технології та інжиніринг. - 2025. - № 2 (26). - С. 37-47.
Source:	Технології та інжиніринг
Abstract:	У сучасному машинному навчанні поставало питання ефективної побудови класифікаційних моделей при недостатньому обсязі навчальної інформації. Мета дослідження – проаналізувати можливості використання градієнтного бустінгу для розв’язання задач класифікації в умовах обмежених даних. Методологія дослідження базувалася на комплексному аналізі провідних реалізацій градієнтного бустінгу: XGBoost, LightGBM та HistGradientBoosting. Основну увагу було зосереджено на вивченні механізмів регуляризації, стратегій оптимізації гіперпараметрів та адаптивних технік навчання в умовах малих вибірок. Дослідження спрямовувалося на виявлення архітектурних особливостей алгоритмів, здатних забезпечити високу точність класифікації при мінімальному обсязі даних. Встановлено, що запропоновані алгоритми продемонстрували значний потенціал для ефективного розв’язання класифікаційних задач. Виявлено, що механізми shrinkage та subsampling дозволили суттєво підвищити узагальнюючу здатність моделей. Результати дослідження розширили теоретичні уявлення про ансамблеві методи машинного навчання та окреслили перспективні напрямки адаптації алгоритмів до специфічних умов обмежених інформаційних ресурсів. Досліджено, що XGBoost, LightGBM та HistGradientBoosting мають унікальні архітектурні особливості, які дозволяють ефективно працювати з різними типами даних. Встановлено, що механізми внутрішньої регуляризації цих алгоритмів забезпечили стійкість до перенавчання та високу точність прогнозування. Показано потенціал градієнтного бустінгу для вирішення складних класифікаційних задач у медицині, фінансах та інших галузях з обмеженими інформаційними ресурсами. Практичне значення роботи полягало в розробці методологічних рекомендацій щодо вибору та налаштування алгоритмів градієнтного бустінгу для різних типів класифікаційних задач. Отримані результати будуть корисні для подальшого розвитку методів машинного навчання. In machine learning, the question of effective construction of classification models with insufficient amount of educational information has arisen. The purpose of the study was to analyse the possibilities of using gradient boosting to solve classification problems in data-constrained environments. The research methodology was based on a comprehensive analysis of the leading gradient boosting implementations: XGBoost, LightGBM, and HistGradientBoosting. The main focus was on investigating regularisation mechanisms, hyperparameter optimisation strategies, and adaptive learning techniques under small sample conditions. The research was aimed at identifying the architectural features of algorithms that can provide high classification accuracy with a minimum amount of data. It was established that the proposed algorithms have demonstrated a significant potential for effectively solving classification problems. It was found that the mechanisms of shrinkage and subsampling significantly increased the generalising ability of models. The results of the study expanded the theoretical understanding of ensemble machine learning methods and outlined promising areas for adapting algorithms to specific conditions of limited information resources. XGBoost, LightGBM, and HistGradientBoosting have been shown to have unique architectural features that allow working efficiently with different types of data. It was found that the internal regularisation mechanisms of these algorithms provided resistance to retraining and high prediction accuracy. The potential of gradient boosting for solving complex classification problems in medicine, finance, and other industries with limited information resources is shown. The practical significance of the study was to develop methodological recommendations for selecting and configuring gradient boosting algorithms for various types of classification problems. The results obtained will be useful for further development of machine learning methods.
DOI:	10.30857/2786-5371.2025.2.3
URI:	https://er.knutd.edu.ua/handle/123456789/33633
ISSN:	2786-538X
Appears in Collections:	Наукові публікації (статті) Технології та інжиніринг

Files in This Item:

File	Description	Size	Format
TI_2025_N2(26)_P037-047.pdf		1,3 MB	Adobe PDF	View/Open

Show full item record

DSpace JSPUI

DSpace preserves and enables easy and open access to all types of digital content including text, images, moving images, mpegs and data sets