Use the gradient boosting classes in Scikit-Learn to solve different classification and regression problems
In the first part of this article, we presented the gradient boosting algorithm and showed its implementation in pseudocode.
In this part of the article, we will explore the classes in Scikit-Learn that implement this algorithm, discuss their various parameters, and demonstrate how to use them to solve several classification and regression problems.
Although the XGBoost library (which will be covered in a future article) provides a more optimized and highly scalable implementation of gradient boosting, for small to medium-sized data sets it is often easier to use the gradient boosting classes in Scikit-Learn, which have a simpler interface and a significantly fewer number of hyperparameters to tune.
Scikit-Learn provides the following classes that implement the gradient-boosted decision trees (GBDT) model:
- GradientBoostingClassifier is used for classification problems.
- GradientBoostingRegressor is used for regression problems.
In addition to the standard parameters of decision trees, such as criterion, max_depth (set by default to 3) and min_samples_split, these classes provide the following parameters:
- loss — the loss function to be optimized. In GradientBoostingClassifier, this function can be ‘log_loss’ (the default) or ‘exponential’ (which will make gradient boosting behave like the AdaBoost algorithm). In GradientBoostingRegressor, this function can be ‘squared_loss’ (the default), ‘absolute_loss’, ‘huber’, or ‘quantile’.
- n_estimators — the number of boosting iterations (defaults to 100).
- learning_rate — a factor that shrinks the contribution of each tree (defaults to 0.1).
- subsample — the fraction of samples to use for training each tree (defaults to 1.0).
- max_features — the number of features to consider when searching for the best split in each node. The options are to specify an integer for the…