XGBoost

eXtreme Gradient Boosting sagemaker.xgboost

Hyperparameters

Subsample
- Subsample ratio of the training instances. Setting it to 0.5 means that XGBoost would randomly sample half of the training data prior to growing trees. and this will prevent overfitting. Subsampling will occur once in every boosting iteration.
Eta \(\eta\)
- alias: learning_rate
- Step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features, and eta shrinks the feature weights to make the boosting process more conservative.
Gamma \(\gamma\)
- Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger gamma is, the more conservative the algorithm will be.
Alpha \(\alpha\)
- L1 regularization term on weights. Increasing this value will make model more conservative.
Lambda \(\lambda\)
- L2 regularization term on weights. Increasing this value will make model more conservative.
eval_metric
scale_pos_weight
- useful for unbalanced data
max_depth
- prevents overfitting