eXtreme Gradient Boosting sagemaker.xgboost
Hyperparameters
-
Subsample
- Subsample ratio of the training instances. Setting it to 0.5 means that XGBoost would randomly sample half of the training data prior to growing trees. and this will prevent overfitting. Subsampling will occur once in every boosting iteration.
-
Eta \(\eta\)
-
alias:
learning_rate -
Step size shrinkage used in update to prevents overfitting. After each boosting step, we can directly get the weights of new features, and
etashrinks the feature weights to make the boosting process more conservative.
-
alias:
-
Gamma \(\gamma\)
-
Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger
gammais, the more conservative the algorithm will be.
-
Minimum loss reduction required to make a further partition on a leaf node of the tree. The larger
-
Alpha \(\alpha\)
- L1 regularization term on weights. Increasing this value will make model more conservative.
-
Lambda \(\lambda\)
- L2 regularization term on weights. Increasing this value will make model more conservative.
- eval_metric
-
scale_pos_weight
- useful for unbalanced data
-
max_depth
- prevents overfitting
Instance Choice
- CPU: M5
- GPU: P3 (>=1.2)