My Understandings of Xgboost

Jodie Heqi Qiu
2 min readMar 30, 2020

Revised on 4/21

I would like to share what I’ve come to understand.

・First, I would like correct my misconception that Xgboost does not add up trees. In fact, it does add trees.

・To add a tree, we need to set a learning rate ( called “eta” in the xgboost parameter). Every time a new tree will be multiplied by learning rate then added to the existing tree. Notice that the learning rate is fixed.

・Xgboost is based on CART trees which use Gain_Gini to select which feature to split on.

・Xgboost evalutes trees by self-defined objective function.

we know the objective function can be finalized as:

where, rT is the regulation term, w is weights.

Then what kind of w gives the minimum of the objective function?

Take partial derivative of w, and set it to 0, which gives us a w function of G and H

plug this w function back to the objective function, we then have the new objective function.

How can we make use of the new objective function?

With a fixed tree, this new objective function is always calculable.

The lower the output is, the better the tree is.

--

--

Jodie Heqi Qiu

My memos of machine learning algorithms, data pre-prcocessing and statistics. Git: https://github.com/qhqqiu