主要組件
Boosting
void GBDT::Init(const Config* gbdt_config, const Dataset* train_data,const ObjectiveFunction* objective_function,const std::vector<const Metric*>& training_metrics) override
初始化,主要是創建樣本采樣策略data_sample_strategy_
,設置目標函數objective_function_
,創建tree_learner_
,創建train_score_updater_
,配置training_metrics_
void GBDT::Train(int snapshot_freq, const std::string& model_output_path) override
訓練處理
bool GBDT::TrainOneIter(const score_t* gradients, const score_t* hessians) override
單次迭代訓練
void GBDT::Boosting()
計算梯度和海森矩陣
void UpdateScore(const Tree* tree, const int cur_tree_id)
樹訓練完后更新評分
TreeLearner
目標函數ObjectiveFunciton
二分類對數損失
一般定義為
L(y,f(x))=log(1+exp(?y?f(x)))L(y, f(x))=log(1+exp(-y \cdot f(x)))L(y,f(x))=log(1+exp(?y?f(x)))
其中yyy是標簽,取值為{?1,1}\left \{-1,1\right \}{?1,1},f(x)f(x)f(x)是模型輸出的分數,令z=y?f(x)z=y\cdot f(x)z=y?f(x),則損失函數為L=log(1+exp(?z))L=log(1+exp(-z))L=log(1+exp(?z))
對zzz求導有?L?z=?exp(?z)1+exp(?z)=?11+exp(z)\frac{\partial L}{\partial z} = \frac{-exp(-z)}{1+exp(-z)} = -\frac{1}{1+exp(z)}?z?L?=1+exp(?z)?exp(?z)?=?1+exp(z)1?,所以對f(x)f(x)f(x)求導有
?L?f(x)=?L?z??z?f(x)=?y1+exp(y?f(x))\frac{\partial L}{\partial f(x)} = \frac{\partial L}{\partial z} \cdot \frac{\partial z}{\partial f(x)} = - \frac{y}{1+exp(y \cdot f(x))}?f(x)?L?=?z?L???f(x)?z?=?1+exp(y?f(x))y?
在BinaryLogloss
中對損失函數添加了縮放因子sigmoid_
,即
L(y,f(x))=log(1+exp(?y?σ?f(x)))L(y, f(x))=log(1+exp(-y \cdot \sigma \cdot f(x)))L(y,f(x))=log(1+exp(?y?σ?f(x)))
對f(x)f(x)f(x)求導
?L?f(x)=?y?σ1+exp(y?σ?f(x))\frac{\partial L}{\partial f(x)}=-\frac{y \cdot \sigma}{1+exp(y\cdot \sigma \cdot f(x))}?f(x)?L?=?1+exp(y?σ?f(x))y?σ?
BinaryLogloss
在計算梯度時添加了樣本權重weights_[i]
和標簽權重label_weight
const double response = -label * sigmoid_ / (1.0f + std::exp(label * sigmoid_ * score[i]));
gradients[i] = static_cast<score_t>(response * label_weight * weights_[i]);