XGBoost: How Deep Learning Can Replace Gradient Boosting and Decision Trees — Part 2: Training | by Saupin Guillaume | Sep, 2023

Saupin Guillaume
Towards Data Science
Photo by Simon Wilkes on Unsplash

In a previous article:

you have learned about rewriting decision trees using a Differentiable Programming approach, as suggested by the NODE paper. The idea of this paper is to replace XGBoost by a Neural Network.

More specifically, after explaining why the process of building Decision Trees is not differentiable, it introduced the necessary mathematical tools to regularize the two main elements associated with a decision node:

  • Feature Selection
  • Branch detection

The NODE paper shows that both can be handled using the entmax function.

To summarize, we have shown how to create a binary tree without using comparison operators.

The previous article ended with open questions regarding training a regularized decision tree. It’s time to answer these questions.

If you’re interested in a deep dive in Gradient Boosting Methods, have a look at my book:

First, based on what we presented in the previous article, let’s create a new Python class: SmoothBinaryNode .

This class encodes the behavior of a smooth binary node. There are two key parts in its code :

  • The selection of the features, handled by the function _choices
  • The evaluation of these features, with respect to a given threshold, and the identification of the path to follow: left or right . All this is managed by the methods left and right .

Source link

This post originally appeared on TechToday.