but do you have code for estimated probability of prediction ? #1

Sandy4321 · 2020-05-07T12:30:56Z

really good code thanks
but do you have code for estimated probability of prediction ?
as mentioned in
https://stats.stackexchange.com/questions/350134/how-does-gradient-boosting-calculate-probability-estimates
Per this discussion
dmlc/xgboost#5640
it is important to understand in details how this probability is calculated

Ekeany · 2020-05-07T16:57:21Z

Hey Sandy,

If I understand your question correctly you are asking how the "XGBoost" model calculates the probability for a single sample.
For the binary classification case using Logloss it sums up all of the Logodds in the terminal leaf nodes and then applies a inverse Logit function to squeeze this value to be between 0-1.

I have a worked example of this process here
https://medium.com/analytics-vidhya/what-makes-xgboost-so-extreme-e1544a4433bb

Sandy4321 · 2020-05-07T20:50:19Z

great so you did implemented this
your code is the best in internet
can you share some clue how to find this code in your post and in your repo?
thanks you very much will try to learn how your code works
did you compared performance with real xgboost?

Ekeany · 2020-05-08T16:22:24Z

Hi Sandy,

In the post I would recommend looking at the section “XGBoost” By Hand, as I go a step by step example there. This is what the "XGBoost" predict function looks like.

def predict(self, X):
        pred = np.zeros(X.shape[0])
        for estimator in self.estimators:
            pred += self.learning_rate * estimator.predict(X) 
        
        predicted_probas = self.sigmoid(np.full((X.shape[0], 1), 1).flatten().astype('float64') + pred)
        preds = np.where(predicted_probas > np.mean(predicted_probas), 1, 0)
        return(preds)

You can see that for each sample that we wish to predict we have to loop through and add up the leaf values or predictions from our weak learners. This summed value is actually not a probability yet but the a log odds ratio as we are using log loss for the binary case. In order to turn it into a probability we use the Sigmoid function to squeeze the log odds value between the range of 0-1. Then anywhere the value is greater than the mean probability of all the sample in the datset is given a prediction 1 or a 0. However this step is not necessary.

I honestly haven't compared the results with the real "XGBoost", just a simple cross fold validation on a test dataset to check the accuracy, which I was happy with. I made this more as a learning exercise than a real implementation so I wouldn't advice using it, but the core concepts behind it are the same as the "XGBoost" paper.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

but do you have code for estimated probability of prediction ? #1

but do you have code for estimated probability of prediction ? #1

Sandy4321 commented May 7, 2020

Ekeany commented May 7, 2020

Sandy4321 commented May 7, 2020

Ekeany commented May 8, 2020

but do you have code for estimated probability of prediction ? #1

but do you have code for estimated probability of prediction ? #1

Comments

Sandy4321 commented May 7, 2020

Ekeany commented May 7, 2020

Sandy4321 commented May 7, 2020

Ekeany commented May 8, 2020