Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

but do you have code for estimated probability of prediction ? #1

Open
Sandy4321 opened this issue May 7, 2020 · 3 comments
Open

Comments

@Sandy4321
Copy link

really good code thanks
but do you have code for estimated probability of prediction ?
as mentioned in
https://stats.stackexchange.com/questions/350134/how-does-gradient-boosting-calculate-probability-estimates
Per this discussion
dmlc/xgboost#5640
it is important to understand in details how this probability is calculated

@Ekeany
Copy link
Owner

Ekeany commented May 7, 2020

Hey Sandy,

If I understand your question correctly you are asking how the "XGBoost" model calculates the probability for a single sample.
For the binary classification case using Logloss it sums up all of the Logodds in the terminal leaf nodes and then applies a inverse Logit function to squeeze this value to be between 0-1.

I have a worked example of this process here
https://medium.com/analytics-vidhya/what-makes-xgboost-so-extreme-e1544a4433bb

@Sandy4321
Copy link
Author

great so you did implemented this
your code is the best in internet
can you share some clue how to find this code in your post and in your repo?
thanks you very much will try to learn how your code works
did you compared performance with real xgboost?

@Ekeany
Copy link
Owner

Ekeany commented May 8, 2020

Hi Sandy,

In the post I would recommend looking at the section “XGBoost” By Hand, as I go a step by step example there. This is what the "XGBoost" predict function looks like.

def predict(self, X):
        pred = np.zeros(X.shape[0])
        for estimator in self.estimators:
            pred += self.learning_rate * estimator.predict(X) 
        
        predicted_probas = self.sigmoid(np.full((X.shape[0], 1), 1).flatten().astype('float64') + pred)
        preds = np.where(predicted_probas > np.mean(predicted_probas), 1, 0)
        return(preds)

You can see that for each sample that we wish to predict we have to loop through and add up the leaf values or predictions from our weak learners. This summed value is actually not a probability yet but the a log odds ratio as we are using log loss for the binary case. In order to turn it into a probability we use the Sigmoid function to squeeze the log odds value between the range of 0-1. Then anywhere the value is greater than the mean probability of all the sample in the datset is given a prediction 1 or a 0. However this step is not necessary.

I honestly haven't compared the results with the real "XGBoost", just a simple cross fold validation on a test dataset to check the accuracy, which I was happy with. I made this more as a learning exercise than a real implementation so I wouldn't advice using it, but the core concepts behind it are the same as the "XGBoost" paper.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants