Platt scaling

What is it

transforms output from a classification model to a probability distribution over classes
fits a logistic regression model to a classifier's score
estimates probability P(y=1|x), even though the classifier does not provide this probability:
- P(y=1|x)= 1 / 1 + exp (A f(x) + B)
- A and B are estimated using maximum likelihood
usefull for SVMs, naive Bayes
less effective for well-calibrated models as logistic regression

https://en.wikipedia.org/wiki/Platt_scaling

How to use it for probability estimations

Platt scaling uses the distance to the decision boundary and scales it in a way to result in probabilities for the point x belonging to class y. Points with a greater distance to the decision boundary have a higher probability then points that are very near to the decision boundary.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.92.623&rep=rep1&type=pdf

https://books.google.de/books?id=g57cDgAAQBAJ&pg=PA69&lpg=PA69&dq=platt+scaling+-calibration&source=bl&ots=zI2fvZK6Pq&sig=IZ7RMo3nRqgNimzyOq1do8-CR8E&hl=de&sa=X&ved=0ahUKEwj9gaeg09DZAhUM2aQKHUIaCAcQ6AEIYTAI#v=onepage&q=platt%20scaling%20-calibration&f=false

How to use it for calibration (not what we need/ want?)

Split the train data set into training set and Cross Validation set
Train the model on the training data set
Score test data set and Cross Validation data set
Run a logistic model on the Cross Validation data set using the actual dependent variable and the predicted values.
Score the test data set using the model created in step 4 with feature as the output of scoring on test data set in step 3.

https://www.analyticsvidhya.com/blog/2016/07/platt-scaling-isotonic-regression-minimize-logloss-error/

Platt scaling as calibration can be used to improve the outcome of the classifier.

https://jmetzen.github.io/2015-04-14/calibration.html

https://jmetzen.github.io/2014-08-16/reliability-diagram.html

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Platt scaling

What is it

How to use it for probability estimations

How to use it for calibration (not what we need/ want?)

Clone this wiki locally