Skip to content

JimmysDataLab/credit-score-modelling

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Bootstrapping Approach for Estimating Prediction Uncertainty

In this approach, we use bootstrapping to estimate prediction uncertainty (such as confidence intervals) by resampling the training data, training multiple models, and then aggregating their predictions.

1. Resample the Training Data

  • Description:
    Create many bootstrapped samples (with replacement) from your training data.

  • Purpose:
    Each bootstrapped sample provides a slightly different version of the training data, allowing you to capture the variability and uncertainty inherent in the data.

2. Train a Model for Each Sample

  • Description:
    For each bootstrapped sample, train your classification (or regression) model.

  • Purpose:
    This process results in a collection of models, each reflecting variations in the training data. The ensemble of models helps to capture the uncertainty in the model's predictions.

3. Make Predictions

  • Description:
    For a given input, collect the predictions from all the models trained on the different bootstrapped samples.

  • Purpose:
    The variability among these predictions provides an empirical estimate of the model's uncertainty on that input.

4. Compute Quantiles

  • Description:
    Analyze the distribution of predictions to compute quantiles. For instance, calculate the 2.5th and 97.5th percentiles of the predictions.

  • Example:
    The 2.5th and 97.5th percentiles can serve as the bounds of a 95% prediction interval.

  • Purpose:
    These quantiles form a confidence interval around the predicted value, giving an estimate of prediction uncertainty.

Summary

Using bootstrapping to assess prediction uncertainty involves:

  1. Resampling your training data to create multiple bootstrapped datasets.
  2. Training a separate model on each bootstrapped sample.
  3. Aggregating the predictions for a specific input across all models.
  4. Computing quantiles (e.g., the 2.5th and 97.5th percentiles)

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published