Skip to content

Commit 9edfd90

Browse files
authored
Update README.md
1 parent 60c69d3 commit 9edfd90

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

README.md

+6
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,14 @@
11

22
# The conjugate beta estimator statistical algorithm
33

4+
5+
## Introduction
46
This repo contains a reference implementation for a statistical algorithm called
57
Conjugate Beta Estimator (CBE) for computing
68
CIs for population means using (weighted) sample means and
79
potentially noisy labels.
810

11+
## Basic algorithm
912
The basic formula for CBE is
1013

1114
alpha = mu*n + alpha_prior
@@ -14,6 +17,7 @@ The basic formula for CBE is
1417

1518
where mu is the mean of the (weighted) sample labels and n is the sample size in bits
1619

20+
## Accounting for label noise
1721
Both mu and n can be adjusted to account for label noise.
1822

1923
mu should be adjusted using the [Rogan Gladen](https://en.wikipedia.org/wiki/Beth_Gladen) (RG) estimator for the sample mean:
@@ -42,6 +46,8 @@ The relationship between average accuracy and # of bits per label is visualized
4246

4347
<img width="695" alt="Screenshot 2024-11-05 at 12 44 57 PM" src="https://github.com/user-attachments/assets/975f7141-6ed6-4327-9035-052b419fbc51">
4448

49+
## Mathematical derivation
50+
4551
The following formula shows that the 1-H(X) formula, where H() is entropy and X is a label's probability
4652
of containing an accurate label, can be expressed as the Bayesian information gain, or KL divergence
4753
between the posterior P and uniform prior Q:

0 commit comments

Comments
 (0)