Commit 9edfd90 1 parent 60c69d3 commit 9edfd90 Copy full SHA for 9edfd90
File tree 1 file changed +6
-0
lines changed
1 file changed +6
-0
lines changed Original file line number Diff line number Diff line change 1
1
2
2
# The conjugate beta estimator statistical algorithm
3
3
4
+
5
+ ## Introduction
4
6
This repo contains a reference implementation for a statistical algorithm called
5
7
Conjugate Beta Estimator (CBE) for computing
6
8
CIs for population means using (weighted) sample means and
7
9
potentially noisy labels.
8
10
11
+ ## Basic algorithm
9
12
The basic formula for CBE is
10
13
11
14
alpha = mu*n + alpha_prior
@@ -14,6 +17,7 @@ The basic formula for CBE is
14
17
15
18
where mu is the mean of the (weighted) sample labels and n is the sample size in bits
16
19
20
+ ## Accounting for label noise
17
21
Both mu and n can be adjusted to account for label noise.
18
22
19
23
mu should be adjusted using the [ Rogan Gladen] ( https://en.wikipedia.org/wiki/Beth_Gladen ) (RG) estimator for the sample mean:
@@ -42,6 +46,8 @@ The relationship between average accuracy and # of bits per label is visualized
42
46
43
47
<img width =" 695 " alt =" Screenshot 2024-11-05 at 12 44 57 PM " src =" https://github.com/user-attachments/assets/975f7141-6ed6-4327-9035-052b419fbc51 " >
44
48
49
+ ## Mathematical derivation
50
+
45
51
The following formula shows that the 1-H(X) formula, where H() is entropy and X is a label's probability
46
52
of containing an accurate label, can be expressed as the Bayesian information gain, or KL divergence
47
53
between the posterior P and uniform prior Q:
You can’t perform that action at this time.
0 commit comments