Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Correct German TEK counts by the padding multiplier #9

Open
micb25 opened this issue Jul 18, 2020 · 1 comment
Open

Correct German TEK counts by the padding multiplier #9

micb25 opened this issue Jul 18, 2020 · 1 comment

Comments

@micb25
Copy link

micb25 commented Jul 18, 2020

Hey there,

I am the owner of github.com/micb25/dka which analyzes and visualizes the TEK data of the German COVID-19 tracing app (Corona-Warn-App). There is also a dashboard version at github.io.

Concerning the German TEK data and your PDF on page 3:

We also see that the first two days have exactly 10 TEKs
each, which presumably indicates test data. We also see that
for the latest date in each file, the number of TEKs is 90
on day one, 90 on day two and 180 on day three. It would
also seem almost too good to be true, were there really 180
TEKs uploaded when less than 600 cases were notified. So
once more, better transparency from the German public health
authorities is needed.

Short answer:
The German COVID-19 tracing app is producing a lot of fake TEKs to improve anonymity.

This fake multiplier ('padding multiplier') was in the beginning 10 (9:1) that is why you see these numbers like 90, 180, .... At the moment, this multiplier is 5 (4:1) which means that for every uploaded real TEK another four fake TEKs are being generated. For these fake TEKs, the identifier is filled with random bytes, but all other information are identical. You can find the currently estimated padding multiplier in my diagram here.

Consequently, if you correlate the German COVID-19 case numbers with the distributed TEKs you should correct them by the current padding multiplier.

Best regards,
Michael

@sftcd
Copy link
Owner

sftcd commented Jul 19, 2020

Hiya - at present we're reporting just the raw numbers seen, which as you note at present are n*5 for .de, but are n+10 for .ch (mostly) and n+large-random for .at. So given the variability and that the padding schemes change, I think it best to continue to report the raw numbers in our survey for now. There is a note about the 10:1 and 5:1 ratios for the .de numbers on the web page too. (BTW, nice dashboard!) I hadn't known for sure that the change was on July 2nd though, so will add that. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants