Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cookie Monster #2

Open
wants to merge 28 commits into
base: main
Choose a base branch
from
Open

Cookie Monster #2

wants to merge 28 commits into from

Conversation

DelphianCalamity
Copy link
Collaborator

No description provided.

@DelphianCalamity DelphianCalamity changed the title passing epsilon through http response header chromium tweaks Feb 13, 2024
Copy link
Member

@tholop tholop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @DelphianCalamity, I just left a few comments or clarification questions. It's amazing that you made that work :) The diff is huge so I couldn't review everything yet, I'm rather trying to probe if I understand ARA correctly, and hopefully that will help me have a data model that's close enough to the code. I'll take another pass once I have the theory ready to check that the optimizations really spend the right thing!

Comment on lines +195 to +211
// Populate partition.report_value_pairs[*].report for all source_keys
for (auto& outer : source_counts_per_sourcekey) {
auto source_key = outer.first;
auto& report_value_pair = partition.report_value_pairs[source_key];
auto& trigger_keypieces = trigger_keypieces_per_source[source_key];

for (auto& inner : outer.second) {
auto source_keypiece = inner.first;
auto source_count = inner.second;

// Extend the source key_pieces for source_key
for (auto& trigger_keypiece : trigger_keypieces) {
source_keypiece |= trigger_keypiece;
}

double contribution_value = (source_count / total_sources_count) * report_value_pair.value;
report_value_pair.report.emplace_back(source_keypiece, contribution_value);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok so first we build source_counts_per_sourcekey, that is a list (actually dict) of sources, for each source we have a histogram that maps each keypiece to the number of times it appears in the whole attribution window.

And next we browse all the sources. If there is a trigger associated to that source we concatenate the trigger_keypiece and map it to the right contribution value. But what if there is no trigger associated to that source? E.g. source_key does not appear in trigger_keypieces_per_source. Should we still give some value to the untouched source_keypiece? Maybe yes, it's a sort of unattributed impression?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there should always be a value for a given source_key. There should be a trigger for every existing source. I do some sanity check to catch some errors here

// Sanity check: every source_key from trigger_data must exist in source aggregation_keys too

but I should do some more to ensure this doesn't happen.

I was actually hoping ARA does that already tbh earlier but will check

Comment on lines +38 to +54
double Partition::compute_sensitivity(const char* sensitivity_metric) {
double individual_sensitivity = 0;
if (std::strcmp(sensitivity_metric, "L1") == 0) {
for (auto& pair : report_value_pairs) {
for (auto& bucket : pair.second.report) {
individual_sensitivity += bucket.value();
}
}
return individual_sensitivity;
} else if (std::strcmp(sensitivity_metric, "L2") == 0) {
// TODO(kelly)
// return AggregatableResult::kInternalError;
} else {
// return AggregatableResult::kInternalError;
}
return -1;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That one looks good for now, but I'll keep an eye on it until I have the full proof written. It's fine for impressions but I'm a bit worried about the sensitivity for conversions

Comment on lines +3548 to +3550
} else if (std::strcmp(kSensitivityMetric, "L2") == 0) {
// TODO(kelly)
// sum up histograms, raise each bucket value to power of 2 sum them up and sqrt
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but let's wait for that one to be sure that we don't mess up with a non-linearity due to the square/sqrt

@DelphianCalamity DelphianCalamity changed the title chromium tweaks Cookie Monster Oct 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants