From 57d0ad5cc484ddddfe5e00b10f2a7f6291ce8a6c Mon Sep 17 00:00:00 2001 From: Yao Xiao Date: Tue, 27 Jun 2023 16:34:55 -0400 Subject: [PATCH] Provide reasonable limits on the data that can be included in topics classification input Address issue https://github.com/patcg-individual-drafts/topics/issues/211 --- spec.bs | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/spec.bs b/spec.bs index 75a0224..58cd684 100644 --- a/spec.bs +++ b/spec.bs @@ -195,7 +195,9 @@ spec: html; urlPrefix: https://www.rfc-editor.org/rfc/ Each {{Document}} has a document id, which is an [=implementation-defined=] unique identifier shared with no other {{Document}} objects within or across browser sessions for a user agent.

Determine topics calculation input data

- Given a {{Document}}, the browser must have a way to determine the topics calculation input data. [=determine-topics-calculation-input-data-header/topics calculation input data=] is a string that encodes the attributes to be used for topics classification. The attributes could be the document's [=Document/URL=], the URL's [=domain=], the document node's [=descendant text content=], etc, as determined by the browser vendor. + Given a {{Document}}, the browser must have a way to determine the topics calculation input data. [=determine-topics-calculation-input-data-header/topics calculation input data=] is a string that encodes the attributes to be used for topics classification, as determined by the browser vendor. By default, the attributes should be scoped to the document's [=Document/URL=] and metadata. + + Note: unless specifically allowed, data beyond the document shouldn't be included, such as data from localStorage or cookies. Note: In Chrome's experimentation phase, the [=host=] of a {{Document}}'s [=Document/URL=] is used as the [=determine-topics-calculation-input-data-header/topics calculation input data=], and the model is trained with human curated hostnames and topics.