Author:
Rashmi Rao, John Cox, Salman Malik, Google Privacy Sandbox
Protected App Signals provides ad-techs a way to egress data from inside the privacy boundary to their own servers for model training. See the Protected App Signals Explainer, particularly the Reporting section, for background. Refer design for Protected App Signals with B&A here.
This document describes the following:
-
The set of feature types available to egress information to adtech servers for model training for Protected App Signals.
-
The wire format of each feature.
-
The wire format of the
egressPayload
itself.
The wire representation of the egressPayload
will be noised per feature type. More details will be provided in a future explainer update.
Based on the information in this document, adtechs can write parsers to prepare the egressPayload
and transform the values it contains into features for use in their model training systems.
We support two kinds of feature types: primitives, which can contain a single feature value; and collections, which can contain multiple primitives.
These types represent single values: booleans, unsigned integers, and signed integers
This represents a single boolean value
Expected value: true
or false
Wire format
0 (false
) or 1 (true
)
This represents a single non-negative integer value.
Parameters:
size
: unsigned integer indicating the number of bits based on the range of values.
Expected value: non-negative integer value in the range [0,2^size-1]
For example, if size
= 3
unsigned-integer-feature-type
can have a value : [0, 7]
and will occupy 3 bits on the wire.
Wire format
Binary representation of the unsigned integer. The integer will be converted to wire format following little endian byte order.
Example: If value
= 5
and size
= 3
, wire format will be 101
This type can be used to represent a single positive or negative integer value.
Parameters:
size
: unsigned integer indicating the number of bits based on the range of values.
Expected value: integer value in the range [-2^(size-1),2^(size-1)-1]
For example, if size
= 4
, signed-integer-feature-type
can have a value [-8, 7]
, and will occupy 4 bits on the wire.
Wire format
2’s complement representation of the signed integer. The integer will be converted to wire format following little endian byte order.
Example: If value
= -3
and size
=4
, wire format will be 1101
These feature types represent a collection of homogeneous or heterogeneous values.
The wire representation of the values will be in the right-to-left order.
This type can be used to represent an ordered list of boolean-feature-type
values.
Expected values: list of boolean (true
or false
) values
Parameters
allow-multiple:
indicates whether the bucket can contain multipletrue
values.size:
number of values in the bucket.
Wire format
Sequential bit representation of boolean-feature-type
values.
For example, if size
= 4
, bucket-feature-type
can have values which are of the type boolean-feature-type
. If the boolean-feature-type
values are [true
, false
, true
, false
] this will occupy 4
bits on the wire. Wire format will be 0101
This type can be used to represent an ordered, heterogeneous list of unsigned-integer-feature-type
and signed-integer-feature-type
values.
Parameters
size
: unsigned integer indicating the number of fixed size values in the histogram.
Expected values: list of unsigned-integer-feature-type
and signed-integer-feature-type
values
Wire format
Wire format of each contained value, right-to-left. For example, if the histogram contains 2
elements, the first of which is an unsigned 3-bit integer with the value 5
, and the second is a signed 4
-bit integer with the value -3
, then the wire format would be 1101101.
The wire representation of the values will be in the right-to-left order.
The definition of the wire format of a payload is called its protocol. Below we describe the first wire format, or protocol version 1
. The protocol version included in the payload will be set by the platform.
A payload is made up of two parts: a header containing metadata information used for serialization and deserialization; and a body containing serialized feature values.
The header itself has two parts:
-
Protocol version: unsigned
5
-bit int indicating the version of the wire format specification used to encode the payload. -
Schema version: unsigned
3
-bit int. Version identifier for the schema that defines the payload.
The wire format of the header is the protocol version, then the schema version, right-to-left. For example, if the protocol version is 1
(00001
on the wire) and the schema version is 2
(010
on the wire), the header will be 01000001
.
The body contains serialized feature values, with values as defined in each feature type above. The order of the features is the same as their order in the provided schema for the payload, right-to-left.
The body is 0
-padded. Details of the padding are slightly different for egressPayload
and temporaryUnlimitedEgressPayload
:
egressPayload
is first0
-padded to its maximum size in bits, then0
-padded to the nearest byte.temporaryUnlimitedEgressPayload
is0
-padded to the nearest byte.
- Protocol version :
1
- Example schema version :
2
- Example max wire size for
egressPayload
:20
bits
Consider this example schema, feature values and corresponding wire format for each feature type specified the schema:
Feature type
(Defined in the schema) |
Feature type in collection
(Defined in the schema) |
Parameters
(Defined in the schema) |
Corresponding value in Json | Wire representation | |
histogram-feature-type with size = 2
|
unsigned-int-feature-type
|
size = 3
|
5
|
101
|
|
signed-int-feature-type
|
size = 4
|
-3
|
1101
|
||
boolean-feature-type
|
false
|
0
|
|||
bucket-feature-type
|
boolean-feature-type
|
size = 4 , allow-multiple = true
|
[true, false, true, false]
|
0101
|
Wire format of the feature values would be:
Wire format of the feature values + padding would be:
Wire format of the feature values + padding + header would be:
Wire format of the feature values + padding would be:
Wire format of the feature values + padding + header would be: