trackhar
- Adapter
- AnnotatedResult
- ArrayOrSingle
- Context
- DataPath
- DecodingStep
- Identifier
- IndicatorValues
- JsonPath
- Path
- Property
- Request
- Result
- Tracker
- TrackerDescriptionTranslationKey
- TrackingDataValue
- Variable
Ƭ Adapter: Object
An adapter that contains instructions on how to extract the tracking data included in a request to certain endpoints.
Handling for one endpoint might be split across multiple adapters if the endpoint accepts different request formats.
The first adapter that matches a request will be used to decode it.
Name | Type | Description |
---|---|---|
containedDataPaths |
Partial <Record <Property , ArrayOrSingle <DataPath >>> |
A description of how to extract the transmitted tracking data from the decoded object. |
decodingSteps |
DecodingStep [] |
An array of the steps (in order) used to decode the request into an object format. |
description? |
TrackerDescriptionTranslationKey |
The translation key for a description that gives context on the endpoint, if that makes sense. |
endpointUrls |
(string | RegExp )[] |
The endpoints that this adapter can handle. Each entry can either be a string (which will have to be equal to the full endpoint URL in the request) or a regular expression that is matched against the endpoint URL. The endpoint URL in this context is the full URL, including protocol, host, and path, but excluding the query string. It should not have a trailing slash. |
match? |
(r : Request ) => boolean | undefined |
An optional function to further filter which requests can be handled by this adapter. This is useful if there are multiple adapters for one endpoint that handle different request formats. |
name |
string |
A human-readable name for the adapter. This should be as close as possible to the official name for the endpoint. |
slug |
string |
A slug to identify the adapter. These only need to be unique per tracker, not globally. |
tracker |
Tracker |
The tracking company behind these endpoints. |
Ƭ AnnotatedResult: { adapter
: string
; property
: LiteralUnion
<Property
, string
> ; reasoning
: DataPath
["reasoning"
] | "indicator matching (plain text)"
| "indicator matching (base64)"
| "indicator matching (URL-encoded)"
; value
: TrackingDataValue
} & Omit
<DataPath
, "reasoning"
>[]
Extended version of the Result type that includes additional metadata about the detected tracking. Each entry in the array is one instance of a tracking data value that was found in a request, with the following properties:
-
adapter
: The adapter that detected the tracking data (<tracker slug>/<adapter slug>
) orindicators
if the entry was detected through indicator matching. -
property
: The type of tracking data that was detected. -
value
: The actual value of the tracking data that was transmitted. -
context
: The part of the request in which the tracking data was found (e.g.body
,path
). -
path
: A JSONPath expression indicating where this match was found. Note that while we try to keep this path as close as possible to the format used by the tracker, it refers to the decoded request, after our processing steps. This is unavoidable as the trackers don't transmit in a standardized format.If indicator matching was used to detect this entry, the path will point to the first character of the match in the respective part of the request.
-
reasoning
: An explanation of how we concluded that this is information is actually the type of data we labelled it as. This can either be a standardized description, or a URL to a more in-depth research report.If indicator matching was used to detect this entry, the reasoning will be
indicator matching
followed by the encoding that was used to match the indicator value in parentheses.
Ƭ ArrayOrSingle<T
>: T
| T
[]
Either a single instance or an array of T
.
Name |
---|
T |
Ƭ Context: "header"
| "cookie"
| "path"
| "query"
| "body"
A part of a request, to explain where some information was found.
Ƭ DataPath: Object
A description of where a certain piece of tracking data can be found in the decoded request.
Name | Type | Description |
---|---|---|
context |
Context |
The part of the original request that the data can be found in. |
notIf? |
string | RegExp |
An optional filter that stops a discovered value from being considered an instance of the respective property. |
onlyIf? |
string | RegExp |
An optional filter that causes only matching values to be considered instances of the respective property. |
path |
JsonPath |
A JSONPath expression describing where in the decoded request object the data can be found. |
reasoning |
"obvious property name" | "obvious observed values" | "observed values match known device parameters" | `https://${string}` | `http://${string}` | `${string}.md` |
An explanation of how we concluded that this is information is actually the type of data we labelled it as. This can either be a standardized description, or a URL to a more in-depth research report. |
Ƭ DecodingStep: { function
: "parseQueryString"
| "parseJson"
| "decodeBase64"
| "decodeUrl"
| "decodeProtobuf"
| "decodeJwt"
| "ensureArray"
| "gunzip"
} | { function
: "split"
; options
: { separator
: string
} } | { function
: "getProperty"
; options
: { path
: JsonPath
} } & { input
: Path
} | { mapInput
: Path
} & { output
: Identifier
}
A step in the process of decoding a tracking request. This is essentially a function call with some input and output, and potentially additional options.
The input
is a JSONPath expression which is evaluated against the global decoding state (initialized with the data
from each Context of the request, and a res
object, where the result of the decoding is to be stored,
separated by Context; new variables can be created by decoding steps).
Alternatively, if a mapInput
is specified instead, the function will be mapped over the array at the given path,
returning a result array.
The output
is an identifier of where to store the return value of the function call in the same global decoding
state. Note that this doesn't support the full range of JSONPath expressions, but only nested property access through
.
.
The following function
s are available:
parseQueryString
: Parses a query string encoded value into an object.parseJson
: Parses a JSON encoded string into an object.decodeBase64
: Decodes a base64-encoded string.decodeUrl
: Decodes a URL-encoded string.decodeProtobuf
: Decodes a Protobuf blob. This doesn't use a schema, as such property names are not available in the result.decodeJwt
: Decodes the payload of a JSON Web Token (JWT) string into an object.ensureArray
: Ensures that the given value is an array. If it is not, it is wrapped in an array.gunzip
: Unzips a gzip-compressed blob.split
: Splits a string into an array using the given separator.getProperty
: Gets a property from an object. The property name is given in theoptions.path
option. This is useful for either copying a nested property to a variable, or to extract a nested property from an array when used with amapInput
.
Ƭ Identifier: Variable
| `${Exclude<Variable, "res">}.${string}` | `res.${Context}` | `res.${Context}.${string}`
An identifer for a variable or nested property on the global state in the decoding process of a request. This doesn't have support for more complex JSONPath expressions.
Ƭ IndicatorValues: Partial
<Record
<LiteralUnion
<Property
, string
>, ArrayOrSingle
<string
>>>
A mapping from properties (standardized names for certain types of tracking data) to indicator values (known honey data strings that appear in the request if the property is present). Indicator values can be provided as arrays or single strings. They are automatically matched against their encoded versions (e.g. base64 and URL-encoded). Where possible, they are matched case-insensitively.
Example
{
"localIp": ["10.0.0.2", "fd31:4159::a2a1"],
"advertisingId": "6a1c1487-a0af-4223-b142-a0f4621d0311"
}
This example means that if the string 10.0.0.2
or fd31:4159::a2a1
is found in the request, it indicates that the
local IP is being transmitted. Similarly, if the string 6a1c1487-a0af-4223-b142-a0f4621d0311
is found in the
request, it indicates that the advertising ID is being transmitted.
Ƭ JsonPath: string
A JSONPath expression to be parsed by https://github.com/JSONPath-Plus/JSONPath.
Ƭ Path: LiteralUnion
<Variable
, JsonPath
>
A JSONPath expression that can be used to access a variable or nested property on the global state in the decoding process of a request.
Ƭ Property: keyof typeof translations
["properties"
]
A type of tracking data that we can detect in a request.
These are our standardized names for the data that we can detect. They are not necessarily the same as the names used by the tracker.
Remarks
-
state
here means "subnational political entity" -
Locales should not be listed under
country
-
We distinguish the following types of IDs (that are personal data under the GDPR):
- An
advertisingId
is a unique identifier assigned to a device by the operating system that is the same across apps/websites. In particular, this includes the Google Advertising ID (GAID) and Apple's Identifier for Advertisers (IDFA). These can typically be reset by the user. - A
developerScopedId
is a unique identifier assigned to a device by the operating system that is specific to a certain app developer. Apps from different developers will see differentdeveloperScopedId
s. In particular, this includes Apple's Identifier for Vendor (IDFV), Google's App set ID (ASID), and theANDROID_ID
. - A
sessionId
identifies a single (time-limited) session and is specific to a certain website/app and device. - An
installationId
identifies an installation of an app on a device. It specific to that app and device, and reset when the app is un- and reinstalled. - A
deviceId
identifies a device across apps/websites. - A
userId
identifies a user across apps/websites and devices. - We use the
otherIdentifiers
data property to denote UUIDs and other identifiers where we don't know how they are actually used. This should only be used sparingly and where, despite not knowing the precise function, it is obvious (from context or otherwise) that this ID is personal data.
- An
Ƭ Request: Object
Our internal representation of an HTTP request.
Name | Type | Description |
---|---|---|
content? |
string |
The request body, if any. |
cookies? |
{ name : string ; value : string }[] |
The cookies set through request. |
endpointUrl |
string |
The full URL, but without the query string. This is useful for matching in the adapters. |
headers? |
{ name : string ; value : string }[] |
The headers included in the request. |
host |
string |
The host name of the request. |
httpVersion |
string |
The HTTP version of the request. |
method |
string |
The HTTP method used. |
path |
string |
The full path of the request, including the query string. |
port |
string |
The port of the request. |
scheme |
"http" | "https" |
The scheme of the request. |
startTime |
Date |
The time when the request was sent. |
Ƭ Result: Partial
<Record
<LiteralUnion
<Property
, string
>, TrackingDataValue
[]>>
A mapping from properties (standardized names for certain types of tracking data) to the actual instances of values of that property found in a request.
If indicator matching is enabled, it is not possible to distinguish between instances detected through adapter and indicator matching.
Ƭ Tracker: Object
A tracking company that we have adapters for.
Name | Type | Description |
---|---|---|
datenanfragenSlug? |
string |
The slug of the tracking company in the Datenanfragen.de company database (if available). |
description? |
TrackerDescriptionTranslationKey |
The translation key for an introductory description that gives context on the tracking company, if that makes sense and the description applies equally to all adapters assigned to the company. |
exodusId? |
number |
The numeric ID of the tracker in the Exodus tracker database (if available). |
name |
string |
The legal name of the tracking company. |
slug |
string |
A slug to identify the tracker. |
Ƭ TrackerDescriptionTranslationKey: keyof typeof translations
["tracker-descriptions"
]
A translation key for a tracker description, either for a Tracker or for an Adapter. At least the
English translation for the actual description needs to be provided in i18n/en.json
.
See the README for additional details on the contents and markup.
Ƭ TrackingDataValue: any
Some value transmitted by a tracker. We don't have any type information about it.
Ƭ Variable: LiteralUnion
<Context
| "res"
, string
>
A variable on the global state used in the decoding process of a request. This doesn't allow nested property access.
• Const
adapters: Adapter
[] = allAdapters
An array of all available adapters.
Remarks
This is not needed for the main purposes of this library, but can be useful for more advanced use cases. We use it to
generate the information in tracker-wiki
.
▸ adapterForRequest(r
): undefined
| Adapter
Find the adapter that can handle a certain request.
Remarks
This is not needed for the main purposes of this library, but can be useful for more advanced use cases.
Name | Type | Description |
---|---|---|
r |
Request |
The request to find an adapter for. |
undefined
| Adapter
The adapter that can handle the request, or undefined
if none could be found.
▸ decodeRequest(r
, decodingSteps
): any
Decode a request into an object representation using the given decoding steps.
Remarks
This is not needed for the main purposes of this library, but can be useful for more advanced use cases.
Name | Type | Description |
---|---|---|
r |
Request |
The request to decode in our internal request format. |
decodingSteps |
DecodingStep [] |
The decoding steps to use (from the adapter). |
any
An object representation of the request.
▸ process<ValuesOnly
>(har
, options?
): Promise
<ValuesOnly
extends true
? (undefined
| Partial
<Record
<LiteralUnion
<"accelerometerX"
| "accelerometerY"
| "accelerometerZ"
| "advertisingId"
| "appId"
| "appName"
| "appVersion"
| "architecture"
| "batteryLevel"
| "browserId"
| "browserWindowHeight"
| "browserWindowWidth"
| "campaignCreative"
| "campaignCreativePosition"
| "campaignMedium"
| "campaignName"
| "campaignSource"
| "campaignTerm"
| "carrier"
| "consentState"
| "country"
| "currency"
| "developerScopedId"
| "deviceId"
| "deviceName"
| "diskFree"
| "diskTotal"
| "diskUsed"
| "errorInformation"
| "hashedAdvertisingId"
| "installationId"
| "installTime"
| "interactedElement"
| "isAutomated"
| "isCharging"
| "isConversion"
| "isDntEnabled"
| "isEmulator"
| "isFirstLaunch"
| "isInDarkMode"
| "isInForeground"
| "isJsEnabled"
| "isMobileDevice"
| "isRoaming"
| "isRooted"
| "isUserActive"
| "isUserInactive"
| "language"
| "lastActivityTime"
| "latitude"
| "localIp"
| "longitude"
| "macAddress"
| "manufacturer"
| "model"
| "networkConnectionType"
| "orientation"
| "osName"
| "osVersion"
| "otherIdentifiers"
| "pageHeight"
| "pageWidth"
| "propertyId"
| "publicIp"
| "pushNotificationToken"
| "ramFree"
| "ramTotal"
| "ramUsed"
| "referer"
| "revenue"
| "rotationX"
| "rotationY"
| "rotationZ"
| "screenColorDepth"
| "screenHeight"
| "screenWidth"
| "scrollPositionX"
| "scrollPositionY"
| "segment"
| "sessionCount"
| "sessionDuration"
| "sessionId"
| "signalStrengthCellular"
| "signalStrengthWifi"
| "startTime"
| "state"
| "timeSpent"
| "timezone"
| "trackerSdkVersion"
| "uptime"
| "userAction"
| "userActionSource"
| "userActiveTime"
| "userAgent"
| "userGender"
| "userId"
| "userInterests"
| "viewedPage"
| "viewedPageCategory"
| "viewedPageKeywords"
| "viewedPageLanguage"
| "volume"
| "websiteName"
| "websiteUrl"
, string
>, any
[]>>)[] : (undefined
| AnnotatedResult
)[]>
Parse the requests in a HAR traffic dump and extract tracking data.
This always tries to parse requests with the tracker-specific adapters first. If none of them can handle a request,
and options.indicatorValues
is provided, it will fall back to indicator matching.
Name | Type |
---|---|
ValuesOnly |
extends boolean = false |
Name | Type | Description |
---|---|---|
har |
Har |
A traffic dump in HAR format. |
options? |
Object |
An optional object that can configure the following options: - valuesOnly : By default, the result contains not just the values but also various metadata (like the adapter that processed the request). If you only need the values, you can set this option to true to get a simpler result. - indicatorValues : An object that specifies known honey data values for certain properties. If no adapter could match the request but indicator values are provided, this function will fall back to indicator matching and try to find the indicator values in the request headers, path or body. See IndicatorValues. |
options.indicatorValues? |
Partial <Record <LiteralUnion <"accelerometerX" | "accelerometerY" | "accelerometerZ" | "advertisingId" | "appId" | "appName" | "appVersion" | "architecture" | "batteryLevel" | "browserId" | "browserWindowHeight" | "browserWindowWidth" | "campaignCreative" | "campaignCreativePosition" | "campaignMedium" | "campaignName" | "campaignSource" | "campaignTerm" | "carrier" | "consentState" | "country" | "currency" | "developerScopedId" | "deviceId" | "deviceName" | "diskFree" | "diskTotal" | "diskUsed" | "errorInformation" | "hashedAdvertisingId" | "installationId" | "installTime" | "interactedElement" | "isAutomated" | "isCharging" | "isConversion" | "isDntEnabled" | "isEmulator" | "isFirstLaunch" | "isInDarkMode" | "isInForeground" | "isJsEnabled" | "isMobileDevice" | "isRoaming" | "isRooted" | "isUserActive" | "isUserInactive" | "language" | "lastActivityTime" | "latitude" | "localIp" | "longitude" | "macAddress" | "manufacturer" | "model" | "networkConnectionType" | "orientation" | "osName" | "osVersion" | "otherIdentifiers" | "pageHeight" | "pageWidth" | "propertyId" | "publicIp" | "pushNotificationToken" | "ramFree" | "ramTotal" | "ramUsed" | "referer" | "revenue" | "rotationX" | "rotationY" | "rotationZ" | "screenColorDepth" | "screenHeight" | "screenWidth" | "scrollPositionX" | "scrollPositionY" | "segment" | "sessionCount" | "sessionDuration" | "sessionId" | "signalStrengthCellular" | "signalStrengthWifi" | "startTime" | "state" | "timeSpent" | "timezone" | "trackerSdkVersion" | "uptime" | "userAction" | "userActionSource" | "userActiveTime" | "userAgent" | "userGender" | "userId" | "userInterests" | "viewedPage" | "viewedPageCategory" | "viewedPageKeywords" | "viewedPageLanguage" | "volume" | "websiteName" | "websiteUrl" , string >, ArrayOrSingle <string >>> |
- |
options.valuesOnly? |
ValuesOnly |
- |
Promise
<ValuesOnly
extends true
? (undefined
| Partial
<Record
<LiteralUnion
<"accelerometerX"
| "accelerometerY"
| "accelerometerZ"
| "advertisingId"
| "appId"
| "appName"
| "appVersion"
| "architecture"
| "batteryLevel"
| "browserId"
| "browserWindowHeight"
| "browserWindowWidth"
| "campaignCreative"
| "campaignCreativePosition"
| "campaignMedium"
| "campaignName"
| "campaignSource"
| "campaignTerm"
| "carrier"
| "consentState"
| "country"
| "currency"
| "developerScopedId"
| "deviceId"
| "deviceName"
| "diskFree"
| "diskTotal"
| "diskUsed"
| "errorInformation"
| "hashedAdvertisingId"
| "installationId"
| "installTime"
| "interactedElement"
| "isAutomated"
| "isCharging"
| "isConversion"
| "isDntEnabled"
| "isEmulator"
| "isFirstLaunch"
| "isInDarkMode"
| "isInForeground"
| "isJsEnabled"
| "isMobileDevice"
| "isRoaming"
| "isRooted"
| "isUserActive"
| "isUserInactive"
| "language"
| "lastActivityTime"
| "latitude"
| "localIp"
| "longitude"
| "macAddress"
| "manufacturer"
| "model"
| "networkConnectionType"
| "orientation"
| "osName"
| "osVersion"
| "otherIdentifiers"
| "pageHeight"
| "pageWidth"
| "propertyId"
| "publicIp"
| "pushNotificationToken"
| "ramFree"
| "ramTotal"
| "ramUsed"
| "referer"
| "revenue"
| "rotationX"
| "rotationY"
| "rotationZ"
| "screenColorDepth"
| "screenHeight"
| "screenWidth"
| "scrollPositionX"
| "scrollPositionY"
| "segment"
| "sessionCount"
| "sessionDuration"
| "sessionId"
| "signalStrengthCellular"
| "signalStrengthWifi"
| "startTime"
| "state"
| "timeSpent"
| "timezone"
| "trackerSdkVersion"
| "uptime"
| "userAction"
| "userActionSource"
| "userActiveTime"
| "userAgent"
| "userGender"
| "userId"
| "userInterests"
| "viewedPage"
| "viewedPageCategory"
| "viewedPageKeywords"
| "viewedPageLanguage"
| "volume"
| "websiteName"
| "websiteUrl"
, string
>, any
[]>>)[] : (undefined
| AnnotatedResult
)[]>
An array of results, corresponding to each request in the HAR file. If a request could not be processed
(i.e. if no adapter was found that could handle it and indicator matching, if enabled, didn't produce any results),
the corresponding entry in the array will be undefined
.
▸ processRequest(request
, options?
): undefined
| AnnotatedResult
Parse a single request in our internal request representation and extract tracking data as an annotated result from it.
Remarks
This is not needed for the main purposes of this library, but can be useful for more advanced use cases.
Name | Type | Description |
---|---|---|
request |
Request |
The request to process in our internal request format. |
options? |
Object |
An optional object that can configure the following options: - indicatorValues : An object that specifies known honey data values for certain properties. If no adapter could match the request but indicator values are provided, this function will fall back to indicator matching and try to find the indicator values in the request headers, path or body. See IndicatorValues. |
options.indicatorValues? |
Partial <Record <LiteralUnion <"accelerometerX" | "accelerometerY" | "accelerometerZ" | "advertisingId" | "appId" | "appName" | "appVersion" | "architecture" | "batteryLevel" | "browserId" | "browserWindowHeight" | "browserWindowWidth" | "campaignCreative" | "campaignCreativePosition" | "campaignMedium" | "campaignName" | "campaignSource" | "campaignTerm" | "carrier" | "consentState" | "country" | "currency" | "developerScopedId" | "deviceId" | "deviceName" | "diskFree" | "diskTotal" | "diskUsed" | "errorInformation" | "hashedAdvertisingId" | "installationId" | "installTime" | "interactedElement" | "isAutomated" | "isCharging" | "isConversion" | "isDntEnabled" | "isEmulator" | "isFirstLaunch" | "isInDarkMode" | "isInForeground" | "isJsEnabled" | "isMobileDevice" | "isRoaming" | "isRooted" | "isUserActive" | "isUserInactive" | "language" | "lastActivityTime" | "latitude" | "localIp" | "longitude" | "macAddress" | "manufacturer" | "model" | "networkConnectionType" | "orientation" | "osName" | "osVersion" | "otherIdentifiers" | "pageHeight" | "pageWidth" | "propertyId" | "publicIp" | "pushNotificationToken" | "ramFree" | "ramTotal" | "ramUsed" | "referer" | "revenue" | "rotationX" | "rotationY" | "rotationZ" | "screenColorDepth" | "screenHeight" | "screenWidth" | "scrollPositionX" | "scrollPositionY" | "segment" | "sessionCount" | "sessionDuration" | "sessionId" | "signalStrengthCellular" | "signalStrengthWifi" | "startTime" | "state" | "timeSpent" | "timezone" | "trackerSdkVersion" | "uptime" | "userAction" | "userActionSource" | "userActiveTime" | "userAgent" | "userGender" | "userId" | "userInterests" | "viewedPage" | "viewedPageCategory" | "viewedPageKeywords" | "viewedPageLanguage" | "volume" | "websiteName" | "websiteUrl" , string >, ArrayOrSingle <string >>> |
- |
undefined
| AnnotatedResult
▸ unhar(har
): Request
[]
Parse a traffic dump in HAR format into our internal request representation.
Name | Type | Description |
---|---|---|
har |
Har |
The HAR traffic dump. |
Request
[]