Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validate input data before training the models #342

Open
barjin opened this issue Dec 12, 2024 · 0 comments
Open

Validate input data before training the models #342

barjin opened this issue Dec 12, 2024 · 0 comments
Labels
debt Code quality improvement or decrease of technical debt. t-tooling Issues with this label are in the ownership of the tooling team.

Comments

@barjin
Copy link
Collaborator

barjin commented Dec 12, 2024

As mentioned in #339 (and the related comments), the collected input data can contain arbitrary values (e.g. as a result of a penetration test run against the collecting server). This leads to the generation of less believable (or even potentially dangerous) fingerprints.

The input data should be validated before training the models with generator-networks-creator to ensure we only generate real fingerprints. This could be simple for some properties (e.g. Navigator.appCodeName should be always Mozilla), but may be impossible for other properties (e.g. Navigator.userAgent can be pretty much arbitrary string - sans the syntax).

Note that this blocks re-enabling the automatic updates of the models.

@barjin barjin added debt Code quality improvement or decrease of technical debt. t-tooling Issues with this label are in the ownership of the tooling team. labels Dec 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
debt Code quality improvement or decrease of technical debt. t-tooling Issues with this label are in the ownership of the tooling team.
Projects
None yet
Development

No branches or pull requests

1 participant