Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Github action for custom metric tests with WPT API #89

Merged
merged 81 commits into from
Oct 19, 2023
Merged
Show file tree
Hide file tree
Changes from 50 commits
Commits
Show all changes
81 commits
Select commit Hold shift + click to select a range
d2ff6b8
gh action draft
max-ostapenko Jul 23, 2023
82dc580
fixed result path
max-ostapenko Jul 24, 2023
0de4f05
diff with gh context properties
max-ostapenko Jul 24, 2023
e522629
checkout main
max-ostapenko Jul 24, 2023
d630641
include still existing only
max-ostapenko Jul 24, 2023
68a88a5
fixed source commit
max-ostapenko Jul 24, 2023
3ad2382
more test cases
max-ostapenko Jul 24, 2023
fc26242
npm script rename
max-ostapenko Jul 24, 2023
0ea6889
added missing modules and did local testing
max-ostapenko Jul 24, 2023
7ace0af
linter fix
max-ostapenko Jul 24, 2023
4c03238
test case example
max-ostapenko Jul 24, 2023
a9f2298
async tests + added examples
max-ostapenko Jul 31, 2023
55478cd
fixing gh action code
max-ostapenko Jul 31, 2023
7cc3b57
strip file paths
max-ostapenko Jul 31, 2023
de36c8f
list of unique values
max-ostapenko Jul 31, 2023
987b9c6
action conditions fixes
max-ostapenko Jul 31, 2023
11cfe01
basename in a loop
max-ostapenko Jul 31, 2023
01fe324
name fixes
max-ostapenko Jul 31, 2023
5647ff8
incorrect test run exit code
max-ostapenko Jul 31, 2023
95e11e7
exclude big files in a list
max-ostapenko Jul 31, 2023
b4320e5
lint
max-ostapenko Jul 31, 2023
5194f0b
use forked webpagetest
max-ostapenko Aug 1, 2023
8ba13ac
fixed 2 tests
max-ostapenko Aug 1, 2023
f5f848b
remove previous size limits
max-ostapenko Aug 1, 2023
699baae
optimised privacy test time
max-ostapenko Aug 1, 2023
bd82145
name fix
max-ostapenko Aug 1, 2023
1b5175e
to define metrics in test
max-ostapenko Aug 1, 2023
6eb7419
fix
max-ostapenko Aug 1, 2023
0d54df6
remove example custom metrics
max-ostapenko Aug 4, 2023
f509e28
tests in 'tests' folder
max-ostapenko Aug 4, 2023
a3f8b78
use tests dir in yml
max-ostapenko Aug 4, 2023
5bb5d2a
review suggestions
max-ostapenko Aug 4, 2023
5a7060e
custom metrics whitelist removed
max-ostapenko Aug 4, 2023
02e5c8e
testing more string formatting
max-ostapenko Aug 4, 2023
5b840c9
cleaner formatting
max-ostapenko Aug 4, 2023
66164ae
some formatting
max-ostapenko Aug 7, 2023
3876116
Merge branch 'main' into wpt-test-action
max-ostapenko Aug 7, 2023
d1aaead
gh-reporter
max-ostapenko Aug 9, 2023
8822346
pr-comment
max-ostapenko Aug 9, 2023
cddc4f2
debug test results
max-ostapenko Aug 9, 2023
9dd243e
rollback reporters config
max-ostapenko Aug 9, 2023
2a53042
remove pr comment step
max-ostapenko Aug 9, 2023
fd9de3a
no test results
max-ostapenko Aug 9, 2023
c094b8b
main
max-ostapenko Aug 9, 2023
b31533a
comment with custom metrics
max-ostapenko Aug 10, 2023
2fdee19
failing test behaviour
max-ostapenko Aug 10, 2023
18e6a8b
revert failing test
max-ostapenko Aug 10, 2023
c52082f
refresh-message-position
max-ostapenko Aug 10, 2023
ca78b14
Merge branch 'main' into wpt-test-action
max-ostapenko Sep 12, 2023
1c062d9
ads metrics fix
max-ostapenko Sep 13, 2023
2177443
Merge branch 'main' into wpt-test-action
max-ostapenko Sep 26, 2023
cb95937
rework
max-ostapenko Sep 27, 2023
5c02201
md lint
max-ostapenko Sep 27, 2023
e4a09c2
url pattern check
max-ostapenko Sep 27, 2023
3c4a8cf
debugging
max-ostapenko Sep 27, 2023
b580a0a
fix1
max-ostapenko Sep 27, 2023
4a11173
debugging
max-ostapenko Sep 27, 2023
b44eba6
debugging
max-ostapenko Sep 27, 2023
74e2246
fix lines
max-ostapenko Sep 27, 2023
7170466
skipping test websites
max-ostapenko Sep 27, 2023
f0f6e0e
bash and skip condition
max-ostapenko Sep 28, 2023
ebf348c
more debugging
max-ostapenko Sep 28, 2023
d3b6eb9
trimmed line
max-ostapenko Sep 28, 2023
09c64e6
index fix
max-ostapenko Sep 28, 2023
4254fa7
compare strings
max-ostapenko Sep 28, 2023
1c19444
trim line and substring
max-ostapenko Sep 28, 2023
dc92bf4
pr body updated
max-ostapenko Sep 28, 2023
b0e42d5
always test all metrics
max-ostapenko Sep 28, 2023
3cb2943
lint
max-ostapenko Sep 28, 2023
5c88b3b
readme update
max-ostapenko Sep 28, 2023
cf63957
cleanup
max-ostapenko Sep 28, 2023
4eaf511
fix
max-ostapenko Sep 28, 2023
b77ab31
description
max-ostapenko Sep 28, 2023
e467290
new fork
max-ostapenko Sep 28, 2023
44100d8
npm update
max-ostapenko Sep 28, 2023
45b4057
httparchive webpagetest server
max-ostapenko Oct 6, 2023
55312da
Update dist/ads.js
max-ostapenko Oct 6, 2023
fb6626e
Add Storage Buckets detection (#95)
tomayac Oct 9, 2023
9a7ca64
npm update
max-ostapenko Oct 11, 2023
1ce1839
Merge branch 'main' into wpt-test-action
max-ostapenko Oct 11, 2023
21490cd
Pass the API key explicitly as part of the options.
pmeenan Oct 17, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
47 changes: 47 additions & 0 deletions .github/workflows/wpt-test.yml
max-ostapenko marked this conversation as resolved.
Show resolved Hide resolved
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
name: Tests

on:
pull_request:
branches:
- main

jobs:
test:
name: WebPageTest Test Cases
runs-on: ubuntu-latest
steps:
- name: Checkout
uses: actions/checkout@v2
with:
fetch-depth: 0

- name: Install dependencies
run: |
npm install webpagetest
npm install jest

- name: Run WebPageTest
run: |
EXPECTED_TESTS=$(for file in $(git diff --name-only --diff-filter=ACMRT \
${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }} | \
grep -E "^dist/.*\.js$"); do basename "$file"; done | cut -d\. -f1 | sort | uniq)

for TEST in ${EXPECTED_TESTS[@]}; do
echo "::group::Test case for $TEST"
if [ -f "tests/$TEST.test.js" ]; then
npm test $TEST
else
echo "Test case file tests/$TEST.test.js not found"
exit 2
fi
echo "::endgroup::"
done
env:
WPT_API_KEY: ${{ secrets.WPT_API_KEY }}
max-ostapenko marked this conversation as resolved.
Show resolved Hide resolved

- name: Add comment with results
uses: mshick/add-pr-comment@v2
if: always()
with:
refresh-message-position: true
message-path: test-results.txt
12 changes: 10 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -25,12 +25,14 @@ return JSON.stringify({
});
```

3. Test your changes on WPT using the workflow below.
2. Test your changes on WPT using the workflow below.

4. Submit a pull request. Include one or more links to test results in your PR description to verify that the script is working.
3. Submit a pull request. Include one or more links to test results in your PR description to verify that the script is working.

## Testing

### Manual testing using webpagetest.org website

To test a custom metric, for example [`doctype.js`](https://github.com/HTTPArchive/legacy.httparchive.org/blob/master/custom_metrics/doctype.js), you can enter the script directly on [webpagetest.org](https://webpagetest.org?debug=1) under the "Custom" tab.

![image](https://user-images.githubusercontent.com/1120896/59539351-e3ecdd80-8eca-11e9-8b43-76bbd7a12029.png)
Expand All @@ -48,6 +50,12 @@ To see the custom metric results, select a run, first click on "Details", and th

For complex metrics like [almanac.js](./dist/almanac.js) you can more easily explore the results by copy/pasting the JSON into your browser console.

### Automated testing using test cases

0. Tests are running using [WPT API wrapper](https://github.com/webpagetest/webpagetest-api) and [Jest Testing Framework](https://jestjs.io/).

1. Create a test file in the [`tests`](./tests) directory with a name corresponding to a custom metrics file, e.g. [`privacy.test.js`](./tests/privacy.test.js) testing [privacy.js](./dist/privacy.js) custom metrics. In test file you need to define websites for WPT test runs and test cases for the custom metric parameters.

## Linting

On opening a Pull Request we will do some basic linting of JavaScript using [ESLint](https://eslint.org/) through the [GitHub Super-Linter](https://github.com/github/super-linter).
Expand Down
56 changes: 27 additions & 29 deletions dist/ads.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
const ACCOUNT_TYPES = ['direct', 'reseller'];
const SELLER_TYPES = ['publisher', 'intermediary', 'both'];

const isPresent = (response, endings) => response.ok && endings.find(ending => response.url.endsWith(ending));
const isPresent = (response, endings) => response.ok && endings.some(ending => response.url.endsWith(ending));

const fetchAndParse = async (url, parser) => {
const controller = new AbortController();
Expand All @@ -22,20 +22,20 @@ const fetchAndParse = async (url, parser) => {
}
};

// Extracts status, record count, and record counts respective to relationship.
// Standard Specification: https://iabtechlab.com/wp-content/uploads/2022/04/Ads.txt-1.1.pdf
// https://iabtechlab.com/wp-content/uploads/2022/04/Ads.txt-1.1.pdf
const parseAdsTxt = async (response) => {
const content = await response.text();
let content = await response.text();

let result = {
present: isPresent(response, ['/ads.txt', '/app-ads.txt']),
redirected: response.redirected,
status: response.status,
};

if (result.present) {
if (result.present && content) {
result = {
...result, ...{
...result,
...{
account_count: 0,
account_types: {
direct: {
Expand All @@ -54,10 +54,10 @@ const parseAdsTxt = async (response) => {
};

// Clen up file content
max-ostapenko marked this conversation as resolved.
Show resolved Hide resolved
file_content = file_content.replace(/#.*$/gm, '');
file_content = file_content.replace(/\r/g, '');
content = content.replace(/#.*$/gm, '');
content = content.replace(/\r/g, '');

let lines = file_content.split('\n');
let lines = content.split('\n');
result.line_count = lines.length;

for (let line of lines) {
Expand All @@ -81,7 +81,7 @@ const parseAdsTxt = async (response) => {
}
};

// Convert Sets to Arrays
// Count unique and remove domain Sets for now
for (let accountType of Object.values(result.account_types)) {
accountType.domain_count = accountType.domains.size;
delete accountType.domains // Keeping a list of domains may be valuable for further research, e.g. accountType.domains = [...accountType.domains];
Expand All @@ -94,8 +94,7 @@ const parseAdsTxt = async (response) => {
}


//Extracts seller record mertrics.
//Standard Specification: https://iabtechlab.com/wp-content/uploads/2019/07/Sellers.json_Final.pdf
// https://iabtechlab.com/wp-content/uploads/2019/07/Sellers.json_Final.pdf
const parseSellersJSON = async (response) => {
let content;
try {
Expand All @@ -104,14 +103,15 @@ const parseSellersJSON = async (response) => {
content = null;
}
let result = {
present: isPresent(response, ['/ads.txt', '/app-ads.txt']),
present: isPresent(response, ['/sellers.json']),
redirected: response.redirected,
status: response.status,
};

if (result.present) {
if (result.present && content) {
result = {
...result, ...{
...result,
...{
seller_count: 0,
seller_types: {
publisher: {
Expand All @@ -132,39 +132,37 @@ const parseSellersJSON = async (response) => {
};

// Clean up file content
file_content_json = JSON.parse(file_content);
file_content_json.seller_count = file_content_json.sellers.length;
result.seller_count = content.sellers.length;

for (let seller of file_content_json.sellers) {
for (let seller of content.sellers) {
// Seller records
let type = seller.seller_type.trim().toLowerCase(),
domain = seller.domain.trim();
domain = seller.domain.trim().toLowerCase();
if (Object.keys(result.seller_types).includes(type)) {
result.seller_types[type].domains.add(domain);
result.seller_types[type].seller_count += 1;
}
result.seller_count += 1;

// Passthrough
if (seller.is_passthrough) {
result.passthrough_count += 1;
}
}
};

// Count unique and remove domain Sets for now
for (let seller_type of Object.values(result.seller_types)) {
seller_type.domain_count = seller_type.domains.size;
delete seller_type.domains //seller_type.domains = [...seller_type.domains];
}
// Count unique and remove domain Sets for now
for (let seller_type of Object.values(result.seller_types)) {
seller_type.domain_count = seller_type.domains.size;
delete seller_type.domains //seller_type.domains = [...seller_type.domains];
}
};

return result;
}

return Promise.all([
fetchAndParse("/ads.txt", parseAdsTxt),
fetchAndParse("/app-ads.txt", parseAdsTxt),
fetchAndParse("/sellers.json", parseSellersJSON),
fetchAndParse("/ads.txt", parseAdsTxt).catch(e => e),
fetchAndParse("/app-ads.txt", parseAdsTxt).catch(e => e),
fetchAndParse("/sellers.json", parseSellersJSON).catch(e => e),
]).then((all_data) => {
return JSON.stringify({
ads: all_data[0],
Expand Down
20 changes: 3 additions & 17 deletions dist/privacy.js
Original file line number Diff line number Diff line change
Expand Up @@ -39,16 +39,14 @@ return JSON.stringify({
* Privacy policies
* Wording sourced from: https://github.com/RUB-SysSec/we-value-your-privacy/blob/master/privacy_wording.json
* words = privacy_wording.map(country => country.words).filter((v, i, a) => a.indexOf(v) === i).flat().sort().join('|');
*
* Test site: https://www.theverge.com/
*/
privacy_wording_links: (() => {
let words =
'adatkezelési|adatvédelem|adatvédelmi|andmekaitsetingimused|aviso legal|beskyttelse af personlige oplysninger|cgu|cgv|confidentialitate|confidentialite|confidentialité|confidentialité|confidentialité|confidentialité|confidentialité|confidențialitate|cookie policy|cookie-uri|cookie-urilor|cookiepolitik|cookies|data policy|data policy|data policy|data policy|datapolicy|datapolitik|datenrichtlinie|datenrichtlinie|datenrichtlinie|datenrichtlinie|datenschutz|datenschutz|datenschutz|datenschutz|datenschutzbestimmungen|datenschutzrichtlinie|donnees personelles|gdpr|gegevensbeleid|gegevensbeleid|gizlilik|gizlilik|integritetspolicy|isikuandmete|isikuandmete töötlemise|kasutustingimused|kişisel verilerin korunması|kolačići|konfidencialiteti|konfidentsiaalsuse|kvkk|küpsised|mbrojtja e të dhënave|mentions légales|mentions légales|normativa sui dati|ochrana dat|ochrana osobních údajů|ochrana osobných údajov|ochrana soukromí|ochrana súkromia|ochrana udaju|ochrana údajov|ochrany osobných údajov|osobné údaje|personlige data|personoplysninger|personuppgifter|personvern|persónuvernd|piškotki|piškotkih|podmínky|policy|politica de utilizare|politika e të dhënave|politikat e privatesise|politikat e privatësisë|politique d’utilisation des données|politique d’utilisation des données|politique d’utilisation des données|politique d’utilisation des données|politique d’utilisation des données|política de dados|política de dados|política de datos|política de datos|pravila o upotrebi podataka|privaatsus|privacidad|privacidad|privacidade|privacidade|privacy|privacy|privacy|privacy|privacy|privacy policy|privacybeleid|privacybeleid|privatezza|privatlivspolitik|privatnost|privatnost|privatnosti|privatssphäre|privatumas|privatumo|privatësia|privātuma|privātums|protecció de dades|protecţia datelor|prywatnosci|prywatności|prywatność|regler om fortrolighed|rekisteriseloste|retningslinjer for data|rgpd|sekretess|slapukai|soukromi|soukromí|személyes adatok védelme|súkromie|sīkdatne|sīkdatņu|tietokäytäntö|tietosuoja|tietosuojakäytäntö|tietosuojaseloste|varstvo podatkov|veri i̇lkesi|veri i̇lkesi|veri politikası|vie privée|webbplatsen|yksityisyyden suoja|yksityisyydensuoja|yksityisyys|zasady dotyczące danych|zasebnost|zaštita podataka|zásady ochrany osobných|zásady používání dat|zásady používání dat|zásady využívania údajov|απόρρητο|απόρρητο|πολιτική απορρήτου|πολιτική δεδομένων|προσωπικά δεδομένα|όροι και γνωστοποιήσεις|конфиденциальность|конфіденційність|поверителност|политика за бисквитки|политика за данни|политика использования данных|политика конфиденциальности|политика о подацима|политика о подацима|политика о подацима|политика обработки персональных данных|приватност|приватност|приватност|условия|условия за ползване|מדיניות נתונים|פרטיות|الخصوصية|سياسة البيانات|数据使用政策|數據使用政策|私隱政策|隐私权政策';
let pattern = new RegExp('\\b(?:' + words + ')\\b', 'ig');

let privacy_links = Array.from(document.querySelectorAll('a')).map(
a => ({keywords: a.innerText.match(pattern), text: a.innerText})
a => ({ keywords: a.innerText.match(pattern), text: a.innerText })
).filter(a => a.keywords); // filter out non-matching texts (keywords = null)

return privacy_links;
Expand All @@ -71,7 +69,6 @@ return JSON.stringify({
if (consentData.present) {
// Standard command: 'getVendorConsents'
// cf. https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/CMP%20JS%20API%20v1.1%20Final.md#what-api-will-need-to-be-provided-by-the-cmp-
// Test site: ?
window.__cmp('getVendorConsents', null, (result, success) => {
if (success) {
consentData.data = result;
Expand All @@ -98,8 +95,6 @@ return JSON.stringify({
/**
* IAB Transparency and Consent Framework v2
* docs v2: https://github.com/InteractiveAdvertisingBureau/GDPR-Transparency-and-Consent-Framework/blob/master/TCFv2
*
* Test site: https://www.rtl.de/
*/
iab_tcf_v2: (() => {
let tcData = {
Expand Down Expand Up @@ -137,8 +132,6 @@ return JSON.stringify({
/**
* IAB US Privacy User Signal Mechanism “USP API”
* https://github.com/InteractiveAdvertisingBureau/USPrivacy
*
* Test site: https://www.nfl.com/
*/
iab_usp: (() => {
let uspData = {
Expand All @@ -161,8 +154,6 @@ return JSON.stringify({
/**
* Ads Transparency Spotlight Data Disclosure schema
* Only for top frame, can't access child frames (same-origin policy)
*
* Test site: unknown
*/
ads_transparency_spotlight: (() => {
// Check `meta` tag cf. https://github.com/Ads-Transparency-Spotlight/documentation/blob/main/implement.md
Expand All @@ -178,10 +169,9 @@ return JSON.stringify({
})(),

/**
* FLoC
* FLoC (Federated Learning of Cohorts) - deprecated
*
* Test site: https://floc.glitch.me/
* Test site: https://www.pokellector.com/
*
* @todo Check function/variable accesses through string searches (wrappers cannot be used, as the metrics are only collected at the end of the test)
*/
Expand All @@ -190,16 +180,12 @@ return JSON.stringify({
/**
* Do Not Track (DNT)
* https://www.eff.org/issues/do-not-track
*
* Test site: https://www.theverge.com/
*/
navigator_doNotTrack: testPropertyStringInResponseBodies('navigator.+doNotTrack'),

/**
* Global Privacy Control
* https://globalprivacycontrol.org/
*
* Test site: https://global-privacy-control.glitch.me/
*/
navigator_globalPrivacyControl: testPropertyStringInResponseBodies(
'navigator.+globalPrivacyControl'
Expand Down Expand Up @@ -255,7 +241,7 @@ return JSON.stringify({
a => a.tagName === e.tagName && a.referrerpolicy === e.referrerpolicy
);
if (!found) {
acc.push({...e, count: 1});
acc.push({ ...e, count: 1 });
} else {
found.count += 1;
}
Expand Down
Loading
Loading