Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPLT_3672: [Smart concurrency] Scaling up/down asset discovery queue #1849

Merged
merged 27 commits into from
Jan 29, 2025
Merged
Show file tree
Hide file tree
Changes from 6 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
a86fcca
feat: added functions to all the methods for cpuLoad and memoryUsage
this-is-shivamsingh Jan 23, 2025
2326495
feat: updates after testing this out on macOSX
this-is-shivamsingh Jan 23, 2025
bf7fb34
Merge branch 'master' into PPLT_3672
this-is-shivamsingh Jan 23, 2025
fabe159
fix: upload-artifact@v4
this-is-shivamsingh Jan 23, 2025
df28236
fix: download-artifact@v4
this-is-shivamsingh Jan 23, 2025
e9f8c38
fix: await 1sec for cpu load
this-is-shivamsingh Jan 23, 2025
7405009
feat: completed monitoring package
this-is-shivamsingh Jan 26, 2025
b9524e3
test: Test fix
this-is-shivamsingh Jan 26, 2025
e498058
test: monitoring coverage fix
this-is-shivamsingh Jan 26, 2025
5aed072
test: monitoring coverage fix 2
this-is-shivamsingh Jan 26, 2025
91ccc33
test: fixing cli-exec specs
this-is-shivamsingh Jan 26, 2025
eb0122b
test: fixing cli-exec specs 2
this-is-shivamsingh Jan 26, 2025
612c411
test: fixing cli-exec specs 3
this-is-shivamsingh Jan 26, 2025
46347ac
test: added tests for percy.test.js coverage fix
this-is-shivamsingh Jan 26, 2025
c4507bf
test: added tests for percy.test.js coverage fix 2
this-is-shivamsingh Jan 26, 2025
bcf6771
test: added tests for cli-exec and core
this-is-shivamsingh Jan 26, 2025
c3fa22f
refactor: update name from cpuload to cpuusage
this-is-shivamsingh Jan 26, 2025
8832dd8
Delete d.txt
this-is-shivamsingh Jan 27, 2025
13ccf15
refactor: resolving comments
this-is-shivamsingh Jan 27, 2025
351f600
test: Test fix
this-is-shivamsingh Jan 27, 2025
80dcf1c
chore: lint fix
this-is-shivamsingh Jan 27, 2025
9f8f020
test: Test fix 2
this-is-shivamsingh Jan 27, 2025
375d3c5
test: Test fix 3
this-is-shivamsingh Jan 27, 2025
25d92e3
test: coverage fix
this-is-shivamsingh Jan 27, 2025
b935501
test: coverage fix 3
this-is-shivamsingh Jan 27, 2025
06e0391
test: coverage fix 3
this-is-shivamsingh Jan 27, 2025
2965c17
feat: added env variable to disable only discovery concurrency change
this-is-shivamsingh Jan 27, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
${{ hashFiles('.github/.cache-key') }}/
- run: yarn
- run: yarn build
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4
with:
name: dist
path: packages/*/dist
Expand Down Expand Up @@ -75,7 +75,7 @@ jobs:
restore-keys: >
${{ runner.os }}/node-${{ matrix.node }}/
${{ hashFiles('.github/.cache-key') }}/
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: dist
path: packages
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ jobs:
${{ hashFiles('.github/.cache-key') }}/
- run: yarn
- run: yarn build
- uses: actions/upload-artifact@v3
- uses: actions/upload-artifact@v4
with:
name: dist
path: packages/*/dist
Expand Down Expand Up @@ -74,7 +74,7 @@ jobs:
restore-keys: >
${{ runner.os }}/node-14/
${{ hashFiles('.github/.cache-key') }}/
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: dist
path: packages
Expand Down
1 change: 1 addition & 0 deletions packages/core/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@
"@percy/dom": "1.30.7-beta.2",
"@percy/logger": "1.30.7-beta.2",
"@percy/webdriver-utils": "1.30.7-beta.2",
"@percy/monitoring": "1.30.7-beta.1",
"content-disposition": "^0.5.4",
"cross-spawn": "^7.0.3",
"extract-zip": "^2.0.1",
Expand Down
4 changes: 3 additions & 1 deletion packages/core/src/discovery.js
Original file line number Diff line number Diff line change
Expand Up @@ -355,7 +355,7 @@ async function* captureSnapshotResources(page, snapshot, options) {
// one concurrently. When skipping asset discovery, the callback is called immediately for each
// snapshot, also processing snapshot resources when not dry-running.
export async function* discoverSnapshotResources(queue, options, callback) {
let { snapshots, skipDiscovery, dryRun } = options;
let { snapshots, skipDiscovery, dryRun, checkAndUpdateConcurrency } = options;

yield* yieldAll(snapshots.reduce((all, snapshot) => {
debugSnapshotOptions(snapshot);
Expand All @@ -368,6 +368,8 @@ export async function* discoverSnapshotResources(queue, options, callback) {
callback(dryRun ? snap : processSnapshotResources(snap));
}
} else {
// update concurrency before pushing new job in discovery queue
checkAndUpdateConcurrency();
all.push(queue.push(snapshot, callback));
}

Expand Down
68 changes: 67 additions & 1 deletion packages/core/src/percy.js
Original file line number Diff line number Diff line change
Expand Up @@ -26,10 +26,15 @@ import {
discoverSnapshotResources,
createDiscoveryQueue
} from './discovery.js';
import Monitoring from '@percy/monitoring';
import { WaitForJob } from './wait-for-job.js';

const MAX_SUGGESTION_CALLS = 10;

// If no activity is done for 5 mins, we will stop monitoring
// system metric eg: (cpu load && memory usage)
const MONITOR_ACTIVITY_TIMEOUT = 300000;

// A Percy instance will create a new build when started, handle snapshot creation, asset discovery,
// and resource uploads, and will finalize the build when stopped. Snapshots are processed
// concurrently and the build is not finalized until all snapshots have been handled.
Expand Down Expand Up @@ -107,8 +112,14 @@ export class Percy {
this.browser = new Browser(this);

this.#discovery = createDiscoveryQueue(this);
this.discoveryMaxConcurrency = this.#discovery.concurrency;
this.#snapshots = createSnapshotsQueue(this);

this.monitoring = new Monitoring();
// used continue monitoring if there is activity going on
// if there is none, stop it
this.resetMonitoringId = null;

// generator methods are wrapped to autorun and return promises
for (let m of ['start', 'stop', 'flush', 'idle', 'snapshot', 'upload']) {
// the original generator can be referenced with percy.yield.<method>
Expand All @@ -117,6 +128,25 @@ export class Percy {
}
}

async configureSystemMonitor() {
await this.monitoring.startMonitoring();
this.resetSystemMonitor();
}

// Debouncing logic to only stop Monitoring system
// if there is no any activity for 5 mins
// means, no job is pushed in queue from 5 mins
resetSystemMonitor() {
if (this.resetMonitoringId) {
clearTimeout(this.resetMonitoringId);
this.resetMonitoringId = null;
}

this.resetMonitoringId = setTimeout(() => {
this.monitoring.stopMonitoring();
}, MONITOR_ACTIVITY_TIMEOUT);
}

// Shortcut for controlling the global logger's log level.
loglevel(level) {
return logger.loglevel(level);
Expand Down Expand Up @@ -167,6 +197,7 @@ export class Percy {
if (this.readyState != null) return;
this.readyState = 0;
this.cliStartTime = new Date().toISOString();
await this.configureSystemMonitor();

try {
if (process.env.PERCY_CLIENT_ERROR_LOGS !== 'false') {
Expand Down Expand Up @@ -298,13 +329,48 @@ export class Percy {
this.log.error(err);
throw err;
} finally {
// stop monitoring system metric, if not already stopped
this.monitoring.stopMonitoring();

// This issue doesn't comes under regular error logs,
// it's detected if we just and stop percy server
await this.checkForNoSnapshotCommandError();
await this.sendBuildLogs();
}
}

checkAndUpdateConcurrency() {
// start system monitoring if not already doing...
// TODO: Check this once, this might cause problem with async
// for cpu load we need 1 sec wait to get % cpu load
if (!this.monitoring.running) this.monitoring.startMonitoring();

const { cpuInfo, memoryUsageInfo } = this.monitoring.getMonitoringInfo();
if (cpuInfo.usagePercent >= 80 || memoryUsageInfo.usagePercent >= 80) {
let currentConcurrent = this.#discovery.concurrency;

// concurrency must be betweeen [1, (default/user defined value)]
let newConcurrency = Math.max(1, parseInt(currentConcurrent / 2));
newConcurrency = Math.min(this.discoveryMaxConcurrency, newConcurrency);

this.log.debug(`Downscaling discovery browser concurrency from ${this.#discovery.concurrency} to ${newConcurrency}`);
this.#discovery.set({ concurrency: newConcurrency });
} else if (cpuInfo.usagePercent <= 50 && memoryUsageInfo.usagePercent <= 50) {
let currentConcurrent = this.#discovery.concurrency;
let newConcurrency = currentConcurrent + 2;

// concurrency must be betweeen [1, (default/user-defined value)]
newConcurrency = Math.min(this.discoveryMaxConcurrency, newConcurrency);
newConcurrency = Math.max(1, newConcurrency);

this.log.debug(`Upscaling discovery browser concurrency from ${this.#discovery.concurrency} to ${newConcurrency}`);
this.#discovery.set({ concurrency: newConcurrency });
}

// reset timeout to stop monitoring after no-activity of 5 mins
this.resetSystemMonitor();
}

// Takes one or more snapshots of a page while discovering resources to upload with the resulting
// snapshots. Once asset discovery has completed for the provided snapshots, the queued task will
// resolve and an upload task will be queued separately.
Expand Down Expand Up @@ -351,7 +417,7 @@ export class Percy {
yield* discoverSnapshotResources(this.#discovery, {
skipDiscovery: this.skipDiscovery,
dryRun: this.dryRun,

checkAndUpdateConcurrency: this.checkAndUpdateConcurrency.bind(this),
snapshots: yield* gatherSnapshots(options, {
meta: { build: this.build },
config: this.config
Expand Down
37 changes: 37 additions & 0 deletions packages/monitoring/package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
{
"name": "@percy/monitoring",
"version": "1.30.7-beta.1",
"license": "MIT",
"repository": {
"type": "git",
"url": "https://github.com/percy/cli",
"directory": "packages/monitoring"
},
"publishConfig": {
"access": "public",
"tag": "beta"
},
"engines": {
"node": ">=14"
},
"files": [
"dist"
],
"main": "./dist/index.js",
"type": "module",
"exports": {
".": "./dist/index.js"
},
"scripts": {
"build": "node ../../scripts/build",
"lint": "eslint --ignore-path ../../.gitignore .",
"test": "node ../../scripts/test",
"test:coverage": "yarn test --coverage"
},
"dependencies": {
"@percy/config": "1.30.7-beta.1",
"@percy/sdk-utils": "1.30.7-beta.1",
"@percy/logger": "1.30.7-beta.1",
"systeminformation": "^5.25.11"
}
}
143 changes: 143 additions & 0 deletions packages/monitoring/src/cpu.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,143 @@

import os from 'os';
import fs from 'fs';
import logger from '@percy/logger';

const CGROUP_CPU_STATS = '/sys/fs/cgroup/cput.stats';
const CGROUP_CPU_MAX = '/sys/fs/cgroup/cpu.max';
const CGROUP_FILES = [CGROUP_CPU_MAX, CGROUP_CPU_STATS];
const log = logger('monitoring:cpu');

async function getCPULoadInfo(os, { containerLevel, machineLevel = true }) {
try {
if (os.includes('linux') && cgroupExists()) {
return await getLinuxCPULoad();
} else {
return await getCPULoad();
}
} catch (error) {
// Don't raise this error to avoid user build failure
log.debug('Error: ', error);
return null;
}
}

function cgroupExists() {
// Check if cgroup files are avaiable for not
let cgroupExists = true;
for (const file in CGROUP_FILES) {
cgroupExists &&= fs.existsSync(file);
}
return cgroupExists;
}

/**
* reading cpu stats from cgroup files
*/
function readCpuStatFromCgroup() {
const content = fs.readFileSync(CGROUP_CPU_STATS, 'utf8');
const stats = {};

// Parse the cpu.stat file
content.split('\n').forEach(line => {
const [key, value] = line.trim().split(' ');
if (key && value) {
stats[key] = parseInt(value);
}
});

return stats;
}

/**
* linux cpu load is calculated is same for
* containerLevel and machineLevel
*/
async function getLinuxCPULoad() {
try {
// Read cpu.max for quota and period
const cpuMaxContent = fs.readFileSync(CGROUP_CPU_MAX, 'utf8');
const [quotaStr, periodStr] = cpuMaxContent.trim().split(' ');
let quota, period, availableCPUs;

if (quotaStr === 'max') {
// No quota set, use the number of physical CPUs
const physicalCPUs = os.cpus().length;
quota = null; // Indicate no quota
period = null; // Indicate no period
availableCPUs = physicalCPUs;
} else {
// Parse quota and period values
quota = parseInt(quotaStr);
period = parseInt(periodStr);
availableCPUs = quota / period;
}

// Get first CPU usage reading
const startStats = readCpuStatFromCgroup();

// Wait for 1 second, ie 10^6 microsecod
await new Promise(resolve => setTimeout(resolve, 1000));

// Get second CPU usage reading
const endStats = readCpuStatFromCgroup();

// Calculate CPU usage
const usageDelta = endStats.usage_usec - startStats.usage_usec;

// Calculate percentage (similar to previous implementation)
// usageDelta is in microseconds, we measured over 1 second (1_000_000 microseconds)
// NOTE: cpuPercentage can be > 100% as something process can use
// more than allocated cpu range
const cpuPercent = (usageDelta / (1_000_000 * availableCPUs)) * 100;

return {
availableCPUs,
usagePercent: cpuPercent
};
} catch (error) {
// TODO: Log error here
return null;
}
}

async function computeCpuUsageStats() {
let totalTickTime = 0;
let totalIdleTime = 0;
os.cpus().forEach(cpu => {
for (let type in cpu.times) {
totalTickTime += cpu.times[type];
}
totalIdleTime += cpu.times.idle;
});
return { totalTickTime, totalIdleTime };
};

/**
* this is a fallback method, if
* 1. For LINUX, we can't find cgroup details
* 2. For Win/OSX operating system
*/
async function getCPULoad() {
const initialCpuUsage = await computeCpuUsageStats();
// wait for 1 second, to connect cpu load for 10^6 micro-second
// ie. 1 second
await new Promise((res) => setTimeout(() => res(), 1000));

const finalCpuUsage = await computeCpuUsageStats();

// Calculate differences
const deltaTick = finalCpuUsage.totalTickTime - initialCpuUsage.totalTickTime;
const deltaIdle = finalCpuUsage.totalIdleTime - initialCpuUsage.totalIdleTime;

// Calculate % usage
const cpuUsagePercent = (1 - deltaIdle / deltaTick) * 100;
return {
availableCPUs: os.cpus().length,
usagePercent: cpuUsagePercent
};
}

export {
getCPULoadInfo
};
Loading
Loading