Extend on the System Tests (#1182)

* doc(system-tests): document the existence of the system tests * doc-fix(linting-wiki): fix broken paths * test(system-test): showcase scenarios * refactor(system-test): increase hook timeout
flowr-analysis · Nov 30, 2024 · 6ba4ce4 · 6ba4ce4 · github-actions · Nov 30, 2024
1 parent 0a57e68
commit 6ba4ce4
Show file tree

Hide file tree

Showing 7 changed files with 169 additions and 68 deletions.
diff --git a/src/cli/repl/commands/repl-dataflow.ts b/src/cli/repl/commands/repl-dataflow.ts
@@ -12,14 +12,17 @@ async function dataflow(shell: RShell, remainingLine: string) {
 	}).allRemainingSteps();
 }
 
+function handleString(code: string): string {
+	return code.startsWith('"') ? JSON.parse(code) as string : code;
+}
+
 export const dataflowCommand: ReplCommand = {
 	description:  `Get mermaid code for the dataflow graph of R code, start with '${fileProtocol}' to indicate a file`,
 	usageExample: ':dataflow',
 	aliases:      [ 'd', 'df' ],
 	script:       false,
 	fn:           async(output, shell, remainingLine) => {
-		const result = await dataflow(shell, remainingLine);
-
+		const result = await dataflow(shell, handleString(remainingLine));
 		output.stdout(graphToMermaid({ graph: result.dataflow.graph, includeEnvironments: false }).string);
 	}
 };
@@ -30,8 +33,7 @@ export const dataflowStarCommand: ReplCommand = {
 	aliases:      [ 'd*', 'df*' ],
 	script:       false,
 	fn:           async(output, shell, remainingLine) => {
-		const result = await dataflow(shell, remainingLine);
-
+		const result = await dataflow(shell, handleString(remainingLine));
 		output.stdout(graphToMermaidUrl(result.dataflow.graph, false));
 	}
 };
diff --git a/src/cli/repl/commands/repl-normalize.ts b/src/cli/repl/commands/repl-normalize.ts
@@ -12,14 +12,17 @@ async function normalize(shell: RShell, remainingLine: string) {
 	}).allRemainingSteps();
 }
 
+function handleString(code: string): string {
+	return code.startsWith('"') ? JSON.parse(code) as string : code;
+}
+
 export const normalizeCommand: ReplCommand = {
 	description:  `Get mermaid code for the normalized AST of R code, start with '${fileProtocol}' to indicate a file`,
 	usageExample: ':normalize',
 	aliases:      [ 'n' ],
 	script:       false,
 	fn:           async(output, shell, remainingLine) => {
-		const result = await normalize(shell, remainingLine);
-
+		const result = await normalize(shell, handleString(remainingLine));
 		output.stdout(normalizedAstToMermaid(result.normalize.ast));
 	}
 };
@@ -30,8 +33,7 @@ export const normalizeStarCommand: ReplCommand = {
 	aliases:      [ 'n*' ],
 	script:       false,
 	fn:           async(output, shell, remainingLine) => {
-		const result = await normalize(shell, remainingLine);
-
+		const result = await normalize(shell, handleString(remainingLine));
 		output.stdout(normalizedAstToMermaidUrl(result.normalize.ast));
 	}
 };
diff --git a/src/documentation/print-linting-and-testing-wiki.ts b/src/documentation/print-linting-and-testing-wiki.ts
@@ -2,17 +2,19 @@ import { setMinLevelOfAllLogs } from '../../test/functionality/_helper/log';
 import { LogLevel } from '../util/log';
 import { codeBlock } from './doc-util/doc-code';
 import { FlowrCodecovRef, FlowrDockerRef, FlowrGithubBaseRef, FlowrSiteBaseRef, FlowrWikiBaseRef, getFilePathMd, RemoteFlowrFilePathBaseRef } from './doc-util/doc-files';
+import { block } from './doc-util/doc-structure';
 
 function getText() {
 	return `
-For the latest code-coverage information, see [codecov.io](${FlowrCodecovRef}), 
+For the latest code coverage information, see [codecov.io](${FlowrCodecovRef}), 
 for the latest benchmark results, see the [benchmark results](${FlowrSiteBaseRef}/wiki/stats/benchmark) wiki page.
 
 - [Testing Suites](#testing-suites)
   - [Functionality Tests](#functionality-tests)
     - [Test Structure](#test-structure)
     - [Writing a Test](#writing-a-test)
     - [Running Only Some Tests](#running-only-some-tests)
+  - [System Tests](#system-tests)
   - [Performance Tests](#performance-tests)
   - [Oh no, the tests are slow](#oh-no-the-tests-are-slow)
   - [Testing Within Your IDE](#testing-within-your-ide)
@@ -25,8 +27,10 @@ for the latest benchmark results, see the [benchmark results](${FlowrSiteBaseRef
 
 ## Testing Suites
 
-Currently, flowR contains two testing suites: one for [functionality](#functionality-tests) and one for [performance](#performance-tests). We explain each of them in the following.
-In addition to running those tests, you can use the more generalized \`npm run checkup\`. This will include the construction of the docker image, the generation of the wiki pages, and the linter.
+Currently, flowR contains three testing suites: one for [functionality](#functionality-tests), 
+one for [system tests](#system-tests), and one for [performance](#performance-tests). We explain each of them in the following.
+In addition to running those tests, you can use the more generalized \`npm run checkup\`. 
+This command includes the construction of the docker image, the generation of the wiki pages, and the linter.
 
 ### Functionality Tests
 
@@ -37,8 +41,8 @@ You can run the tests by issuing:
 ${codeBlock('shell', 'npm run test')}
 
 Within the commandline,
-this should automatically drop you into a watch mode which will automatically re-run the tests if you change the code.
-If, at any time there are too many errors, you can use \`--bail=<value>\` to stop the tests after a certain number of errors.
+this should automatically drop you into a watch mode which will automatically re-run (potentially) affected tests if you change the code.
+If, at any time there are too many errors for you to comprehend, you can use \`--bail=<value>\` to stop the tests after a certain number of errors.
 For example:
 
 ${codeBlock('shell', 'npm run test -- --bail=1')}
@@ -51,23 +55,29 @@ To run all tests, including a coverage report and label summary, run:
 
 ${codeBlock('shell', 'npm run test-full')}
 
-However, depending on your local R version, your network connection and potentially other factors, some tests may be skipped automatically as they don't apply to your current system setup 
-(or can't be tested with the current prerequisites). 
+However, depending on your local version of&nbsp;R, your network connection, and other factors (each test may have a set of criteria), 
+some tests may be skipped automatically as they do not apply to your current system setup (or can not be tested with the current prerequisites). 
 Each test can specify such requirements as part of the \`TestConfiguration\`, which is then used in the \`test.skipIf\` function of _vitest_.
-It is up to the [ci](#ci-pipeline) to run the tests on different systems to ensure that those tests are ensured to run.
+It is up to the [ci](#ci-pipeline) to run the tests on different systems to ensure that those tests run.
 
 #### Test Structure
 
 All functionality tests are to be located under [test/functionality](${RemoteFlowrFilePathBaseRef}test/functionality).
 
 This folder contains three special and important elements:
 
-- \`test-setup\` which is the entry point if *all* tests are run. It should automatically disable logging statements and configure global variables (e.g., if installation tests should run).
-- \`_helper\` which contains helper functions to be used by other tests.
-- \`test-summary\` which may produce a summary of the covered capabilities.
+- \`test-setup.ts\` which is the entry point if *all* tests are run. It should automatically disable logging statements and configure global variables (e.g., if installation tests should run).
+- \`_helper/\` folder which contains helper functions to be used by other tests.
+- \`test-summary.ts\` which may produce a summary of the covered capabilities.
 
-We name all tests using the \`.test.ts\` suffix and try to run them in parallel. 
-Whenever this is not possible (e.g., when using \`withShell\`), please use \`describe.sequential\` to disable parallel execution for the respective test.
+${block({
+		type:    'WARNING',
+		content: `
+We name all test files using the \`.test.ts\` suffix and try to run them in parallel.
+Whenever this is not possible (e.g., when using \`withShell\`), please use \`describe.sequential\`
+to disable parallel execution for the respective test (otherwise, such tests are flaky).
+` 
+	})}
 
 #### Writing a Test
 
@@ -86,10 +96,11 @@ assertDataflow(label('simple variable', ['name-normal']), shell,
 `)}
 
 When writing dataflow tests, additional settings can be used to reduce the amount of graph data that needs to be pre-written. Notably:
+
 - \`expectIsSubgraph\` indicates that the expected graph is a subgraph, rather than the full graph that the test should generate. The test will then only check if the supplied graph is contained in the result graph, rather than an exact match.
 - \`resolveIdsAsCriterion\` indicates that the ids given in the expected (sub)graph should be resolved as [slicing criteria](${FlowrWikiBaseRef}/Terminology#slicing-criterion) rather than actual ids. For example, passing \`12@a\` as an id in the expected (sub)graph will cause it to be resolved as the corresponding id.
 
-The following example shows both in use.
+The following example shows both in use:
 ${codeBlock('typescript', `
 assertDataflow(label('without distractors', [...OperatorDatabase['<-'].capabilities, 'numbers', 'name-normal', 'newlines', 'name-escaped']),
 	shell, '\`a\` <- 2\\na',
@@ -108,10 +119,25 @@ assertDataflow(label('without distractors', [...OperatorDatabase['<-'].capabilit
 To run only some tests, vitest allows you to [filter](https://vitest.dev/guide/filtering.html) tests. 
 Besides, you can use the watch mode (with \`npm run test\`) to only run tests that are affected by your changes.
 
+### System Tests
+
+In contrast to the [functionality tests](#functionality-tests), the system tests use runners like the \`npm\` scripts
+to test the behavior of the whole system, for example, by running the CLI or the server.
+They are slower and hence not part of \`npm run test\` but can be run using:
+${codeBlock('shell', 'npm run test:system')}
+To work, they require you to set up your system correctly (e.g., have \`npm\` available on your path).
+The CI environment will make sure of that. At the moment, these tests are not labeled and only intended
+to check basic availability of *flowR*'s core features (as we test the functionality of these features dedicately 
+with the [functionality tests](#functionality-tests)).
+
+Have a look at the [test/system-tests](${RemoteFlowrFilePathBaseRef}test/system-tests) folder for more information.
+ 
+
+
 ### Performance Tests
 
 The performance test suite of *flowR* uses several suites to check for variations in the required times for certain steps.
-Although we measure wall time in the CI (which is subject to rather large variations), it should give a rough idea of the performance of *flowR*.
+Although we measure wall time in the CI (which is subject to rather large variations), it should give a rough idea *flowR*'s performance.
 Furthermore, the respective scripts can be used locally as well.
 To run them, issue:
 
@@ -143,18 +169,18 @@ Please follow the official guide [here](https://www.jetbrains.com/help/webstorm/
 
 ## CI Pipeline
 
-We have several workflows defined in [.github/workflows](../.github/workflows/).
+We have several workflows defined in [.github/workflows](${RemoteFlowrFilePathBaseRef}/.github/workflows/).
 We explain the most important workflows in the following:
 
-- [qa.yaml](../.github/workflows/qa.yaml) is the main workflow that will run different steps depending on several factors. It is responsible for:
+- [qa.yaml](${RemoteFlowrFilePathBaseRef}/.github/workflows/qa.yaml) is the main workflow that will run different steps depending on several factors. It is responsible for:
   - running the [functionality](#functionality-tests) and [performance tests](#performance-tests)
     - uploading the results to the [benchmark page](${FlowrSiteBaseRef}/wiki/stats/benchmark) for releases
     - running the [functionality tests](#functionality-tests) on different operating systems (Windows, macOS, Linux) and with different versions of R
     - reporting code coverage
   - running the [linter](#linting) and reporting its results
   - deploying the documentation to [GitHub Pages](${FlowrSiteBaseRef}/doc/)
-- [release.yaml](../.github/workflows/release.yaml) is responsible for creating a new release, only to be run by repository owners. Furthermore, it adds the new docker image to [docker hub](${FlowrDockerRef}).
-- [broken-links-and-wiki.yaml](../.github/workflows/broken-links-and-wiki.yaml) repeatedly tests that all links are not dead!
+- [release.yaml](${RemoteFlowrFilePathBaseRef}/.github/workflows/release.yaml) is responsible for creating a new release, only to be run by repository owners. Furthermore, it adds the new docker image to [docker hub](${FlowrDockerRef}).
+- [broken-links-and-wiki.yaml](${RemoteFlowrFilePathBaseRef}/.github/workflows/broken-links-and-wiki.yaml) repeatedly tests that all links are not dead!
 
 ## Linting
 
@@ -163,11 +189,11 @@ The main one:
 
 ${codeBlock('shell', 'npm run lint')}
 
-And a weaker version of the first (allowing for *todo* comments) which is run automatically in the [pre-push githook](../.githooks/pre-push) as explained in the [CONTRIBUTING.md](../.github/CONTRIBUTING.md):
+And a weaker version of the first (allowing for *todo* comments) which is run automatically in the [pre-push githook](${RemoteFlowrFilePathBaseRef}/.githooks/pre-push) as explained in the [CONTRIBUTING.md](${RemoteFlowrFilePathBaseRef}/.github/CONTRIBUTING.md):
 
 ${codeBlock('shell', 'npm run lint-local')}
 
-Besides checking coding style (as defined in the [package.json](../package.json)), the *full* linter runs the [license checker](#license-checker).
+Besides checking coding style (as defined in the [package.json](${RemoteFlowrFilePathBaseRef}/package.json)), the *full* linter runs the [license checker](#license-checker).
 
 In case you are unaware,
 eslint can automatically fix several linting problems[](https://eslint.org/docs/latest/use/command-line-interface#fix-problems).
@@ -185,7 +211,7 @@ However, in case you think that the linter is wrong, please do not hesitate to o
 
 ### License Checker
 
-*flowR* is licensed under the [GPLv3 License](${FlowrGithubBaseRef}/flowr/blob/main/LICENSE) requiring us to only rely on [compatible licenses](https://www.gnu.org/licenses/license-list.en.html). For now, this list is hardcoded as part of the npm [\`license-compat\`](../package.json) script so it can very well be that a new dependency you add causes the checker to fail &mdash; *even though it is compatible*. In that case, please either open a [new issue](${FlowrGithubBaseRef}/flowr/issues/new/choose) or directly add the license to the list (including a reference to why it is compatible).
+*flowR* is licensed under the [GPLv3 License](${FlowrGithubBaseRef}/flowr/blob/main/LICENSE) requiring us to only rely on [compatible licenses](https://www.gnu.org/licenses/license-list.en.html). For now, this list is hardcoded as part of the npm [\`license-compat\`](${RemoteFlowrFilePathBaseRef}/package.json) script so it can very well be that a new dependency you add causes the checker to fail &mdash; *even though it is compatible*. In that case, please either open a [new issue](${FlowrGithubBaseRef}/flowr/issues/new/choose) or directly add the license to the list (including a reference to why it is compatible).
 `;
 }
 

diff --git a/test/functionality/r-bridge/normalize-ast-fold.test.ts b/test/functionality/r-bridge/normalize-ast-fold.test.ts
@@ -7,7 +7,7 @@ import type { NormalizedAst } from '../../../src/r-bridge/lang-4.x/ast/model/pro
 import type { RBinaryOp } from '../../../src/r-bridge/lang-4.x/ast/model/nodes/r-binary-op';
 import type { RExpressionList } from '../../../src/r-bridge/lang-4.x/ast/model/nodes/r-expression-list';
 
-describe('normalize-visitor', withShell(shell => {
+describe.sequential('normalize-visitor', withShell(shell => {
 	let normalized: NormalizedAst | undefined;
 	let mathAst: NormalizedAst | undefined;
 	beforeAll(async() => {

diff --git a/test/system-tests/repl.test.ts b/test/system-tests/repl.test.ts
@@ -1,21 +1,62 @@
-import { assert, describe, test } from 'vitest';
+import { assert, beforeAll, describe, test } from 'vitest';
 import { flowrRepl } from './utility/utility';
+import { graphToMermaid, graphToMermaidUrl } from '../../src/util/mermaid/dfg';
+import type { PipelineOutput } from '../../src/core/steps/pipeline/pipeline';
+import { DEFAULT_DATAFLOW_PIPELINE } from '../../src/core/steps/pipeline/default-pipelines';
+import { PipelineExecutor } from '../../src/core/pipeline-executor';
+import { requestFromInput } from '../../src/r-bridge/retriever';
+import { withShell } from '../functionality/_helper/shell';
+import type { RShell } from '../../src/r-bridge/shell';
+import type { DataflowGraph } from '../../src/dataflow/graph/graph';
+import { emptyGraph } from '../../src/dataflow/graph/dataflowgraph-builder';
+import type { NormalizedAst } from '../../src/r-bridge/lang-4.x/ast/model/processing/decorate';
+import { normalizedAstToMermaid, normalizedAstToMermaidUrl } from '../../src/util/mermaid/ast';
 
 describe('repl', () => {
-	test(':df', async() => {
-		const output = await flowrRepl([':df test', ':quit']);
-		assert.include(output, 'flowchart');
-	});
-
-	test(':df x <- 3', async() => {
-		const output = await flowrRepl([':df x <- 3 ', ':quit']);
-		assert.include(output, 'flowchart');
-	});
-
-	test(':df "x <- 3\nprint(x)"', async() => {
-		const output = await flowrRepl([':df "x <- 3\\nprint(x)"', ':quit']);
-		assert.include(output, 'flowchart');
-	});
+	async function analyze(shell: RShell, code: string): Promise<PipelineOutput<typeof DEFAULT_DATAFLOW_PIPELINE>> {
+		return await new PipelineExecutor(DEFAULT_DATAFLOW_PIPELINE, {
+			shell,
+			request: requestFromInput(code)
+		}).allRemainingSteps();
+	}
+	describe.sequential('inspection', withShell(shell => {
+		for(const [code, str] of [
+			['test', false],
+			['x <- 3', false],
+			['x <- 3\nprint(x)', true],
+			['x <- 2\nif(u) { x <- 3 } else { x <- 4 }\nprint(x)', true],
+			['x <- 2;y <- "hello";print(paste(x,y))', true],
+		] as const) {
+			const replCode = str ? JSON.stringify(code) : code;
+			describe(replCode, () => {
+				let dfOut: DataflowGraph = emptyGraph();
+				let normalized: NormalizedAst | undefined = undefined;
+				let output = '';
+				beforeAll(async() => {
+					const data = await analyze(shell, code);
+					dfOut = data.dataflow.graph;
+					normalized = data.normalize;
+					output = await flowrRepl([`:df ${replCode}`, `:n ${replCode}`, `:df* ${replCode}`, `:n* ${replCode}`, ':quit']);
+				});
+				test(':df', () => {
+					const expect = graphToMermaid({ graph: dfOut }).string;
+					assert.include(output, expect, `output ${output} does not contain ${expect}`);
+				});
+				test(':df*', () => {
+					const expect = graphToMermaidUrl(dfOut);
+					assert.include(output, expect, `output ${output} does not contain ${expect}`);
+				});
+				test(':n', () => {
+					const expect = normalized ? normalizedAstToMermaid(normalized.ast) : '';
+					assert.include(output, expect, `output ${output} does not contain ${expect}`);
+				});
+				test(':n*', () => {
+					const expect = normalized ? normalizedAstToMermaidUrl(normalized.ast) : '';
+					assert.include(output, expect, `output ${output} does not contain ${expect}`);
+				});
+			});
+		}
+	}));
 
 	test(':slicer', async() => {
 		const output = await flowrRepl([':slicer -c "3@a" -r "a <- 3\\nb <- 4\\nprint(a)"', ':quit']);

diff --git a/test/system-tests/vitest.config.mts b/test/system-tests/vitest.config.mts
@@ -3,6 +3,7 @@ import { defineConfig } from 'vitest/config';
 export default defineConfig({
 	test: {
 		testTimeout: 60 * 2000,
+		hookTimeout: 60 * 2000,
 		sequence:    {
 			/* each test file that does not support parallel execution will be executed in sequence by stating this explicitly */
 			concurrent: true,
Benchmark suite	Current: `6ba4ce4`	Previous: `db7ac2e`	Ratio
`Retrieve AST from R code`	`234.2262099090909` ms (`97.51691610600616`)	`242.24415118181818` ms (`104.26310147093089`)	`0.97`
`Normalize R AST`	`18.66348390909091` ms (`35.446413371140316`)	`17.39715159090909` ms (`31.09452227247912`)	`1.07`
`Produce dataflow information`	`59.84816036363637` ms (`127.76721455375782`)	`60.33616290909091` ms (`127.75990956872062`)	`0.99`
`Total per-file`	`817.6554442727272` ms (`1477.1416260434685`)	`833.5976102727273` ms (`1502.9439844838432`)	`0.98`
`Static slicing`	`2.0646184341069107` ms (`1.1971295569064917`)	`2.029558401836784` ms (`1.2081663161038019`)	`1.02`
`Reconstruct code`	`0.22602289296638017` ms (`0.17082576645491085`)	`0.23538053186242114` ms (`0.17968366955783688`)	`0.96`
`Total per-slice`	`2.3042351016436817` ms (`1.2580349197253093`)	`2.278853130617715` ms (`1.2795408953549086`)	`1.01`
`failed to reconstruct/re-parse`	`0` #	`0` #	`1`
`times hit threshold`	`0` #	`0` #	`1`
`reduction (characters)`	`0.7891949660994808` #	`0.7891949660994808` #	`1`
`reduction (normalized tokens)`	`0.7665650684287274` #	`0.7665650684287274` #	`1`
`memory (df-graph)`	`95.46617542613636` KiB (`244.77619956879823`)	`95.46617542613636` KiB (`244.77619956879823`)	`1`
Benchmark suite	Current: `6ba4ce4`	Previous: `db7ac2e`	Ratio
`Retrieve AST from R code`	`247.45438072` ms (`46.28515654697999`)	`237.81388206` ms (`43.26435892465071`)	`1.04`
`Normalize R AST`	`19.2054152` ms (`14.737214629853948`)	`18.584648960000003` ms (`13.710753719051512`)	`1.03`
`Produce dataflow information`	`73.62953884000001` ms (`70.34537334890321`)	`71.41246154000001` ms (`65.6106108648961`)	`1.03`
`Total per-file`	`7668.51287744` ms (`28798.584942205558`)	`7625.11364774` ms (`29095.600821943466`)	`1.01`
`Static slicing`	`15.82304150720786` ms (`44.20335531355589`)	`15.746698561898643` ms (`44.50249372495068`)	`1.00`
`Reconstruct code`	`0.27967390761290517` ms (`0.16564402957858926`)	`0.24344144772080362` ms (`0.13994668646932817`)	`1.15`
`Total per-slice`	`16.11106168218992` ms (`44.24369043188509`)	`15.997988500166153` ms (`44.53599655284285`)	`1.01`
`failed to reconstruct/re-parse`	`0` #	`0` #	`1`
`times hit threshold`	`0` #	`0` #	`1`
`reduction (characters)`	`0.8762109251198998` #	`0.8762109251198998` #	`1`
`reduction (normalized tokens)`	`0.819994064355517` #	`0.819994064355517` #	`1`
`memory (df-graph)`	`99.526015625` KiB (`113.60201607005874`)	`99.526015625` KiB (`113.60201607005874`)	`1`