Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add notebook params and debugger #68

20 changes: 20 additions & 0 deletions .vscode/example_launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
{
"version": "0.2.0",
"configurations": [
{
"name": "Python: Current File",
"type": "python",
"request": "launch",
"console": "integratedTerminal",
"python": "python3",
"module": "cli.nuttercli",
"args": [
"run",
"<Add test pattern here>",
"--cluster_id",
"<Add cluster_id here>",
"--notebook_params",
"{\"example_key_1\": \"example_value_1\", \"example_key_2\": \"example_value_2\"}"
]
}]
}
5 changes: 3 additions & 2 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
{
"python.pythonPath": "/usr/bin/python3",
"python.pythonPath": "/usr/bin/python3",
"python.testing.pytestArgs": [
"tests"
],
"python.testing.unittestEnabled": false,
"python.testing.nosetestsEnabled": false,
"python.testing.pytestEnabled": true
"python.testing.pytestEnabled": true,
"python.envFile": "${workspaceFolder}/.env"
}
15 changes: 11 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,10 +254,10 @@ The ```run``` command schedules the execution of test notebooks and waits for t

### Run single test notebook

The following command executes the test notebook ```/dataload/test_sourceLoad``` in the cluster ```0123-12334-tonedabc```.
The following command executes the test notebook ```/dataload/test_sourceLoad``` in the cluster ```0123-12334-tonedabc``` with the notebook_param key-value pairs of ```{"example_key_1": "example_value_1", "example_key_2": "example_value_2"}``` (Please note the escaping of quotes):

```bash
nutter run dataload/test_sourceLoad --cluster_id 0123-12334-tonedabc
nutter run dataload/test_sourceLoad --cluster_id 0123-12334-tonedabc --notebook_params "{\"example_key_1\": \"example_value_1\", \"example_key_2\": \"example_value_2\"}"
```

__Note:__ In Azure Databricks you can get the cluster ID by selecting a cluster name from the Clusters tab and clicking on the JSON view.
Expand All @@ -267,10 +267,10 @@ __Note:__ In Azure Databricks you can get the cluster ID by selecting a cluster
The Nutter CLI supports the execution of multiple notebooks via name pattern matching. The Nutter CLI applies the pattern to the name of test notebook **without** the *test_* prefix. The CLI also expects that you omit the prefix when specifying the pattern.


Say the *dataload* folder has the following test notebooks: *test_srcLoad* and *test_srcValidation*. The following command will result in the execution of both tests.
Say the *dataload* folder has the following test notebooks: *test_srcLoad* and *test_srcValidation* with the notebook_param key-value pairs of ```{"example_key_1": "example_value_1", "example_key_2": "example_value_2"}```. The following command will result in the execution of both tests.

```bash
nutter run dataload/src* --cluster_id 0123-12334-tonedabc
nutter run dataload/src* --cluster_id 0123-12334-tonedabc --notebook_params "{\"example_key_1\": \"example_value_1\", \"example_key_2\": \"example_value_2\"}"
```

In addition, if you have tests in a hierarchical folder structure, you can recursively execute all tests by setting the ```--recursive``` flag.
Expand Down Expand Up @@ -316,6 +316,10 @@ FLAGS
--max_parallel_tests Sets the level of parallelism for test notebook execution.
--recursive Executes all tests in the hierarchical folder structure.
--poll_wait_time Polling interval duration for notebook status. Default is 5 (5 seconds).
--notebook_params Allows parameters to be passed from the CLI tool to the test notebook. From the
notebook, these parameters can then be accessed by the notebook using
the 'dbutils.widgets.get('key')' syntax.

```

__Note:__ You can also use flags syntax for POSITIONAL ARGUMENTS
Expand Down Expand Up @@ -435,6 +439,9 @@ steps:
condition: succeededOrFailed()
```

### Debugging Locally
If using Visual Studio Code, you can use the `example_launch.json` file provided, editing the variables in the `<>` symbols to match your environment. You should be able to use the debugger to see the test run results, much the same as you would in Azure Devops.

## Contributing

### Contribution Tips
Expand Down
8 changes: 4 additions & 4 deletions cli/nuttercli.py
Original file line number Diff line number Diff line change
Expand Up @@ -53,22 +53,22 @@ def __init__(self, debug=False, log_to_file=False, version=False):
def run(self, test_pattern, cluster_id,
timeout=120, junit_report=False,
tags_report=False, max_parallel_tests=1,
recursive=False, poll_wait_time=DEFAULT_POLL_WAIT_TIME):
recursive=False, poll_wait_time=DEFAULT_POLL_WAIT_TIME, notebook_params=None):
try:
logging.debug(""" Running tests. test_pattern: {} cluster_id: {} timeout: {}
logging.debug(""" Running tests. test_pattern: {} cluster_id: {} notebook_params: {} timeout: {}
junit_report: {} max_parallel_tests: {}
tags_report: {} recursive:{} """
.format(test_pattern, cluster_id, timeout,
junit_report, max_parallel_tests,
tags_report, recursive))
tags_report, recursive, notebook_params))

logging.debug("Executing test(s): {}".format(test_pattern))

if self._is_a_test_pattern(test_pattern):
logging.debug('Executing pattern')
results = self._nutter.run_tests(
test_pattern, cluster_id, timeout,
max_parallel_tests, recursive, poll_wait_time)
max_parallel_tests, recursive, poll_wait_time, notebook_params)
self._nutter.events_processor_wait()
self._handle_results(results, junit_report, tags_report)
return
Expand Down
16 changes: 8 additions & 8 deletions common/api.py
Original file line number Diff line number Diff line change
Expand Up @@ -88,21 +88,21 @@ def list_tests(self, path, recursive=False):
return tests

def run_test(self, testpath, cluster_id,
timeout=120, pull_wait_time=DEFAULT_POLL_WAIT_TIME):
timeout=120, pull_wait_time=DEFAULT_POLL_WAIT_TIME, notebook_params=None):
self._add_status_event(NutterStatusEvents.TestExecutionRequest, testpath)
test_notebook = TestNotebook.from_path(testpath)
if test_notebook is None:
raise InvalidTestException

result = self.dbclient.execute_notebook(
test_notebook.path, cluster_id,
timeout=timeout, pull_wait_time=pull_wait_time)
timeout=timeout, pull_wait_time=pull_wait_time, notebook_params=notebook_params)

return result

def run_tests(self, pattern, cluster_id,
timeout=120, max_parallel_tests=1, recursive=False,
poll_wait_time=DEFAULT_POLL_WAIT_TIME):
poll_wait_time=DEFAULT_POLL_WAIT_TIME, notebook_params=None):

self._add_status_event(NutterStatusEvents.TestExecutionRequest, pattern)
root, pattern_to_match = self._get_root_and_pattern(pattern)
Expand All @@ -119,7 +119,7 @@ def run_tests(self, pattern, cluster_id,
NutterStatusEvents.TestsListingFiltered, len(filtered_notebooks))

return self._schedule_and_run(
filtered_notebooks, cluster_id, max_parallel_tests, timeout, poll_wait_time)
filtered_notebooks, cluster_id, max_parallel_tests, timeout, poll_wait_time, notebook_params)

def events_processor_wait(self):
if self._events_processor is None:
Expand Down Expand Up @@ -168,20 +168,20 @@ def _get_root_and_pattern(self, pattern):
return root, valid_pattern

def _schedule_and_run(self, test_notebooks, cluster_id,
max_parallel_tests, timeout, pull_wait_time):
max_parallel_tests, timeout, pull_wait_time, notebook_params=None):
func_scheduler = scheduler.get_scheduler(max_parallel_tests)
for test_notebook in test_notebooks:
self._add_status_event(
NutterStatusEvents.TestScheduling, test_notebook.path)
logging.debug(
'Scheduling execution of: {}'.format(test_notebook.path))
func_scheduler.add_function(self._execute_notebook,
test_notebook.path, cluster_id, timeout, pull_wait_time)
test_notebook.path, cluster_id, timeout, pull_wait_time, notebook_params)
return self._run_and_await(func_scheduler)

def _execute_notebook(self, test_notebook_path, cluster_id, timeout, pull_wait_time):
def _execute_notebook(self, test_notebook_path, cluster_id, timeout, pull_wait_time, notebook_params=None):
result = self.dbclient.execute_notebook(test_notebook_path,
cluster_id, None, timeout, pull_wait_time)
cluster_id, timeout, pull_wait_time, notebook_params)
self._add_status_event(NutterStatusEvents.TestExecuted,
ExecutionResultEventData.from_execution_results(result))
logging.debug('Executed: {}'.format(test_notebook_path))
Expand Down
8 changes: 4 additions & 4 deletions common/apiclient.py
Original file line number Diff line number Diff line change
Expand Up @@ -56,9 +56,9 @@ def list_objects(self, path):

return workspace_path_obj

def execute_notebook(self, notebook_path, cluster_id,
notebook_params=None, timeout=120,
pull_wait_time=DEFAULT_POLL_WAIT_TIME):
def execute_notebook(self, notebook_path, cluster_id, timeout=120,
pull_wait_time=DEFAULT_POLL_WAIT_TIME,
notebook_params=None):
if not notebook_path:
raise ValueError("empty path")
if not cluster_id:
Expand All @@ -68,7 +68,7 @@ def execute_notebook(self, notebook_path, cluster_id,
"Timeout must be greater than {}".format(self.min_timeout))
if notebook_params is not None:
if not isinstance(notebook_params, dict):
raise ValueError("Parameters must be a dictionary")
raise ValueError("Parameters must be in the form of a dictionary (See #run-single-test-notebook section in README)")
if pull_wait_time <= 1:
pull_wait_time = DEFAULT_POLL_WAIT_TIME

Expand Down