Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve testing harness using psutil process handling #166

Closed
wants to merge 31 commits into from

Conversation

piotrm-nvidia
Copy link
Contributor

No description provided.

@piotrm-nvidia
Copy link
Contributor Author

You can run test with debug logs to see both pytest logs from test and also logs from subscriber.py and publisher.py processes:

pytest -v icp/tests/python/event_plane/test_multiprocess.py --log-cli-level DEBUG

The sample of log:

INFO     icp.tests.python.event_plane.publisher_subscriber_utils:publisher_subscriber_utils.py:275 Checking if worker needs kill for WorkerRecord(name='publisher_0', command=['/usr/bin/python3', '-u', './icp/examples/python/event_plane/publisher.py', '--publisher-id', '1', '--save-events-path', '/tmp/publisher_1_xz2ax8hc.json'], needs_kill_to_stop=False, json_file='/tmp/publisher_1_xz2ax8hc.json', tag='publisher') worker to finish
INFO     icp.tests.python.event_plane.publisher_subscriber_utils:publisher_subscriber_utils.py:277 Waiting for WorkerRecord(name='publisher_0', command=['/usr/bin/python3', '-u', './icp/examples/python/event_plane/publisher.py', '--publisher-id', '1', '--save-events-path', '/tmp/publisher_1_xz2ax8hc.json'], needs_kill_to_stop=False, json_file='/tmp/publisher_1_xz2ax8hc.json', tag='publisher') worker to finish
INFO     subscriber_0:publisher_subscriber_utils.py:89 selector_events.py: DEBUG: __init__(): 64:	Using selector: EpollSelector
INFO     subscriber_0:publisher_subscriber_utils.py:89 nats_event_plane.py: DEBUG: connect(): 156:	Connecting to NATS server: nats://localhost:4222
INFO     subscriber_0:publisher_subscriber_utils.py:89 nats_event_plane.py: DEBUG: connect(): 178:	Connected to NATS server: nats://localhost:4222

The logs with icp.tests.python.event_plane.publisher_subscriber_utils are generated by test harness and lines liken this:

INFO     subscriber_0:publisher_subscriber_utils.py:89 nats_event_plane.py: DEBUG: connect(): 178:	Connected to NATS server: nats://localhost:4222

Are generated by ScriptThread class and forwarded from sub-process subscriber.py with id 0 to main process.

Full log:

verbose_test_log.txt

@piotrm-nvidia
Copy link
Contributor Author

It seems that this test dead-locks at CI:

https://gitlab-master.nvidia.com/dl/triton/triton-distributed/-/jobs/141090309#L318

publisher_args = [
sys.executable,
"-u",
"./icp/examples/python/event_plane/publisher.py",
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working example.

Base automatically changed from bkubiak-eventplane to main February 14, 2025 05:40
@piotrm-nvidia
Copy link
Contributor Author

It won't be used.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants