Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make_queue_pairs removed? #96

Closed
vsoch opened this issue Mar 16, 2023 · 10 comments
Closed

make_queue_pairs removed? #96

vsoch opened this issue Mar 16, 2023 · 10 comments

Comments

@vsoch
Copy link

vsoch commented Mar 16, 2023

Hi! I'm using a script from 8 months ago with:

client_queues, server_queues = make_queue_pairs(hostname=args.hostname, topics=['simulate', 'train', 'infer'], serialization_method='pickle')

But colmena.queue.redis has no function called "make_queue_pairs". What is the replacement for this now? I've tried installing older versions but then other bugs emerge.

Related: ExaWorks/molecular-design-parsl-demo#2

Thanks!

@WardLT
Copy link
Collaborator

WardLT commented Mar 16, 2023

Thanks for reaching out! I did remove that function when refactoring Colmena and forgot to update the example.

We now use a single object to define queues and offer a few different implementations of them. The above example has been replaced by:

queues = RedisQueues(hostname=args.hostname, topics=['simulate', 'train', 'infer'], serialization_method='pickle')

@WardLT
Copy link
Collaborator

WardLT commented Mar 16, 2023

I'll go fix that in the demo

@vsoch
Copy link
Author

vsoch commented Mar 16, 2023

Fantastic! Thank you - will try this out right now.

@vsoch
Copy link
Author

vsoch commented Mar 16, 2023

Do you have any advice for debugging (this is the first time I'm using these libraries). It's hanging on this line. For reference I am updating the example from this notebook which I'm not sure was run because it has the wrong import path.

I'm running redis inside of the same container where I'm testing this, so it should be on localhost! Complete instructions (so far) are in the README.

Thanks for the help! Sorry for being a noob. 😆

@WardLT
Copy link
Collaborator

WardLT commented Mar 16, 2023

Sorry for the hassle and missing your question. Not a noob question at all! (And it wouldn't be a problem even if it was!)

I ended up having a hang in a similar spot, due to the task failing. Check the "colmena.log" to see if there was an error.

I just updated the demo app and changed a few things about the environment to fix bypass an issue that stumped me (something to do with a mix between thread affinities in two libraries)

@vsoch
Copy link
Author

vsoch commented Mar 16, 2023

I don't see an error, but I see these two lines a gazillion times:

2023-03-16 21:58:38,103 - parsl.dataflow.strategy - DEBUG - general strategy starting with strategy_type simple for 0 executors
2023-03-16 21:58:38,113 - parsl.process_loggers - DEBUG - Normal ending for _general_strategy on thread FlowControl-Thread

And before those repetitions, it seems mostly ok?

2023-03-16 20:44:17,299 - colmena.task_server.parsl - INFO - Using default executors for run_model: ['FluxExecutor']
2023-03-16 20:44:17,299 - colmena.task_server.parsl - INFO - Defined 3 methods: compute_vertical, train_model, run_model
2023-03-16 20:44:17,304 - colmena.task_server.base - INFO - Started task server ParslTaskServer on 893
2023-03-16 20:44:17,305 - colmena.queue.base - INFO - Client sent a compute_vertical task with topic default. Created 0 proxies for input values
2023-03-16 20:44:17,309 - parsl.dataflow.dflow - INFO - Starting DataFlowKernel with config
Config(
    app_cache=True,
    checkpoint_files=None,
    checkpoint_mode=None,
    checkpoint_period=None,
    executors=[FluxExecutor(
        flux_executor_kwargs={},
        flux_path='/usr/bin/flux',
        label='FluxExecutor',
        launch_cmd='{flux} start {python} {manager} {protocol} {hostname} {port}',
        provider=LocalProvider(
            channel=LocalChannel(envs={}, script_dir=None, userhome='/workflow'),
            cmd_timeout=30,
            init_blocks=1,
            launcher=SingleNodeLauncher(debug=True, fail_on_any=False),
            max_blocks=1,
            min_blocks=0,
            move_files=None,
            nodes_per_block=1,
            parallelism=1,
            worker_init=''
        ),
        working_dir='/workflow'
    )],
    garbage_collect=True,
    initialize_logging=True,
    internal_tasks_max_threads=10,
    max_idletime=120.0,
    monitoring=None,
    retries=0,
    retry_handler=None,
    run_dir='runinfo',
    strategy='simple',
    usage_tracking=False
)
2023-03-16 20:44:17,309 - parsl.dataflow.dflow - INFO - Parsl version: 2023.01.23
2023-03-16 20:44:17,309 - parsl.usage_tracking.usage - DEBUG - Tracking status: False
2023-03-16 20:44:17,310 - parsl.dataflow.dflow - INFO - Run id is: 3c242bab-8964-4353-818f-a4ac27208dce
2023-03-16 20:44:17,452 - parsl.dataflow.dflow - DEBUG - Considering candidate for workflow name: /opt/conda/envs/moldesign-demo/lib/python3.9/site-packages/parsl/dataflow/dflow.py
2023-03-16 20:44:17,452 - parsl.dataflow.dflow - DEBUG - Considering candidate for workflow name: /opt/conda/envs/moldesign-demo/lib/python3.9/site-packages/parsl/dataflow/dflow.py
2023-03-16 20:44:17,452 - parsl.dataflow.dflow - DEBUG - Considering candidate for workflow name: /opt/conda/envs/moldesign-demo/lib/python3.9/site-packages/colmena/task_server/parsl.py
2023-03-16 20:44:17,452 - parsl.dataflow.dflow - DEBUG - Using parsl.py as workflow name
2023-03-16 20:44:17,452 - parsl.dataflow.memoization - INFO - App caching initialized
2023-03-16 20:44:17,452 - parsl.dataflow.strategy - DEBUG - Scaling strategy: simple
2023-03-16 20:44:17,458 - colmena.task_server.parsl - INFO - Launched Parsl DFK. Process id: 893
2023-03-16 20:44:17,464 - colmena.task_server.base - INFO - Begin pulling from task queue
2023-03-16 20:44:17,490 - colmena.task_server.base - INFO - Received request for compute_vertical with topic default
2023-03-16 20:44:17,495 - parsl.dataflow.dflow - DEBUG - Task 0 will be sent to executor FluxExecutor
2023-03-16 20:44:17,505 - parsl.dataflow.dflow - DEBUG - Adding output dependencies
2023-03-16 20:44:17,511 - parsl.dataflow.dflow - INFO - Task 0 submitted for App compute_vertical, not waiting on any dependency
2023-03-16 20:44:17,516 - parsl.dataflow.dflow - DEBUG - Task 0 set to pending state with AppFuture: <AppFuture at 0x7f3c90448220 state=pending>

@WardLT
Copy link
Collaborator

WardLT commented Mar 16, 2023

Yes, those lines are fine. I'll going to push a version of colmena to PyPI that I hopes fixes that excessive logging

@vsoch
Copy link
Author

vsoch commented Mar 16, 2023

Thank you! I reverted some of the version pinning and got the first example running again - when it finishes I'll try this new approach with each of redis and the pipe variant.

@WardLT
Copy link
Collaborator

WardLT commented Mar 16, 2023

Darn. That change doesn't fix that parsl logging. I'll have to think on it some more

Mind if I close this issue? The problems now are with the example, not Colmena, from my understanding.

@vsoch
Copy link
Author

vsoch commented Mar 16, 2023

Definitely - we resolved this one! Thanks for the help, closing.

@vsoch vsoch closed this as completed Mar 16, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants