-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
colmena version issues in example #2
Comments
I made a few version updates. Could you check if it works for you now? You'll need to rebuild the environment, as I changed some things besides Colmena |
Rebuilding! So just to clarify - redis is no longer being used? Note in the issue linked above the developer said we could do: queues = RedisQueues(hostname=args.hostname, topics=['simulate', 'train', 'infer'], serialization_method='pickle') Should the two be equivalent (aside from using pipes vs redis?) |
Lol! Sorry just realized that "the developer" is you! So you probably know this is the better approach. 😆 (sorry for my faux pas). Will redis not work then? |
Ack - so the changes to the environment broke the example 0. :( I'm going to revert to see if I can get it working again. |
Yes, Redis and Pipes should be equivalent (Redis is better for larger data). I switched to pipes so that users don't have to remember to start redis for the demo. |
Let me know when there is an updated colmena to try and I'll try the second example again (it's still hanging with pipes). |
In case you need it, here is the current state of my script for the second simulation (still hanging!) https://github.com/rse-ops/flux-hpc/blob/add/molecular-design-parsl/molecular-design-parsl/scripts/1_interleaving-simulation-and-steering.py Thanks for your help and have a good evening! |
Colmena is updated. YOu should find a v0.4.5 Could you try updating "chemfunctions.py" to the latest version from this git repo? It switches from qcengine to ASE for the chemistry computations, and that fixed some issues I had. I'm not sure if yours is the same, but at least it'll put our repos on the same basis. Also, do you see the second task going out in |
Hiya! I've updated the chemistry script, and also fixed (what I consider a bug) with the FluxExecutor - it was running
For the log, I'm not sure what I'm looking at, but it look like this! |
That is good to know about no submit scripts showing up. My guess so far is that the ParslTaskServer is failing to start or crashing when it receives a task. Do you see anything at the end of that log file? Does it stop writing messages after a certain point? Another thing you could do is add |
Overall, it sounds like an issue between Colmena and the FluxExecutor. Could you open an issue about it on Colmena's GitHub? It be good to test Colmena+FluxExecutor on the simple test cases we use with Colmena if we get stumped here. |
The join hangs so we know it's running! |
heyo! Do you have any other ideas for what I could try? Are there examples that don't require colmena? That seems to be what might be introducing something buggy. |
Sorry for the slow reply. But, I have been thinking about this.
Just the first notebook, which does a very similar workflow without Colmena's interference.
There are two routes for bugs I'm curious about:
|
Heyo!
I'm not running on a Mac - I use Linux (I have a Mac but I'm allergic. It's great for email though!)
I'm not sure I know what HTEx is! Is that another workflow tool? Since this is for the Flux Operator, we are primarily interested in running with Flux. Is there something I should check to see if this HTEx is involved? |
Sorry for the jargon, HTEx was "High throughput Executor." It's the part you replaced with Flux. Put better, did the colmena demo work without flux? |
Oh, I didn't test that use case - I'm only interested in running with Flux. If you report the demo works for you, I'd assume it's an issue with how it's integrated into flux. |
Yea, flux not working is definitely something to dig in to. I'm hoping to at least rule out Colmena not working on your system. |
Hiya! I'm trying to reproduce the 1_ notebook (with colmena) and none of the versions from the one provided up until the current work.
ImportError: cannot import name 'make_queue_pairs' from 'colmena.queue.redis'
ModuleNotFoundError: No module named 'colmena.task_server'
AttributeError: module 'proxystore' has no attribute 'proxy'
Possibly I'm missing something or the script needs to be updated? Also, the import was
columna.redis.queue
and it should becolumna.queue.redis
so I'm questioning if this was run to completion.The text was updated successfully, but these errors were encountered: