-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improving startup time of the full-stack image #447
Comments
Thanks! What does the pgsql setup do? Maybe just 1. Doing the setup once, 2. Turning off psql, 3. Tarring the psql internal folder once would work, and at startup you just untar it? |
The script is here: As far as I can see, it runs two commands:
I don't know their relative timing, but the second is the startup command right? So we can't get rid of that one for sure. |
I'll test this with |
The full-stack already takes ~30s on the local machine which is not fast enough for the MC aiida archive inspect. For the QeApp, the problem is much serious although it only happens the first time the user persistent volume was created.
(@danielhollas, I didn't do a detail profiling on demo server, since it is deployed with k8s directly so I have no where to run If @danielhollas @giovannipizzi do you see any problem with the plan or have better solution for it? |
Hi, one option (maybe is the same you are thinking about) is
It would be good to check how long 4 takes (and if zipping helps, I am not convinced, maybe tarring is enough). |
I agree that 30s is not ideal, I am just really worried about introducing a lot of complexity to this repo (which is already complex). If I understand correctly (@giovannipizzi correct me if I am wrong), the main concern for now is the demo server, we shouldn't worry about Since that's done through kubernetes, simply modifying the base image and adding stuff to $HOME will not work anyway. So as a first step, I would suggest for you to focus on demo server and as @giovannipizzi create a package that you can copy to the home volume. (I am not familiar with kubernetes but surely this is not an uncommon problem to have a pre-populated mount-point. Sorry in case I am misunderstanding something. |
I agree, and that's the reason I never bring this to this repo but want to tackle it for the QEApp image first. I'll still do it from aiidalab-qe fro the moment where we can have a faster iteration on development. Meanwhile, in order to avoid bring too much complicity to the image preparation (as @danielhollas pointed out, it is already quite complex), if there is less different between backup and run the rsync at full home directly, I am prone to go with the simple solution. But worth to try both and had a clear comparison between the size of the final image and the speed improvement. |
Thanks. To clarify my suggestion, if we care about the demo server, I'd not touch the images at all, and instead try modify the kubernetes startup to inject the data there. But I might be underestimating the complexity of doing that. |
Haha, I think about it as well, I may over estimate the complexity such as the permission of the system. Will keep this in mind. |
Just a note, aiida 2.5 is unlikely to help too much here. Much of the gain I got were concentrated on verdi tab completion and commands not accessing the database. Other gains were partly negated by the introduction of pydantic. I did some timings and in your case you're paying a price of at least 0.5s for each verdi invocation. The main gain here would be to create the codes via python API from within the same process. (@superstar54 mentioned there were some threading issues there, but those should be surmountable, e.g. by having a small python script that sets up the codes and is called via subprocess, in case more simple solutions don't work). See my timings here: |
Hi @danielhollas and @unkcpz, A small good news is that , in the latest QEApp, code setup is no longer time-consuming since all codes are set up in one script. aiidalab/aiidalab-qe#706. We can use the Python API in the future if we fix the thread problem. |
Thanks @superstar54, but I find aiidalab/aiidalab-qe#706 a bit hacky. I think if we anyway have startup time problem with pseudo libraries, why not just keep the original implementation which is more straightforward. |
Hi @unkcpz , I don't understand the logic here. Could you explain in more detail? thanks! |
I mean your fix is great but bring limited influence to the startup time issue of QeApp image. If we didn't solve the time needed of setup profile and pseudopotential groups, it still need ~ 2 mins to start the qeapp image (I agree your change improve it, which is great!). However, once we have a image that do not need the runtime setup of profiles (include the codes setup) and pseudos group setup, the problem solved together. I say the aiidalab/aiidalab-qe#706 is not straightforward, because you use a function to create a string and write to a The discussion is a bit side track, let's move the QeApp image issue to QeApp. This issue is more about whether/how we improve the startup time of full-stack image. For the QeApp image, we need to do it anyway and the most time consuming part is the pseudopotential groups setup. |
Thanks for the explanation. Looking forward to this solution! |
Please check this PR: aiidalab/aiidalab-qe#695 (comment) |
Thanks! Yes, I think that implementation is much more clear. I'd suggest maybe in the future would be a bit better that you can wait a bit on using the work around and get the issue exposed to the team (and aiida team) to get discussed. |
@unkcpz could you run the full-stack container startup on your machine again? $ docker pull docker.io/aiidalab/full-stack:edge
$ time docker run --rm docker.io/aiidalab/full-stack:edge With the recent improvements I did to the startup scripts, the startup time of a fresh container is now 11s on my machine (down from ~30s). I don't see any obvious ways of speeding this up further in the full-stack image itself. In @superstar54's experimental QeApp image which prepares home in advance, this would be even faster since aiida profile and computer are already initialized, but I don't think we should do that here. |
I've published a new version of the docker stack with the loading speed improvements. @superstar54 I'd suggest to rebuild your QeApp image on top of it. I'd also suggest that you try keeping the startup scripts as they are now and don't delete them, since I think it will improve the maintainability of your solution, and should not add more then 1-2seconds overhead. Closing this issue for now, we can open a new one if there are further avenues for improvements. |
Issue for @unkcpz 's investigation into a startup time of AiiDAlab QEApp container.
I was curious and ran a few tests on my machine with the latest
aiidalab/full-stack:edge
image based on aiida-core 2.5.1So on my machine the whole startup takes around 30s. Not great not terrible, but I can imagine on slower machines this might take significantly longer. I also tried to run just the PGSQL setup script.
and the same script followed by prepare-aiida script
So it seems that majority of time is spent in these two scripts. pgsql setup itself takes around a third (10s) of total time, not sure how much we can do about that. Another 14s is spent in
40_prepare-aiida.sh
.@unkcpz could you run the same on the demo server? Would be good to see how this depends on the machine.
cc @giovannipizzi
The text was updated successfully, but these errors were encountered: