-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
moon2 : the test is finished but the session (and corresponding pod) still present #354
Comments
@MohamedBenighil Playwright pod is automatically deleted when respective web-socket connection is closed. Make sure your process actually closed this connection. |
@vania-pooh Actually, i discussed with QA team, and they said : the process is already closes the connection. And it is done by the following piece of code (at the end of each process/test) :
Notice our QA team uses Nuget within C#, and playwright 1.27.1 to develope those processes/tests I would like also to highlight somthing that may help:
I would like if you have an idea why ? If you need more inforamtions, please let me know thank you for your help |
@MohamedBenighil one more possible is reason could be Moon pod restart because of some maintenance in Kubernetes cluster. Make sure that number of restarts in |
@vania-pooh In fact, i have noticed that one of moon's pod get killed and recreated (Look to the AGE Column in first screenshot above). PS: i am using kubernetes in Azure (AKS) So, what should i do to avoid the frozen session/pod ? and why moon2 does not terminate the previous session if it restarted ? Thank you |
@MohamedBenighil Moon has no state, i.e. no list of sessions is being stored in Moon memory. In case of Playwright \ Puppeteer Moon pod is simply controlling web-socket connection state and deletes browser pod when connection is closed. When Moon is being killed from the outside, this information is lost and pod will never be deleted. This is why it is important to make sure that Moon is never restarted (which is usually the case when configured correctly). |
@vania-pooh So to make Moon is never restarted, should i ONLY edit the value from |
@MohamedBenighil this is not needed. You just need to make sure Moon is not restarted from the outside or restarted because of OOM. |
@vania-pooh The problem is : i have no idea why it restart? The pod of moon2 is killed, so when i do |
There is command to see logs of previous pod:
kubectl logs my-pod --previous
Please use kubectl describe to see the reason of restart.
In case of OOM please increase memory request/limir,
Please note that memory request must be equal to limit.
сб, 29 окт. 2022 г., 23:25 Mohamed BENIGHIL ***@***.***>:
… @vania-pooh <https://github.com/vania-pooh> The problem, i do not why it
restart: The pod of moon2 is killed, so when i do kubectl get po -nmoon
the RESTART column is set to 0, so i can not check the logs of previous
pod since it is lost. I would like to now if you have a "tip" to deal with
this situation ? (unless deploying logging system)
—
Reply to this email directly, view it on GitHub
<#354 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKY23ODLTMVLX26L5KJGRTWFWB3VANCNFSM6AAAAAARPAZBSE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
@aandryashin As i described on my just last comment, |
In case it was killed there is nothing interesting in logs. Try to increase
memory request/limit. Also try to analyze what is the reason of memory
consumption it may be uploading files and using browser extensions, please
take a look to this article there is description how to work with files
effective:
https://blog.aerokube.com/selenium-moon-environment-provisioning-72402242c917
вс, 30 окт. 2022 г., 16:04 Mohamed BENIGHIL ***@***.***>:
… @aandryashin <https://github.com/aandryashin> As i described on my just
last comment, kubectl logs my-pod --previous does not work since the
previous pod was *killed* (ie. it restarts BUT the RESTART columen is 0
when you do : kubectl get po -nmoon ).
—
Reply to this email directly, view it on GitHub
<#354 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKY23LUQEFOKDEJXMLKYSDWFZP6DANCNFSM6AAAAAARPAZBSE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@vania-pooh i discoverd why the moon2's pod was killed : It is due to the downscacle of my AKS (Azure kubernetes). I enabled the autoscaling feature on my AKS (from min=2 to max=10 nodes), so during the downscale process, the worker hosting the moon2's pod(s), gets deleted by AKS. So the moon2's pod get killed as well and recreated on the other available worker. Once the moon2's pod was created, the control of web-socket connection is lost since the new pod has no state to save the history, hence the playwright pod is never deleted. And this is what you said in this comment I would like to know how can i configure moon2 to avoid to be killed during the downscale process of my k8s cluser ? PS: i deployed it using Helm Thank you in advance for your help |
Hello, moon pods should run on dedicated kubernetes nodes, otherwise you
have to increase graceful shutdown period up to 6 min to avoid kill moon
pods...
сб, 5 нояб. 2022 г., 00:17 Mohamed BENIGHIL ***@***.***>:
… @vania-pooh <https://github.com/vania-pooh> i discoverd why the moon2's
pod was killed : It is due to the downscacle of my AKS (Azure kubernetes).
I enabled the autoscaling feature on my AKS (from min=2 to max=10 nodes),
so during the downscale process, the worker hosting the moon2's pod(s), get
deleted. So the moon2's pod get killed as well and recreated on the other
available worker, and between the killing and recreation, the web-socket
connection is lost, hence the playwright pod is never deleted. And this is
what you said in this comment
<#354 (comment)>
I would like to know how can i configure moon2 to avoid to be killed when
the autoscaling is enabled on my k8s cluser (ie. like my use case) ?
PS: i deployed it using Helm
Thank you in advance for your help
—
Reply to this email directly, view it on GitHub
<#354 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKY23O3F4OIFV3AE7V5S4TWGV4QHANCNFSM6AAAAAARPAZBSE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@aandryashin when you say : "you have to increase graceful shutdown period up to 6 min" do you mean |
Hello, no, i meant node should send kill signal to processes only after
pods graceful shurdown period, for moon it is 360 seconds by default. In
production environment nobody should not send kill signal, every pods
should exit yourself, and only after pods completed node can be
restarted... Also please check that moon pods do not run on spot instances,
that can cause forced shutdown...
пн, 7 нояб. 2022 г., 16:20 Mohamed BENIGHIL ***@***.***>:
… @aandryashin <https://github.com/aandryashin> when you say : "you have to
increase graceful shutdown period up to 6 min" do you mean
terminationGracePeriodSeconds properity ? if "yes", i guess it is already
set to 6min by default see it here in the moon2 chart
<https://github.com/aerokube/charts/blob/master/moon2/values.yaml#L125>
If "No", please let me know ?
—
Reply to this email directly, view it on GitHub
<#354 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAKY23JNDLEIU43WSDGQWHDWHD6YXANCNFSM6AAAAAARPAZBSE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
when you say : "i meant node should send kill signal to processes only after pods graceful shurdown period".
I confirm also, the moon2 is not runnning on worker spot instance. But it was killed due to downscale process, because on my cluster (AKS), the autoscaling feature is enabled |
Hello,
I deployed moon2 on my azure kubernetes (AKS), and the QA tems lanched their tests every 20 minutes (the same tests) and most of times everything works fine.
However sometimes, the test is finished but the corresponfing session on moon ui and the pod get stuck and they are never deleted. I would like to know you have an idea why ?
I will provide some infos that may help.
and the logs of vnc-server container of the stuck pod in question are:
Please let me know if you need more informations ?
Thank you for your help
The text was updated successfully, but these errors were encountered: