-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fully dynamic updates of run["cores"] and run["workers"] (RFC) #2010
Conversation
28472e5
to
ac75b37
Compare
6043a22
to
970e7fe
Compare
I rebased this and took the opportunity to strengthen the comment about the semaphore. I hope it is ok. |
The fact that the semaphore should be strictly less than the number of threads can in principle be verified at start up. I am revisiting the idea of parsing the config file ( |
DEV running with the PR |
When I quit pserve with ^C I need to press ^C twice and then the second time I get
This does not happen with master. Do you perhaps know what might be the cause of this? |
I think the error message points to an unreleased lock but I don't really see how that's possible. I will think more. |
No idea but I confirm that there is an issue, May 16 17:50:59 dfts-0 systemd[1]: Stopping Fishtest Server port 6543...
May 16 17:50:59 dfts-0 pserve[3350]: flush
May 16 17:50:59 dfts-0 pserve[3350]: .done
May 16 17:52:29 dfts-0 systemd[1]: [email protected]: State 'stop-sigterm' timed out. Killing.
May 16 17:52:29 dfts-0 systemd[1]: [email protected]: Killing process 3350 (pserve) with signal SIGKILL.
May 16 17:52:29 dfts-0 systemd[1]: [email protected]: Killing process 3381 (pserve) with signal SIGKILL.
May 16 17:52:29 dfts-0 systemd[1]: [email protected]: Main process exited, code=killed, status=9/KILL
May 16 17:52:29 dfts-0 systemd[1]: [email protected]: Failed with result 'timeout'.
May 16 17:52:29 dfts-0 systemd[1]: Stopped Fishtest Server port 6543.
May 16 17:53:29 dfts-0 systemd[1]: Started Fishtest Server port 6543 |
Ok thanks. That's something that needs to be solved. |
The schedule scripts don't stop as well: 835 root 20 35700 1152 932 S 0.0 0:14.67 `- /usr/sbin/cron -f
14467 root 20 61116 3088 2660 S 0.1 `- /usr/sbin/CRON -f
14468 usr00 20 4636 860 792 S 0.0 `- /bin/sh -c /usr/bin/nice -n 10 /usr/bin/cpulimit -l 50 -f -m -- ${VENV}/bin/python3 ${UPATH}/delta_update_users.py
14469 usr00 30 10 82528 860 772 S 0.3 0.0 0:18.69 `- /usr/bin/cpulimit -l 50 -f -m -- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_upda+
14470 usr00 30 10 1022124 200036 40308 S 3.9 0:29.75 `- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_update_users.py
16289 root 20 61116 3088 2660 S 0.1 `- /usr/sbin/CRON -f
16290 usr00 20 4636 812 748 S 0.0 `- /bin/sh -c /usr/bin/nice -n 10 /usr/bin/cpulimit -l 50 -f -m -- ${VENV}/bin/python3 ${UPATH}/delta_update_users.py
16291 usr00 30 10 82528 856 768 S 0.3 0.0 0:15.68 `- /usr/bin/cpulimit -l 50 -f -m -- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_upda+
16292 usr00 30 10 1025196 203236 40700 S 4.0 0:31.82 `- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_update_users.py
19005 root 20 61116 3088 2660 S 0.1 `- /usr/sbin/CRON -f
19006 usr00 20 4636 784 720 S 0.0 `- /bin/sh -c /usr/bin/nice -n 10 /usr/bin/cpulimit -l 50 -f -m -- ${VENV}/bin/python3 ${UPATH}/delta_update_users.py
19007 usr00 30 10 82528 856 772 S 0.3 0.0 0:12.16 `- /usr/bin/cpulimit -l 50 -f -m -- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_upda+
19008 usr00 30 10 1015964 194008 40676 S 3.8 0:26.30 `- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_update_users.py
22617 root 20 61116 3088 2660 S 0.1 `- /usr/sbin/CRON -f
22618 usr00 20 4636 812 748 S 0.0 `- /bin/sh -c /usr/bin/nice -n 10 /usr/bin/cpulimit -l 50 -f -m -- ${VENV}/bin/python3 ${UPATH}/delta_update_users.py
22619 usr00 30 10 82528 896 808 S 0.3 0.0 0:09.11 `- /usr/bin/cpulimit -l 50 -f -m -- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_upda+
22620 usr00 30 10 1018008 195696 40124 S 3.9 0:27.26 `- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_update_users.py
27139 root 20 61116 3088 2660 S 0.1 `- /usr/sbin/CRON -f
27140 usr00 20 4636 860 796 S 0.0 `- /bin/sh -c /usr/bin/nice -n 10 /usr/bin/cpulimit -l 50 -f -m -- ${VENV}/bin/python3 ${UPATH}/delta_update_users.py
27141 usr00 30 10 82528 856 768 S 0.3 0.0 0:05.75 `- /usr/bin/cpulimit -l 50 -f -m -- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_upda+
27142 usr00 30 10 1022124 200072 40336 S 3.9 0:26.16 `- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_update_users.py
32562 root 20 61116 3088 2660 S 0.1 `- /usr/sbin/CRON -f
32563 usr00 20 4636 872 808 S 0.0 `- /bin/sh -c /usr/bin/nice -n 10 /usr/bin/cpulimit -l 50 -f -m -- ${VENV}/bin/python3 ${UPATH}/delta_update_users.py
32564 usr00 30 10 82528 932 844 S 0.3 0.0 0:02.32 `- /usr/bin/cpulimit -l 50 -f -m -- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_upda+
32565 usr00 30 10 1023148 201140 40312 S 4.0 0:25.89 `- /home/usr00/fishtest/server/env/bin/python3 /home/usr00/fishtest/server/utils/delta_update_users.py
|
The new stack trace feature will be handy for debugging this. |
Unfortunately fishtest just hangs at Can you perhaps do ^C on dev (or get the error message from journalctl)? Since you are running a more recent python, perhaps the error message is more illuminating. I see that the shutdown code has changed quite a bit. |
$ env/bin/pserve production.ini http_port=6543
Starting server in PID 15641.
^Cflush
done
^Cflush
done
Exception ignored in: <module 'threading' from '/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py'>
Traceback (most recent call last):
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1622, in _shutdown
lock.acquire()
File "/home/usr00/fishtest/server/fishtest/rundb.py", line 356, in exit_run
sys.exit(0)
SystemExit: 0 |
It seems like hanging in sys.exit can be because some threads are not finished? That could be visible in the stack traces, i.e. once it is hanging after a ctrl+c, send the usr1 signal, and look at where all threads are. With some luck, the hanging thread can be seen. |
Thanks! I found at least the culprit. I am calling I assume that in the constructor it is ok to write directly to the db since the cache has not been initialized yet. |
backtrace Click to view=================== <_MainThread(MainThread, stopped 140327722854208)> ======================
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1622, in _shutdown
lock.acquire()
File "/home/usr00/fishtest/server/fishtest/__init__.py", line 25, in thread_stack_dump
traceback.print_stack(sys._current_frames()[th.ident])
=================== <Thread(pymongo_server_monitor_thread, started daemon 140327159478016)> ======================
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
self._bootstrap_inner()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 141, in _run
if not self._target():
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 62, in target
monitor._run() # type:ignore[attr-defined]
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 192, in _run
self._server_description = self._check_server()
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 235, in _check_server
return self._check_once()
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 282, in _check_once
response, round_trip_time = self._check_with_socket(conn)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/monitor.py", line 304, in _check_with_socket
response = Hello(conn._next_reply(), awaitable=True)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/pool.py", line 918, in _next_reply
reply = self.receive_message(None)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/pool.py", line 1037, in receive_message
return receive_message(self, request_id, self.max_message_size)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 317, in receive_message
length, _, response_to, op_code = _UNPACK_HEADER(_receive_data_on_socket(conn, 16, deadline))
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 394, in _receive_data_on_socket
wait_for_read(conn, deadline)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/network.py", line 375, in wait_for_read
readable = conn.socket_checker.select(sock, read=True, timeout=timeout)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/socket_checker.py", line 66, in select
res = self._poller.poll(timeout_)
=================== <Thread(pymongo_kill_cursors_thread, started daemon 140327151085312)> ======================
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
self._bootstrap_inner()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 156, in _run
time.sleep(self._min_interval)
=================== <Thread(pymongo_server_rtt_thread, started daemon 140327141644032)> ======================
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
self._bootstrap_inner()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1010, in run
self._target(*self._args, **self._kwargs)
File "/home/usr00/fishtest/server/env/lib/python3.12/site-packages/pymongo/periodic_executor.py", line 156, in _run
time.sleep(self._min_interval)
=================== <Timer(Thread-118, started 140326739048192)> ======================
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1030, in _bootstrap
self._bootstrap_inner()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1073, in _bootstrap_inner
self.run()
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 1429, in run
self.finished.wait(self.interval)
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 655, in wait
signaled = self._cond.wait(timeout)
File "/home/usr00/.pyenv/versions/3.12.3/lib/python3.12/threading.py", line 359, in wait
gotit = waiter.acquire(True, timeout) |
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in official-stockfish#2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
The reason for this PR is that before introducing more dynamic updates we would like to verify that the dynamic updates of run["workers"] and run["cores"] introduced in #2010 are bug free. In order to facilitate the use of periodic timers in Fishtest we have introduced a new scheduler in utils.py which may be interesting in its own right. Currently the timers defined are the cache flush timer and the timer introduced in this PR.
We have everywhere replaced
by
which does the required bookkeeping under the
task_lock
.The intention is that the
task_lock
will only be held for a very short time. So inrequest_task
we only grab it at the moment when we actually add a task to the run (currently it is held during the whole invocation ofrequest_task
).If it works then this PR can serve as a model for other types of bookkeeping like the number of committed games.
Other change:
We start the cache timer in
__init__.py
at application start up instead of during the first invocation ofbuffer()
.