Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lost my socks #15751

Closed
wants to merge 4 commits into from
Closed

Lost my socks #15751

wants to merge 4 commits into from

Conversation

AlanCoding
Copy link
Member

SUMMARY

This scratches my own itch, since I've been looking at logs a lot.

The issue is that the installer sets up services before receptor is fully running. This results in log spam that looks like the following:

2025-01-15 20:37:13,710 INFO     [-] awx.main.dispatch Running worker dispatcher listening to queues ['tower_broadcast_all', 'tower_settings_change', '18.212.26.122']
2025-01-15 20:37:13,756 ERROR    [-] awx.main.dispatch Encountered unhandled error in dispatcher main loop
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/base.py", line 239, in run
    self.worker.on_start()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/task.py", line 141, in on_start
    dispatch_startup()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 117, in dispatch_startup
    cluster_node_heartbeat()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 575, in cluster_node_heartbeat
    inspect_execution_and_hop_nodes(instance_list)
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 514, in inspect_execution_and_hop_nodes
    mesh_status = ctl.simple_command('status')
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/receptorctl/socket_interface.py", line 81, in simple_command
    self.connect()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/receptorctl/socket_interface.py", line 99, in connect
    raise ValueError(f"Socket path does not exist: {path}")
ValueError: Socket path does not exist: /var/run/receptor/receptor.sock
2025-01-15 20:37:17,758 INFO     [-] awx.main.dispatch Running worker dispatcher listening to queues ['tower_broadcast_all', 'tower_settings_change', '18.212.26.122']
2025-01-15 20:37:17,822 ERROR    [-] awx.main.dispatch Encountered unhandled error in dispatcher main loop
Traceback (most recent call last):
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/base.py", line 239, in run
    self.worker.on_start()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/dispatch/worker/task.py", line 141, in on_start
    dispatch_startup()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 117, in dispatch_startup
    cluster_node_heartbeat()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 575, in cluster_node_heartbeat
    inspect_execution_and_hop_nodes(instance_list)
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/awx/main/tasks/system.py", line 514, in inspect_execution_and_hop_nodes
    mesh_status = ctl.simple_command('status')
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/receptorctl/socket_interface.py", line 81, in simple_command
    self.connect()
  File "/var/lib/awx/venv/awx/lib64/python3.11/site-packages/receptorctl/socket_interface.py", line 99, in connect
    raise ValueError(f"Socket path does not exist: {path}")

These do come in 4 second increments, but the state persists for a long time. I just really don't want to see tracebacks in this situation, because I only want to see tracebacks when there is an actual problem. This is the normal install flow.

Demo:

$ awx-manage run_dispatcher
2025-01-16 20:24:54,269 INFO     [-] awx.main.utils.redis Redis ping error: Error 2 connecting to unix socket: /var/run/redis/redis.sock. No such file or directory.

Next:

$ awx-manage run_dispatcher
2025-01-16 20:39:48,646 INFO     [-] awx.main.dispatch Running worker dispatcher listening to queues ['tower_broadcast_all', 'tower_settings_change', 'arominge-thinkpadp16vgen1.durham.csb']
2025-01-16 20:39:52,336 INFO     [-] awx.main.dispatch Could not create listener connection: [Errno -2] Name or service not known

Next:

$ awx-manage run_dispatcher
2025-01-16 20:35:46,234 INFO     [-] awx.main.dispatch Receptor sock file does not exist at /var/run/receptor/receptor.sock
ISSUE TYPE
  • Bug, Docs Fix or other nominal change
COMPONENT NAME
  • API

@AlanCoding
Copy link
Member Author

AlanCoding commented Jan 16, 2025

Started integration run 11858 to get logs from this\

EDIT: 11859

@AlanCoding
Copy link
Member Author

This was having problem with the receptor file check, and that was failing on normal startup so that's no good. Closing until that can be resolved.

@AlanCoding AlanCoding closed this Jan 23, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant