Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

398 featureexperiment proper vscode debugging #478

Merged
merged 21 commits into from
Nov 18, 2024
Merged
Show file tree
Hide file tree
Changes from 20 commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
23 changes: 23 additions & 0 deletions .vscode/launch.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
{
// Use IntelliSense to learn about possible attributes.
// Hover to view descriptions of existing attributes.
// For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
"version": "0.2.0",
"configurations": [
{
"name": "53000 leaderboard attach",
"type": "debugpy",
"request": "attach",
"connect": {
"host": "localhost",
"port": 53000
},
"pathMappings": [
{
"localRoot": "${workspaceFolder}/code",
"remoteRoot": "${env:PAF_CATKIN_CODE_ROOT}"
}
]
}
],
}
Comment on lines +22 to +23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove trailing comma for valid JSON syntax.

The trailing comma after the configurations array should be removed.

Apply this diff to fix the JSON syntax:

     "configurations": [
         {
             "name": "53000 leaderboard attach",
             "type": "debugpy",
             "request": "attach",
             "connect": {
                 "host": "localhost",
                 "port": 53000
             },
             "pathMappings": [
                 {
                     "localRoot": "${workspaceFolder}/code",
                     "remoteRoot": "${env:PAF_CATKIN_CODE_ROOT}"
                 }
             ]
-        }
-    ],
+        }
+    ]
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
],
}
],
}
```
```suggestion
]
}
🧰 Tools
🪛 Biome

[error] 22-23: Expected a property but instead found '}'.

Expected a property here.

(parse)

Comment on lines +22 to +23
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Remove trailing comma to fix JSON syntax.

The trailing comma before the closing brace is causing a syntax error. This could prevent VS Code from properly loading the debug configuration.

Apply this fix:

    "configurations": [
        {
            "name": "53000 leaderboard attach",
            "type": "debugpy",
            "request": "attach",
            "connect": {
                "host": "localhost",
                "port": 53000
            },
            "pathMappings": [
                {
                    "localRoot": "${workspaceFolder}/code",
                    "remoteRoot": "${env:PAF_CATKIN_CODE_ROOT}"
                }
            ]
-        }
-    ],
+        }
+    ]
}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
],
}
],
}
```
```suggestion
]
}
🧰 Tools
🪛 Biome

[error] 22-23: Expected a property but instead found '}'.

Expected a property here.

(parse)

1 change: 1 addition & 0 deletions build/agent_service.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,7 @@ services:
- ROS_HOSTNAME=agent
- XDG_RUNTIME_DIR=/tmp/runtime-carla
- ROUTE=/opt/leaderboard/data/routes_devtest.xml
- DEBUG_WRAPPER_DEFAULT_HOST=0.0.0.0
# Simple route without special scenarios
# - ROUTE=/workspace/code/routes/routes_simple.xml
volumes:
Expand Down
9 changes: 8 additions & 1 deletion build/docker-compose.leaderboard.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -8,4 +8,11 @@ services:
extends:
file: agent_service.yaml
service: agent
command: bash -c "sleep 10 && sudo chown -R ${USER_UID}:${USER_GID} ../ && sudo chmod -R a+w ../ && python3 /opt/leaderboard/leaderboard/leaderboard_evaluator.py --debug=0 --routes=$${ROUTE} --agent=/workspace/code/agent/src/agent/agent.py --host=$${CARLA_SIM_HOST} --track=MAP"
ports:
# Reserved ports for the debugger
- "53000-53100:53000-53100"
Comment on lines +11 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

Port range can be safely reduced to a single debug port

Based on the codebase analysis:

  1. No specific requirements for multiple debug ports were found
  2. The only debug configurations found are using a single debug flag (--debug=0)
  3. The DEBUG_WRAPPER_DEFAULT_HOST setting suggests a single host configuration

Recommend:

  • Reduce the port range to a single port (e.g., - "53000:53000")
  • Document the purpose of the debug port in a comment
🔗 Analysis chain

Consider reducing the exposed port range for security.

While exposing ports for debugging is necessary, the current range of 100 ports (53000-53100) seems excessive for debugging purposes. Consider limiting this to a smaller range unless there's a specific requirement for multiple debug sessions.

Let's verify if there are any specific port requirements in the codebase:

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for debugpy port configurations in the codebase
rg -l "connect\(.*53\d{3}\)" || rg -l "listen\(.*53\d{3}\)"

Length of output: 128


Script:

#!/bin/bash
# Search for debugger-related configurations and ports in various file types
rg -i "debug.*port|debugger.*port|port.*debug" --type yaml --type python --type javascript

# Search for VS Code launch configurations that might specify debug ports
fd -e json -x cat {} \; | grep -i "port.*53"

# Look for any documentation about debugging setup
fd "README|CONTRIBUTING" -t f -x grep -il "debug" {}

Length of output: 255


Script:

#!/bin/bash
# Check the content of the perception README for any debug port information
cat ./doc/perception/README.md

# Look for any VS Code launch configurations
fd -e vscode -d 2 -x cat {}

# Search for any debug-related configurations in yaml files
rg -i "debug" --type yaml

Length of output: 2369

command: |-
bash -c "sleep 10 && sudo chown -R ${USER_UID}:${USER_GID} ../ && \
sudo chmod -R a+w ../ && sudo mkdir -p $${XDG_RUNTIME_DIR} && sudo chmod 0700 $${XDG_RUNTIME_DIR} && sudo chown -R ${USER_UID}:${USER_GID} $${XDG_RUNTIME_DIR} && \
(rqt_console &) && disown -a && \
python3 /opt/leaderboard/leaderboard/leaderboard_evaluator.py --debug=0 --routes=$${ROUTE} --agent=/workspace/code/agent/src/agent/agent.py --host=$${CARLA_SIM_HOST} --track=MAP"
Comment on lines +14 to +18
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

🛠️ Refactor suggestion

Consider using service dependencies instead of sleep

Based on the analysis of the codebase, I'm modifying my previous recommendation. The docker-compose.cicd.yaml shows that the agent service properly depends on carla-simulator and roscore services. Instead of using a complex retry mechanism, we should leverage Docker Compose's built-in dependency handling:

     command: |-
-      bash -c "sleep 10 && sudo chown -R ${USER_UID}:${USER_GID} ../ && \
-      sudo chmod -R a+w ../ && sudo mkdir -p $${XDG_RUNTIME_DIR} && sudo chmod 0700 $${XDG_RUNTIME_DIR} && sudo chown -R ${USER_UID}:${USER_GID} $${XDG_RUNTIME_DIR} && \
-      (rqt_console &) && disown -a && \
-      python3 /opt/leaderboard/leaderboard/leaderboard_evaluator.py --debug=0 --routes=$${ROUTE} --agent=/workspace/code/agent/src/agent/agent.py --host=$${CARLA_SIM_HOST} --track=MAP"
+      bash -c '
+        # Setup error handling
+        set -e
+        
+        # Setup permissions
+        echo "Setting up permissions..."
+        sudo chown -R ${USER_UID}:${USER_GID} ../ || exit 1
+        sudo chmod -R a+w ../ || exit 1
+        
+        # Setup runtime directory
+        echo "Setting up runtime directory..."
+        sudo mkdir -p $${XDG_RUNTIME_DIR} || exit 1
+        sudo chmod 0700 $${XDG_RUNTIME_DIR} || exit 1
+        sudo chown -R ${USER_UID}:${USER_GID} $${XDG_RUNTIME_DIR} || exit 1
+        
+        # Start rqt_console in background
+        echo "Starting rqt_console..."
+        rqt_console &
+        disown -a
+        
+        # Start the evaluator
+        echo "Starting leaderboard evaluator..."
+        exec python3 /opt/leaderboard/leaderboard/leaderboard_evaluator.py \
+          --debug=0 \
+          --routes=$${ROUTE} \
+          --agent=/workspace/code/agent/src/agent/agent.py \
+          --host=$${CARLA_SIM_HOST} \
+          --track=MAP
+      '

Add these service dependencies to ensure proper startup order:

    depends_on:
      - carla-simulator
      - roscore

This approach:

  • Removes the arbitrary sleep 10 and relies on Docker's service orchestration
  • Maintains error handling for critical file operations
  • Improves readability and logging
  • Preserves the original functionality in a more robust way
🔗 Analysis chain

Enhance command robustness and readability.

The current command structure has several potential issues:

  1. The sleep 10 is a brittle way to handle timing dependencies
  2. No error handling for critical operations (chmod, chown, mkdir)
  3. Multiple sudo operations without checking for failures

Consider refactoring to this more robust approach:

     command: |-
-      bash -c "sleep 10 && sudo chown -R ${USER_UID}:${USER_GID} ../ && \
-      sudo chmod -R a+w ../ && sudo mkdir -p $${XDG_RUNTIME_DIR} && sudo chmod 0700 $${XDG_RUNTIME_DIR} && sudo chown -R ${USER_UID}:${USER_GID} $${XDG_RUNTIME_DIR} && \
-      (rqt_console &) && disown -a && \
-      python3 /opt/leaderboard/leaderboard/leaderboard_evaluator.py --debug=0 --routes=$${ROUTE} --agent=/workspace/code/agent/src/agent/agent.py --host=$${CARLA_SIM_HOST} --track=MAP"
+      bash -c '
+        # Setup error handling
+        set -e
+        
+        # Function to retry operations with timeout
+        wait_for_service() {
+          local retries=30
+          local wait_time=2
+          while [ $retries -gt 0 ]; do
+            if [ -d "../" ]; then
+              return 0
+            fi
+            retries=$((retries-1))
+            sleep $wait_time
+          done
+          return 1
+        }
+        
+        # Wait for services to be ready
+        echo "Waiting for services..."
+        wait_for_service || { echo "Service startup timeout"; exit 1; }
+        
+        # Setup permissions
+        echo "Setting up permissions..."
+        sudo chown -R ${USER_UID}:${USER_GID} ../ || { echo "Failed to set ownership"; exit 1; }
+        sudo chmod -R a+w ../ || { echo "Failed to set permissions"; exit 1; }
+        
+        # Setup runtime directory
+        echo "Setting up runtime directory..."
+        sudo mkdir -p $${XDG_RUNTIME_DIR} || { echo "Failed to create runtime dir"; exit 1; }
+        sudo chmod 0700 $${XDG_RUNTIME_DIR} || { echo "Failed to set runtime dir permissions"; exit 1; }
+        sudo chown -R ${USER_UID}:${USER_GID} $${XDG_RUNTIME_DIR} || { echo "Failed to set runtime dir ownership"; exit 1; }
+        
+        # Start rqt_console in background
+        echo "Starting rqt_console..."
+        rqt_console &
+        disown -a
+        
+        # Start the evaluator
+        echo "Starting leaderboard evaluator..."
+        exec python3 /opt/leaderboard/leaderboard/leaderboard_evaluator.py \
+          --debug=0 \
+          --routes=$${ROUTE} \
+          --agent=/workspace/code/agent/src/agent/agent.py \
+          --host=$${CARLA_SIM_HOST} \
+          --track=MAP
+      '

This refactored version:

  • Replaces static sleep with dynamic service checking
  • Adds proper error handling and logging
  • Improves readability with structured sections
  • Uses exec to replace shell with the Python process

Let's verify the service dependencies:


Let's search for any service dependencies and timing-related patterns in the build directory to better understand the startup requirements.


Let's examine the service dependencies and health check configurations to better understand the startup requirements and validate our recommendation.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Check for other services that might need to be ready before agent starts
rg -l "depends_on.*agent" build/

Length of output: 128


Script:

#!/bin/bash
# Check for service dependencies in docker-compose files
rg -l "depends_on" build/

# Check for any references to timing or startup order
rg "sleep|wait" build/

Length of output: 1258


Script:

#!/bin/bash
# Check the content of docker-compose.cicd.yaml to understand dependencies
cat build/docker-compose.cicd.yaml

# Check if there are any health checks defined
rg "healthcheck" build/

Length of output: 842

4 changes: 3 additions & 1 deletion build/docker/agent/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -162,8 +162,10 @@ RUN source ~/.bashrc && pip install -r /workspace/requirements.txt
# Add agent code
COPY --chown=$USERNAME:$USERNAME ./code /workspace/code/

# For debugger
ENV PAF_CATKIN_CODE_ROOT=/catkin_ws/src/code
# Link code into catkin workspace
RUN ln -s /workspace/code /catkin_ws/src
RUN ln -s -T /workspace/code ${PAF_CATKIN_CODE_ROOT}

# re-make the catkin workspace
RUN source /opt/ros/noetic/setup.bash && catkin_make
Expand Down
1 change: 1 addition & 0 deletions code/acting/src/debug_wrapper.py
4 changes: 4 additions & 0 deletions code/agent/launch/agent.launch
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,9 @@
<arg name="role_name" value="$(arg role_name)"/>
</include>

<!-- debugging -->
<include file="$(find debugging)/launch/debugging.launch">
</include>

<node type="rviz" name="rviz" pkg="rviz" args="-d $(find agent)/config/rviz_config.rviz" />
</launch>
201 changes: 201 additions & 0 deletions code/debug_wrapper.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,201 @@
#!/usr/bin/env python3
"""Debug wrapper node

Node that wraps a python ros node
and is able to open a debugpy remote debugger instance.

Logs any exceptions from the node and other information
into the ros console via the debug_logger node.

Always tries to at least start the node,
even if dependencies like debugpy are missing.

Usage:
Already done for existing ros packages: symlink this file into the package. Example:
`cd code/perception/src && ln -s ../../debug_wrapper.py debug_wrapper.py`

Adjust the launch configuration to use the debug_wrapper.py
instead of the node file and set the required args

More info in [debugging.md](doc/development/debugging.md)

Arguments:
--debug_node: Required: The filename of the node to debug
--debug_port: The port the debugger listens on. If not set,
--debug_host: The host the debugger binds to.
Defaults to the environment variable `DEBUG_WRAPPER_DEFAULT_HOST` if set,
otherwise `localhost`
--debug_wait: If True, the wrapper waits until a client (VS Code) is connected
and only then starts node execution

Raises:
ArgumentParserError: Missing required parameters
error: If the started debug_node raises an exception,
it is logged and then raised again
"""


import importlib.util
import os
import runpy
import sys
import argparse
import time
from multiprocessing.connection import Client

import rospy

NODE_NAME = "NAMEERR"


def eprint(msg: str):
"""Log msg into stderr.

Used instead of print, because only stderr seems to reliably land in agent.log

Args:
msg (str): Log message
"""
print(f"[debug_wrapper]: {msg}", file=sys.stderr)


def log(msg: str, level: str):
"""Log msg via the debug_logger node

Args:
msg (str): Log message
level (str): Log level. One of (debug), info, warn, error, fatal.
debug level not recommended, because these are not published to /rosout
"""
error = None
success = False
start_time = time.monotonic()
while not success and start_time + 5.0 > time.monotonic():
try:
address = ("localhost", 52999)
conn = Client(address, authkey=b"debug_logger")
conn.send({"name": NODE_NAME, "msg": msg, "level": level})
conn.close()
success = True
except Exception as e:
error = e
if not success:
eprint(msg)
if error is not None:
eprint(f"Failed to send to logger: {error}")

Comment on lines +70 to +86
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Improve logging reliability and error handling

The logging implementation has several areas for improvement:

  1. The 5-second retry loop could delay node startup
  2. The connection is not properly closed in error cases
  3. The error handling could be more informative

Apply this diff:

     error = None
     success = False
     start_time = time.monotonic()
+    conn = None
     while not success and start_time + 5.0 > time.monotonic():
         try:
             address = ("localhost", 52999)
             conn = Client(address, authkey=b"debug_logger")
             conn.send({"name": NODE_NAME, "msg": msg, "level": level})
-            conn.close()
             success = True
         except Exception as e:
             error = e
+        finally:
+            if conn is not None:
+                conn.close()
     if not success:
         eprint(msg)
         if error is not None:
-            eprint(f"Failed to send to logger: {error}")
+            eprint(f"Failed to send to logger after {time.monotonic() - start_time:.1f}s: {error}")
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
error = None
success = False
start_time = time.monotonic()
while not success and start_time + 5.0 > time.monotonic():
try:
address = ("localhost", 52999)
conn = Client(address, authkey=b"debug_logger")
conn.send({"name": NODE_NAME, "msg": msg, "level": level})
conn.close()
success = True
except Exception as e:
error = e
if not success:
eprint(msg)
if error is not None:
eprint(f"Failed to send to logger: {error}")
error = None
success = False
start_time = time.monotonic()
conn = None
while not success and start_time + 5.0 > time.monotonic():
try:
address = ("localhost", 52999)
conn = Client(address, authkey=b"debug_logger")
conn.send({"name": NODE_NAME, "msg": msg, "level": level})
success = True
except Exception as e:
error = e
finally:
if conn is not None:
conn.close()
if not success:
eprint(msg)
if error is not None:
eprint(f"Failed to send to logger after {time.monotonic() - start_time:.1f}s: {error}")


def logfatal(msg: str):
log(f"FAILED TO START NODE - NODE WILL NOT SHOW UP: {msg}", "fatal")


def logerr(msg: str):
log(msg, "error")


def logwarn(msg: str):
log(msg, "warn")


def loginfo(msg: str):
log(msg, "info")


def run_module_at(path: str):
"""Runs a python module based on its file path

Args:
path (str): python file path to run
"""
basename = os.path.basename(path)
module_dir = os.path.dirname(path)
module_name = os.path.splitext(basename)[0]
sys.path.append(module_dir)
runpy.run_module(module_name, run_name="__main__")

Comment on lines +104 to +115
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Handle sys.path modifications safely

The current implementation modifies sys.path without cleanup, which could lead to unintended side effects.

 def run_module_at(path: str):
     """Runs a python module based on its file path

     Args:
         path (str): python file path to run
     """
     basename = os.path.basename(path)
     module_dir = os.path.dirname(path)
     module_name = os.path.splitext(basename)[0]
+    original_path = sys.path.copy()
     sys.path.append(module_dir)
-    runpy.run_module(module_name, run_name="__main__")
+    try:
+        runpy.run_module(module_name, run_name="__main__")
+    finally:
+        sys.path = original_path
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
def run_module_at(path: str):
"""Runs a python module based on its file path
Args:
path (str): python file path to run
"""
basename = os.path.basename(path)
module_dir = os.path.dirname(path)
module_name = os.path.splitext(basename)[0]
sys.path.append(module_dir)
runpy.run_module(module_name, run_name="__main__")
def run_module_at(path: str):
"""Runs a python module based on its file path
Args:
path (str): python file path to run
"""
basename = os.path.basename(path)
module_dir = os.path.dirname(path)
module_name = os.path.splitext(basename)[0]
original_path = sys.path.copy()
sys.path.append(module_dir)
try:
runpy.run_module(module_name, run_name="__main__")
finally:
sys.path = original_path


def start_debugger(
node_module_name: str, host: str, port: int, wait_for_client: bool = False
):
"""_summary_

Args:
node_module_name (str): Name of the underlying node. Only used for logging
host (str): host the debugger binds to
port (int): debugger port
wait_for_client (bool, optional): If the debugger should wait
for a client to attach. Defaults to False.
"""
debugger_spec = importlib.util.find_spec("debugpy")
if debugger_spec is not None:
try:
import debugpy

debugpy.listen((host, port))
logwarn(f"Started debugger on {host}:{port} for {node_module_name}")
if wait_for_client:
logwarn("Waiting until debugging client is attached...")
debugpy.wait_for_client()
except Exception as error:
# Yes, all exceptions should be catched and sent into rosconsole
logerr(f"Failed to start debugger: {error}")
else:
logerr("debugpy module not found. Unable to start debugger")


class ArgumentParserError(Exception):
pass


class ThrowingArgumentParser(argparse.ArgumentParser):
def error(self, message):
logfatal(f"Wrong node arguments. Check launch config. : {message}")
raise ArgumentParserError(message)


def main(argv):
default_host = "localhost"
if "DEBUG_WRAPPER_DEFAULT_HOST" in os.environ:
default_host = os.environ["DEBUG_WRAPPER_DEFAULT_HOST"]

node_args = rospy.myargv(argv=argv)
parser = ThrowingArgumentParser(
prog="debug wrapper",
)
parser.add_argument("--debug_node", required=True, type=str)
parser.add_argument("--debug_port", required=False, type=int)
Zelberor marked this conversation as resolved.
Show resolved Hide resolved
parser.add_argument("--debug_host", default=default_host, type=str)
parser.add_argument("--debug_wait", action="store_true")
args, unknown_args = parser.parse_known_args(node_args)

debug_node = args.debug_node
global NODE_NAME
NODE_NAME = debug_node
base_dir = os.path.abspath(os.path.dirname(__file__))

if args.debug_port is not None:
start_debugger(
args.debug_node,
args.debug_host,
args.debug_port,
wait_for_client=args.debug_wait,
)
else:
logerr(
"""Missing parameter to start debugger: --debug_port
Add it in the launch configuration"""
)

target_type_path = os.path.join(base_dir, debug_node)
loginfo(f"Node {args.debug_node} starting at {base_dir}")
try:
run_module_at(target_type_path)
except BaseException as error:
# Yes, all exceptions including SystemExit should be catched.
# We want to always know when a node exits
logfatal(f"Failed to run node {debug_node}: {error}")
raise error


if __name__ == "__main__":
main(argv=sys.argv)
Loading
Loading