Skip to content

Commit

Permalink
feat: manipulation and taxi demo launch files and readme (#297)
Browse files Browse the repository at this point in the history
  • Loading branch information
maciejmajek authored Oct 23, 2024
1 parent 7dc5fcb commit bfd3b03
Show file tree
Hide file tree
Showing 4 changed files with 216 additions and 2 deletions.
74 changes: 73 additions & 1 deletion docs/demos/manipulation.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,75 @@
# Manipulation tasks with natural language

Work in progress.
This demo showcases the capabilities of RAI (Robotec AI) in performing manipulation tasks using natural language commands. The demo utilizes a robot arm (Franka Emika Panda) in a simulated environment, demonstrating how RAI can interpret complex instructions and execute them using advanced vision and manipulation techniques.

![Manipulation Demo](../imgs/manipulation_demo.gif)

> [!NOTE]
> This readme is a work in progress.
## Setup

1. Follow the RAI setup instructions in the [main README](../../README.md#setup).
2. Download additional dependencies:

```shell
poetry install --with openset
```

3. Clone the manipulation demo repository:

```bash
git clone https://github.com/RobotecAI/rai-manipulation-demo.git src/examples/rai-manipulation-demo
```

4. Download the latest binary release for your ROS 2 distribution:

- [ros2-humble-manipulation-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIManipulationDemo_1.0.0_jammyhumble.zip)
- [ros2-jazzy-manipulation-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIManipulationDemo_1.0.0_noblejazzy.zip)

5. Unpack the binary:

For Humble:

```bash
unzip RAIManipulationDemo_1.0.0_jammyhumble.zip
```

For Jazzy:

```bash
unzip RAIManipulationDemo_1.0.0_noblejazzy.zip
```

6. Build the ROS 2 workspace:

```bash
colcon build --symlink-install
```

## Running the Demo

> **Note**: Ensure that every command is run in a sourced shell using `source setup_shell.sh`
1. Start the demo
```shell
ros2 launch examples/manipulation-demo.launch.py
```
2. Interact with the robot arm using natural language commands. For example:
```
Enter a prompt: Pick up the red cube and drop it on other cube
```

## How it works

The manipulation demo utilizes several components:

1. Vision processing using Grounded SAM 2 and Grounding DINO for object detection and segmentation.
2. RAI agent to process the request and plan the manipulation sequence.
3. Robot arm control for executing the planned movements.

The main logic of the demo is implemented in the `ManipulationDemo` class, which can be found in:

```python
examples/manipulation-demo.py
```
43 changes: 42 additions & 1 deletion docs/demos/taxi.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,44 @@
# Speech-to-speech interaction with autonomous taxi

Work in progress.
This demo showcases a speech-to-speech interaction with an autonomous taxi using RAI (Robotec AI) in an AWSIM environment with Autoware. Users can specify destinations verbally, and the system will process the request, plan the route, and navigate the taxi accordingly.

![Autonomous Taxi Demo](../imgs/taxi_demo.gif)

> [!NOTE]
> This readme is a work in progress.
## Prerequisites

Before running this demo, ensure you have the following prerequisites installed:

1. Autoware and AWSIM [link](https://tier4.github.io/AWSIM/GettingStarted/QuickStartDemo/)
as well as you have configured the speech to speech as in [speech to speech doc](../human_robot_interface/voice_interface.md)

## Running the Demo

1. Start AWSIM and Autoware:

2. Run the taxi demo:

```bash
source ./setup_shell.sh
ros2 launch examples/taxi-demo.launch.py
```

3. To interact with the taxi using speech, speak your destination into your microphone. The system will process your request and plan the route for the autonomous taxi.

## How it works

The taxi demo utilizes several components:

1. Speech recognition (ASR) to convert user's spoken words into text.
2. RAI agent to process the request and interact with Autoware for navigation.
3. Text-to-speech (TTS) to convert the system's response back into speech.
4. Autoware for autonomous driving capabilities.
5. AWSIM for simulation of the urban environment.

The main logic of the demo is implemented in the `TaxiDemo` class, which can be found in:

```python
examples/taxi-demo.py
```
53 changes: 53 additions & 0 deletions examples/manipulation-demo.launch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
# Copyright (C) 2024 Robotec.AI
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from launch import LaunchDescription
from launch.actions import DeclareLaunchArgument, ExecuteProcess
from launch.substitutions import LaunchConfiguration
from launch_ros.actions import Node


def generate_launch_description():
# Declare the game_launcher argument
game_launcher_arg = DeclareLaunchArgument(
"game_launcher",
default_value="",
description="Path to the game launcher executable",
)

return LaunchDescription(
[
# Include the game_launcher argument
game_launcher_arg,
# Launch the manipulation demo Python script
ExecuteProcess(
cmd=["python3", "examples/manipulation-demo.py"], output="screen"
),
# Launch the robotic_manipulation node
Node(
package="robotic_manipulation",
executable="robotic_manipulation",
name="robotic_manipulation_node",
output="screen",
),
# Launch the game launcher
ExecuteProcess(
cmd=[
LaunchConfiguration("game_launcher"),
"-bg_ConnectToAssetProcessor=0",
],
output="screen",
),
]
)
48 changes: 48 additions & 0 deletions examples/taxi-demo.launch.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
# Copyright (C) 2024 Robotec.AI
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

from launch import LaunchDescription
from launch.actions import ExecuteProcess, IncludeLaunchDescription
from launch.launch_description_sources import PythonLaunchDescriptionSource
from launch.substitutions import PathJoinSubstitution
from launch_ros.substitutions import FindPackageShare


def generate_launch_description():
return LaunchDescription(
[
# Launch the taxi demo Python script
ExecuteProcess(cmd=["python3", "examples/taxi-demo.py"], output="screen"),
# Include the voice launch from rai_bringup
IncludeLaunchDescription(
PythonLaunchDescriptionSource(
[
PathJoinSubstitution(
[
FindPackageShare("rai_bringup"),
"launch",
"voice.launch.py",
]
)
]
),
launch_arguments={
"tts_vendor": "opentts", # elevenlabs (paid), opentts (free, local model)
"asr_vendor": "whisper", # whisper (free, local model), openai (paid)
"recording_device": "0", # find your recording device using python -c 'import sounddevice as sd; print(sd.query_devices())'
"keep_speaker_busy": "false",
}.items(),
),
]
)

0 comments on commit bfd3b03

Please sign in to comment.