-
Notifications
You must be signed in to change notification settings - Fork 24
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: manipulation and taxi demo launch files and readme (#297)
- Loading branch information
1 parent
7dc5fcb
commit bfd3b03
Showing
4 changed files
with
216 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,75 @@ | ||
# Manipulation tasks with natural language | ||
|
||
Work in progress. | ||
This demo showcases the capabilities of RAI (Robotec AI) in performing manipulation tasks using natural language commands. The demo utilizes a robot arm (Franka Emika Panda) in a simulated environment, demonstrating how RAI can interpret complex instructions and execute them using advanced vision and manipulation techniques. | ||
|
||
![Manipulation Demo](../imgs/manipulation_demo.gif) | ||
|
||
> [!NOTE] | ||
> This readme is a work in progress. | ||
## Setup | ||
|
||
1. Follow the RAI setup instructions in the [main README](../../README.md#setup). | ||
2. Download additional dependencies: | ||
|
||
```shell | ||
poetry install --with openset | ||
``` | ||
|
||
3. Clone the manipulation demo repository: | ||
|
||
```bash | ||
git clone https://github.com/RobotecAI/rai-manipulation-demo.git src/examples/rai-manipulation-demo | ||
``` | ||
|
||
4. Download the latest binary release for your ROS 2 distribution: | ||
|
||
- [ros2-humble-manipulation-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIManipulationDemo_1.0.0_jammyhumble.zip) | ||
- [ros2-jazzy-manipulation-demo](https://robotec-ml-roscon2024-demos.s3.eu-central-1.amazonaws.com/ROSCON_Release/RAIManipulationDemo_1.0.0_noblejazzy.zip) | ||
|
||
5. Unpack the binary: | ||
|
||
For Humble: | ||
|
||
```bash | ||
unzip RAIManipulationDemo_1.0.0_jammyhumble.zip | ||
``` | ||
|
||
For Jazzy: | ||
|
||
```bash | ||
unzip RAIManipulationDemo_1.0.0_noblejazzy.zip | ||
``` | ||
|
||
6. Build the ROS 2 workspace: | ||
|
||
```bash | ||
colcon build --symlink-install | ||
``` | ||
|
||
## Running the Demo | ||
|
||
> **Note**: Ensure that every command is run in a sourced shell using `source setup_shell.sh` | ||
1. Start the demo | ||
```shell | ||
ros2 launch examples/manipulation-demo.launch.py | ||
``` | ||
2. Interact with the robot arm using natural language commands. For example: | ||
``` | ||
Enter a prompt: Pick up the red cube and drop it on other cube | ||
``` | ||
|
||
## How it works | ||
|
||
The manipulation demo utilizes several components: | ||
|
||
1. Vision processing using Grounded SAM 2 and Grounding DINO for object detection and segmentation. | ||
2. RAI agent to process the request and plan the manipulation sequence. | ||
3. Robot arm control for executing the planned movements. | ||
|
||
The main logic of the demo is implemented in the `ManipulationDemo` class, which can be found in: | ||
|
||
```python | ||
examples/manipulation-demo.py | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,44 @@ | ||
# Speech-to-speech interaction with autonomous taxi | ||
|
||
Work in progress. | ||
This demo showcases a speech-to-speech interaction with an autonomous taxi using RAI (Robotec AI) in an AWSIM environment with Autoware. Users can specify destinations verbally, and the system will process the request, plan the route, and navigate the taxi accordingly. | ||
|
||
![Autonomous Taxi Demo](../imgs/taxi_demo.gif) | ||
|
||
> [!NOTE] | ||
> This readme is a work in progress. | ||
## Prerequisites | ||
|
||
Before running this demo, ensure you have the following prerequisites installed: | ||
|
||
1. Autoware and AWSIM [link](https://tier4.github.io/AWSIM/GettingStarted/QuickStartDemo/) | ||
as well as you have configured the speech to speech as in [speech to speech doc](../human_robot_interface/voice_interface.md) | ||
|
||
## Running the Demo | ||
|
||
1. Start AWSIM and Autoware: | ||
|
||
2. Run the taxi demo: | ||
|
||
```bash | ||
source ./setup_shell.sh | ||
ros2 launch examples/taxi-demo.launch.py | ||
``` | ||
|
||
3. To interact with the taxi using speech, speak your destination into your microphone. The system will process your request and plan the route for the autonomous taxi. | ||
|
||
## How it works | ||
|
||
The taxi demo utilizes several components: | ||
|
||
1. Speech recognition (ASR) to convert user's spoken words into text. | ||
2. RAI agent to process the request and interact with Autoware for navigation. | ||
3. Text-to-speech (TTS) to convert the system's response back into speech. | ||
4. Autoware for autonomous driving capabilities. | ||
5. AWSIM for simulation of the urban environment. | ||
|
||
The main logic of the demo is implemented in the `TaxiDemo` class, which can be found in: | ||
|
||
```python | ||
examples/taxi-demo.py | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
# Copyright (C) 2024 Robotec.AI | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from launch import LaunchDescription | ||
from launch.actions import DeclareLaunchArgument, ExecuteProcess | ||
from launch.substitutions import LaunchConfiguration | ||
from launch_ros.actions import Node | ||
|
||
|
||
def generate_launch_description(): | ||
# Declare the game_launcher argument | ||
game_launcher_arg = DeclareLaunchArgument( | ||
"game_launcher", | ||
default_value="", | ||
description="Path to the game launcher executable", | ||
) | ||
|
||
return LaunchDescription( | ||
[ | ||
# Include the game_launcher argument | ||
game_launcher_arg, | ||
# Launch the manipulation demo Python script | ||
ExecuteProcess( | ||
cmd=["python3", "examples/manipulation-demo.py"], output="screen" | ||
), | ||
# Launch the robotic_manipulation node | ||
Node( | ||
package="robotic_manipulation", | ||
executable="robotic_manipulation", | ||
name="robotic_manipulation_node", | ||
output="screen", | ||
), | ||
# Launch the game launcher | ||
ExecuteProcess( | ||
cmd=[ | ||
LaunchConfiguration("game_launcher"), | ||
"-bg_ConnectToAssetProcessor=0", | ||
], | ||
output="screen", | ||
), | ||
] | ||
) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,48 @@ | ||
# Copyright (C) 2024 Robotec.AI | ||
# | ||
# Licensed under the Apache License, Version 2.0 (the "License"); | ||
# you may not use this file except in compliance with the License. | ||
# You may obtain a copy of the License at | ||
# | ||
# http://www.apache.org/licenses/LICENSE-2.0 | ||
# | ||
# Unless required by applicable law or agreed to in writing, software | ||
# distributed under the License is distributed on an "AS IS" BASIS, | ||
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. | ||
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
from launch import LaunchDescription | ||
from launch.actions import ExecuteProcess, IncludeLaunchDescription | ||
from launch.launch_description_sources import PythonLaunchDescriptionSource | ||
from launch.substitutions import PathJoinSubstitution | ||
from launch_ros.substitutions import FindPackageShare | ||
|
||
|
||
def generate_launch_description(): | ||
return LaunchDescription( | ||
[ | ||
# Launch the taxi demo Python script | ||
ExecuteProcess(cmd=["python3", "examples/taxi-demo.py"], output="screen"), | ||
# Include the voice launch from rai_bringup | ||
IncludeLaunchDescription( | ||
PythonLaunchDescriptionSource( | ||
[ | ||
PathJoinSubstitution( | ||
[ | ||
FindPackageShare("rai_bringup"), | ||
"launch", | ||
"voice.launch.py", | ||
] | ||
) | ||
] | ||
), | ||
launch_arguments={ | ||
"tts_vendor": "opentts", # elevenlabs (paid), opentts (free, local model) | ||
"asr_vendor": "whisper", # whisper (free, local model), openai (paid) | ||
"recording_device": "0", # find your recording device using python -c 'import sounddevice as sd; print(sd.query_devices())' | ||
"keep_speaker_busy": "false", | ||
}.items(), | ||
), | ||
] | ||
) |