-
Notifications
You must be signed in to change notification settings - Fork 176
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
87b0951
commit 4d0f64f
Showing
3 changed files
with
53 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,6 @@ | ||
# [Paper Title](link_to_paper) | ||
|
||
_September 2023_ | ||
_December 2023_ | ||
|
||
tl;dr: Summary of the main idea. | ||
|
||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
# [ChatGPT for Robotics: Design Principles and Model Abilities](https://arxiv.org/abs/2306.17582) | ||
|
||
_December 2023_ | ||
|
||
tl;dr: A pipeline to use ChatGPT for robotics tasks via prompt engineering, and writing high level code for execution. Similar to [CaP (code as policies)](cap.md). | ||
|
||
#### Overall impression | ||
Robotics systems, unlike text-only apps, require deep understanding of real-world **physics**, environmental **context**, and the **ability** to perform physical actions. | ||
|
||
LLM's out-of-the-box understanding of basic concepts (control, camera geometry, physical form factors) makes it an excellenet choice to build generalizable and user-friendly robotics pipeline. | ||
|
||
PromptCraft replaces a specialized engineer-in-the-loop with a user-on-the-loop. --> How to polish the interaction between user and the robot or automate as much as possible is the key to real world application (productization). | ||
|
||
PromptCraft is NOT a fully automated process, and needs human on the loop to monitor and intervene in case of unexpected behavior generated by LLM, especially so for safety-critical application. | ||
|
||
PromptCraft is not using VLM, but rather only LLM. | ||
|
||
#### Key ideas | ||
- Pipeline to construct ChatGPT-based robotics app | ||
- Define high level robot function lib. | ||
- Prompt with objectives and allowed functions. | ||
- The user stays on the loop to evaluate. | ||
- Deployed onto the robot. | ||
|
||
#### Technical details | ||
- The creation of a high level function library, and listing them in the prompt is a key concept that unlock the ablity to solve robotics app with ChatGPT. This avoids unbounded text-based answer, and avoids API under-specification. | ||
- The capability to write new functions confers flexibility and robustness to LLMs. | ||
- The diaglog/conversation ability of ChatGPT is a surprisingly effective vehicle for interactive behavior correction. | ||
- The user of simulators can be particularly useful to evaluate model's performance before deployment in the real world. --> Simulation (Habitat, AirSim, etc) is the right vehicle to evaluate closed-loop high level task planning. | ||
|
||
#### Notes | ||
- Application of LLM application on robotics, include visual-language navigation, language-based human-robot interaction, and visual-langauge manipulation control (PerAct, Cliport by Dieter Fox) |