Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Minimal Code-Generation Agent (Single-Agent) with On-Demand LLM Calls #96

Open
matiasmolinas opened this issue Jan 10, 2025 · 0 comments

Comments

@matiasmolinas
Copy link

Description
We want to implement a single “code-generation” agent that can dynamically produce (and optionally execute) code snippets when it encounters a request it cannot fulfill. By relying on an LLM (IBM Granite, Llama 3.x, etc.), the agent should be able to generate new logic for tasks on the fly.

There are two main approaches we could explore (potentially both, giving users a choice):

  1. Bee Framework Tools

    • PythonTool: Provides a fresh VM environment each time Python code is executed, with a controlled set of libraries and shell commands (e.g., ffmpeg, pillow, numpy, etc.).
    • bee-code-interpreter: A gRPC service that can run arbitrary Python code in a sandboxed environment (Docker/Kubernetes), designed from the ground up for better security and reproducibility.
  2. smolagents

    • CodeAgent: Runs LLM-generated Python code in a restricted local interpreter or via E2B remote execution, with the possibility to restrict imports, limit runtime, and ensure safer remote sandboxing.

The goal is to prove that a single agent can accept a prompt it cannot currently handle, call an LLM to generate the required code, and then run or store that code:

  1. On-Demand Generation

    • If the agent receives a task (e.g., “convert PDF to text,” “plot a bar chart,” etc.) and doesn’t have a matching “tool,” it should prompt the LLM to produce a Python snippet.
    • Basic prompt engineering is expected (the agent must guide the LLM to create valid code suited for the Bee environment or for smolagents’ interpreter).
  2. Execution & Logging

    • Integrate with either PythonTool/bee-code-interpreter or CodeAgent (depending on user choice) to execute the new code snippet.
    • Provide success/failure logging with timestamps. If a snippet fails (syntax error, runtime error, etc.), log the details so we can iterate or re-prompt.
  3. Minimal Validation

    • Perform at least a syntax check or “smoke test” after the snippet is generated. If that fails, the agent logs an error and could optionally try to regenerate code with revised constraints.
  4. Single-Agent Focus

    • We’re not doing multi-agent orchestration yet. The goal is to demonstrate that one agent can evolve on-demand using an LLM and either Bee’s or smolagents’ sandboxed code execution.

Acceptance Criteria

  • A new class or function that, upon encountering an unknown request, triggers an LLM-based code generation flow.
  • Compatibility with both Bee Framework code execution (via PythonTool or bee-code-interpreter) and smolagents’ CodeAgent. (We could start with one approach and add a config flag for the other.)
  • Logging of success/fail states and timestamps in a consistent format (JSON, console logs, etc.).
  • Minimal test harness that ensures the generated code at least runs without errors.

Additional Context

  • Bee: The PythonTool is quite comprehensive, allowing Python + shell commands but in a fresh VM each time. bee-code-interpreter goes a step further by spinning up isolated Kubernetes pods for safer execution.
  • smolagents: CodeAgent can run Python locally or via E2B. This is excellent for quick dev/test scenarios or extra security when running untrusted code.
  • Long-Term Vision: Future PRs can build on this to store successful snippets in a shared “tool library,” gating them behind developer review or more thorough tests before promoting them to “core” tools.

Feel free to share any feedback. If this direction sounds good, I’ll begin drafting a PR to implement the minimal code-generation agent logic and provide a basic demonstration!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant