feat: Add tracing (#28)

* Initial logfire-api integration. * Small docs fix * soften some opinions in docs * Add completion tracing. Update docs.
dreadnode · Jan 16, 2025 · bdbc7df · bdbc7df
1 parent 71ae7aa
commit bdbc7df
Show file tree

Hide file tree

Showing 14 changed files with 447 additions and 114 deletions.
diff --git a/README.md b/README.md
@@ -15,13 +15,14 @@ Simplify using LLMs in code
 
 </br>
 
-Rigging is a lightweight LLM framework built on Pydantic XML. The goal is to make leveraging language models in production code as simple and effective as possible. Here are the highlights:
+Rigging is a lightweight LLM framework to make using language models in production code as simple and effective as possible. Here are the highlights:
 
 - **Structured Pydantic models** can be used interchangably with unstructured text output.
 - LiteLLM as the default generator giving you **instant access to a huge array of models**.
 - Define prompts as python functions with **type hints and docstrings**.
-- Simple **tool calling** even for models which don't support them at the API.
+- Simple **tool use**, even for models which don't support them at the API.
 - Store different models and configs as **simple connection strings** just like databases.
+- Integrated tracing support with [Logfire](https://logfire.pydantic.dev/docs/).
 - Chat templating, forking, continuations, generation parameter overloads, stripping segments, etc.
 - Async batching and fast iterations for **large scale generation**.
 - Metadata, callbacks, and data format conversions.
@@ -120,6 +121,7 @@ Want more?
 - Use [structured pydantic parsing](https://rigging.dreadnode.io/#basic-parsing)
 - Check out [raw completions](https://rigging.dreadnode.io/topics/completions/)
 - Give the LLM [access to tools](https://rigging.dreadnode.io/topics/tools/)
+- Track behavior with [tracing](https://rigging.dreadnode.io/topics/tracing/)
 - Play with [generation params](https://rigging.dreadnode.io/topics/generators/#overload-generation-params)
 - Use [callbacks in the pipeline](https://rigging.dreadnode.io/topics/callbacks-and-mapping/)
 - Scale up with [iterating and batching](https://rigging.dreadnode.io/topics/iterating-and-batching/)

diff --git a/docs/assets/tracing_logfire.png b/docs/assets/tracing_logfire.png
diff --git a/docs/index.md b/docs/index.md
@@ -1,10 +1,13 @@
-Rigging is a lightweight LLM framework built on Pydantic XML. The goal is to make leveraging language models in production code as simple and effective as possible. Here are the highlights:
+# Home
+
+Rigging is a lightweight LLM framework to make using language models in production code as simple and effective as possible. Here are the highlights:
 
 - **Structured Pydantic models** can be used interchangably with unstructured text output.
 - LiteLLM as the default generator giving you **instant access to a huge array of models**.
 - Define prompts as python functions with **type hints and docstrings**.
 - Simple **tool use**, even for models which don't support them at the API.
 - Store different models and configs as **simple connection strings** just like databases.
+- Integrated **tracing** support with [Logfire](https://logfire.pydantic.dev/docs/) to track activity.
 - Chat templating, forking, continuations, generation parameter overloads, stripping segments, etc.
 - Async batching and fast iterations for **large scale generation**.
 - Metadata, callbacks, and data format conversions.
@@ -266,7 +269,7 @@ Check out [Tools](topics/tools.md) for more information.
 
 ### Tools + Prompts
 
-You can combine prompts and tools to achieve "multi-agent" behavior":
+You can combine prompts and tools to achieve "multi-agent" behavior:
 
 ```py
 import rigging as rg
@@ -340,14 +343,13 @@ constructing models in a [later section](topics/models.md), but don't stress the
 ??? note "XML vs JSON"
 
     Rigging is opinionated with regard to using XML to weave unstructured data with structured contents
-    as the underlying LLM generates text responses. A frequent solution to getting "predictable"
-    outputs from LLMs has been forcing JSON conformant outputs, but we think this is
-    poor form in the long run. You can read more about this from [Anthropic](https://docs.anthropic.com/claude/docs/use-xml-tags)
+    as the underlying LLM generates text responses, at least when it comes to raw text content. If you want
+    to take advantage of structured JSON parsing provided by model providers or inference tools,
+    [`APITools`](topics/tools.md) are a good alternative.
+
+    You can read more about XML tag use from [Anthropic](https://docs.anthropic.com/claude/docs/use-xml-tags)
     who have done extensive research with their models.
 
-    We'll skip the long rant, but trust us that XML is a very useful syntax which beats
-    JSON any day of the week for typical use cases.
-
 To begin, let's define a `FunFact` model which we'll have the LLM fill in. Rigging exposes a 
 [`Model`][rigging.model.Model] base class which you should inherit from when defining structured
 inputs. This is a lightweight wrapper around pydantic-xml's [`BaseXMLModel`](`https://pydantic-xml.readthedocs.io/en/latest/pages/api.html#pydantic_xml.BaseXmlModel`)

diff --git a/docs/topics/prompt-functions.md b/docs/topics/prompt-functions.md
@@ -75,8 +75,21 @@ A [`Prompt`][rigging.prompt.Prompt] is typically created using one of the follow
 
 Prompts are optionally bound to a pipeline/generator underneath, hence the generator and pipeline
 decorator variants, but they don't have to be. We refer to bound prompts as "standalone", because
-they can be executed directly as functions. Otherwise, you are required to use 
-[`ChatPipeline.run_prompt`][rigging.chat.ChatPipeline.run_prompt] to execute the prompt.
+they can be executed directly as functions. Otherwise, you are required first "bind" the prompt to
+a specific generator id, generator, or pipeline to make it callable. Do this with 
+[`Prompt.bind`][rigging.prompt.Prompt.bind] or related methods.
+
+```py
+import rigging as rg
+
+@rg.prompt
+def summarize(text: str) -> str:
+    """Summarize this text."""
+
+generator = rg.get_generator("gpt-4o-mini")
+
+await summarize.bind(generator)("...")
+```
 
 ### Templates and Docstrings
 
@@ -334,7 +347,7 @@ processed correctly.
     # <age></age>
     ```
 
-You can also embedd a [`Chat`][rigging.chat.Chat] object inside a some objects, which
+You can also embed a [`Chat`][rigging.chat.Chat] object inside other objects, which
 will be excluded from any prompt guidance, but supplied the value when the prompt
 is executed. This is great for gathering both structured data and the original chat. 
 
@@ -422,11 +435,11 @@ Prompt objects expose the following methods for execution:
 
 *(Available if the prompt was supplied/bonded to a pipeline or generator)*
 
-You can also run a prompt with a specific `ChatPipeline` by passing it to any of:
+You can also bind a prompt at runtime with any of the following:
 
-- [`ChatPipeline.run_prompt()`][rigging.chat.ChatPipeline.run_prompt]
-- [`ChatPipeline.run_prompt_many()`][rigging.chat.ChatPipeline.run_prompt_many]
-- [`ChatPipeline.run_prompt_over()`][rigging.chat.ChatPipeline.run_prompt_over]
+- [`Prompt.bind()`][rigging.prompt.Prompt.bind]
+- [`Prompt.bind_many()`][rigging.prompt.Prompt.bind_many]
+- [`Prompt.bind_over()`][rigging.prompt.Prompt.bind_over]
 
 !!! Note "Pipeline Context"
 
@@ -469,7 +482,7 @@ You can also run a prompt with a specific `ChatPipeline` by passing it to any of
     def write_code(description: str, language: str = "python") -> code_str:
         """Write a single function."""
 
-    code = await pipeline.run_prompt(write_code, "Calculate the factorial of a number.")
+    code = await write_code.bind(pipeline)("Calculate the factorial of a number.")
     ```
 
 === "Run Manually"

diff --git a/docs/topics/tracing.md b/docs/topics/tracing.md
@@ -0,0 +1,82 @@
+# Tracing
+
+Rigging integrates with the [Logfire](https://logfire.pydantic.dev/docs/) library for exposing tracing information
+about execution. Specifically we use the logfire-api no-op package, making it optional for users with no overhead
+if you don't need it.
+
+Logfire is capable of reporting trace information to any Open Telemetry compatible system, and provides some
+convient abstractions on top of the standard open-telemetry-sdk which we like. If the `logfire` package is installed and
+configured, details about pipelines, prompts, and tools will be traced during rigging use.
+
+You can configure Logfire to use [alternative backends](https://logfire.pydantic.dev/docs/how-to-guides/alternative-backends/)
+as needed to integrate with your preferred tracing stack.
+
+```py
+import rigging as rg
+import logfire
+
+logfire.configure()
+
+@rg.prompt(generator_id="gpt-4o")
+async def summarize(content: str) -> str:
+    """
+    Summarize the content into 1-2 sentences then save it
+    """
+
+summarize.watch(rg.watchers.write_chats_to_jsonl("chats.jsonl"))
+
+text = """
+Revachol is located on the island of Le Caillou, also called "The Pebble" on the northeast side of the
+Insulindian Isola, on the world's largest body of water: the Insulindic. The city itself has a radius
+of 80 kilometres and is split by the River Esperance into Revachol East and Revachol West. The north
+side of the island is shattered by the delta of the Esperance, and is named La Delta.
+"""
+
+await summarize.run_many(3, text)
+
+# 23:46:31.484 Prompt summarize() (x3)
+# 23:46:31.485   Chat with litellm!gpt-4o (x3)
+# 23:46:32.874   Watch with rigging.watchers.write_chats_to_jsonl()
+```
+
+???+ "What's Stored?"
+
+    Rigging will attach call parameters and results for both tools and prompt functions, as well
+    as finalized chat objects at the end of a pipeline. Logfire will serialize these items as
+    JSON values inside attributes, and include a dynamic json schema for reference. When using
+    their platform, these items deserialize into the web view directly.
+
+![Logfire trace](../assets/tracing_logfire.png)
+
+## Inference Tracing
+
+We've opted to exclude tracing at the generator level in Rigging (for now) and focus on
+instrumenting higher-order abstractions like pipelines, tools, etc. which are specific to the framework.
+
+There are a suite of powerful instrumentation libraries (including Logfire!) which will add tracing
+to underlying libraries like LiteLLM (recommended), OpenAI, Anthropic, VertexAI, etc. These snap
+right into the tracing spans from rigging, and provide insight into the raw inference
+traffic before it's sent to API endpoints and inference libraries.
+
+- [LiteLLM - Logfire](https://docs.litellm.ai/docs/observability/logfire_integration)
+- [LiteLLM - OpenTelemetry](https://docs.litellm.ai/docs/observability/opentelemetry_integration)
+- [Logfire - Integrations](https://logfire.pydantic.dev/docs/integrations/)
+- [TraceLoop - openllmetry](https://github.com/traceloop/openllmetry)
+
+Here is an example of adding LiteLLM tracing on top of rigging:
+
+```py
+import rigging as rg
+import logfire
+import litellm
+
+logfire.configure()
+
+os.environ.setdefault("LOGFIRE_TOKEN", "") # (1)!
+litellm.callbacks = ["logfire"]
+
+# ...
+```
+
+1. As of writing, LiteLLM requires this environment var, even if empty
+and logfire is managing tokens for you
diff --git a/mkdocs.yml b/mkdocs.yml
@@ -5,21 +5,23 @@ site_url: https://rigging.dreadnode.io
 repo_url: https://github.com/dreadnode/rigging
 
 nav:
-  - Home: index.md
-  - Topics:
-    - Workflow: topics/workflow.md
-    - Models: topics/models.md
-    - Generators: topics/generators.md
-    - Chats and Messages: topics/chats-and-messages.md
-    - Prompt Functions: topics/prompt-functions.md
-    - Completions: topics/completions.md
-    - Callbacks and Mapping: topics/callbacks-and-mapping.md
-    - Iterating and Batching: topics/iterating-and-batching.md
-    - Tools: topics/tools.md
-    - Serialization: topics/serialization.md
-    - Logging: topics/logging.md
-    - Migrations: topics/migrations.md
-    - Principles: topics/principles.md
+  - Home:
+    - index.md
+    - Topics:
+      - Workflow: topics/workflow.md
+      - Models: topics/models.md
+      - Generators: topics/generators.md
+      - Chats and Messages: topics/chats-and-messages.md
+      - Prompt Functions: topics/prompt-functions.md
+      - Completions: topics/completions.md
+      - Callbacks and Mapping: topics/callbacks-and-mapping.md
+      - Iterating and Batching: topics/iterating-and-batching.md
+      - Tools: topics/tools.md
+      - Tracing: topics/tracing.md
+      - Serialization: topics/serialization.md
+      - Logging: topics/logging.md
+      - Migrations: topics/migrations.md
+      - Principles: topics/principles.md
   - API:
     - rigging.chat: api/chat.md
     - rigging.completion: api/completion.md
@@ -49,7 +51,6 @@ theme:
   features:
     - content.code.copy
     - content.code.annotate
-    - toc.integrate
     - navigation.footer
     - navigation.indexes
     - navigation.sections

diff --git a/poetry.lock b/poetry.lock
diff --git a/pyproject.toml b/pyproject.toml
@@ -31,6 +31,7 @@ click = { version = "^8.1.7", optional = true }
 httpx = { version = "^0.27.0", optional = true }
 aiodocker = { version = "^0.22.2", optional = true }
 websockets = { version = "^13.0", optional = true }
+logfire-api = "^3.1.1"
 
 [tool.poetry.extras]
 examples = ["asyncssh", "click", "httpx", "aiodocker", "websockets"]