Add the option of streaming gen_ai.choice events. #1964

Cirilla-zmh · 2025-03-06T07:39:01Z

Area(s)

area:gen-ai

What's missing?

There are streaming and non-streaming response mode for LLM call, and that's means the implement of capturing gen_ai.choice can be very different.

I have noticed two approaches up to now:

wait until all contents return: open-ai instrumentation
capture as chunked events: googel-genai instrumentation

Personally we prefer the latter option — which means less memory usage in collection sides. As we know, collection sides normally work together with production applications (they mostly in the same process). So we will still hear sounds that collect tools are occupying much memory if we follow the first implement.

However, there're no semantic conventions about capturing streaming response here. This means that observability backends following the OTel semconv can only recognize choice events that have been aggregated on the collection side. Even if they become aware of the issue above and allow the ingestion of chunked choice events, such an implementation would be non-standardized, leading to a wide variety of final formats and causing confusion for OTel users.

My proposal is: Could we provide an alternative and define a streaming format for the event structure? This would give developers flexibility — they could aggregate the data on the client side, or they could choose to stream the events, with the latter implying that they must rely on a server-side solution that supports aggregation.

Describe the solution you'd like

P.S. I have to point out this topic is what I want to discuss in today's SIG APAC but nobody else comes actually. We really need a notify if this meet has been cancelled or delay.

The text was updated successfully, but these errors were encountered:

aabmass · 2025-03-06T19:15:12Z

Any proposal for what to actually change with the current gen_ai.choice event?

Cirilla-zmh · 2025-03-07T08:14:37Z

Any proposal for what to actually change with the current gen_ai.choice event?

I mean for the streaming or chunked response, we should give an optional semconv like:

Joining all chunked responses into a complete completion, like:

{"index":0,"finish_reason":"stop","message":{"content":"Why did the developer bring OpenTelemetry to the party? Because it always knows how to trace the fun!"}}

Sending chunked responses separately, like:

{"index":0,"sequence_id":0,"message":{"content":"Why did the developer"}}
{"index":0,"sequence_id":1,"message":{"content":" bring OpenTelemetry"}}
{"index":0,"sequence_id":2,"message":{"content":" to the party?"}}
{"index":0,"sequence_id":3,"message":{"content":" Because it always"}}
{"index":0,"sequence_id":4,"message":{"content":" knows how to"}}
{"index":0,"sequence_id":5,"finish_reason":"stop","message":{"content":" trace the fun!"}}

github-actions bot added triage:needs-triage area:gen-ai labels Mar 6, 2025

github-project-automation bot added this to GenAI Semantic Conventions and Instrumentation libraries and DRAFT - SemConv Issue Triage Mar 6, 2025

github-project-automation bot moved this to Need triage in DRAFT - SemConv Issue Triage Mar 6, 2025

github-project-automation bot moved this to New issues in GenAI Semantic Conventions and Instrumentation libraries Mar 6, 2025

Cirilla-zmh mentioned this issue Mar 6, 2025

GenAI (LLM): how to capture streaming #1170

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the option of streaming gen_ai.choice events. #1964

Add the option of streaming gen_ai.choice events. #1964

Cirilla-zmh commented Mar 6, 2025

aabmass commented Mar 6, 2025

Cirilla-zmh commented Mar 7, 2025 •

edited

Loading

Add the option of streaming gen_ai.choice events. #1964

Add the option of streaming gen_ai.choice events. #1964

Comments

Cirilla-zmh commented Mar 6, 2025

Area(s)

What's missing?

Describe the solution you'd like

aabmass commented Mar 6, 2025

Cirilla-zmh commented Mar 7, 2025 • edited Loading

Cirilla-zmh commented Mar 7, 2025 •

edited

Loading