You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In real-world systems, it's crucial to handle HTTP errors effectively, especially when interacting with Large Language Models (LLMs) like Azure OpenAI. Rate limit exceeded errors (tokens per minute or requests per minute) always happen at some point, resulting in 429 errors. This blog post explores different approaches to HTTP error handling with semantic kernel and Azure OpenAI.
The default setup for [Semantic Kernel](https://github.com/microsoft/semantic-kernel) with Azure OpenAI by AddAzureOpenAIChatCompletion. This approach offers a built-in retry policy that automatically retries requests up to three times with exponential backoff. Additionally, it can detect specific HTTP headers like 'retry-after' to implement more tailored retries.
@@ -68,7 +68,7 @@ var clientOptions = new AzureOpenAIClientOptions
68
68
This configuration enables you to combine HTTP retry policies from HttpClient with custom pipeline policy-based retries from the Azure OpenAI SDK.
69
69
70
70
71
-
** Recommendations
71
+
##Recommendations
72
72
The default setup might not be suitable for scenarios where you frequently encounter token limit issues.
73
73
Using HttpClient provides more control and flexibility, making it a favorable option for broader compatibility beyond Azure OpenAI.
74
74
If you already have AzureOpenAIClient registered and require maximum control, this approach allows you to leverage both HTTP client policies and Azure OpenAI pipeline policy-based retries.
0 commit comments