Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mistral via Azure #1678

Open
pholz opened this issue Feb 5, 2025 · 5 comments
Open

Mistral via Azure #1678

pholz opened this issue Feb 5, 2025 · 5 comments

Comments

@pholz
Copy link

pholz commented Feb 5, 2025

Should it be possible to use a non-openAI model which is however hosted on Azure? Specifically, I would like to use Mistral, and I have it deployed on Azure already. But when running indexing, I keep getting Operation 'chat' failed errors from the fnllm package.

If I look at the model URLs, OpenAI deployments take the shape <baseURL>/openai/deployments/<deploymentname>/chat/completions?api-version=2024-08-01-preview whereas the Mistrall deployment uses <baseURL>/models/chat/completions?api-version=2024-05-01-preview. Since in the settings file I only set the baseURL, I assume that the rest of the API is somehow assumed? Or is there a way to change this too?

@yueqianh
Copy link

yueqianh commented Feb 7, 2025

Non-OpenAI models from Azure AI Foundry use OpenAI chat completions template. Simply run it with the OpenAI Chat template from settings.yaml. I got DeepSeek-R1 from Azure AI Foundry to work that way.

@pholz
Copy link
Author

pholz commented Feb 7, 2025

I still can't get it to work - I changed the endpoints to reflect the "target URIs" in my LLM and embeddings deployments, but now I get a 401 Unauthorized error. Would you share a sample of your settings file?

@yueqianh
Copy link

@pholz I've been testing other models in Azure AI Foundry. I realised that while DeepSeek-R1 works fine, other models will not work due to the different API endpoint format.

base_url in OpenAI Chat mode:

  • DeepSeek-R1: https://[DeepSeek-R1-deploymentname].eastus2.models.ai.azure.com/v1/
  • Phi-4 and other models: https://[deployment_name].services.ai.azure.com/models/

GraphRAG auto completes the base_url into the following:

  • DeepSeek-R1: https://[DeepSeek-R1-deploymentname].eastus2.models.ai.azure.com/v1/chat/completions
    which is the correct endpoint for accessing DeepSeek-R1 models (no API version)
  • Phi-4 and other models: https://[deployment_name].services.ai.azure.com/models/chat/completions
    which lacks the API version query string required for these models. The full endpoint should be: https://[deployment_name].services.ai.azure.com/models/chat/completions?2024-05-01-preview

Seeking help from the GraphRAG team to add support for these Azure AI Model Inference models. Appreciate any workaround!

@yueqianh
Copy link

Managed to use LiteLLM to route access to Azure AI Services (Azure AI Studio / Azure AI Foundry). Can give it a try!

@pholz
Copy link
Author

pholz commented Feb 13, 2025

Thank you, I managed using LiteLLM proxy and a serverless mistral-mini deployment. If anyone else tries this, note that you will probably need to set drop_params: true in the proxy config, otherwise azure will return errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants