Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code Completation and Custom OpenAI #759

Open
Pietro395 opened this issue Nov 11, 2024 · 4 comments
Open

Code Completation and Custom OpenAI #759

Pietro395 opened this issue Nov 11, 2024 · 4 comments
Labels
bug Something isn't working

Comments

@Pietro395
Copy link

What happened?

I am using CodeGPT via Custom OpenAI with Openrouter.
I configured the template as suggestions I never get code but only sentences like this:
It looks like you're working on a Python function that interacts with an OPC-UA server and performs some operations

Schermata del 2024-11-11 12-38-34

For code completion I am using deepseeker, with these settings:
immagine

Relevant log output or stack trace

No response

Steps to reproduce

No response

CodeGPT version

2.12.3

Operating System

Linux

@Pietro395 Pietro395 added the bug Something isn't working label Nov 11, 2024
@carlrobertoh
Copy link
Owner

I'm not sure if any of the OpenRouter models can be used for code completions, as they only provide a OpenAI compatible Chat Completions API.

If you want to use DeepSeek Coder for code completions, you must configure the DeepSeek API explicitly and use the following configuration: https://api-docs.deepseek.com/guides/fim_completion

@Pietro395
Copy link
Author

I'm not sure if any of the OpenRouter models can be used for code completions, as they only provide a OpenAI compatible Chat Completions API.

If you want to use DeepSeek Coder for code completions, you must configure the DeepSeek API explicitly and use the following configuration: https://api-docs.deepseek.com/guides/fim_completion

It seems that the problem is that e.g. deepseeker does not support FIM in the OpenRouter version.
I am using gpt3.5 turbo and now it seems to work, I will do more tests, thanks

@carlrobertoh
Copy link
Owner

I will post the same text here as I did on Reddit for others as well.

Unfortunately, in most cases, the models supported by OpenRouter are only accessible via the /v1/chat/completions endpoint, which means it cannot be used for code completion. This is due to the request format, with the exception of the gpt-3.5-turbo-instruct, which can be accessed via the legacy /v1/completions endpoint by setting the correct prefix and suffix params.

Fortunately, you have a few other options to get this working:

The easiest option is likely to use our native llama.cpp local integration. The plugin comes with a pre-packaged llama.cpp backend along with a user friendly interface that allows you to download and run the models without needing any other 3rd party clients, such as Ollama or Open WebUI. Each model is already bound to an appropriate FIM template, so you don't have to worry about constructing the prompt yourself.

Simply choose and download the model (each model is downloaded directly from HuggingFace with a nice loading progress indicator). Once the model is downloaded, start the server and you're good to go!

However, please note that the llama.cpp integration can only be used with UNIX-based systems and the logic for building and running the server is not yet supported on Windows machines. If you're using Windows, then you'll likely find Ollama to be your best friend. The extension can recognize the models you have already downloaded.

For other models provided by cloud providers, you can use the Custom OpenAI provider. This option is highly configurable for various needs, including on-premise models hosted within your company.

Let's say you want to use the the newest Qwen2.5 Coder 32B model. One option is to use it with the Fireworks API from /v1/completions endpoint, which supports raw input, allowing us to send the pre-built FIM prompt.

  • Select the Custom OpenAI provider and use the Fireworks preset template
  • Fill in your API key obtained from their account page
  • Choose the proper FIM template (CodeQwen 2.5) from the Code Completion section
  • Replace the model body parameter value with accounts/fireworks/models/qwen2p5-coder-32b-instruct
  • Click apply

The same model can also be used for other features, such as regular chats, by simply replacing the model value in the Chat Completions tab.

We have many existing preset configurations in place. For instance, to use the Codestral model, simply select the Mistral AI template, enter the API key, and click apply. The values are already pre-filled, so the additional steps from the previous example are unnecessary.

@Pietro395
Copy link
Author

I will post the same text here as I did on Reddit for others as well.

Unfortunately, in most cases, the models supported by OpenRouter are only accessible via the /v1/chat/completions endpoint, which means it cannot be used for code completion. This is due to the request format, with the exception of the gpt-3.5-turbo-instruct, which can be accessed via the legacy /v1/completions endpoint by setting the correct prefix and suffix params.
Fortunately, you have a few other options to get this working:
The easiest option is likely to use our native llama.cpp local integration. The plugin comes with a pre-packaged llama.cpp backend along with a user friendly interface that allows you to download and run the models without needing any other 3rd party clients, such as Ollama or Open WebUI. Each model is already bound to an appropriate FIM template, so you don't have to worry about constructing the prompt yourself.
Simply choose and download the model (each model is downloaded directly from HuggingFace with a nice loading progress indicator). Once the model is downloaded, start the server and you're good to go!
However, please note that the llama.cpp integration can only be used with UNIX-based systems and the logic for building and running the server is not yet supported on Windows machines. If you're using Windows, then you'll likely find Ollama to be your best friend. The extension can recognize the models you have already downloaded.
For other models provided by cloud providers, you can use the Custom OpenAI provider. This option is highly configurable for various needs, including on-premise models hosted within your company.
Let's say you want to use the the newest Qwen2.5 Coder 32B model. One option is to use it with the Fireworks API from /v1/completions endpoint, which supports raw input, allowing us to send the pre-built FIM prompt.

  • Select the Custom OpenAI provider and use the Fireworks preset template
  • Fill in your API key obtained from their account page
  • Choose the proper FIM template (CodeQwen 2.5) from the Code Completion section
  • Replace the model body parameter value with accounts/fireworks/models/qwen2p5-coder-32b-instruct
  • Click apply

The same model can also be used for other features, such as regular chats, by simply replacing the model value in the Chat Completions tab.
We have many existing preset configurations in place. For instance, to use the Codestral model, simply select the Mistral AI template, enter the API key, and click apply. The values are already pre-filled, so the additional steps from the previous example are unnecessary.

I will also answer you here as well as on reddit:

Thanks for the clarification, I am actually doing some more testing with Openrouter and putting on code completion the URL ''https://openrouter.ai/api/v1/completions" seems to work with qwen 2.5 coder.
I will continue to use it and give you feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants