[Feature Request] ollama support #18

zhuozhiyongde · 2025-02-11T03:29:36Z

首先感谢大佬的开发！

主要就是在想 Copilot 感觉还是有点笨笨的，现在 LLM 发展迅速，有没有可能使用本地模型来完成上下文补全？这样的话似乎也不用费心去调用 Copilot 了，本地的话，7B 的小模型速度应该也足够？

Snowflyt · 2025-02-11T04:54:30Z

(English)

It intuitively seems feasible. Currently, the plugin has written a lot of code to implement the LSP protocol in order to interact with GitHub Copilot. Switching to Ollama might be simpler.

There are a few challenges I can think of. One is that some documents are very long, and small models seem insufficient to support such a large context window. This might require special handling, such as inputting the document content in chunks and summarizing it step by step. I'm not sure whether this process will lose too much information from earlier parts of the text, which could result in poor completion performance. Additionally, performance might be an issue, as such long contexts could significantly slow down response times.

I'm also concerned about whether small local models are “smart” enough to provide an experience similar to GitHub Copilot and whether they can do so quickly. Models like Qwen 7B/14B can indeed produce some decent completion results, but GitHub Copilot is a “large model”, its parameter size allows it to contain much more knowledge than smaller models, and it’s also much faster. The exact performance remains to be seen once it's implemented.

I came across a similar project, a web-based Markdown editor, that also provides Copilot-like text completion, which you might want to check out: https://github.com/fynnfluegge/rocketnotes

Maybe I’ll start a new branch to see if I can implement a prototype once I have some free time. Or if anyone is willing, I would be happy to have someone submit a PR for this.

（中文）

直觉上似乎可行。现在插件为了与 GitHub Copilot 的交互，在实现 LSP 协议上写了很多代码，换成 Ollama 或许还会简单一些。

我目前能想到有几个困难。一个是有些文档会很长，小模型似乎不足以支持这么长的上下文窗口，需要做一些特殊的处理，比如需要一段段输入文档内容，并一步步做摘要总结。我不太清楚这个过程中会不会丢失太多前文的信息，导致补全效果不佳；另外性能可能也是个问题，这么长的上下文会显著拖慢响应速度。

而且我有些担忧本地小模型是否足够“聪明”到能够提供类似 GitHub Copilot 的体验，并且是否足够快。小模型像是 Qwen 7B/14B 确实可以得到一些看起来不错的补全结果，但 GitHub Copilot 毕竟是个“大模型”，它的参数量决定了它可以塞下远多于小模型的知识，而且速度也很快。具体效果如何，还得等实现了才知道。

我看到了一个相似的项目，一个基于 Web 的 Markdown 编辑器，也能提供 Copilot-like 的补全文本，可以参考一下：https://github.com/fynnfluegge/rocketnotes

也许过段时间等我空下来我会开个 new branch 看看能否实现一个原型。或者如果有人愿意的话，我也很欢迎有人可以在这方面提交 PR。

fangtaosong · 2025-02-23T15:54:29Z

同求，但是实际上现在的GitHub Copilot可能也只是一个十几 B 甚至更小的模型（几百M的通义灵码可以表现出相同甚至更好的体验），原型可以先从支持基础的 openai 规范的 http 请求开始

Snowflyt pinned this issue Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] ollama support #18

[Feature Request] ollama support #18

zhuozhiyongde commented Feb 11, 2025 •

edited

Loading

Snowflyt commented Feb 11, 2025

fangtaosong commented Feb 23, 2025 •

edited

Loading

[Feature Request] ollama support #18

[Feature Request] ollama support #18

Comments

zhuozhiyongde commented Feb 11, 2025 • edited Loading

Snowflyt commented Feb 11, 2025

fangtaosong commented Feb 23, 2025 • edited Loading

zhuozhiyongde commented Feb 11, 2025 •

edited

Loading

fangtaosong commented Feb 23, 2025 •

edited

Loading