July 15th, 2024

Self hosting a Copilot replacement: my personal experience

The author shares their experience self-hosting a GitHub Copilot replacement using local Large Language Models (LLMs). Results varied, with none matching Copilot's speed and accuracy. Despite challenges, the author plans to continue using Copilot.

Read original articleLink Icon
Self hosting a Copilot replacement: my personal experience

The article discusses the author's experience with self-hosting a GitHub Copilot replacement using local Large Language Models (LLMs). The author, a Software Developer, explores this alternative to using external services like Copilot and ChatGPT. They clarify that AI tools are meant to assist, not replace human understanding and decision-making. The experiment involved running LLMs locally on a MacBook Pro and testing various models and VSCode extensions. Results varied based on the LLM model used, with none matching the speed and accuracy of GitHub Copilot. The author concludes that while the idea of a personal code assistant is appealing, achieving Copilot's performance level is challenging. They anticipate improvements in models and extensions over time. Despite the challenges, the author plans to continue using GitHub Copilot for their personal use. The article invites feedback on better models or extensions for future testing.

Related

Link Icon 10 comments
By @ghthor - 6 months
I’m self hosting using TabbyML and running StarCoder 3B on my nvidia rtx2080 super and I can’t imagine coding without it anymore. It consistently, across all languages I work in , gives me great completions.

People not having good success in this thread, I would suggest trying again

By @tcdent - 6 months
Went through this same exercise this week and came to the same conclusion.

After trying multiple open models, reconfiguring GPT-4o and seeing the speed and quality of the output was illuminating.

By @cmpit - 6 months
I also wanted to try some local LLMs, but gave up and came to the same conclusion:

"While the idea of having a personal and private instance of a code assistant is interesting (and can also be the only available option in certain environments), the reality is that achieving the same level of performance as GitHub Copilot is quite challenging.".

But considering the pace at which AI and the ecosystem advances, things might change soon.

By @zamalek - 6 months
I believe we'll need a purpose-built ASIC with access to 100GB of good old [G]DDR5 before this becomes viable. Something like what Hailo offers, but without the "product inquiry" barrier.

I say that because we don't need datacenter speeds for a single user, but there is no avoiding memory requirements.

I don't think it will happen. The market is too niche. People are happy to fork over $5/mo.

By @rcarmo - 6 months
It really depends on the use case, and right now using Ollama for coding just isn’t that useful. I can use gemma2 and phi3 just fine for general summarization and keyword extraction (including most of the stuff I need to do home automation with a “better Siri”—low bar, I know), but generating or autocompleting code is just another level entirely.
By @NomDePlum - 6 months
I don't use Copilot so not able to compare but ollama + llama3:instruct + open-webui on a Mac Pro M2 is helpful when coding.
By @sebazzz - 6 months
I wonder if running these models on a recent RTX graphics card would speed them up - instead of using an M2 mac.
By @transformi - 6 months
It looks like you used some older models.. what about deepseekcoder/Qwen?
By @fortyseven - 6 months
This is from early March. An eternity in this space. What's the REAL current status?