It doesn't. It has to connect to SOME LLM provider, but that CAN also be local Ollama server (running instance). The choice ALWAYS need to be present since, depending on your use case, Ollama (local machine LLM) could be just right, or it could be completely unusable, in which case you can always switch to data center size LLMs.
The ReadMe gives only a Antropic version example, but, judging by the source code [1], you can use other providers, including Ollama, just by changing the syntax of that one config file line.
What reasonable comparable model can be run locally on say 16GB of video memory compared to Opus 4.6? As far as I know Kimi (while good) needs serious GPUs GTX 6000 Ada minimum. More likely H100 or H200.
Devstral¹ has very good models that can be run locally.
They are in the top of open models, and surpass some closed models.
I've been using devstral, codestral and Le Chat exclusively for three months now. All from misteals hosted versions. Agentic, as completion and for day-to-day stuff. It's not perfect, but neither is any other model or product, so good enough for me. Less anecdotal are the various benchmarks that put them surprisingly high in the rankings
Nothing will come close to Opus 4.6 here. You will be able to fit a destilled 20B to 30B model on your GPU.
Gpt-oss-20B is quite good in my testing locally on a Macbook Pro M2 Pro 32GB.
The bigger downside, when you compare it to Opus or any other hosted model, is the limited context. You might be able to achieve around 30k.
Hosted models often have 128k or more. Opus 4.6 has 200k as its standard and 1M in api beta mode.
There are local models with larger context, but the memory requirements explode pretty quickly so you need to lower parameter count or resort to heavy quantization. Some local inference platforms allow you to place the KV cache in system memory (while still otherwise using GPU). Then you can just use swap to allow for even very long contexts, but this slows inference down quite a bit. (The write load on KV cache is just appending a KV vector per inferred token, so it's quite compatible with swap. You won't be wearing out the underlying storage all that much.)
I made something similar to this project, and tested it against a few 3B and 8B models (Qwen and Ministral, both the instruction and the reasoning variants). I was pleasantly surprised by how fast and accurate these small models have become. I can ask it things like "check out this repo and build it", and with a Ralph strategy eventually it will succeed, despite the small context size.
I also don't like having to think about it, and if it were free, I would not bother even though keeping up a decent local alternative is a good defensive move regardless.
But let's face it. For most people Opus comes at a significant financial cost per token if used more than very casual, so using it for rather trivial or iterative tasks that nevertheless consume a lot of those is something to avoid.
The idea is really good, and I would be the first to use it for all my IT infrastructures, BUT, judging by the site and the documentation, this is just one half of the equation. The other part that's sorely missing, in my humble opinion, is an AUTOMATED import & sync engine. If I need to build the whole infrastructure in your editor, then, it is just a game, like milion other games on Steam. To make this thing really useful, I would need to feed it raw data (a router export, an SNMP walk, a terraform/AWS config...) and it would need to build the city on it's own, and then allow me to monitor/sync by hooking up to metrics. In other words, the source of truth needs to remain in one place, while your engine needs to be able to digest/hook into that source of truth.
This is great! Thanks for sharing. One usability issue for me is that the context menu entries are shown only in the file explorer, but NOT on the open files pane nor on the file tab right click menu. This kind of forces me to go back to file explorer whenever I want to work with file markers. It would be awesome if the context menu could be included everywhere where a file context menu is shown. Thanks.
That make sense. I'll check that one. But for now, I think the work around is to use Ctrl/Cmd + Shift + M (Default shortcut key) to toggle between markers for the opened file.
Great question. I'm a long-time user of KeePassXC myself, and I see Sklad as complementary to it, not a replacement.
The main difference is the workflow and friction.
1. Tray-First vs. Window-First: KeePassXC is primarily window-based. Even with the tray icon, retrieving a specific entry usually involves opening the window, searching (Cmd/Ctrl+F), and copying. Sklad exposes your entire folder hierarchy as a native recursive tray menu. You right-click the icon, hover through `Servers -> Client A -> SSH Key`, and click to copy. It allows for "muscle memory" access without ever switching focus or managing windows.
2. Snippets vs. Credentials: I use KeePassXC for high-security web logins and bank details. I built Sklad for "operational" data—SSH keys, complex CLI one-liners, specific IP addresses, and env vars that I need to grab 20 times a day.
3. Hierarchy Visualization: Sklad allows you to visualize the tree structure instantly via the menu, which feels faster for mental mapping of infrastructure than a flat search list.
In short: KeePassXC is a vault; Sklad is a quick-access utility belt.
It is well known that secret services of unfriendly countries use material they can get as blackmail. The risk is not getting extradited to Russia, the risk is a Russian agent pressuring someone who works at (say) a defense company to do their bidding.
Oh, in this demo the right panel is just simulating the co-worker. In a real case, I would send the link to him via IM (for example, Slack). Since this is only a demo, I’m playing both roles myself. :p
https://github.com/localgpt-app/localgpt/blob/main/src%2Fage...
reply