Hacker Newsnew | past | comments | ask | show | jobs | submit | atmanactive's commentslogin



It doesn't. It has to connect to SOME LLM provider, but that CAN also be local Ollama server (running instance). The choice ALWAYS need to be present since, depending on your use case, Ollama (local machine LLM) could be just right, or it could be completely unusable, in which case you can always switch to data center size LLMs.

The ReadMe gives only a Antropic version example, but, judging by the source code [1], you can use other providers, including Ollama, just by changing the syntax of that one config file line.

[1] https://github.com/localgpt-app/localgpt/blob/main/src%2Fage...


> but I'm not really sure about calling it "local-first" as it's still reliant on an `ANTHROPIC_API_KEY`.

See here:

https://github.com/localgpt-app/localgpt/blob/main/src%2Fage...


What reasonable comparable model can be run locally on say 16GB of video memory compared to Opus 4.6? As far as I know Kimi (while good) needs serious GPUs GTX 6000 Ada minimum. More likely H100 or H200.

Devstral¹ has very good models that can be run locally.

They are in the top of open models, and surpass some closed models.

I've been using devstral, codestral and Le Chat exclusively for three months now. All from misteals hosted versions. Agentic, as completion and for day-to-day stuff. It's not perfect, but neither is any other model or product, so good enough for me. Less anecdotal are the various benchmarks that put them surprisingly high in the rankings

¹https://mistral.ai/news/devstral


Nothing will come close to Opus 4.6 here. You will be able to fit a destilled 20B to 30B model on your GPU. Gpt-oss-20B is quite good in my testing locally on a Macbook Pro M2 Pro 32GB.

The bigger downside, when you compare it to Opus or any other hosted model, is the limited context. You might be able to achieve around 30k. Hosted models often have 128k or more. Opus 4.6 has 200k as its standard and 1M in api beta mode.


There are local models with larger context, but the memory requirements explode pretty quickly so you need to lower parameter count or resort to heavy quantization. Some local inference platforms allow you to place the KV cache in system memory (while still otherwise using GPU). Then you can just use swap to allow for even very long contexts, but this slows inference down quite a bit. (The write load on KV cache is just appending a KV vector per inferred token, so it's quite compatible with swap. You won't be wearing out the underlying storage all that much.)

I made something similar to this project, and tested it against a few 3B and 8B models (Qwen and Ministral, both the instruction and the reasoning variants). I was pleasantly surprised by how fast and accurate these small models have become. I can ask it things like "check out this repo and build it", and with a Ralph strategy eventually it will succeed, despite the small context size.

Nothing close to Opus is available in open weights. That said, do all your tasks need the power of Opus?

The problem is that having to actively decide when to use Opus defeats much of the purpose.

You could try letting a model decide, but given my experience with at least OpenAI’s “auto” model router, I’d rather not.


I also don't like having to think about it, and if it were free, I would not bother even though keeping up a decent local alternative is a good defensive move regardless.

But let's face it. For most people Opus comes at a significant financial cost per token if used more than very casual, so using it for rather trivial or iterative tasks that nevertheless consume a lot of those is something to avoid.


The idea is really good, and I would be the first to use it for all my IT infrastructures, BUT, judging by the site and the documentation, this is just one half of the equation. The other part that's sorely missing, in my humble opinion, is an AUTOMATED import & sync engine. If I need to build the whole infrastructure in your editor, then, it is just a game, like milion other games on Steam. To make this thing really useful, I would need to feed it raw data (a router export, an SNMP walk, a terraform/AWS config...) and it would need to build the city on it's own, and then allow me to monitor/sync by hooking up to metrics. In other words, the source of truth needs to remain in one place, while your engine needs to be able to digest/hook into that source of truth.

This is great! Thanks for sharing. One usability issue for me is that the context menu entries are shown only in the file explorer, but NOT on the open files pane nor on the file tab right click menu. This kind of forces me to go back to file explorer whenever I want to work with file markers. It would be awesome if the context menu could be included everywhere where a file context menu is shown. Thanks.

That make sense. I'll check that one. But for now, I think the work around is to use Ctrl/Cmd + Shift + M (Default shortcut key) to toggle between markers for the opened file.

How is this different than KeePassXC?

Great question. I'm a long-time user of KeePassXC myself, and I see Sklad as complementary to it, not a replacement.

The main difference is the workflow and friction.

1. Tray-First vs. Window-First: KeePassXC is primarily window-based. Even with the tray icon, retrieving a specific entry usually involves opening the window, searching (Cmd/Ctrl+F), and copying. Sklad exposes your entire folder hierarchy as a native recursive tray menu. You right-click the icon, hover through `Servers -> Client A -> SSH Key`, and click to copy. It allows for "muscle memory" access without ever switching focus or managing windows.

2. Snippets vs. Credentials: I use KeePassXC for high-security web logins and bank details. I built Sklad for "operational" data—SSH keys, complex CLI one-liners, specific IP addresses, and env vars that I need to grab 20 times a day.

3. Hierarchy Visualization: Sklad allows you to visualize the tree structure instantly via the menu, which feels faster for mental mapping of infrastructure than a flat search list.

In short: KeePassXC is a vault; Sklad is a quick-access utility belt.


What's worse: Telegram's alleged accessibility to Moscow/FSB, or WhatsApp's proven spying and data selling to anyone?

Tough choice, but I would choose Whatsapp over FSB.

Why? There is a low chance of FSB successfully prosecuting you as western Citizen doing illegal/silly things in Telegram.

Big Five of other hand (UK, USA, AUSTRALIA etc) spy network are already working with your western government...

So I would rather be compromised in Russia with 0 chance of extradition there than non 0 to USA, UK GERMANY etc

(Let's say you are producing fake Coco Channel perfumes)


It is well known that secret services of unfriendly countries use material they can get as blackmail. The risk is not getting extradited to Russia, the risk is a Russian agent pressuring someone who works at (say) a defense company to do their bidding.

I'm not big fan of US politics at the moment, but still easily choose US spying over Russia. There is still some difference between these countries.

>There is a low chance of FSB successfully prosecuting you as western Citizen doing illegal/silly things in Telegram

I'm more worried about electoral interference and stoking of social tensions than prosecution, personally.


>> 1. I ask Claude: "Zip the logs and send them to Bob."

>> 2. Claude uses the tool to generate a one-time P2P link.

>> 3. My coworker clicks the link to download immediately

It is not clear how DOES the co-worker get the link?


Oh, in this demo the right panel is just simulating the co-worker. In a real case, I would send the link to him via IM (for example, Slack). Since this is only a demo, I’m playing both roles myself. :p

Android only?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: