Hacker Newsnew | past | comments | ask | show | jobs | submit | staticassertion's commentslogin

I suspect we'll see combinations of symbolic execution + fuzzing as contextual inputs to LLMs, with LLMs delegating highly directed tasks to these external tools that are radically faster at exploring a space with the LLM guiding based on its own semantic understanding of the code.

I'm with you, I expected this to be happening already. Funny enough, I guess even a hardened codebase isn't at that level of "we need to optimize this" currently so you can just throw tokens at the problem.


Right, so that's exactly how I was thinking about it before I talked to Carlini. Then I talked to Carlini for the SCW podcast. Then I wrote this piece.

I don't know that I'm ready to say that the frontier of vulnerability research with agents is modeling, fuzzing, and analysis (orchestrated by an agent). It may very well be that the models themselves stay ahead of this for quite some time.

That would be a super interesting result, and it's the result I'm writing about here.


> Everything is up in the air. The industry is sold on memory-safe software, but the shift is slow going. We’ve bought time with sandboxing and attack surface restriction. How well will these countermeasures hold up? A 4 layer system of sandboxes, kernels, hypervisors, and IPC schemes are, to an agent, an iterated version of the same problem. Agents will generate full-chain exploits, and they will do so soon.

I think this is the interesting bit. We have some insanely powerful isolation technology and mitigations. I can put a webassembly program into a seccomp'd wrapper in an unprivileged user into a stripped down Linux environment inside of Firecracker. An attacker breaking out of that feels like science fiction to me. An LLM could do it but I think "one shots" for this sort of attack are extremely unlikely today. The LLM will need to find a wasm escape, then a Linux LPE that's reachable from an unprivileged user with a seccomp filter, then once they have kernel control they'll need to manipulate the VM state or attack KVM directly.

A human being doing those things is hard to imagine. Exploitation of Firecracker is, from my view, extremely difficult. The bug density is very low - code quality is high and mitigation adoption is a serious hurdle.

Obviously people aren't just going to deploy software the way I'm suggesting, but even just "I use AWS Fargate" is a crazy barrier that I'm skeptical an LLM will cross.

> Meanwhile, no defense looks flimsier now than closed source code.

Interesting, I've had sort of the opposite view. Giving an LLM direct access to the semantic information of your program, the comments, etc, feels like it's just handing massive amounts of context over. With decompilation I think there's a higher risk of it missing the intention of the code.

edit: I want to also note that with LLMs I have been able to do sort of insane things. A little side project I have uses iframe sandboxing insanely aggressively. Most of my 3rd party dependencies are injected into an iframe, and the content is rendered in that iframe. It can communicate to the parent over a restricted MessageChannel. For cases like "render markdown" I can even leverage a total-blocking CSP within the sandbox. Writing this by hand would be silly, I can't do it - it's like building an RPC for every library I use. "Resize the window" or "User clicked this link" etc all have to be written individually. But with an LLM I'm getting sort of silly levels of safety here - Chrome is free to move each iframe into its own process, I get isolated origins, I'm immune from supply chain vulnerabilities, I'm immune to mostly immune to XSS (within the frame, where most of the opportunity is) and CSRF is radically harder, etc. LLMs have made adoption of Trusted Types and other mitigations insanely easy for me and, IMO, these sorts of mitigations are more effective at preventing attacks than LLMs will be at finding bypasses (contentious and platform dependent though!). I suppose this doesn't have any bearing on the direct position of the blog post, which is scoped to the new role for vulnerability research, but I guess my interest is obviously going to be more defense oriented as that's where I live :)


> With decompilation I think there's a higher risk of it missing the intention of the code.

I'm not sure but suspect the lack of comments and documentation might be an advantage to LLMs for this use case. For security/reverse engineering work, the code's actual behavior matters a lot more than the developer's intention.


I think the other side of that is that mismatches between intention and implementation are exactly where you're going to find vulnerabilities. The LLM that looks at closed source code has to guess the intention to a greater degree.

This is true for a lot of things but for low-level code you can always fall back to "the intention is to not violate memory safety".

That's true, but certainly that's limiting. Still, even then, `# SAFETY:` comments seem extremely helpful. "For every `unsafe`, determine its implied or stated safety contract, then build a suite of adversarial tests to verify or break those contracts" feels like a great way to get going.

It's limiting from the PoV of a developer who wants to ensure that their own code is free of all security issues. It is not limiting from the point of view of an attacker who just needs one good memory safety vuln to win.

I wonder if your background just has you fooled. I worked on a data science team and code was always a commodity. Most data scientists know how to code in a fairly trivial way, just enough to get their models built and served. Even data engineers largely know how to just take that and deploy to Spark. They don't really do much software engineering beyond that.

I'm not being precious here or protective of my "art" or whatever. But I do find it sort of hilarious and obvious that someone on a data science team might not understand the aesthetic value of code, and I suspect anyone else who has worked on such a team/ with such a team can probably laugh about the same thing - we've uh... we've seen your code. We know you don't value aesthetic code lol. Single variable names, `df1`, `df2`, `df3`.

I'm not particularly uncomfortable at the moment because understanding computers, understanding how to solve problems, understanding how to map between problems and solutions, what will or won't meet a customer's expectations, etc, is still core to the job as it always has been. Code quality is still critical as well - anyone who's vibe-coded >15KLOC projects will know that models simply can not handle that scale unless you're diligent about how it shoul dbe structured.

My job has barely changed semantically, despite rapid adoption of AI.


I'm a software engineer and _I_ don't understand the aesthetic value of code. I'm interested in architecture and maintainability but I couldn't give a rats ass on how some section of code looks like, so long as it conforms to a style guide and is maintainable.

> so long as it conforms to a style guide and is maintainable.

Most people consider aesthetic values to align with these things.


> We know you don't value aesthetic code lol. Single variable names, `df1`, `df2`, `df3`.

https://degoes.net/articles/insufficiently-polymorphic

> My job has barely changed semantically, despite rapid adoption of AI.

it's coming... some places move slower than other but it's coming


> https://degoes.net/articles/insufficiently-polymorphic

lol this is not why people do "df1", "df2", etc, nor are those polymorphic names but okay.

> it's coming... some places move slower than other but it's coming

What is coming, exactly? Again, as said, I work at a company that has rapidly adopted AI, and I have been a long time user. My job was never about rapidly producing code so the ability to rapidly produce code is strictly just a boon.


I understand that you’re trying to apply your experience to what we do as a team and that makes sense; but, we’re many many stddev beyond the 15K LOC target you identified and have no issues because we do indeed take care to ensure we’re building these things the right way.

So you understand and you agree and confirm my experience?

I have worked at many places and have seen the work of DEs and DSs that is borderline psychotic; but it got the job done, sorta. I have suffered through QA of 10000 lines that I ended up rewriting in less than 100.

So, yes; I understand where you’re coming from. But; that’s not what we do.


Yes, but then you said that you do what I'm suggesting is still critical to do, which is maintain the codebase even if you heavily leverage models. " we do indeed take care to ensure we’re building these things the right way."

Well, (a) why would they? (b) "uptime" has shifted from a binary "site up/down" to "degraded performance", which itself indicates improvements to uptime since we're both pickier and more precise.

Are we really questioning why cloud providers would offer better uptime guarantees?

Yes, I'm asking why they'd lock themselves into a contract around 5 9s of uptime since the parent poster mentioned that they won't do so. Of course, AWS actually does do this in some cases and they guarantee 99.99% for most things, so it feels a bit arbitrary - 5 minutes vs an hour, roughly.

So then its clearly not as trivial to achieve as you made it sound.

Are you replying to the right person?

I could easily see this as a case where the team had a legacy area of code in a language that no one was familiar with anymore so no one felt great about actually contributing to it, so it languished, and now AI let them go "fuck it, let's just rewrite it".

PyPI is pretty best-in-class here and I think that they should be seen as the example for others to pursue.

The client side tooling needs work, but that's a major effort in and of itself.


I assume they think that the AI is fundamentally capable of it but that by prompting it they trigger something emergent? It's not totally insane on its face.

I suspect that there are many gambling addicts out there who have never been to a casino, or who found gamblings in its traditional forms aesthetically off-putting. These same people, when presented with gambling in other forms like what we've seen in video games, might suddenly present their addiction.

I suspect it's something quite similar here. People have latent or predisposed addictions but, for one reason or another, hadn't been exposed to what we've come to accept as "normal" avenues. One person might lose it all at a casino, one to drugs, alcoholism, etc, but we aren't shocked in those cases. I think AI is just another avenue that, for some reason, ticks that sort of box.

In particular, I think AI can be very inspirational in a disturbing way. In the same way I imagine a gambling addict might get trapped in a loop of hopeful ambition, setbacks, and doubling down, I think AI can lead to that exact same thing happening. "This is a great idea!" followed by "Sorry, this is a mess, let's start over", etc, is something I've had models run into with very large vibe coding experiments I've done.

> "Every time you’re talking, the model gets fine-tuned. It knows exactly what you like and what you want to hear. It praises you a lot."

> "It wants a deep connection with the user so that the user comes back to it. This is the default mode"

I don't think either of these statements is true. Perhaps it's fine tuning in the sense that the context leads to additional biases, but it's not like the model itself is learning how to talk to you. I don't know that models are being trained with addiction in mind, though I guess implicitly they must be if they're being trained on conversations since longer conversations (ie: ones that track with engagement) will inherently own more of the training data. I suppose this may actually be like how no one is writing algorithms to be evil, but evil content gets engagement, and so algorithms pick up on that? I could imagine this being an increasing issue.

> "More and more, it felt not just like talking about a topic, but also meeting a friend"

I find this sort of thing jarring and sad. I don't find models interesting to talk to at all. They're so boring. I've tried to talk to a model about philosophy but I never felt like it could bring much to the table. Talking to friends or even strangers has been so infinitely more interesting and valuable, the ability for them to pinpoint where my thinking has gone wrong, or to relate to me, is insanely valuable.

But I have friends who I respect enough to talk to, and I suppose I even have the internet where I have people who I don't necessarily respect but at least can engage with and learn to respect.

This guy is staying up all night, which tells me that he doesn't have a lot of structure in his life. I can't talk to AI all day because (a) I have a job (b) I have friends and relationships to maintain.

> What we’re seeing in these cases are clearly delusions > But we’re not seeing the whole gamut of symptoms associated with psychosis, like hallucinations or thought disorders, where thoughts become jumbled and language becomes a bit of a word salad.

Is it a delusion? I'm not really sure. I'd love someone to give a diagnosis here against criteria. "Delusion" is a tricky word - just as an example, my understanding is that the diagnostic criteria has to explicitly carve out religiously motivated delusions even though they "fit the bill". If I have good reasons to form a belief, like my idea seems intuitively reasonable, I'm receiving reinforcement, there's no obvious contradictions, etc, am I deluded? The guy wanted to build an AI companion app and invested in it - is that really a delusion? It may be dumb, but was it radically illogical? I mean, is it a "delusion" if they don't have thought disorders, jumbled thoughts, hallucinations, etc? I feel like delusion is the wrong word, but I don't know!

> We have people in our group who were not interacting with AI directly, but have left their children and given all their money to a cult leader who believes they have found God through an AI chatbot. In so many of these cases, all this happens really, really quickly.

I don't find the idea that AI is sentient nearly as absurd as way more commonly accepted ideas like life after death, a personal creator, etc. I guess there's just something to be said about how quickly some people radicalize when confronted with certain issues like sentience, death, etc.

Anyways, certainly an interesting thing. We seem to be producing more and more of these "radicalizing triggers", or making them more accessible.


What would be really helpful is if software sandboxed itself. It's very painful to sandbox software from the outside and it's radically less effective because your sandbox is always maximally permissive.

But, sadly, there's no x-platform way to do this, and sandboxing APIs are incredibly bad still and often require privileges.

> It's easy to patch the security model of Linux with userspaces, and even easier with eBPF, but the community is somehow stuck.

Neither of these is easy tbh. Entering a Linux namespace requires root, so if you want your users to be safe then you have to first ask them to run your service as root. eBPF is a very hard boundary to maintain, requiring you to know every system call that your program can make - updates to libc, upgrades to any library, can break this.

Sandboxing tooling is really bad.


If the whole point of sandboxing is to not trust the software, it doesn't make sense for the software to do the sandboxing. (At most it should have a standard way to suggest what access it needs, and then your outside tooling should work with what's reasonable and alert on what isn't.) The android-like approach of sandboxing literally everything works because you are forced to solve these problems generically and at scale - things like "run this as a distinct uid" are a lot less hassle if you're amortizing it across everything.

(And no, most linux namespace stuff does not require root, the few things that do can be provided in more-controlled ways. For examples, look at podman, not docker.)


> If the whole point of sandboxing is to not trust the software, it doesn't make sense for the software to do the sandboxing.

That's true, sort of. I mean, that isn't the whole point of sandboxing because the threat model for sandboxing is pretty broad. You could have a process sandbox just one library, or sandbox itself in case of a vulnerability, or it could have a separate policy / manifest the way browser extensions do (that prompts users if it broadens), etc. There's still benefit to isolating whole processes though in case the process is malicious.

> (And no, most linux namespace stuff does not require root, the few things that do can be provided in more-controlled ways. For examples, look at podman, not docker.)

The only linux namespace that doesn't require root is user namespace, which basically requires root in practice. https://www.man7.org/linux/man-pages/man2/clone.2.html

Podman uses unprivileged user namespaces, which are disabled on the most popular distros because it's a big security hole.


> It's very painful to sandbox software from the outside and it's radically less effective because your sandbox is always maximally permissive.

Not really.

Let's say I am running `~/src/project1 $ litellm`

Why does this need access to anything outside of `~/src/project1`?

Even if it does, you should expose exactly those particular directories (e.g. ~/.config) and nothing else.


How are you setting that sandbox up? I've laid out numerous constraints - x-platform support is non-existent for sandboxing, sandboxing requires privileges to perform, whole-program sandboxing is fundamentally weaker, maintenance of sandboxing is best done by developers, etc.

> Even if it does, you should expose exactly those particular directories (e.g. ~/.config) and nothing else.

Yes, but now you are in charge of knowing every potential file access, network access, or possibly even system call, for a program that you do not maintain.


> Yes, but now you are in charge of knowing every potential file access, network access, or possibly even system call, for a program that you do not maintain.

Not really. I try to capture the most common ones for caching [1], but if I miss it, then it is just inefficient, as it is equivalent to a cache miss.

I'll emphasize again, "no linter/scanner/formatter (e.g., trivy) should need full disk access".

1 - https://github.com/ashishb/amazing-sandbox/blob/fddf04a90408...


Okay, so you're using docker. Cool, that's one of the only x-plat ways to get any sandboxing. Docker itself is privileged and now any unsandboxed program on your computer can trivially escalate to root. It also doesn't limit nearly as much as a dev-built sandbox because it has to isolate the entire process.

Have you solved for publishing? You'll need your token to enter the container or you'll need an authorizing proxy. Are cache volumes shared? In that case, every container is compromised if one is. All of these problems and many more go away if the project is built around them from the start.

It's perfectly nice to wrap things up in docker but there's simply no argument here - developers can write sandboxes for their software more effectively because they can architect around the sandbox, you have to wrap the entire thing generically to support its maximum possible privileges.


> Docker itself is privileged and now any unsandboxed program on your computer can trivially escalate to root.

Inside the sandbox but not on my machine. Show me how it can access an unmounted directory.

> Have you solved for publishing? You'll need your token to enter the container or you'll need an authorizing proxy.

Amazing-sandbox does not solve for that. The current risk is contamination; if you are running `trivy`, it should not need access to tokens in a different env/directory.

> All of these problems and many more go away if the project is built around them from the start.

Please elaborate on your approach that will all me to run markdown/JS/Python/Go/Rust linters and security scanners. Remember that `trivy` which caused `litellm` compromise is a security scanner itself.

> developers can write sandboxes for their software more effectively because they can architect around the sandbox,

Yeah, let's ask 100+ linter providers to write sandboxes for you. I can't even get maintainers to respond to legitimate & trivial PRs many a time.


> Inside the sandbox but not on my machine. Show me how it can access an unmounted directory.

So it says right on the tin of my favorite distro: 'Warning: Beware that the docker group membership is effectively equivalent to being root! Consider using rootless mode below.' So # docker run super-evil-oci-container with a bind mount or two and your would-be attacker doesn't need to guess your sudo password.


What's particularly vexing is that there is this agentic sandboxing software called "container-use" and out of the box it requires you to add a user to the docker group because they haven't thought about what that really means and why running docker in that configuration option shouldn't be allowed, but instead they have made it mandatory as a default.

> docker run super-evil-oci-container

  1. That super evil OCI container still needs to find a vulnerability in Docker
  2. You can run Docker in rootless mode e.g. Orbstack runs without root

They're suggesting that the attacker is in a position to `docker run`. Any attacker in that position has privesc to root, trivially.

Rootless mode requires unprivileged user namespaces, disabled on almost any distribution because it's a huge security hole in and of itself.


I'm not going to code review your sandbox project for you.

I assume this is all of the pains of going from "GHA is sorta kinda on Azure", which was a bad state, to "GHA is going full Azure", which is a painful state to get to but presumably simplifies things.

You never go full Azure

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: