Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

The trifecta:

> LLM access to untrusted data, the ability to read valuable secrets and the ability to communicate with the outside world

The suggestion is to reduce risk by setting boundaries.

Seems like security 101.



It is, but there's a direct tension here between security and capabilities. It's hard to do useful things with private data without opening up prompt injection holes. And there's a huge demand for this kind of product.

Agents also typically work better when you combine all the relevant context as much as possible rather than splitting out and isolating context. See: https://cognition.ai/blog/dont-build-multi-agents — but this is at odds with isolating agents that read untrusted input.


The external communication part of the trifecta is an easy defense. Don't allow external communication. Any external information that's helpful for the AI agent should be available offline, be present in its model (possibly fine tuned).


Sure, but that is as vacuously true as saying “router keeps getting hacked? Just unplug it from the internet.”

Huge numbers of businesses want to use AI in the “hey, watch my inbox and send bills to all the vendors who email me” or “get a count of all the work tickets closed across the company in the last hour and add that to a spreadsheet in sharepoint” variety of automation tasks.

Whether those are good ideas or appropriate use-cases for AI is a separate question.


It is security 101 as this is just setting basic access controls at the very least.

The moment it has access to the internet, the risk is vastly increased.

But with a very clever security researcher, it is possible to take over the entire machine with a single prompt injection attack reducing at least one of the requirements.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: