Just yesterday I asked it to repeat a very simple task 10 times. It ended up doing it 15 times. It wasn't a problem per se, just a bit jarring that it was unable to follow such simple instructions (it even repeated my desire for 10 repetitions at the start!).
Not a professional developer (though Guillermo certainly is) so take this with a huge grain of salt, but I like the idea of an AI "trained" on security vulnerabilities as a second, third and fourth set of eyes!
You p much just linked to an ad for a vibe coding platform.
If you don't know what you're doing, you are going to make more security mistakes. Throwing LLMs into it doesn't increase your "know what you're doing" meter.
I'm not sure how to take that seriously with the current reality where almost all security findings by LLM tools are false positives
While I suspect that's gonna work good enough on synthetic examples for naive and uninformed people to get tricked into trusting it... At the very least, current LLMs are unable to provide enough stability for this to be useful.
It might become viable with future models, but there is little value in discussing this approach currently.
At least until someone actually made a PoC thats at least somewhat working as designed, without having a 50-100% false positive quota.
You can have some false positives, but it has to be low enough for people to still listen to it, which currently isn't the case.
But if you're "starting from scratch", then what would be the problem? If none of the results match what you want, you reiterate on your prompt and start from scratch. If one of them is suitable you take it. If there's no iterating on the code with the agents, then this really wouldn't add much mental overhead? You just have to glance over more results.
reply