Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

given that AI is primarily trained on web data I wonder if it's possible to attack other people's ML training in that way :-)


that's the idea! we know about adversarial inputs at inference time, this paper talks about adversarial perturbation of the model itself during training. what about undetectable adversarial training inputs where people do their own training but the model still ends up with hard to find (except for the adversary) weaknesses?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: