Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Everybody seems to be focused on whether or not the OpenAI copied the data in training, but my understanding of copyright is that if a person when into a clean room and wrote an new article from scratch, without having read any NYT, that just so happened to be exactly the same as an existing NYT article, it would still be a copyright violation.

As soon as OpenAI repeats a set of words verbatim, it violates copyright.

The courts should examine how much damage an occasional verbatim regurgitation would damage NYTs business. (I would guess not much)



No this is untrue. Independent creation is an affirmative defense against copyright infringement. You'd never convince a jury that you independently wrote the exact same article as a New York Times article, but in principle you can argue that you independently wrote say... a song, or even reimplemented the WIN32 API without ever having read or familiarized yourself with the original source code:

https://github.com/wine-mirror/wine

https://harvardlawreview.org/print/vol-128/creating-around-c...


Thanks for the clarification!


> but my understanding of copyright is that if a person when into a clean room and wrote an new article from scratch, without having read any NYT, that just so happened to be exactly the same as an existing NYT article, it would still be a copyright violation.

It would not be. Independent creation is a complete defense against copyright infringement.

Patents, however, do work this way.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: