Hacker Newsnew | past | comments | ask | show | jobs | submit | ostacke's commentslogin

Bu that's not really what danlitt said, right? They did not claim that it's impossible for an LLM to generate something different, merely that it's not a clean room implementation since the LLM, one must assume, is trained on the code it's re-implementing.

BUt LLM has seen millions (?) of other code-bases too. If you give it a functional spec it has no reason to prefer any one of those code-bases in particular. Except perhaps if it has seen the original spec (if such can be read from public sources) associated with the old implementation, and the new spec is a copy of the old spec.

Yes if you are solving the exact problem that the original code solved and that original code was labeled as solving that exact problem then that’s very good reason for the LLM to produce that code.

Researchers have shown that an LLM was able to reproduce the verbatim text of the first 4 Harry Potter books with 96% accuracy.


> that an LLM was able to reproduce the verbatim text of the first 4 Harry Potter books with 96% accuracy.

Kinda weird argument, in their research (https://forum.gnoppix.org/t/researchers-extract-up-to-96-of-...) LLM was explicitly asked to reproduce the book. There are people that can do so without LLMs out there, by this logic everything they write is a copyright infringement an every book they can reproduce.

> Yes if you are solving the exact problem that the original code solved and that original code was labeled as solving that exact problem then that’s very good reason for the LLM to produce that code.

I think you're overestimating LLM ability to generalize.


The point about Harry Potter was just that the verbatim text for popular text in the training set is in there.

It’s the same as when you ask a model to generate an Italian plumber with overalls and it produces something close enough to Mario to be a copyright violation.

If you ask it to solve a very specific problem for which there is a solution well represented in its train set, you can definitely get back enough verbatim snippets to cause problems.

It’s also not a theoretical problem, you can Google for studies showing real world production of verbatim code with non-adversarial prompts.


I guess the text of Harry Potter was used as training material as one big chunk. That would be a copyright violation.

This is where I disagree. Copyright was most likely violated, but (most likely) because book was obtained not via a legal way.

LLMs didn't spit out Harry Potter until it was prompted to do so. There is argument to be make that LLM can be used as transport of pirated content.

My argument is that it's not different from searching for "file:pdf Harry Potter"


I see your point but it also seems clear to me that somebody violated copyright, most likely the people or company that trained the AI.

This is not an argument against coding in a different language, though. It would be like having it restate Harry Potter in a different language with different main character names, and reshuffled plot points.

If you find a single paragraph that is a direct translation with different names that’s definitely enough for copyright infringement.

Reshuffling plot points is doing a lot of lifting here. Just looking at a specific chapter near the end of the book, if you change the the order of the trials, change the names, and translate it into a different language, you’re still going to have a very hard time arguing that what you’ve produced isn’t a derivative work.


Well, if you’re coding it in Zig, and it’s barely seen any Zig, then how exactly would that argument hold up in that case?

It's definitively not the source of the word, but it might very well be the reason the decided to have a "fest i val". Gothenburg is famous for their puns, and even today they open up the mouth of the whale for visitors on two occasions - valdagen (election day) and Valborgsmässoafton (Walpurgis eve).

Göteborg - dad capital of the world!

Making a note to visit on one of those occasions!



It falls quite close to the "super ideal scenarios" you described, but Nordic did a real world test and got a range of 1300 m using coded phy.

https://devzone.nordicsemi.com/nordic/nordic-blog/b/blog/pos...


Interesting, so it roughly doubles the range. So we might be looking at like 50-100 m in the real world I guess.


Regular Bluetooth already has 100 m of range, at least for class 1 devices like most Apple devices. (Many older/non-Apple devices are class 2, which only does roughly 10 m. Very noticeable difference in an office environment using headphones.)


The second article is here: https://jsomers.net/blog/the-mcphee-method



Interesting concept with conceptual spaces, but how does that affect how you work with LLM:s in practice?


I think of it like improvising with a very skilled but slightly alien musician.

If you just hand it a chord chart, it’ll follow the structure. But if you understand the kinds of patterns it tends to favour, the statistical shapes it moves through, you can start composing with it, not just prompting it.

That’s where Gärdenfors helped me reframe things. The model isn’t retrieving facts. It’s traversing a conceptual space. Once you stop expecting grounded truth and start tracking coherence, internal consistency, narrative stability, you get a much better sense of where it’s likely to go off course.

It reminds me of salespeople who speak fluently without being aligned with the underlying subject. Everything sounds plausible, but something’s off. LLMs do that too. You can learn to spot the mismatch, but it takes practice, a bit like learning to jam. You stop reading notes and start listening for shape.


I’m sure you’re right, at least to some extent, but let’s not forget that Mad Men is fictional, and from the 21st century, and might not accurately reflect the 1950’s.


Fictional, but it captures something about work and life in that unique way that art is supposed to.

One of my favorite scenes:

Peggy: "You never say thank you!" Don: "That's what the money is for!"

It captures a lot of the mismatch in perspective between employer/employee boss/subordinate. You're there to do something for someone who is paying you to do it. That's as far as it goes (despite the constant human pull to perceive it as more).


I wonder what the US administration will demand from Netflix for approving this.


equity stake obviously


Gotta kiss at least two rings.


Adding Warner Bros. catalog will naturally lead to more titles to choose from for Netflix users. The choice of streaming services will be slimmer though. It will be interesting to see how regulators see it.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: