In the most straightforward way possible, the commoditized intelligence-as-a-service of a technologically mature civilization must be a public utility, rather than a handful of walled gardens competing over territory, or worse, a single one that has won all.
It's not that code is distinct or "less than" art. It's an authority and boundaries question.
I've written a fair amount of open source code. On anything like a per-capita basis, I'm way above median in terms of what I've contributed (without consent) to the training of these tools. I'm also specifically "in the crosshairs" in terms of work loss from automation of software development.
I don't find it hard to convince myself that I have moral authority to think about the usage of gen AI for writing code.
The same is not true for digital art.
There, the contribution-without-consent, aka theft, (I could frame it differently when I was the victim, but here I can't) is entirely from people other than me. The current and future damages won't be born by me.
Alright, if I understand correctly, what you're saying is they make this distinction because they operate in the "text and code" space but not in the media space.
I've written _a lot_ of open source MIT licensed code, and I'm on the fence about that being part of the training data. I've published it as much for other people to use for learning purposes as I did for fun.
I also build and sell closed source commercial JavaScript packages, and more than likely those have ended up in the training data as well. Obviously without consent. So this is why I feel strong about making this separation between code and media, from my perspective it all has the same problem.
I agree it does all have the same problem, but on balance: it's much easier to rationalize my own use of genAI to augment my programming skillset and (maybe) stay employable, than it is to rationalize using genAI to do commercial artwork.
re: MIT license, I generally tell people they have to credit and that's functionally the only requirement. Are they crediting? That's really the lowest imaginable bar, they're not asked to do ANYTHING else.
If you find yourself 50-first-dating your LLMs, it may be worth it to invest some energy into building up some better context indexing of both the codebase itself and of your roadmap.
Yeah, I admit I'm probably not doing that quite optimally.
I'm still just letting the LLM generate ephemeral .md files that I delete after a certain task is done.
The other day I found [beads](https://github.com/steveyegge/beads) and thought maybe that could be a good improvement over my current state.
But I'm quite hesitant because I also have seen these AGENTS.md files become stale and then there is also the question of how much information is too much especially with the limited context windows.
Probably all things that could again just be solved by leveraging AI more and I'm just an LLM noob. :D
Beads is basically what github issues is, but local and built in a way that LLMs can easily use it. I had a self-made solution that was close, but moved to beads because it worked out of the box without disrupting my workflow that much.
I've used it quite a bit, but now that Gas Town is a thing Beads getting a bit bloated and they're adding new features left and right, dunno why.
Might have to steal the best bits of Beads (the averaged out cli experience and JSONL for storing issues in the repo + local sqlite cache) and build my one with none of the extra bells and whistles.
Given this specific family of product, the ads are essentially baked in - medium is the message and all.
LLM induced psychosis is one thing, but extremely subtle LLM induced brand loyalty or ideological alignment seem like natural attractors.
One day a model provider will be 'found out' for allowing paid placement among its training data. It's entirely possible that free-tier LLMs won't need banner ads - they'll just happen to like Pepsi a lot.
This is what worries me the most. Marketing is ultimately a business of manipulation, and services like ChatGPT seem like excellent tools for manipulation. I wish OpenAI could find a less adversarial business model.
Data has historically been a moat, but I think now more than ever it's a moat of bounded size / utility.
The biggest data hoarders now compress their data into oracles whose job is to say whatever to whoever - leaking an ever-improving approximation of the data back out.
DeepSeek was a big early example of adversarial distillation, but it seems inevitable to me that frontier models can and will always be siphoned off in order to produce reasonably strong fast-follow grey market competition.
Quite a contrast from the quote about civilization advancing in proportion to the size and scope of things it can achieve automatically.
Dug it up. Alfred Whitehead:
It is a profoundly erroneous truism, repeated by all copy books and by eminent people when they are making speeches, that we should cultivate the habit of thinking of what we are doing. The precise opposite is the case. Civilization advances by extending the number of important operations which we can perform without thinking about them.
A rooted phone is more capable of modifying the banking app itself and has 'freer reign' over the APIs that the app uses to interact with the bank.
Whereas previously the app displays a 'whitelisted' set of UI options to the user, the rooted user could use employee only methods. Somewhere or other every bank has methods that set balances on accounts.
To be honest a law like this makes security by the extremely modest obscurity of not having an "increase your balance" button on the app UI much more tempting.
I do not know whether I'm misinterpreting this comment but:
Yes, the context from working sessions moves over the wire - claude "the model" doesn't work inside the CLI on your machine - it's an API service that the cli wraps.
Nit: Children haven't been taught the food pyramid in something like a couple of decades I think. Current model is something like the DailyPlate visual - a plate filled proportionally with various things.
In the most straightforward way possible, the commoditized intelligence-as-a-service of a technologically mature civilization must be a public utility, rather than a handful of walled gardens competing over territory, or worse, a single one that has won all.
reply