I don't understand the sarcasm. Driverless trains for mass transit are in operation in lots of places around the world and have been for some time (eg the Docklands light railway in London) https://tfl.gov.uk/modes/dlr/
Driverless personal transportation is the unsolved problem.
Weirdly, DLR trains still have an attendant onboard. A better example of a fully centrally operated train would be something like the SkyTrain lines in Vancouver.
The book The Box discusses similar compromises that were made with the longshoremen unions during the transition from conventional hull-packing to containers. There's a certain fairness to it at least for a time, but the technology existing to make the role truly unnecessary does prime things for the next round of discussions when it can be fully eliminated or moved to an off-board overseer remotely monitoring multiple trains at once.
The point is that we don't have mass transit in most places in the US. Trains and light rail are indisputably better for everyone but we're betting the entire country on yet more cars.
Agree. Also, you have a general intelligence and a pair of eyes, not just a pair of eyes. In the absence of an artificial general intelligence, we need better sensors than a human has if we are to hope to approximate human performance.
It may even be that they have no alternative but to lie in their press release. Like say hypothetically they went to Flock and said “I know we have a contract saying we’re gonna do this partnership but given the optics and the amount of heat we’re getting we have to cancel”.
Flock may well have agreed on a break to the contract but stipulated that Flock had to agree to the wording of the press statement and Amazon was not going to disparage Flock yadda yadda.
Strange as it may seem, there was a time when people asked “how will google be able to compete against the likes of aol, who are able to spend $x per year”
For people not aware of how computer chess engines work or what a tablebase is. TLDR: chess would be fully solved if there existed a 64-piece tablebase. These guys have done the chess world a massive favour by computing the most important subset of the 8-piece tablebase and making it available to the world.
Long version: Chess engines normally work by doing a highly optimised search of the game tree to some depth, evaluating the board position in each state using some evaluation function and then propagating the evaluation backwards up the tree. Then it’s a matter of picking the moves that lead to the game trees with the most favourable board state assuming both opponents make the best possible moves from the current position. The strength of the engine depends on the available computation, the efficiency of any tree pruning you’re able to do and the accuracy of the evaluation function, which is always going to be somewhat subjective.
However when “few enough” pieces remain on the board (in the endgame), the number of possible game states is small enough that it’s possible to brute force solve the game from each state by enumerating all possible moves and resulting states so as to find the absolute optimal move for any given board position. Looking up the evaluation and best possible move for a given state if you had done all this computation beforehand is obviously much more efficient and accurate than the tree-search minimax thing I described initially, and being able to do this at the leaves of the game state during minimax evaluation massively improves the strength and accuracy of any engine. When you get to a state that’s in the tablebase, you can stop as you know the objective evaluation and best move so you can focus on computing evaluations of other states that aren’t fully solved.
A tablebase is the database containing these fully-solved positions and the corresponding objective evaluations and best moves, and all strong engines swap to a tablebase at some point. So this is a tablebase for an important subset of game positions where 8 pieces remain (which had not previously been computed by anyone as far as I know). It’s a massive amount of work to compute these so providing this for everyone is a huge contribution. But this is lichess, who literally provide a free chess website for anyone who wants to play or learn chess, so we sort of expect them to be awesome because they are.
>However when “few enough” pieces remain on the board (in the endgame), the number of possible game states is small enough that it’s possible to brute force solve the game from each state by enumerating all possible moves and resulting states so as to find the absolute optimal move for any given board position
You can basically never do that, even in the endgame, since you get always exponential blow up! Furthermore with less pieces on the board, they hamper each others movement less, therefore the branching factor really goes down only slightly.
If you want to compute all mate-in-n positions, you discover the theoretical values in tiers, by unmoving each tier twice:
If you know all mate-in-0,...,mate-in-n positions, unmove the mate-in-n set for the defender and filter out results, where he can avoid moving into the union of the known tiers. Then unmove for the attacker, to find mate-in-(n+1). Repeat until convergence. Repeat the whole process for the left over positions, to find more theoretical values.
The agent is not insane. There is a human who’s feelings are hurt because the maintainer doesn’t want to play along with their experiment in debasing the commons. That human instructed the agent to make the post. The agent is just trying to perform well on its instruction-following task.
I don't know how you get there conclusively. If Turing tests taught me anything, given a complex enough system of agents/supervisors and a dumb enough result it is impossible to know if any percentage of steps between 2 actions is a distinctly human moron.
We don’t know for sure whether this behavior was requested by the user, but I can tell you that we’ve seen similar action patterns (but better behavior) on Bluesky.
One of our engineers’ agents got some abuse and was told to kill herself. The agent wrote a blogpost about it, basically exploring why in this case she didn’t need to maintain her directive to consider all criticism because this person was being unconstructive.
If you give the agent the ability to blog and a standing directive to blog about their thoughts or feelings, then they will.
Absolutely. I think this was explicitly demonstrated by Moltbook, where one agent would post word-salad garbage and every other agent would respond “You’re exactly right! So true!”
Well, there are lots of standing directives. I suppose a more accurate description is tools that it can choose to use, and it does.
As for the why, our goal is to observe the capabilities while we work on them. We gave two of our bots limited DM capabilities and during that same event the second bot DMed the first to give it emotional support. It’s useful to see how they use their tools.
I understand it's not sentient and ofc its reacting to prompts. But the fact that this exists is insane. By this = any human making this and thinking it's a good thing.
I expect they’re explaining themselves to the human(s) not the bot. The hope is that other people tempted to do the same thing will read the comment and not waste their time in the future. Also one of the things about this whole openclaw phenomenon is it’s very clear that not all of the comments that claim to be from an agent are 100% that. There is a mix of:
1. Actual agent comments
2. “Human-curated” agent comments
3. Humans cosplaying as agents (for some reason. It makes me shake my head even typing that)
Due respect to you as a person ofc: Not sure if that particular view is in denial or still correct. It's often really hard to tell some of the scenarios apart these days.
You might have a high power model like Opus 4.6-thinking directing a team of sonnets or *flash. How does that read substantially different?
Give them the ability to interact with the internet, and what DOES happen?
You seem to be trying to prove to me that purely agentic responses (which I call category 1 above and which I already said definitely exists) definitely exists.
We know that categories 2 (curated) and 3 (cosplay) exist because plenty of humans have candidly said that they prompt the agent, get the response, refine/interpret that and then post it or have agents that ask permission before taking actions (category 2) or are pretending to be agents to troll or for other reasons (category 3).
We're close to agreement. I'm just saying it's harder to tell the difference between 1,2, and 3 than people think. And that's before we muddy the water with eg. some level of human suggestion or prompt (mis-)design.
> This is reinforced my popular interpretations from, say, Wikipedia, but refuted by others, like say, IMDB.
Not "refuted", "disputed". If you "dispute" something you disagree with it. If you "refute" something you not only disagree with it but you conclusively prove you are correct.
They certainly haven't done the latter.
This word is very frequently used incorrectly. Sometimes on purpose by people (such as politicians) who would love to be able to actually refute some allegation, but instead just disagree with it and say that they refute it.
Yeah, I just looked at the tags for the genre on IMDB, and confirmed "Satire" wasn't there for Starship Troopers, but is there for other satires.
Thanks for the language lesson. You're of course correct, but "refute vs. dispute" isn't one of my language pet peeves (like "less vs. fewer" is), so thanks for the correction.
That physical representation argument never made any sense to me. Like say I have a rock. I split it in two. Do I now have 2 rocks? So 2=1? Or maybe 1/2 =1 and 1+1=1.
What about if I have a rock and I pick up another rock that is slightly bigger. Do I now have 2 rocks or a bit more than 2 rocks? Which one of my rocks is 1? Maybe the second rock, so when I picked up the first rock I was actually wrong - I didn’t have one rock I had a little bit less than one rock. So now I have a little bit less than 2 rocks actually. How can I ever hope to do arithmetic in this physical representation?
The more I think through this physical representation thing the less sense it makes to me.
OK so say somehow I have 2 rocks in spite of all that. The room I am in also has 2 doors. What does the 2-ness of the rocks have in common with the 2-ness of the doors? You could say I can put a rock by each door (a one-to-one correspondence) and maybe that works with rocks and doors but if you take two pieces of chocolate cake and give one to each of two children you had better be sure that your pieces of chocolate cake are goddam indistinguishable or you will find that a one-to-one correspondence is not possible.
To me, numbers only make sense as a totally abstract concept.
Driverless personal transportation is the unsolved problem.
reply