There's a paper by Papadimitriou (from Logicomix fame) and some collaborators that the transformer model is incapable of solving certain simple problems, and if done by Cost, it needs exponentially many steps.
The paper is currently only available at Arxiv (ie not yet peer-reviewed), but given that it is Papadimitriou, I would be inclined to believe the results.
The paper is currently only available at Arxiv (ie not yet peer-reviewed), but given that it is Papadimitriou, I would be inclined to believe the results.
https://arxiv.org/abs/2402.08164