> As an example, they cited how Devin, when asked to deploy multiple applications to the infrastructure deployment platform Railway, failed to understand this wasn't supported and spent more than a day trying approaches that didn't work and hallucinating non-existent features
An engineer not reading the docs and wasting a day chasing their tail because of that. Yes… how unrealistic…
>"Tasks that seemed straightforward often took days rather than hours, with Devin getting stuck in technical dead-ends or producing overly complex, unusable solutions," the researchers explain in their report. "Even more concerning was Devin’s tendency to press forward with tasks that weren’t actually possible."
Apparently we've all been working with Devin for years.
> "Even more concerning was Devin’s tendency to press forward with tasks that weren’t actually possible."
Quickest way to get AI engineers kicked out of the company will be to patch them so they push back against unrealistic goals from management.
Seriously though, where is the AI C-suite? The AI BoD? At least with an AI BoD you don't have to worry about them doing backstabbing financial shenanigans for their own self-interest at the expense of the company.
You would need much less "agreeable" AI to reliably steer a company. With current models an AI C-suite would quickly get "captured" by almost anyone interacting with it.
If an employee behaved like an LLM, a company should immediately get them into a debriefing with corporate counsel, HR, management, and trusted top technical personnel.
For example, to try to find out whose IP they plagiarized, and how badly we're scrod.
Or, for example, to find out how they generated so much code they don't understand at all, and how badly we're scrod.
Or, for example, to find out why they wrote a criminally negligent security vulnerability or data corruption, and how badly we're scrod.
Or, for example, to see what engineering assurance they "hallucinated", and how badly we're scrod.
I wonder how much you get billed if the agent spends a whole day running around in circles. The $500/month subscription only comes with 250 vaguely defined "compute units", so past a certain point you'd have to pay extra for the time it wastes.
Move over "bankrupted by runaway cloud spending", it's time for "bankrupted by AI agents trying and failing to complete a task indefinitely".
Depends on the company. We all hear stories of people writing themselves a promotion / bonus by deploying a bunch of bugs they can then save the day by fixing.
Do people actually do that? Finding bugs in virtually any piece of software isn’t difficult if you have access to the source. Merging in a bug only to fix it later honestly seems like more work. Most bugs are pretty easy to fix…
A much more common story would be people knowingly cutting corners because of management pressure/demotivation/etc, then fixing the resulting bugs. It's easy for somebody doing that to look like a hard-working hero compared to the programmer who just avoided the problems in the first place.
No, if A's PRs always bounce because the testers find bugs then A is going to look like an idiot. Then again you need to work at a place that actually employs testers.
If B always submits PRs and they always go straight to merged in prod, then B knows what he's doing
I've seen a lot of fairly explicit discussions around "this timeline will require cutting these corners and cost this much time to fix later or else it will cause these problems", and also some relatively internal discussions around "how strongly can we rely on promises that the project won't get dropped before all the cleanup is done, and how does that impact what options we can present".
> Finding bugs in virtually any piece of software isn’t difficult if you have access to the source.
What????
Yeah, trivial bugs maybe.
Even because most "hairy" bugs (and those are the one that count by the end of the day) manifest themselves not in obvious ways, but only under some hard to predict set of pre-conditions and input data. And let's not even get started on threaded/asynchronous code.
In my experience, "not supported" has a wide range of meanings from "next to impossible" to "we just don't want you to", so that wouldn't deter a human either (and I have interpreted it to mean "challenge accepted" several times), but the latter would be unlikely to hallucinate non-existent features.
I'm unemployed now and trying to look for work, but the sheer amount of "AI is going to take over" is making it really hard for me to get excited about learning new tools. I know this is self-defeating, but between how fast the programming landscape already changes and this, it just feels like I'm in an industry that is so hostile to its own inhabitants.
Whereas when I was a professional Ruby programer, everything was nice, because the community really cared about making developers lives nice.
> but the sheer amount of "AI is going to take over" is making it really hard for me to get excited about learning new tools
Ignore it. It is marketing. There have been about ten “this makes programmers obsolete” crazes in the last 50 years (it actually goes back even further; COBOL was initially marketed this way!)
That I've been hearing this for several years straight without any exponential increase in the capability of the technology makes me assume this is false.
> it just feels like I'm in an industry that is so hostile to its own inhabitants.
You are in a historical period where industries are largely monopolized and are notably hostile to those who earn the largest salaries.
> because the community really cared about making developers lives nice.
The community isn't going anywhere. AI can't replace our need to connect with and achieve large advancements with each other. The golden era of computing has been dwarfed by commercial concerns, but it has not, and will never disappear entirely.
Now I'm curious how similar being a software developer is to being a doctor, in the way of having to stay up to date with the latest stuff. Like, how much new medical research is just hype nonsense, and how much is something I should really learn. And how many doctors are selected during interviews for their currentness in research which is overhyped.
What other industries suffer from this to this extent?
> how much new medical research is just hype nonsense
There are actual rules in medicine about the devices, medicines, and procedures you can and can’t use. You can’t just have a doctor use a device or medicine that hasn’t been proven to be as good or better than the existing standard of care (unless you’re specifically participating in a clinical trial). So the amount of BS you can get away with is limited in that way.
However, medical providers are also under a lot more financial stress than most startups, and less constrained by market pressures, so there’s a lot more incentive for straight up corruption (e.g. ordering unnecessary expensive tests).
I have seen this happening at work. What should have been a single file one-off script to do something was instead a full blown application "just in case".
In my experience, a lot of the talk about AI making engineers obsolete is coming from people that have some odd, deep-seated hated towards SWEs. I don't know why people in tech receive so much vitriol compared to all other white collar workers but they definitely do.
I have a family member who genuinely hates SWEs and the "lazy dorks" in tech, but, it's like, Buddy, your wife works in tech... You worked at a tech startup that failed...You are now trying to start a startup with only an AI writing the software. You are in tech! But, since he can't read or write code and just delegates to an AI, it's somehow different in his mind. It's honestly exhausting to be around.
I would associate these hatred manufactured by “influencers” on social media. Try checking out the fantasy/lies they constantly peddle. It is natural for a normal person to feel hatred when someone putting 5% effort and making google L6 salary somehow(honestly, I don’t even know where these influencers make such salary).
While I can say a lot of things, see if you can look at some of these influencer videos/reels on normal people’s timeline, the content is bizarre and goes maximum effort to garner envy, by reducing the entire profession into “learn to code and make millions”. For you/me, we recognize this bs immediately, but for normal people, it is painful when you see that someone is making 350k+ and having time to push out reels all day while also travelling and collecting sports cars/bikes and promoting bill gates/jeff bezos revenue generation per hour.
I am unsure if I ever saw such bizarre influencer behavior of other professions.
I can tell you why people hate tech... becuase so much of it is built poorly and causes nothing but frustration for people. These people are stuck on calls with robots when they need help, or angry that they lost a password and need to prove their identity, they are irritated by broken, ad riddled websites.
I used to think I'd be part of the solution... but these days it really feels like nobody in the industry is willing to admit how bad it is.
I agree with everything you're saying but I'm referring specifically towards people's hatred towards tech workers.
Maybe it's just because I live in a small town on the East Coast but it's brutal out here in the sticks. It's gotten to the point where if someone asks what I do for work, I just say "computer stuff" and wave my hands around like I'm typing on an imaginary keyboard. It's vague and odd enough that there aren't any follow-ups or rebuffs.
I do the same thing out here on the west coast if I'm at a bar or whatever.
But I think the hatred would be less if the average person actually felt like we were providing value proportional to our salaries, not just having salaries proportional to our intelligence.
They also do not accurately understand the cost of keeping things running, which is a problem not isolated to tech.
EDIT: wow I'm a special kind of dyslexic tonight. Got east and west mixed up. I live in Boston roflmao. I was picturing you living in northern cali or something >.<
> In my experience, a lot of the talk about AI making engineers obsolete is coming from people that have some odd, deep-seated hated towards SWEs.
I’m not convinced it’s hate, but a general misunderstanding of what a SWE does. If you haven’t spent a long time doing it, you might be under the impression it’s mostly about producing lines of code, not trying to figure out what the ticket says should to be done, what can be done and what actually needs to be done and who’s the person(s) that might have the answers. Writing code is the easy part.
Yeah it ruined numerous relationships in my life. My brother in law turned into a total luddite and liar due to my tech career. He foams at the mouth with stories about Indian's replacing us all.
I've thought a lot about it and I believe it's because tech is one of the few careers that has a ton of freedom. In normal job markets we can switch companies easily. We can go raise money and start our own companies etc.
Most other careers are do x, y, z then you might have your choice of 1 or 2 employers that you then are mostly stuck working for.
I think a lot of people are very threatened by that freedom.
I want to live in a world where tech workers aren't the only people with this freedom.
How can I as a software engineer by trade help make this world is the question?
The talent pool would be improved if not everyone felt they had to become a techie just to live a comfortable life. Though I will say the resurgence in interest in the trades is a good sign, even if it also often comes with a certain rejection from education in general...
Idk, lots of big problems and questions here.
P.S. I'm sorry your brother can't be happier for the good things in your life.
I grew up in a homebuilder family and I saw enough as a teenager on job sites to know that the trades are way more screwed up than people think. Very tough getting your foot in the door with an honest builder as a plumber etc.
There are so many barriers and tons of nepotism in the trades.
I'm sure its much worse today with all the wall st builders dominating the game. You can't even buy lots in the master planned communities sprouting up everywhere unless you can build out an entire subdivision.
Nepotism. Cut-throath and a lot of time dishonest competition, life-long health issues.
People who haven't come from blue-collar families tend to see the trades through rosy glasses.
There's a reason my father worked like a horse so I could study. And while I am pretty competent with manual labor, it would have completely killed him if I followed his foot-steps.
You say that but I don’t think it’s entirely fair. The tech industry is often living in an entirely different world. Hearing SWEs in places like SF and Seattle who are personally bothered and upset because homeless people have the gall to exist and they have to see them makes me want to scream.
Also SWE often just have the worst attitudes and egos to rival surgeons who think they’re gods. Real talk: a lot of them are working on shit that doesn’t matter to anyone and is a solution looking for a problem. SWE are very often not producing anything of value to anyone. The entire startup industry is predicated on the hope that you’ll exist long enough to get a giant payday from the big boys.
The person who fixes your toilet did real work with a tangible benefit. The person who takes care of your grandmother by cooking/bathing/shopping for her is doing real work that has a tangible benefit. They don’t get paid anything comparable to a SWE who is making Uber for dogs or the 15th messaging system that Google will shut down after a year.
Maybe SWEs are overvalued for their output compared to other people.
Software engineers will usually build whatever investors decided to give them money to build. So far, at least for the VCs and the Wall Street people those things you call useless seem to be pretty much in demand. If you don’t like and I have to somewhat agree with you here, you should complain with the folks that cut the checks not the ones cashing them.
I do a lot of manual work at home and even at friends for free. I can build a house, I can solder stuff, fix my own plumbing and yes, it is hard work, but most of those things don’t have a big barrier of entry. So, supply and demand apply.
Recognizing homelessness as a problem and feeling threatened and disgusted by it is not an exclusivity of software engineers, and if you talk to blue collar folks you’d be surprised to find that they see the problem in a far more heavy handed way than our average googlezen. Compulsory rehab would create frissons of indignation in our average habitats while being absolutely popular amongst construction and factory workers.
I 100% agree the issue is more on the capital side. There probably shouldn’t be people who have more money than God who can afford to thrown hundreds of million dollar darts at a wall and see what sticks.
But I still think the downstream effect is an attitude in many SWE that divorce them from reality. Most people don’t get paid incredibly large sums of money on work that amounts to a gamble, with high risk and high reward. The failure rate for small businesses isn’t trivial but they’re not playing in the same part of the casino as tech. Tradespeople aren’t afforded the luxury of building entirely speculative things that are going to be dead within a year or two. People expect their homes and the things in them to have a much longer shelf life.
My point about seeing some of the things Google/Amazon/Apple folks have said regarding the homeless is that those are very much people working in the industry that made Seattle/SF impossible for “normal” people to afford to live in. There are many reasons people become homeless and why they stay that way but being literally priced out of a place you may have been born and raised in and lived for decades is a very special kind of fucked.
That said, if the capital side of things wasn’t the runaway nightmare monster it was, SWE probably would and should be making less than they do now on average.
It costs more to be poor. Closing the gap absolutely means raising the minimum standard of living but wealthy inequality absolutely goes both ways and it isn’t going to get better unless things contract on both ends. Raising things up at the bottom does mean lowering things at the top. The impact isn’t linear though.
Obviously the real world has other ideas but there should be less people with more money than God and that does mean less money for SWEs. But the face value of money and the absolute value of money aren’t the same either.
But I also don’t things are going to be getting better anytime soon so it’s somewhat of a moot point. The incentives aren’t aligned and unfortunately the people who are playing an idle game are going to squeeze every last cent out of the people who can still afford to buy food to feed their family, but only just.
It may be that software development is less gatekeeped than other white collar jobs. No degree, no certification to get started. Thus seen as "lazy". Paid for sitting on your ass.
Or it may just be the high school mentality carried over into adulthood. In other words envy. The "dorks" are supposed to stay miserable. Not do what they like and earn good money from it.
No surprise it's seen different from being a founder or a manager. Our culture respects these occupations more. They may be in "tech" but they are for sure not the "lazy dorks" the developers are.
> You are now trying to start a startup with only an AI writing the software. You are in tech! But, since he can't read or write code and just delegates to an AI
> SWE-bench is a dataset that tests systems' ability to solve GitHub issues automatically. The dataset collects 2,294 Issue-Pull Request pairs from 12 popular Python repositories. Evaluation is performed by unit test verification using post-PR behavior as the reference solution.
I have this terrible anxiety about working at a company like this where I need the job and money but it’s obvious that a bunch of optimistic, but under-equipped business types are asking for the impossible and now I’m just in way over my head, desperately trying to make this thing work that doesn’t stand a chance.
It would be like if the Apollo program was just 10 grads.
I hate to break it to you, but that's exactly what a lot of companies are like. Sales people can routinely promise impossible things to their customers and then expect YOU to make it happen fast, without much regard of whether you can, whether you should, or how you felt about it.
When, precisely, did programmers stop caring about efficiency? Why is everything in the modern tech stack so bloated and inefficient? It's so funny to me that modern AI is the most wasteful thing we've invented and it's not even good at what it does. It goes against everything I know as a programmer.
It went this way since the beginning of time. When I started writing PHP 3 scripts, I was told that PHP is a massive waste of CPU cycles and resources and is inherently bad. PHP pays my bills since almost two decades and in this time computers got faster and faster. I'm with you, AI is a massive waste of energy. But so is everything else that came before it.
I think the fact that it can do anything at all is pretty amazing from a technological standpoint.
Not sure if it will impact jobs really. Most likely, I think it will impact the number of people entering the field only. I suspect software will be a very good career in 2035.
Years ago I read an article about AI and it said the biggest hurdle is getting to where the user doesn’t feel like they have to intervene. Now we have “self driving cars” that require attention, LLMs that require attention. We have blown this first round by making people feel like they cannot trust anything automated. Now our relationship will be “you can do this ok, but I have to babysit you”. Personally I’d rather be a maker than managing an unpredictable ai.
I think the fundamental conceit that underlies all the surprise at AI's progress is that we humans rely more on figuring things out than mere chance. Trial and error is a huge part of human (indeed of all beings) progress.
An engineer not reading the docs and wasting a day chasing their tail because of that. Yes… how unrealistic…