Downside is a lot of those that argue, try out some stuff in ChatGPT or other chat interface without digging a bit further. Expecting "general AI" and asking general questions where LLMs are most prone for hallucinations. Other part is cheap out setups using same subscription for multiple people who get history polluted.
They don't have time to check more stuff as they are busy with their life.
People who did check the stuff don't have time in life to prove to the ones that argue "in exactly whatever the person arguing would find useful way".
Personally like a year ago I was the person who tried out some ChatGPT and didn't have time to dabble, because all the hype was off putting and of course I was finding more important and interesting things to do in my life besides chatting with some silly bot that I can trick easily with trick questions or consider it not useful because it hallucinated something I wanted in a script.
I did take a plunge for really a deep dive into AI around April last year and I saw for my own eyes ... and only that convinced me. Using API where I built my own agent loop, getting details from images, pdf files, iterating on the code, getting unstructured "human" input into structured output I can handle in my programs.
*Data classification is easy for LLM. Data transformation is a bit harder but still great. Creating new data is hard so like answering questions where it has to generate stuff from thin air it will hallucinate like a mad man.*
Data classification like "is it a cat, answer with yes or no" it will be hard for latest models to start hallucinating.
So I tried it and it is worse that having random dude from Fiverr write you code — it is actively malicious and goes out of it's way do decieve and to subtly sabotage existing working code.
Do I now get the right to talk badly about all LLM coding, or is there another exercise I need to take?
It's like arguing that the piano goes out of tune randomly and that even if you get through 1, 2, or even 10 songs without that happening, I'm not interested in playing that piano on stage.
This is a known sales trick, called door-in-the face. First you introduce your victim to an outrageous claim, and then follow with a more modest and more reasonable sounding claim.
In truth neither claims are reasonable, but because of the door in the face, the victim is more susceptible the the latter claim. Without the more outrageous claim it is unlikely the victim would have believed the latter claim.
In reality, both "AGI" and "100x miracle" AND the "10x miracle" are all outrageous claims, and I call bullshit on all of them.
I am more concerned by bait and switch that is comming, people will get used to convenience for $100 a year or $100 a month and after 10 years they do price 5x and what are people going to do?
I guess we all know and „love“ how every five minutes, some breathless hipster influencer posts „This changes everything!!!“ to every new x.y.1 AI bubble increment.
But honestly? This here really is something.
I can vividly imagine how in a not too far future, there will only be two types of product companies: those that work like this, and those that don’t — and vanish.
Edit: To provide a less breathless take myself:
What I can very realistically imagine is that just like today sane and level-headed startups go „let’s first set up some decent infrastructure-as-code, a continuous delivery pipeline, and a solid testing framework, and then start building the product for good“, in the future sane and level-headed startups will go „let’s first set up some decent infrastructure-as-code, a continuous delivery pipeline, a solid testing framework, and a Ramp-style background agent — and then start building the product for good“.
Yeah I feel somewhat the same way. This looks like some serious engineering effort went into it, and it looks like there should be a way to measure its impact on developer productivity and quality of output. I'm a bit hesitant considering finance is not an industry you want to introduce security problems in, but nonetheless will be a good test of these tools.
If it really does work I expect there will be many paid and open source variants that other companies can adopt into their workflows. So I'll patiently wait for the outcomes before trying something like this, but I'm glad someone is.
This is by far the best summary of the state of affairs, or rather, the most sensible perspective that one should have on the state of affairs, that I've read so far.
As a German, I can say I‘m very happy for the intervention some decades ago, but it’s of course just one example, potentially a bad one, and likely cannot be generalized — just wanted to throw this into the ring as a positive example.
> Germany waged war on just about the whole world, the response to that was one of defense, not offense.
I find it useful to distinguish legality from morality of the move of capturing Maduro and his wife.
One way I approach it is to ask myself: if one could have Maduro returned to Venezuela today, would one? Perhaps the answer that most people would give is yes (i.e. everyone would be better off), but I'm not so sure.
Yeah, a pattern like „do the heavy lifting with cheap regexes, and every 100 line items, do one expensive LLM run comparing inputs, outputs, and existing regexes to fine-tune the regexes“.
Can you explain what you mean by that? As of this writing, there is a „Iran protests“ block on cnn.com on position #2 or #3 depending on how you count, well within the first 20% of the scroll-like homepage.
reply