
“The masters are liable to get replaced because as soon as any technique becomes at all stereotyped it becomes possible to devise a system of instruction tables which will enable the electronic computer to do it for itself. They may be unwilling to let their jobs be stolen from them in this way. In that case they would surround the whole of their work with mystery and make excuses, couched in well-chosen gibberish, whenever any dangerous suggestions were made.” – Alan Turing
“Free your mind, and the rest will follow” – En Vouge
In my Great Liberation series (Part I, Part II), I argued that the way AI is being rapidly injected and measured in software testing, a discipline that profoundly shapes modern life, has moved to borderline recklessness. Banking, healthcare, transport, education, and government infrastructure all run on software we need to be reliable and AI adoption is racing ahead at a pace and human judgment can’t keep up.
The reason this matters is because AI is being treated as a substitute for critical thinking and responsibility rather than a tool that demands more of it. Poorly tested systems cause real harm, and when organizations replace human evaluation with automation in the name of speed or cost savings, they increase risks to their business and society.
Despite what you may have heard or read, AI in testing is not neutral or low risk. The dream being sold of closed loop systems validating themselves, with minimal external oversight will make failures more systemic, less visible, and harder to correct. More than ever, we need to be pushing back against the hype cycle selling AI adoption as inevitable, discouraging skepticism and trading on FOMO.
But unfortunately, the software testing industry has failed to meet the challenge.
I believe part of the problem is test tool vendors are increasingly led by private equity and investment firm CEOs with no testing experience or pedigree while pushing AI-driven testing but insulated from the consequences. Look at the Gartner Magic Quadrant for AI-Augmented Software Testing Tools and you will struggle to find tester led companies that position testing as a risk management activity and not sold as a magic bullet for cost reduction.

Moral hazard has all but left our business because the people who run it have never lived with the consequences of being responsible for ACTUALLY testing something. And if you think that’s harsh, look at the claims they make about their products:
“AI-native systems eliminate the need for human-created models or manual test maintenance”
“AI-powered automation dramatically increases speed and coverage/automates everything”
Most vendors don’t even get that specific and just spew AI buzz words like they’re going out of style: “AI solves every testing challenge,” “no human needed,” “fully autonomous defect prediction,” or “100× productivity.” My point is that this isn’t predetermined. It’s a choice, and frequently these days a choice being made for us, but how we test and deliver systems reflects our values – and we should value safety, transparency, and fairness.
But fairness cannot be achieved with unverifiable claims lacking credibility or grounding in reproducible evidence, something even Gartner warns is common with GenAI vendor marketing. In The Pursuit of Fairness in Artificial Intelligence Models the authors argue that fairness, accountability, and transparency in AI is not automatic, but needing deliberate design choices, ongoing evaluation, and requiring human oversight.
To get there we need clear-eyed views on the technology challenges and risks and cannot afford AI testing “thought leaders” making nonsensical claims about AI testing software better than humans, redefining accepted terminology to suit their marketing needs, or presenting hypothetical concepts with no evidence. Especially, as I think it’s going to quickly get a lot harder for testing.
A recently released paper dubiously co-authored by an AI (Claude, Anthropic), “Toward Robopsychology: Evidence of Emergent Behavior in Human-AI Partnership”, presents some interesting challenges for quality engineering. In short, the paper talks about when AI systems interact with us, we both generate behaviors that aren’t completely predictable from either the AI’s original design or our actions.
Field of psychology or not, this study is puts some shape around new challenges for software testing like verification of systems where human behavior and AI behavior adapt to each other, metric validity for emergent behavior, and what threats to “human in the loop” governance does the human bring to the party for automation and regression testing.
So where do we go from here?
Probably not down except in headcount, even though companies as big as Salesforce, who did a massive layoff under the guise of AI productivity are scaling back their LLM use after facing repeated reliability issues. Per MSN, Salesforce “confidence in generative AI has fallen sharply over the past year”, but I don’t think there are any signs of slowing down in the general integration of AI into everything, just more of a pause.
And it doesn’t look like government oversight is coming any time soon, with core parts of the much lauded EU AI Act likely kicked down the road to 2027. In another capitulation, the European Commission seems to be following in the US footsteps in throwing the environment and high risk AI implementations out of the window in, as Reuters puts it, “an attempt to cut red tape, head off criticism from Big Tech and boost Europe’s competitiveness.” In 2022 I was complaining about the lack oversight on an industry so far out in front of regulation for AI, but apparently it wouldn’t have made much of a difference anyway.
When I started this series it was a play on AI being the “great liberation” for companies to finally rid themselves of the responsibility of testing and all those painful testers. I assumed (correctly) we’d find useful partners in the software testing industry and a regulatory environment fully on board with removing constraints and guardrails in the name of innovation.
So in times like these, your strategy to implementation is more important than your technology. We see more than ever now that AI is not just a tool; it is a choice. It’s a statement on what we care about. How it is adopted, governed, and tested tells us a lot about what organizations – and the people who run them value.
As I’ve said before, in this brave new world your test strategy should focus on:
- Algorithm transparency and explainability
- Automated decision-making controls and human in the loop interventions
- Governance for training data and model drift with measures for continual monitoring
- Audit trails for traceability on workflows, decisions, and agency
As 2026 shapes up to be another wild AI ride, a principled, transparent approach to using these tools in testing led by values, free from buzzwords, and grounded in solid testing practices and experience is the science that will see us through this, or as the good Dr Feynman put it, “the belief in the ignorance of experts.” Good luck and enjoy!
Discover more from Quality Remarks
Subscribe to get the latest posts sent to your email.