Test as Transformation – AI, Risk, and the Business of Reality

“If the rise of an all-powerful artificial intelligence is inevitable, well it stands to reason that when they take power, our digital overlords will punish those of us who did not help them get there. Ergo, I would like to be a helpful idiot. Like yourself.”
Bertram Gilfoyle

A while ago, I wrote a series of posts on how testing can support your business through a “transformation”, either through disruption, optimization, or by managing expectations around risk and what testing can do to help navigate digital transformation. As GenAI has stampeded into every aspect of technology delivery and innovation, I thought it only appropriate to add another entry into my “Test as Transformation” series.

I’ve always contended delivering technology value to your business will either be constrained or accelerated by your approach to testing, and adding AI into that risk is like injecting it with steroids. By layering black boxes into everything from customer service to decision support, testing has rapidly become one of the most important sources of business intelligence a company has and a true market differentiator.

Testing has always been about information – learning things about systems we didn’t know and confirming things we thought we did, but there are new risks AI presents for your business I’d like to highlight that testing should address.

Continue reading

User Error

The tragedy of Adam Raines death by suicide at age 16 this year should have shocked the world into a renewed focus on accountability in Silicon Valley but instead, we heard the familiar excuse from the no-man’s land of moral hazard – user error.

In the first court filings from OpenAI they defended themselves by victim blaming. “Misuse” and “unforeseeable use” are cited even though according to the parents claims, ChatGPT basically gave him instructions for self-harm including offering to help write the suicide note.

Using legalese and technical “terms and conditions” arguments to defend jeopardising a vulnerable teen only further lays bare the morally hollow position the company has taken. GenAI is an incredibly powerful tool that must come with responsibility instead of marketing gimmicks – we now see what’s at stake.

And not knowing is no longer an excuse.

The research paper, The Illusions of Thinking published by Apple earlier this year is a pretty chilling read in this context. We know that as the complexity of problems hit a threshold, the models basically collapse and their accuracy and the reduction in “thinking” drops dramatically.

That’s not just a problem for automating a business workflow with AI, it speaks directly to the complex problems being fed into these models by everyday people. People who struggle with depression, loneliness, and other mental health issues requiring empathy, nuance, and a depth of humanity not possible with GenAI.

Further to that, in 2023 the G7 countries agreed to implement a code of conduct for developing “Advanced AI Systems” including “risk-based design, post-deployment safety, transparency about limitations” and accountability when things go wrong because these systems require more than a TOR like an app that plays music on your phone.

The point is, we know to do better and are on the familiar path to let big tech off the hook on the grounds of “user error”, which ultimately treats a teenager’s tragic death as just a design flaw.

Software Testing Weekly – 294th Issue

Happy to be included in the 294th issue of Software Testing Weekly – thanks, Dawid Dylowicz!

IEEE Report: How IT Managers Fail Software Projects

“Few IT projects are displays of rational decision-making from which AI can or should learn.”

This report from by Robert Charette for IEEE Spectrum on “How IT Mangagers Fail Software Projects” is just fire and should be required reading for anyone looking to reuse or “train” their existing management into a crack team to deploy AI into their systems.

If you’ve been working in the software development business for any length of time, most of what’s in this report won’t come as a surprise: poor risk and investment management, bad process, shoddy leadership, and a near endless ability to refuse to learn from past mistakes. And we keep handing the keys back to the same people that drove the last car into the ditch!

It’s a well researched piece and I would recommend you spend the time reading it, as if we know anything about most software projects, it’s that they seldom learn from their mistakes and AI will only increase the risk and opacity!

He lays it out plain and simply here:

“Frustratingly, the IT community stubbornly fails to learn from prior failures. IT project managers routinely claim that their project is somehow different or unique and, thus, lessons from previous failures are irrelevant. That is the excuse of the arrogant, though usually not the ignorant.”

Ouch!

Agile Testing Days – 2025

It’s been a few years since I made the familiar journey to the Dorint hotel in Potsdam, the annual home of the Agile Testing Days conference, and I was very happy to be back in 2025. Personally, I was excited to be there just to hang out with some friends who due to global distribution and the pandemic, we’ve not had more than an online relationship in recent years.

But this was more than a holiday and I attended quite a few sessions, listened to a LOT of talks, and had some pretty serious conversations with folks about the future of our business as yet again, the software testing industry I love and more importantly, the people who work in it are threatened by what feels like kids playing with toys in AI slop.

But first the highlights for me… (all the videos should be posted by ATD on YouTube)

Continue reading

Alun Turing FTW

Ooof! Alan Turings 1947 lecture on the “Automatic Computing Engine” still hits hard…

“couched in well chosen gibberish” is exactly how I’d describe what I’ve seen for AI in testing leadership right now…

The State of AI in 2025: McKinsey Report

McKinsey just published their report on The state of AI in 2025: Agents, innovation, and transformation and aside from the usual overly optimistic responses around ROI and resource reduction, it had some interesting (and troubling) data on AI risk and mitigation.

The online survey of nearly 2000 respondents was conducted in June and July this year from what appears to be a fairly large sample size of companies, regions, and industries from all over the world.

As expected, most of the enterprise is still experimenting with AI and agents and haven’t really begun to scale their efforts. Going by most of the large organizations I’ve worked with, I suspect this is primarily due to the inefficiencies of any global operation and large scale duplication of efforts.

Continue reading

Software Testing Weekly – 293rd Issue

Happy to be included in the 293rd issue of Software Testing Weekly – thanks, Dawid Dylowicz!

(Testing) Mind over (Developer) Matter(s)


“The secret of life is honesty and fair dealing. If you can fake that, you’ve got it made.”
Groucho Marx

For as long as I’ve been in the software testing business, a consistent myth I’ve encountered is that developers and testers share the same mindset. Usually this is accompanied by the view that testing is just an “activity as part of development,” and therefore developers can do as good a job as testers at it and in many cases, are positioned to do a better job.

Ironically, the reality is that the people most confident in their ability to evaluate a system’s quality are usually the least able to do so with any objectivity. In just about every other discipline of engineering this isn’t a hot take or viewed as a criticism. It’s human nature.

So today, as AI has entered our daily lives, in the face of the enormity of the task of testing artificial intelligence systems, the idea of using LLMs to judge their own output has become mainstream.  But pretending that developers alone can test or limited “human in the loop” checking has gone from a novel optimistic opinion to borderline reckless.

Continue reading

EU Privacy Under Fire from Big AI?

“European Commission accused of ‘massive rollback’ of digital protections”

Could be not great news for consumers and vulnerable communities if this goes ahead, from the article:

“The commission also confirmed the intention to delay the introduction of central parts of the AI Act, which came into force in August 2024 and does not yet fully apply to companies.

Companies making high-risk AI systems, namely those posing risks to health, safety or fundamental rights, such as those used in exam scoring or surgery, would get up to 18 months longer to comply with the rules.”

Industry is already so far out in front of regulation we need to STRENGTHEN these measures, not delay them further.

Continue reading