Dive Brief:
- Most generative AI users are encountering problems when using the technology, according to a global survey of 4,400 software developers, quality assurance pros and consumers conducted by software testing vendor Applause.
- The most common issues reported included responses that lack detail, misunderstand prompts or show bias. Nearly 1 in 3 generative AI users have swapped one tool for another, and more than one-third have a variety of preferred tools depending on the task.
- The AI-powered coding tools of choice were GitHub Copilot and OpenAI Codex. QA professionals said they are turning to AI for test case generation, text generation for test data and test reporting.
Dive Insight:
Enterprise generative AI adoption continues to expand, but the technology is still far from perfect.
Despite challenges, most IT and tech leaders like their chances against AI’s risks, according to a TeamViewer report published in November.
Early adopters point to keeping a human in the loop and implementing other guardrails as ways to sustain innovation while maintaining security and quality.
Walmart, for example, is expanding developer access to AI-powered coding assistance and completion tools after an initial rollout brought productivity gains and streamlined deployments.
To ensure generated code meets set standards, the company has a multistage process for checking accuracy, security and compliance before the code goes into production, according to Sravana Karnati, EVP, global technology platforms at Walmart.
Human validation is a key part of the process.
“Don’t wait for the tools to get perfect,” Karnati told CIO Dive. “Experiment, see what’s working and get your developers trained.”
Risk appetites vary from organization to organization. CIOs and other senior leaders should define where to draw the line.
AWS CISO Chris Betz told CIO Dive that organizations should identify risk levels based on use cases. If an AI tool is assisting an employee with a first draft of communications, there is more room to mess up. Enterprises exploring AI agents that perform a task with some level of autonomy, on the other hand, can’t afford to compromise on accuracy.
“My tolerance for error is much lower, and the quality bar that I need to achieve is much higher,” Betz said, referring to agentic systems. “This becomes a risk conversation. You build a threat model, you look at the quality of the testing, you look at the quality of the answers, you look at what other guardrails you have and from there you can go make a set of decisions.”