As generative AI percolates through enterprise tech stacks, training employees on prompting techniques has become a key part of the strategy to support and level up workers.
Prompt engineering was even paraded as an example of generative AI creating new roles as fear spread around the tech’s ability to disrupt existing workforces.
Now, the tune is changing.
Prompt engineering is on the path to rapidly becoming an afterthought, according to Rick Battle, machine learning engineer at VMware-acquired Broadcom. Battle is one of the lead researchers on a paper published in February that found AI models make better prompt engineers than humans.
“Anyone who’s still out there hand-tuning prompts is wasting their time and not concentrating on what really matters: building representative optimization and test sets,” Battle said in an email.
Enterprises are entering the second phase of generative AI, which comes after a year of heavy experimentation and looming regulatory enforcement. The emphasis is now on aligning talent strategy and technology implementation to achieve AI objectives in a cost-effective, efficient and responsible way.
But future-proofing plans is a challenge with the brisk pace of innovation and change.
Enterprise technology leaders are “trying to make those decisions while they are at the same time trying to deal with this continuing swirl of excitement in the broader development and media sphere,” Rowan Curran, senior analyst at Forrester, told CIO Dive. “The hype phase isn’t over even though we are in a much more practical phase.”
Mythbusting prompting tall tales
Generative AI tools were intended to take away administrative burdens, facilitate easier information gathering and improve the end-user experience, but prompting tools can be a time-draining task in and of itself.
Large language models are extremely receptive to the guidance they are given, leading some to try and game the system with positive thinking. Users have found tools will give longer responses when offered an incentive.
“We don’t understand exactly why these models produce the outputs they do, and this has implications for their use,” said Jeremy Roberts, senior workshop director at Info-Tech Research Group. CIOs should push for transparency and explainability when incorporating tools broadly.
In Battle’s research, nearly every identifiable trend related to incorporating human-written positive thinking in prompts had at least one notable exception. Without a straightforward, universal prompt snippet that can optimize any given model’s performance, workers are left to tweak prompts endlessly.
“Tricking the model into being smarter is absolutely best left as a task for the models themselves,” Battle said.
Using generative AI models as automated prompt optimizers also offers organizations a way to ease the struggle of hiring for the niche, scarce skill set. But upskilling IT employees who are part of the broader enterprise application development process on the latest approaches is a necessity.
“The people best suited to succeed with LLMs are the same people who were succeeding with smaller language models before: formally trained data scientists and machine learning engineers,” Battle said. “The advent of LLMs has not changed the fundamentals of data science.”
IT workers tasked with building production-ready, LLM-powered applications will require a familiarity with test sets, evaluation metrics and failure analysis, Battle added.
Using automated prompt optimizers is considered an emerging enterprise strategy, but more organizations are jumping on board.
“It’s a very powerful tool for helping enterprises understand the capabilities of their large language model supported applications and to start to build more realistic test spaces without having to do as much manual work in designing and processing systems,” Curran said. “The baseline savings for this are almost obvious on their face.”