Editor’s note: This article draws on insights from a CIO Dive virtual event panel. You can register here to watch a replay of the full event, “AI and Managing the Data Underbelly,” here.
Businesses are eager to reap the efficiency gains, user-experience enhancements and innovation capacity of generative AI technologies.
But large language models are data-hungry beasts, and their performance correlates directly to the quality of the information they ingest. Achieving enterprise generative AI adoption aspirations now hinges on IT’s ability to rationalize vast data estates, unlock reams of unstructured data and stand up risk-mitigating governance.
In the day-to-day churn of managing IT operations, tech leaders have to take the time to get the data foundation ready for growing AI use — and it's no easy task.
Last month, CIO Dive brought together IT executives from the automotive, real estate, fintech and banking industries to explore generative AI implementation and the data operations the technology relies upon. Here are some key takeaways from our conversations.
Cushman & Wakefield's data standardization at work
Global commercial real estate services firm Cushman & Wakefield is working to implement generative AI in marketing and content creation, according to Shawna Cartwright, business information officer and SVP of enterprise technology. But in the last year, Cartwright has focused more broadly on bolstering data operations across finance, legal and other core functions.
“We've spent a lot of time trying to standardize data across many of these key spaces,” Cartwright said. “But we also have a lot more work to do when it comes to ensuring that our data is clean and ready to go so that we are able to use it as we continue to build on our AI strategies.”
The company is eying previously overlooked unstructured data sources and merging proprietary and external data to create value for customers, she said.
“We’re doing quite a bit of work in the contract space, leveraging AI capabilities from contract solutions to go through our contracts and be able to understand the legal terms and commitments, ensuring that we are on track from a legal perspective to meet all of our legal needs from a contract perspective.”
To manage a geographically dispersed and functionally diverse enterprise, Cushman & Wakefield created several data teams, Cartwright explained.
“I have an enterprise data team, which is actually newly created in the last 12 months, and then we have an advisory and a services data team, as well as a product team that is supporting our deliveries within both data and AI.”
The challenge for Cartwright has been managing the complexities of a sizeable and growing data estate. The business is “working through some of the complexities of having multiple towers and ensuring that we don't silo ourselves and making sure that each of us are educating the other organizations on what is available,” she said.
Intuit pushes access to 'fresh, clean, well understood, correct' data
On Intuit’s path to provide meaningful data insights, it leaned on talent brought in via mergers and acquisitions to promote progress. Alon Amit, VP of product, analytics, AI and data at Intuit, joined the global fintech company five years ago as part of a deal to level up the company’s use of data and AI.
Amit heads a team dedicated to ensuring the data employees need is clean, helpful and ready to use.
“The whole point is for us to build those mechanisms for data producers and data consumers so that everyone around the company can take charge of their data,” Amit said.
It’s a task that became even more critical as the company worked to make its AI products smarter over the past year with the increased enterprise interest in generative AI.
“This was a seismic shift in our attention to AI, also an acceleration of our journey to empower developers across the company to use AI,” Amit said. “Accessing data — fresh, clean, well understood, correct [data] — all of those things literally expanded by a factor of three to five, almost overnight.”
Two years ago, Amit said the company felt like the overall state of its data was in good shape, but infusing generative AI shifted some of its priorities.
“Of course, some data is more urgent than other data … [but] our own blog posts were not at the top of that list,” Amit said. “Frankly, now they are to some extent because as we are training language models to speak our language, understand our domain and know the vernacular, our own blog posts are actually a mission critical component of that.”
Banks find potential in unlocking unstructured data
Professional services firm Accenture is assisting organizations on AI initiatives across industries while undertaking its own related transformation. Michael Abbott, senior managing director and global banking lead at Accenture, monitors around 350 different AI projects from banks around the world, which he categorizes in three groups: embedded tools, changed processes and product differentiation.
Most banks are leveraging generative AI by simply turning to already existing solutions, such as generative AI capabilities in Salesforce or ServiceNow. Others are trying to tackle decades of COBOL.
“Unfortunately many of the banks are still working on code from the 1970s,” Abbott said. “We're now able to put that back into original specifications and … it may very well be the most disruptive thing to banking overall, is when you can get it to the original specifications and you get it built right, generative AI can then forward engineer that code into any language.”
The power to untangle decades of legacy code is an immense draw for banking institutions. But Abbot emphasized the importance of leaning into unstructured data.
“This is not helping banks where they have structured information because of its risk and underwriting, most likely they’ve taken advantage of that information,” Abbot said. “It’s really all the unstructured problems out there that are being unlocked for the first time.”
American Honda speaks the data language
Data is the common language across American Honda Motor Company, Bob Brizendine, VP of IT at American Honda, said.
But preparing data for AI use cases and keeping a watchful eye on data quality and security has taken on new urgency in the last year.
“We have definitely increased the investment and the pace of our data quality initiatives,” said Brizendine. “We haven't defined a metric for this, but in essence, AI-ready data is what we're striving for.”
The company implemented a five-point adoption plan and, as he put it, gave data a promotion within the company.
Brizendine detailed several focus areas in his group’s data strategy: curation, discovery and literacy.
“When we talk about how to curate data and what are the target areas, our approach is pretty straightforward,” he said.
Unlocking previously untapped sources, what Brizendine noted as the “easiest area,” is a priority.
“These are the areas that deal with new business ventures that we're getting into within the company, or the growing emphasis that we have on sustainability initiatives — areas for which we have few existing sources of data,” Brizendine said.
One key area of data discovery lies in the unstructured realm.
“What used to be the biggest challenge was unstructured data when we were talking about data provisioning and things along that line,” Brizendine said. “But as many are realizing now, some of the best sources of high-quality data for generative AI and specifically the large language models are not found necessarily in your structured data, but they’re in what we would have historically considered to be our unstructured datasets.”
Much of that data, Brizendine noted, is already highly curated and resting in product manuals and other documentation, easing the quality assurance process.
For structured data, Brizendine said there’s no secret sauce. “There's really no magic here,” he said. “We're really prioritizing the most critical areas where we do lack high-quality data as we move toward this AI-ready level for the future.”