Dive Brief:
- Five technology vendors united behind adoption of cross-industry data safety standards Thursday. Cisco, IBM, Intel, Microsoft and Red Hat jointly sponsored the launch of the OASIS Data Provenance Standards Technical Committee, nonprofit organization OASIS Open said in an announcement.
- The group will convene April 8 with the mandate to refine the data provenance standards codified by members of the Data and Trust Alliance and encourage widespread adoption.
- American Express, Humana, IBM, Mastercard, Pfizer, UPS and Walmart were among the 19 D&TA affiliates that crafted version 1.0.0 of the standards, published in July 2024.
Dive Insight:
Data preparedness is an old problem that’s reached a new level of urgency, as data-hungry generative AI applications increase the pace and scale of data consumption. The road to AI adoption is strewn with privacy, compliance and integration hurdles that remain hard to steer around.
“Data governance is nothing new — people have been at this for a long time,” Kristina Podnar, senior policy director at D&TA, told CIO Dive. “But we’ve never had businesses come together to make it happen.”
Standardizing provenance protocols across vendors and industries can clear a path for better data quality management and tools that automate the validation process.
The alliance’s framework defines a common metadata classification system to help organizations efficiently validate the quality and reliability of datasets for use in traditional analytics and AI business applications.
“We want to ensure any business across all industries can have visibility and transparency into the data that they're using,” Podnar said.
In an unsettled regulatory environment, broad involvement is needed to create a trust benchmark for third-party data sources. Policy professionals want stronger guardrails around AI data, but most expect it be several years before governments can effectively intervene.
“There has been much movement around adoption and regulating AI, yet we don’t have standard definitions for its critical elements,” Saira Jesani, D&TA executive director said in a July post announcing the organization’s standards. “The consequences — from copyright infringement to privacy to authenticity — could affect both the technology’s business value and its acceptance.”
IBM tested the standards internally last year, aligning the committee’s recommendations with its governance policies. The company reported a “consistent and quantifiable impact on the overall efficiency of our data diligence and management processes,” Christina Montgomery, chief privacy and trust officer at IBM, said in a November post.
The technical committee will look to improve on the existing standards while generating momentum for implementation tools. Podnar said she thinks the current version covers most but not all the standardization issues that should be addressed.
“It’s about 80% there,” she said. “We knew that we didn't get this 100% right and that we were going to have to evolve the standards.”
Once the technical committee convenes, the organization intends to bring additional members on board. The sponsoring companies are aiming to have metadata quality metrics ready for broad usage within 12 to 18 months, according to Podnar.
“They're not interested in having this take a long time,” she said. “They want tools in the marketplace.”