It's easy to discuss Big Data theoretically, highlighting the long-term benefits of a corporate data-oriented strategy. But success is not created in a vacuum and there are numerous examples of companies doing Big Data right.
Below are three case studies of Big Data in action, highlighting the benefits of a data-oriented strategy across sectors:
74 million bits and counting
Healthcare organizations are processing a lot of data, but nowhere is that more clear than in the U.S. Medicaid program.
Medicaid covers 74 million American's health insurance, which is more than Medicare. The social healthcare program also covers more than half of all U.S. births and half of long-term care for elderly and disabled populations.
Though the payer is a federal program, it's state administered, creating 51 "little" Medicaids, according to Jessica Kahn, director of Medicaid Data and Systems Centers for Medicare & Medicaid Services, speaking at the Amazon Web Services Public Sector Summit in Washington.
With vast amounts of data, Medicaid can better understand trends at a national level. For example, how many pregnant women in the program received routine prenatal care? Where are people accessing behavioral health services?
"As we started to do the federal build — that federal catcher's net — we started to think about the size and the scale of the data ... just California alone would probably jam up our data center."
Jessica Kahn
Director of Medicaid Data and Systems Centers for Medicare & Medicaid Services
While those are important questions for the payer to understand, it is difficult to answer with essentially 51 separate programs, according to Kahn. So for the past four to five years, Medicare has been working to improve the kind of data it collects from and with states to understand trends at the national level.
But, "as we started to do the federal build — that federal catcher's net — we started to think about the size and the scale of the data," said Kahn. "Seriously, just California alone would probably jam up our data center."
Amassing large amounts of data to draw insight poses a storage problem, so size and scalability was important because the data is cumulative. "We weren't building just for now; we're building four years from now," said Kahn.
Working with a Silicon Valley startup, the agency turned to the cloud using AWS to deal with storing vast quantities of data and the ability to scale to accommodate future data and the input of historic data.
"It's this constant influx of very complex data from, as I said, 51 different paths, and we did try to get them to a single data dictionary, which is like matchmaking times 51," Kahn said. "It's quite a big enterprise, but it's already producing," she said. "It will never be perfect because it's real, and that's how healthcare data is."
Transactional data and the consumer potential
Old companies have to undergo difficult modernization efforts to keep up with the digital age. And we’re not talking pre- Dot-Com age or companies born around the time of Intel or IBM.
Rather, it’s companies like National Cash Register (NCR), which has been around since 1884 and has long been known as a hardware company, creating a large percentage of the world’s ATMs and cash registers. In fact, 830,000 of the 3.1 million ATMs in the world were created by NCR, the company reports.
But Bill Nuti, CEO of NCR who joined the company in 2005, has worked with the company's tech leadership like CTO Eli Rosner to transform NCR into a software company. In fact, 63% of the company's revenue came from software and services.
To reinvent the company, Nuti built five-years plans to cover the period from 2005 through 2020. In the first five years, the company streamlined manufacturing and consolidated the workforce. Then between 2010 and 2015, NCR began looking toward software investments.
More than 700 million transactions run through NCR's devices every day but its true differentiator will be its ability to act on the data.
With software and a successful reinvention under its belt, what’s next for NCR? Well, the answer might lie in the data, said Rosner, in an interview with CIO Dive.
With more than 700 million transactions running through its devices every day — whether at Kiosks or ATMs — NCR collects a lot of data. But NCR's true differentiator will stem from its ability to take data across other industries.
With transaction data sets from banks, ATMs, POS systems and restaurants, NCR could craft complete consumer profiles and potentially anticipate consumer spending behavior, helping to create a more tailored marketing experience.
For example, with an ATM in a grocery store, NCR can leverage an integration, and send a customer a targeted marketing campaign while they are using the machine, such as a coupon for an item in the store based on their transaction history, according to Rosner.
But what about the potential invasiveness of such targeted solutions? NCR "doesn't blink" without a privacy or IP attorney on hand, Rosner said.
While everything it does has to be opt in, millennials are also far more lenient toward sharing data, according to Rosner. For the most part, in terms of consumer habits, so much is already known and understood today. But if a consumer opts in through an omnichannel manner, they have the potential to get better recommendations.
Unlocking genomics at the Smithsonian
Scientists have long toiled over the human genome, investing large amounts of resources because of the potential health impacts. But other genomes are far less understood
That's where the Smithsonian Institution comes in. With 19 museums, nine research centers — from Panama to Fort Pierce, Florida to Cambridge, MA — and a zoo, the Smithsonian's research capabilities cover everything but human genomics, according to Dr. Rebecca Dikow, a research data scientist of biodiversity at the Smithsonian Institution, speaking at the AWS Public Sector Summit.
The Smithsonian is in the process of undergoing data IT modernization to help unlock the potential of data, particularly as it relates to genomics. After all, some plant genomes are more than 10 times as large as the human genome, according to a AWS case study on the Smithsonian.
"As genomics has been on the rise, researchers have increasingly large data sets," said Dikow. So the research computing branch of the Smithsonian has taken on the task of not only managing data, but also enabling researchers to analyze it.
Working in the office of the CIO, data scientists like Dikow are trying to enable data-heavy research through both local computer resources and also data in the cloud.
Working with companies like AWS and Intel, the Smithsonian data science team is trying to accelerate genome annotation, which identifies the location of genes and the genome coding regions to help determine what genes do.
By developing software and using existing software, the Smithsonian is working to unlock genomic data, which contains large and complicated information sets. But with cloud storage and other modern tools, work with genomic data sets is far easier than it used to be.