Editor's note: The following is a guest article from Maria Colgan is a Master Product Manager at Oracle Corporation
Historically, relational databases have been very good — necessary in fact — for transaction processing applications. If you bought or sold something and needed to record that fact, a relational database was for you. In early days, these databases handled rows of records—with each record containing various attributes. A record might be a product sale and attributes of that sale could be price, customer, color, source, discount etc.
Thus, relational databases were limited to that very large, very important, but also extremely specific, task.
Over the last four decades, the explosive growth in the volume of data generated spurred the advent of data warehouses which gave companies a better way to aggregate and analyze that data.
In that time the relational database has likewise evolved so that these databases now handle many different types of data, workloads, as well as different ways of storing that data. As such, modern relational databases in the cloud computing era can now handle a much more diverse set of tasks.
What was true about the category 20 years ago is no longer the case today, read on to see the biggest myths about today's relational databases.
Myth #1: Relational databases can't handle messy, un-structured data
In the late 80s and 90s non-relational "object" databases debuted to handle data that didn't fit well in structured, column-and-row formats. In the early 2000s XML databases, which suited document data, arrived on the market.
Since that time, however, relational databases have adapted to handle these unstructured data types.
Modern relational databases, for example, can handle the XML format used to store many documents as well as JSON, a popular "lightweight," readable data interchange format used in many modern applications.
Modern "converged" databases not only store but allow users to query both structured and unstructured data, thereby delivering more holistic results based on both types of information.
Myth #2: Relational databases aren't up to the Internet of Things
The internet of things, in which multiple billions of sensors and other devices share data over the web, obviously poses a huge data management challenge. IoT devices collect and share data showing the operating status of remote machinery, as well as cellphone location data, weather stats like temperature and wind speed, and human health statistics.
Obviously that's a lot of data snippets from myriad sources, which in many cases needs to be processed fast. This is why in-memory database capabilities can be attractive. Storing and processing data in memory is faster than doing the same on disk, as is typically the case with relational databases. Hence the advent of a several specialized in-memory databases. But again, modern relational databases can now perform in-memory processing where needed and with the right algorithms applied. That being the case, there's little sense in using two or more databases when one gets the job done.
Myth #3: Relational databases do not scale
In the past it was thought that relational databases were fine for big data sets as long as they didn't get too big. That was one factor driving the early growth of distributed NoSQL (not-only SQL databases.) These databases divvied up massive data sets into separate partitions. This process, known as sharding, was not something older relational databases facilitated or handled well.
But that too, has changed. Modern relational databases can support sharding and even ensure that certain shards stay native to specific countries. That is a key requirement because some nations now require that local data stay in-country. Providing all of the benefits of sharding along with all the advantages of a mature relational database, instead of the compromises you may have to make with a NoSQL database is a game changer.
Myth #4: Relational databases can't handle non-transactional workloads.
As noted, early relational databases were great at transaction processing, where all data about a record resides in a neat row. But a row-based structural database ran into limitations when used for analytical style queries that focus in just one or two attributes across a large dataset. So if you wanted to pull up all instances where customers buying a given shirt came from California, across a large set of records, a "columnar" database may have been a better option. Such databases made it easier to pull a desired subset of data from across all the relevant rows.
But once again, relational databases have progressed to a point where they can store data in multiple formats simultaneously, allowing analytical queries to grab specific columnar data and analyze the results quickly and transactional workloads to still operate on a single record or row as before.
Myth #5: It's best to move data to the process rather than the other way around
In the past, it was a given that it was more efficient to move data to an analytic engine or platform to run the machine learning algorithms required to solve a specific problem.
In today's big data era, however, it is time consuming, expensive, error-prone, and insecure to make multiple copies of large dataset simply to take advantage of machine learning.
Additionally, the whole point of using machine learning is to create or maintain a competitive edge. The farther the data is from the place it is analyzed, the more latency, or delay, is introduced to the process potentially impacting that competitive edge. And in many cases, moving data from one place to another incurs data transfer fees. Moving a bunch of data can get pricey fast.
Nowadays, it saves time and money to bring analytics and other sorts of computational algorithms (machine learning) to the data rather than forklifting data in the other direction. Having machine learning algorithms built into a converged database enables faster model building and real-time scoring without the risk of copy contagion or security breaches.
Myth #6 Relational databases can't handle modern tech constructs like microservices
Nowadays, software developers often use microservices: small, modular pieces of code that work in concert with other microservices to build full applications. Early on, each microservice was assigned its own, separate database. The problem with that is it leads to a hairball of different databases all needing to be managed and updated separately.
But again, (please skip ahead if you've guessed what's next) modern relational databases or a converged database can provide the independence required by microservices without the integration complexity that arises from using separate single purpose database. One database means one set of management tools and updates.
Summing it up
So, as the arguments for deploying multiple, specialized databases fall away, it's time for businesses to bring together their disparate workloads onto a single, adaptable, general-purpose database that is easier to track, manage, and secure.
And, while they're at it, it's also probably time to retire the old "relational" moniker when referring to a new class of flexible, converged databases that handle relational data, as well as a lot more.
About the Author
Maria Colgan is a Master Product Manager at Oracle Corporation and has been with the company since version 7.3 was released in 1996. Maria's core responsibility is evangelizing new database functionality, specifically for Oracle's Autonomous Transaction Processing Database and the best practices for incorporating it into business environments. She is also responsible for getting feedback from Oracle customers and partners and incorporating it into future product releases.