Is it Time For a DBMS Mass Extinction?

On Sept 19, 2014, InfiniDB (formerly Calpont) announced it was closing its doors after failing to secure financing to continue operations. Establishing differentiation in a crowded market and competing over a finite supply of large enterprises eroded InfiniDB’s position. It’s easy to think this is an isolated story of a single company that failed to gain traction. However, I think this is just the first sign that the asteroid is coming.

The Permian Extinction—When Life Nearly Came to an End
Source: NatGeo – The Permian Extinction—When Life Nearly Came to an End

Over the last eight years or so, dozens of new vendors have emerged offering specialized types of DBMSs. The website nosql-databases.org tracks about 150 different DBMSs. As I see it, the current level of diversity in the DBMS space is simply unsustainable.

This won’t be a popular view. After all, several reasons have been given for the explosion of DBMSs. Most of the reasons are thinly veiled market positioning from vendors desperate for market share. Market positioning reasons almost always talk about how the “old” rules of data management no longer apply. And when people say the old rules no longer apply, you’re in a bubble.

Open source is another element driving DBMS diversity. Open source licensing allows independent developers or companies to derive new products from OSS staples like MySQL and PostgreSQL, or combine several projects into entirely new offerings. To some extent, this is facilitated by sites like github. These sites provide a virtual water cooler to develop ideas and features without needing to get together in meatspace. Redis is a great example of the power of virtual collaboration and development.

Another reason given for the diversity of DBMS models is the growth of data volumes and varieties. This argument looks much like the first: legacy DBMS vendors simply can’t cope with new data demands. To a certain extent this reason has some merit, but not nearly enough to justify the continued existence of dozens of DBMS vendors. And don’t forget the resources these vendors have. When the asteroid strikes, they’ll be hiding under piles of money.

If the old rules still apply and the data expansion argument has questionable merit, what has supported the number of DBMS vendors entering the market? Other than loads of VC funding, the answer appears to be a simple one: hardware.

Computing hardware is always increasing in capability and decreasing in cost. But it’s uncommon for processing, storage and networking to all experience massive capability increases and cost reductions in concert. The last time it happened was in 2009. Each time this convergence happens, application developers have free reign over implementation decisions. This developer-centric approach typically lasts for a few years. After all, any code will run, and run quite well, with better hardware.

The slack capacity provided by better hardware might make you think you can do things you wouldn’t previously consider. The old rules may no longer apply. There might even be a free lunch!

Applications are developed and some value is created, but the result is the proliferation of data silos at the cost of abandoned information governance.

Information management realities always assert themselves. Data in silos is fine for systems of innovation because you’re only focused on data use. But those systems eventually become systems of record, where data reuse is paramount. Systems of record must have capabilities of description, organization, governance, integration, among others.

When the asteroid strikes, it won’t be a fiery rock falling from the sky. It will be IT Ops reasserting the need for adult supervision. DBMSs providing the required capabilities will likely thrive, while those that don’t, won’t.

And not even Bruce Willis can save vendors from IT Ops.
MPW-31982

Thanks to Merv Adrian for reviewing and contributing to this post.

The Big Data A-Ha Moment is Only the Beginning

If Big Data is going to remake an industry, insurance is certainly a great candidate. With massive amounts of historical data as well as emerging IoT-based data sources, insurance is well positioned to take advantage of new analytical methods and techniques. Recently, I had a chance to speak to a very large insurance company about their Big Data opportunities and challenges. What made the day unique was my session was followed by a predictive analytics demonstration created by the company’s IT department.

Once the gathered executives saw what was possible with predictive analytics, even in a small scale, they wanted to do more: new cuts of data, additional facets, different questions. But the requests from the invigorated audience reinforced fundamental challenges with Big Data.

First, the data must be correct. This isn’t limited to data quality, but also metadata. Almost any organization is going to have multiple data sources, often for the same data. In the case of this insurer, it has several claims systems, each with different attributes. For example, one claim system has five different categories for marriage status, while another had seven. Inconsistent dates of birth also complicated analysis.

Second, IT and the business must be partners. For the purposes of the demonstration, IT simply picked what they thought was an interesting problem and they picked correctly. After that, the executives started asking for more things from the IT team – without any consideration for the work that must happen from the business side. The executives believed they could simply request predictive insights from IT in the same way they ask for new descriptive analytics reports.

Without meaningful collaboration and investment from the business side, in the form of people, process and data, Big Data initiatives will fail. And they will fail quite spectacularly.

Copyright © Nick Heudecker

Built on Notes Blog Core
Powered by WordPress