The data management industry operates like the fashion industry. Its most persistent characteristic is migration from fad to fad. Every few years -- the number keeps getting smaller -- some "new" problem is discovered, for which the solution is so magical, that it is extended everywhere to everything, whether it is applicable or not.
But many of these problems are old and fundamental and some of the “solutions” bring them back, rather than solve them.
The current solution is big data analytics, seen as the technology for solving and the problem for everything from terrorism to tuberculosis, and the problem is data complexity:
"... the key challenge is not data size but complexity ... To make a Big Data initiative succeed, the trick is to handle widely varied types of data, disparate sources, datasets that aren’t easily linkable, dirty data, and unstructured or semi-structured data ... But … You don’t get a Big Data club card just for changing your old (but still trustworthy) data warehouse into a data lake (or even worse, a data swamp).” -- Big Data: The Key is Bridging Disparate Data Sources
Quite. Except that old geezers, er, experienced professionals like me remember complexity as the "islands of information" of the days of proliferating redundant application-specific files and application programs that would not talk to each other. The marketeers of "integrative solutions" hyped them then just as those of big data analytics do it today. Complexity is still with us because we mindlessly generate complexity much easier and faster than any "agile" magic wand can extract reliable useful information from it. But instead of addressing this fundamental problem, we accelerate it.
“The NoSQL flavor of databases has come en vogue in the last few years in certain technology sectors, primarily ones that are evolving so quickly that having to slow down to put forethought into your data store and how it's going to be structured might literally be the difference between your whole company suceeding or not.” --ignoredbydinosaurs.com
Forethought has become an impediment, rather than a success factor -- one reason I find the "science" in data science, as practiced, highly questionable. When real science education still existed, it drilled into me that science is all about knowledge, reasoning, and forethought. This is exactly what we're now trying to avoid at all cost, because, as vocational training is substituted for education and you gotta be a dropout to succeed, thought -- not just forethought -- becomes increasingly difficult and even discouraged.
That explains the attractiveness of NoSQL and big data analytics. i.e., forward to the past:
“There are no [design] rules of normalization for [NoSQL] databases … Which means you're designing the data organization to serve specific queries. So follow the same principle in NoSQL databases as you would for denormalizing a relational database: design your queries first, then the structure of the database is derived from the queries.” --What is a good way to design a NoSQL database?
Using the term "design" in this context is, of course, misleading. Design requires forethought, anticipating the types of likely queries and models reality and pre-structures data so as to simplify integrity enforcement and data manipulation. NoSQL and big data analytics do the opposite.
“In various organizations, data modeling for NoSQL emphasizes the roles of [application] developers while deemphasizing the roles of data modelers and database administrators.” --Donovan Hsieh, eBay’s Senior Data Architect
In other words, we mindlessly pile up complexity and put our trust in minimally educated coders and machines “more intelligent than us” to tackle it for our benefit. The consequences are not only not different than those of application-managed data -- struggle to optimize for multiple uses data structured for specific uses -- and post-facto structuring anyway. They also are scary: the constant atrophiation of human intellect coupled with machines programmed by corporate interests to extract, not produce and distribute wealth is socially destructive.
Shouldn't we strive to avoid complexity via forethought, rather than keep increasing it? That's what database management -- and relational database management in particular -- were devised to address, but are being dismissed and ignored, because they require scarce intellectual abilities. They are yesterday's fads.