Note: The RDM assumes conceptual models (i.e., that object groups, their properties and their relationships have been identified). Some analytical systems are designed to 'discover' the models (from data value relationships, sequence analysis, co-occurrences, etc.). These are two distinct practices that are commonly confused, but should not be, an issue I addressed in Data Meaning: Analytics vs. Data Mining and Data, Information, Knowledge Discovery, and Knowledge Representation.
"Only 30% of data analytics is still performed against traditional relational database management systems" while "approximately 70% are modern non-RDBMS sources" like Hadoop, NoSQL, in-memory, search, columnar/MPP analytic and cloud native databases."
As my readers should know, there is little to no understanding in the industry of what a RDBMS is (What Is a True Relational System and What Is Not). In fact, there are 'no' true RDBMSs, only SQL DBMSs wrongly alleged to be relational. They have limited relational fidelity and nowhere near the capabilities and advantages conferred by the RDM. Classifications of DBMSs as relational cannot and should not be trusted. Some of the criticisms of "relational" systems apply, thus, to SQL DBMSs, not RDBMSs and in what follows I will, therefore, substitute [SQL] for 'relational'.
Note: Even criticisms of SQL DBMSs cannot be of their "analytics capabilities" (analytics is an 'application function'), but at best only for their data retrieval capabilities (i.e., data integrity and manipulation, their 'data management DBMS function'). There is little recognition of this distinction in the industry. (Understanding the Division of Labor between Analytics Applications and DBMS).
"For example, analytics users that understand how to leverage graph queries can derive deep network structure insight and wide relationship analysis over graphed data that simply can't be computed on relational schema structured data."
There are applications for which directed graph data structures are suitable. But they are much rarer than what their proponents would have you believe and have serious drawbacks, not the least of which are 'prohibitive complexity and inflexibility'. This is a core problem that the RDM was introduced to address and is a major reason hierarchic and network DBMSs having been effectively dropped more than four decades ago in favor of even weakly relational SQL — so much for "modern". Them who forget the past …
To the extent that there is anything to the often repeated claim that "[SQL] databases have a fraught relationship with applications written in object-oriented programming languages like Java, PHP and Python", it is their affinity to directed graph structures. But (1) this has nothing to do with analytics per se and (2) a 'careful' separation between computationally complete programming languages (CCL) and data languages is necessary to guarantee relational advantages (Data Sublanguages, Programming, and Data Integrity).
In-memory, columnar, and cloud native DBMSs are types of DBMS implementation that say nothing about their underlying data models, which are practically ignored in DBMS reviews, evaluations, or comparisons (Structure, Integrity, Manipulation How to Compare Data Models). No wonder that misconceptions -- rather than true RDBMSs that would address the SQL valid criticisms -- proliferate (Database Management: No Progress Without Data Fundamentals). This 'logical-physical confusion' (LPC) is rampant (Don't Mix Model with Implementation) and underlies most of the non-RDBMS superiority arguments:
- "… scale horizontally";
- "processing huge amounts of data in the cloud";
- "allowing relatively low-cost servers to be combined into a single, powerful cluster";
- "solve great performance challenges, tackle huge scales of data, help mine value from a wider variety of data";
There's more. Stay tuned for Part 2.