Consistency and Integrity
"RDBMS's greatest strength -- data integrity -- has now become its biggest weakness. To ensure consistent entry of data, (RDBMS) requires a strict data model enforced by tons of referential data relationship constraints ... [a] data model burden emerges as a serious limitation in cases where an organization requires flexibility ... enter the non-relational database."
Data practitioners who perceive integrity as a weakness are in the wrong field -- they do not understand what integrity and consistency are and, therefore, the meaning of data (What Meaning Means: Business Rules, Predicates, Integrity Constraints and Database Consistency, Redundancy, Consistency, and Integrity Derivable Data). By enforcing integrity constraints, a DBMS ensures consistency of the database with the business rules that denote the meaning of the data (i.e., the faithfulness of the database to the conceptual model of the segment of the real world it represents). No analyst should accept data that does not represent reality accurately, yet they do without being aware of it (Relational Fidelity & Analytics Integrity). A DBMS that doesn't enforce integrity cannot be considered a true DBMS. SQL DBMSs do a rather poor job of that (To Really Understand Integrity, Don't Start with SQL), non-SQL and other non-RDBMSs give it up altogether -- their data structures make it both theoretically and practically unfeasible. Trading off integrity for "flexibility" is a bad analytics strategy, but how can technology that had already been dropped as inflexible in favor of SQL several decades ago be a better option anyway? They who forget the past.
Giving up integrity is sometimes acceptable for data warehouses and other read-only databases, but, first, this is somewhat of an illusion: they must be populated and without constraint enforcement, what ensures consistency for all applications (including analytics) using them, particularly if the data come from SQL, or non-relational databases where constraints are not enforced either? (The Trouble with Data Warehouse Analytics). Second, beyond integrity, the theoretical foundation of the RDM guarantees 'logical validity' of the data sets retrieved for analysis; and adherence to relational design principles (Database Design: What It Is and Isn't) ensures their 'semantic correctness' (Object Orientation, Relational Database Design, Logical Validity and Semantic Correctness). Neither "modern" non-RDBMSs, nor "traditional" SQL DBMSs guarantee them.
I used quotes because it is a contradiction in terms ('Unstructured Data': Why This Popular Term Is Really a Contradiction, Structuring the World With 'Unstructured Data'). All data is structured 'by definition' -- if they weren't, they would be random noise, not data.
" ... if you're dealing with massive amounts of unstructured data, you may not have the luxury of developing a relational database with clearly defined schema ... related information of all types. Instead of relying on tables, non-RDBMS databases are document-oriented. This way, non-structured data -- such as articles, photos, social media data, videos, or content within a blog post -- can be stored in a single document that can be easily found, but isn't necessarily categorized into fields like a relational database does."
More fundamental misunderstanding. All data management involves manipulation of 'some' data structure (Analytics = Manipulation of Data Structure), which determines the kind of manipulation applicable to it (i.e., what kind of questions can be asked of the data and answers obtained). All the enumerated elements are structured, it's just that the structures are complex and, thus, so is their manipulation (All Structures Aren't Created Equal). Documents are multi-structured (Documents and Databases). Data are not "inherently" relational or non-relational -- it is a matter of modeling and design choices. Relational structure has advantages that are critical -- decidable data languages, guaranteed logical validity, semantic correctness, physical and logical independence -- that no non-RDBMS can deliver (Data Sublanguages, Programming, and Data Integrity, Physical Data Independence). The upfront effort is, thus, not a luxury, but it is not done either because the value is unappreciated, modeling and design skills are lacking, or out of expediency and laziness. Problem is, if you don't understand reality sufficiently to model it and design a relational database, you can't guarantee consistency, sensible queries, valid, semantically correct results and correct interpretation.