Representing a segment of reality in a computerized database requires a model of that reality mappable to a data structure that provides intelligent information retrieval and facilitates data integrity enforcement.
It should be obvious that the foremost necessity for modeling is a thorough knowledge of the business being modeled. The modeling skills per se are necessary, but not sufficient without such knowledge.
It's also important to recognize that modeling is based on subjective perceptions of reality that vary with the perspectives and informational needs of the constituencies to be served by the database. This is another way of saying that business models are informal, with commonly understood meanings, and there is no formal -- that is, theoretical -- way to prefer one perception over another. The choices made are pragmatic, based on usefulness.
Since business reality is rather amorphous and complex, a useful business model singles out only those reality aspects that are essential, keeping things as simple as possible (but not simpler!). By "simple" I mean reliance on: 1) the most basic terms and concepts available -- language primitives that are deeply embedded in our culture and thinking and, therefore, do not require explanation; and 2) parsimony -- not more concepts and terms than is necessary.
Here are my definitions of the basic modeling concepts.
A type is the criterion for membership of an element in a set. We're concerned with two kinds of sets of elements: properties as sets of values and classes as sets of entities.
A property is a set of values. The property type is the criterion for membership of a value in the property set; it specifies by enumeration or range the setís legal values and their uses -- meaning, the operations applicable to those values.
An attribute is a subset of property values; it inherits its type from the corresponding property.
An entity is a unique set of attribute values.
A class is a set of entities; the class type is the criterion for membership of an entity in the class -- it specifies:
The attributes shared by entities in the class
Any attribute(s) referencing attributes of entities in other classes
Any other business-specific criteria for entitiesí membership in the class.
A business rule is a type specification statement. More specifically, a property business rule is a property type specification statement, and a class business rule is an entity type specification statement.
A business model is the intersect of the property and class business rules.
I caution the reader that these terms are often used in different, vague, or inconsistent ways. Vagueness, confusion, and inconsistencies can wreak havoc with business modeling and database design. So I urge you to focus on the specific definitions used above and ignore, for the purposes of this blog, any other uses of the terms to which you may be accustomed.
I'll share examples in next month's post. But, in the meantime, let me know in the comments below if you've encountered or used these definitions in different ways.
Hi Fabian, I like the idea of laying out the definitions in this way. What's your advice on how to make sure people within a company are all working from the same dictionary, so to speak, as they work on their business modeling? Should they have a document with all of their terms spelled out, for example?
Of course, how businesses do that makes the difference between success or failure.
The large and complex databases require (1) a formal process (2) management support (3) modeler-user-manager-dba team (4) thorough knowledge of the business (5) data foundation knowledge (6) proper documentation and design tools.
I believe that except for cases of relatively simple databases used by only a few application/users, the current practice of assigning a programmer/developer to do everything from business modeling to database design to database administration to application development will usually end up with serious problems.
The notion that eschewing this saves money and time is an illusion due to scarcity of foundation knowledge: the opposite is true. The argument that modeling and design is difficult, takes a long time, is expensive and so on is certainly true, but there is no free lunch: if you want the benefits of a true db system, that's the price for it, and thinking that you can eschew it and still get the benefits is delusional. You can do it, but then you must do it consciously and aware of the benefits you give up. The ndustry caters to this delusion by promoting magic wands.
Simplicity and parsimony are critical. Given the complex nature of business reality and the above process, it is crucial that we don't pile complexity of methodology and tools on top of it. Everything should be as simple as possible, but not simpler. That's the issue that the relational model addresses and the failure to comprehend that is one of the core problems in IT.
Good points, more than just a programmer/developer is needed to make sense and hopefully some company profits out of the data massage.
Lots of data, means lots of brainpower is needed to figure the best ways to use it, and the persistence to keep trying new ways of looking at what's going on. It's not the first result that will define the answers.
All that data were generated for different purposes and, therefore, do not have a common structure to which common sound operations can be applied for analytical purposes. There is this common delusion that such data can be taken as is and, without any effort, can be mined for information.
Another aspect is a disregard for the cruacial importance the ratio of method and tool complexity to usefulness. The purpose of models is to simplify the reality of interest to the most essential and to structure it optimally for multiple analytical purposes. Complexity and lack of generality defeats their purpose. This is largely missed by both business and IT and the industry encourages the opposite.
Codd's genius was not only to apply theory to data management, but also that the theory provides an optimal ratio of simplicity to generality: the relational approach is the simplest way to deal with data that is general enough to handle a vast majority of informational needs. In my paper "Businss modeling for database design" I specify the criteria that an alternative approach must satisfy to be superior. Unless it does, we trade down. The trite statement "the right tool for the right job" more often than not reflects a lack of foundation knowledge.
2015 Visual Analytics Interactive RoadshowSAS(r) experts are coming to a city near you in a series of live, interactive workshops focused on SAS Visual Analytics, including how to prepare your data for VA, the integration of VA with Office Analytics and a Visual Statistics demo.
January 22: King of Prussia, PA
February 24: Austin, TX
March 26: Redwood City, CA
April 22: NYC, NY (1st of 2 stops)
May 13: Seattle, WA
June 18: Minneapolis, MN
July 21: Rockville, MD
August 18: Chicago, IL
September 24: Irvine, CA
October 9: Cary, NC (during SAS Championship)
October 21: NYC, NY (2nd of 2 stops)
November 17: Orlando, FL
December 8: Atlanta, GA
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
The hospitality industry gathers massive amounts of customer data, and mining that data effectively can yield tremendous results in terms of improved CRM, better-targeted marketing spend, and more efficient back-end processes. Roger Ares, vice president of analytics at Hyatt Corp., discusses the ways he and his staff use big data.
Charged with keeping track of travel assets, including employees, iJET International relies on data management best-practices and advanced analytics to keep its clients in the know on current and potential world events affecting travel, Rich Murnane, Director of Enterprise Data Operations & Data Architect, told All Analytics in an interview from the 2014 SAS Global Forum Executive Conference.
Jason Dorsey, chief strategy officer for the Center for Generational Kinetics and keynote speaker at last month's SAS Global Forum 2014, describes how Gen Y professionals are enhancing the makeup of multigenerational analytics organizations.
From analytics talent development to the power of visual analytics, All Analytics found a variety of common themes circulating throughout the exhibition floor and session discussions at the 2014 SAS Global Forum and SAS Global Forum Executive Conference events held last month in Washington, DC.
Talking with All Analytics live from the 2014 SAS Global Forum Executive Conference, Eric Helmer, senior manager of campaign design and execution for T-Mobile, discussed the importance of customer data -- starting internally -- in devising the mobile operator's marketing plans.
The big-data analytics market can be a confusing place. Among the vendors vying for your dollars are traditional database management providers, Hadoop startup services, and IT giants. In this video, All Analytics editors Beth Schultz and Michael Steinhart sit down in a Google+ Hangout on Air with Doug Henschen, executive editor of InformationWeek. Henschen discusses use cases for big-data analytics, purchase considerations, and his recent roundup of the top 16 big-data analytics platforms.
At the National Retail Federation BIG Show last month, All Analytics executive editor Michael Steinhart noted a host of solutions for tracking and analyzing customer activity in retail stores. From Bluetooth beacons to RFID tags to NFC connections to video analytics, retailers must find the right combination of tools to help optimize the shopper experience, streamline operations, and boost revenues.
The days when historical shipment trends and gut feelings were enough to forecast retail demand accurately are long over. SAS chief industry consultant Charles Chase outlines the benefits of pulling real-time sales information from point-of-sale and product scanner systems, then flowing that data into dynamic forecasting tools from SAS.
With today's advanced visual analytics tools, you can stream data into memory for real-time processing, provide users the ability to explore and manipulate the data, and bring your data to life for the business.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.