If anyone asked you about the relationship between Aristotle and analytics, your immediate response might be, "What does philosophy have to do with number crunching?" Of the times SAS CEO Jim Goodnight has spoken at analytic conferences, he's not been seen wearing a toga, after all.
However, Aristotle’s influence on analytics is represented by categorical logic, his system of subject/predicate analysis.
Categorical logic purports that there are three types of subject/predicate (or variable/value) relationships: 1)
synonyms; 2) homonyms; and 3) paronyms.
Synonyms (univocals) are relationships in which multiple variables map to a single value. Homonyms (equivocals) are relationships in which a single variable maps to multiple values. Paronyms (derivatives) are relationships in which multiple but distinct variable/value pairs represent divisions of a broader concept. In his work Categories, Aristotle expressed his thoughts on subject/predicate relationships:
Things are said to be named 'equivocally' when, though they have a common name, the definition corresponding with the name differs for each. Thus, a real man and a figure in a picture can both lay claim to the name 'animal'; yet these are equivocally so named, for, though they have a common name, the definition corresponding with the name differs for each. For should any one define in what sense each is an animal, his definition in the one case will be appropriate to that case only.
On the other hand, things are said to be named 'univocally' which have both the name and the definition answering to the name in common. A man and an ox are both 'animal', and these are univocally so named, inasmuch as not only the name, but also the definition, is the same in both cases: for if a man should state in what sense each is an animal, the statement in the one case would be identical with that in the other.
Things are said to be named 'derivatively', which derive their name from some other name, but differ from it in termination. Thus the grammarian derives his name from the word 'grammar', and the courageous man from the word 'courage'.
In short, Aristotle classified objects and their characteristics into three major groups. First, he noted that many objects share the same names but had different characteristics. Second, he noted that many objects have different names but have the same characteristics. And third, he noted that many objects have distinct names and characteristics but share a common origin.
The following table exhibits the association among these three components of categorical logic and how they are associated with analytics:
Table 1: Analytics & Aristotle's Categorical Logic
||Nature of Relationship
||Univocals -– many speak as one.
||MANY : 1
||Multiple variables map to the same idea.
||When multiple variables share the same meaning, extract only one variable.
||Since 'SEX' and 'GENDER' typically mean the same thing, keep only one of the two variables.
||Equivocals -- one speaks as many.
||1 : MANY
||One variable maps to multiple ideas.
||When one variable has multiple values, transform all the values into a common value.
||The attributes of GENDER typically map to 'F' or '1' for FEMALE, and 'M' or '2' for MALE. Use either the character or numeric value to represent GENDER.
||Derivatives –- one idea is expressed by subsets of distinct voices.
||MANY(1 : 1) : 1
||Multiple unique variable/value pairs map to the same idea domain.
||When designing fact and dimension tables, load the data elements with discrete but related attributes into the same table.
||Name, Street Address, Phone Number, City, State, and ZIP logically fit together on a CONTACT INFORMATION table.
Aristotle’s ancient work provides a modern framework for preparing data for analysis:
- During the extraction process, identify the synonyms and eliminate the excess variables.
- During the transformation process, identify the homonyms and eliminate the excess values.
- During the loading process, identify the paronyms and group all of the similar analysis variables and classification variables.
This data preparation framework works for me. What works for you?