- by kq4ym, Data Doctor
- 11/11/2017 9:59:42 AM
A very good question. Getting to the optimal amounts of clustera and figuring the validity of the correlations may be not so simple as it may seem. But, it does look like it may be solution to some problems where the ideas may bear fruit it not "know which birds flock together."
- by bkbeverly, Data Doctor
- 11/9/2017 9:27:06 AM
Basically cluster analysis, factor analysis, path analyis, etc. - these are all parametric methods that look at how variances of independent variables are grouped or linked. These techniques have been used almiost since fire was discovered, but fell out of favor because without modern technology, they were cumbersome/brutal to calculate (in fact, in 1981, we had to learn and do path analysis for causal models by hand). What the modern tools do is make it easier by providing GUIs and making them accessible by the web. I am happy to see Pierre surface this technique - it means that the old stuff is still the good stuff.
- by Lyndon_Henry, Blogger
- 11/9/2017 9:07:49 AM
Not being familiar with this data analytics technique, I did a bit of Googling for info. I found the following article to be particularly helpful:
It provides the following simply explained overview:
Clustering is the task of dividing the population or data points into a number of groups such that data points in the same groups are more similar to other data points in the same group than those in other groups. In simple words, the aim is to segregate groups with similar traits and assign them into clusters.
Let's understand this with an example. Suppose, you are the head of a rental store and wish to understand preferences of your costumers to scale up your business. Is it possible for you to look at details of each costumer and devise a unique business strategy for each one of them? Definitely not. But, what you can do is to cluster all of your costumers into say 10 groups based on their purchasing habits and use a separate strategy for costumers in each of these 10 groups. And this is what we call clustering.
I found this fairly helpful in getting a clearer understanding of clustering and its value.
- by SethBreedlove, Data Doctor
- by Zimana, Blogger
- 11/2/2017 6:32:23 PM
Thanks - I've been seeing a few of these types of post in R-Bloggers, but its more technical and focused on programmer usage and observation. One aspect that has been changing in analytics is how much programmer perspective and ability has been infused into the tasks required to analyze data and provide the story.
- by bkbeverly, Data Doctor
- 11/2/2017 12:42:36 PM
Wow - cluster analysis - that takes me back to circa 1980. Thanks for surfacing this topic. I cant recall the last time that I have seen anything on it explicitly.