Understanding Comes With Normalized Medians


In coaching a group of smart but unschooled people through solving a problem with normalized median graphs, I learned much about the intersection between analytics and other functions like reporting data, perceiving patterns, presenting results, and setting policy.

The group comprised my students at a for-profit career college: barely literate single parents and eighth- to tenth-grade dropouts. Most of the women were unwed mothers, and more of the men had prison records than had high school diplomas. In one much-dreaded oral communcation assignment, students had to present numeric information orally and visually and explain how it justified a specific conclusion -- a vital skill in starter-level professional jobs.

That same week, the administration had mandated that we use some classroom time to reinforce attendance. Our sessions were deliberately extra long, so students could complete all work in class, since a working single mother or a guy working two jobs and living in a party-prone relative's basement may not have time, space, or quiet to do homework. Not surprisingly, students with perfect attendance tended to get As or Bs. Those with poor attendance failed and dropped out.

So I asked the students, Suppose you're me, and you have to show students like you why attendance is essential. Here's a graph of hours of class attended versus point totals for a previous class. Does this convince you?

"No, it's just dots on a grid."

So what would make it convincing?

"Explain what all the parts are."

I explained the axes and the up-right trend: Students with poor attendance had low points.

"Yeah, but we get a point for every hour attending."

What if we subtract attendance points from total points? Here's a new graph.

Would that convince you?

"It's still just different places on a grid, and there's some higher scores with lower attendance and lower scores with higher attendance."

What if we split things up?

I introduced using the median to divide both axes and using differences from the median, rather than raw scores, so the low group on each axis would all have negative numbers and the high group positive. Graphing difference from the median, and drawing in the medians as heavy dark lines, I asked: Would this get people not to miss class?

"Yeah, but explain all that about the median."

"Yeah, but show how most people were in the high-high or the low-low box."

"Yeah, and the high-low or low-high people are all on the borders, not the middle of their boxes."

Are we done?

"Wait! The way it's drawn, points go 0 to 500, and attendance goes 0 to 48 -- points are about 10 times hours. That's why it looks like one hour makes such a big difference."

I explained normalization -- i.e., rescaling values to make them all fall between negative-1 and 1 -- by dividing by the highest absolute value, so that each graph can be centered on 0.

Here's that graph. Would it convince you?

"You have to explain normalization, too."

I had been writing the need-to-explain points on the board opposite the screen. By class vote, we ordered them into:

A. Axes: 1. Points minus attendance points; 2. Attendance
B. Differences from medians (make negative=low, positive=high)
C. Normalized to avoid exaggerating the point-to-hour relationship
D. Most high point students have high attendance (the high-high box)
E. Most low point students have low attendance (the low-low box)
F. Exceptions are close to the borders
G. So come to class and put yourself into the high-high box.

Then I said: Now, stand by that graph, point when you need to, and give your speech from this outline. Not for grade, just for practice. Mariah, you're up, then Kevin. (I chose the two biggest hams in the class, though I've changed the names.) Everybody else, psych up; you're all going to do it.

They spoke from that outline, pointed to that graph, and accepted crowd advice ("That's the points that ain't from attendance, not the points for not attending!")

Now, all you have to do for a grade is graph some data you think is interesting, see some meaning in your graph, and explain your graph and what it means. (We had already covered finding data and graphing in Excel).

Not only did they do well on the assignment, but for the rest of the term, I had near-perfect attendance. The recipe for their success:

  • Tinker with the picture until you can see the relationship and you know it's not bogus.
  • Explain the picture.
  • Connect your explanation to your recommendation for action.

Sadly, some million-dollar consultants don't always do that.

John Barnes, Freelance Writer

John Barnes has published 30 commercial novels (mostly science fiction,including two collaborations with astronaut Buzz Aldrin), 53 articles in The Oxford Encyclopedia of Theatre and Performance, more magazine articles than he can remember, and around 30 short stories. Tales of the Madman Underground, Barnes's first "officially" young adult novel, received a Printz Honor Prize at the 2010 American Library Association national convention, and his technothriller, Directive 51, was briefly on the New York Times bestseller list in 2011. His 1990 article, "How to Build a Future," about applying social science forecasting to creating backgrounds for science fiction, has been widely reprinted, and he's still getting email about it. In his twenties, John worked in an R&D shop on reliability math applied to the problems of relational databases and testing/validation; in his thirties he consulted on the connection between document systems design and natural language interfaces. He has taught college courses in theatre, communications, literature, writing, mathematics, political science, economics, and philosophy, and written what was probably the most math-heavy theatre dissertation ever (applying statistical semiotics to the problem of defining basic terms in theatre history). Recently he has pioneered applying statistical semiotics to strategic, analytic, and tactical marketing problems, poll analysis, and trendspotting, and consulted for a variety of firms and government agencies. He lives in Denver, Colorado.  His personal blog is Approachably Reclusive.

Nonlinear Problems: When Brute Force Is Best

Analysts would do well to get out of the rut of using linear regressions by default.

Analyze for Accuracy or Precision, or Vice Versa

Sometimes your results require accuracy and sometimes precision. Knowing the difference matters.


Another example for the interested
  • 3/6/2012 10:15:34 PM
NO RATINGS

I recently had occasion to use a  normalized median graph in discussing a model of global warming that I'm constructing as part of the research for a science fiction novel.  If you want to see some more brutal assaults on reality, in which, armed only with a few simple numbers you may have around the house, I chase reality into a dark alley and beat it till it says what I want it to, you can find that here.  Warning -- longish piece and the normalized median graphs don't come into it till fairly late.

Re: Normalization : More Than Meets the Eye
  • 2/25/2012 8:00:55 PM
NO RATINGS

Go for it.  Remember who the original audience was: numerous people that school didn't stick for, many trying to deal with the consequences of past irresponsibility, and not a few criminals.  Perfect test demographic for public sector decisionmakers! <g>

Re: Normalization : More Than Meets the Eye
  • 2/25/2012 7:40:01 PM
NO RATINGS

..

John, this explanation of normalizing data was mind-blowing in its clarity.

Seems like a good quick-and-dirty way to compare disparate data and suggest some implied correlations.

At first I thought it might be kind of hard to explain this presentation to decisionmakers (most public sector decisionmakers in my case) but I'm thinking it could be done and could prove useful. (Sound of mental wheels grinding...)

Re: Hit and trial in analytics too
  • 2/24/2012 11:29:54 AM
NO RATINGS

I'm guessing it has more to do with simplicity; i.e., avoiding the hassles, inefficiencies, and personality conflicts of group decision-making.

Additionally, a generalist who understands (even if only rudimentairily so) all aspects of an operation is better positioned to coordinate and make decisions that affect disparate parts.

Besides, if things go poorly, one guiillotine is easier to clean than three.  ;)

Re: Hit and trial in analytics too
  • 2/24/2012 9:55:17 AM
NO RATINGS

" we are building a generation of technology specifically aimed entirely at delivering data visualization without necessarily the need for key decision-makers to know the math...but, as you said, this makes them completely reliant on the "graphmakers." 

There is no harm in reliance as probably the decision makers are usually the executives and they are not expected to get involved in calculation/math process. They should just ensure that the staff that develops data and graphs is competant enough to understand the objective of what was required and that is responsibility-taker. Further, the graphmakers probably would be the ones who are data compilers so they are the source. Relying on their graphs means that reliance on their data.

Re: Hit and trial in analytics too
  • 2/24/2012 5:59:12 AM
NO RATINGS

...And I've known too many graphic artists to trust them with that much power!

I totally picture what you mean here. Indeed i hope that these tools aren't actually designed by graphic artists, rather by mathematicians and draftsmen who are of similar engineering minds.

Re: Hit and trial in analytics too
  • 2/24/2012 5:58:00 AM
NO RATINGS

To some extent it's just the culmination of decision making based on staff research, white papers, and "I've got people."  I can't help wondering, though, if the function of the decision maker is to ratify what the people who do understand the white papers and graphs have already worked out ... maybe we don't need that decision maker.

Somewhere early on in A Tale of Two Cities, Dickens pointed out that nearly every important job in pre-Revolutionary France had a literal bigwig (some of those wigs were 2 feet tall) who collected a large salary and took public credit, and a few commoner clerks who collected much smaller salaries and did the job.  That particular story, for some reason or other, ends at the guillotine ....

With so much chatter about leadership and decision making and so forth filling the racks in the biz-book sections of bookstores, and packing people in at seminars and training sessions, perhaps we should spare a moment to ask whether it's the ability to say "Oh, yes, do that, good idea" that is really the critical part of the process.

Re: Hit and trial in analytics too
  • 2/23/2012 11:50:48 PM
NO RATINGS

Hi John,

To your last point, interesting isn't it that in analytics we are building a generation of technology specifically aimed entirely at delivering data visualization without necessarily the need for key decision-makers to know the math...but, as you said, this makes them completely reliant on the "graphmakers." 

Re: Hit and trial in analytics too
  • 2/23/2012 7:25:47 PM
NO RATINGS

Well, obviously I come from a somewhat different perspective, but I do think that if you don't know what the math is under the graph, you are a prisoner of the graphmaker.  And I've known too many graphic artists to trust them with that much power!

Re: Hit and trial in analytics too
  • 2/23/2012 4:26:00 PM
NO RATINGS

There's quite a technical side to analytics. Thankfully though, now there are many tools that can aid one in decomposing complex data into diagrams, which then become a lot easier to interpret. I've used tableau for data visualization with quite some positive results, and thank God i didn't have to do all that math!

Page 1 / 2   >   >>
INFORMATION RESOURCES
ANALYTICS IN ACTION
CARTERTOONS
VIEW ALL +
QUICK POLL
VIEW ALL +