Viral Vampires: How to Measure a Marketing Hashtag's Potency

The epidemiological equation is a basic tool for understanding the viral nature of messages, and its j and k statistics, if properly reported, are powerful, revealing social media metrics. Here's an example.

The epidemiological equation is Rt+1=kRt(1-Rt)-jRt where:

  • t=observation period. For example, day, month, or hour; if you take your first observation at 9:00 a.m. and hourly thereafter, at 9:00 a.m. t=1, at 10:00 a.m. t=2, at 9:00 p.m. t=13.
  • R=rate of participation. What fraction of the relevant people are repeating the message, infected with the disease, humming the tune, wearing the shoes?
  • j=die-off. What fraction of participants at period t will stop participating by period t+1? It has meaningful values between 0 percent and 100 percent.
  • k=virality (think "konversion" or "kontagiousness"). How fast do participants convert (or infect) nonparticipants? For math reasons we won't go into here, k can be anywhere between 4 and +4.

Let's see what we can learn by estimating the values of j and k in a real-world situation.

Currently a hot area in fiction publishing, traditional and indie, is urban fantasy: the publisher marketing category for books about vampires, werewolves, and other magical beings in the modern industrial world. How effectively will the hashtag #urbanfantasy promote an urban fantasy novel on Twitter?

Manually, I captured who was tweeting the hashtag #urbanfantasy, how often, across an easily accessed period of nine days. Here's what one afternoon's quick analysis showed about #urbanfantasy's potential as a marketing hashtag:

On arbitrarily-chosen Day 0, #urbanfantasy did not appear on Twitter. On Day 1, three people tweeted it; during eight subsequent days a densely-connected community of 107 people tweeted #urbanfantasy with these frequencies:

Those percentages are R1 through R9. On a smoothed graph:

Since we have Rt for t=1 to 9, we can estimate k and j by three different methods:

  • Lagged-variable regression.
  • Brute-force with a sum of squared errors (SSE) test, from the individual behavior of the function (the transition from R1 to R2 is treated separately from the transition from R5 to R6, or any other adjacent transition, and they're all scalars).
  • Brute-force with an SSE test, from the trajectory of the function (the pathway from R1 to R9 is treated as one transition, a vector).

I applied all three. Regression gave a meaningless negative value for j, with enormous error bars. Inspecting the residuals revealed that the failure of lagged-variable regression actually was good news, because the cause of this misbehavior was that Rt and Rt2 are not fully independent and are of nearly equal importance. Interdependence is intrinsic to the math, but coequality of significance confirms that the structure of the epidemiological model is appropriate to the problem.

So, as often happens in these problems, regression simultaneously tells us we're on the right track and refuses to give an answer. The individual brute force behavior estimates were k=2.4, j=0.52, with an SSE of about 0.27; the trajectory brute force behavior estimates were k=3.2, j=1, with an SSE of about 0.24. In a small case like this, chances are true that j and k lie between the individual and trajectory estimates, and because the exact value of j and k is less important than the behavioral range they fall into, those numbers tell us:

  1. The social process runs on an internal dynamic. It is not being driven by outside factors like a celebrity, holiday, or news event. An insignificant difference in SSE between the trajectory and individual approaches indicates this.

  2. #urbanfantasy has lousy persistence. Persistence=100%-j, so it's somewhere between 0 percent and 48 percent -- an F at any school.

  3. #urbanfantasy is so viral it's chaotic. It's retweeted in sudden up-and-down bursts, sometimes absent completely, sometimes sweeping the community. Scoring virality as (k+4/8) to put it on the same "grade book" basis as persistence, k=2.4 corresponds to 80%, k=3.2 to 90%: a solid B in virality. It may not last long, but it's catchy.

These three conclusions suggested some further non-epidemiological analysis, which revealed two more points of interest:

  1. Overcoming #urbanfantasy's persistence problem would be cheap. A log R:log F regression showed that the #urbanfantasy community fits a Zipf distribution, with the No. 1 user tweeting the hashtag 56 times, and 65 users only tweeting it once. The top nine (out of 107) participants accounted for exactly half of all tweets. If we target those urban fantasy evangelists with advance reading copies (ARC) with "urban fantasy" prominent on the cover, we can probably start an #urbanfantasy twitterdemic about a book.

  2. Twitter provides tremendous potential leverage for #urbanfantasy. Plugging the trajectory j and k estimates into the equation, we can calculate that if the nine most active point-of-contact social users tweeted about a book and hashtagged it #urbanfantasy, 37,000 followers would receive at least one retweet or response in one week. That's about 60 percent penetration of the 62,734 unique followers of the 107 active tweeters (unique means people who followed more than one were counted only once).

Of course, this was all an afternoon's amusement. For real-world clients, I'd examine a longer period and test additional hashtags like #sexyvampires, #shapeshifters, and #urbanwerewolves, seeking a hashtag that was at least as contagious (k=2.4 or better) and much more persistent (j=0.3 or lower) -- ideally a "straight A" hashtag (i.e. j<0.1 and k>3.2) reaching a bigger community. But even if we didn't... Hey, nine ARCs with the right two words reach 60 percent of a sizable, passionate reader community in one week.

The verdict is, #urbanfantasy might not be the very best hashtag for helping your vampire-slayer novel go viral, but it will certainly do till you find something better.

John Barnes, Freelance Writer

John Barnes has published 30 commercial novels (mostly science fiction,including two collaborations with astronaut Buzz Aldrin), 53 articles in The Oxford Encyclopedia of Theatre and Performance, more magazine articles than he can remember, and around 30 short stories. Tales of the Madman Underground, Barnes's first "officially" young adult novel, received a Printz Honor Prize at the 2010 American Library Association national convention, and his technothriller, Directive 51, was briefly on the New York Times bestseller list in 2011. His 1990 article, "How to Build a Future," about applying social science forecasting to creating backgrounds for science fiction, has been widely reprinted, and he's still getting email about it. In his twenties, John worked in an R&D shop on reliability math applied to the problems of relational databases and testing/validation; in his thirties he consulted on the connection between document systems design and natural language interfaces. He has taught college courses in theatre, communications, literature, writing, mathematics, political science, economics, and philosophy, and written what was probably the most math-heavy theatre dissertation ever (applying statistical semiotics to the problem of defining basic terms in theatre history). Recently he has pioneered applying statistical semiotics to strategic, analytic, and tactical marketing problems, poll analysis, and trendspotting, and consulted for a variety of firms and government agencies. He lives in Denver, Colorado.  His personal blog is Approachably Reclusive.

Nonlinear Problems: When Brute Force Is Best

Analysts would do well to get out of the rut of using linear regressions by default.

Analyze for Accuracy or Precision, or Vice Versa

Sometimes your results require accuracy and sometimes precision. Knowing the difference matters.

Re: Leading off with a correction ...
  • 12/4/2011 8:34:00 PM

Would the tool to track hashtags be similar to tools used now to track which keywords are the "hottest" for SEO?

Re: Leading off with a correction ...
  • 12/3/2011 8:49:39 PM

Thank you John for this great example.  I really like how you not only addressed the problem but addressed the solutions. Reading your blogs is like taking a class. (A good one!)  A good save for later. 

Re: Leading off with a correction ...
  • 11/30/2011 12:12:28 AM

This is fascinating stuff, John (as usual with your posts).  Looking forward to your next one!

Re: Leading off with a correction ...
  • 11/22/2011 11:40:42 PM

There's no question it would be doable. Actually sometime soon I'll probably blog someplace or other about why semiautomatic coding -- i.e. mechanized systems to make human beings much more efficient but leave them in the loops -- is actually likely to be the most effective.

John, can't wait to hear more. My feeling is this would be a winning application and a great way of tracking the influence of hashtags. Thanks for sharing.

Re: Leading off with a correction ...
  • 11/22/2011 10:34:16 PM

There's no question it would be doable.  Actually sometime soon I'll probably blog someplace or other about why semiautomatic coding -- i.e. mechanized systems to make human beings much more efficient but leave them in the loops --  is actually likely to be the most effective.

Re: Leading off with a correction ...
  • 11/21/2011 11:40:03 PM


I was thinking the same thing. In fact, what a great concept for an analytics tool (or a feature for an existing tool.) You can follow others using the same hashtag by simply clicking into the hashtag stream, but this gives you only a feel for the kinds of posts being made. Something really cool would be an analytics or insight tool that could effectively track, catalog and manage hashtag data so that hashtag trends could be adequately analyzed.  

Re: Leading off with a correction ...
  • 11/19/2011 8:23:05 PM

John, if someone did a longer analysis of hashtag use and effectiveness, any less manual ways to track it than yours?

Re: Leading off with a correction ...
  • 11/18/2011 5:19:06 PM


I would think finding measurements for the effectiveness of marketing online and off continues to be a major challenge and may not be able to be resolved 100 percent. After all, the superiority of PPC marketing was extolled at the outset but now, of course, there are doubts about its effectiveness as well. Just because we know someone looked at your add doesn't mean they bought anything. Better than a newspaper ad but far from perfect. Maryam has spoken about some elaborate advertising codes that were used to track effectiveness even back in the age of print, but to what extent this koind of technique would be effective in social media is uncertain.

Re: Leading off with a correction ...
  • 11/18/2011 1:33:51 PM

Shawn, that is the real challenge.

Right now we don't have any real clue about the monetization process.  Neither has traditional advertising or marketing (except for direct mail), of course, but when they were the only game in town, nobody noticed (if all cars are black there's not much point in taking a customer survey about how much car color is affecting their decision).

Some possibilities would be to survey buyers to see how many remembered the message; offer a got-it rebate through the tweeted channel; or use the hashtag to promote a temporary discount site that wasn't publicized in any other way (and then see how much the non-promoted correlated with the promoted).  Elsewhere I've written about how I used frequency counts on reader reviews to position a book of mine, and then used frequency counts to determine how many reviewers of the new book appeared to have picked up the language from the marketing copy, but that doesn't actually reveal that any buyers decided to buy based on any particular language.

All of those have gaping holes methodologically, but might be better than nothing.  Buyer surveys post-purchase notoriously involve people rationalizing and re-writing their decisions to what they think the survey-takers want to hear; rebates and promos tied to hashtags are routinely searched for these days, so the causality is likely to be badly skewed.  People don't report all their purchases on line, thank heaven,  lifted phrases in reviews might have been lifted by people looking for the right words at the time of reviewing rather than influenced by them at the time of buying, and most of all, books are consumed on very long delays anyway -- often several months -- so much so that it's a cliche in book review blogs that many reviewers don't remember why they picked up the book in the first place.  Nonetheless, there may be some relation that would be enough to work with.  The research just isn't there yet.

Re: Leading off with a correction ...
  • 11/18/2011 1:12:25 PM

Hi John,

Analysis that would shame the average social media marketer. I'm wondering how you might go about tracking such a campaign long-term and how/if you could adequately and effectively measure its impact on, say, book sales beyond the social media space, always the real challenge, it seems.

Page 1 / 2   >   >>