I remember several years back hosting applications like OddCast on a Website and having computer-generated avatars greet visitors was a cool thing to do. The avatars seemed to follow mouse movements with their fake eyeballs!
Well, the human part, curation, might come to an end, sooner than we think.
Kris Hammond, one of the founders and CTO of Narrative Science, a company working on technologies to generate narratives from data, thinks in 15 years, 90 percent of the news stories will be computer generated, as he discusses here. There will still be room for human curation, but many of the stories will be almost entirely “automated”:
A computer can write highly localized crime reports, personalized stock portfolio reporting, high school and youth sports stories at scale to provide coverage that was previously impossible and could never be possible in a world of purely human generated content.
I doubt we are anywhere near generating narrative stories from unstructured data. Yet, news story automation already may be here. In early December 2011, a computer-generated news portal called The Wall launched, analyzing and displaying real-time local Twitter trends while automatically clustering the information into news topics.
On closer examination, I found each clustered news story ends up linking to one or more actual newspaper stories written by a human reporter. So, while perhaps human bias is not eliminated, the selection of stories appears to be automatic.
The Fast Company piece points out a surprising discovery made using DocuScope. Othello -- despite labeled a tragedy -- turns out to be a comedy. Shakespeare apparently used comedic stylistic cues to intensify the play's tragic aspects. Turns out that Shakespeare’s vocabulary and syntax varied wildly between his comedies, historical plays, and tragedies. In fact, according to a DocuScope insight, the funniest thing Shakespeare wrote was a portion of The Merry Wives of Windsor, while a passage from Richard II was the most serious.
Then again, in our age of “big data,” we can now visualize how our literary expressions differ and evolve over time. Take the Corpus of Contemporary American English, or COCA, comprising 425 million words of text from the past two decades, and compare it with equally large samples drawn from fiction, popular magazines, newspapers, academic texts, and transcripts of spoken English. The New York Times recently wrote how the COCA program detected patterns a human would never have found, such as which past-tense verbs show up more frequently in fiction compared with those showing up in academic prose.
And again, the same technology that analyzes unstructured data and turns it into computer-generated insights also can predict what may happen in the future, in the case of the Recorded Future platform, which is partially funded by the CIA and Google. I was a recent guest of the Recorded Future Webinar on the Future of the World Economy and Alternative Energy in 2012.
Recorded Future view
The Recorded Future looks at 100,000 Web pages an hour, scanning across 50,000 sources -- from Securities and Exchange Commission filings to Twitter comments. As discussed in this New York Times blog, it looks for statements about the future, like notices of an annual meeting or predictions about when a product might be released, and past developments, and then creates a “temporal index” that suggests momentum trends and unusually strong relationships between key players in a timeline in order to generate unusual insights.
The Recorded Future is not alone in generating insights. Companies such as Palantir Technologies attempt to visualize the world’s governmental and financial information, as well. Read this blog, for example, detailing Palantir's analysis of the recent turmoil in the Sudan. It performed the relational, temporal, statistical, geospatial, and social network analysis on more than a dozen open sources of intelligence data to gain a deeper understanding and insights around conflict, and how it might be resolved.
Yet another platform, Quid, aims to discover new opportunities through a “white space” analysis. The software will let you find “standout companies” within a sector and a sea of largely unstructured data, the company says.
The world is rapidly changing, that much is certain, and our ability to generate insights is about to take quantum leaps. Are you in?
Uh oh, Marshall! As a blogger and former newspaper journalist, I don't like the sound of this! :) Of course, there is a difference between curating content and creating it and then between creating standard content based on readily available information and specialized, idiosyncratic, opinionated content that makes up the best commentary. My suspicion is that the human desire for self-expression will keep humans writing but perhaps much of the drudge work involved in simple regurgitation of facts will be gone. The mechanization of yet another human function may also cause unsuspected problems. Those information outlets that have survived the collapse of the large scale media--think of sources as disparate as Fox News and The Onion--do so specifically because their content is stylized, opinionated and, in many cases, bias. So could unbiased insight ruin a business model?
@Shawn, you ask, "Could unbiased insight ruin a business model?" I hope not. Isn't that what business analytics is all about -- delivering facts for corporate decision-making? There's no bias in a factual bit of data.
@Marshall, of all the many interesting things I've learned in the six months since launching AllAnalytics.com the fact that the classic Shakespearian tragedy Othello is actually a comedy has got to be one the best. Brit Lit teachers around the globe must be flummoxed.
Have to admit - as much as I like England, and go there often, in fact, in 10 days - I can't say I can read Shakespeae all that well. I'm glad someone figured out how to tell Othello was a comedy vs. tradegy, as it would probably never have been me. Ha!
I suppose so, Shawn - I don't think machines can replace humans - but I do think they can gather a lot of the information for us and prepare it so we can process it, in only the way a human can, faster than otherwise.
Yes, of course, there are tradeoffs in that we might miss things we'd find otherwise, as the bots failed to pick them up, or just the reverese, have a bunch of junk to look at and summerize. And then, we're back to square one.
In fact, if machine intelligence can't make this job much easier (to gather and curate information) we're better off not using it at all.
Kris Hammond, one of the founders and CTO of Narrative Science, a company working on technologies to generate narratives from data, thinks in 15 years, 90 percent of the news stories will be computer generated,
Further, he explains that:
In early December 2011, a computer-generated news portal called The Wall launched, analyzing and displaying real-time local Twitter trends while automatically clustering the information into news topics.
Marshall says that:
On closer examination, I found each clustered news story ends up linking to one or more actual newspaper stories written by a human reporter. So, while perhaps human bias is not eliminated, the selection of stories appears to be automatic.
I would submit that, in some cases, the "bias" we are describing here is an individuality and point of view that cannot be replaced by technology and actually causes us to choose or accept one source of information over another or, more precisely, to choose one bit of information as more relevant to us personally than another.
Actually, I see some serious advantages in the automation and aggregation of news and information that has freed up the flow of data and the power that access to that data has given common people in their daily lives. This is especially true when that technology uses a means that allows those consuming that data to choose the information they find most relevant and helpful outside of the control of an outdated and hierarchical information system. I just think that the human element here is critical and cannot be ignored as a partner with the technology used to help manage that data more efficiently.
Alas, there's the rub, as the Bard himself might say. Othello, of course, was not a comedy, but the playwright used words and phrases associated with comedy for effect. Would our algorithm have known the difference without human interpretation to guide and interpret its results?
I think the article I got that information from also stated the common defination of a tradegy and comedy in Shakespere's time was somewhat different than what we think of those genres, today.
So there's a few parts to this theme -
- we, in our time, have relabeled and redefined what Shakespere meant - not alwyas accurately.
- also, the software went back and tried to uncover the real genre of Othello, and found it was a comedy - as defined in Shakesphere's time period and local, not ours.
Think of it this way - you have an old silver jug, it's all rusty, and you think the rust is part of the design - but then you put some silver cleaner on it- and it looks all new and shiny - and all of a sudden, like a different piece than what you thought of it originally - that is what software mentioned in my article did for Othello.
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
To save this item to your list of favorite AllAnalytics content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.
Elizabeth Barth-Thacker, a BI and informatics technology manager at Humana, tells us how her team is creating data transparency and building engagement with the business – with the help of an internal collaboration portal called Humanalytics.
Speaking at SAS Global Forum Executive Conference, Rajeev Kaul, SVP of pricing at OfficeMax, uses a Chinese proverb to explain one of the reasons he's deploying SAS Visual Analytics.
In an All Analytics interview, Mike Cavaretta, technical leader, predictive analytics at Ford Research & Advanced Engineering, shares how big-data is fueling vehicle decisions.
Analytics professionals and SAS executives share how organizations can get on with their work so much faster when working in a high-performance and visual analytics environment.
Analytics professionals who attended SAS's recent Executive Briefing in New York share how they think visual analytics might help their organizations get better value from data.
At Boeing, effective decision making comes down to this simple formula: QxA=E, as executive Jerry Allyne explained at the recent INFORMS analytics conference.
Whether working in major league sports, financial services, or healthcare, analytics, and data, professionals are checking out how visual analytics and high-performance technologies can help them optimize their environments, shrink their cycle times, and improve decision making, as attendees at the recent SAS Executive Briefing in New York share with us.
SAS CEO Jim Goodnight speaks with us at a recent SAS Executive Briefing about getting a feel for what's in your big-data and other new realities powered by advanced analytics.