Thinking on NSA Excess & Enterprise Implications

In the wake of the NSA whistleblower Edward Snowden's revelations, civil libertarians aren't the only ones who think the agency is collecting too much information. Even its own analysts think they have too much data.

Documents received by The Washington Post describe Muscular, an NSA effort that infiltrated Google and Yahoo networking traffic. Muscular gave NSA analysts access to millions of emails, attachments, and other web communications each day -– including entire Yahoo mailboxes.

The NSA needed to develop new filtering and distribution systems to process this data mother lode, as indicated in the documents. Even with these systems, the new data (particularly from Yahoo) proved too much to handle. Yahoo email began to account for approximately 25 percent of daily data being processed by the NSA's main analytics platform for intercepted Internet traffic. Most of the data was more than six months old and virtually useless. Analysts became so frustrated that they requested "partial throttling" of Yahoo data.

"Numerous target offices have complained about this collection 'diluting' their workflow," according to one NSA document. "The sheer volume" of data is unjustified by its "relatively small intelligence value."

Other NSA data mining programs have overwhelmed the agency, as reported elsewhere. When spammers hacked a target Yahoo account last year, the account's address book blew up with irrelevant email addresses. Consequently, the NSA had to limit its address book data collection efforts to only Facebook contacts.

These broad data sweeps have been significantly less successful than the NSA's more targeted operations. In an interview with the Daily Caller, former NSA official William Binney said the NSA's inefficient big data processes crippled its ability to react to a tipoff about Tamerlan Tsarnaev -– information that could have curtailed the Boston Marathon bombing.

They're making themselves dysfunctional by collecting all of this data. They've got so much collection capability but they can't do everything… The basic problem is they can't figure out what they have, so they store it all in the hope that down the road they might figure something out and they can go back and figure out what's happening and what people did.

Still, the White House and other government departments and agencies place the NSA under what The New York Times calls an intense "pressure to get everything" -- a pressure that has spawned a data obsession.

The problem with this obsession is twofold. The first issue is the ROI of gathering haystacks -– resources better spent elsewhere are diverted to finding, gathering, filtering, and ultimately throttling and fixing oversized and under-relevant data.

The other issue is one of public relations. The NSA may have assumed that, as a super-secret spy agency, its accountability would always remain limited, but leaks happen. This data gluttony has cost it trust and goodwill from the American public and from foreign powers -– just as companies often face public backlash over their customer analytics programs.

The ever-present question for the big data enterprise is this: What are the costs -– all the costs –- of your data mining efforts? Are they manageable? Will there be backlash or some other loss of goodwill? What other consequences might occur?

Or, more importantly, is there a simpler, better way?

Related posts:

Joe Stanganelli, Attorney & Marketer

Joe Stanganelli is founder and principal of Beacon Hill Law, a Boston-based general practice law firm.  His expertise on legal topics has been sought for several major publications, including U.S. News and World Report and Personal Real Estate Investor Magazine. 

Joe is also a communications consultant.  He has been working with social media for many years -- even in the days of local BBSs (one of which he served as Co-System Operator for), well before the term "social media" was invented.

From 2003 to 2005, Joe ran Grandpa George Productions, a New England entertainment and media production company. He has also worked as a professional actor, director, and producer.  Additionally, Joe is a produced playwright.

When he's not lawyering, marketing, or social-media-ing, Joe writes scripts, songs, and stories.

He also finds time to lose at bridge a couple of times a month.

Follow Joe on Twitter: @JoeStanganelli

Also, check out his blog .

Clean Your Data with Visualization and Algorithmic Tests

Speakers at Bio-IT World explore techniques for biotech researchers and others working with big data to identify the accurate data in their data files.

Data Sharing: A Matter of Life and Death

Cooperation among medical researchers -- done right -- very simply can mean lives saved, but the research community needs education on how to execute on that collaboration.

Re: Store at your own risk
  • 11/21/2013 7:48:38 AM

Beth, I wish that advocate the best of luck and hope that they work in a company whose culture values dissent and change!

Re: Store at your own risk
  • 11/20/2013 10:49:20 AM

Matt, well, not necessarily. Somebody has be an advocate for change, and analytics professionals who find themselves in a situation in which they're not able to do their jobs effectively fit the bill. If they can't help bring about a change in infrastructure and processes, then, yes, absolutely, they should look elsewhere.

Re: Store at your own risk
  • 11/19/2013 8:35:56 PM

@Beth, no analyst worth their salt should be working for such a company, no? Should just be beginners who need to cut their teeth.

  • 11/19/2013 8:34:44 PM

They should be in big big trouble --- a la the CIA during the Church hearings of the 70s, which set back that spy agency for a decade.

  • 11/19/2013 5:08:04 PM

Anyone else think that the NSA collecting private data and full emal inboxs without the go ahead by a judge unconstitutional? 

Re: Store at your own risk
  • 11/19/2013 8:16:44 AM

And we've certainly heard enough about the impending talent shortage to know that data scientists are in demand. However, I think the scenario I've described would not necessarily be at a company that has evolved its data & analytics infrastructures into the true "data science" realm but on the other end of the spectrum -- reallly just getting started and not yet fully "data-ified."


Re: Store at your own risk
  • 11/19/2013 7:44:25 AM

I would think that any disgruntled data scientist could pull up stakes at one employer and easily find work at another. Or ... are there data scientist guns for hire? Big-time experts who sell their services as contractors. Outside of academia, I mean. Must be plenty, I would think.

Re: Store at your own risk
  • 11/18/2013 8:53:57 AM

No argument there!

Re: Store at your own risk
  • 11/16/2013 12:04:09 AM

@Beth, understood. I would still suggest to that data analyst to work for another company or find another job.

Re: Store at your own risk
  • 11/14/2013 5:10:28 PM

@Matt, I was speaking more of an infrastructure/architecture issue that would be out of a data analysts time. No matter how enthusiastic a data analyst is about running query after query, if each is a huge time sink because of the wait on data, then there's good reason for frustration. By the same token, the data analysts best make sure their models aren't too complex or incapable of scaling or unable to support iterative modeling -- those are all essential for letting the business ask as many questions of the data as it would like.

Page 1 / 2   >   >>