Weathering the Storm


Superstorm Sandy has left more than a trail of devastation in neighborhoods; she continues to wreak havoc on businesses throughout the area. A friend of mine is relegated to indefinite telecommuter status because salt-water flooding compromised infrastructure at her companyís Manhattan headquarters.

Like many of the businesses in the strike zone, IT teamsí disaster recovery scenarios are under the microscope. Are there enough software licenses to cover everyone who needs to access applications remotely? Can the servers handle a maximum load of remote users all day, every day? Are employees equipped with appropriate hardware, software, and connectivity to carry out their jobs?

Businesses in New York, New Jersey, and nearby regions are finding this out the hard way right now. All others should study the responses closely, analyze their own disaster recovery and business continuity plans, perform drills in the near future, and watch for postmortems from their industry peers.

Before you can assess your ability to handle a full-time, remote workforce, you first have to analyze your workflow. Chart each personís role, and the exact applications and centralized data he or she would need access to in a scenario. For instance, if the employee is responsible for customer service, he will not need access to the financials database. Being this specific becomes essential as your network resources are constrained. You donít want users consuming bandwidth, server CPU, and licenses for non-related tasks.

Next, use asset management tools to gather an inventory of the software, hardware, and platforms in use across the enterprise. Check versions, security patches, and overall configurations to ensure they are able to handle the rigors of remote access. If not, budget for an upgrade as soon as possible. Users will not suffer a spinning hourglass when chaos is erupting around them.

Have all your workers telecommute for a day. This might sound crazy, but youíll never really know how the ecosystem (human and technology) will respond until it is under that type of duress. Take note of every aspect:

  • What kind of support do users need? Can you offer that training upfront or pre-distribute how-tos? Will you need an emergency help desk to get users up and running?

  • Did the servers hit maximum utilization, and at what point? Did virtualization help balance the load, or do certain applications need to be reconfigured? Do you need to implement better prioritization so that mission-critical applications always have the CPU power they require? Performance management tools, both onboard servers and third-party, measure and analyze this data for you.

  • How did your bandwidth hold up? Network monitoring and traffic analysis tools illustrate capacity issues during peak times as well as usage patterns. While an actual disaster would skew these results, it gets you closer to providing a stable network for access. Again, you might have to set priorities so that voice over IP, video, and other communications get through with low latency.

  • Do you have enough licenses to support a remote workforce? If users ultimately have to rely on applications such as Web-based mail that they might not otherwise use, then youíll need enough seats to accommodate everyone. Executives and HR use email in a disaster to gather and disseminate status updates and other important information.

  • How will you get data re-centralized? Users will be forced to work offline because of spotty connectivity and other issues. Youíll have to ensure that whatever documents pile up on their laptops gets back to the datacenter without overwriting other versions.

  • What is the state of your security? In a disaster, IT can be tempted/pressured to compromise on its tough remote and mobile security stance just to get people up and running. However, doing so can have long-term, destructive consequences.

    A lot of people will use their own devices, so make sure they are educated and trained to safely access applications and data. Also, while it might seem a slam on productivity, if working at public hotspots is considered too risky and against compliance outside of a disaster, the same holds true during a disaster. For instance, employees cannot work with sensitive user or corporate data at a coffee shop just because it has Wi-Fi and power. Those employees might require a pre-assigned temporary office with full network security.

Most companies prepare for short-term inconveniences, such as a NoríEaster taking out power at headquarters. Superstorm Sandy has taught us that serious geographic, operational, and infrastructure damage can take a company offline indefinitely if IT, workers, their devices, and the datacenter, arenít properly prepared.

Will Superstorm Sandy change how you conduct disaster recovery analysis? Share on the comment board below.

Sandra Gittlen, IT Writer

Sandra is a free-lance business and technology writer in the Boston area. She is a frequent contributor to many high-tech print and online publications. Previously events editor and managing editor of online for IDG's Network World, Sandra has covered all aspects of business intelligence/analytics over the years, writing articles and in-depth issue papers. An industry educator, she moderates Webcasts and conference sessions on an array of technology topics. She can be reached at sandra@slgpublishing.net.

Big-Data Analytics: Waste Not, Want Not

Think twice before taking on data irrelevant to your big-data project.

Panera Baking Data-Driven Delicacies

Combining loyalty program data with restaurant performance feeds, Panera Bread finds the optimal mix of menu items and in-store display ads.


Re: IT analytics
  • 11/9/2012 10:08:00 AM
NO RATINGS

@Beth The decision was probably along the lines of, let's just keep what we have rather than spend our profits on upgrading equipment. Rates go up every summer, not because the utility is spending more but because demand increases with the use of air conditioning. They drop in winter when the price of gas goes up. The thing is that when so many people are counting on these things, you'd better not rely on just shuffling through in a best case scenario but on holding up in a worst-case scenario. 

ConEd, which had far better response than LIPA and much more detailed maps of outages, still came under fire in this blog: http://blogs.reuters.com/great-debate/2012/11/07/when-customers-are-angry-how-do-companies-justify-shareholder-payouts/

Re: IT analytics
  • 11/9/2012 9:52:13 AM
NO RATINGS

Ariella, I think Cordaro captured the issue with the quote on providing cheap electric power to customers. I'd love to see the ROI work the utility has done on infrastructure upgrades, assuming it's done those exercises in determining not to upgrade. Can it deliver service to customers more cheaply on old infrastructure than it can with upgraded infrastructure and automated systems? How did storm damage play into those assessments? Was the risk of not being prepared outweighed by the low cost of the current infrastructure? (Because, let's not forget, when rates increase so do customer complaints.) I'm certainly not justifying the utlity's shoddy infrastructure -- just wondering how it was making its decisions!

 

Re: IT analytics
  • 11/9/2012 9:05:51 AM
NO RATINGS

@Sandy related to this is the question of the slow response on the part of the utlilities. LIPA has drawn particular attention to itself in this regard. Today's Newsdsay has an article, Why LIPA failed: Utility ignored warnings it wasn't ready for major storm Primarily it was because they just didn't bother to upgrade what needed to be replaced. But there was also very poor analytics involved. While ConEd could show detailed maps of outages, LIPA's didn't.  As the article recounts: 

 "a Newsday reporter at the Hicksville headquarters of National Grid — the company contracted by LIPA to oversee operations — saw engineers who were using highlighters and paper maps to track thousands of outages, as ratepayers banged in frustration on the building's locked front doors."

Of course, the area got a double whammy with the nor'easter: "Ten days after the superstorm battered the region, more than 170,000 Long Islanders were still without power. The nor'easter on Wednesday piled on with another 90,000 outages."

 

My understanding is there was a similar scenario for PSE&G in NJ, though they do serve more customers overall. 

But to get back to the infrastructure:

Antiquated infrastructure

The utility's infrastructure has changed little since Gloria, said Matthew Cordaro, who served as vice president of engineering at LIPA's predecessor, the Long Island Lighting Co., when that hurricane struck.

"I think somewhere along the way they lost sight of what the primary mission of a utility is," Cordaro said Thursday, "and that is to provide cheap electric power to customers."

Alexandra von Meier, the co-director of electric grid research at the California Institute for Energy and Environment, said other utilities face similar challenges.

"I don't think it's very unusual to have very old and clunky technology in their power distribution context," she said. "If they were more modern ... restoration could be faster, and we all want that."

More than a half-million residents lost power for a week after Irene. Cuomo — who said that "at a minimum, LIPA did a terrible job of communicating" following that tropical storm — requested a review of the Uniondale-based utility.

The resulting report concluded that LIPA and National Grid did not meet industry standards in dozens of aspects concerning planning and recovery in major storms.

To survey storm damage, engineers used spotty equipment, including expired Internet aircards for their laptops, the state inspectors found. Their computers used COBOL, a basic decades-oldcomputer programming language, and some lacked electronic mapping for outages and used a"rudimentary damage prediction model."

Even fax machines and other basic office equipment were unavailable or broken at substations, the facilities that transfer power to thousands of homes, hindering communication. One substation coordinator reported having to run to a local office supply store to purchase a printer.

 

Re: IT analytics
  • 11/8/2012 3:33:02 PM
NO RATINGS

Hi Sandy, I guess the good news is that with each disaster the necessity of thorough and continued network/security analysis gets more and more evident. Not that I'd wish this lesson on anybody...

Re: IT analytics
  • 11/8/2012 2:44:28 PM
NO RATINGS

Beth, it's hard to say whether the tools out there are the tools that are most needed in a situation like Sandy. In fact, I'd venture to say you'll most likely see some technology and training tweaking in the wake of this storm.

I think that as cloud computing (SaaS, PaaS, IaaS, etc.) takes hold, IT has to go back to its core values of network monitoring, traffic analysis, server analysis and all that good stuff that is supposed to be standard practice. This might have been the wake-up call on that score.

As for the necessary investments, hard to say. I'm sure it varies according to the architecture of the data center, age of the company and other factors. I do believe that everyone should be revisiting their own capabilities and that this should eliminate any remaining "it's not going to happen to me" attitudes.

Let's not forget shortly before Sandy, the West Coast, including Hawaii was under watch for a tsunami. It can happen anywhere.

 

 

IT analytics
  • 11/8/2012 11:30:44 AM
NO RATINGS

Sandra, do you think IT vendors are doing enough to deliver products that enterprises can use to analyze how well their network infrastructures, application performance, security, etc. are doing? And, I should add, that allow customers a way to absorb the analysis easily (for example, via interactive data visualizations on network management, app performance, or security dashboards)? And, if so, do you think enterprise IT execs are taking these types of tools seriously and making the necessary investments in them?

Ugly truth
  • 11/8/2012 9:11:18 AM
NO RATINGS

The storm revealed some ugly truths for businesses who are using third-party vendors -- and discovered after the fact that some of them lacked geographically dispersed backup systems and/or redundant power sources.

INFORMATION RESOURCES
ANALYTICS IN ACTION
CARTERTOONS
VIEW ALL +
QUICK POLL
VIEW ALL +