You discover a legacy SAS program from 1982 in one of your archived directories. You unzip it and the source data inputs and executes the program. Will it run in the same manner it did 30 years ago?
Unless the code is using a procedure or function the vendor no longer supports, the code will probably execute. However, four other types of technical factors could influence whether and how the code executes.
Internal software factors: The code's program logic may not have changed during the last 30 years, but SAS (this site's sponsor) has changed where it chooses to use workspace for inputs/outputs. The software engine's default settings have traditionally split the reading and writing between memory and the disks. However, in recent years, more of the I/Os have been allocated to memory. This is based on the idea that reading and writing to memory is quicker than reading and writing to disks. If you migrated to a later version of SAS that promoted speed improvements, that meant the code would need your RAM resources more than your disk resources.
Also, over the past 30 years, SAS has changed how some procedures use workspaces. It has reengineered specific resource-intensive procedures (such as Proc Sort) to reduce execution time by optimizing (multithreading) the use of available memory. Of course, the nature of the dataset (i.e., whether it is variable-loaded, record-loaded, or some combination thereof) also will influence the efficiency of the optimizing techniques.
External software factors: The operating systems with which SAS interfaces have changed dramatically. When SAS launched in 1976, the product ran on mainframes; personal computers didn't exist. Between 1982 and 2012, SAS has had to configure its engines to be interoperable with a plethora of operating systems across and within computing platforms.
Hardware factors: Above and beyond RAM and disk drive allocations, CPU processing speed has advanced significantly, and default directory structures for loading the software have changed. The legacy program may execute faster because of better resources and processing power, or it may not execute at all if the legacy code has logic that points to configuration files in directories that have changed.
Dataset factors: As the SAS product line matured between versions 6 and 8, the company discovered that permanent datasets created during or before version 6 could not reside in the same directory with the newer datasets. Specifically, if you wanted to read a version 6 dataset and use the data to create a version 8 dataset in the same directory, the job would come to an abnormal end, or abend. SAS fixed the problem by letting users specify which SAS engines should be associated with each dataset.
In short, the statements and syntax of a SAS program authored in 1982 may still be logically sound in 2012, but it still may execute in an unexpected manner. The analytic objectives could still be relevant, but the technical environment in which the analytic activity is conducted may not.
Here is a personal example. I began my journey in analytics in 1977 as a sophomore sociology major. I used SPSS in a technical environment of a UNIVAC 1100 mainframe, a keypunch machine, and a deck of 80-column Hollerith cards. Over the last 35 years, I have moved from SPSS to SAS, using IBM mainframes, DEC VAX and UNIX minicomputers, and Windows-based PCs. My user manuals have ranged from the two hard-copy version 5 manuals (Base SAS and SAS Stat) to SAS online documentation. On every platform, I have had to learn the underlying operating system and retest legacy code with each new release. All this is separate from the security changes needed as SSH regulations replaced FTP file transfers. In short, I've been involved with analytics for 35 years, but the context in which I have learned and work has and is still evolving.
The message to you is that your career will require you to make adjustments in how you do your work. Analytics can be done within varying combinations of technical components. You need to stay abreast of the external technical advancements. Analytics will always exist in a context beyond the calculations -- you can count on it.
Bryan I smiled when I read your piece because I remember the days of cards and mainframe processing, today's younger students don't realize how time consumption education was years ago just because of the lack of technology. Remember micro fiche? I think I spent half of my research hours in the newspaper morgue as it was called. Today, we can save space, help the environment and save time with all of our tablets that access all that information anytime, anywhere. No doubt the process to gather data, assimilate it, and drive information distribution will continue to evolve with technology. Tablets and the cloud will be the dynamic duo!
I was laughing to myself because I was having lunch with my co-workers and a bunch of them were joking how old they were because they used cassette tapes. I was thinking, "Yeah, try 8 tracks, records and phones that were attached to the wall.".
When it comes to technology today, you just can't wright code fast enough to intergrate and keep us with technology. You come up with a new feature, the competitor goes "That's a great idea." and intergrates with their other great idea and alas, the vicious circle goes on.
Funny how fast technology changes form completely. I remember the floppy era too, and the days flash disks were so expensive and had to be 'ejected safely' or they would crash. Unearthing a generations old copy of SAS could produce some not so interesting results especially since it could easily be a version that was optimized for DOS. Though i'd normally try to skip generations of technology whenever possible..getting left too far is certainly disaster.
@Bryan - Keeping up with technology is a constant struggle. If I look in to my digital time capsule that is stuffed in a box in the corner of the garage, I see a pile of 5.25" floppy discs, 3.25" single-side or double-sided floppies (high and low density), a bunch of QIC-format tape cartridges, 8mm Exabyte cartridges, CD-ROM discs, a stack of 5.25" rewritable MO discs, some 3.5" MO discs, a ton of Bernoulli disc cartridges from 20MB to 230MB, a stack of zip cartridges, and various SDD thingies.
My personal digital legacy is archived in this box of media, stored safely away for some day when I need to go back and take a look. One small problem, the media exists (although because it is chemically based and may have deteriorated and lost much of its magnetic storage capabilities) but I have no working mechanism that can read any of this media. At least I know my personal data is safe since probably no one else can read my media either.
Oh, Bryan, your mention of punch cards made me howl with recognition! I can remember taking computer science as a freshman in college (1970) and spending hours at the computer center trying to get my assignments to work. I'd start around 11PM because students weren't allowed to even approach the mainframe during daylight hours. First I would have to punch the cards, then take them to the service window where some geeky grad student right out of central casting would place them in the batch queue that always seemed to put freshman in Computer Science 101 at the v-e-r-y end.
Finally, I'd get my paper printout back (remember the holes on the sides?) only to discover that I had punched something wrong and the computer had spit out 10 pages of error messages. Rinse and repeat (oops, I mean iterate) this process another 5 or 6 times!
Finally, along about 4 am, there would be success! Another week in which that which did not kill me made me stronger. I can remember riding my bike back through campus...the world asleep except for me, and feeling the most amazing sense of triumph at having SOLVED THE PROBLEM!
That feeling took me through symbolic logic on this kooky experimental interactive typewriter that someone had rigged up, graduate school where we had--gasp--personal data input terminals and my first job at Xerox's late, great PARC where I got to test a home modem that was practically bigger than my kitchen table.
LEADERS FROM THE BUSINESS AND IT COMMUNITIES DUEL OVER CRITICAL TECHNOLOGY ISSUES
The Current Discussion
Visual Analytics: Who Carries the Onus? The Issue: Data visualization is an up-and-coming technology for businesses that want to deliver analytical results in a visual way, enabling analysts the ability to spot patterns more easily and business users to absorb the insight at a glance and better understand what questions to ask of the data. But does it make more sense to train everybody to handle the visualization mandate or bring on visualization expertise? Our experts are divided on the question. The Speakers: Hyoun Park, Principal Analyst, Nucleus Research; Jonathan Schwabish, US Economist & Data Visualizer
Elizabeth Barth-Thacker, a BI and informatics technology manager at Humana, tells us how her team is creating data transparency and building engagement with the business – with the help of an internal collaboration portal called Humanalytics.
With today's advanced visual analytics tools, you can stream data into memory for real-time processing, provide users the ability to explore and manipulate the data, and bring your data to life for the business.
Dynamic data visualizations let analysts and business users interact with the data, changing variables or drilling down into data points, and see results in a flash. Advance your use of data visualization with tools that support features like auto-charting, explanatory pop-ups, and mobile sharing.
No doubt your enterprise is amassing loads of data for fact-based decision-making. Hand in hand with all that data comes big computational requirements. Can traditional IT infrastructure handle the increasing number and complexity of your analytical work? Probably not, which is why you need a backend rethink. Big data calls for a high-performance analytics infrastructure, as Fern Halper, a partner at the IT consulting and research firm, Hurwitz & Associates, discusses here.
Redbox's bright-red DVD kiosks are all but ubiquitous these days, located in more than 28,000 spots across the country. Jayson Tipp, Redbox VP of Analytics and CRM, provides an insider's look at how the company has accomplished its phenomenal nine-year growth.
InterContinental Hotels Group (IHG), a seven-brand global hotelier, has woven analytics into the fabric of its operations. David Schmitt, director of performance strategy and planning, shares IHG's analytics story and his lessons learned.