Open Source Opens the Advancement of Analytics

In 1543, Nicolaus Copernicus jolted the leaders of his day when he published a scientific model of the universe with the Sun instead of Earth at its center.

I have to wonder if his transformative spirit is alive in the open source era that we have today, as developers and enterprise managers are discovering how open source solutions provide a universe in raising the quality of business intelligence analysis and internal software development.

Open source solutions offer capability to examine relationships among the data collected, develop good maintainable code and to establish a robust software development process. The choices help users overcome challenges to elect the right advanced analytic capabilities such as predictive analytics.

I learned about some of those options at O’Reilly OSCON. This year the conference relocated from Portland, Oregon to Austin, Texas, hoping the more central location in the US would entice more developer and business attendees.

Why is open source becoming so influential among the analytics practitioners? A key driver lies in how analytics evolved so that analysts and solutions providers can explore and exploit the benefits from open source activity. Analytics is moving from descriptive analytics (metrics that explained what happened) to predictive techniques (discovering the influences of what happened). In the process, managers are learning that consequential value comes from incorporating data from different sources and executing data mining methodologies. The activities foreshadow models ranging from regression to machine learning. With open source solutions typically free, enterprises can experiment with how to best use the tools and achieve the aforementioned value. Innovative collaboration with professionals on conducting analysis and understanding the results is also possible.

Demand for predictive analytics has been long anticipated. Back in 2013 Gartner predicted that by 2016, 70% of most business processes will incorporate real time predictive analytics to establish a competitive advantage. Fast forward to today, and you will see sites like Information Management report how refinement tactics abound among industries adopting predictive analytics, such as streamlining the ways to delivery insights from data and to blend internal and external data sources.

Another aspect influencing the attracting to open source is the latest programming techniques in JavaScript, Python, and R programming, which I explained in this earlier All Analytics post, Can Twitter and Open Source Analytics Predict the Next American President?. The programming languages have been used to develop a slew of frameworks, functions for addressing repeatable coding tasks, and libraries to implement algorithms into larger programs. This makes repeatable tasks in modeling easier to create, run, and share results with colleagues.

As I spoke with tech vendors at OSCON, I realized how important a backbone analytic capability is to the services, products, or business model demoed, especially for IoT-related instances.

One example of that backbone is from new tools intended to make analysis and experimentation easier. One such tool comes from, a firm that produces a platform that allows data scientists and developers to run powerful predictive algorithms on their data sets. I spoke with Vinod Iyengar, Director of Product Marketing, at the H20 vendor booth, and he explained how artificial intelligence has been layered to make data analysis useful to an enterprise. “AI has been around for 50 years, but now open source packages have allowed new algorithms , which means start ups need to build a data product. I am not just talking machinery but building a data product with a business value. You need the machine learning part, and you need to take the predictions and apply them. You also need to understand what you are meaning to solve.”

Another example comes from Continuum Analytics, which produces a Python-supported open source platform called Anaconda.

The platform features a package environment manager that wraps installation of different languages into a sharable interface. The purpose is to allow professionals from different teams to be able to communicate with each other while minimizing technical and user differences across certain platforms. I have seen similar setups with text editors and IDEs that allow plugins, but Anaconda was created with data management in mind.

The Continuum team has focused on Python, but just introduced a package with R programming in mind. What caught the team’s interest in releasing a package for R, according to Christine Doig, Senior Data Scientist and Product Marketing Manager, is the adoption of key open source languages. “A developer may be using C or C plus, a scientist may be using SAS or Matlab, and a business analyst may be using Excel. With these different stacks, collaboration among different departments becomes hard. But solutions like Anaconda can provide a common language across teams because it is language agnostic.”

And there are frameworks making programmable devices easier to maintain from languages already popular among developers. I had fun playing with robots that featured Cylon.JS, a JavaScript framework made especially for programming Internet of Things devices.

As I wrote in the opening, Copernicus offered the scientific community a new way to view and explore the astrophysical bodies known to man at that time. Open source has offered a similar revolution for analytic capabilities and the programming languages that operate with analytic solutions. The only limitation that seems to be in place for data analysts is how they choose to reach the stars.

Pierre DeBois, Founder, Zimana

Pierre DeBois is the founder of Zimana, a small business analytics consultancy that reviews data from Web analytics and social media dashboard solutions, then provides recommendations and Web development action that improves marketing strategy and business profitability. He has conducted analysis for various small businesses and has also provided his business and engineering acumen at various corporations such as Ford Motor Co. He writes analytics articles for and Pitney Bowes Smart Essentials and contributes business book reviews for Small Business Trends. Pierre looks forward to providing All Analytics readers tips and insights tailored to small businesses as well as new insights from Web analytics practitioners around the world.

Clustering: Knowing Which Birds Flock Together

Analytics pros from many different industries employ clustering when classification is unclear. Here's how they do it.

How Analytics Can Help Marketers Reuse Content to Boost Sales

As content marketing becomes important, aging content becomes a concern. Here are some ways of using analytic reporting to develop content ideas and to align content with customers in the sales cycle.

Re: theories
  • 8/11/2016 1:21:44 PM

Good point about preconceptions and their impact on decision-making. I think we will see more of that thinking as some of the advances are applied in social issues where predicting human behavior leads to response issues or even ethical issues.

  • 8/9/2016 4:16:46 PM

I still remember writing a report on Copernicus way back. I think I was in junior high then. Interesting that he didn't get the same kind of backlash Galileo did. Some theorize that the Church became more reactionary only in the 1600s. We think we're living in a scientific age now, but people often stick to their preconceptions and ignore evidence to the contrary.