In 1543, Nicolaus Copernicus jolted the leaders of his day when he published a scientific model of the universe with the Sun instead of Earth at its center.
I have to wonder if his transformative spirit is alive in the open source era that we have today, as developers and enterprise managers are discovering how open source solutions provide a universe in raising the quality of business intelligence analysis and internal software development.
Open source solutions offer capability to examine relationships among the data collected, develop good maintainable code and to establish a robust software development process. The choices help users overcome challenges to elect the right advanced analytic capabilities such as predictive analytics.
I learned about some of those options at O’Reilly OSCON. This year the conference relocated from Portland, Oregon to Austin, Texas, hoping the more central location in the US would entice more developer and business attendees.
Why is open source becoming so influential among the analytics practitioners? A key driver lies in how analytics evolved so that analysts and solutions providers can explore and exploit the benefits from open source activity. Analytics is moving from descriptive analytics (metrics that explained what happened) to predictive techniques (discovering the influences of what happened). In the process, managers are learning that consequential value comes from incorporating data from different sources and executing data mining methodologies. The activities foreshadow models ranging from regression to machine learning. With open source solutions typically free, enterprises can experiment with how to best use the tools and achieve the aforementioned value. Innovative collaboration with professionals on conducting analysis and understanding the results is also possible.
Demand for predictive analytics has been long anticipated. Back in 2013 Gartner predicted that by 2016, 70% of most business processes will incorporate real time predictive analytics to establish a competitive advantage. Fast forward to today, and you will see sites like Information Management report how refinement tactics abound among industries adopting predictive analytics, such as streamlining the ways to delivery insights from data and to blend internal and external data sources.
As I spoke with tech vendors at OSCON, I realized how important a backbone analytic capability is to the services, products, or business model demoed, especially for IoT-related instances.
One example of that backbone is from new tools intended to make analysis and experimentation easier. One such tool comes from H2O.ai, a firm that produces a platform that allows data scientists and developers to run powerful predictive algorithms on their data sets. I spoke with Vinod Iyengar, Director of Product Marketing, at the H20 vendor booth, and he explained how artificial intelligence has been layered to make data analysis useful to an enterprise. “AI has been around for 50 years, but now open source packages have allowed new algorithms , which means start ups need to build a data product. I am not just talking machinery but building a data product with a business value. You need the machine learning part, and you need to take the predictions and apply them. You also need to understand what you are meaning to solve.”
Another example comes from Continuum Analytics, which produces a Python-supported open source platform called Anaconda.
The platform features a package environment manager that wraps installation of different languages into a sharable interface. The purpose is to allow professionals from different teams to be able to communicate with each other while minimizing technical and user differences across certain platforms. I have seen similar setups with text editors and IDEs that allow plugins, but Anaconda was created with data management in mind.
The Continuum team has focused on Python, but just introduced a package with R programming in mind. What caught the team’s interest in releasing a package for R, according to Christine Doig, Senior Data Scientist and Product Marketing Manager, is the adoption of key open source languages. “A developer may be using C or C plus, a scientist may be using SAS or Matlab, and a business analyst may be using Excel. With these different stacks, collaboration among different departments becomes hard. But solutions like Anaconda can provide a common language across teams because it is language agnostic.”
As I wrote in the opening, Copernicus offered the scientific community a new way to view and explore the astrophysical bodies known to man at that time. Open source has offered a similar revolution for analytic capabilities and the programming languages that operate with analytic solutions. The only limitation that seems to be in place for data analysts is how they choose to reach the stars.