Improvements in Open Source Codes Give Analysts the Best of Both Worlds
Gone are the days when open-source software was considered heretical and dangerous, something you’d never touch for commercial use. Over the years, lots of talented people have invested thousands of hours of development and testing into open source languages and now, in some instances, their algorithms exceed commercial applications in terms of scope and performance. And that, in turn, means both enterprises and commercial software vendors have learned that success requires them to leverage the best of both worlds.
Benefits of Widespread Brain Power
Open source languages benefit from the wisdom of crowds, as is abundantly evident in the predictive space with both R and Python. These languages benefit from their accessibility and alignment with academia. As the languages mature and attract new users, there have been thousands of practitioners and academics contributing and aggressively testing the source code (for the sheer love and utility of it). That collective effort has made these applications truly cutting edge. And not for nothing, few companies have the financial muscle to dedicate a similar number of engineers and analysts to build out their libraries.
Balancing IT’s Needs
But what about user experience, security, governance and stability...all the things that keep IT leaders up at night? They’re charged with protecting their enterprise data, and they need absolute assurance that sensitive and strategic information is only available to the appropriate people. At the same time, they need to ensure their employees know how to use the tools they’re provided, lest those employees find insecure workarounds.
Commercial providers are good at addressing CTO’s concerns, and enjoy a high level of confidence among their customers. Still, I think a lot of people were surprised when Microsoft announced its acquisition of Revolution Analytics. While a modest size deal by MSFT terms, it signaled a very different stance towards analytics and the open source world. As R spreads through companies, CTO’s using the MSFT enterprise distribution of R have the confidence that their customers benefit from best-in-class algorithms, but with the ease of use and governance that are hallmarks of commercial solutions. In other words, they’re giving their customers the best of both worlds.
And it’s not just the giants. Business Intelligence companies, such as Tableau and Microstrategy, known for their intuitive commercial environments but off-the-shelf algorithms (which some analysts find limited) are also integrating with open-source languages like R. This approach offers analysts all the benefits of a commercial analytics framework that’s fully integrated with the rich predictive libraries of R, allowing them to open those libraries wherever they need to, and always in the context of a high-quality commercial software environment. It’s an approach that effectively neutralizes the either/or approach and makes Tableau and Microstrategy more appealing and relevant.
Equally important is the ongoing efforts of publishers to embrace Big Data technology and thinking. Big Data is still predominantly based on Hadoop and commercial providers, such as Cloudera, Hortonworks, and MapR, bring enterprise services and controls to an open source project.
How Open Source Benefits Media Companies
How can media analysts benefit from the best of both worlds? Given the amount of data that’s thrown at analysts in the media space, coupled with the ever-increasing list of demands expected of you, these open-source algorithms can be game changers. A few examples: R’s strength in statistics makes it a particularly good tool for predicting inventory availability, which is difficult at best to do with other off-the-shelf applications. In fact, R has dozens of powerful libraries (e.g. zoo, forecast, timeSeries, MTS) that people have written over the years, which gives media companies a lot of latitude to pick an approach and fine-tune it to their business. And Python is great at machine-learning clustering, specifically the extremely popular scikit-learn which media companies use successfully to do things like discover attributes of their most loyal users for lookalike modeling, among many other use cases.
Senior management wants to stay competitive and has little interest in lengthy integration cycles, especially with new markets and metrics promising new revenue opportunities. Having a tech stack fueled with algorithms designed to respond faster will make your life a whole lot easier (and your CRO happier).
Ad-Tech Loves Open Source
If I still haven’t convinced you, consider this: the tech providers media companies use have long recognized the inherent value of open source. Like Microsoft, they may put some commercial code around it for stability, but most of their prototyping, and much of their production code, use substantial amounts of Python and R, and many use MySQL or Postgres databases.
In fact, Python and R are so mature and so well accepted that those languages are foundational at virtually all of the data academies.
While digital has been disrupting many industries, the media industry is in another league when it comes to reinvention. We move at breakneck speeds, which means we need a much higher level of data agility in order to respond to changes in the market. Open source makes that possible by harnessing the brainpower of thousands of really smart people, both inside and outside your company!