Not seeing a Scroll to Top Button? Go to our FAQ page for more info.

How Demographics and Technology Impact Viewing Habits

Just prior to becoming a parent, I read an article in the New York Times on how electronic devices redefine family quality time. It described a family of four retreating to their living room after dinner. Dad watched an NCAA game on his laptop while mom caught up on fashion news on her iPad. Their 10-year-old daughter played on her iPad Touch; her brother played a Wii game on the TV.

What stuck me about this article is the appeal various technologies have to different generations, and how that leads to vastly different consumption patterns. Baby Boomers tend to watch TV live (i.e. "traditional") and via a DVR services. Ditto the 35 to 55 set, although many prefer to catch their favorite shows via an OTT service. Younger generations are cord-cutters. My 3-year-old son prefers the iPad to TV.

Despite what many believe, all of these screens haven't obliterated family time, they simply changed our behavior when we want to do our own thing. Rather than retreat to our rooms, we can do our own thing together. What's more, all these screens give advertisers ample opportunity to target families with ads for...

Click here to read the entire article →

5 Key Marketing Trends for 2017

As a data-driven analytics company, my team and I spend most of our day helping publishers better understand their audiences, discovering opportunities for driving advertising yields, and teasing out trends that will push the industry forward. Although each publisher is different, there are some distinct trends that apply to almost everyone in the industry for the coming year.

1. Fraud and ad blocking will get worse before they get better, and will slow the migration of budgets to digital

Whether we like it or not, there’s a lot of incentive for fraudsters to keep doing what they’re doing. And they have immense technical skills, and can easily find ways to circumvent the fraud-detection systems. Is it any wonder that advertisers are now thinking twice about migrating their marketing budgets to digital channels? TV is beginning to look good again.

And despite the publishers’ best efforts to explain to readers the consequence of ad blocking, the number of installations keep going up. Why? In my opinion, the technical side of advertising harms the consumer experience. When each page load requires (literally) hundreds of ad calls, consumers are kept waiting. Ad-blocking eliminates that frustration. And if more than 50% of consumers start using ad blockers, what’s the point of advertising there?

Fortunately, I think we’re on the cusp of addressing both these issues. Companies like White Ops offer services to proactively block fraud. As this market matures, we’ll see a major dent in fraud, and renewed advertiser confidence. Ad blocking will also be conquered through improved consumer experiences. For instance, publishers need to understand just how many pixels fire on their pages, and the impact they have on the user experience. Try installing Ghostery and taking a tour of your site. You may be in for a shock.

2. Publishers will move to a mobile-first strategy

Brands have been aggressive in adopting a mobile-first strategy, but publishers have been slower to embrace this change. Too many pages are overly complex and bloated, and consequently work terribly on mobile devices. They’re simply carryovers from their desktop efforts, but the design and delivery don’t translate well to mobile.

Why are publishers late to the mobile-first game? The effort to rework their sites is a factor. More importantly, the rates mobile commands are significantly lower than desktop. From a revenue point of view, mobile isn’t necessarily a great place to optimize, which makes it difficult to justify the funds required to do so. But when the secular shift is there, being late to the game is never a good idea....

Click here to read the entire article

Ad-Free Netflix Helps Publicize Ad-Filled TV

Last year, my wife and I sat through what feels like the thousandth conversation where friends and families raved about the BBC show, “Last Tango in Halifax.” It  captivated just about everyone we knew. Intrigued, and frankly tired of sitting out the conversation, we signed onto Netflix to see what the fuss was all about. We were hooked! Within a couple of weeks, we were caught up on the first two seasons and eagerly anticipating the start of the third.

Put another way: the BBC expanded its live audience for the show by two.

Binge watching is America’s new pastime. Some 75% of TV viewers admit to it, which comes as no surprise, given that services like Netflix and Amazon Prime offer entire seasons at once. These media companies have changed the way we view TV. But does it also follow that these subscription-based streaming services have eliminated television advertising as we know it? And if it’s dead, how will advertisers reach the crucial millennial generation?

I don’t think so. To broadcasters, Netflix, Hulu, Amazon Prime and other multichannel video programming distributors (MVPDs) are syndication models in the traditional sense of the word. No one goes there to watch last night’s episode; they go to watch last year’s.

For the original creators, these are classic syndication exercises; content is sent to them after the originating network exhausts all of its first-airing and first-look opportunities.

The fact is, Netflix is actually building audiences for appointment TV, not eliminating TV advertising.

Let me give you another example. About a decade ago, I missed...

Click here to read the entire article →

Media and Data Analytics with Ben Reid CEO of Elasticiti

Over the last five years media has changed our lives in significant ways. ‘Cord cutters’ have become part of mainstream society, and Netflix and Amazon have won Emmys left and right. At this point there is so much good media content available at our fingertips it can be overwhelming.

Ben Reid is CEO of Elasticiti – a data analytics consulting and advisory company focused on online media. Now, I was excited to speak with Ben specifically because he has a unique perspective on online media and how media in general is dealing with the data analytics era. It helps that he has been in the industry for well over a decade.

I want to quickly thank Ben for taking the time to chat with me. Data analytics and big data are changing almost every domain. They’ve transformed agriculture, manufacturing, and more. Media is no different.

Over the last 4 years the media and ad-tech industry have undergone a massive change, but for most publishers Big Data is a relatively new concept. Apart from the top 15-20 industry leaders, until recently most publishers did not have.....

Click here to read the entire article →

It’s 10:00PM – Do You Know Where Your Audiences Are?

Why Partner Management and Yield Management Need to be BFFs

Once upon a time, media companies had a pretty clear idea of who their audiences were and where they could be found, which helped them advise advertisers on the right programming and day-parting to purchase for their ads.

But the long march to digital TV has upended life for content-originator brands (aka television broadcast companies). Although viewers still watch as much TV as ever, fewer sit down in front of traditional sets to tune in, and that puts them in the control seat. If brand loyalty was hard to achieve when viewers could switch between a handful a channels, it’s become pretty near impossible now that they have a myriad of options at their disposal.

Increasingly, the group with the most real influence over your audiences just may be your partner-management team. Savvy media companies have already elevated these groups to the senior levels, and that’s a pretty good start, but it’s not enough. Individual partner managers need to learn the ins and outs of yield optimization, or at least become BFFs with the analysts on the yield optimization team.

Partner Management As Yield Optimizers (and Vice Versa)

Partner management groups were once a mid-tier function within the media organization, providing important, but hardly mission-critical, services. And in the old world, programmers (those who picked the shows) held all the power, and they strategically picked programming adjacencies to maximize affinity and extend viewing. That’s a lot less relevant now in our era of playlists, video-on-demand, recommendation engines, as well as new content distributors such as Amazon, Hulu, Netflix, YouTube and the myriad OTT and MVPD players. IMHO, the syndication teams are the new goods.

And this, in turn, means media companies have lost a major....

Click here to read the entire article →

WTF Bespoke Analytics? The Dangers of Confusing Flexibility with True Customization

We hear the word “bespoke” used in a variety of contexts these days, from audience segments and system integrations, to all manner of IT products. In fact, one could call it the new “disrupt” -- a word such cachet that everyone claims it applies to them.

Bespoke originally referred to custom-made clothing (emphasis on custom). Therefore, bespoke anything, by definition, isn’t prepackaged; whatever it is, it has been designed from the bottom up to meet a subject’s very specific needs.

So is there such a thing as bespoke analytics for TV networks? And if so, do analytics need to be bespoke?

Let’s start with the second question. Back in the 1970’s, media companies were more similar than they were different. There were a handful of TV channels and advertisers bought audiences based on the gross rating point (GRPs). But in the past ten years, digital obliterated that model.

TV Everywhere, i.e. the ability to consume TV content from any device at any time, has drastically changed consumer behavior. Gone are the days when most of America tuned in at the same time to watch “Seinfeld.” Today, audiences tune in whenever they feel like watching a show, making it difficult for networks and advertisers to identify the day-parts that are over-indexed for their prospects. On top of that, TV networks have faced a never-ending crop of competition for original content, from AMC and HBO, to Amazon, Netflix and Hulu.

This has been a dramatic change for TV networks, which traditionally were the bastion of all things branding. Prior to the digital revolution, TV strategy was all about keeping an eye on viewership and valuable demographics, staking out a position in the marketplace, and doing whatever it took to pry viewership away from the competition.

But that’s all changed completely. TV is now largely niche focused (even the big media companies occupy lots of different niches). Their goals are very different from what they once were, and more significantly, they’re very different from their competitors. As a result, TV marketers need to act more like their advertiser clients, and develop super-sharp direct response marketing skills. Who are their viewers? How are they influenced? How can they be reached and kept engaged? And how can they help advertisers identify and reach their ideal audiences?

This new reality changes the nature of the dialog, and data is at the center of the conversation. Off-the-shelf analytics packages no longer suffice. Media companies want an analytics framework that reflects the drivers of their niche markets and unique goals. Put another way, they want bespoke analytics.

What Do Analytics Really Need to Answer

The purpose of analytics is to help identify the drivers that will move a business forward. At a high level, companies need to think about the kinds of content and customer experiences that gain eyeballs. It takes a lot of thought and a myriad of iterations for a media company to find its voice and place in the viewer’s mind.

And that’s just the beginning. Media companies need to constantly decipher what matters to them, what they can control, and how they can affect change. These are the critical datasets for their businesses. And they are not the same across all companies. For instance, I work with some companies where the top goal is distributing content as far and wide as possible, and others for which the top goal is keeping users engaged for as long as possible in a single session. As a result, each company requires significantly different datasets and analytics.

That’s why off-the-shelf analytics won’t cut it for today’s media companies. These standardized KPIs will tell you what everyone else in the industry is doing, but how useful is that to meeting  your business goals? Analytics may provide you with 100 data points that are available, but maybe only 20 are important. Worse, there may be 15 data points that are truly valuable to your business, but are simply not available in pre-packaged analytics packages.

For the record, I don’t mean to sound as if these packages are worthless -- they’re not. In fact, we use several of them with many clients and we get a lot of value out of them. But I understand that today’s fragmented media landscape has meant the media companies are forging their own path for mindshare, and that means they need a robust set of data and analytics that are unique to their own business. Bespoke analytics are as customized as a tailor-made suit, based on what the business needs to know.

< PyGotham 2016: Quick Recap >

Elasticiti’s Rob Tsai attended PyGotham in New York City and here’s his recap.

It’s not easy to sacrifice a summer weekend to anything other than the beach or a beer garden, but I’ve always been impressed by the quality of the sessions at PyGotham, so I figured it would be worth the tradeoff.

The talk schedule is available online  while videos of the different talks will most likely be posted on Youtube soon.

Here’s a couple of fun talks I enjoyed :

Playing with Python Bytecode:

This was a really cool talk by Scott Sanderson and Joe Jevnik that explored the internals of CPython’s code representation. If you were ever curious to know what a function looks like in bytes – this was the talk for you. For many of us who use Python in our day to day – to make API calls, build webscrapers, build API endpoints, parse through CSVs, write data loaders to our database, write visualization scripts, etc. – we might really never need to ‘hack’ the CPython bytecode. We benefit so much from the core Python team, that we don’t always know (or need to know) what’s going on under the hood.  But I found it truly fascinating.  It’s kind of like the guy who made a sandwich for $1,500 by making everything from scratch. Witnessing the amount of dedication it takes in each part of the codebase to make your code run, you develop a newfound appreciation for all the hard work that had to happen for your Python code to run.

Probabilistic Graphical Models in Python:

One day I will complete Daphne Koller’s Probabilistic Graphical Models course on Coursera, but Aileen Nielsen’s talk was an excellent overview of how PGMs work, and how they are implemented in Python.  I really liked how she used a concrete example of a very simple PGM to cover Bayesian Networks. In particular – she used a simple example of a person trying to make it to the Olympics, and creating the network at multiple layers as being dependent on how you perform at the Trials, which is dependent on whether you practice or have good genes.  In most of the classes I’ve seen on PGMs, they make you calculate the probabilities by hand (brutal but probably worth doing once or twice) – so it was fun to see all that stuff implemented in the Python library so you could build your own networks quickly, and see the probabilities calculated on the fly.

Spark Dataframes for the Pandas Pro:

I enjoyed Alfred Lee’s talk on Spark and Pandas. We use Pandas quite heavily and while I’ve mostly been writing native Hive QL for my big data querying, it’s pretty seamless to move to Spark using Scala’s SQLContext class – if you like thinking and writing in SQL.

If you think like a Python/Pandas developer with Dataframes, it was pretty interesting to see how the functions and methods are invoked side by side to query, slice, join and subset data.  Short answer – use Pandas if you’re modeling data that you can read in memory on your dev machine. Use Spark dataframes if you have a distributed cluster with data stored in HDFS and distributed compute nodes that can make use of shared cluster memory.

Simple Serverless ETL in AWS:

Ryan Tuck’s live demo of building a Pokemon API service using AWS Lambda was pretty amazing. The idea that you could set up a microservice by simply deploying to AWS Lambda means you don’t need to deal with spinning up virtual machines, loading up your libraries and dependencies, thinking about scale out, load balancing, deployment scripts. It’s a new way of thinking, and I think it’s going to be a hugely important trend moving forward – as people migrate their mononlithic applications towards microservices. Lambda architecture makes sense, as you offload all the scaling and DevOps challenges to vendors like AWS.  For batch processing, this may not make the most sense – as you are limited by I believe a max of 300 seconds per request. You have memory and disk space limitations as well. But it will be interesting to see which parts of an analytics data pipeline could be migrated over to Lambda, and which parts stay outside of Lambda.

PyGotham conference was this year made of a great diversity of speakers, each one of them bringing great quality content. I recommend developers from any backgrounds and skills to make sure they attend next year’s conference because this is a cool opportunity to listen and meet some visionary developers.

Data Viz Camp 2016: Sharing Tools, Approaches and Inspiration

Today’s blog is written by Elasticiti’s Mark Permann.

On July 9 and 10, 2016, I attended Data Visualization Camp at the United Nations, one of the many Open Camps conferences for open source technologies.

Over my last year I’ve focused on dashboard design and development, and I love the mix of creativity and analysis, design and development, aesthetics and metrics involved. Most of my experience is with Tableau, the well known commercial visualization software, and I’m happy with it - there’s a large development community, support is fantastic, and the product is robust enough to build some amazingly flexible dashboards. But Data Viz Camp offered a live opportunity to survey what else is out there - techwise, definitely, but also hear from different types of practitioners. It was a worthwhile weekend for which I wanted to share some highlights.

Diverse solutions for needs niche to common

There were a wide variety of tools presented, some from sources that surprised me. The American Museum of Natural History’s Eozin Che talked about the work behind the planetarium show Dark Universe, which is (of course! I realized only then) a visualization of truly big data from space exploration instruments. Partiview is a free version of the software used, and while most of us probably aren’t visualizing geospatial data, Partiview can make 3D scatterplots actually useful for needs like inspecting cluster algorithm results. An example from Optical Character Recognition used Partiview to display each handwritten number as its own data point in cluster space.

Partiview Viz of OCR Clustering

Keynote speaker Edward Tufte - from Day 2- whose scholarship in information design well predates today’s tools, also shared a tool of his own, Image Quilts. It’s a Chrome extension that allows you to manipulate Google Images search output, most notably to eliminate the white space - applying his maxim to minimize non-data ink.

Subatomic Particles Quilt

There were two presentations addressing a more common visualization: network graphs. I haven’t personally used these graphs yet, but have often seen them and thought “that looks cool, but how do I get insights out of that hairball?” Nick Fernandez, a postdoc at the Icahn School of Medicine, presented Clustergrammer, which displays networks as adjacency matrices. The ability to spread out the data, chose different sort orders and use color to encode additional information makes such matrices more useful to my eye.

That said, I totally agree that Alicia Powers’ network graphs are an effective means of displaying and analyzing nutrition data - you know, like how you connect to lettuce? She used Neo4J to graph individuals, their meals, food and ingredient constituents and tell a convincing story that a hot dog with sauerkraut can be recommended for better nutrition! If that sounds incredible, I think you’ll find her talk entertaining and thought provoking.

“Ladies Who Lunch” in Neo4J

What do you do if you’re prototyping viz but the dataset isn’t ready yet? Matt Strom launched datumipsum.com to the rescue. Datum Ipsum lets you create real-looking data by adding Perlin noise (invented to depict imaginary landscapes in the movie Tron) to tweakable change signals; you can get it looking the way you like and export the data out.

Open source easier to use with Tiny Tools and Vega

D3’s gallery shows how powerful it is for developing browser viz, but it requires a lot of code to get even something basic on the screen - as Adam Pearce of the New York Times put it, 48 lines for a scatterplot. Adam shared several D3 tools that can shrink the verbosity down to 9 lines, reduce time spent on formatting, and enable annotations.

Keynote speaker Arvind Satyanarayan -from Day 1- shared another tool for viz with less code: Vega. Arvind pitched Vega as a viz platform; its JSON format allows embedding in other software packages, and it leverages the widely used D3, JavaScript, SVG, and Canvas technologies. You could use Polestar, a Tableau-like drag and drop interface, to quickly explore and prototype a viz; customize it by editing a few lines of Vega, and export to D3 or SVG to get it how and where you need it to be. You can find an earlier presentation of Arvind’s talk here.

Practices and principles for design & development

The diversity of tech presented was equalled by the fields and approaches of the presenters themselves. Several were journalists, for whom a common goal was was personalization: designing graphics that quickly connect the reader to the story. Nadja Popovich’s piece “Are you reflected in the new Congress”, is a great example. It invites the reader to begin by filtering on multiple dimensions immediately to find “you”, inverting the usual story order of big picture first, then drill-down. It makes a lot of sense and parallels what I do with my Tableau business dashboards, putting the interactive controls at top left.

K.K. Rebecca Lai ‘s “Death in Syria” follows a big-picture-then-details order, but tackles the problem of connecting us to that big picture (in this case, 200,000 civilian deaths) by representing each with a small, slightly fuzzy marker, which plotted fill several inches of scrolling column space. She described the marker as a “dot” and said it was controversial to represent people as such, but frankly, I think she undersold their work using that term. Rebecca didn’t talk about how they arrived at the marker, but to me it reads as what it is: individuals seen from a great distance, pixelated but not pixels. Reading it on the web page, you can’t see it all at once, and you scroll and scroll till you find the end of it...It’s a triumph of information design, powerfully conveying the scale of a single number while doing so with great sensitivity.

A former journalist, Natalia Rodriguez, talked about the mindset shift she experienced in her current role working with scientists at the American Museum of Natural History: using visualization to explore and discover data before the story has been determined, to see before showing. I was charmed to hear her wish for museums to be at the forefront of innovation, because they’ve certainly inspired my own innovations. I found an echo of my own experience of the creative process - multiple possibilities, beginnings that aren’t pretty, middle stages of incomplete functioning - in the evolution on display in AMNH’s Hall of Invertebrate Origins. And a recent MoMA Jackson Pollock retrospective unconsciously influenced the development of a series ranking bump chart; “that looks like art!” were the first words uttered by a client upon seeing it.

Similar sentiment was expressed by Hermann Zschiegner, who said one of his “favorite things is going to the Met” before showing a cuneiform tablet to illustrate the long history and usefulness of data visualization. Hermann, founder of data viz agency TWO-N (who sponsored and organized Data Viz Camp) characterized their work as creating “story platforms” - software for clients to tell stories with their data e.g. the Art Genome Browser. He described their development approach: get data first, quickly prototype, and iterate. “Agile is the only way to succeed,” he said; “we can’t just hand over design specs to developers”. Rather than following “a theory, we focus on being aware, open, curious” and engaged. I find that to match up pretty well with how we work at Elasticiti.

Being careful matters too, of course, and Elliot Noma reminded us that the choice of what to graph matters just as much as how. He conveyed this crisply by showing how the typical “hockey stick” linear growth chart can become a slowing growth chart with a log scale, and a declining growth rate chart, all depictions of the same data.

Not surprisingly, former professor Tufte provided the most principled thinking about the opportunities and challenges of data visualization. He suggested we find a way to increase the data-ink ratio of linking lines by using words to draw them. He proclaimed the future of visualization to be ever increasing throughput via video and 4K, and that we should design up to the latest display standards to foster the abandonment of low-rez tech. Perhaps most simply and compellingly, he proffered Google Maps as familiar and compelling evidence that there “is no relationship between the amount of information and the ability of comprehension...Clutter and overload aren’t inherent properties of information; they are failures of design.” That’s data visualization inspiration you can see every day.

Don’t Leave Money on the Table (The Importance of Ad-Serving Diagnostics)

Every day we meet with media companies who spend a lot of time and resources on digital advertising, and yet they leave an abundance of money on the table. Why?

Well, the DevOps group in most media companies are really good at monitoring your sites and apps in order to assess if they are working correctly --  from a technical point of view. But when it comes to ad serving? Not so much.

Part of the reason is that publishers have a supreme trust in the systems they’ve set up. If GoogleAnalytics or Adobe Analytics says this web page received four million unique visitors, few will question that number. Ditto for their ad server. If four million users visited a page, then surely the ad server served up ads for a corresponding number of visitors, right?

Hardly. We’ve seen alarming discrepancies between disparate systems in ad tech. It’s rare for a media company to see identical numbers between its page views or video players and ad tags fired.

Those discrepancies are more than just accounting errors...they represent real revenue that’s been lost to the publisher. If you don’t want to leave money on the table, then you’ll need to pay attention to ad-serving diagnostics.

Reasons for Discrepancies

There are lots of reasons why these discrepancies occur. To begin, it may be an indication that your ad tags aren’t firing correctly, and you may need to optimize them. Or you may have too many ad-tags on a page, and that bloat slows down the page load. So although your web analytics counted the visit, in reality, the user got frustrated and clicked away before the ad (and page) could render.

Other culprits:

  • Viewability, which may discount an ad if at least 50% isn’t in view for one full second, or even the presence of ad-blocking software in the user’s browser

  • Counting methodologies, which often differ from platform to platform, especially when it comes to exclusion criteria. Is there a gold standard? If so, what system?

  • Targeting criteria, which may come from multiple sources, including the page itself, or a third-party like Oracle Data Cloud, all of which may be mangled, leading to more discrepancies.

Many of these reason can be eliminated, or at least reduced, and those reductions means higher yields for you. That’s why it’s critical that you look at key metrics holistically across your organization. Discrepancies are everybody’s problem to solve.

What to Look For

There are key metrics that monitor the health of your ad-serving environment. Look for:

  • Areas of your site and apps where content renders without ads. Once you’ve identified the areas, you can begin to analyze why ads aren’t rendering. Some of the issues, such as the presence of ad-blocking software, require complex solutions. For instance, many sites won’t allow users with ad-blocking software to access their content. The New York Times makes appeals to such users, explaining how it hampers their ability to earn revenue, and requests users to add its site to their white lists.

    Some issues, such as problems with ad tags, are easier to address.

  • Ad unit implementations or page designs that are viewability unfriendly. Many publishers and advertisers use a separate system for counting viewable impression, so right away there will be discrepancies between what those viewability measurement tools and your ad server report. But you still have a lot that’s in your control. Test your pages to see how viewable your ad units actually are.

  • Latencies. Increasingly, users are referred to an article from a Facebook post, Tweet or email. If that article takes too long to load, they’ll click away, squandering your opportunity to earn revenue. Latency can occur because of:

  • Header bidding, which goes through a complex process of selecting the best ad to show to an individual user (where “best” means highest CPM). Although this process is typically occurs in less than 1 second, hiccups do occur, and you need to know about them so you can fix them.
  • Pixels. These are the necessary evils of advertising. You can’t show and an ad -- and consequently earn money -- without your advertisers’, ad networks’ or other demand partners’ pixels firing on your site. Often, your demand partners “usher” their own partners onto your site, who bring their own pixels to your pages. All of these tags slow down page loads.

  • Page design. A page may be beautiful once fully loaded, but think of that visitor who clicked on it from Instagram or from a search page...does he or she have the patience to wait?

Moving Forward

To recapture all the revenue you’re probably leaving on the table, you’ll need a strategy for assessing whether your ad-tech stack is working as expected; a test that’s completely independent of ad campaigns themselves (e.g. take a hard look at the inventory).

How do you do that? One way is to align datasets, such as your web page and ad server data. Are they relatively in sync or off by factors?

Or you can engage an external service, such as Dynatrace (FKA Gomez), to devise your own tests. Some people even do their own analytics of their logs to assess how the various parts of their ad-tech stack are performing. Of course, this could quickly turn into an exploratory Big Data exercise, but one that is justified given the amount of revenue at stake.

It Takes a Village

Assessing and addressing any health issues in your monetization efforts requires a shared discipline between all parties. One of those parties are the beneficiaries of the monetization, which is typically someone in your revenue/Ad Ops department, but they’re not to only ones.

Often, the solution falls outside of the revenue/Ad Ops group, which may need to engage the folks who are responsible for maintaining or designing the site. We recommend that media companies build a structure for collaboratively analyzing potential issues, so that each group understands exactly what’s at stake, and why it’s important they they fix it.

Advertising revenue is more important now than it has ever been before. And it’s getting harder to earn, as advertisers demand better results at lower costs. This means it’s a downright shame to walk away from money that rightfully should be yours.

Enterprise Analytics Embrace Open Source Software - The Best of Both Worlds

Improvements in Open Source Codes Give Analysts the Best of Both Worlds

Gone are the days when open-source software was considered heretical and dangerous, something you’d never touch for commercial use. Over the years, lots of talented people have invested thousands of hours of development and testing into open source languages and now, in some instances, their algorithms exceed commercial applications in terms of scope and performance. And that, in turn, means both enterprises and commercial software vendors have learned that success requires them to leverage the best of both worlds.

Benefits of Widespread Brain Power

Open source languages benefit from the wisdom of crowds, as is abundantly evident in the predictive space with both R and Python. These languages benefit from their accessibility and alignment with academia. As the languages mature and attract new users, there have been thousands of practitioners and academics contributing and aggressively testing the source code (for the sheer love and utility of it). That collective effort has made these applications truly cutting edge. And not for nothing, few companies have the financial muscle to dedicate a similar number of engineers and analysts to build out their libraries.

Balancing IT’s Needs

But what about user experience, security, governance and stability...all the things that keep IT leaders up at night? They’re charged with protecting their enterprise data, and they need absolute assurance that sensitive and strategic information is only available to the appropriate people. At the same time, they need to ensure their employees know how to use the tools they’re provided, lest those employees find insecure workarounds.

Commercial providers are good at addressing CTO’s concerns, and enjoy a high level of confidence among their customers. Still, I think a lot of people were surprised when Microsoft announced its acquisition of Revolution Analytics.  While a modest size deal by MSFT terms, it signaled a very different stance towards analytics and the open source world. As R spreads through companies, CTO’s using the MSFT enterprise distribution of R have the confidence that their customers benefit from best-in-class algorithms, but with the ease of use and governance that are hallmarks of commercial solutions. In other words, they’re giving their customers the best of both worlds.

And it’s not just the giants. Business Intelligence companies, such as Tableau and Microstrategy, known for their intuitive commercial environments but off-the-shelf algorithms (which some analysts find limited) are also integrating with open-source languages like R. This approach offers analysts all the benefits of a commercial analytics framework that’s fully integrated with the rich predictive libraries of R, allowing them to open those libraries wherever they need to, and always in the context of a high-quality commercial software environment. It’s an approach that effectively neutralizes the either/or approach and makes Tableau and Microstrategy more appealing and relevant.

Equally important is the ongoing efforts of publishers to embrace Big Data technology and thinking. Big Data is still predominantly based on Hadoop and commercial providers, such as Cloudera, Hortonworks, and MapR,  bring enterprise services and controls to an open source project.

How Open Source Benefits Media Companies

How can media analysts benefit from the best of both worlds? Given the amount of data that’s thrown at analysts in the media space, coupled with the ever-increasing list of demands expected of you, these open-source algorithms can be game changers. A few examples: R’s strength in statistics makes it a particularly good tool for predicting inventory availability, which is difficult at best to do with other off-the-shelf applications. In fact, R has dozens of powerful libraries (e.g. zoo, forecast, timeSeries, MTS) that people have written over the years, which gives media companies a lot of latitude to pick an approach and fine-tune it to their business. And Python is great at machine-learning clustering, specifically the extremely popular scikit-learn which media companies use successfully to do things like discover attributes of their most loyal users for lookalike modeling, among many other use cases.

Senior management wants to stay competitive and has little interest in lengthy integration cycles, especially with new markets and metrics promising new revenue opportunities. Having a tech stack fueled with algorithms designed to respond faster will make your life a whole lot easier (and your CRO happier).

Ad-Tech Loves Open Source

If I still haven’t convinced you, consider this: the tech providers media companies use have long  recognized the inherent value of open source. Like Microsoft, they may put some commercial code around it for stability, but most of their prototyping, and much of their production code, use substantial amounts of Python and R, and many use MySQL or Postgres databases.

In fact, Python and R are so mature and so well accepted that those languages are foundational at virtually all of the data academies.

While digital has been disrupting many industries, the media industry is in another league when it comes to reinvention. We move at breakneck speeds, which means we need a much higher level of data agility in order to respond to changes in the market. Open source makes that possible by harnessing the brainpower of thousands of really smart people, both inside and outside your company!