Good preparation is the key to data science

Good preparation is the key to data science

Giles Cottle, Fri 15 December 2017

There's a perception that data science is primarily about machine learning, PhD level statistics, and massive scale processing of petabytes of data. Although many of our team do have PhDs and we do process many petabytes of data, there's another critical element of data science consulting that often gets missed.

Take the example of customer retention: it continues to be the most significant driver for the uptake of data science consulting projects among subscription businesses. The mantra that it costs at least five times more to acquire a new customer than to retain your existing ones remains broadly correct for most companies.

There’s something slightly intangible and almost “magical” about this kind of project – we are looking to accurately predict if a subscriber is going to churn or not and give operators the tools to do something about it.

However, the reality is far from magical. When our data science consulting team build models that can help them to retain customers better, we interrogate granular viewing data mapped to any number of other fields, giving operators an entirely new way of looking at churn. As well as looking at traditional indicators of churn like the length of subscription and CRM data, operators have an entirely new lens through which to see the problem.

Success primarily comes not from how sophisticated the algorithms are, but the amount and quality of data, and which features we extract from that dataset and feed into the model. Consider the features you might use to predict churn in a sizable western market and compare them to a smaller Asian market. In the west, we'll focus on billing information and rich CRM data. In Asia, CRM data is frequently less accurate, and targeting based on language and ethnicity is much more common.

You also need to take into account the difference between customers. A large vertically integrated provider with multiple expensive sports and movie rights will need a very different model to an IPTV provider that is using TV specifically as a means to grow broadband and mobile subscribers. These two operators, even in the same market, need very different approaches to churn.

An excellent exploratory analysis will point you in the right direction of subtle changes in behavior that indicate that a customer is likely to churn. It's never as simple as you may think. Every operator we have worked with thinks VoD is a good retention strategy. It isn’t always, and we’ve even found customers for whom users watching a lot of VoD is an indicator that they are more likely to leave the operator.

The final preparatory action is data cleansing: Our data science consulting team probably spend 75% of their time cleaning and preparing data. Common issues we find in retention projects: Insufficient ability to distinguish subscribers who have churned, poor metadata, and an inability to look at historic subscription status.

Data science consulting might seem like it's all about artificial intelligence and predictive analytics. However, like most things in life, success is down to proper preparation.

Need help? Get in touch...

Sign up below and one of our data consultants will get right back to you

Other articles about Data Science

Dativa is a global consulting firm providing data consulting and engineering services to companies that want to build and implement strategies to put data to work. We work with primary data generators, businesses harvesting their own internal data, data-centric service providers, data brokers, agencies, media buyers and media sellers.

145 Marina Boulevard
San Rafael

Registered in Delaware

Thames Tower
Station Road

Registered in England & Wales, number 10202531