Case Study: Kaltura Data Lake

Case Study: Kaltura Data Lake

Giles Cottle, Thu 02 May 2019

Introduction 

Kaltura is a leading provider of technology for video service providers who want to offer fully-fledged, personalized multiscreen services across linear, VOD, and time-shifted TV. To help its customers create standout in a crowded market, Kaltura will create the world’s first data lake dedicated to Cloud TV services, built on AWS. Kaltura wants to be able to offer its customer base an off-the-shelf, AWS-based data lake that they can use to query their customer data. Kaltura also intends to build off-the-shelf dashboards and models that are compatible with AWS to bring machine learning and AI to a wide range of video providers.

What we did

To build the data lake, we will take multiple data sources from different Kaltura customers, unify and normalize them to create common features, and join them with further pre-integrated TV-centric third-party data sources. Kaltura customers will have a unified data structure that they can easily query using AWS.

Why AWS?

AWS is the natural infrastructure choice given the technology and feature roadmap we can leverage, and for the scale required for the Kaltura platform. Kaltura’s Targeted TV solution processes more than 40 million monthly users and over 2 billion daily API calls, so the platform needs to be scalable and robust enough to handle this, as well as for future growth of the Kaltura platform.

A key challenge is ensuring that we select the right technologies for the project, taking the greatest advantage of the AWS toolset, and building a data structure that is most relevant for TV and video customers. For this project, it means leveraging S3 for storage, Athena for data querying, and ECS for processing. ECS is particularly crucial to the success of the project: it is easily scalable and means we can process data in parallel, spinning up EC2 instances when we need more processing power and scaling them down when we don’t.

What’s next?

Dativa will continue to work with Kaltura to develop out-of-the-box machine learning models that can help cloud TV service providers optimize user acquisition and retention, increased content consumption, and greater monetization. We will also build reporting dashboards on AWS that business end-users can use to surface the most critical metrics relevant to their business, quickly and easily. 

Need help? Get in touch...

Sign up below and one of our data consultants will get right back to you

Other articles about aws


Dativa is a global consulting firm providing data consulting and engineering services to companies that want to build and implement strategies to put data to work. We work with primary data generators, businesses harvesting their own internal data, data-centric service providers, data brokers, agencies, media buyers and media sellers.

145 Marina Boulevard
San Rafael
California
94901

Registered in Delaware

Thames Tower
Station Road
Reading
RG1 1LX

Registered in England & Wales, number 10202531