Rill Meets StreamSets

Building data pipelines with StreamSets Data Collector

In the previous post we have explored Twitch API using Elixir programming language. We have done our exploration in order to plan how to build a process that acquires data from Twitch API. Data acquisition problem is a common problem in Data analysis and Business intelligence. In data warehousing there is a process called ETL (Extract, Transform, Load), which represents how data flows from source systems to destinations. One way to acquire data is to write custom code for each source (bringing challenges of maintenance, flexibility, reliability). [Read More]

Rill Stage 1

Exploring Twitch API

The goal of the project Rill is to collect data about online streams from Twitch (and, possibly, other streaming platforms) for further analysis. 1) Set up Twitch client ID according to: http://blog.danielberkompas.com/elixir/2015/03/21/manage-env-vars-in-elixir.html The process to obtain data about streams for a particular user looks like this: 1) Find user’s username (e.g., from a Twitch URL) 2) Make a request to Twitch API to convert username to stream id. 3) Make a request to Twitch API to obtain data about user’s stream (is there a live steam, is there a recording being played) [Read More]