Kylin is an open source Distributed Analytics Engine from eBay Inc. that provides SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets.
At SequenceIQ we are always interested in the latest emerging technologies, and try to offer those to our customers and the open source community. A few weeks ago eBay Inc. released Kylin as an open source product and made available for the community under an Apache 2 license. Since we share the approach towards
open source software we have partnered with them to
Dockerize Kylin – and made it extremely easy for people to deploy a Kylin locally or in the cloud, using our Hadoop as a Service API – Cloudbreak.
While there is a pretty good documentation available for Kylin we’d like to give you a really short introduction and overview.
For an overview and the used components and architecture please check this diagram.
Kylin cluster running on Docker
We have put together and fully
automated the steps of creating a Kylin cluster. The only thing you will need to do is just pull the container from the
official Docker repository by issuing the following command.
Once the container is pulled you are ready to start playing with Kylin. Get the following helper functions from our Kylin GitHub repository – (make sure you source it).
You can specify the number of nodes you’d like to have in your cluster (3 in this case). Once we installed all the necessary Hadoop services we’ll build Kylin on top of it and then you can reach the UI on:
The default credentials to login are:
admin/KYLIN. The cluster is pre-populated with sample data and is ready to build cubes as shown here.