Open Source BI Stack

Published by

on


The use of data by people and business around the world is on a rise. Almost everyone involved into work are now-a-days looking for a analytics solution and to take business decisions based on the insights generated from data. As a results a number of DW/BI solutions are born in the market. The BI tools like Informatica, Cognos, Pentaho, Qlikview, MSBI, Oracle BI, etc are considered the top tools in the market. They bring out the very best out of the data, impacting multiple business around the globe.

Now the immediate question that comes into your mind is the Pricing or cost. Of course every tool comes in with a price tag and that you will be required to pay for these tools. But !!! worry not we have open source community to the rescue.

In this blog, i am gonna provide you my personal recommendation for BI Stack which will be mostly open source and will cost you no money. There are basically 5 layers of development required by BI developers.

  1. Data Storage Layer: This is nothing but a database layer, where the developers would be storing the data.
  2. ETL Layer: The ETL layer is the place for Extracting transforming and loading your data. This is the place where you apply your business logics on the data.
  3. Reporting Layer: This stage is for generating the reports based on the data loaded in step #2.
  4. OLAP/Cube Layer: This is place for generating analytical functions with slicing and dicing of the multi-dimensional cubes.
  5. Dashboard Layer: This provides the user with a web-based design and layouts.

For each of the above layers, find below my BI Stack recommendations for developers:

Capture
BI Stack – Open Source

Development Plan:

  • PostgreSQL – Database layer. Install it from the Postgresql site and follow the instruction. I recommend this since its open source and can easily handle huge volumes of data. Even Amazon Redshift and Greenplum databases are built on top of this database.
  • Pentaho Data Integration CE Edition – This is the ETL tool. Install the Community edition from the official pentaho site. You can easily fetch data from various sources including Hadoop, AWS and easily perform your ETL activities. Its free and has an active developer community.
  • Pentaho Report Designer CE Edition – This is a graphical tool that generates reports from data streamed through the Data Integration engine without the need for any intermediate staging tables.
  • Schema Workbench – This is a visual design interface that allows you to create and test Mondrian OLAP cube schemas. You can present your data multi-dimensionally and let users select which dimensions and measures to explore, interactively drilling into cross-tabulating data.
  • Saiku Analytics[Note: Saiku Analytics is no longer supported or available in the market. The product is removed from the market.] Saiku is a modular open-source analysis suite offering lightweight OLAP which remains easily embeddable, extendable and configurable. It is user-friendly and has an intuitive analytics. You can easily drag-drop the columns and create own report. This is an alternative to Analyzer Reports which is available in Pentaho Enterprise edition. If you have worked with Tableau, saiku will feel almost the same but without hole in your pocket. Saiku plugin is available at Pentaho Marketplace.
  • CTools – A set of tools and components, working on top of Pentaho, created and maintained by Webdetails to allow the creation of Advanced Dashboards. Its free, open source and you can create dashboard frameworks easily without much of coding effort. You can download the CTools from the Pentaho Marketplace.

Subscribe to continue reading

Subscribe to get access to the rest of this post and other subscriber-only content.

One response to “Open Source BI Stack”