Being Data Driven

Tableau & Redshift: How to Visualize Amazon Redshift Data

Written by Alon Brody|February 08, 2017

Data analyst? You must be familiar with both Tableau and AWS Redshift. And you must know, or at least you should believe me that integrating the Redshift columnar-based database speed with the mature, rich and interactive Tableau creates an awesome experience.

In my article last week, I described how to create custom Google Analytics dimensions, export the data, and import it to Redshift. Don’t miss it since this one is a continuation of it. In this short one, I will take you through the integration process, through to the eventual  presentation of an aggregated graph of pageview performance by number of visitors.

Note: In this guide we used Tableau desktop version 10.0, which you can easily download and try for free. Once downloaded and installed, you can perform the following steps to connect it with Redshift. It will also help to ensure you have your Redshift information (host name, database name, username and password) on hand.

Conquer Inefficiency in Data Analytics: Download Whitepaper

Connect

On the Tableau Desktop start page, under Connect, select the Amazon Redshift option.

Connect Redshift and Tableau - Select Redshift

Enter your Redshift information and click Sign In.

Connect Redshift and Tableau - Sign In

On the data source page (shown above), do the following:

Click the data source name at the top of the page, then enter a unique data source name for use in Tableau. It’s good practice to use a data source naming convention that helps other users figure out which data source Tableau is connected to.

From the Schema drop-down list, select a schema or use the text box to search for a schema by name.

Under Table, select a table or use the text box to search for a table by name.

Drag the table to the canvas, then click the sheet tab to start your analysis.

Connect Redshift and Tableau - click the sheet tab

Now that Tableau is connected to our Redshift database, we can move onto visualizing it.

Visualize

From the Tableau Desktop menu bar, select Data and then select a table from list data source tables at the bottom of the popup menu. For our example, we wanted to create reports and charts based on the Clientid table (as shown below).

Connect Redshift and Tableau - select a table

Next, on the menu bar, select Dashboard > New Dashboard.

Connect Redshift and Tableau - select dashboard

A graph is displayed, returning the number of page views by Timestamp, PagePath and Clientid:

Connect Redshift and Tableau - dashboard view

As shown in the steps below, from this point on we can experiment with Tableau’s simple UI and filters to visualize the data based on our specific business requirements:

On the menu bar, select Worksheet > New worksheet.

Connect Redshift and Tableau - new worksheet

Next, on the menu bar, select Data > Users.

Connect Redshift and Tableau - worksheet

From the Dimensions and Measures sections in the left pane, you can adjust the worksheet, selecting the relevant columns and rows.

For example, the screenshot below shows which pages have been viewed and how many times, by date.

Connect Redshift and Tableau - pages viewed report

Alternatively, if you want to look at a specific Clientid (information not supported out-of-the-box by GA), you can create a graph similar to this one:

Connect Redshift and Tableau - graph example

Number of Page Views by Specific page by Specific Clientid and by Date

Or, by making a few simple changes to the arrangement of your data, you can create a different visualization for the same example:

Connect Redshift and Tableau - alternative graph example

Number of Page Views by Specific page by Specific clientid and by Timestamp.

As can you see, using Tableau Desktop offers a simple way to start visualizing your Redshift data.

Conquer Inefficiency in Data Analytics: Download Whitepaper

Advanced Use Cases: Using Panoply

The integration between Tableau and Redshift is a powerful one. However, there are lots of preliminary steps that are required to make it happen, which can make the whole process cumbersome, lengthy and hard to maintain. In an advanced real-life case, your database will hold more than two tables and will need to be continuously processed to be ready for visualization.

There are lots of preliminary steps that are required to make it happen, which can make the whole process cumbersome, lengthy and hard to maintain.

This is where Panoply.io comes into play:

Connect Redshift and Tableau - Panoply connectors

Panoply.io’s Visualization Plugins

Panoply.io provides a one-stop data warehouse solution, from extracting the data source right through to load and transform. Besides out-of-the-box integration with Tableau, Panoply services also integrate with other external BI tools such as Chart.io, Looker, and PowerBI.

Our automated, data management solution leverages Redshift infrastructure and empowers it by eliminating its management overhead, including both the underlying infrastructure and the database layers. You can think of Panoply as an automated data engineering robot, eliminating the vast majority of data engineering and database administrative operations.

Don’t have a Panoply account yet? Start here - it’s Free!

From raw data to analysis in under 10 minutes.

Sign up now for a demo or a free trail of the Panoply.io platform.

Learn more about platform features