Analyzing Instagram API Data, part 2: Engagement Over Time

By Peter Weinberg | September 24, 2019 | Updated On: January 30, 2021 | Data Analysis

This post has been updated to reflect changes to Instagram's API (Facebook's Graph API). You will now need to switch to an Instagram Business account to perform this analysis in Panoply.

Ok, Instagram datanauts, welcome back to the second installment of our ongoing series covering how to analyze your Instagram data using SQL and Panoply. We covered most of the basics of how your data will be collected, organized and stored in Panoply in our last post, but just as a refresher, we’re going to be working mainly with three distinct Instagram tables. The tables we’ll be referring to the most are:

instagram_graph_media: contains the like counts, comment counts, type of media, and other data about individual posts

instagram_graph_comments: contains the text of comments left on posts, the user who left the comment, and the time the comment was created

instagram_graph_user: contains follower counts and other summary information about the user’s account

Now that we’ve got that out of the way, let’s talk data analysis. Today’s featured query is all about measuring engagement over time.

Measuring Instagram engagement over time

This should probably be one of the first things you look into, since it will be a key data point in measuring the effectiveness of your Instagram campaign so far. It will also help you get a good overall sense of the history of your account and allow you to pick out particular time periods when your posts outperformed, which will help you to start thinking about the underlying causes.

To get that data, you’ll look again at the instagram_media table, and use a query like this:

SELECT 
  date_trunc('month',"timestamp") as "month", 
  AVG(like_count) as "average likes", 
  AVG(comments_count) as "average comments" 
  /* Remember to set the table name to whatever it is in your personal warehouse! */
FROM instagram_graph_media GROUP BY "date" ORDER BY "date" ASC

This will give you averages of like and comment counts, binned by month. If you plug these results into your favorite data viz tool, you’ll be able to make a nice bar plot with months on the x axis and average engagement by month on the y axis.

Let’s take a closer look at the query here before we run off to start making dashboards, though. With this query, we’re creating a large summary table from the instagram_media table where everything is grouped by month and year. Because PostgreSQL and Panoply can detect and interpret date columns, we can take advantage of the fact that each entry in the instagram_media table is tagged with a created_time timestamp. We’ll prettify that by using TO_CHAR(created_time, 'yyyy/mm'), which will also allow us to truncate and group the timestamps by date and year.

Then we’ll summarize the like and comment engagement rates with AVG(likes_count) and AVG(comments_count), respectively, and group the aggregation functions with GROUP BY “date”. Ordering the resulting entries chronologically is achieved by using ORDER BY “date” ASC, which orders the entries by ascending date timestamps.

In our next installment of this series, we’re going to dig even deeper into engagement, and look at how to identify your most engaged followers on Instagram.