Analyzing Instagram API Data, Part 2: Engagement Over Time

Ok, Instagram datanauts, welcome back to the second installment of our ongoing series covering how to analyze your Instagram data using SQL and Panoply. We covered most of the basics of how your data will be collected, organized and stored in Panoply in our last post, but just as a refresher, we’re going to be working mainly with three distinct Instagram tables. The tables we’ll be referring to the most are: 

 

instagram_media: contains the like counts, comment counts, type of media, and other data about individual posts

 

instagram_comments: contains the text of comments left on posts, the user who left the comment, and the time the comment was created 

 

instagram_user: contains follower counts and other summary information about the user’s account 

 

Now that we’ve got that out of the way, let’s talk data analysis. Today’s featured query is all about measuring engagement over time. 

Measuring Instagram engagement over time 

This should probably be one of the first things you look into, since it will be a key data point in measuring the effectiveness of your Instagram campaign so far. It will also help you get a good overall sense of the history of your account and allow you to pick out particular time periods when your posts outperformed, which will help you to start thinking about the underlying causes. 

 

To get that data, you’ll look again at the instagram_media table, and use a query like this: 

 

SELECT 

to_char(created_time, 'yyyy/mm') as "date", 

AVG(likes_count) as "average likes", 

AVG(comments_count) as "average comments" 

/* Remember to set the table name to whatever it is in your personal warehouse! */ 

FROM  Instagram_media 

GROUP BY "date" 

ORDER BY "date" ASC

 

This will give you averages of like and comment counts, binned by month. If you plug these results into your favorite data viz tool, you’ll be able to make a nice bar plot with months on the x axis and average engagement by month on the y axis. 

 

Let’s take a closer look at the query here before we run off to start making dashboards, though. With this query, we’re creating a large summary table from the instagram_media table where everything is grouped by month and year. Because PostgreSQL and Panoply can detect and interpret date columns, we can take advantage of the fact that each entry in the instagram_media table is tagged with a created_time timestamp. We’ll prettify that by using TO_CHAR(created_time, 'yyyy/mm'), which will also allow us to truncate and group the timestamps by date and year. 

 

Then we’ll summarize the like and comment engagement rates with AVG(likes_count) and AVG(comments_count), respectively, and group the aggregation functions with GROUP BY “date”. Ordering the resulting entries chronologically is achieved by using ORDER BY “date” ASC, which orders the entries by ASCending date timestamps. 

 

In our next installment of this series, we’re going to dig even deeper into engagement, and look at how to identify your most engaged followers on Instagram.

Get a free consultation with a data architect to see how to build a data warehouse in minutes.
Request Demo
Read more in:
Share this post:

Work smarter, better, and faster with weekly tips and how-tos.