Ok, Instagram datanauts, welcome back to the second installment of our ongoing series covering how to analyze your Instagram data using SQL and Panoply. We covered most of the basics of how your data will be collected, organized and stored in Panoply in our last post, but just as a refresher, we’re going to be working mainly with three distinct Instagram tables. The tables we’ll be referring to the most are:
instagram_media: contains the like counts, comment counts, type of media, and other data about individual posts
instagram_comments: contains the text of comments left on posts, the user who left the comment, and the time the comment was created
instagram_user: contains follower counts and other summary information about the user’s account
Now that we’ve got that out of the way, let’s talk data analysis. Today’s featured query is all about measuring engagement over time.
Measuring Instagram engagement over time
This should probably be one of the first things you look into, since it will be a key data point in measuring the effectiveness of your Instagram campaign so far. It will also help you get a good overall sense of the history of your account and allow you to pick out particular time periods when your posts outperformed, which will help you to start thinking about the underlying causes.
To get that data, you’ll look again at the
instagram_media table, and use a query like this:
SELECT to_char(created_time, 'yyyy/mm') as "date", AVG(likes_count) as "average likes", AVG(comments_count) as "average comments" /* Remember to set the table name to whatever it is in your personal warehouse! */ FROM Instagram_media GROUP BY "date" ORDER BY "date" ASC
This will give you averages of like and comment counts, binned by month. If you plug these results into your favorite data viz tool, you’ll be able to make a nice bar plot with months on the x axis and average engagement by month on the y axis.
Let’s take a closer look at the query here before we run off to start making dashboards, though. With this query, we’re creating a large summary table from the
instagram_media table where everything is grouped by month and year. Because PostgreSQL and Panoply can detect and interpret date columns, we can take advantage of the fact that each entry in the
instagram_media table is tagged with a
created_time timestamp. We’ll prettify that by using
TO_CHAR(created_time, 'yyyy/mm'), which will also allow us to truncate and group the timestamps by date and year.
Then we’ll summarize the like and comment engagement rates with
AVG(comments_count), respectively, and group the aggregation functions with
GROUP BY “date”. Ordering the resulting entries chronologically is achieved by using
ORDER BY “date” ASC, which orders the entries by ascending date timestamps.
In our next installment of this series, we’re going to dig even deeper into engagement, and look at how to identify your most engaged followers on Instagram.