Hazel Kavili

Accio data!

github twitter linkedin email rss
Our Blue Planet
Nov 30, 2017
2 minutes read

It has been two months that I can not stop myself watching videos and GIFs from Blue Planet 2 on Twitter. I saw many people talking about episodes by emojis which is awesome and so funny! That’s why I decided to make a Twitter analysis on Blue Planet 2 emojis and hashtags!

rtweet package makes the tweet collection process so easy! Also I used other useful libraries such as tidyverse, tidytext, janitor, stringr.

# Getting tweets
tweetsBigBlue <- search_tweets(q = '#BluePlanet2', n = 18000, 
                            include_rts = FALSE, parse = TRUE)
# Tidy data
cleanTweetsBlue <- janitor::remove_empty_cols(tweetsBigBlue)

tidyTweetsBlue <- cleanTweetsBlue %>% 
  select(screen_name, user_id, created_at, status_id, text,
         retweet_count, favorite_count, hashtags, source)

I wonder what is the most retweeted and the most favourited tweet in my tweet set:

mostRetweetedBlue <- tidyTweetsBlue %>% arrange(desc(retweet_count)) %>% 
                      select(text) %>% head(1)

mostFavourited <- tidyTweetsBlue %>% arrange(desc(favorite_count)) %>%
                    select(text) %>% head(1) 

getURLinsideTweet <- gsub(".*(https://)", "https://",mostRetweetedBlue$text)

This is from one of the most amazing scenes: an octopus hide itself in shells! And you can click here to see the GIF.

What are the other hashtags people used besides #BluePlanet2?

tidy_hashtags <- tidyTweetsBlue %>% unnest(hashtags) 

tidy_hashtags <- tidy_hashtags %>%
  count(hashtags, sort = TRUE) %>% top_n(n = 10, wt = n) 

p <- tidy_hashtags %>%  filter(hashtags != "BluePlanet2")
t <- ggplot(data = p) + 
     geom_bar(aes(x = reorder(hashtags, -n), y = n,
               fill = rainbow(n=length(p$hashtags))), stat = 'identity') +
     labs(x = "Hashtags", y = "Hashtag Count")
t + theme_bw() + theme(legend.position="none", 
                       axis.text.x = element_text(angle = 60, hjust = 1))

And of course here are the most used emojis! I love using emojis and I love how people explain themself by using these cuties! Well, since the date range of data set between 24-30 November, you can guess the winner! (if you watched the show of course :) )

## load in the emoji dictionary
dico <- readr::read_csv2("https://raw.githubusercontent.com/today-is-a-good-day/emojis/master/emDict.csv")

# get emojis
emojis <- regex_left_join(tidyTweetsBlue, dico, by = c(text = "Native")) %>%
  group_by(Native) %>%
  filter(!is.na(Native)) %>%
  summarise(n = n()) %>%
  arrange(desc(n)) %>%
  head(15) %>%
  mutate(num = 1:15)

The savvy octopus is our winner!

The whole series is really amazing. With the voice of David Attenborough and music from Hans Zimmer, everything looks more mysterious and exciting than ever!

Back to posts

comments powered by Disqus