- #Harry potter and the order of the phoenix word count code
- #Harry potter and the order of the phoenix word count series
We can use a wordcloud as a quick check to see if there are any outliers in the context of Harry Potter, constructed using ggwordcloud: library(ggwordcloud) That is, an author may use a word without intending to convey a specific sentiment but the dictionary defines it in a certain way. Sometimes words which are defined in a general sentiment dictionary can be outliers in specific contexts. Visualize which words in the AFINN sentiment dictionary appear most frequently Generate data frame with sentiment derived from the AFINN dictionary Click for the solution
Title = "Negative words used in the Harry Potter series", Title = "Positive words used in the Harry Potter series", # extract 10 most frequent pos/neg words per bookįacet_wrap(facets = vars(book), scales = "free_y") + # generate frequency count for each book, word, and sentiment Y = "Number of occurences in all seven books" Title = "Sentimental words used in the Harry Potter series", # used with reorder_within() to label the axis tick marksįacet_wrap(facets = vars(sentiment), scales = "free_y") + Mutate(word = reorder_within(word, n, sentiment)) %>% # prep data for sorting each word independently by facet # generate frequency count for each word and sentiment
#Harry potter and the order of the phoenix word count series
Visualize the most frequent positive/negative words in the entire series using the Bing dictionary, and then separately for each bookĬheck out this blog post which introduces the reorder_within() and scale_x_reordered() functions for sorting bar charts within each facet. # 9 philosophers_stone 1 greatest positive # 6 philosophers_stone 1 nonsense negative # 5 philosophers_stone 1 mysterious negative # 4 philosophers_stone 1 strange negative # 2 philosophers_stone 1 perfectly positive Title = "Most frequent words in Harry Potter",įacet_wrap(facets = vars(book), scales = "free") +Įstimate sentiment Generate data frame with sentiment derived from the Bing dictionary Click for the solution Ggplot(aes(x = word, y = n, fill = book)) + Mutate(word = reorder_within(word, n, book)) %>% Mutate(book = factor(book, levels = hp_books)) %>% # convert each book to a data frame and merge into a single data frame If you just run install.packages("harrypotter"), you will get an error. Note that there is a different package available on CRAN also called harrypotter.
#Harry potter and the order of the phoenix word count code
Run the following code to download the harrypotter package: devtools::install_github("bradleyboehmke/harrypotter")