# install.packages("tidyverse")# install.packages("here")library(tidyverse)library(here)## Load the datacharacters <-readRDS(file = here::here("raw_data", "characters.rds"))psych_stats <-read.csv(file = here::here("raw_data", "psych_stats.csv"),sep =";")## Reshape into long format:psych_stats <- psych_stats %>%pivot_longer(cols = messy_neat:innocent_jaded,names_to ="question",values_to ="rating" )## Take a look at the data setsstr(characters)
Print each fictional universe (column: uni_name) in the characters_stats data frame into your console once, like this: "The fictional universe 'fictional universe' is part of the characters data set."
Solution
for (universe inunique(characters_stats$uni_name)) {print(paste0("The fictional universe '", universe, "' is part of the characters data set." ) )}
[1] "The fictional universe 'Arrested Development' is part of the characters data set."
[1] "The fictional universe 'Avatar: The Last Airbender' is part of the characters data set."
[1] "The fictional universe 'Arcane' is part of the characters data set."
[1] "The fictional universe 'Archer' is part of the characters data set."
[1] "The fictional universe 'It's Always Sunny in Philadelphia' is part of the characters data set."
[1] "The fictional universe 'Bones' is part of the characters data set."
[1] "The fictional universe 'Brooklyn Nine-Nine' is part of the characters data set."
[1] "The fictional universe 'Beauty and the Beast' is part of the characters data set."
[1] "The fictional universe 'Breaking Bad' is part of the characters data set."
[1] "The fictional universe 'The Big Bang Theory' is part of the characters data set."
[1] "The fictional universe 'The Breakfast Club' is part of the characters data set."
[1] "The fictional universe 'Broad City' is part of the characters data set."
[1] "The fictional universe 'Bob's Burgers' is part of the characters data set."
[1] "The fictional universe 'Battlestar Galactica' is part of the characters data set."
[1] "The fictional universe 'Buffy the Vampire Slayer' is part of the characters data set."
[1] "The fictional universe 'Community' is part of the characters data set."
[1] "The fictional universe 'Calvin and Hobbes' is part of the characters data set."
[1] "The fictional universe 'Criminal Minds' is part of the characters data set."
[1] "The fictional universe 'Craze Ex-Girlfriend' is part of the characters data set."
[1] "The fictional universe 'Dexter' is part of the characters data set."
...
Note how we don’t have to use i as counter (even though it is convention).
Now calculate the mean rating over all characters in this fictional universe for each question and print the result in a statement containing the sentence: "The mean rating for the fictional universe 'your_universe' on the question 'question' is: 'mean_rating'."
Hint
Build a for loop that goes over all unique questions (use unique()) in your subsetted data frame. Inside this for-loop you can subset again, this time only the rows containing the question that the loop is at at the moment, and calculate its mean rating from here. Then use paste() to build and print the statement.
Solution
for (i inunique(characters_friends$question)) { # goes over all unique questions## Build a subset that only consists of ratings about the current question: question_dat <- characters_friends %>%filter(question == i)## Calculate the mean for that subset: question_mean <-mean(question_dat$rating)## Build and print the final statement: statement <-paste("The mean rating for the fictional universe 'Friends' on the question '", i, "' is:", question_mean)print(statement)}
[1] "The mean rating for the fictional universe 'Friends' on the question ' messy_neat ' is: 47.6833333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' disorganized_self.disciplined ' is: 45.1666666666667"
[1] "The mean rating for the fictional universe 'Friends' on the question ' diligent_lazy ' is: 42.2333333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' on.time_tardy ' is: 53.3166666666667"
[1] "The mean rating for the fictional universe 'Friends' on the question ' competitive_cooperative ' is: 33.6"
[1] "The mean rating for the fictional universe 'Friends' on the question ' scheduled_spontaneous ' is: 55.5333333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' ADHD_OCD ' is: 40.7833333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' chaotic_orderly ' is: 41.6666666666667"
[1] "The mean rating for the fictional universe 'Friends' on the question ' motivated_unmotivated ' is: 32.2833333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' bossy_meek ' is: 41.4666666666667"
[1] "The mean rating for the fictional universe 'Friends' on the question ' persistent_quitter ' is: 29.05"
[1] "The mean rating for the fictional universe 'Friends' on the question ' overachiever_underachiever ' is: 41.85"
[1] "The mean rating for the fictional universe 'Friends' on the question ' muddy_washed ' is: 64.2166666666667"
[1] "The mean rating for the fictional universe 'Friends' on the question ' beautiful_ugly ' is: 19.5"
[1] "The mean rating for the fictional universe 'Friends' on the question ' slacker_workaholic ' is: 53.5333333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' driven_unambitious ' is: 34.1833333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' outlaw_sheriff ' is: 50.2666666666667"
[1] "The mean rating for the fictional universe 'Friends' on the question ' precise_vague ' is: 51.55"
[1] "The mean rating for the fictional universe 'Friends' on the question ' bad.cook_good.cook ' is: 37.6333333333333"
[1] "The mean rating for the fictional universe 'Friends' on the question ' manicured_scruffy ' is: 32.4166666666667"
...
Tweak your for loop so the mean values get saved in a new data frame, containing the question and the mean rating for each question.
Hints
Build an empty data frame where you will save your results.
Now you can’t easily loop over the question column itself, because you need the position of each element to save it in the respective row of your new data frame: for(i in 1:length(unique(characters_friends$question))){.
Now you can save the result of your calculation in row i and column mean of your new data frame.
Solution
## Build an empty data frame for storing the results:mean_ratings <-data.frame()for (i in1:length(unique(characters_friends$question))) {## Extract the question on position i: question_i <-unique(characters_friends$question)[i]## Extract all rows that contain values for this question: question_dat <- characters_friends %>%filter(question == question_i)## Calculate the mean for that question question_mean <-mean(question_dat$rating)## Save the question in the row corresponding to the position of i: mean_ratings[i, "question"] <- question_i## Save the mean in the row corresponding to the position of i: mean_ratings[i, "mean"] <- question_mean}head(mean_ratings)
characters_friends %>%group_by(question) %>%summarise(mean_rating =mean(rating)) %>%## Let's look at the rating of this question for comparison:filter(question =="messy_neat")