Reshaping: Exercise

# install.packages("tidyverse")
# install.packages("here")

library(tidyverse)
library(here)

## Load the data
characters <- readRDS(file = here::here("raw_data", "characters.rds"))
psych_stats <- read.csv(
  file = here::here("raw_data", "psych_stats.csv"),
  sep = ";"
)

Exercise 1

Take a look at the data frame psych_stats. Which format does it have?

  • Wide format
  • Long format
  • None of the above
  • Wide format
  • Long format
  • None of the above

Each unit of observation, in this case each character, only has one row.

Exercise 2

Reshape it, so there are only three columns in the data set: char_id, question and rating.

You can select multiple columns like this: column_1:column_10.

psych_stats <- psych_stats %>%
  pivot_longer(cols = messy_neat:innocent_jaded, 
               names_to = "question", 
               values_to = "rating")

head(psych_stats)
# A tibble: 6 × 3
  char_id question                      rating
  <chr>   <chr>                          <dbl>
1 F2      messy_neat                     95.7 
2 F2      disorganized_self.disciplined  95.2 
3 F2      diligent_lazy                   6.10
4 F2      on.time_tardy                   6.2 
5 F2      competitive_cooperative         6.40
6 F2      scheduled_spontaneous           6.60

Now we have multiple rows for every character, but all question ratings are nicely aligned in one column.

Exercise 3

Try to reshape the data into long format again.

psych_stats %>%
  pivot_wider(id_cols = char_id, 
               names_from = "question", 
               values_from = "rating")
# A tibble: 889 × 365
   char_id messy_neat disorganized_self.disciplined diligent_lazy on.time_tardy
   <chr>        <dbl>                         <dbl>         <dbl>         <dbl>
 1 F2           95.7                           95.2          6.10           6.2
 2 F1           30.2                           25.9         51.8           77.9
 3 F5           45.3                           42.4         52.2           57.1
 4 F4           13                             11           78.1           84.1
 5 F3           20.9                           20.9         45.2           74  
 6 F6           81                             75.6         20             20.6
 7 EU1           9.60                          10.4         62.3           85.7
 8 EU2          27.7                           31.9         23.7           68.3
 9 EU6          40                             39.6         54.1           73.6
10 EU3          43.9                           31.1         32.2           58.2
# ℹ 879 more rows
# ℹ 360 more variables: competitive_cooperative <dbl>,
#   scheduled_spontaneous <dbl>, ADHD_OCD <dbl>, chaotic_orderly <dbl>,
#   motivated_unmotivated <dbl>, bossy_meek <dbl>, persistent_quitter <dbl>,
#   overachiever_underachiever <dbl>, muddy_washed <dbl>, beautiful_ugly <dbl>,
#   slacker_workaholic <dbl>, driven_unambitious <dbl>, outlaw_sheriff <dbl>,
#   precise_vague <dbl>, bad.cook_good.cook <dbl>, manicured_scruffy <dbl>, …

This is how we got it! But scratch that, it was just for the sake of the exercise. We want to use psych_stats in the long format from now on.