# install.packages("tidyverse")
# install.packages("here")
library(tidyverse)
library(here)
## Load the data
<- readRDS(file = here::here("raw_data", "characters.rds"))
characters <- read.csv(
psych_stats file = here::here("raw_data", "psych_stats.csv"),
sep = ";"
)
Reshaping: Exercise
Exercise 1
Take a look at the data frame psych_stats
. Which format does it have?
- Wide format
- Long format
- None of the above
Solution
- Wide format
- Long format
- None of the above
Each unit of observation, in this case each character, only has one row.
Exercise 2
Reshape it, so there are only three columns in the data set: char_id
, question
and rating
.
Hint
You can select multiple columns like this: column_1:column_10
.
Solution
<- psych_stats %>%
psych_stats pivot_longer(cols = messy_neat:innocent_jaded,
names_to = "question",
values_to = "rating")
head(psych_stats)
# A tibble: 6 × 3
char_id question rating
<chr> <chr> <dbl>
1 F2 messy_neat 95.7
2 F2 disorganized_self.disciplined 95.2
3 F2 diligent_lazy 6.10
4 F2 on.time_tardy 6.2
5 F2 competitive_cooperative 6.40
6 F2 scheduled_spontaneous 6.60
Now we have multiple rows for every character, but all question ratings are nicely aligned in one column.
Exercise 3
Try to reshape the data into long format again.
Solution
%>%
psych_stats pivot_wider(id_cols = char_id,
names_from = "question",
values_from = "rating")
# A tibble: 889 × 365
char_id messy_neat disorganized_self.disciplined diligent_lazy on.time_tardy
<chr> <dbl> <dbl> <dbl> <dbl>
1 F2 95.7 95.2 6.10 6.2
2 F1 30.2 25.9 51.8 77.9
3 F5 45.3 42.4 52.2 57.1
4 F4 13 11 78.1 84.1
5 F3 20.9 20.9 45.2 74
6 F6 81 75.6 20 20.6
7 EU1 9.60 10.4 62.3 85.7
8 EU2 27.7 31.9 23.7 68.3
9 EU6 40 39.6 54.1 73.6
10 EU3 43.9 31.1 32.2 58.2
# ℹ 879 more rows
# ℹ 360 more variables: competitive_cooperative <dbl>,
# scheduled_spontaneous <dbl>, ADHD_OCD <dbl>, chaotic_orderly <dbl>,
# motivated_unmotivated <dbl>, bossy_meek <dbl>, persistent_quitter <dbl>,
# overachiever_underachiever <dbl>, muddy_washed <dbl>, beautiful_ugly <dbl>,
# slacker_workaholic <dbl>, driven_unambitious <dbl>, outlaw_sheriff <dbl>,
# precise_vague <dbl>, bad.cook_good.cook <dbl>, manicured_scruffy <dbl>, …
This is how we got it! But scratch that, it was just for the sake of the exercise. We want to use psych_stats
in the long format from now on.