Missing values: Exercises

# install.packages("tidyverse")
# install.packages("here")

library(tidyverse)
library(here)

## Load the data
characters <- readRDS(file = here::here("raw_data", "characters.rds"))
psych_stats <- read.csv(
  file = here::here("raw_data", "psych_stats.csv"),
  sep = ";"
)

Exercise 1

  1. Does the characters data set contain any NAs?

Use any() to see if a logical vector contains any TRUE values.

any(is.na(characters))
[1] FALSE

No, there don’t seem to be any NAs in this data set, which would be great in real life. For this exercise it’s not great, so let’s introduce some NAs manually.

  1. Be careful not to overwrite the characters data frame, so copy it into the new object characters_na before doing anything. Then set the name to NA in the rows 34, 103, 300 and the uni_name to NA in the rows 404, 670.

To overwrite values, you can select them on the left side of the assignment operator <- and assign them a new value on the right side.

characters_na <- characters

characters_na[c(34, 103, 300), "name"] <- NA
characters_na[c(404, 670), "uni_name"] <- NA
  1. Remove all rows containing missing values in the column name from the characters_na data frame.
characters_na <- characters_na[!is.na(characters_na$name), ]

Or:

library(tidyverse)

characters_na <- characters_na %>%
  drop_na(name)