Cleaning data
Removing missing data from a vector
Section titled “Removing missing data from a vector”First lets create a vector called Vector1:
set.seed(123)Vector1 <- rnorm(20)And add missing data to it:
set.seed(123)Vector1[sample(1:length(Vector1), 5)] <- NANow we can use the is.na function to subset the Vector
Vector1 <- Vector1[!is.na(Vector1)]Now the resulting vector will have removed the NAs of the original Vector1
Removing incomplete rows
Section titled “Removing incomplete rows”There might be times where you have a data frame and you want to remove all the rows that might contain an NA value, for that the function complete.cases is the best option.
We will use the first 6 rows of the airquality dataset to make an example since it already has NAs
x <- head(airquality)This has two rows with NAs in the Solar.R column, to remove them we do the following
x_no_NA <- x[complete.cases(x),]The resulting dataframe x_no_NA will only have complete rows without NAs