Missing entries in any given column of the matrix are replaced by the column means or the values in a supplied vector.

na.replace(x, m = rowSums(x, na.rm = TRUE))

Arguments

x

A matrix with potentially missing values, and also potentially in sparse matrix format (i.e. inherits from "sparseMatrix")

m

Optional argument. A vector of values used to replace the missing entries, columnwise. If missing, the column means of 'x' are used

Value

A version of 'x' is returned with the missing values replaced.

Details

This is a simple imputation scheme. This function is called by makeX if the na.impute=TRUE option is used, but of course can be used on its own. If 'x' is sparse, the result is sparse, and the replacements are done so as to maintain sparsity.

See also

makeX and glmnet

Examples

set.seed(101) ### Single data frame X = matrix(rnorm(20), 10, 2) X[3, 1] = NA X[5, 2] = NA X3 = sample(letters[1:3], 10, replace = TRUE) X3[6] = NA X4 = sample(LETTERS[1:3], 10, replace = TRUE) X4[9] = NA dfn = data.frame(X, X3, X4) x = makeX(dfn) m = rowSums(x, na.rm = TRUE) na.replace(x, m)
#> X1 X2 X3a X3b X3c X4A X4B X4C #> 1 -0.3260365 0.5264481 0.000000 1.0000000 0.000000 0.000000 0.000000 1.000000 #> 2 0.5524619 -0.7948444 0.000000 0.0000000 1.000000 0.000000 1.000000 0.000000 #> 3 2.2004116 1.4277555 1.000000 0.0000000 0.000000 0.000000 1.000000 0.000000 #> 4 0.2143595 -1.4668197 1.000000 0.0000000 0.000000 1.000000 0.000000 0.000000 #> 5 0.3107692 1.7576174 1.000000 0.0000000 0.000000 0.000000 1.000000 0.000000 #> 6 1.1739663 -0.1933380 3.427756 0.7475398 2.310769 1.000000 0.000000 0.000000 #> 7 0.6187899 -0.8497547 1.000000 0.0000000 0.000000 1.000000 0.000000 0.000000 #> 8 -0.1127343 0.0584655 0.000000 1.0000000 0.000000 1.000000 0.000000 0.000000 #> 9 0.9170283 -0.8176704 0.000000 1.0000000 0.000000 1.980628 1.769035 1.945731 #> 10 -0.2232594 -2.0503078 0.000000 0.0000000 1.000000 0.000000 0.000000 1.000000
x = makeX(dfn, sparse = TRUE) na.replace(x, m)
#> 10 x 8 sparse Matrix of class "dgCMatrix" #> X1 X2 X3a X3b X3c X4A X4B X4C #> 1 -0.3260365 0.5264481 . 1.0000000 . . . 1.000000 #> 2 0.5524619 -0.7948444 . . 1.000000 . 1.000000 . #> 3 2.2004116 1.4277555 1.000000 . . . 1.000000 . #> 4 0.2143595 -1.4668197 1.000000 . . 1.000000 . . #> 5 0.3107692 1.7576174 1.000000 . . . 1.000000 . #> 6 1.1739663 -0.1933380 3.427756 0.7475398 2.310769 1.000000 . . #> 7 0.6187899 -0.8497547 1.000000 . . 1.000000 . . #> 8 -0.1127343 0.0584655 . 1.0000000 . 1.000000 . . #> 9 0.9170283 -0.8176704 . 1.0000000 . 1.980628 1.769035 1.945731 #> 10 -0.2232594 -2.0503078 . . 1.000000 . . 1.000000