Impute Missing Values by median/mode.

na.roughfix(object, ...)

Arguments

object

a data frame or numeric matrix.

...

further arguments special methods could require.

Value

A completed data matrix or data frame. For numeric variables, NAs are replaced with column medians. For factor variables, NAs are replaced with the most frequent levels (breaking ties at random). If object contains no NAs, it is returned unaltered.

Note

This is used as a starting point for imputing missing values by random forest.

See also

Examples

data(iris) iris.na <- iris set.seed(111) ## artificially drop some data values. for (i in 1:4) iris.na[sample(150, sample(20)), i] <- NA iris.roughfix <- na.roughfix(iris.na) iris.narf <- randomForest(Species ~ ., iris.na, na.action=na.roughfix) print(iris.narf)
#> #> Call: #> randomForest(formula = Species ~ ., data = iris.na, na.action = na.roughfix) #> Type of random forest: classification #> Number of trees: 500 #> No. of variables tried at each split: 2 #> #> OOB estimate of error rate: 4.67% #> Confusion matrix: #> setosa versicolor virginica class.error #> setosa 50 0 0 0.00 #> versicolor 0 47 3 0.06 #> virginica 0 4 46 0.08