Continuing the discussion from Error in missForest command:
Dear Mara,
I have managed to run the reprex for miss forest command. Please see the details below. I have three questions: 1) the Missforest command runs with subset of data I cerated for reprex but not with the main dataset which is quite large. So how do I over come this problem. 2) In the reprex command below, I see lot of warning messages, should I ignore them. 3) How do I use imputed data in further analysis ? Can I view them or export the data to other format e.g excel or stata?
Many thanks in advance. Regards, Saran
library(missForest)
#> Loading required package: randomForest
#> randomForest 4.6-14
#> Type rfNews() to see new features/changes/bug fixes.
#> Loading required package: foreach
#> Loading required package: itertools
#> Loading required package: iterators
df <- data.frame(
drecall = c(NA, NA, 6, 7, 5, NA, NA, NA, NA, 8, 5, NA, NA, NA, 3, NA,
NA, 6, 5, 5, 9, 4, 3, 4, NA, NA, 7, 3, NA, 3, 7, 7, 4, NA,
4, 4, NA, 4, NA, 2, 4, 7, 7, 5, 7, 5, 2, 4, NA, NA),
orientation = c(NA, NA, 3, 4, 4, NA, NA, NA, NA, 4, 4, NA, NA, NA, 4, NA,
NA, 4, 3, 4, 4, 4, 3, 4, NA, NA, 3, 4, NA, 3, 4, 4, 4, NA,
3, 4, NA, 3, NA, 3, 3, 4, 4, 3, 4, 4, 4, 4, NA, NA),
number = c(NA, NA, NA, 3, 2, NA, NA, NA, NA, NA, 3, NA, NA, NA, NA, NA,
NA, NA, 2, 3, 3, NA, NA, 0, NA, NA, NA, 3, NA, NA, 2, 3, 3,
NA, 2, NA, NA, NA, NA, NA, 1, NA, 3, NA, NA, NA, NA, NA, NA,
NA),
slfall = c(NA, NA, 5, 5, 5, NA, NA, NA, NA, 4, 5, NA, NA, NA, 5, NA,
NA, 3, 5, 5, 3, 5, 5, 5, NA, NA, 5, 5, NA, 5, 5, 4, 3, NA,
3, 4, NA, 3, NA, 5, 5, 4, 5, 5, 4, 5, 5, 2, NA, NA),
slwake = c(NA, NA, 4, 3, 1, NA, NA, NA, NA, 3, 4, NA, NA, NA, 4, NA,
NA, 1, 1, 2, 2, 1, 4, 1, NA, NA, 3, 1, NA, 4, 1, 3, 1, NA,
2, 3, NA, 1, NA, 4, 4, 2, 1, 4, 4, 1, 4, 1, NA, NA),
sltired = c(NA, NA, 4, 2, 4, NA, NA, NA, NA, 4, 4, NA, NA, NA, 1, NA,
NA, 2, 4, 4, 2, 1, 4, 4, NA, NA, 4, 3, NA, 4, 2, 1, 4, NA,
1, 4, NA, 1, NA, 4, 2, 4, 4, 4, 4, 4, 2, 4, NA, NA),
slmorn = c(NA, NA, 4, 4, 2, NA, NA, NA, NA, 4, 1, NA, NA, NA, 1, NA,
NA, 2, 4, 4, 1, 1, 4, 3, NA, NA, 2, 4, NA, 1, 4, 4, 1, NA,
2, 4, NA, 1, NA, 1, 2, 4, 4, 4, 4, 4, 4, 4, NA, NA),
affect = c(NA, NA, 7, 8, 8, NA, NA, NA, NA, 7, 7, NA, NA, NA, 4, NA,
NA, 7, 8, 6, 7, 0, 8, 8, NA, NA, 8, 7, NA, 8, 5, 7, 4, NA,
8, 8, NA, NA, NA, 8, 8, 8, 8, 8, 8, 8, 7, 7, NA, NA),
hear = c(NA, NA, 5, 4, 3, NA, 5, 4, NA, 4, 4, NA, NA, 1, 2, NA, NA,
3, 2, 4, 3, 2, 5, 5, 3, NA, 2, 3, 2, 3, 2, 4, 5, NA, 2, 2,
NA, 1, NA, 3, 3, 3, 4, 5, 5, 5, 5, 3, 4, NA),
nvision = c(NA, NA, 4, 4, 4, NA, NA, NA, NA, 5, 4, NA, NA, NA, 3, NA,
NA, 3, 4, 5, 5, 2, 5, 5, NA, NA, 4, 4, NA, 3, 3, 4, 1, NA,
3, 5, NA, 4, NA, 2, 3, 3, 3, 3, 4, 5, 5, 4, NA, NA)
)
iris.mis <- prodNA(df)
summary(iris.mis)
#> drecall orientation number slfall
#> Min. :2 Min. :3.000 Min. :0.000 Min. :2.000
#> 1st Qu.:4 1st Qu.:3.000 1st Qu.:2.000 1st Qu.:4.000
#> Median :5 Median :4.000 Median :3.000 Median :5.000
#> Mean :5 Mean :3.655 Mean :2.357 Mean :4.448
#> 3rd Qu.:6 3rd Qu.:4.000 3rd Qu.:3.000 3rd Qu.:5.000
#> Max. :9 Max. :4.000 Max. :3.000 Max. :5.000
#> NA's :25 NA's :21 NA's :36 NA's :21
#> slwake sltired slmorn affect
#> Min. :1.000 Min. :1 Min. :1.000 Min. :0.000
#> 1st Qu.:1.000 1st Qu.:2 1st Qu.:1.000 1st Qu.:7.000
#> Median :2.000 Median :4 Median :4.000 Median :8.000
#> Mean :2.357 Mean :3 Mean :2.793 Mean :7.077
#> 3rd Qu.:4.000 3rd Qu.:4 3rd Qu.:4.000 3rd Qu.:8.000
#> Max. :4.000 Max. :4 Max. :4.000 Max. :8.000
#> NA's :22 NA's :27 NA's :21 NA's :24
#> hear nvision
#> Min. :1.0 Min. :1.0
#> 1st Qu.:2.5 1st Qu.:3.0
#> Median :3.0 Median :4.0
#> Mean :3.4 Mean :3.7
#> 3rd Qu.:4.5 3rd Qu.:4.0
#> Max. :5.0 Max. :5.0
#> NA's :15 NA's :20
iris.imp <- missForest(iris.mis)
#> missForest iteration 1 in progress...
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> done!
#> missForest iteration 2 in progress...
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> done!
#> missForest iteration 3 in progress...
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> done!
#> missForest iteration 4 in progress...
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> Warning in randomForest.default(x = obsX, y = obsY, ntree = ntree, mtry =
#> mtry, : The response has five or fewer unique values. Are you sure you want
#> to do regression?
#> done!
Created on 2019-02-21 by the reprex package (v0.2.1)