Error in knnImputation(dat_del, k = 1) : Not sufficient complete cases for computing neighbors.

"Using knnImputation, I'm getting an error: 'Error in knnImputation(dat_del, k = 5): Not sufficient complete cases for computing neighbors.' I've already tried stricter filtering for missing values and changing the value of k, but it didn't work. Can you please help me with a solution?

Which library are you using, pguIMP? What filterings have you done?

First and foremost I would suggest you check the distribution of missing values. For example if you have a data frame dat_del, you can use:

hist( colMeans(is.na(dat_del)) )

to check the proportion of missing values per column. Same per row with rowMeans().

For example the package {impute} puts a threshold at 80% missing values per column, 50% per row; if you have more than that you might want to filter more (does it really make sense to work with a column with more than 80% missing data?).

You can also vary k, in your title you have k=1, that seems really too low (if that one neighbor happens to be missing, then it fails). I expect it will be more robust at higher k, but you will also end up "smoothing" your data and introducing more bias.

You should also consider the package {naniar} which provides a set of tools to visualize missing values. Try to determine if there are patterns to your missing data.

It's hard to give you more specific advice, as it highly depends on the type of data you have, what the missing values represent, and how they are distributed.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.