This is my script:
library(class)
library(ggplot2)
library(gmodels)
library(scales)
library(caret)
library(tidyverse)
library(caret)
db_data <- iris
row_train <- sample(nrow(iris), nrow(iris)*0.8)
db_train <- iris[row_train,]
db_test <- iris[-row_train,]
unique(db_train$Species)
table(db_train$Species)
#--------
#KNN
#-------
model_knn<-train(Species ~ ., data = db_train, method = "knn",tuneGrid = data.frame(k = 12))
summary(model_knn)
#-------
#PREDICTION NEW RECORD
#-------
test_data <- db_test
db_test$predict <- predict(model_knn, newdata=test_data, interval='confidence')
confusionMatrix(data=factor(db_test$predict),reference=factor(db_test$Species))
#-------
How can I define the optimal value of k in the KNN model?
refrain from dictating that k=12, and then multiple k's will be tested and the highest accuracy chosen. or set for k to be some reasonable range i.e. k=2:20
Ok, but is there a function that can automatically calculate the best value? Or do I have to test each value myself?
the function train()
would do that, but you asked it only to consider the case of 12, so that's all it did.
I try this:
model_knn<-train(Species ~ ., data = db_train, method = "knn",tuneGrid = data.frame(k = c(2:20)))
but I have this error:
Error in train(Species ~ ., data = db_train, method = "knn", tuneGrid = data.frame(k = c(2:20))) :
unused arguments (data = db_train, method = "knn", tuneGrid = data.frame(k = c(2:20)))
Unfortunately I can't reproduce your error as that syntax works for me without issue.
Sidenote, while the c() wrapper arround 2:20 is not an issue, neither is it required, this is because 2:20 is already a well-formed vector.
Maybe restart your session, and try again ?
1 Like
system
Closed
November 18, 2021, 10:56am
7
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.