I have some problems with my random forest, I have imbalanced data. I would like to first to a prediction with this data, where I after will do It more balanced and then do a new prediction. The problem is that I get this error:
Is there anything i can do to fix this? To make them the same length?
Thanks
remove [2]
(unless you have a good reason for having it there that you can explain ?)
length(data_test$Risk_Flag)
length(pred.tree)
are they not both 128 ?
No, the other one is 15... Dont really know why, or how to fix it
the other one being data_test$Risk_Flag
specifically ?
The other one is
"tree <- rpart(Risk_Flag~.,data=data_train)"
I'm not sure the relevance, I thought the two parts that need to be the same length, are the ones sent to accuracy.meas
and these are data_test$Risk_Flag
and pred.tree
and not tree
So can you not please run the length code for the two , and paste the results to the forum (avoiding screenshots where possible)?
library(rpart)
tree <- rpart(Risk_Flag~.,data=data_train)
pred.tree <- predict(tree, newdata = data_test)
accuracy.meas(data_test$Risk_Flag, pred.tree)
call:
accuracy.meas(response = data_test$Risk_Flag, predicted = pred.tree)
Error in accuracy.meas(data_test$Risk_Flag, pred.tree) : Response and predicted must have the same length.
length(pred.tree)
length(data_test$Risk_Flag)
[1] 1866
[1] 933
Made the data set a little bigger, thats why the numbers have changed comparedo t the first screenshot
ok thanks.
can you do
length(na.omit(pred.tree))
length(na.omit(data_test$Risk_Flag))
length(na.omit(pred.tree))
length(na.omit(data_test$Risk_Flag))
[1] 1866
[1] 933
Ok, can you provide a reprex ?
You showed you had the error even with 128 records of datatest, so that should be enough and not too much to share.
Please have a look at this guide, to see how to create one:
A minimal reproducible example consists of the following items:
A minimal dataset, necessary to reproduce the issue
The minimal runnable code necessary to reproduce the issue, which can be run
on the given dataset, and including the necessary information on the used packages.
Let's quickly go over each one of these with examples:
Minimal Dataset (Sample Data)
You need to provide a data frame that is small enough to be (reasonably) pasted on a post, but big enough to reproduce your issue.
Let's say, as an example, that you are working with the iris data frame
head(iris)
#> Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1 5.1 3.5 1.4 0.…
system
Closed
December 23, 2021, 6:45pm
14
This topic was automatically closed 21 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.