Hello, I'm having trouble doing the probability calculation using Nayve Bays for a new group of data.
I'm putting down the code that's working for a test base but it contains the dependent variable, how do I predict the probability a basis that I have the same independent variables but I don't have the dependent one, since I have to calculate the probability?
Thanks
# Libraries
library(naivebayes)
library(dplyr)
library(ggplot2)
library(psych)
#Read data file
getwd()
data <- read.csv('https://raw.githubusercontent.com/bkrai/Statistical-Modeling-and-Graphs-with-R/main/binary.csv')
#contingency table
xtabs(~admit + rank, data = data)
#Rank & admit are categorical variables
data$rank <- as.factor(data$rank)
data$admit <- as.factor(data$admit)
summary(data)
#Split data into Training (80%) and Testing (20%) datasets
set.seed(1234)
ind <- sample(2,nrow(data),replace=TRUE, prob=c(0.8,.2))
train <- data[ind==1,]
test <- data[ind==2,]
# Naive Bayes
model <- naive_bayes(admit ~ ., data = train)
model
plot(model)
# Predict
p <-predict(model, type= 'prob', newdata = select(test,-admit))
head(cbind(p, test))