Nayve Bays for a new group of data.

profiack · September 29, 2023, 1:32pm

Hello, I'm having trouble doing the probability calculation using Nayve Bays for a new group of data.
I'm putting down the code that's working for a test base but it contains the dependent variable, how do I predict the probability a basis that I have the same independent variables but I don't have the dependent one, since I have to calculate the probability?
Thanks


# Libraries
library(naivebayes)
library(dplyr)
library(ggplot2)
library(psych)

#Read data file
getwd()
data <- read.csv('https://raw.githubusercontent.com/bkrai/Statistical-Modeling-and-Graphs-with-R/main/binary.csv')

#contingency table 
xtabs(~admit + rank, data = data)

#Rank & admit are categorical variables
data$rank <- as.factor(data$rank)
data$admit <- as.factor(data$admit)
summary(data)

#Split data into Training (80%) and Testing (20%) datasets
set.seed(1234)
ind <- sample(2,nrow(data),replace=TRUE, prob=c(0.8,.2))
train <- data[ind==1,]
test <- data[ind==2,]

# Naive Bayes
model <- naive_bayes(admit ~ ., data = train)
model
plot(model)


# Predict
p <-predict(model, type= 'prob', newdata = select(test,-admit))
head(cbind(p, test))

technocrat · September 29, 2023, 8:49pm

I'm uncertain what you are looking for. The model is given priors of 0.8/0.2 and calculates from summary(model)

===================================== Naive Bayes ====================================== 
 
- Call: naive_bayes.formula(formula = admit ~ ., data = train) 
- Laplace: 0 
- Classes: 2 
- Samples: 325 
- Features: 3 
- Conditional distributions: 
    - Categorical: 1
    - Gaussian: 2
- Prior probabilities: 
    - 0: 0.6862
    - 1: 0.3138

----------------------------------------------------------------------------------------

When the model is applied to test, predict() provides the posteriors. There's no need to exclude admit from predict().

system · October 20, 2023, 8:50pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.