Please I need help with this code.

# Load necessary libraries
library(mlbench)  # For Breast Cancer dataset
library(caret)    # For data partitioning
library(dplyr)    # For data manipulation

# Load Breast Cancer dataset

data(BreastCancer)
#rm = rm(Id)

# Remove missing values
BreastCancer <- na.omit(BreastCancer, Id)

# Convert Class column to a factor (Malignant = 1, Benign = 0)
BreastCancer$Class <- factor(ifelse(BreastCancer$Class == "malignant", 1, 0), levels = c(0, 1))

# Split dataset into training (70%) and testing (30%)
set.seed(123)
trainIndex <- createDataPartition(BreastCancer$Class, p = 0.7, list = FALSE)
trainData <- BreastCancer[trainIndex, ]
testData <- BreastCancer[-trainIndex, ]

# Function to fit a logistic regression model
train_logistic_model <- function(trainData) {
  glm(Class ~ ., data = trainData, family = binomial)
}

# Function to make predictions and evaluate performance
predict_summary <- function(model, data) {
  # Predict probabilities
  probs <- predict(model, newdata = data, type = "response")
  # Convert to binary (threshold = 0.5)
  preds <- ifelse(probs > 0.5, 1, 0)
  actuals <- as.numeric(data$Class)

  # Compute evaluation metrics
  cm <- table(Predicted = preds, Actual = actuals)
  accuracy <- sum(diag(cm)) / sum(cm)
  precision <- cm[2, 2] / sum(cm[2, ])
  recall <- cm[2, 2] / sum(cm[, 2])
  f1_score <- 2 * (precision * recall) / (precision + recall)

  return(list(ConfusionMatrix = cm, Accuracy = accuracy, Precision = precision, Recall = recall, F1_Score = f1_score))
}

# Train the model
model <- train_logistic_model(trainData)

# Evaluate on training data
train_results <- predict_summary(model, trainData)
print("Training Set Performance:")
print(train_results)

# Evaluate on test data
test_results <- predict_summary(model, testData)
print("Test Set Performance:")
print(test_results)

Hi @ckappiah1999, welcome to the community!

Can you be more specific about what help you are looking for? Does the code not run? Does it not provide the results you expect? Are you looking to extend it to do something else?

Best,
Randy

Please, the code does not produce the expected output.
Thanks.

Can you give us the output you are getting?

Copy the output and paste it here between

```

```

This gives us formatted output that we can read easily .

''''
Error in createDataPartition(BreastCancer$Class, p = 0.7, list = FALSE) :
could not find function "createDataPartition"
'''''

Should read

BreastCancer <- na.omit(BreastCancer)

And I think you do need to drop the Id variable.

BreastCancer <- select(BreastCancer , -Id)

It runs for me but I don't know enough about the subject to know if it is working properly.

1 Like

Please, it is working well.
Thank you very much