Getting a "Error in family$linkfun(mustart) : Argument mu must be a nonempty numeric vector". no NA values or non numeric columns in data

I have been trying to run a simple Logistic regression but keep getting the above error. i checked for NAs and have removed na rows with 'complete.cases' function. also checked 'sapply' to see the column format and all are numeric. Please tell me if i am missing something.

heart_logistic <- glm(~male+glucose,data = data, family = 'binomial')

> sapply(data, mode)
           male             age       education   currentSmoker      cigsPerDay          BPMeds 
      "numeric"       "numeric"       "numeric"       "numeric"       "numeric"       "numeric" 
prevalentStroke    prevalentHyp        diabetes         totChol           sysBP           diaBP 
      "numeric"       "numeric"       "numeric"       "numeric"       "numeric"       "numeric" 
            BMI       heartRate         glucose      TenYearCHD 
      "numeric"       "numeric"       "numeric"       "numeric"

How come there is nothing on the left side of your formula ? what are male and glucose going to predict ?

heart_logistic is the model on the left side of the formula. Please let me know if you mean something else.

Male and glucose levels are some sample columns i am using to see if the formula is working. The dataset is to predict heart disease. The below one will make more sense but it is throwing the same error.

heart_logistic <- glm(~TenYearCHD+male,data = data, family = 'binomial')

heart_logistic is what you are storing the results of a GLM model as in R.

The formula you have specified is ~ male + glucose

male and glucose are the variable names of your model covariates (or independent variables). These are fine how they are written.

But there is no response variable (also known as a dependent variable) in your formula - the response variable would be to the left of the tilde (~) in your formula. Without this, you won't be able to train the model on your data.

The dataset is to predict heart disease

Which column in your dataset records relates to a patient having heart disease?

1 Like