Hi! I have a question regarding statistics. I have a data set of 13 metabolites (numerical variable) and then infant's characteristics like weight or gender. For the weight, I have performed a Pearson r correlation, but I have read it that for nominal categories it is not possible. Does anyone knows which test would fit for assessing the correlation between a binary category and a numerical one?
I would think about a point-biserial correlation coefficient. It measures the strength and direction of the relationship between a binary variable and a continuous variable. I am not sure if this is what you are searching for but it was my first guess. Here an example how to calculate in R with a random dataset I created and just one variable of metabolities:
#create random dataset
set.seed(123)
n <- 100
weight <- rnorm(n, mean = 45, sd = 15)
gender <- sample(c("female", "male"), n, replace = TRUE)
age <- sample(8:16, n, replace = TRUE)
Metabolities <- sample(10:50, n, replace = TRUE)
#create data frame
dataset <- data.frame(weight, gender, age, Metabolities)
Thank you so much to all! However, I have a question. I have performed the test with the function correlation_biserial <- cor(M1, as.numeric(gender)) and with the function of technocrat of cor.test. The results are the same and I can not see where I have said with the function cor.test that I want a biseral correlation:
Pearson's and the point biserial are mathematically equivalent. The latter is preferred when the dichotomous term has been induced, rather than natural. Although, until recently, gender is natural, point biserial is not needed. However, were gender conceptualized in the modern sense of a social construct, dichotomous treatment would not be appropriate. I have removed the offending example from my answer.