Correlation in a dataset

Hello,

I hope you´re excellent.

Please, I attach a dataframe,I need to do correlations, but I have a lot of observations with different type of values.

How can you help me to do a correlation matrix with the attached data?, thanks a lot.

You can use cor() to get correlations of the numerical variables. Correlations of the text variables doesn't make any sense, although you can make cross-tabulation tables using tabyl() from the janitor package.

for example to correlate age and weight:

I changed both as numeric:

First the age

unique(stats$age) #which(is.na(stats))
stats$age[stats$age == "null"] <- 0
stats[complete.cases(stats),]
stats$age <- as.numeric(as.numeric(stats$age))
unique(stats$age)
class(stats$age) #it appears as a numeric

Second the weight

unique(stats$weight)
class(stats$weight)
stats$weight <- as.numeric(as.numeric(stats$weight))
class(stats$weight) #Now it is a numeric class too!

Both are numeric, let´s correlate:

cor(stats$age, stats$weight)

'x' must be numeric
[1] NA

I need the way to take all numerical variables like:

stats %>% select(where(is.numeric))
and continue in just one code, with:
na.omit()
or %>% filter(!complete.cases(.))

You shouldn't set "null" to zero unless you know that is what "null" means.
Try

cor(as.numeric(stats$age), as.numeric(stats$weight), use = "pairwise.complete.obs")

This topic was automatically closed 90 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.