Hey everyone! I am new to rstudio and would like some help figuring out how to solve the correlation coeffecient for 2 variables given (weight and exercise_minutes). I have attached what I have done so far below.
The correlation coeffecient i am getting is incorect it should be 0.3763526 but i am getting 464394961 (a much higher number than expected) not sure why...any help is greatly appreciated. Also, the formula for determine the coeffecient correlation is written as a comment.
#Here is your formula
(sum(product_2_variables))/(20-1)*sd(weight)*sd(excercise_minutes)
#Here is a version with simple numbers 190/19*5*2
sum(c(100,90))/(20-1)*5*2
#> [1] 100
#Here is a version with simple numbers 190/(19*5*2) and parentheses
#forcing 5 and 2 into the denominator
sum(c(100,90))/((20-1)*5*2)
#> [1] 1
I used numbers in my example that would facilitate mental calculations. I kept the (20-1) and sum() from your formula to make the comparisons between my formulas and yours simple.
Looking at my two formulas, that give very different answers, are you still convinced that sd(weight) and sd(excercise_minutes) are in the denominator of your formula?
Yes, I added another set of parentheses and the correlation coeffecient i am getting is now 9.3864 instead of the original number which was 464394961.
Also, I tried the code u wrote: cor(df$weight,df$exercise_minutes) and I still get 0.3763526 regardless. I am just concerned why the coeffecients aren't matching when i use the equation and the built in r function cor().
Also, how would u descibe these histogram shapes? For weight, I said skewed left, for exercise_minutes I wrote bell curve and for height, I wrote bimodal. Not sure if that's correct. Any help is greatly appreciated.