correlation of the returns across periods

can anyone help me with correlation in R? I tried use cor.test(x,y) that I found on google. I need to compute correlation of returns from the historical stock price that I had. I'm not really understand what I should change and add to make it work. Thank you in advance.

I used to grumble that the help pages in R needed their own help page. I was embarrassed to discover


which is worthwhile reviewing.

The real effort, though, is in learning to think of R as school algebra: f(x) = y.

The three objects (in R everything is an object) are

x what is at hand
y what is desired
f convert x to y

Keep that in mind while looking at help(cor.test) because it's key to understanding the arguments that f here expects, which may not be the same as how your data is presently stored.

Here's the function signature

cor.test(x, y,
alternative = c("two.sided", "less", "greater"),
method = c("pearson", "kendall", "spearman"),
exact = NULL, conf.level = 0.95, continuity = FALSE, ...)

Everything is a default except for x and y. (The mysterious \dots at the end indicates that the function is open to receiving other objects; you usually don't need to worry about those.)

Under Arguments

x, y numeric vectors of data values. x and y must have the same length.

Let's say you have a table of stock prices for some basket at two given dates and a difference. (This neglects dividends, of course, and isn't a rate of return, but that's separate; the function doesn't care what the numbers mean.)

DF <- structure(list(open = c(
  21L, 63L, 39L, 57L, 34L, 33L, 52L, 26L,
  22L, 46L, 92L, 16L, 56L, 31L, 81L, 70L, 14L, 36L, 59L, 1L, 55L,
  92L, 15L, 86L, 2L
), close = c(
  62L, 47L, 5L, 71L, 91L, 61L, 46L,
  70L, 40L, 87L, 45L, 46L, 80L, 22L, 68L, 25L, 95L, 24L, 23L, 29L,
  4L, 45L, 98L, 72L, 82L
), return = c(
  41L, -16L, -34L, 14L, 57L,
  28L, -6L, 44L, 18L, 41L, -47L, 30L, 24L, -9L, -13L, -45L, 81L,
  -12L, -36L, 28L, -51L, -47L, 83L, -14L, 80L
)), class = "data.frame", row.names = c(

cor.test(DF$open, DF$return)
#>  Pearson's product-moment correlation
#> data:  DF$open and DF$return
#> t = -5.5544, df = 23, p-value = 1.192e-05
#> alternative hypothesis: true correlation is not equal to 0
#> 95 percent confidence interval:
#>  -0.8868103 -0.5161355
#> sample estimates:
#>        cor 
#> -0.7569028

Created on 2021-01-03 by the reprex package (v0.3.0.9001)

In this toy example, constructed from random integers under 70, we set x to the first column of DF and y to the last. There is a marked negative correlation.

You should take a careful look at the section Details, particularly on what the return value of f, which is y in the problem set up represents. Make sure to understand what association means in this context.

thank you for your help. i understand now.

1 Like

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.