Error in correlation in rstudio, results missing

Create a simple DataFrame

data <- data.frame(

  • Turtle = c("Sp1", "Sp2", "Sp3", "Sp4"),
  • Lake = c(3, 2, 6, 2),
  • River = c(12, 2, 9, 8),
  • Igapo = c(1, 3, 11, 1)
  • )

Display the DataFrame

print(data)
Turtle Lake River Igapo
1 Sp1 3 12 1
2 Sp2 2 2 3
3 Sp3 6 9 11
4 Sp4 2 8 1

Assuming your DataFrame is called 'data'

species <- data$Turtle
environments <- colnames(data)[2:4] # Select environment columns

Initialize a matrix to store correlations

correlations <- matrix(NA, nrow = length(species), ncol = length(environments))
rownames(correlations) <- species
colnames(correlations) <- environments

Calculate Pearson correlation for each combination

for (i in 1:length(species)) {

  • for (j in 1:length(environments)) {
  • correlations[i, j] <- cor(data[[environments[j]]], data[[environments[i]]], method = "pearson")
    
  • }
  • }
    Error in cor(data[[environments[j]]], data[[environments[i]]], method = "pearson") :
    forneça conjuntamente 'x' e 'y' ou algo semelhante a uma matriz 'x'

Display the correlation matrix

print(correlations)
Lake River Igapo
Sp1 1.0000000 0.38844208 0.92465840
Sp2 0.3884421 1.00000000 0.01669684
Sp3 0.9246584 0.01669684 1.00000000
Sp4 NA NA NA

In this part of your code, you are calculating the correlation between environments, that is, between the three columns of your data named Lake, River, and Igapo.

correlations[i, j] <- cor(data[[environments[j]]], data[[environments[i]]], method = "pearson")

However, the outer for loop iterates over the species, that is, over the 4 rows your data.

 for (i in 1:length(species)) {

    for (j in 1:length(environments)) {

    correlations[i, j] <- cor(data[[environments[j]]], data[[environments[i]]], method = "pearson")

    }
    }

When i reaches the value of 4, the cor() function gets the argument data[[environments[4]], which does not exist and you get an error. If you want to see the correlations between the three environments, set i to iterate over 1:length(environments). You should also define the correlations matrix to have nrow = length(environments).

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.