Error in correlation in rstudio, results missing

omatheuscomh · March 1, 2024, 9:37pm

Create a simple DataFrame

data <- data.frame(

Turtle = c("Sp1", "Sp2", "Sp3", "Sp4"),
Lake = c(3, 2, 6, 2),
River = c(12, 2, 9, 8),
Igapo = c(1, 3, 11, 1)
)

Display the DataFrame

print(data)
Turtle Lake River Igapo
1 Sp1 3 12 1
2 Sp2 2 2 3
3 Sp3 6 9 11
4 Sp4 2 8 1

Assuming your DataFrame is called 'data'

species <- data$Turtle
environments <- colnames(data)[2:4] # Select environment columns

Initialize a matrix to store correlations

correlations <- matrix(NA, nrow = length(species), ncol = length(environments))
rownames(correlations) <- species
colnames(correlations) <- environments

Calculate Pearson correlation for each combination

for (i in 1:length(species)) {

for (j in 1:length(environments)) {

correlations[i, j] <- cor(data[[environments[j]]], data[[environments[i]]], method = "pearson")

}
}
Error in cor(data[[environments[j]]], data[[environments[i]]], method = "pearson") :
forneça conjuntamente 'x' e 'y' ou algo semelhante a uma matriz 'x'

Display the correlation matrix

print(correlations)
Lake River Igapo
Sp1 1.0000000 0.38844208 0.92465840
Sp2 0.3884421 1.00000000 0.01669684
Sp3 0.9246584 0.01669684 1.00000000
Sp4 NA NA NA

FJCC · March 1, 2024, 10:09pm

In this part of your code, you are calculating the correlation between environments, that is, between the three columns of your data named Lake, River, and Igapo.

correlations[i, j] <- cor(data[[environments[j]]], data[[environments[i]]], method = "pearson")

However, the outer for loop iterates over the species, that is, over the 4 rows your data.

 for (i in 1:length(species)) {

    for (j in 1:length(environments)) {

    correlations[i, j] <- cor(data[[environments[j]]], data[[environments[i]]], method = "pearson")

    }
    }

When i reaches the value of 4, the cor() function gets the argument data[[environments[4]], which does not exist and you get an error. If you want to see the correlations between the three environments, set i to iterate over 1:length(environments). You should also define the correlations matrix to have nrow = length(environments).

system · April 12, 2024, 10:09pm

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.