Hi
I have a custom function that requires given means and SDs to produce an overlap coefficient. (Taken from here). This works great when I manually input defined parameters, but I have a large series of means and SDs that I will need to evaluate in time. I was wondering if there was a way of running the code to compute a new variable in the data frame (an 'Overlap' column) with the output? My current attempt to do this returns an error.
The code at the moment to define an Overlap coefficient is presented below. In this example, I calculate Overlap from the known means of hypothetical Condition 1 (Mean=1, SD= 1) and Condition 2 (Mean= 0.8, SD=1).
f1 <- dnorm(x, mean=mu1, sd=sd1) f2 <- dnorm(x, mean=mu2, sd=sd2) pmin(f1,f2) } Overlap <- function(mu1, mu2, sd1, sd2) { integrate(int_f, -Inf, Inf, mu1, mu2, sd1, sd2)$value } Overlap(1,.8,1,1)
This correctly returns:
[1] 0.9203441
Which is what I'd want to run on a series of data (N>100). Where I have a dataset that has columns titled Mean1, Mean2, Sd1, Sd2.
Mydata$Overlap <-Overlap(Mydata$Mean1, Mydata$Mean2, Mydata$Sd1, Mydata$Sd2)
However when I run this with randomly generated data I get this error:
Error in integrate(int_f, -Inf, Inf, mu1, mu2, sd1, sd2) : evaluation of function gave a result of wrong length
When I try other (simpler) custom code this works and does produce the expected fifth column:
ColSumTest <- function(x, y) {x+y} Mydata$Sum <- ColSumTest(Mydata$Mean1,Mydata$Mean2)
But not with my desired Overlap function.
Is anyone able to help? (I am only using randomly generated data at the moment, so no data to share).
Any help for this R novice is greatly appreciated!
Many thanks,