Evaluating Functions at Multiple Points

omario · September 25, 2021, 10:56pm

I am working with the R programming language. Suppose I have the following data:

library("dplyr")

df <- data.frame(b = rnorm(100,5,5), d = rnorm(100,2,2),
                 c = rnorm(100,10,10))

a <- c("a", "b", "c", "d", "e")
a <- sample(a, 100, replace=TRUE, prob=c(0.3, 0.2, 0.3, 0.1, 0.1))

a<- as.factor(a)
df$a = a

> head(df)
           b          d          c a
1  3.1316480  0.5032860  4.7362991 a
2  4.3111450 -0.1142736 -0.5841322 c
3  2.8291346  3.6107839 16.0684492 a
4 14.2142245  4.9893987 -1.8145138 a
5 -6.7381302  0.0416782 -7.7675387 c
6  0.4481874  0.3370716 17.4260801 a

I also have the following function ("my_subset_mean") which evaluates the mean of the "column c" given a specific choice of inputs:

my_subset_mean <- function(r1, r2, r3){  
  subset <- df %>% filter(a %in% r1, b > r2, d < r3)
  return(mean(subset$c))
}

my_subset_mean(r1 = c("a", "b"), r2 = 5, r3 = 1 ) 
[1] 5.682513

My Question: I am trying to evaluate the function "my_subset_mean" at random combinations of "r1", "r2" and "r3". For example:

 my_subset_mean(r1 = c("a", "b"), r2 = 5, r3 = 1 ) 
[1] 11.46365

 my_subset_mean(r1 = c("a", "b"), r2 = 5, r3 = 1 ) 
[1] 11.46365

 my_subset_mean(r1 = c("a"), r2 = 2, r3 = 0 ) 
[1] 14.59809

my_subset_mean(r1 = c("a", "b", "c"), r2 = 3.1, r3 = 0 ) 
[1] 11.26508

 #I am not sure how to get this one to work (i.e. ignore "r1" all together and only calculate the mean using r2 and r3)

 my_subset_mean(r1 = "NA", r2 = 3.1, r3 = 0 ) 
[1] NaN

etc.

Is it possible to make a "grid" that contains random values of "r2" and "r3" (e.g. random values of "r2" and "r3" between 0 and 5) along with random subsets of "r1" (e.g. "a", "c, d", "b, a, e", "d"):

> head(my_grid)
           r2          r3   r1
1  3.1316480  0.5032860     a, b
2  4.3111450 -0.1142736     c, d, e
3  2.8291346  3.6107839     a
4 14.2142245  4.9893987     b, e
5 -6.7381302  0.0416782     NA
6  0.4481874  0.3370716     e

And then evaluate "my_subset_mean" at each row of "my_grid"? E.g.

#desired result

 > head(final_answer)
               r2          r3   r1         my_subset_mean
    1  3.1316480  0.5032860     a, b         0.3
    2  4.3111450 -0.1142736     c, d, e      0.1
    3  2.8291346  3.6107839     a            0.55
    4 14.2142245  4.9893987     b, e         0.6
    5 -6.7381302  0.0416782     NA           0.51
    6  0.4481874  0.3370716     e            0.16

If there were no "factor variables" involved, I think I could have done this with an iterative "for loop". But I am not sure how to "feed" the function ("my_subset_mean") using "my_grid". Can someone please show me how to do this?

Thanks!

R_DUMMY · September 26, 2021, 12:24am

loop through the row of grid

grid$my_subset_mean=NA

for (i in 1:nrow(grid)){

grid[i,4]=my_subset_mean( r1=grid[i,3], r2=grid[i,1],r3=grid[i,2])
}

system · October 17, 2021, 12:25am

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.