I am working with the R programming language.
I have the the following data:
set.seed(123)
var1 = sample(0:1, 10000, replace=T)
var2 = sample(0:1, 10000, replace=T)
var3 = sample(0:1, 10000, replace=T)
var4 = sample(0:1, 10000, replace=T)
score = rnorm(10000,10,5)
my_data = data.frame(var1,var2, var3,var4, score)
We can see the summary of unique rows for this data with the following command:
# https://stackoverflow.com/questions/34312324/r-count-all-combinations
> dt = my_data[,c(1,2,3,4)]
> setDT(dt)[,list(Count=.N) ,names(dt)]
var1 var2 var3 var4 Count
1: 0 0 0 0 667
2: 0 1 0 0 601
3: 1 1 1 1 651
4: 0 1 1 1 608
5: 1 0 1 1 613
6: 1 1 0 1 588
7: 0 1 1 0 607
8: 0 0 1 1 607
9: 1 0 1 0 625
10: 0 1 0 1 661
11: 1 1 1 0 635
12: 0 0 1 0 640
13: 1 1 0 0 608
14: 1 0 0 0 607
15: 0 0 0 1 626
16: 1 0 0 1 656
I want to find out the average value of "score" when some variable is "present" and the same variable is "absent". For example:
- Contribution for Var4 : Average score for (var1 = 1, var2= 1, var3 = 1, var4 = 1) - Average score for (var1 = 1, var2= 1, var3 = 1, var4 = 0)
- Contribution for Var2 : Average score for (var1 = 1, var2= 1, var3 = 1, var4 = 1) - Average score for (var1 = 1, var2= 0, var3 = 1, var4 = 1)
- etc.
I found a very "clumsy" way to do this:
var1_present <- my_data[which(my_data$var1 == 1 & my_data$var2 == 1 & my_data$var3 == 1 & my_data$var4 == 1 ), ]
var1_present_score = mean(var1_present$score)
var1_absent <- my_data[which(my_data$var1 == 0 & my_data$var2 == 1 & my_data$var3 == 1 & my_data$var4 == 1 ), ]
var1_absent_score = mean(var1_absent$score)
var_1_contribution = var1_present_score - var1_absent_score
[1] 0.1288283
Is there someway to write a function that can look at the "contribution" of different variables to the "score"? I understand that even for 4 variables there can be many different combinations to compare - e.g. row 14 vs. row 16 : (1,0,0,0) vs. (1,0,0,1). But even for just some "contributions", is it possible to write a function that evaluates the "contribution" of variables being absent/present?
Can someone please show me how to do this?
Thanks!