I want to run multiple regression models in parallel. The approach that I am trying to incorporate is as follows:-
Let's say we have a dataset with 1 dependent variable (DV) as y and 4 independent variables (IVs) as - x1, x2, x3 and x4. I want to run all possible regression models -
y with x1
y with x2
y with x3
y with x4
y with x1, x2
...
y with x1, x2, x3, x4
So likewise we will have (2^4) -1 = 15 models. I want to have a matrix wherein I can have the representation of indicator variables for all the models and then for each row we can run regression using regular lm function.
x1
x2
x3
x4
1
0
0
0
0
1
0
0
0
0
1
0
0
0
0
1
1
1
0
0
1
0
1
0
1
0
0
1
0
1
1
0
0
1
0
1
0
0
1
1
1
1
1
0
1
1
0
1
1
0
1
1
0
1
1
1
1
1
1
1
Is this possible?
If not, is there any other way to do this?
Any kind of guidance will really be helpful.
You could create such a matrix with expand.grid(0:1,0:1,0:1,0:1) or, to make it easier with larger numbers of variables, expand.grid(replicate(4, 0:1, simplify=FALSE)), but what about creating the regression formulas instead:
But since you specifically asked for a way to create such a matrix, I'd like to add a concise alternative expand.grid(rep(list(0:1), 4)) of what he suggested as expand.grid(replicate(4, 0:1, simplify=FALSE)).