here I have a dataframe which stands for a family relationships
df <- data.frame(
pid = c(101, 102, 103, 104, 105, 106, 107),
pid_f = c(-8, -8, -8, -8, 102, -8, 106),
pid_m = c(-8, 101, 101, -8, 104, -8, 103),
pid_s = c(-8, 104, 106, 102, -8, 103, -8),
pid_c1 = c(103, 105, 107, 105, -8, 107, -8),
pid_c2 = c(102, -8, -8, -8, -8, -8, -8)
)
df
#> pid pid_f pid_m pid_s pid_c1 pid_c2
#> 1 101 -8 -8 -8 103 102
#> 2 102 -8 101 104 105 -8
#> 3 103 -8 101 106 107 -8
#> 4 104 -8 -8 102 105 -8
#> 5 105 102 104 -8 -8 -8
#> 6 106 -8 -8 103 107 -8
#> 7 107 106 103 -8 -8 -8
colname | label |
---|---|
pid | personal id |
pid_f | father id |
pid_m | mother id |
pid_s | spouse id |
pid_c1 | children1 id |
pid_c2 | children2 id |
the value "-8" | means it doesn't exist |
- For the first row, we can read as person 101 has two children 103 and 102.
- For second row, it can been read as person 102 have mother 101 , spouse 104 and a child 105.
- ... and so on
So, from above family relationship table, we can create Family Tree (for easy to understand),
and finally obtain each person a corresponding hierarchy like this
pid | pid_f | pid_m | pid_s | pid_c1 | pid_c2 | Hierarchy |
---|---|---|---|---|---|---|
101 | -8 | -8 | -8 | 103 | 102 | 1 |
102 | -8 | 101 | 104 | 105 | -8 | 2 |
103 | -8 | 101 | 106 | 107 | -8 | 2 |
104 | -8 | -8 | 102 | 105 | -8 | 2 |
105 | 102 | 104 | -8 | -8 | -8 | 3 |
106 | -8 | -8 | 103 | 107 | -8 | 2 |
107 | 106 | 103 | -8 | -8 | -8 | 3 |
My question is how to mutate
the Hierarchy variable from df
by some function.
df %>% mutate(Hierarchy = function(...)
)
Could you please give me some help and advice?