Calculating years between occurence of 1 in a column by group

I have the following dataframe:

country year conflict
Iraq 1985 0
Iraq 1989 0
Iraq 1990 0
Iraq 1991 1
Iraq 1992 0
Iraq 1993 0
Iraq 1994 0
Iraq 1995 1
Iraq 1996 0
Iraq 1997 0
Iraq 1998 0
Iraq 1999 0
Iraq 2000 0
Iraq 2001 0
Iraq 2002 0
Iraq 2010 0
Iraq 2011 0
Iraq 2012 0
Iraq 2013 0
Iraq 2014 0

but need the following dataframe:

Nr. Country Year Conflict Peace
1 Iraq 1985 0 0
2 Iraq 1989 0 1
3 Iraq 1990 0 2
4 Iraq 1991 1 3
5 Iraq 1992 0 0
6 Iraq 1993 0 1
7 Iraq 1994 0 2
8 Iraq 1995 1 3
9 Iraq 1996 0 0
10 Iraq 1997 0 1
11 Iraq 1998 0 2
12 Iraq 1999 0 3
13 Iraq 2000 0 4
14 Iraq 2001 0 5
15 Iraq 2002 0 6
16 Iraq 2010 0 14
17 Iraq 2011 0 15
18 Iraq 2012 0 16
19 Iraq 2013 0 17
20 Iraq 2014 0 18

peace indicates the years between two conflicts (more precisely from after a conflict year to the conflict year). This is because peace assumes the value 0 after a conflict year.

It is important to note that the missing observation years are also counted.

The following code leads to the following result

d_muslim <- d_muslim %>%
mutate(helper = cumsum(lag(conflict, default = 1) == 1), .by = country) %>%
mutate(peace = year - first(year), .by = c(country, helper)) %>%
select(-helper)

head(subset(d_muslim, country == "Iraq", select = c(country, year, conflict, peace)), 20)

Nr. Country Year Conflict Peace
1 Iraq 1985 0 0
2 Iraq 1989 0 4
3 Iraq 1990 0 5
4 Iraq 1991 1 0
5 Iraq 1992 0 1
6 Iraq 1993 0 2
7 Iraq 1994 0 3
8 Iraq 1995 1 0
9 Iraq 1996 0 1
10 Iraq 1997 0 2
11 Iraq 1998 0 3
12 Iraq 1999 0 4
13 Iraq 2000 0 5
14 Iraq 2001 0 6
15 Iraq 2002 0 7
16 Iraq 2010 0 15
17 Iraq 2011 0 16
18 Iraq 2012 0 17
19 Iraq 2013 0 18
20 Iraq 2014 0 19

What am I doing wrong?

You could use the sbtscs() function in the stevemisc package to do this. If you want just what you indicate in the second table, you can do it without modifying the original data:

library(stevemisc)
library(dplyr)
dat <- tibble::tribble(
  ~country, ~year,  ~conflict,
"Iraq", 1985,   0,
"Iraq", 1989,   0,
"Iraq", 1990,   0,
"Iraq", 1991,   1,
"Iraq", 1992,   0,
"Iraq", 1993,   0,
"Iraq", 1994,   0,
"Iraq", 1995,   1,
"Iraq", 1996,   0,
"Iraq", 1997,   0,
"Iraq", 1998,   0,
"Iraq", 1999,   0,
"Iraq", 2000,   0,
"Iraq", 2001,   0,
"Iraq", 2002,   0,
"Iraq", 2010,   0,
"Iraq", 2011,   0,
"Iraq", 2012,   0,
"Iraq", 2013,   0,
"Iraq", 2014,   0)
sbtscs(dat, conflict, year, country)
#> # A tibble: 20 × 4
#>    country  year conflict spell
#>    <chr>   <dbl>    <dbl> <dbl>
#>  1 Iraq     1985        0     0
#>  2 Iraq     1989        0     1
#>  3 Iraq     1990        0     2
#>  4 Iraq     1991        1     3
#>  5 Iraq     1992        0     0
#>  6 Iraq     1993        0     1
#>  7 Iraq     1994        0     2
#>  8 Iraq     1995        1     3
#>  9 Iraq     1996        0     0
#> 10 Iraq     1997        0     1
#> 11 Iraq     1998        0     2
#> 12 Iraq     1999        0     3
#> 13 Iraq     2000        0     4
#> 14 Iraq     2001        0     5
#> 15 Iraq     2002        0     6
#> 16 Iraq     2010        0     7
#> 17 Iraq     2011        0     8
#> 18 Iraq     2012        0     9
#> 19 Iraq     2013        0    10
#> 20 Iraq     2014        0    11

If you want the years that are not displayed (e.g., 1986-1988) to be counted, you could pad out the data set, then run the function.

dat_pad <- dat %>% 
  group_by(country) %>% 
  reframe(year = seq(min(year), max(year), by=1)) %>% 
  left_join(dat) %>% 
  mutate(conflict0 = ifelse(is.na(conflict), 0, conflict)) 
#> Joining with `by = join_by(country, year)`

sbtscs(dat_pad, conflict0, year, country) %>% 
  filter(!is.na(conflict))
#> # A tibble: 20 × 5
#>    country  year conflict conflict0 spell
#>    <chr>   <dbl>    <dbl>     <dbl> <dbl>
#>  1 Iraq     1985        0         0     0
#>  2 Iraq     1989        0         0     4
#>  3 Iraq     1990        0         0     5
#>  4 Iraq     1991        1         1     6
#>  5 Iraq     1992        0         0     0
#>  6 Iraq     1993        0         0     1
#>  7 Iraq     1994        0         0     2
#>  8 Iraq     1995        1         1     3
#>  9 Iraq     1996        0         0     0
#> 10 Iraq     1997        0         0     1
#> 11 Iraq     1998        0         0     2
#> 12 Iraq     1999        0         0     3
#> 13 Iraq     2000        0         0     4
#> 14 Iraq     2001        0         0     5
#> 15 Iraq     2002        0         0     6
#> 16 Iraq     2010        0         0    14
#> 17 Iraq     2011        0         0    15
#> 18 Iraq     2012        0         0    16
#> 19 Iraq     2013        0         0    17
#> 20 Iraq     2014        0         0    18

Created on 2024-02-24 with reprex v2.0.2

1 Like

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.