Recoding a variable across waves of a longitudinal dataset

I realised recently that a longitudinal variable in my dataset (whether people stated they were on furlough after being asked why their working hours were reduced from the previous wave of the study) was coded incorrectly. Right now, the variable is coded as “1” if a respondent reported furlough as the reason for fewer working hours than the last survey wave, and “0" even if a respondent was on furlough but whose working hours did not change from the last survey wave. Therefore, I want to recode this variable so that after the first report of a furlough-related decrease in working hours (“1”), the rest of the data (i.e. the proceeding waves) would also be coded “1". This may be a simple change to execute in R, but I spent a few hours this morning trying if-else statements and dplyr with no success.

TLDR : I would like to recode a variable so that if the variable equals 1 for a specified wave of my longitudinal dataset, it also equals 1 for the rest of the waves of the dataset.

Can I please ask for any suggestions you have for resolving this? Thank you so much!

Hi,

If I understand it correctly, what you like to do is set all values in a vector to "1" after the fist "1" appears. For example 0,0,1,0,1 becomes 0,0,1,1,1.

If that is the case, here are ways of doing that

#Base R implementation
myData = data.frame(
  id = 1:10,
  val = c(0,0,0,0,1,0,1,1,0,1)
)

pos = which(myData$val == 1)[1]
if(!is.na(pos)){
  myData$val[pos:nrow(myData)] = 1
}

myData
#>    id val
#> 1   1   0
#> 2   2   0
#> 3   3   0
#> 4   4   0
#> 5   5   1
#> 6   6   1
#> 7   7   1
#> 8   8   1
#> 9   9   1
#> 10 10   1

#Alternative solution
myData$val[
  1:nrow(myData) > 
    min(which(myData$val == 1)[1], nrow(myData), na.rm = T)] = 1

Created on 2021-11-20 by the reprex package (v2.0.1)

#Tidverse implementation
library(dplyr)

myData = data.frame(
  id = 1:10,
  val = c(0,0,0,0,1,0,1,1,0,1)
)

myData = myData %>% 
  mutate(val = ifelse(
    row_number() > min(which(val == 1)[1], n(), na.rm = T), 
    1, val))
myData
#>    id val
#> 1   1   0
#> 2   2   0
#> 3   3   0
#> 4   4   0
#> 5   5   1
#> 6   6   1
#> 7   7   1
#> 8   8   1
#> 9   9   1
#> 10 10   1

Created on 2021-11-20 by the reprex package (v2.0.1)

Both cases use the which() function to find the positions that are "1", then takes the first value and from that position onwards set everything to 1. What makes the code a little more complicated is taking care of the scenario where everting is "0", in which case nothing has to be updated.

Hi Pieter,

Thank you so much for your detailed response! After tweaking the tidyverse implemention to fit my use case, I was able to achieve my intended transformation.

Kind regards, and have a lovely day,
Yaning

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.