Help! Trying to assign value on a another column string

enrique.gomez · October 11, 2021, 12:52pm

Dear all,

I am in need in help due to I have been unable to resolve this problem.
I am working in a clinical database when several subjects were longitudinal observed and so I have various observations for the same ID. The patients were randomized to one of two treatments groups (1 or 2 within the data) but the data only registered the group in one of the four possible observations. What I want is to replicate the treatment group assigned in each of the observations based of one of the IDs that have one (each unique ID and each observation with their assigned group).

The data frame looks like this:

df <- data.frame(subject = c("subject1_study1" , "subject1_study1",  "subject1_study1", "subject_2_study1", "subject2_study1", "subject2_study1", "subject3_study1", "subject3_study1"), event = c("V1", "V2", "V3", "V1", "V2", "V3", "V1", "V2"), treatment_group = c(1, NA, NA, NA, 2, NA, NA, 1))

Thanks for your help!

Equation · October 11, 2021, 1:10pm

I am not entirely sure what you want to achieve. Below is my guess, but please feel free to provide additional information as to what you are expecting the end-result to look like.

(Note that I split the subject variable into a subject and a study variable.)

library(tidyverse)

clinical <- tibble(
  subject = c("subject1_study1" , "subject1_study1",  "subject1_study1", "subject2_study1", "subject2_study1", "subject2_study1", "subject3_study1", "subject3_study1"), 
  event = c("V1", "V2", "V3", "V1", "V2", "V3", "V1", "V2"), 
  treatment_group = c(1, NA, NA, NA, 2, NA, NA, 1)
  )

clinical %>% 
  separate(subject, into = c("subject", "study"), extra = "drop", remove = FALSE) %>% 
  fill(treatment_group)
#> # A tibble: 8 × 4
#>   event subject  study  treatment_group
#>   <chr> <chr>    <chr>            <dbl>
#> 1 V1    subject1 study1               1
#> 2 V2    subject1 study1               1
#> 3 V3    subject1 study1               1
#> 4 V1    subject2 study1               1
#> 5 V2    subject2 study1               2
#> 6 V3    subject2 study1               2
#> 7 V1    subject3 study1               2
#> 8 V2    subject3 study1               1

^{Created on 2021-10-11 by the reprex package (v2.0.1)}

This simply fills the values missing for the treatment_group variable with the preceding value. Might not be what you are after; if not, please provide some further context so that we can help you out.

enrique.gomez · October 11, 2021, 1:20pm

Thanks!
I am sorry for the misunderstanding. I am trying to fill the treatment_group column so that it may have their value according to each unique ID.
So that this:
Captura de Pantalla 2021-10-11 a la(s) 15.16.25

May look like this:

I am sorry for not being able to explain myself properly in other way.

system · November 1, 2021, 1:21pm

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.