Hello,
My data looks like this:
patient_id | dia_ppal | dia_02 | dia_03 | poad_02 | poad_03 |
---|---|---|---|---|---|
1 | A41.9 | R65.20 | J18.9 | S | S |
2 | J12.89 | B97.29 | Z87.891 | S | E |
3 | J18.9 | R68.89 | E66.3 | S | S |
4 | J12.89 | B97.29 | I10 | S | S |
5 | J12.89 | B97.29 | J96.00 | S | S |
6 | J12.89 | B97.29 | I10 | S | S |
7 | J98.8 | B97.29 | D69.6 | S | S |
8 | J12.89 | B97.29 | E11.65 | S | N |
9 | J12.89 | B97.29 | R45.1 | S | N |
10 | J12.89 | B97.29 | E11.65 | S | N |
11 | J98.8 | B97.29 | J96.00 | S | S |
Principal diagnosis=DIA_PPAL
diagnosis 2=DIA_02
present on admission=POAD (Si/No)
I need to create a variable with a diagnosis at baseline. For example diabetes at baseline (yes/no). What I thought was:
1)when there is diabetes in dia_ppal asign assign a 1, the rest are missing or 0 as corresponding
2)create variable dm_ppal= when dia_ppal=1 and POAD0_ppal=S assign 1
3) repite step 2 and 3 for dia_02/paod02, dia_03/paod03, dia_04/paod04, etc.
4)create diabetes_baseline: when there is a 1 in dm_ppal,or in dia_02, etc, asign 1, the rest ceros ir missing as corresponding
I did this:
dm_01<-with(dx,ifelse(dia_ppal=="E08",1,ifelse(dia_ppal=="E09",1,ifelse(dia_ppal=="E10",1,ifelse(dia_ppal=="E11",1,ifelse(dia_ppal=="E12",1,ifelse(dia_ppal=="E13",1,0)))))))
dm_02<-with(dx,ifelse(dia_02=="E08",1,ifelse(dia_02=="E09",1,ifelse(dia_02=="E10",1,ifelse(dia_02=="E11",1,ifelse(dia_02=="E12",1,ifelse(dia_02=="E13",1,0)))))))
dm_03<-with(dx,ifelse(dia_03=="E08",1,ifelse(dia_03=="E09",1,ifelse(dia_03=="E10",1,ifelse(dia_03=="E11",1,ifelse(dia_03=="E12",1,ifelse(dia_03=="E13",1,0)))))))
dm_basal<-with(dx,ifelse(dm_01==1 & poad_ppal=="S",1,
ifelse(dm_02==1 & poad_02=="S",1,
ifelse(dm_03==1 & poad_03=="S",1,0))))
However, there are 2 probles:
- the codes for diabetes start with "E08","E09",etc. That means there could be a diabetes codes as E081. And this sintaxis does not include them. I know I have to use grep, but I don't know how.
- the number of diagnosis is actually 20, so , I'am sure there is a way not to repeat every thing.
Can somebody help me?
*pd: forgat to include the POAD_ppal , but thats the idea of the data set.
Thank you¡¡