Could you help me with the following question: I want to make a new Y
variable, which makes a relationship between a date and the code I choose, df1
dataset and the SPV
dataset. I'll give an example to make it easier to understand.
Note that my df1
dataset for 07/07/2021, Code CDE, has values up to DR08. Columns DR09
through DR014
are 0. Therefore, I would like to make a new Y
variable, which excludes these columns for that same day and code, but from the SPV
variable, which in this case is DR09_DR09_DR09_PV
through DR014_DR014_PV
. So, the Y
variable output table for that day and code would look like this:
If I choose day 09/07/2021, Code CDE, the columns DR013_DR013_PV
and DR014_DR014_PV
would be excluded from my SPV
dataset, as DR013
and DR014
have values equal to 0 of df1
dataset. Therefore, the output table for this day and code would look like this :
So, my Y
variable will depend on the day and code I choose.
library(dplyr)
library(tidyverse)
library(lubridate)
df1 <- structure(
list(date1= c("2021-06-28","2021-06-28","2021-06-28","2021-06-28","2021-06-28",
"2021-06-28","2021-06-28","2021-06-28"),
date2 = c("2021-06-30","2021-06-30","2021-07-02","2021-07-07","2021-07-07","2021-07-09","2021-07-09","2021-07-09"),
Code = c("FDE","ABC","ABC","ABC","CDE","FGE","ABC","CDE"),
Week= c("Wednesday","Wednesday","Friday","Wednesday","Wednesday","Friday","Friday","Friday"),
DR1 = c(4,1,4,3,3,4,3,5),
DR01 = c(4,1,4,3,3,4,3,6), DR02= c(4,2,6,7,3,2,7,4),DR03= c(9,5,4,3,3,2,1,5),
DR04 = c(5,4,3,3,6,2,1,9),DR05 = c(5,4,5,3,6,2,1,9),
DR06 = c(2,4,3,3,5,6,7,8),DR07 = c(2,5,4,4,9,4,7,8),
DR08 = c(0,0,0,1,2,0,0,0),DR09 = c(0,0,0,0,0,0,0,0),DR010 = c(0,0,0,0,0,0,0,0),DR011 = c(4,0,0,0,0,0,0,0),
DR012 = c(0,0,0,3,0,0,0,5),DR013 = c(0,0,1,0,0,0,2,0),DR014 = c(0,0,0,0,0,2,0,0)),
class = "data.frame", row.names = c(NA, -8L))
> df1
date1 date2 Code Week DR1 DR01 DR02 DR03 DR04 DR05 DR06 DR07 DR08 DR09 DR010 DR011 DR012 DR013 DR014
1 2021-06-28 2021-06-30 FDE Wednesday 4 4 4 9 5 5 2 2 0 0 0 4 0 0 0
2 2021-06-28 2021-06-30 ABC Wednesday 1 1 2 5 4 4 4 5 0 0 0 0 0 0 0
3 2021-06-28 2021-07-02 ABC Friday 4 4 6 4 3 5 3 4 0 0 0 0 0 1 0
4 2021-06-28 2021-07-07 ABC Wednesday 3 3 7 3 3 3 3 4 1 0 0 0 3 0 0
5 2021-06-28 2021-07-07 CDE Wednesday 3 3 3 3 6 6 5 9 2 0 0 0 0 0 0
6 2021-06-28 2021-07-09 FGE Friday 4 4 2 2 2 2 6 4 0 0 0 0 0 0 2
7 2021-06-28 2021-07-09 ABC Friday 3 3 7 1 1 1 7 7 0 0 0 0 0 2 0
8 2021-06-28 2021-07-09 CDE Friday 5 6 4 5 9 9 8 8 0 0 0 0 5 0 0
dmda<-"2021-07-07"
CodeChosse<-"CDE"
x<-df1 %>% select(starts_with("DR0"))
x<-cbind(df1, setNames(df1$DR1 - x, paste0(names(x), "_PV")))
PV<-select(x, date2,Week, Code, DR1, ends_with("PV"))
med<-PV %>%
group_by(Code,Week) %>%
summarize(across(ends_with("PV"), median))
SPV<-df1%>%
inner_join(med, by = c('Code', 'Week')) %>%
mutate(across(matches("^DR0\\d+$"), ~.x +
get(paste0(cur_column(), '_PV')),
.names = '{col}_{col}_PV')) %>%
select(date1:Code, DR01_DR01_PV:last_col())
SPV<-data.frame(SPV)
> SPV
date1 date2 Code DR01_DR01_PV DR02_DR02_PV DR03_DR03_PV DR04_DR04_PV DR05_DR05_PV DR06_DR06_PV DR07_DR07_PV DR08_DR08_PV DR09_DR09_PV DR010_DR010_PV DR011_DR011_PV DR012_DR012_PV DR013_DR013_PV DR014_DR014_PV
1 2021-06-28 2021-06-30 FDE 4 4.0 4 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4 4.0
2 2021-06-28 2021-06-30 ABC 1 -0.5 3 2.5 2.5 2.5 2.5 1.5 2.0 2.0 2.0 0.5 2 2.0
3 2021-06-28 2021-07-02 ABC 4 3.0 5 4.5 5.5 1.5 2.0 3.5 3.5 3.5 3.5 3.5 3 3.5
4 2021-06-28 2021-07-07 ABC 3 4.5 1 1.5 1.5 1.5 1.5 2.5 2.0 2.0 2.0 3.5 2 2.0
5 2021-06-28 2021-07-07 CDE 3 3.0 3 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3.0 3 3.0
6 2021-06-28 2021-07-09 FGE 4 4.0 4 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4.0 4 4.0
7 2021-06-28 2021-07-09 ABC 3 4.0 2 2.5 1.5 5.5 5.0 3.5 3.5 3.5 3.5 3.5 4 3.5
8 2021-06-28 2021-07-09 CDE 5 5.0 5 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5.0 5 5.0