I'm having an issue filling in NA's with the value to the left.
The job at hand means I have to read in a csv with over a million records. I then have to create a cumulative total per week by city. I have got to the stage where i have done cumsum and pivot_wider.
My current output is as follows
||City|05/03/2020|12/03/2020|19/03/2020|26/03/2020|..........
|---|---|---|---|---|---|
|1|Hull|NA|16776|17599|17840|
|2|Chester|NA|2665|2693|NA|
|3|Wakefield|NA|2594|2854|3081|
|4|Cardiff|NA|NA|NA|NA|
|5|Greenock|450|864|2237|2301|
|6|Inverness|NA|23957|23960|NA|
|7|London|2282|NA|4930|5342|
However for the purposes of the output I need it to look like the following.
||City|05/03/2020|12/03/2020|19/03/2020|26/03/2020|..........
|---|---|---|---|---|---|
|1|Hull|NA|16776|17599|17840|
|2|Chester|NA|2665|2693|2693|
|3|Wakefield|NA|2594|2854|3081|
|4|Cardiff|NA|NA|NA|NA|
|5|Greenock|450|864|2237|2301|
|6|Inverness|NA|23957|23960|23960|
|7|London|2282|2282|4930|5342|
The towns/cites always remain the same. The dates for col names go up to today and will update on a weekly basis. I'm happy with the first NA's in a row remaining but once the first value appears in a row, the remaining NA's need to be filled with the value to its left.
Iv tried variations of na.locf and na.locf0 with no success.
if its of any use - the section of code that got it to this stage is as follow s
mutate(CumulativeSum = cumsum(RFS_count))%>%
group_by(City, Reporting_Week)%>%
summarise(n = n(),
Cumulativetotal = sum(CumulativeSum))%>%
group_by(City)%>%
arrange(Reporting_Week)%>%
mutate(TotalSum = cumsum(n))%>%
select(-n, -Cumulativetotal)%>%
pivot_wider(names_from=Reporting_Week, values_from=TotalSum)
It was a struggle to get it even to this stage so i know the above is probably longwinded and inefficient
Any help would be much appreciated