Suppose in my project directory there is a folder called "data". The folder contains three Excel files:
df.xlsx
df_2012017.xlsx
df_2012018.xlsx
df.xlsx
looks like this:
date | sales | week |
---|---|---|
2012-04-07 | 10 | 2012014 |
2012-04-14 | 30 | 2012015 |
2012-04-21 | 60 | 2012016 |
2012-04-28 | 40 | 2012017 |
df_2012017.xlsx
looks like this:
date | sales | week |
---|---|---|
2012-04-28 | 40 | 2012017 |
df_2012018.xlsx
looks like this:
date | sales | week |
---|---|---|
2012-05-05 | 100 | 2012018 |
Given the above files, I want to do the following:
-
Read the
df.xlsx
asdf
. -
Identify the latest week from the variable
week
and find out if there is any file in the "data" folder with the namedf_*.xlsx
where*
is greater than the last week. For example, when we read the filedf.xlsx
, we notice that the latestweek
is 2012017. So, there is only one file that satisfies the condition isdf_2012018.xlsx
-
After the identification, if there is any, we will read the file and bind the rows with
df
-
Then write the new data frame as
df.xlsx
In other words, I am trying to write a script that will update the excel file if there is any new data.
library(tidyverse)
library(lubridate)
library(writexl)
# Prepare toy data ----
df <- tibble(date = seq(ymd('2012-04-07'),ymd('2012-04-28'), by = '1 week')) %>%
mutate(sales = c(10, 30, 60, 40),
week = str_c(isoyear(date), "0", isoweek(date)))
## Existing data
df_2012017 <- tibble(date = ymd('2012-04-28'),
sales = 40,
week = '2012017')
## New data
df_2012018 <- tibble(date = ymd('2012-05-05'),
sales = 100,
week = '2012018')
## Write
write_xlsx(df, "data/df.xlsx")
write_xlsx(df_2012017, "data/df_2012017.xlsx")
write_xlsx(df_2012018, "data/df_2012018.xlsx")