I am back-testing a trading strategy on the members of a stock index. I have historical stock price date for every years. I would like to merge these time-series objects. My problem is that the index constituents are not the same , therefore I could not simple rbind them. Merging is not an option as well, because it creates separate columns for the same stocks from different years. Could someone suggest me a solution?
Hi @tamas1,
This will get you your desired output with this very simple example but it will be more tedious if there are many mismatched column names (company ticker symbols):
xts1 <- read.table(header=TRUE, text=c("
DATE AAPL AMZ AA AXP
'11/01/2020' 100 85 90 70
'12/01/2020' 105 70 80 90
"))
xts2 <- read.table(header=TRUE, text=c("
DATE AAPL AM AXP BA
'01/01/2021' 108 75 80 50
'02/01/2021' 110 60 70 60
"))
# Wanted:
# AAPL AMZ AA AXP BA
# 11/01/2020 100 85 90 70 NA
# 12/01/2020 105 70 80 90 NA
# 01/01/2021 108 75 NA 80 50
# 02/01/2021 110 60 NA 70 60
# Identify mismatching column names. Will be tricky if there are many.
setdiff(names(xts1), names(xts2))
#> [1] "AMZ" "AA"
suppressPackageStartupMessages(library(tidyverse))
xts_full <- bind_rows(xts1, xts2) # cols are matched by name, missing cols filled with NA.
xts_full
#> DATE AAPL AMZ AA AXP AM BA
#> 1 11/01/2020 100 85 90 70 NA NA
#> 2 12/01/2020 105 70 80 90 NA NA
#> 3 01/01/2021 108 NA NA 80 75 50
#> 4 02/01/2021 110 NA NA 70 60 60
# Column pairs to be merged (same company with different ticker code)
# AMZ = AM
xts_full %>%
rowwise() %>%
mutate(AMZ = sum(c(AMZ, AM), na.rm=TRUE)) %>%
select(-AM)
#> # A tibble: 4 x 6
#> # Rowwise:
#> DATE AAPL AMZ AA AXP BA
#> <chr> <int> <int> <int> <int> <int>
#> 1 11/01/2020 100 85 90 70 NA
#> 2 12/01/2020 105 70 80 90 NA
#> 3 01/01/2021 108 75 NA 80 50
#> 4 02/01/2021 110 60 NA 70 60