Importing more than one data table from the same worksheet

FJCC · April 23, 2019, 4:07pm

If you only have one file to deal with, you can use the read.xlsx function from the openxlsx package and specify the rows and columns to read.

If you have to handle many files, here is a simplistic method. It assumes that each table has a header row, there is one empty row after each table, and the only NA values in the first column are in the rows between tables. The data frames are stored in a list.

library(openxlsx)

df <- read.xlsx("~/R/Play/Source.xlsx", skipEmptyRows = FALSE)
df
#>    First Second   Third Fourth Fifth Sixth
#> 1     23  28.29 34.7967      1     3    11
#> 2     25  30.75 37.8225      2     4    22
#> 3   <NA>   <NA>    <NA>   <NA>  <NA>  <NA>
#> 4    YYY    TTT     RRR    EEE   WWW   QQQ
#> 5      A      Q       V      W     U     N
#> 6      S      W       G      R     I     M
#> 7   <NA>   <NA>    <NA>   <NA>  <NA>  <NA>
#> 8    rrr    ttt     yyy    uuu  iiii   ooo
#> 9     45     66      88      8    53    73
#> 10    48     77      99      1    58    52
TotalRows <- nrow(df)
SplitRows <- which(is.na(df$First))
SplitRows <- c(0, SplitRows, TotalRows + 1)
NumSplits <- length(SplitRows) - 1
DFs <- vector(mode = "list", length = NumSplits)
for (i in 1:NumSplits) {
  if ( i ==1){
      DFs[[i]] <- df[(SplitRows[i] + 1):(SplitRows[i+1] - 1), ]
  } else {
    DFs[[i]] <- df[(SplitRows[i] + 2):(SplitRows[i+1] - 1),]
    colnames(DFs[[i]]) <- df[SplitRows[i] + 1, ]
  }
}
DFs[[1]]
#>   First Second   Third Fourth Fifth Sixth
#> 1    23  28.29 34.7967      1     3    11
#> 2    25  30.75 37.8225      2     4    22
DFs[[2]]
#>   YYY TTT RRR EEE WWW QQQ
#> 5   A   Q   V   W   U   N
#> 6   S   W   G   R   I   M
DFs[[3]]
#>    rrr ttt yyy uuu iiii ooo
#> 9   45  66  88   8   53  73
#> 10  48  77  99   1   58  52