When I create a table in BigQuery using DBI::dbWriteTable()
the created column type of string fields depends on the length of the string. Strings with length below 130 characters will become string, above 130 will become BYTES.
Why does this happen?
When I create the table first with bigrquery::bq_table_create()
and pass the field types everything is fine. So my second question is, can I pass the field types to DBI::dbWriteTable()
, too? Man-page says no.
Is there another way to do the task without using specific bigrquery functions?
Here's a reprex:
# Setting BigQuery Parameters
project <- "my_project"
dataset <- "my_dataset"
table_name <- "test-table"
table_name_2 <- "test-table-2"
# Connection for DBI
con <- DBI::dbConnect(
bigrquery::bigquery(),
project = project,
dataset = dataset,
billing = project
)
# Simple data.frame with two columns, one with string of length 129, one with length of 130 characters
data <- data.frame(
col_129 = paste(replicate(129, "a"), collapse = ""),
col_130 = paste(replicate(130, "a"), collapse = "")
)
# Remove table if it exists
if (DBI::dbExistsTable(con, table_name)) {
DBI::dbRemoveTable(con, table_name)
}
# create table
DBI::dbWriteTable(con, table_name, data, append = FALSE)
# check column types
bigrquery::bq_table(project, dataset, table_name) |> bigrquery::bq_table_fields()
#> <bq_fields>
#> col_130 <BYTES>
#> col_129 <STRING>
# Do the same as above, but using bigrquery functions to create table first
if (DBI::dbExistsTable(con, table_name_2)) {
DBI::dbRemoveTable(con, table_name_2)
}
ds_2 <- bigrquery::bq_table(project, dataset, table_name_2)
bigrquery::bq_table_create(ds_2, fields = bigrquery::as_bq_fields(data))
#> <bq_table> my_project.my_dataset.test-table-2
DBI::dbWriteTable(con, table_name_2, data, append = TRUE)
bigrquery::bq_table(project, dataset, table_name_2) |> bigrquery::bq_table_fields()
#> <bq_fields>
#> col_129 <STRING>
#> col_130 <STRING>
Created on 2022-10-22 with reprex v2.0.2
Thanks for your help.
Martin