I am trying to export PDF files + metadata from Zotero into R. I found this guide that includes a script and have successfully implemented it. However, the metadata that is imported are things like "ID", "date last modified", "date last added". Not very useful for analysis.
I am looking for a way to export other metadata such as "author", "tags", "organization", "type" using this script or any other script.
install.packages(c("magrittr", "DBI", "RSQLite", "quanteda", "readtext"))
connect to Zotero's SQLite database
con = dbConnect(drv = RSQLite::SQLite(),
dbname = "~/Zotero/zotero.sqlite")
get names of all tables in the database
alltables = dbListTables(con)
bring the items and itemNotes tables into R
table.items <- dbGetQuery(con, 'select * from items')
table.itemNotes <- dbGetQuery(con, 'select * from itemNotes')
bring in Zotero fulltext cache plaintext
textDF <- readtext(paste0("~/Zotero/storage", "/*/.zotero-ft-cache"),
docvarsfrom = "filepaths")
isolate "key" (8-character alphanumeric directory in storage/) in docvar1 associated with plaintext
textDF$docvar1 <- gsub(pattern = "^.storage\/", replacement = "", x = textDF$docvar1)
textDF$docvar1 <- gsub(pattern = "\/.", replacement = "", x = textDF$docvar1)
bring in itemID (and some other metadata) and that's all
textDF <- textDF %>%
dplyr::rename(key = docvar1) %>%
dplyr::left_join(table.items) %>%
dplyr::filter(!is.na(itemID), !itemID %in% table.itemNotes$itemID)
I don't have extensive experience with R, so any help would be really appreciated!