Hi!
I am trying to export PDF files + metadata from Zotero into R. I found this guide that includes a script and have successfully implemented it. However, the metadata that is imported are things like "ID", "date last modified", "date last added". Not very useful for analysis.
I am looking for a way to export other metadata such as "author", "tags", "organization", "type" using this script or any other script.
install.packages(c("magrittr", "DBI", "RSQLite", "quanteda", "readtext"))
library(magrittr)
library(DBI)
library(RSQLite)
library(quanteda)
library(readtext)
connect to Zotero's SQLite database
con = dbConnect(drv = RSQLite::SQLite(),
dbname = "~/Zotero/zotero.sqlite")
get names of all tables in the database
alltables = dbListTables(con)
bring the items and itemNotes tables into R
table.items <- dbGetQuery(con, 'select * from items')
table.itemNotes <- dbGetQuery(con, 'select * from itemNotes')
bring in Zotero fulltext cache plaintext
textDF <- readtext(paste0("~/Zotero/storage", "/*/.zotero-ft-cache"),
docvarsfrom = "filepaths")
isolate "key" (8-character alphanumeric directory in storage/) in docvar1 associated with plaintext
textDF$docvar1 <- gsub(pattern = "^.storage\/", replacement = "", x = textDF$docvar1)
textDF$docvar1 <- gsub(pattern = "\/.", replacement = "", x = textDF$docvar1)
bring in itemID (and some other metadata) and that's all
textDF <- textDF %>%
dplyr::rename(key = docvar1) %>%
dplyr::left_join(table.items) %>%
dplyr::filter(!is.na(itemID), !itemID %in% table.itemNotes$itemID)
I don't have extensive experience with R, so any help would be really appreciated!
Thanks