We have gathered historical tweets regarding public organisations and want to do a sentiment analysis on them. The type of file is a ".jsonl", meaning that every line is a separate json (see screenshot). We need to convert the information to a normal dataframe to start working on the data. Can somebody please explain which code we need? I can't find clear answers on the internet on how to deal with json lines.
We don't really have enough info to help you out. Could you ask this with a minimal REPRoducible EXample (reprex)? A reprex makes it much easier for others to understand your issue and figure out how to help.
I can get you this far, but no further... theres an issue with each json not having the same dimensions necessarily as the others, so making one single table to hold them all is frought. my solution works as far as giving you a list of tables, one for each of the 400 json
library(tidyverse)
library(jsonlite)
library(purrr)
fileName <- "twitter_premium.jsonl"
tpj <- readLines(fileName, file.info(fileName)$size)
tpj2 <- paste0("[",tpj,"]")
list_of_json <- map(tpj2,~as_tibble(jsonlite::fromJSON(.)))
#error from inconsistent table structures
try_one_table <- map_dfr(tpj2,~as_tibble(jsonlite::fromJSON(.)))
#Error: Argument 13 can't be a list containing data frames