nested json to csv

kwebihaf · February 13, 2024, 8:22pm

I would to convert the nested json to csv.

{
  "dataElements": [
    {
      "name": "004-DN02. Names of deceased",
      "id": "HCbRydAAt1T",
      "categoryCombo": {
        "name": "default",
        "id": "bjDvmb4bfuf",
        "categoryOptionCombos": [
          {
            "name": "default2",
            "id": "gGhClrV5odI"
          },
          {
            "name": "default",
            "id": "HllvX50cXC0"
          }
        ]
      }
    }
  ]
}

prubin · February 13, 2024, 8:50pm

The presence of a nested data frames/tables makes conversion to CSV tricky, since a cell in a CSV file cannot contain a table. Your first step would be to decide how to flatten it. (I'm assuming here that your example is simplified and your actual dataElements table would contain more than one row.) Suppose that one row of your actual data had two category combos, one with three options and one with two. Would you want the CSV to have five rows (one for each combination of category combo and option combo), or would you want two rows (one for each category combo) with each row having three option name columns and three option id columns (the third name and ID being NA for the combo that only had two options), or would you want one row with columns named columns categoryCombo1.name, categoryCombo2.name, categoryCombo1.id, categoryCombo2.id, categoryCombo1Option1.name, categoryCombo1Option2.name, etc.?

vedoa · February 17, 2024, 11:52am

Hi @kwebihaf ,

here some sample code to get you started

# install.packages("jsonlite", # to get json into an R object
#                  "rrapply"   # convert list to dataframe
# )
# set path
pathToYourFile <- "<yourpath>/posit_181834.json"
# parse json to R list
jsonToList <- jsonlite::fromJSON(txt = pathToYourFile, 
                        simplifyVector = FALSE # make list of lists
                        )
# convert list to dataframe
listToDf <- rrapply::rrapply(object = jsonToList, how = "bind")
# write to csv file
write.csv(listToDf, 
          textConnection("yourCsv", "w"), # change this to a path if you want to save it separately 
          row.names = FALSE)
yourCsv

# output
# [1] "\"name\",\"id\",\"categoryCombo.name\",\"categoryCombo.id\",\"categoryCombo.categoryOptionCombos.1.name\",\"categoryCombo.categoryOptionCombos.1.id\",\"categoryCombo.categoryOptionCombos.2.name\",\"categoryCombo.categoryOptionCombos.2.id\""
# [2] "\"004-DN02. Names of deceased\",\"HCbRydAAt1T\",\"default\",\"bjDvmb4bfuf\",\"default2\",\"gGhClrV5odI\",\"default\",\"HllvX50cXC0\""

jsonlite and rrapply are self contained (no dependencies). I also like the approach of rrapply since the column names "save" the structure of the json hierachy.

kwebihaf · February 20, 2024, 6:58pm

Thank You. It has worked

system · February 27, 2024, 6:58pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.