nested json to csv

I would to convert the nested json to csv.

{
  "dataElements": [
    {
      "name": "004-DN02. Names of deceased",
      "id": "HCbRydAAt1T",
      "categoryCombo": {
        "name": "default",
        "id": "bjDvmb4bfuf",
        "categoryOptionCombos": [
          {
            "name": "default2",
            "id": "gGhClrV5odI"
          },
          {
            "name": "default",
            "id": "HllvX50cXC0"
          }
        ]
      }
    }
  ]
}

The presence of a nested data frames/tables makes conversion to CSV tricky, since a cell in a CSV file cannot contain a table. Your first step would be to decide how to flatten it. (I'm assuming here that your example is simplified and your actual dataElements table would contain more than one row.) Suppose that one row of your actual data had two category combos, one with three options and one with two. Would you want the CSV to have five rows (one for each combination of category combo and option combo), or would you want two rows (one for each category combo) with each row having three option name columns and three option id columns (the third name and ID being NA for the combo that only had two options), or would you want one row with columns named columns categoryCombo1.name, categoryCombo2.name, categoryCombo1.id, categoryCombo2.id, categoryCombo1Option1.name, categoryCombo1Option2.name, etc.?

Hi @kwebihaf ,

here some sample code to get you started

# install.packages("jsonlite", # to get json into an R object
#                  "rrapply"   # convert list to dataframe
# )
# set path
pathToYourFile <- "<yourpath>/posit_181834.json"
# parse json to R list
jsonToList <- jsonlite::fromJSON(txt = pathToYourFile, 
                        simplifyVector = FALSE # make list of lists
                        )
# convert list to dataframe
listToDf <- rrapply::rrapply(object = jsonToList, how = "bind")
# write to csv file
write.csv(listToDf, 
          textConnection("yourCsv", "w"), # change this to a path if you want to save it separately 
          row.names = FALSE)
yourCsv

# output
# [1] "\"name\",\"id\",\"categoryCombo.name\",\"categoryCombo.id\",\"categoryCombo.categoryOptionCombos.1.name\",\"categoryCombo.categoryOptionCombos.1.id\",\"categoryCombo.categoryOptionCombos.2.name\",\"categoryCombo.categoryOptionCombos.2.id\""
# [2] "\"004-DN02. Names of deceased\",\"HCbRydAAt1T\",\"default\",\"bjDvmb4bfuf\",\"default2\",\"gGhClrV5odI\",\"default\",\"HllvX50cXC0\"" 

jsonlite and rrapply are self contained (no dependencies). I also like the approach of rrapply since the column names "save" the structure of the json hierachy.

Thank You. It has worked

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.