Hi
My problem: I'd like to unnest values from the column into readable data frame.
- Since I am not familiar with the nested format, I am open to the community expert advice on best way to work with these types of data. I'd also appreciate if you can explain your approach and share any useful resources.
- I will be using the output for exploratory analysis than develop insights and reporting.
- I was thinking each unnest values into separate rows but that might be too much since I have thousands of rows. What do you recommend?
About the dataset:
- The original file is in CSV format with 1000+ rows.
- The data is stored in nested format in some columns.
- Column symptom_reasons_for_visit is enclosed with
[]
but patient_running is not even though both have nested values. - The data is possibly collected as survey where consultation_Id represents the person who conducted the survey.
Sample dataset:
df <- tibble::tribble(
~consultation_id, ~consultation_week_start, ~symptom_reasons_for_visit, ~why_visit, ~medications, ~global_questions, ~customer_age_range, ~patient_sex, ~patient_drinking, ~patient_running,
1007L, "01/11/2021", "[{symp_code=21, time_long={number=null, time_frame=null}, first_occurrence={number=2.000000000, time_frame=TTTT}, time_of_day=[], exacerbating_factors=[{id=31, name=carrying}], relieving_factors=[{id=007, name=headache (migraine)}], trigger_events=[{id=62, name=food and diet (intense exercise)}, {id=98, name=heavy headache (lifting)}]}]", "[]", "[{id=74770, name=Panadol, problems=[]}]", "[{id=9, type=radio, title=Are you or could you be injured? , answers=[{id=99, name=Not pregnant }]}, {id=876, type=radio, title=Are you currently in pain? , answers=[{id=1807, name=No}]}]", "20-29", "FEMALE", "{frequency=RARE, drinks={number=5.000000000, time_frame=MONTHLY}}", "{running_per_day=5, time_long={number=2.000000000, time_frame=DTFY}, status=ON, quit={number=5.000000000, time_frame=DTFY}}"
)
Thank you so much♥