Flattening the Data of Different Data Types

Introduction

R code written by using Sparklyr package to create database schema. So, after reading I have the dataframe of following structure in R.

R Database Schema

root
    |-- contributors : string
    |-- created_at : string
    |-- entities (struct)
    |     |-- hashtags (array) : [string]
    |     |-- media (array)
    |     |     |-- additional_media_info (struct)
    |     |     |       |-- description : string
    |     |     |       |-- embeddable : boolean
    |     |     |       |-- monetizable : bollean
    |     |     |-- diplay_url : string
    |     |     |-- id : long
    |     |     |-- id_str : string
    |     |-- urls (array)     
    |-- extended_entities (struct)
    |-- retweeted_status (struct)
    |-- user (struct)
    
I want to flatten this structure and create a new dataframe as below,
    
    root
    |-- contributors : string
    |-- created_at : string
    |-- entities (struct)
    |-- entities.hashtags (array) : [string]
    |-- entities.media (array)
    |-- entities.media.additional_media_info (struct)
    |-- entities.media.additional_media_info.description : string
    |-- entities.media.additional_media_info.embeddable : boolean
    |-- entities.media.additional_media_info.monetizable : bollean
    |-- entities.media.diplay_url : string
    |-- entities.media.id : long
    |-- entities.media.id_str : string
    |-- entities.urls (array)     
    |-- extended_entities (struct)
    |-- retweeted_status (struct)
    |-- user (struct)

Problem Statement

The nested columns are of different data types. [If require I can upload the snapshot of database schema]. Database schema is similar to above structure. I want to flatten the columns. Any solution would be appreciated.

Have you tried using the https://mitre.github.io/sparklyr.nested/ sparklyr extension?

You would have to install it with:

devtools::install_github("mitre/sparklyr.nested")

your_data %>%
  sdf_unnest(entities)

Please refer to the extension docs for examples and additional information.