Matrix to Array conversion is increasing the size massively in R

I have an R object as .RData file which has columns of Lat, Lon, and Values for each pixel in a single column for a single month and I have this for multiple months (1950 Jan to Dec 2099 -1800 months). Hence, the dimensions of my data frame 810810 rows and 1802 columns.

I am converting this data frame into an array (as part of delivering NC files as our final product). However, this is massively increasing the size of the object which is not really suitable for storing the data.

Here's the code that I tried:

library(ncdf4)

# Load the RData file
load("Rcp45_bcc-csm1-1_1Month_SPEI_Df.RData")
print("RData File Loaded")

# Open the NetCDF template file to get lat and lon values
file_nc_maxT1 <- nc_open("macav2metdata_PotEvap_bcc-csm1-1_r1i1p1_rcp45_2096_2099_CONUS_monthly.nc")
nclat <- as.numeric(ncvar_get(file_nc_maxT1,"lat"))
nclon <- as.numeric(ncvar_get(file_nc_maxT1,"lon"))
nc_close(file_nc_maxT1)

# Define the filename for the new NetCDF file
filename <- "macav2metdata_1-month_SPEI_bcc-csm1-1_r1i1p1_rcp45_1950_2099_CONUS_monthly.nc"

# Create the NetCDF file
nx <- length(nclon)
ny <- length(nclat)
lon <- ncdim_def("lon", units = "degrees_east", longname = "longitude", vals = nclon)
lat <- ncdim_def("lat", units = "degrees_north", longname = "latitude", vals = nclat)
nctime <- seq(as.Date("1950-01-01"),as.Date("2099-12-31"),by="months")
time <- ncdim_def("time", units = "days since 1900-01-01 00:00:00", vals = as.numeric(nctime - as.Date("1900-01-01")))
mv <- -999 
var <- ncvar_def("SPEI", "1-month SPEI", list(lon, lat, time), longname="1-month Standardized Precipitation Evapotranspiration Index (SPEI)", mv)

ncnew <- nc_create(filename, list(var))

print(paste("The file has", ncnew$nvars,"variables"))
print(paste("The file has", ncnew$ndim,"dimensions"))

# Replace NaN and NA values with missing value (mv)
is.nan.data.frame <- function(x)
do.call(cbind, lapply(x, is.nan))

SPEIDf[is.nan(SPEIDf)] <- mv
SPEIDf[is.na(SPEIDf)] <- mv

# Convert data to array format
dataarray <- array(SPEIDf[, -c(1, 2)], dim = c(nx, ny, length(nctime)))
print("SPEIDf")
print(dim(SPEIDf))
print(object.size(SPEIDf))
print("dataarray")
print(dim(dataarray))
print(object.size(dataarray))

# Write data array to the NetCDF file
ncvar_put(ncnew, var, dataarray)

# Close the NetCDF file
nc_close(ncnew)
print("NC File Created")

The code actually stops giving the following error:

Error in ncvar_put(ncnew, var, dataarray) : 
  'list' object cannot be coerced to type 'double'
Execution halted

This is the output that I get before it stops:

[1] "RData File Loaded"
[1] "The file has 1 variables"
[1] "The file has 3 dimensions"
[1] "SPEIDf"
[1] 810810   1802
11688853808 bytes
[1] "dataarray"
[1] 1386  585 1800
9466826857488224 bytes

I have been working with converting these type of .RData objects to .nc files using this method but never encountered this issue. I would greatly appreciate any help in this.

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.