I have a dataset of 43k rows. It works great for about 500 rows and then it crashes and assigns the remaining 42500 rows to the same tract, which is not correct. Here is the error message I get and the code below is what I am using:
Hi @jillahmad17
You are probably overloading the server with too many requests, too quickly. Try adding a short time delay in your loop:
for (i in 1:nrow(tract_data)) {
tract_data$tracts[i] = geo2fips(tract_data1$Latitude[i], tract_data1$Longitude[i])
Sys.sleep(1)
}
Mind you, with a 1 second delay between requests, 43000 are going to take almost 12 hours! If that doesn't work, try requesting the data in blocks of, say, 400 with a few seconds between each block.
If the server is rate throttling, it might be possible to get by with a shorter delay than 1 second, but that's still not optimal, given the alternatives.
Both the {tigris} and {tidycensus} packages allow downloading files with country FIPS codes and a {sf} (simple features) representation of the country boundaries. That allows using functions in {sf} such as st_intersects() to write a script to assemble a data frame-like object with the FIPS code, county name and lat/lon geometries.
I don't have an example using fromJSON, but I do when reading from a database which sometimes times out
Doesn't have a delay
the main thing is that you will wrap your fromJSON with try and test the class of the response
Try = maxtries
while (TRUE) {
DataRead = try(dbGetQuery(dbcon,QueryString))
if (class(DataRead) != "try-error") { break}
Try = Try - 1
if (Try <= 0) { break}
dbcon <<- OpenDataLake() # ready for retry
LogMessage(QueryString, " read error. retrying")
}