Hi all,
I am trying to download asylum monthly applications from Eurostat using the eurostat package. I would like to get only the latest figures, preferably the latest month available or all 2018 months would work too. The problem is that downloading the entire dataset crashes Rstudio, while I get an error when filtering with sinceTimePeriod.
Hope someone knows a way around this.
Thanks,
Mattia
# downloading the entire dataset crashes Rstudio
applications_all <- get_eurostat("migr_asyappctzm", time_format = "raw")
applications_latest <-subset(applications_all, time == 2018)
# filtering with sinceTimePeriod returns error “Failure to get data. Status code: 416. Some datasets are not accessible via the eurostat interface.”
applications <-get_eurostat("migr_asyappctzm", filters = list(sinceTimePeriod = 2018), type = "label", time_format = "num")
I get a similar error when I try to same get_eurostat
call.
The error message notes that you can download the file here: http://ec.europa.eu/eurostat/estat-navtree-portlet-prod/BulkDownloadListing?sort=1&file=data%2Fmigr_asyappctzm.tsv.gz
The tab-separated-variable file seems (to me at least) oddly formatted. R seems to have a lot of trouble parsing it. There are also colon and "d" characters throughout. Perhaps the metadata discussion on their doc's website has insight: http://ec.europa.eu/eurostat/web/products-datasets/product?code=migr_asyappctzm
With readr
, dplyr
and tidyr
, it's pretty easy to clean this data up, but I'd be nervous about making too many assumptions. E.g. is a :
a missing value or NULL? What does a value of 0 d
under 2018M02 mean?
Some options:
- You might submit an issue so the package maintainers are aware of the issue. There's a chance folks there are familiar with these types of things. https://github.com/rOpenGov/eurostat/issues
- The documentation website I gave also have a few contacts you can pursue.