Hello everyone,
Thank you for taking a look at my post. I am in urgent need of some help with a simple web scraping related project I am working on.
My issue is that I have used this exact same code before and it worked well. But for some reason, I am now getting this error: "Error: Argument 'txt' must be a JSON string, URL or file."
This is a large frustration of course. I have two questions
may you please provide some insight into why I am having this issue.
Is this the best method of web scraping in R? (the purpose of this web scrape is to extract a large (100's of observations of approx 20 variables) data frame.
I have provided my code below.
'''
library (dplyr)
library(jsonlite)
library(httr)
my_url <- "https://api.nhle.com/stats/rest/en/skater/summary?isAggregate=false&isGame=true&sort=%5B%7B%22property%22:%22points%22,%22direction%22:%22DESC%22%7D,%7B%22property%22:%22goals%22,%22direction%22:%22DESC%22%7D,%7B%22property%22:%22assists%22,%22direction%22:%22DESC%22%7D%5D&start=0&limit=100&factCayenneExp=gamesPlayed%3E=1&cayenneExp=franchiseId%3D21%20and%20gameDate%3C=%222020-03-11%2023%3A59%3A59%22%20and%20gameDate%3E=%222019-10-02%22%20and%20gameTypeId=2"
my_raw_results <- httr::GET(my_url)
data_content <- httr::content(my_raw_results, as = "parsed")
complete_df <- jsonlite::fromJSON(data_content)
complete_df2 <- as.data.frame(complete_df)
'''
Thank you for your help.
From help(content)
There are currently three ways to retrieve the contents of a request: as a raw object (as = "raw"), as a character vector, (as = "text"), and as parsed into an R object where possible, (as = "parsed"). If as is not specified, content does its best to guess which output is most appropriate.
In this case, "parsed" fails.
suppressPackageStartupMessages({
library(dplyr)
library(jsonlite)
library(httr)
})
my_url <- "https://api.nhle.com/stats/rest/en/skater/summary?isAggregate=false&isGame=true&sort=%5B%7B%22property%22:%22points%22,%22direction%22:%22DESC%22%7D,%7B%22property%22:%22goals%22,%22direction%22:%22DESC%22%7D,%7B%22property%22:%22assists%22,%22direction%22:%22DESC%22%7D%5D&start=0&limit=100&factCayenneExp=gamesPlayed%3E=1&cayenneExp=franchiseId%3D21%20and%20gameDate%3C=%222020-03-11%2023%3A59%3A59%22%20and%20gameDate%3E=%222019-10-02%22%20and%20gameTypeId=2"
my_raw_results <- GET(my_url)
data_content <- content(my_raw_results, as = "text")
#> No encoding supplied: defaulting to UTF-8.
class(data_content)
#> [1] "character"
complete_df <- fromJSON(data_content)
complete_df2 <- as.data.frame(complete_df)
head(complete_df2)
#> data.assists data.evGoals data.evPoints data.faceoffWinPct data.gameDate
#> 1 1 3 4 0.00 2020-02-17
#> 2 1 2 3 NA 2019-10-31
#> 3 1 2 3 0.44 2020-02-25
#> 4 1 2 3 NA 2019-10-08
#> 5 1 1 1 NA 2019-11-05
#> 6 2 1 2 NA 2020-02-25
#> data.gameId data.gameWinningGoals data.gamesPlayed data.goals data.homeRoad
#> 1 2019020915 0 1 3 H
#> 2 2019020194 1 1 2 R
#> 3 2019020968 0 1 2 R
#> 4 2019020042 0 1 2 H
#> 5 2019020231 1 1 2 H
#> 6 2019020968 0 1 1 R
#> data.lastName data.opponentTeamAbbrev data.otGoals data.penaltyMinutes
#> 1 Mangiapane ANA 0 2
#> 2 Tkachuk NSH 1 0
#> 3 Backlund BOS 0 0
#> 4 Tkachuk LAK 0 0
#> 5 Tkachuk ARI 1 0
#> 6 Tkachuk BOS 0 0
#> data.playerId data.plusMinus data.points data.pointsPerGame data.positionCode
#> 1 8478233 4 4 4 L
#> 2 8479314 1 3 3 L
#> 3 8474150 2 3 3 C
#> 4 8479314 3 3 3 L
#> 5 8479314 1 3 3 L
#> 6 8479314 3 3 3 L
#> data.ppGoals data.ppPoints data.shGoals data.shPoints data.shootingPct
#> 1 0 0 0 0 1.00000
#> 2 0 0 0 0 0.50000
#> 3 0 0 0 0 1.00000
#> 4 0 0 0 0 1.00000
#> 5 1 2 0 0 0.66666
#> 6 0 1 0 0 0.33333
#> data.shootsCatches data.shots data.skaterFullName data.teamAbbrev
#> 1 L 3 Andrew Mangiapane CGY
#> 2 L 4 Matthew Tkachuk CGY
#> 3 L 2 Mikael Backlund CGY
#> 4 L 2 Matthew Tkachuk CGY
#> 5 L 3 Matthew Tkachuk CGY
#> 6 L 3 Matthew Tkachuk CGY
#> data.timeOnIcePerGame total
#> 1 1125 1260
#> 2 1395 1260
#> 3 1094 1260
#> 4 1225 1260
#> 5 1294 1260
#> 6 1145 1260
1 Like
Very interesting! Thank you for the help this is much appreciated. Would you mind if I asked you two more small questions?
No problem. Add them here if they are related. Otherwise start a new thread?
system
Closed
March 18, 2021, 6:55pm
5
This topic was automatically closed 7 days after the last reply. New replies are no longer allowed. If you have a query related to it or one of the replies, start a new topic and refer back with a link.