R Script failing in Github Actions

Can someone please take a look at the workflow run error message and tell me why github actions is failing at this point to run my R script?:

Run Rscript AljaHeadScraper.R

[2](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:2) Rscript AljaHeadScraper.R

[3](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:3) shell: /bin/bash -e {0}

[4](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:4) env:

[5](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:5) R_LIBS_USER: /Users/runner/work/_temp/Library

[6](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:6) TZ: UTC

[7](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:7) _R_CHECK_SYSTEM_CLOCK_: FALSE

[8](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:8) NOT_CRAN: true

[10](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:10)── Attaching packages ─────────────────────────────────────── tidyverse 1.3.1 ──

[11](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:11)✔ ggplot2 3.3.5 ✔ purrr 0.3.4

[12](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:12)✔ tibble 3.1.6 ✔ dplyr 1.0.7

[13](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:13)✔ tidyr 1.1.4 ✔ stringr 1.4.0

[14](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:14)✔ readr 2.1.1 ✔ forcats 0.5.1

[15](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:15)── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──

[16](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:16)✖ dplyr::filter() masks stats::filter()

[17](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:17)✖ readr::guess_encoding() masks rvest::guess_encoding()

[18](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:18)✖ dplyr::lag() masks stats::lag()

[19](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:19)Error in file(file, ifelse(append, "a", "w")) :

[20](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:20) cannot open the connection

[21](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:21)Calls: write.csv -> eval.parent -> eval -> eval -> <Anonymous> -> file

[22](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:22)In addition: Warning message:

[23](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:23)In file(file, ifelse(append, "a", "w")) :

[24](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:24) cannot open file 'data/Headlinks.csv': Not a directory

[25](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:25)Execution halted

[26](https://github.com/Ifeanyi55/AutoScraper/runs/4691624616?check_suite_focus=true#step:5:26)Error: Process completed with exit code 1.

Commit files

0s

Post Run actions/checkout@master

1s

Complete job

0s

Below is the R script I am trying to run on schedule:

library(rvest)
library(tidyverse)

aljurl <- read_html(paste0("https://www.aljazeera.com/"))

headlinks <- aljurl %>% 
  html_nodes(".u-clickable-card__link") %>% 
  html_attr("href")

links <- data.frame(
  date = Sys.Date(),
  headline_links = headlinks
)

write.csv(links,file = paste0("data/Headlinks.csv"))

It is a simple web scraping application. But for some reason, github actions fails to run the script even though all other stages of the workflow checkout. Below is my yaml file:

# Hourly scrape headlines
name: HeadlineScraper

# Controls when the action will run.
on:
  schedule:
    - cron: '0 * * * *'

jobs:
  autoscrape:
    # The type of runner that the job will run on
    runs-on: macos-latest

    # Load repo and install R
    steps:
    - uses: actions/checkout@master
    - uses: r-lib/actions/setup-r@master

    # Set-up R
    - name: Install packages
      run: |
        R -e 'install.packages("rvest")'
        R -e 'install.packages("tidyverse")'
    # Run R script
    - name: Scrape
      run: Rscript AljaHeadScraper.R

    # Add new files in data folder, commit along with other modified files, push
    - name: Commit files
      run: |
        git config --local user.name github-actions
        git config --local user.email "actions@github.com"
        git add data/*
        git commit -am "GH ACTION Autorun $(date)"
        git push origin main
      env:
        REPO_KEY: ${{secrets.GITHUB_TOKEN}}
        username: github-actions

I have been on this issue for several weeks and I am yet to find a solution to it. Google hasn't helped me either.

This means that you are trying to write to the file data/Headlinks.csv but the data directory does not exist, so it is not possible to write this file.

Thanks, I have actually changed the file name to "store", committed the change, and tried to write to that file, but the same problem persists. It keeps referring to the nonexistent "data" file.

Well the error above is "Not a directory" and not a non-existent data file, so this is something different then. To say more you need to show the updated script and the output.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.