How to automate reports, web scrape data using Rstudio

I am a newbie to R, and working on a project to automate report generation and scrape data using R for tracking diseases.

Since I have no prior experience with R, apart from regression and logistic analysis, hence seriously need guidance as to how to go about this project in terms of coding, the methodology to be followed. Any help in this regard would be awesome.

Thanks in advance!!

Hi, welcome!

Please have a look to our homework policy, homework inspired questions are welcome but they should not include verbatim instructions from your course.

2 Likes

Thanks, will surely keep forum guidelines in mind. This was actually summary of what needs to be done and not the exact question itself. Just not sure how to go about it.

It will be better if you mention these.

  • What you want to do ( Ultimate Goal )
  • What you can do ( or what you did )
  • What you can't do ( needs help )

Here are some packages (library) that you can start with.

  • For scrape data , Rselenium can be used. Rselenium
  • For automate work. CronR can be used. cronR
  • For generating reports, Rmarkdown can be used. Rmarkdown

Bests.

1 Like

Thanks a lot for the guidance.

The ULTIMATE GOAL is to scrape data from websites (HTML/CSS), automate report generation in pdf/HTML format (output), followed by creating interactive dashboards using tableau. Now i know how to make dashboards etc in tableau. But really don't know how to Scrape date using R (which packages to install, reference code guide/tutorial etc for understanding) and automate report generation.

I have tried Rcurl() & Readlines() for reading and parsing data directly from the websites, but still a long way off from perfecting it.

I think

  • Try httr package for scraping HTML/CSS webpage first will be good.
  • See this link

before generating report.

1 Like

ok, will start with httr. And thanks a lot again for helping me out.

Since it sounds like you're still in the phase of your project of 'how do I even approach this kinds of problems with R?', I'd also like to encourage you to check out Garret and Hadley's book, https://r4ds.had.co.nz/. It contains a nice collection of chapters on various tools useful to project a project like this, and is a nice onboard to R and the Tidyverse generally.

When you do get to the point where you are looking for help with a specific coding question, I'd encourage you to check out FAQ: Tips for writing R-related questions. It's a nice guide to improve the probability folks who want to help can understand your problem, and quickly reply with useful suggestions.

This topic was automatically closed 21 days after the last reply. New replies are no longer allowed.