I wrote a blog entry here: Keeping Busy with Data Science · Teach Data Science The text of the blog is given below.
I hope that you and your students are well and weathering this storm smoothly.
It has become increasingly clear that many college students have found themselves without summer plans. Unfortunately, this blog entry is not a list of possible employment opportunities. Instead, it is a compilation of statistics and data science projects to enhance a summer spent socially distant.
The list below represents opportunities at a variety of levels. If you are just beginning or quite advanced, there are many ideas for you.

a note on data science: I often get queries about preparation for pursing data science during or after college. My best advice is to: do data science . That is, worry less about which classes to take or which graduate schools to apply to. Instead, worry more about learning data science skills, becoming proficient in data wrangling, and thinking critically about problem solving. Take some statistics, math, and CS classes (it doesn’t matter hugely which classes). If you have a solid background in statistics, math, and CS + some decent data science chops you will be able to accomplish whatever you want.

a note on software: The majority of the resources / links below are for doing data science in R. There are many good software options, however the resources for getting started in R are outstanding, and you are highly encouraged to check them out.
 While not necessarily the first task that you should undertake this summer, the first recommendation is to set up a GitHub account and use it to post anything you do. Each project should be a separate repository, and you should make sure to always have a README file so that others (and you six months from now) can easily see what you’ve done.
If you are at all serious about doing data science at any point down the road, now is the time to start collecting your data projects into a single place so that your work can be highlighted.

The ultimate site for getting GitHub up and running (and talking with RStudio) is: https://happygitwithr.com/

If you’d like to set up your own website, try using Distill for R Markdown: Creating a Website
Are you more advanced?

make your own website using bookdown, bookdown: Authoring Books and Technical Documents with R Markdown

or a blog using blogdown, blogdown: Creating Websites with R Markdown
 Particularly if you are new to R , an amazing book to work through is called “R for Data Science” by Grolemund & Wickham (https://r4ds.had.co.nz/). There are many problems you can try out, and the text provides a wealth of ideas for working through data analysis problems. Even if you have been using R for many years, my guess is that the text contains many opportunities to learn how to work with new data structures.

interactive tutorials for working through “R for Data Science” at Posit Cloud

for a good start to R in general, check out RStudio Education

There is a more advanced version to the Grolemund & Wickham text that you might want to try out if you are an advanced R user (“Advanced R” by Wickham, https://advr.hadley.nz/. The advanced version includes quite a bit on programming and why R works the way it does.
 Interested in modeling?

for modeling in R, visit the new tidymodels Get Started page: tidymodels  Welcome!

arguably the best text on statistical learning models (and freely available!) is “An Introduction to Statistical Learning with Applications in R” by James et al. http://faculty.marshall.usc.edu/garethjames/ISL/
 Interested in text analysis / natural language processing?
 Tidytext tutorials:
 Chapter 11: tidy text:
stringr
cheat sheet:
https://github.com/rstudio/cheatsheets/raw/master/strings.pdf
 Text datasets to get started with:
 Blog post(s) by Julia Silge:
 Julia Silge & David Robinson’s book:
 Implementation of Google’s BERT framework into R:
 One of the most fun things you can do is to practice doing data science . Below are ideas you could work on for one afternoon or that you could commit a few weeks to figuring out. You should choose projects that seem fun and to which you might be able to provide a creative approach to solving.

Tidy Tuesday : every Tuesday a new dataset is posted, and individuals (separately and collaboratively) work to visualize the dataset. Details at GitHub  rfordatascience/tidytuesday: Official repo for the #tidytuesday project.

Kaggle.com : is an online community of data scientists who build models, working together to come up with optimal predictions. You can compete in an ongoing Kaggle competition, or you can work through an old competition where many teams have shared their work and their ideas.

Work through a COVID19 analysis . It is worth noting that the current available case data is likely to be underreported (both cases and deaths across most countries) which makes modeling the actual data somewhat problematic. Instead, you might try to model COVID19 related data (e.g., flights in the US, unemployment, emissions, weather patterns, etc.).
Two good resources that have been collecting analyses and other information are:http://www.stat.cmu.edu/~kass/covid.html and GitHub  minecetinkayarundel/covid19r: Collection of analyses, packages, visualisations of COVID19 data in R

data for social good: Competitions provides real data and structures (similar to Kaggle) for working through models and coming up with predictions – all on data which benefits the social good.

Other data science competitions: Top Competitive Data Science Platforms other than Kaggle  by Parul Pandey  Towards Data Science
 Register for a (free) shiny account, and create a shiny dashboard to highlight the work you are doing!

account: https://shiny.rstudio.com/

gallery: Shiny for R Gallery

Shiny contest 2020 (deadline passed, but you can practice for next year!): Shiny Contest 2020 is here!  Posit
 Interested in art ? Make art with data and R!

Getting started with generative art in R: https://djnavarro.net/post/unpredictablepaintings/

No code, but inspiration: Thomas Lin Pedersen  Generative art by Thomas Lin Pedersen

Follow posts on Twitter: https://twitter.com/search?q=%23generative%20%23rstats&src=typed_query


Illustrations — don’t want to code but draw instead?
 Illustrate your learnings. See GitHub  allisonhorst/statsillustrations: R & stats illustrations by @allison_horst for inspiration.
 Hand drawn data visualizations. See THE PROJECT — Dear Data for inspiration.
 Learn some new stuff from videos & webinars !

RStudio has a wealth of amazing videos https://resources.rstudio.com/

Dave on Data’s youtube channel: https://www.youtube.com/channel/UCRhUp6SYaJ7zme4Bjwt28DQ

Coursera + Johns Hopkins Data Science Specialization: https://www.coursera.org/specializations/jhudatascience
 Participate in the data science community!! Engage on RStudio Community (https://forum.posit.co/) or https://stackoverflow.com/  platforms for asking and answering all the questions . StackOverflow is more comprehensive, but it can be aggressive and unhelpful at times. RStudio Community is a great place for beginners and way less intimidating than SO.

be sure you know how to make a minimal reproducible example: r faq  How to make a great R reproducible example  Stack Overflow

use the
reprex
package in R: reprex: Help me help you · Teach Data Science
 Write an R package !
You'll be surprised to learn that creating your own R package can be reasonably straightforward. Fantastic stepbystep instructions help facilitate putting the R package together.

how to do it: https://rpkgs.org/

really great example to walk through: https://www.erikhoward.net/blog/howtocreateanrdatapackage/

write a package that contains (only) a cool dataset: Creating R data packages for teaching · Teach Data Science

Jo Hardin
Pomona College