Audience
I firmly believe that one of the most impactful things you can do to build foundational skills for programming beginners is to invest the time upfront in getting to know your audience. For example, the two groups that I am getting prepared to work with are:
Colleagues with no stats or programming background, who come from previous experiences where data has been used in a punitive manner (poor results collected from flawed instruments being used to place people on probation or dismiss them from their position).
Young adults aged 18 - 24 who are in need of career training, and generally do not have a high school diploma or GED, coupled with housing instability.
These are two examples from a myriad of population segments, and hopefully help to highlight how understanding more about the audience can help you tailor the learning experience and determine the necessary foundational skills to cover.
Computer Literacy
People enter programming from a wide variety of backgrounds, often with inconsistent computer skills. While capturing and cropping a screenshot may seem like second nature to you, others may not even know this technology exists. Determine which computer skills learners will need in order to be successful in your course, and provide the necessary supports to help them develop those skills.
Skill building
While the application of skills is critical, there is something to be said the practice and repetition of foundational skills. There are benefits to having learners work through sets of increasingly difficult coding problems and challenges tailored to a specific topic, similar to what's done in various Khan Academy courses.
Learning environment
How are learners expected to engage with the material?
Are the data sets used in the course meaningful and easy to relate to?
What are the accountability and motivational measures employed?
How readily does coursework translate to real-world applications?
In this vein, I've been thinking about the benefits of project based learning, and how project based learning can be applied to increasing learner engagement and success in an online learning environment centered around R and data analysis/science.
@jessemaegan you are doing a great job organizing all this. Organizing such a large group of people with different skill levels is no small task.
I have been trying to build a platform for project-based mentorship and I can tell you that its easier than it sounds.
In my case I teach #r4ds as part of R-Ladies Orlando beginners workshops, and so far students are keeping up. I ask for HONEST opinions after each session I finish and I make sure I incorporate them for the next one. Whenever I feel the students are over thinking things, or are rushing with wanting to know more or they feel, that they are over their head, I keep tell them......we are just learning how to drive the car first, not how the engine works, not to race in F1 tomorrow, we are just going to learn how to drive an automatic-gear car. I started my first workshop with , today I am going to show you how to sit in the car, adjust the mirrors and the car seat. The second workshop I said, today I am going to show you how to move forward....that's it. They understand that every single one of us was not born driving a car and that driving a car its hard and scary at the beginning, but with time and practice they will get to enjoy the ride.
First of all, I admire your goals, in particular with your young students. I hope you all succeed!
These days, I find myself teaching R and data science skills to a group of social scientists that would fit in the same group as your colleagues. I decided to start with visualization and mapping, after asking what was the main challenge in their jobs and finding out it is dealing with georeferenced data with no GIS skills.
We've only completed two workshop sessions so far, but this is what I learned:
Teaching how to install R, Rstudio, useful packages (i.e. tidyverse), how to open and save a notebook deserves a dedicated class. Helping students do all of this, while taking time to explain why it's necessary and how the pieces work together is no small feat. Definitely deserves it's own context, maybe as the first lesson.
Things will fail. The lovely R Notebook based exercise that you completed and ran so well in your Linux laptop will end up showing garbled characters in the student's Windows PC, or fail because it relies on online sources (alla ggmap's get_map()) while the clasroom's WiFi failed that day.
So be careful when your code needs internet access to work -some times we forget this could be a weak spot!- but more importantly be prepared for things to behave in unexpected ways, take it easy, smile and insist on it being a great chance to practice dealing with surprise problems.
Your students will feel overwhelmed. Probably not all of them, but some. In particular, the ones that seemed to be more anxious about "not getting it" were not the less prepared, but the most eager to learn. I realize now that this is normal, and a feeling that myself faced many times while learning. I decided to accept it as part of the process, and encouraged my students to relax, especially during the first lessons, and be like the " this is fine" dog (meme attached). They laughed and I think it helped defuse some of the anxiety.
As en extra challenge, I'm teaching in Argentina, so we speak (and read, and think in) Spanish. This makes it harder to me to find suitable supplemental material, as most of the great ones are in English! On the other hand, It forces me to write my own examples and exercises, which is time consuming but also gives me the chance to practice and results in pretty tailored material.
I'm looking forward to learning from other people experiences, and I'll keep you posted in case I come with other tips worth sharing.
Like any method, PBL can either be transformative or a complete fiasco. I can see the benefits to using it within teaching R, as it aligns very well with the applied nature of typical user of learning R. I use PBL in my own teaching for the engagement benefits, but I do so in a smaller, more controlled environment.
It does require a great deal of learner support as they gain experience, especially with novice learners. They require a greater amount of support and scaffolding to keep them in the right space for learning, and monitoring that is critical in PBL methods. Sometimes that is difficult to do in an online environment, but you have a big plus in being experienced in facilitating a positive learning community in #r4ds.
I would suggest starting less from the skills you want students to develop (learning objectives) and more from what you want your students to be able to do after the course (learning outcomes). These are students that have no idea how the skills that we list will apply to them and their plans, "You'll learn how to clean data" but if you tell someone "After this course, students are able to - Apply functions in tidyr and dplyr to clean, manipulate and organize a dataset" they have a clear idea of what this course will let them accomplish.
Starting from the outcomes (a la Wiggins) is a fantastic benefit to instructional design and really helps establish and maintain the alignment and purpose between curriculum, instruction and assessment. Instructional design is a wonderful thing, under recognized by many of us who organize and develop workshops or other learning opportunities.
Every time I teach or help in a hands-on workshop I'm reminded how many people are not facile with their file system. Step 1 is often to download workshop materials, open the slides in a PDF viewer, and open a file of code in RStudio. For some this is second nature and happens fast.
Others immediately fall behind because they are not accustomed to downloading a file or an archive of files to a specific and deliberately chosen location, then navigating there in various apps, and opening individual files. By the time they've done it, they've missed the first 15 mins of material, which is usually critical setup and context.
I would dearly love to find a basic lesson and self-assessment tool for people to work on this skill on MacOS and, especially, Windows!
This is a fantastic example of computer literacy skills that are easy to take for granted, and I love the idea of having some kind of self-assessment that helps identify gaps, then facilitates the building and mastery of necessary pre-requisite skills.
There's got to be a recent computer skills concept inventory floating about. If not, it would be a fantastic tool that I'd immediately use in the data carpentry workshops. Bash lessons make for many sad .
This is huge! It's one of the reasons I like and dislike online interactives- on the one hand, you get up and running, on the other, you might delve deeply into Ruby, or JavaScript or whatever, without knowing how to set things up on your own machine.
This is my first Discourse-based forum, so I wasn't familiar with discobot, and I was thinking about how cool it would be to have a step-by-step like that for beginner skills that you need as a base for anything else in R. Obviously the filesystem case would be tricky if actually done natively (way out of my area of expertise), but I do like the idea of a generic sort of self-paced skill up.
I'm not a Windows user, and I remember taking a screenshot for someone on their Windows , and totally struggling to find it. In this case, it was on the desktop, but near-impossible to see because it was in a pile of file icons.
I think a highly modular (like, really broken down into different pieces) almost checklist-style approach could be helpful since there are so many little things that a fairly computer-savvy person can simply miss along the way. (e.g. Despite the cringe-worthy simplicity of my little Vagrant fiasco a few years back, I wouldn't have considered myself "computer illiterate" at the time).
I'm sure teachers among you (us? I'm not a teacher...I don't know what pronoun goes here ) have a sense of stumbling blocks, but it also could be helpful to solicit feedback... @jessemaegan, the r4ds group would be perfect for this!
I'd have to think hard about good wording, because it's really about us, and not a shortcoming on their part. The most valuable feedback would be re something that was treated as obvious/a given, but wasn't for that individual--and it's not always easy to get people to open up about that.
Even R suffers this problem. Recently I've been helping an actuarial science professor teach basic R skills to her students instead of the normal homework for class. We have been using a combination of:
DataCamp for the Classroom which is free for educational use and allows me to track student progress. I have been assigning about 1/2 of a course per week. 2 courses down so far and they seem to enjoy it.
I spoke to them in person once and went over a full case study related to actuarial work. Most of them are used to Excel, so I did the case study first in that, and then proceeded to recreate it in R with tidyverse tools. Along the way I pointed out things you have to think about in the R implementation VS Excel and why you might use one tool over the other. The best thing about this is that it wasn't just me preaching to them. They actually saw the differences and appreciated them (at least a little) as I brought them up.
Now that they have some DataCamp courses under their belt and have seen an example of actuarial work implemented in R, I have just assigned a small case study for them to do on their own. Due in 3 weeks, but next week I plan on answering any questions they may have run into so far. Kind of like in class office hours.
The only issue with using DataCamp is that they don't get an installation of R/RStudio on their own machines. I dedicated half of the last class to also working through installing R and RStudio. Their (completely understandable) lack of knowledge about navigating a file system, figuring out how to open an R script, and a number of the other things we take for granted only further frustrates them because they never had to do that with the DataCamp course.
I really like DataCamp as one part of a teacher's toolkit, but there is a lot of extra work required to get things up and running locally for each of them. A self-assessment tool on computer literacy skills would definitely make that slightly less painful.
Precisely! I did a bunch of CodeSchool courses one summer which basically has the same interactive setup as DataCamp, and found myself in that same spot.
The advantage is, you've got a motivated skill gap, since you know what you can do with the tools you've learned. I'm not anti-in-browser learning! But it'll be interesting data/feedback for you to collect in seeing what issues arise during office hours.
I really like this thread. Lots of good stuff said already.
I've been trying to prepare training materials, tutorials, documentation, and live training sessions at my company for a while now, similar in idea to what Airbnb does with their Data University. What I've come to realize is very similar to what you described in your introduction -- it very much depends on the background of the student.
In addition to my take on training for data science, I also told our head of HR we need to invest in basic computer literacy training for the company. I've personally seen little things in passing that have made me stop and give tips/recommendations that I would expect anybody working an office job to know. Some examples:
going to Google.com in order to do a web search instead of just typing the search into the URL bar
using the OS calculator app only to then manually type in the resulting number into a cell in a spreadsheet
asking me to share xlsx files instead of csv files because they don't have an app to open them
Even for experienced programmers and data scientists, I've worked with people who are really good at the core competencies of their role (like front end stuff or stats or machine learning), but be frustrated because they don't really understand how computers or operating systems work (things like managing memory or file permissions). Another common example is professional programmers being afraid of the command line.
I feel confident teaching those computer literacy skills, but it's really hard for me to not assume what other people should know. Do you have any advice for how basic I should get in my trainings? Or how to go about deciding what expectations I should have, like maybe a framework or checklist to go through in preparation?
If you're interested, I would lovelovelove to collaborate with you on this!
One of my biggest priorities for the next fiscal year is to facilitate the transition to a data-driven organization, and so much of that starts with computer literacy. I've seen all of those examples that you've mentioned, along with my personal favorite: I only save things on my desktop.
I've been trying to think of things in terms of "where do I see this organization in three years, in terms of what technology they're using, how they're using it, and how they're training others to use it."
I am currently facing the same situation inside my company. We are trying to make a data aware and driven change in mindset with intern open data projects, open source development initiative in the company and help teams with their data need. It exists a big lack of skill obviously from beginners who wants to use data for projects but do not know how and curiously from experience teams using data for some time now but not knowing correctly enough the technology and tools they use. Lack of skill leading to lack of efficiency and consistency between project in the company.
I am seriously thinking from some time now on how to launch a dynamic in the company with training, share of experience, documentations... We've begun something first on the development side helping teams with git for their project and their studies, an obvious tool for collaboration. We've launch a network of people here to listen, help, review code, advice on good practice providing to all kind of project (development or studies, with R python JS html..). It aims obviously to people with already some skills at some point. It very difficult from my point of view to find the good way to help beginners. I have thought about assessment but do not manage to put that in place.
I am not only talking about R skill but computer basic skills or data oriented skills in general.
On the R side, I am trying to find a way to begin a R4DS initiative to help people begin with data using R as a tool to get results quickly.
Sometimes I feel a little alone in my company dealing with this issue. It is great to see others have the same issues to deal with and are trying to find solutions.
@raybuhr I would love to follow the way you will go with @jessemaegan. I see you will DM together. Please keep us aware of your discussion and confusion as it very very interesting. I could participate but I feel behind you in term of initiative and thoughts on this subject. Thanks.
I think that covering installation/setup in greater depth has already been emphasized, but I wanted to further hit on some points.
Audience
In my biomededical science program, we were taught R/RStudio/Rcmdr for our completely online biostatistics course. So many of my fellow students never really wrapped their head around that Rmcdr, Rstudio were the same backend of R, just with different overlays. The concept of an IDE was completely foreign to our non-progammer class, although many people had been using SPSS, SYSTAT, Excel, Origin, etc for statistics and graphing for at least 1-2 years at this point. These were motivated, educated academics, but the way that we were introduced to R was poorly done, and lead to many people fighting back against it. I think Davis had a good approach (albeit probably a longer process and focused on in-person education).
"Why use a programming-based program when I can just point and click in SPSS/SYSTAT/etc." -- Many students in the course. They were hard to convince to switch from the "ease" of a GUI compared to a programming based statistical suite as almost all of the stats were on tiny (and already tidy) datasets.
I think it is so important to show how the proposed process is better and then build it up from there. The light bulb clicked for me when I saw the limitations/time costs of doing some things by hand in Excel versus programming/automation in R. Once you are convinced R is better, it is much easier to self-motivate to learn various things outside the scope of the course/community/task at hand.
The other big takeaway was the student's desire for video-based examples, as some people had difficulty following along purely text-based examples in the online-only course. Video fills in the gaps with some of the basic computer functions, and often WHY something is being done AS it is being done which allows highlighting the feature being taught. Gifs can also accomplish this, but sometimes the voice overlay and ability to pause and rewind on demand is more helpful.
Finally, I get the appeal of Rcmdr, but good lord did it set our group back in the actual use of programming-based R via RStudio and saving/writing scripts.
I've worked really hard to build an ecosystem where finding the data you want is easy, but it gets harder the more we grow. In fact I'm in the middle of completely redoing it all with new technologies because of how unorganized and slow process became. Even with the right data and tools, people still have a hard time. Building a good search tool is both critical and ridiculously hard. Same goes for writing documentation that is useful, helpful, and concise. Same goes for creating training material that is engaging, yet practical and memorable.
Learning is an iterative process, and I've come to realize that teaching is also.
I am current trying to build something for different usage, different people and different skills, and I am really wondering if we are doing it right. I strongly believe that teaching, supporting and working closely with teams are essential in the way to success. It seems to be hardly obvious for all and I am struggling to convince. I need to work on some demo, learning session and other idea to prove this is useful. I will continue to follow this feed to share more on this. Thanks.