I'm in the early early stages of planning out a three year vision, so I'm more than happy to collaborate. I've been reading Creating a Data Driven Organization to get a foundation of directions to go.
yes!
I've been thinking a lot about differentiated instruction, and there's a lot of room for videos as well as visuals that help to reinforce the material, as well as engage those who struggle with traditional direct instruction methods.
I work at a university and routinely work with analysts from other Faculties and offices across the institution. Part of my job is getting programs to engage with and use data for improvement. I am surprised when I speak to others with similar jobs and they aren't familiar with many aspects of scientific computing (reproducibility, version control, console work, programming basics, effective tools). It's simply SAS/SPSS/Excel with numbers coming out of thin air.
The main reason, I think, is training. People are in these roles because they have some skill in analysis but do not learn about the effective use of tools or even get introduced to programming. Most are unconvinced that changing their comfortable tools or learning a new one has any particular benefit to them. It is a deeply entrenched belief that has existed for some time ("This is the way we do it here") and the only way around it has been attrition.
I would love to see more effective practise or a transition to a data driven institution that uses its resources effectively and efficiently, rather than the status quo.
I would love to contribute to the project @jessemaegan and @raybuhr , if I can be of any help.
I'm currently the vice president of Data Science at UCSB, where I often find myself teaching people of varying skill sets. But a technique that seems to work is project based learning. Datacamp and resources that are similar are great, but they're limiting in actual application. Feedback I've heard from people who use Datacamp, that its great to teach the skillset, but they are still unsure of applying what they've learned.
Since most of us are undergrads we don't have too much experience with application of data science, but we found a formula that works for the projects we tackle. So we've taken that approach, more info here.
Setting up environments is probably the biggest issue when it comes to teaching. We found that it helps to create tutorials for users to go back to or else your just repeating the same walk through over and over again. This can be down with a markdown file in Github, but we created a platform where we just include all our documentation so that people can reference when we're doing hands on learning. See here
But I would love to hear more about this since I'm always learning and would like to learn from other people, since my goal is to start teaching r and data science to a much wider audience!
Thanks for the book recommandation!
I'd also like if people keep discussing things in this thread instead of private messages. With teaching, there are few general truths, but a lot of shared difficulties. What you may think is cluttering the thread could help or inspire somebody.
I'm trying to push my colleagues in state government away from SAS and a dismissive view of "programming". Managers are quick to encourage ideas, but money and manpower are tight. We're sticking to a monthly R User Group meetup. My threadbare plan is the GNU model: use R and good programming practices (Git, shared databases) to make awesome stuff. The motivated analysts will want to join the fun. The unmotivated will be pushed by managers who want them to make awesome stuff.
The next big thing on our plate is training. And, after reading this thread and reflecting on what I've seen, basic computer skills intimidate a lot of our analysts. It's hard to teach somebody Git when they've never used relative file paths or even knew files other than .txt can be read in Notepad.
Happy to keep things transparent - however I know that they will veer very far from the world of R and RStudio and eventually become only tangentially related to the forum. There aren't explicit guidelines around content for this site at this point in time, but I do want to address the fact that R and RStudio conversations will likely play a relatively small part in the conversation and solicit feedback on whether or not this would still be the appropriate continue the conversation.
If people here want to collaborate on something, maybe start a GitHub repo where you build out some curriculum or something? A great way to move forward on this, and it would still be transparent. Just a thought.
--
On general literacy: I find that incorporating 15 mins or so of "setup time" at beginning of every workshop I do is a great way to get people up to speed, and everyone can help each other. Indeed, so many minor computer things are taken for granted, as people have listed above!!
If we want to separate from here because unrelated to tidyverse, (and it think too it will) maybe a slack group could be the place ? People could join the discussion.
that may be the best solution, with more concrete deliverables being codified somewhere like GitHub and perhaps (eventually) a website, and sharing back into the RStudio community where relevant.
Yep a combo all this great tool could allow organisation of ideas and task (github), overall discussion even allowing small groups talk on specific task (slack) , sharing work and conclusion (bookdown website for reference ressource and links to specific tidyverse and R topic here)
This is a great thread with some excellent ideas. I'd love to contribute to either a GitHub repo or Slack group!
I teach social science undergraduates, and relate to many of the things @jessemaegan, @raybuhr, @cderv, @TMock, and others have raised. I encounter what @jennybryan shared about downloading and saving files to a specific location every semester, and it usually takes a few sessions before everyone is comfortable.
This is super interesting to me, because i have zero teaching experience and am in the middle of designing my first workshop specifically aimed at non-programmers. Climate science is a coding-heavy field, butābeing at the intersection of maths and geoscienceāit attracts fairly skill-diverse students. Intro to git courses are great for people who have coding skills and just haven't been exposed to git, but it washes over people with more of a geoscience bent who often (painting with a very broad brush here) haven't touched code during their undergrad at all.
In the latter case, I think the cognitive load of (a) touching a command line for the first time, (b) learning the concepts of git for the first time, and (c) juggling everyone's various environments is just too much. I'm trying to design a workshop that skips the command line entirely and just uses the Github Desktop client to build a Github Pages website. The client comes with its own difficulties (like the fact that it's pretty oblique UI-wise), but I'm hoping to just demonstrate the utility of version control and send people away with a useful product so that they'll be a bit more receptive the next time they encounter git.
I know it's been said here, but things like PATH
and environmental variables can be so confusingā¦ Hence, I was extremely excited to see this yesterday: A Data Scientist's Guide to Environment Variables by Eric Ma.