Hi all,
I’ve been engaged to write an intermediate statistics book using R, and I’m having a moment of crisis.
I’ve got my own way of doing things in R, and like most of you, they’ve just sort of grown up naturally. Now that I’m putting out something in print regarding stats using R, I feel the need to think and solicit other opinions.
Although this is a book about statistics and not about R, I want to develop good R habits in the reader. But what are they? I’d say R+RStudio is one. Is adding RCommander another?
Should I use base graphics, or ggplot2?
Should I introduce the whole tidyverse, or just pick out parts I need, like, say, the text mining?
There are 400 clustering methods, and I want to do Hierarchical, Kmeans, maybe SOMs...I’ve got a package, but I wonder if there’s one with more support, or a better future? I’ll keep the one I use to myself for now.
I’m asking the same of myself for Discriminant Analysis. And Design of Experiments (including optimal, at least D-optimal). And Neural Nets. And Regression trees. And MDS. And Item Response Theory. MANOVA. And on and on.
So maybe you could do this. If you feel like you have a strong opinion on the tidyverse question for a stat book, please, say it. Same if you do something like one of the topics I mentioned—is there a package you’re devoted to? Or is there a do you think “boy I wish someone had showed me package X when I was starting”?
Clearly, this isn’t a very specific question. And I assume a very very basic level of R knowledge already—maybe they used it in an intro class. No matter how much I learn, there always seem to be others whose brain I want to pick. If you feel you can help, pray do.