Hi, I've been trying to become proficient in R for a while. One of my strategies is to find a website that has an example for how to do something I am interested in and work my way through the project.
Trouble is that no matter what I try to do there are always issues with the code that is provided failing. Sometimes it's my fault as I haven't installed a library or have not read the instructions properly. Many of the issues I can solve myself by modifying the provided code. However, most of these projects I try to utilise fail in some way that I can't seem to find a sensible reason for.
Yesterday I tried to work my way through https://www.rayshader.com/ . It's a great way of creating 3D maps and adding clouds, sun movement etc. I'm sure it really does work well. Having managed to get through many issues I got to the point where the only way suggested to solve the error was to uninstall R and R studio and then try again!
Is it me doing something wrong or making a false assumption about using R?
Surely not everyone has these sorts of continuous issues?
I think this is normal if you are trying random things that have been written at different points in time. You have to consider that the R ecosystem is open source and constantly evolving. There are a lot of package contributors with different coding preferences and time to devote to their projects so breaking changes are to be expected if you do not manage reproducibility on your side.
There are many ways people try to keep consistent environments for reproducibility, ranging from managing package versions with renv to using containers, but blog posts and tutorials rarely bother with these things.
The renvpackage isnβt something an end user can apply retroactively without knowing what versions were used in the examples, which brings the problem full circle.
I ran the code you provided. I got a message saying "... Skipping install of 'rayshader' from a github remote, the SHA1 (08e88005) has not changed since last install." However, the code seemed to complete without error!
The demo has chaidattrit1 <- chaid(Attrition ~ ., data = newattrit)
not chaidattrit1 <- chaid(Attrition, data = newattrit)
It is very strange that it fails when looking for the Age variable. Age is not in the newattrit data frame (only factors) so it does not even know Age exists.
If you want an answer from someone familiar with CHAID, I recommend a new topic with that in the title.
OK, the confusion is that the author of the demo uses newattrit twice! For the first 2/3 of the post, it does not include Age because it has too many unique values to be converted to a factor. In the last part, it cuts those values into segments and converts it to a factor.
My comments are for the initial appearance of newattrit.
The attrition data frame has 31 variables. There are 15 factor variables and 16 integer variables. The four integer variables with 10 or fewer unique values are converted to factors, leaving 12 integer variables with more than 10 unique values. That gives a total of 19 factor variables and 12 integers variables. Age is one of the integer variables that is not converted to a factor.
The step just before the first model, newattrit <- attrition %>% select_if(is.factor),
drops those 12 variables, including Age. dim(newattrit) reports 1470 rows and 19 columns.
The chaid(Attrition ~ ., newattrit) specifies a model with Attrition as the outcome variable and all of the remaining 18 factors as predictor variables.