Hello all, my group and I trying to analyze our data frame in which most data types are factors. On one of our columns there are 6 types of factors (1,2,3,4, 9999 and blank(" ") we want to plot a bar graph with the following code, however when we do it, the bar graph gives us a bar for 1,2,3,4 9999 and also for blank , we want to ignore 9999 and the blank as factors, what code can we use? thanks in advance.
plot(tab_catalog)
barplot(tab_catalog)
barplot(prop.table(tab_catalog))
barplot(prop.table(tab_catalog),col=c("red", "orange"),
main="Distribution of Customers by Catalog Years",
ylim=c(0,1))
box(lwd=2)
I think you can replace "" and 9999 by NA values before plotting. There is several ways to do that depending on the tools you want to use.
naniar has really good tools for dealing with missing values. Among theme, you'll find naniar::replace_with_na that will do the job.
You can also recode your column (with dplyr's recode, if_else or case_when) to replace the values.
As you deal with factor, you may need more factors oriented tools. There are dplyr::recode_factor or even forcats::fct_recode where you can remove levels easily. forcats is the toolbox for factors in the tidyverse.
Indeed, you can also do it with base R too. At the end, it is as you like.
If you manage to provide a reprex, we can be sure it is the issue and we can work on an example for you.
rem: I also edited your post for formatting purposes
If your question's been answered, would you mind choosing a solution? (see FAQ below for how) It makes it a bit easier to visually navigate the site and see which questions still need help.