NEW RStudio USER needing help with Creating barcharts

Is there a simple or standard formula for creating barcharts in R. I've watched several tutorials and they all are different and I get error messages when trying to duplicate the formulas using my dataset. I'm using RStudio for Windows and although I've work with data and creating viz, I haven't used Rstudio and I'm new to writing this type of code. I digress...Is there anyone out there who can help me create a few basic charts in R for an assignment?

barplot(Newcatsvdogs_csv$Location, names.arg = Newcatsvdogs_csv$Mean Number of Dogs per household ,cex.names = 4, xlab = "Mean Number of Dogs per household", ylab = "Location")

1 Like

You probably want to put backticks (``) around the Mean Number of Dogs per household in Newcatsvdogs_csv$Mean Number of Dogs per household, or R will read that as Newcatsvdogs_csv$Mean.

1 Like

Thank you @mduvekot I seem to be getting closer, but received a new error message barplot(Newcatsvdogs_csv$Location, names.arg = Newcatsvdogs_csv$Mean Number of Dogs per household ,cex.names = 4, xlab = "Mean Number of Dogs per household", ylab = "Location")
Error in -0.01 * height : non-numeric argument to binary operator

1 Like

It looks like you have a couple of problems here or maybe more . @mduvekot has diagnosed one.

It looks like your data may not be in the right format. My guess is the data is in character or factor format and you need it to be in numeric form. What do you get if you do

```
str(Newcatsvdogs_csv)

```

Can you supply us with your data?
A handy way to supply data is to use the dput() function. Do dput(mydata) where "mydata" is the name of your dataset. Paste it here between


It would probably help if you could give us all of the code you are using. Please copy the code and paste it here between

```

````

1 Like

str(Newcatsvdogs_csv)
tibble [49 × 12] (S3: tbl_df/tbl/data.frame)
Location : chr [1:49] "Vermont" "Maine" "Oregon" "South Dakota" ... Number of Households (in 1000) : num [1:49] 265 548 1505 333 2632 ...
Percentage of households with pets: num [1:49] 70.8 62.9 63.6 65.6 62.7 62.1 61.6 62 59.9 56.8 ... Number of Pet Households (in 1000): num [1:49] 188 345 957 219 1649 ...
Percentage of Dog Owners : num [1:49] 37.7 34.6 38.8 42.8 36.3 45.8 45.9 42.7 39.9 30.3 ... Dog Owning Households (1000s) : num [1:49] 100 190 584 143 954 350 816 242 989 154 ...
Mean Number of Dogs per household : num [1:49] 1.4 1.6 1.6 1.5 1.7 1.8 1.9 1.5 1.6 1.4 ... Dog Population (in 1000) : num [1:49] 142 300 917 220 1609 ...
Percentage of Cat Owners : num [1:49] 49.5 46.4 40.2 39.1 39 38.1 36.8 34.6 34.4 34.2 ... Cat Owning Households : num [1:49] 131 254 605 130 1028 ...
Mean Number of Cats : num [1:49] 1.8 1.9 2 2.2 1.8 2.2 2.1 2 2.2 1.8 ... Cat Population : num [1:49] 234 498 1185 290 1844 ...

dput(Newcatsvdogs_csv)

1 Like

Thank you @jrkrideau for your input and considering.

1 Like

Aha, as I thought. location is a character variable. You cannot create a bar chart with character variables.
You need to create a new numeric variable to code each location. There are several ways to do this.

You copied the dput() command but not your actual data. Can you redo the dput() and copy and paste the output? It would be easier than me mocking up some data.
Thanks.

1 Like

Here is a quick and dirty mockup of a possible approach. I am using the {data.table} package, mainly because I am lazy and it's faster for me. If you want to try the code you will probably need to install {data.table}. If it looks usable I can add some explanations and maybe break down what I am doing.

install("data.table)

Code

# Load packages --------------------------------------------------------
suppressMessages(library(data.table))
suppressMessages(library(tidyverse))

# Load data ------------------------------------------------------------
DT1 <- data.table(aa = sample(LETTERS[1:12], 49, replace = TRUE), 
              bb = sample(1:9, 49, replace = TRUE))

# Execute --------------------------------------------------------------
DT2 <- data.table(DT1[,table(aa)]) 

ggplot(DT2, aes(aa, N)) + geom_col() + coord_flip() +
 labs(x = "Location", y = "Mean Number of Dogs", title = "Canine Invasion")

Blast it, for some reasn I cannot load a .png file.

1 Like

For a base R plot, try with a y ~ x formula ( numerical y variables plotted against the categorical x) and data arg, for horizontal bars (as you seem to prefer locations as y-lables), you can set horiz = TRUE:

barplot(
  `Mean Number of Dogs per household` ~ Location, 
  data = Newcatsvdogs_csv,
  horiz = TRUE
)

Such assignments are usually designed to help you build an habit of using documentation whenever you are not 100% sure of something. just move cursor to barplot in RStudio and press F1 or execute
?barplotto open help. Start by checking examples at the bottom of help, those are (usually) granted to be reproducible, e.g. you can just click Run Example to execute those and get an idea of the output / returned values. Then skim through documented arguments and different methods, if there's a method for formula, it usually provides a more compact & less verbose approach.

1 Like

@jrkrideau Thank you for this...you have to pardon my naivete, but do you mean to attach the csv file?

1 Like

Thank you @margusl Your formula worked, but the y axis only shows 4 state names. I may modify the data to only include groups of states i.e. regions.

1 Like

No, see the example below. The idea is that if you do

dput(Newcatsvdogs_csv) it will output an "exact" copy of your working data set. You can just copy the output, paste it here between
\```

\``
and we will have exactly the same data you have. 

Example

dat1 <- data.frame(aa = LETTERS[1:10], bb = c(2, 5, 8, 2, 6, 9, 6, 1, 9,  3), cc = 1:10)

dput(dat1)

gives us

structure(list(aa = c("A", "B", "C", "D", "E", "F", "G", "H", 
"I", "J"), bb = c(2, 5, 8, 2, 6, 9, 6, 1, 9, 3), cc = 1:10), class = "data.frame", row.names = c(NA, 
-10L))

we can just do


mydat <- structure(list(aa = c("A", "B", "C", "D", "E", "F", "G", "H", 
"I", "J"), bb = c(2, 5, 8, 2, 6, 9, 6, 1, 9, 3), cc = 1:10), class = "data.frame", row.names = c(NA, 
-10L))

and we have a working copy of your data set.

1 Like

I think this is what you meant?
` dput(Newcatsvdogs_csv) \

1 Like

Yes that is what I meant

1 Like

Thank you again, Jrkrideau for taking the time...While everyone's guidance was helpful, I’ve identified gaps in my foundational understanding of RStudio operations that are affecting my progress in my course which is an accelerated asynchronous learning course. I am currently evaluating whether it is appropriate for me to continue in this type of program at this time.

1 Like

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.