Code for Historgrams

Hi there,

I'm new to R and R-Studio but have been watching lots of videos on how to get started and have picked up the main gist of how it works.

To practice I have assembled a sample data table in Numbers and exported it as a csv to R and the first image shows the table in R. It lists CoP values (and uncertainties) for 20 test events for two types of test, a regular and a control. I am just practicing showing the first column (CoP) against ID in a histogram.

When I use the code:

ggplot(aes(ID)) + geom_histogram(binwidth = 2, fill = "steelblue", alpha = 0.5) + labs(title = "CoP vs ID",
x = "ID", y = "CoP") + theme_bw()```

it produces a histogram as shown in the second image that does not seem to distinguish the various nuanced values (e.g. 2.12, 2.63, etc) but instead shows CoP values of 1, 2 or 3 against the test ID.

Is this because I have not specified a resolution to the mapping or something similar?

https://ibb.co/vwH7nz6

https://ibb.co/xLsk1sD

I'm not sure I have uploaded the images correctly as I can't get them to show. I will try ImgBB codes

library(ggplot2)
# default binwdith
mtcars |> ggplot(aes(mpg)) +
  geom_histogram() +
  theme_minimal()
#> `stat_bin()` using `bins = 30`. Pick better value with `binwidth`.


# finer grained
mtcars |> ggplot(aes(mpg)) +
  geom_histogram(binwidth = 0.25) +
  theme_minimal()

Created on 2024-01-08 with reprex v2.0.2

Hi there,

I have tried your code and it makes no difference. In this example, I want the test ID on the X axis and the CoP value on the Y axis. What I get is in the image link

COP1 %>%
  ggplot(aes(ID)) +
  geom_histogram() +
  theme_minimal()

It seems I need to specify the Y-axis variable and perhaps some resolution? The histogram blocks are all the same size indicating the frequency of occurrence instead of the CoP value (e.g. 2.14) for each test ID.

The graph linked isn't a standard histogram which shows the count by interval.

It would help to have a reprex (see the FAQ) to see the data (or similar data) that you are working with.

I showed the COP data table at the start. Here is the link again.

I know Histograms are usually categorical on the X-axis and then a count or value on the Y, and I can produce a scatter plot (Co-P-Scatter-plot hosted at ImgBB — ImgBB) but wanted a visual showing Test ID on the X and the COP value as the bar height (Y value). Is that not doable?

Thanks

You may get more answers if you use something that can be cut and pasted, as described in the FAQ. the dput() function is good to bring data in.

Sounds like a good idea but I can't see how to do this. I have put 'dput' into FAQ and it comes up with how to provide a subset of data that is already loaded into R.

Do you have a link on how to bring in a sample of external data?

Being unable to get 'dput' to work, here is a data frame with the first 10 values of my larger dataset:

Df <- data.frame(ID = c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10), CoP = c(2.12, 2.44, 1.97, 2.63, 2.04, 1.95, 2.52, 2.92, 2.56, 2.17))

All I am trying to do I give an alternative visualization to a scatter graph by showing the ID on the X-Axis and the value for CoP as the height of the histogram.

I think you are confusing histogram with a bar chart, a histogram is a very particular subset of the more general bar charts. Do you require binning ? (histogram) , or do you already have binned data (bar chart).
Your ID's to use on the x-axis strongly imply you have a bar chart to generate and therefore geom_histogram is not the approriate tool. if you want the bar to be as high as CoP then you would use geom_col, withaes using y=CoP

Here is a very simple example of how to do it.

dat <- data.frame(xx = 1:10, yy = letters[1:10])

dput(dat)

This gives us

structure(list(xx = 1:10, yy = c("a", "b", "c", "d", "e", "f", 
                                 "g", "h", "i", "j")), class = "data.frame", row.names = c(NA,  -10L))

We can then copy it into R

my_dat  <- structure(list(xx = 1:10, yy = c("a", "b", "c", "d", "e", "f", 
                                            "g", "h", "i", "j")), class = "data.frame", row.names = c(NA,  -10L))

and we have an exact copy of your data.

Clearly I am rusty on the difference: histogram for probabilities and bar charts for values.

This now works:

ggplot(aes(ID, CoP)) + geom_col(fill = "steelblue", alpha = 0.5) + labs(title = "CoP vs ID", x = "ID", y = "CoP") + theme_bw()

Thanks

Quite a bit to get used to there for a beginner but thankfully I don't need that now. Thanks anyway.

Do not worry, if you are used to SAS , SPSS, etc. the sense of complete disorientation fade after a few weeks. :grinning:

Good to hear that is the case :face_with_peeking_eye:

This topic was automatically closed 42 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.