My name is Maggs. I'm trying to plot a simple histogram using ggplot. It plots fine when I don't transform the x-axis to log-scale. However, when I transform the x-axis to log scale it plots an empty grid. I'm not sure how to solve it. Any help would be greatly appreciated! Below is my code, the plot of the empty grid, and the histogram plotted prior to transforming the x-axis.
Thanks!
l <- pachon_del %>%
ggplot(aes(x=V1)) +
geom_histogram(binwidth=1000, fill="aquamarine2", color="aquamarine2") + #scale_y_log10() +
ggtitle("Distribution of Pachón Deletion Lengths") +
ylab("Count") +
xlab("Length deletions") +
scale_x_continuous(limits=c(50,1000000), trans="log10") + #set x-axis limits
scale_y_continuous(trans="log10") + #set y-axis
theme(axis.text = element_text(size = 12)) + #size of axis text
theme(plot.title = element_text(hjust = 0.5)) + #center the title of the plot
theme(panel.background = element_rect(fill = 'white', color = 'black'), #back background white and outline of plot black
panel.grid.major = element_line(color = 'black', linetype = 'dotted')) #make grid dotted lines
I agree, however the reason is the "binwidth = 1000" argument, as this is also based on the log data, removing this and do the peak-width definition automatically and everything works.
pachon_del = tibble(V1 = c(rnorm(10000, 100000, 20000), # some signal
runif(3000, 100, 1000000) )) # some background
pachon_del %>%
ggplot(aes(x=V1)) +
geom_histogram(#binwidth = 0.1, # this works, binwidth = 1000 doesn't work!
fill="aquamarine2", color="aquamarine2") +
labs(title = "Distribution of Pachón Deletion Lengths",
y = "Count", x = "Length deletions") +
scale_x_continuous(limits=c(1000,1000000), trans="log10") + #set x-axis limits
scale_y_continuous(trans="log10") + #set y-axis
theme(axis.text = element_text(size = 12), #size of axis text
plot.title = element_text(hjust = 0.5), #center the title of the plot
panel.background = element_rect(fill = 'white', color = 'black'), #back background white and outline of plot black
panel.grid.major = element_line(color = 'black', linetype = 'dotted') #make grid dotted lines
)
Thank you! Both of these solutions plot the data. I have another maybe silly question. Can you explain why the plot shows 9 deletions 10^5 length, even though the longest deletions are slightly below 100,000? Below is a snapshot of the longest deletions in the dataset. And attached is the png of the histogram.