Fitdist (package fitdistrplus) with ggplot2

diego-rods · April 3, 2025, 12:43pm

I need a help to insert the fitdist into ggplot2. I tried to find an answer on search sites.

The basic code I find whitout ggplot is:

library(fitdistrplus)

data = c(1, 2, 3, 5)

gama_distribution = fitdist(data, distr = 'gamma', method = 'mle')
normal_distribution = fitdist(data, distr = 'norm', method = 'mle')
lognormal_distribution = fitdist(data, distr = 'lnorm', method = 'mle')

plot(gama_distribution, demp = TRUE)
plot(normal_distribution, demp = TRUE)
plot(lognormal_distribution, demp = TRUE)

I need a help to combine fitdist with ggplot2.

diego-rods · April 3, 2025, 12:50pm

Someone knows how to solve this problem?

diego-rods · April 4, 2025, 6:27pm

I need help to making fitdist into a graphic with ggplot!

startz · April 4, 2025, 8:42pm

Look at the documentation for the package. You will see several functions, such as cdfcomp, that allow an argument plotstyle="ggplot" and will then use ggplot.

diego-rods · April 4, 2025, 9:31pm

I try using the code below, but the dataframe, in data_1, don't accept the vector.

library(fitdistrplus)
library(ggplot2)

data = c(1, 2, 3, 5)

gama_distribution = fitdist(data, distr = 'gamma', method = 'mle')

data_1 = data.frame(gama_distribution)

data_2 = ggplot(data_1)

So, I don't know how to integrate fitdist with ggplot.

startz · April 4, 2025, 10:01pm

The problem is that the result of fitdist isn't a dataframe or anything that can be turned into a dataframe (or tibble). You need to either create a dataframe yourself from the contents of the fitdist function or to use the functions like cdfcomp that will do the translation internally and create a ggplot object.

diego-rods · April 4, 2025, 10:24pm

library(fitdistrplus)
library(ggplot2)

data = c(1, 2, 3, 5)

data_1 = data.frame(fitdist(data, distr = 'gamma', method = 'mle'))

data_2 = ggplot(data_1)

I try with that code above, but still don't work (don't accept the vector).

startz · April 4, 2025, 10:39pm

Really. See the reply above.

diego-rods · April 4, 2025, 11:18pm

Well, I just give up to solve this problem with myself, for a while, because I'm not a programmer. I just ask for help to someone show me the solution with a code, for me to apply a biological analysis.

joels · April 4, 2025, 11:52pm

According to the fistdistrplus Vignette, you can return a ggplot using the package's plotting functions and the plotstyle argument. For example:

p = denscomp(list(gama_distribution, normal_distribution, lognormal_distribution),
             plotstyle="ggplot")

See the help for denscomp, which lists other types of plots you can create.

joels · April 5, 2025, 12:23am

Here's a more complete example that shows you how to calculate the density values for each estimated distribution and put them in a data frame. Then you can create your own plot and structure it exactly how you want.

library(fitdistrplus)
#> Loading required package: MASS
#> Loading required package: survival
library(tidyverse)
theme_set(theme_bw())

# Fake data
set.seed(3)
data = rgamma(1000, 6.6, 2.3)

# Create distributions
dists = list(Gamma="gamma", Normal="norm", Lognormal="lnorm") %>% 
  map(~fitdist(data, distr=.x, method="mle"))

# Look at structure of one of the distribution objects so we know where to 
# get its density function name (the "distname") and the values for the 
# location and scale parameters (the "estimate")
str(dists[["Gamma"]])
#> List of 17
#>  $ estimate   : Named num [1:2] 6.78 2.39
#>   ..- attr(*, "names")= chr [1:2] "shape" "rate"
#>  $ method     : chr "mle"
#>  $ sd         : Named num [1:2] 0.296 0.108
#>   ..- attr(*, "names")= chr [1:2] "shape" "rate"
#>  $ cor        : num [1:2, 1:2] 1 0.963 0.963 1
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:2] "shape" "rate"
#>   .. ..$ : chr [1:2] "shape" "rate"
#>  $ vcov       : num [1:2, 1:2] 0.0876 0.0309 0.0309 0.0117
#>   ..- attr(*, "dimnames")=List of 2
#>   .. ..$ : chr [1:2] "shape" "rate"
#>   .. ..$ : chr [1:2] "shape" "rate"
#>  $ loglik     : num -1453
#>  $ aic        : num 2910
#>  $ bic        : num 2920
#>  $ n          : int 1000
#>  $ data       : num [1:1000] 1.72 2.19 1.56 3.02 2.67 ...
#>  $ distname   : chr "gamma"
#>  $ fix.arg    : NULL
#>  $ fix.arg.fun: NULL
#>  $ dots       : NULL
#>  $ convergence: int 0
#>  $ discrete   : logi FALSE
#>  $ weights    : NULL
#>  - attr(*, "class")= chr "fitdist"

# Vector of values at which we want to get density
xvals = seq(min(data), max(data), length.out=100)

# Use each distribution's location and scale parameters to create a data frame of 
#  values to plot
dist.vals = map_df(dists,
                   ~{
                     # Get desired density function
                     fun = paste0("d", .x$distname)
                     fun = match.fun(fun)
                     
                     # Get density values for that function
                     fun(xvals, .x$estimate[1], .x$estimate[2])
                   })

dist.vals = data.frame(xvals, dist.vals) %>% 
  pivot_longer(-xvals, names_to="Distribution", values_to="Density")

dist.vals %>% 
  ggplot() +
  geom_histogram(data=data.frame(x=data), aes(x=x, y=after_stat(density)),
                 breaks=seq(floor(min(data)), ceiling(max(data)), 0.5), 
                 colour="white", fill="grey80",
                 ) +
  geom_line(aes(xvals, Density, colour=Distribution)) +
  scale_x_continuous(breaks=0:10)

^{Created on 2025-04-04 with reprex v2.1.1}

diego-rods · April 5, 2025, 12:37am

Alright. Thanks for your help!

aursiber · April 7, 2025, 9:15am

Hi! Next time you have a question or an issue about fitdistrplus, I encourage you to post directly to the GitHub page of fitdistrplus here. The authors and users will be able to help you better Best regards.

joels · April 7, 2025, 4:50pm

Hi Aurélie,

Thanks for posting. In my answer above, I extracted the location and scale parameters from the fitdist objects and fed them to the appropriate density functions (e.g., dgamma) . I'm curious whether there's a more direct method to do this directly from the fitdist object.

diego-rods · April 7, 2025, 7:06pm

Hi, Joel. I've written the code this way, but I don't know how to insert the 'demp = TRUE' parameter into the code. Writing the code this way, without using ggplot as a plot argument (plotstyle = ggplot), makes it much easier to change the characteristics of the graph.

If anyone knows how to insert 'demp = TRUE, and the curves relative for each of the distributions (gama, normal and lognormal). I would appreciate it.

library(fitdistrplus)
library(ggplot2)

rgb_to_hex = function(rgb){
  sprintf('#%02x%02x%02x',
          rgb[1],
          rgb[2],
          rgb[3])}

rgb_color_histogram_0 = c(111, 5, 28)
rgb_color_histogram_1 = c(254, 254, 10)
rgb_color_histogram_2 = c(35, 194, 228)
rgb_color_histogram_3 = c(26, 238, 36)
rgb_color_histogram_4 = c(105, 57, 185)

data = c(2.5, 6.5, 6.6, 6.6, 7.0, 8.0, 10.3, 10.6, 13.9)

library(fitdistrplus)
library(ggplot2)

rgb_to_hex = function(rgb){
  sprintf('#%02x%02x%02x',
          rgb[1],
          rgb[2],
          rgb[3])}

color_0 = c(111, 5, 28)
color_1 = c(254, 254, 10)
color_2 = c(237, 7, 203)
color_3 = c(237, 7, 203)
color_4 = c(235, 105, 10)
color_5 = c(235, 105, 10)
color_6 = c(37, 236, 9)
color_7 = c(37, 236, 9)
color_8 = c(0, 0, 0)
color_9 = c(255, 255, 255)
color_10 = c(255, 255, 255)
color_11 = c(255, 255, 255)
color_12 = c(47, 31, 241)

data = c(2.5, 6.5, 6.6, 6.6, 7.0, 8.0, 10.3, 10.6, 13.9)

gama_distribution = fitdist(data, distr = 'gamma', method = 'mle')
normal_distribution = fitdist(data, distr = 'norm', method = 'mle')
lognormal_distribution = fitdist(data, distr = 'lnorm', method = 'mle')

dev.new()

data_1 = gama_distribution$data
data_2 = normal_distribution$data
data_3 = lognormal_distribution$data

data_4 = ggplot(data.frame(c(data_1, data_2, data_3)),
                aes(c(data_1, data_2, data_3))) +
         geom_histogram(aes(y = ..density..),
                        bins = 10.0,
                        color = rgb_to_hex(color_0),
                        fill = rgb_to_hex(color_1),
                        linewidth = 1.0,
                        linetype = 'solid') +
         theme(axis.text.x = element_text(angle = 0.0,
                                          color = c(rgb_to_hex(color_2)),
                                          face = 'bold',
                                          hjust = 0.5,
                                          size = 15.0,
                                          vjust = 0.0),
               axis.text.y = element_text(angle = 0.0,
                                          color = rgb_to_hex(color_3),
                                          face = 'bold',
                                          hjust = -0.1,
                                          size = 15.0,
                                          vjust = 0.0),
               axis.ticks.length = unit(0.1, 'cm'),
               axis.ticks.x = element_line(c(rgb_to_hex(color_4)),
                                           linewidth = 2.0),
               axis.ticks.y = element_line(color = c(rgb_to_hex(color_5)),
                                           linewidth = 2.0),
               axis.title.x = element_text(angle = 0.0,
                                           color = rgb_to_hex(color_6),
                                           face = 'bold',
                                           hjust = 0.5,
                                           size = 20.0),
               axis.title.y = element_text(angle = 90.0,
                                           color = rgb_to_hex(color_7),
                                           face = 'bold',
                                           hjust = 0.5,
                                           size = 20.0,
                                           vjust = 2.0),
               panel.background = element_rect(color = rgb_to_hex(color_8),
                                               fill = rgb_to_hex(color_9)),
               panel.grid.major = element_blank(),
               panel.grid.minor = element_blank(),
               plot.background = element_rect(color = rgb_to_hex(color_10),
                                              fill = rgb_to_hex(color_11),
                                              linewidth = 1.0),
               plot.title = element_text(color = rgb_to_hex(color_12),
                                         face = 'bold',
                                         hjust = 0.5,
                                         size = 30.0)) +
         ggtitle('Generalized linear model') +
         xlab('Data') +
         ylab('Density')