Modifying a function from a package

Hi all,

I'd like to use the function get_summary_stats from the rstatix package and addapt it to use scientific notations.

This is the function from the package:

function (data, ..., type = c("full", "common", "robust", "five_number", 
    "mean_sd", "mean_se", "mean_ci", "median_iqr", "median_mad", 
    "quantile", "mean", "median", "min", "max"), show = NULL, 
    probs = seq(0, 1, 0.25)) 
{
    type = match.arg(type)
    if (is_grouped_df(data)) {
        results <- data %>% doo(get_summary_stats, ..., type = type, 
            show = show, probs = probs)
        return(results)
    }
    data <- data %>% select_numeric_columns()
    vars <- data %>% get_selected_vars(...)
    n.vars <- length(vars)
    if (n.vars >= 1) {
        data <- data %>% select(!!!syms(vars))
    }
    variable <- .value. <- NULL
    data <- data %>% gather(key = "variable", value = ".value.") %>% 
        filter(!is.na(.value.)) %>% dplyr::mutate(variable = factor(.data$variable, 
        levels = vars)) %>% group_by(variable)
    results <- switch(type, common = common_summary(data), robust = robust_summary(data), 
        five_number = five_number_summary(data), mean_sd = mean_sd(data), 
        mean_se = mean_se(data), mean_ci = mean_ci(data), median_iqr = median_iqr(data), 
        median_mad = median_mad(data), quantile = quantile_summary(data, 
            probs), mean = mean_(data), median = median_(data), 
        min = min_(data), max = max_(data), full_summary(data)) %>% 
        dplyr::ungroup() %>% dplyr::mutate_if(is.numeric, round, 
        digits = 3)
    if (!is.null(show)) {
        show <- unique(c("variable", "n", show))
        results <- results %>% select(!!!syms(show))
    }
    results
}

I think I would need to modify the dplyr::mutate_if(is.numeric, round, digits = 3) section, but I have tried a few things and is not working.

Many thanks for any help!
Beatriz

what seems to be the problem ?

If I run the get_summary_stats() as per package, then it will return zero values for very small numbers, thus I'd like it to return the statistics using scientific notation, for example as 8E-04.

Underneath is the summary table I am getting with get_summary_stats for some of my data, but I know their mean values (for example) in reality are : 0.000333, 0.000118, and 0.000826, and i would like to present them as 3E-04, 1E-04, and 8E-04.

Does this make sense? I can see that the get_summary_stats is rounding the numbers to 3 digits, but this is not good for my data and would like to make an modification to that function that I could then use. I gave the example of the mean, but all zero values given are wrong.

Thanks.

There are (at least) 3 ways to change the functionality

  1. fork the package code from rstatix (GitHub - kassambara/rstatix: Pipe-friendly Framework for Basic Statistical Tests in R ? ) and make your own package 'myrstatix' that works like you want; or send a pull over to rstatix to consider integrating your code into rstatix proper's next release , you would probably need to do your extension so that other users who dont want the behaviour can stick with the old way it worked.
  2. interactively edit the function just for your session. after loading the package, use edit(get_summary_stats) to temporarily overwrite it, and have the code be as you would prefer
  3. copy the code into your own standalone function; you will likely need to prefix the function calls, so that they come from the correct namespace , i.e. where you see doo( should become rstatix:doo etc.
1 Like

Thanks for the options.

I have tried to edit it myself but i cannot make it work.

This is the part of the code i tried to modify:

%>% dplyr::mutate_if(is.numeric, round, 
        digits = 3)

but not being good at coding, I have not being able even to change the number of digits when I finally have tested that part.

I have seen that mutate_if have been superseded and I am not sure if that can be one of the problems?

I have seen that

options(scipen=0)

OR

formatC(x, format = "e", digits = 2)

could maybe transform the numbers to the format I want to, but I don't know where to include this into the code...

ok, here is a complete solution.
rather than use edit , I realised its important to use fix so that your change will take effect.
In the fix, you have to do 2 things.

  1. add the namespace for rstatix to avoid having to rewrite the function with rstatix::: everywhere.
  2. remove the part that does the rounding, as we dont want rounding

after that we have a version that gets the summaries, but does no rounding.

we then use a simple scientific function from the formattable library to control how numeric columns get printed when shown as a gt table.

its like this

library(rstatix)
library(tidyverse)
library(formattable)
data("ToothGrowth")

fix(get_summary_stats) 
# with fix add - attach(getNamespace("rstatix"))   - as first line of function
# just remove the mutate_if that will round the numerics


t1 <- ToothGrowth %>% get_summary_stats(len) 

library(gt)
gt(t1) # see it unrounded , but not scientific

# make it scientific
mutate(t1,
       across(where(is.numeric),
              \(x)scientific(x,digits=2))) |> gt()
1 Like

Thanks @nirgrahamuk , I got your suggestion working in my session, many thanks for the tip to change the code.

I have now tried to use it as part of a RMarkdown document and i doesn't recognise the fix.

Do i need to include the fix in one of the chunks?

I have tried using the fix(get_summary_stats) and when i run it it opens a text file, which i can modify to remove the the mutate_if.. as you suggested, and then I save so the document continues knitting... but the document keeps using the original get_summary_stats() from "rstatix".

It would be great if you could help me with the last part of the code for running the RMarkdown document.

Many thanks in advance for following my issue up and teaching me this codings.

I think you should make your own version of rstatix. 'Beavet82rstatix'

The best resource on making pacakges is https://r-pkgs.org/
but making slight alterations to an existing package is not so complicated as doing your own from scratch.
and you dont need to do anything fancy like host it on CRAN, you just use it yourself.

I finally made the changes work.

Thanks for your guidance.
Beatriz

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.