Using Option Settings command line from pre-written script

GifJB · June 15, 2023, 6:25pm

Hello,

Writing here because I'm having some trouble using the R script found on this website using RSalvador.
https://barricklab.org/twiki/bin/view/Lab/ProtocolsFluctuationTests

I can get most of the code rolling and working, except for the options I'm supposed (line 9-16 on github). I tried using the Options and getOptions commands, but the code still asks me to supply an -i argument.
It'd be a tremendous help if anyone could just send me the lines I should write to use the whole program with the example .csv file.

Thanks in advance.

  make_option(c("-i", "--input"), type="character", default=NULL, 
              help="Input CSV file name", metavar="input.csv"),
  make_option(c("-o", "--output"), type="character", default="", 
              help="Output file prefix", metavar="output_prefix"),
  make_option(c("-c", "--comparisons"), action="store_true", 
              help="Perform comparisons between fluctuation tests. Results in the output of file comparisons.csv showing p-values for tests that mutation rates are significantly different between samples")
  )

AlexisW · June 16, 2023, 3:38am

Are you familiar with the command line? The idea behind this kind of script is that it can be called from outside R, and will take care of loading R and everything else.

Specifically, this was written with a Unix system (Mac, Linux) in mind (though it's not to hard to make it work on Windows). If you don't know anything about the terminal, I would recommend reading one of the many online tutorials. Long story short, you have to open a Terminal, use cd and ls as appropriate, and run the script from the terminal with:

fluxxer.R -i fluxxer_example_input.csv -o fluxxer_example_output -c

On Windows, if it doesn't work, I think it should work if you just add the path to Rscript.exe.

I'm not giving complete explanations, because there are many variants depending on your operating system etc. I can give details if needed.

Run script from within R

If you want to stay in R, you actually can modify the script to provide the options "by hand". The way {optparse} works, there is first a block of code (lines 9-38) which defines the possible options that can be accepted from the command line. Then you have line 39:

opt = parse_args(opt_parser)

What this does, is use the definitions contained in the object opt_parser, and, according to the documentation, returns a simple list of the corresponding values.

In other words, if you wanted to hardcode values in the script, it would be as simple as replacing lines 7-44 by:

opt <- list(input = "some/path/to/file.csv",
            output = "some/prefix_",
            comparisons = TRUE)

And leaving the rest of the script untouched.

But turns out even that is unnecessary. Look at the structure of the script, here is a simplified version:

suppressMessages(library(rsalvador))
suppressMessages(library(...))


option_list = list(
  ...
)
opt = parse_args(opt_parser)

calculateMutRate <- function(filename, output_prefix, comparisons)
{
 ...
}

calculateMutRate(filename = opt$input, output_prefix = opt$output, comparisons = opt$comparisons)

So the script has 4 parts:

load libraries
define options
define function calculateMutRate()
call calculateMutRate()

And in your case, if you want to use this script from within R, all you have to do is call the function calculateMutRate() with the right arguments. You can totally remove anything to do with options.

Why this use of the terminal?

At this point you might be asking why they went through all that trouble, that's actually to make things easier for the user. In bioinformatics and computer science, it's very common to have programs that are called from the command line and accept options. As a user, you can spend all days calling programs, and having no idea what programming language they were written in. Some might be in R, most in C, and anything else including Python, Rust, Lua, Perl, ... Obviously you don't want to learn all these programming languages!

So, the programmers from the Barrick lab did the work of wrapping their function in an executable script, now any bioinformatician can just call it from the command line without knowing anything about R (R still needs to be installed on the computer though).

DavoWW · June 16, 2023, 3:41am

Hi @GifJB
Looks like you need to run this to input the example CSV file:

fluxxer.R -i "fluxxer_example_input.csv" -o "my_prefix"

GifJB · June 16, 2023, 4:03pm

Thank you so much for your thorough reply ! I'll stick to your R solution because it's the simplest one for me, but I'll keep the terminal usage in mind in the future.

system · June 23, 2023, 4:04pm

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.