running commands on remote HPC server via R-Shiny UI

I have a workflow (in WDL) pipeline setup on a remote HPC cluster, it executes successfully with 2 commands as follows.

# cd to the project folder
cd folder/project_folder/with_cromwell

# invoke the .sh script, it requires the library name as input
sh ./execute_from_shiny.sh input_library_name

Wanted to create a R/Shiny app so that, users can log-in to the remote server via R/shiny UI and execute the pipeline, track progress and view the outputs back in R/Shiny. I can successfully SSH via R/Shiny ( using ID and Password) and execute the basic bash commands. (like cd, ls -lh, etc)

However, certain commands like module load python and other HPC environment-specific commands do not work. It seems R/Shiny is not invoking the right bash environment, but no clue how to ensure that the necessary environment or configurations are properly loaded within the SSH session.

When I tried the same commands after SSH-ing from Mac OS terminal or, setting up a basic python flask app, was able login with user credentials and execute the pipeline successfully without the issue with HPC environment specific commands like module load or source activate. Which I assume python is able to load the appropriate remote env and r could not.

Here is the sample app. I'm stuck at execution of pipeline step, not sure what challenges are with using passkey instead of login & pswd, track progress and viewing the outputs on Shiny UI would be. Any suggestion on how to approach this app are helpful.

library(shiny)
library(ssh)
#> Linking to libssh v0.9.5

# UI
ui <- fluidPage(
    textOutput("output"),
    actionButton("connect", "Connect and Run")
)

# Server
server <- function(input, output, session) {
    observeEvent(input$connect, {
        # Hardcoded values
        # TO DO: Setup ssh key
        server_address <- "portal.univ.edu"
        username<- "login_name"
        password <- "login_pswd"
        command <- c("cd to/project/directory","module load python")
        output_file <- "path/to/bash_console/command_output.txt"  # Output file path on the remote server
        tryCatch({
            # Establish an SSH connection
            con <- ssh_connect(paste(username,server_address,sep = "@"),passwd =  password)
            # Construct a command to redirect output to a file
            command_with_redirect <- paste0(command, " > ", output_file)
            
            # Execute the command with output redirection
            ssh::ssh_exec_wait(con, command_with_redirect)
            ssh_exec_internal(con, command_with_redirect)
            
            # Read the content of the output file
            output_text <- ssh::ssh_exec_internal(con, paste("cat", output_file))
            
            # Display the output
            output$output <- renderText({
                paste("Command Output:\n", rawToChar(output_text$stdout),sep = "n")
            })
            
            # Close the SSH connection
            ssh_disconnect(con)
        }, error = function(e) {
            output$output <- renderText({
                paste("Error: ", e$message)
            })
        })
    })
}

shinyApp(ui, server)
Shiny applications not supported in static R Markdown documents
<sup>Created on 2024-01-10 with [reprex v2.0.2](https://reprex.tidyverse.org)</sup>

<details style="margin-bottom:10px;">
<summary>
Session info
sessionInfo()
#> R version 4.2.3 (2023-03-15)
#> Platform: x86_64-apple-darwin17.0 (64-bit)
#> Running under: macOS Big Sur ... 10.16
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] ssh_0.8.3   shiny_1.7.4
#> 
#> loaded via a namespace (and not attached):
#>  [1] Rcpp_1.0.10       rstudioapi_0.14   knitr_1.42        magrittr_2.0.3   
#>  [5] xtable_1.8-4      R6_2.5.1          rlang_1.1.0       fastmap_1.1.1    
#>  [9] sys_3.4.1         tools_4.2.3       xfun_0.37         cli_3.6.1        
#> [13] jquerylib_0.1.4   withr_2.5.0       htmltools_0.5.4   askpass_1.1      
#> [17] ellipsis_0.3.2    openssl_2.0.6     yaml_2.3.7        digest_0.6.31    
#> [21] lifecycle_1.0.3   later_1.3.1       sass_0.4.5        credentials_1.3.2
#> [25] fs_1.6.1          promises_1.2.0.1  cachem_1.0.8      glue_1.6.2       
#> [29] evaluate_0.20     mime_0.12         rmarkdown_2.20    reprex_2.0.2     
#> [33] bslib_0.4.2       compiler_4.2.3    jsonlite_1.8.4    httpuv_1.6.11

It seems non-trivial.

First, are you using a scheduler (e.g. SLURM, LSF, SGE, ...)?

How do they not work? Do you get any error message?

Can you give more details? Is it an app running on the HPC or your local computer? How is it calling the pipeline?

Before running the entire Shiny app, can you run the corresponding commands in a simple R session? In particular, does the ssh_connect(paste(username,server_address,sep = "@"),passwd = password) call work as expected? Can you run something like ssh_exec_wait(con, "echo $SHELL") to get more details about the environment?

does the ssh_connect(paste(username,server_address,sep = "@"),passwd = password) call work as expected? Can you run something like ssh_exec_wait(con, "echo $SHELL")

Yes, also basic bash commands work, for example echo $SHELL gives the output ==> /bin/bash. This is the same output we get if we use terminal in the Rstudio interface and ssh. However, in the Rstudio terminal, after ssh, we also are able to execute "module load python" or run the complete pipeline without any errors. But these commands throw the following error
On Rstudio "background jobs": for "module load python" error is. > "bash: line 2: module: command not found"
On the HPC server > Error Executing module load python failed with status 127

We are not using any scheduler. The WDL pipeline just reads input files from folder and runs series of bash & R scripts. The .sh file that we call on command line loads all the required modules and updates references that .wdl pipeline would use, and make the input folder ready.

OK I don't have an obvious answer, so I'd suggest to do things systematically. See if you get the same results in an ssh session in a classic terminal and with {ssh}:

First, a simple hostname and uname -a. I know, it looks stupid, but sometimes weird things happen.

Second, whoami and echo $USER, if the profile is not the same that can affect environment variables.

Then I'd say try which module. If it gives you a path, that's great; if it gives you a function (starting with module() on the first line), you can use this to find where it's defined:

shopt -s extdebug
declare -F module
shopt -u extdebug

If it's a function, it might be because you're using lmod, so instead of the above you can try echo $LMOD_CMD (the reason for that is that there are several softwares called module, it could correspond to lmod, but it could also be the older tcl package).

Once you find the path to the command called by module, try to ls that path through the R {ssh} package. If the file is present, you can check echo $PATH to see why it's not found.

1 Like

Thank you for the response, I have tried "hostname","uname -a","whoami" & "echo $USER" they worked as expected.

Got an error for " which module" which: no module in (/usr/local/bin:/usr/bin) When I do, echo $PATH it gives the path as /usr/local/bin:/usr/bin.

If I do the same using terminal It gives similar /usr/bin/which: no module in but few more additional paths. Output below

/usr/bin/which: no module in (/usr/local/xalt/xalt/bin:/opt/mvapich2/intel/19.0/2.3.3/bin:/usr/local/gnu/8.4.0/bin:/opt/intel/itac/2019.5.041/bin:/opt/intel/advisor_2019/bin64:/opt/intel/vtune_amplifier_2019/bin64:/opt/intel/inspector_2019/bin64:/opt/intel/compilers_and_libraries_2019.5.281/linux/bin/intel64:/usr/lib64/qt-3.3/bin:/opt/oum/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/opt/ddn/ime/bin:/opt/puppetlabs/bin)

The rest of the comands
shopt -s extdebug, declare -F module, shopt -u extdebug, echo $LMOD_CMD does not throw an error or even show any outputs.

Was able to submit slurm script .sh file, from Rshiny end, just by executing command

sbatch /path/to/slurm/script/submit_slurm_job.sh

I get an console output in Rstudio saying submitted job 378493.

When I log into remote HPC and check squeue -job 378493 it show status as waiting, and once the status changes to 'Running' it closes down. The same slurm script .sh successfully runs from terminal from mac os or directly on the HPC.

Interactive sessions

I think I understand the source of the problem, though not the solution. The important part is whether the session is interactive. If you do ssh server then execute a command there, you are in an interactive session. If you do ssh server 'command', you are in a non-interactive session. You can circumvent that by "cheating" with ssh server 'bash -ic "command"' to create an interactive session within the non-interactive ssh session.

Why does it matter? Because interactive sessions, but not non-interactive sessions, start by reading ~/.bashrc. And I suspect the module command is defined in that file (see below), as well as some modifications to the $PATH variable. And the R package {ssh}, which uses libssh, creates a non-interactive session (and while libssh seems to have the possibility of creating interactive sessions, I don't think the R package implements it).

So, here we are, I'm not sure about the solution. You can try a ssh_exec_wait(con, "bash -ic 'module avail python'"), I expect you will have these two warnings:

bash: cannot set terminal process group (-1): Inappropriate ioctl for device
bash: no job control in this shell

followed by a list of available python modules, while if you remove the -i switch, you'd get an error of module: command not found.

But I don't think this is a good solution on the long term, maybe the best in that case would be to ask your HPC administrators. Though see below for thoughts on Slurm.

Module

Argh, then I don't know which software is powering the module command on your HPC. You can try module --version with some luck, or see if this information is somewhere in the module help output. Your HPC administrators would know too, and maybe wrote it in the docs if they have a website.

It could be useful to know, especially to understand how the module command is defined, and maybe check the documentation of the implementation to see if they explain a way to use it in non-interactive sessions.

SLURM

The previous parts of my answer were assuming you were not using Slurm. But this is different: as far as I know, sbatch creates a new session when it runs a script (which is a non-interactive session, but of course will know about module through some other mean). So I suspect the error is not due to module.

You can try diagnosing the problem this script encounters with sacct -j 378493 and see the ExitCode. If you didn't specify otherwise, it should have created a log file slurm-378493.out in the working directory (that you can get with pwd).

1 Like

Thank you for the detailed explanation. Learning something new. Slurm is something I tried after seeing your post, thinking that may work. Thank you.
I spoke with our HPC admins, they discouraged connecting to their HPC via 3rd part browser/ login interfaces to submit jobs, also they don't support to host shiny.

Right it does seem like something that an HPC admin would be wary of.

In itself, that's not a big problem, you could have a Shiny app running on one server and connecting to another server to run computations. For example, a Shiny app hosted on shinyapps.io that calls on AWS to do the heavy lifting (just be aware that these cloud providers often have a "pay what you use" approach, the price can increase fast if not careful).

1 Like

While the solution may eventually work, I also just would like to point out that you could look into alternative approaches for your problem, too, e.g. using clustermq or if you want to go down the future route, you could use a nested future with ssh and future.batchtools.

A simple example for clustermq + ssh can be found at GitHub - michaelmayer2/penguins-hpc: Penguins glm goes HPC.

2 Likes

This topic was automatically closed 54 days after the last reply. New replies are no longer allowed.

If you have a query related to it or one of the replies, start a new topic and refer back with a link.