Hi, I am integrating RStudio Workbench with Slurm, follow the url: Integrating RStudio Workbench with Slurm - RStudio Documentation, and I can't submit a job to the slurm cluster.
The slurm cluster is OK. I get some log:
In the RStudio Workbench node:
[ec2-user@ip-10-0-0-147 ~]$ sudo rstudio-server verify-installation --verify-user=test
TTY detected. Printing informational message about logging configuration. Logging configuration loaded from '/etc/rstudio/logging.conf'. Logging to '/var/log/rstudio/rstudio-server/rserver.log'.
Checking Job Launcher configuration...
Ensuring server-user is a Job Launcher admin...
Getting list of Job Launcher clusters...
Job launcher configured with the following clusters: Slurm
launcher-adhoc-clusters is empty - all clusters may be used to launch adhoc jobs
launcher-sessions-clusters is empty - all clusters may be used to launch session jobs
Launched R session job for cluster Slurm
Waiting for job to run...
Verify Installation Failed: system error 71 (Protocol error) [description: Job transitioned to unexpected terminal state: Failed]; OCCURRED AT rstudio::server::session_proxy::overlay::{anonymous}::waitForJob(const JobPtr&)::<lambda(const string&, const string&)> src/cpp/server/JobLauncherVerification.cpp:606
2022-04-21T14:48:24.726230Z [rserver] ERROR system error 71 (Protocol error) [description: Job transitioned to unexpected terminal state: Failed]; OCCURRED AT rstudio::server::session_proxy::overlay::{anonymous}::waitForJob(const JobPtr&)::<lambda(const string&, const string&)> src/cpp/server/JobLauncherVerification.cpp:606; LOGGED FROM: int main(int, char* const*) src/cpp/server/ServerMain.cpp:845
Slurm show jobs:
JobId=33 JobName=[RStudio Launcher] Session 0da29cfc78a31a872eaeb (test) - Session Test Job
UserId=test(1001) GroupId=test(1001) MCS_label=N/A
Priority=4294901728 Nice=0 Account=(null) QOS=(null)
JobState=FAILED Reason=BadConstraints Dependency=(null)
Requeue=0 Restarts=0 BatchFlag=1 Reboot=0 ExitCode=0:1
RunTime=00:00:00 TimeLimit=UNLIMITED TimeMin=N/A
SubmitTime=2022-04-21T14:48:24 EligibleTime=2022-04-21T14:48:24
AccrueTime=Unknown
StartTime=2022-04-21T14:48:24 EndTime=2022-04-21T14:48:24 Deadline=N/A
SuspendTime=None SecsPreSuspend=0 LastSchedEval=2022-04-21T14:48:24
Partition=c5n AllocNode:Sid=ip-10-0-0-147:25810
ReqNodeList=(null) ExcNodeList=(null)
NodeList=(null)
NumNodes=1 NumCPUs=1 NumTasks=1 CPUs/Task=1 ReqB:S:C:T=0:0:*:*
TRES=cpu=1,mem=512M,node=1,billing=1
Socks/Node=* NtasksPerN:B:S:C=0:0:*:* CoreSpec=*
MinCPUsNode=1 MinMemoryNode=512M MinTmpDiskNode=0
Features=(null) DelayBoot=00:00:00
OverSubscribe=OK Contiguous=0 Licenses=(null) Network=(null)
Command=(null)
WorkDir=/
StdErr=/shared/slurm-data/Session 0da29cfc78a31a872eaeb (test) - Session Test Job-33.err
StdIn=/dev/null
StdOut=/shared/slurm-data/Session 0da29cfc78a31a872eaeb (test) - Session Test Job-33.out
Power=
NtasksPerTRES:0
I can't find the reason, Pls get some tips.