I really love the jobs feature! It's a game changer for me.
I live even better the fact that you can operate it via the rstudioapi package -- in particular the rstudioapi::jobRunScript function.
The main problem that I have though is that I cannot find a way to kill / stop a job via the API.
There is a stop button in the jobs tab that does it not via the API. I know about the function to set the status to "succeeded" or any other status, and about the one to remove it from the queue so it won't be visible. But even if I set the job status to "succeeded" and remove the job from the queue, the process that runs it is still alive, and as far as I know, the script is still running.
Is there a way to use the API to do the equivalent of clicking the red stop button in the jobs tab? I really need the job to stop executing --- my use case is a library function that loops indefinitely (it communicates with a server) and I need to stop it after a while, do some processing, and relaunch it. I need then to make sure that the process that runs the function is dead, and that the function does no longer run.
PS: I can do without the option of using an export environment in this case; that is, not to capture any objects created by this job, as its results are saved in file(s) (e.g. rds files), and I can use these to communicate back the results. So I don't need any environment to outlive the process.
Great, thanks!
At the meantime I'll probably use the ugly workaround of running a shell script to capture and kill the process running the job.
I know . but it works.
Everything else w.r.t. job management can be done via the rstudioapi.
Not sure whether I would still be able to use the feature of passing in an environment to store newly created object as it may not survive the violent kill, but I can do without it, my job barely creates any global variables.
Related question:
Is there is an easy way to use the job id that I get from the rstudioapi to identify (from the unix command line) the process that runs that job based on the (numerical) job id that is returned by rstudioapi::jobRunScript call?
I can easily do without it, as the command line process description has the name of the script file embedded in it, but being able to confirm that using the job id should be good for a sanity check when running an ugly workaround.
Thanks!
If you're on Linux, then one proposal that could work: we could set the ID as an environment variable, e.g. RSTUDIO_JOB_ID, and you could query processes in the /proc filesystem to find it.
First, just make sure we're on the same pate, what I was asking about/for is a way to reach the OS process that is running the session of the job that I spawned. Assuming we are:
Proposals: I think that the most intuitive way for the users would be a rstudioapi::getProcessID function, or at least return the process ID from the jobRunScript call, together with the job id via a list / structure. E.g.
job.info <- rstudioapi::jobRunScript(my.file);
job.info$id # returns the Studio internal job id
job.info$ps # returns the OS process id
(If what you meant is a proposal of how to implement it, then I'm not sure, as I do not know the structure RStudio uses to represent a job and it's state, but I assume that it does have the OS process ID embedded in it.)
Using environment variable: I assume you mean setting it inside the script that is being run, that is, grab the OS process id that runs the current session.
How can I get the OS process id that runs my own session? The only thing that comes to mind is using the the system command:
system("echo $$")
but I'm not sure how I can grab the output of a system command.
Even if I do get the process id, then defining an environment variable from within a system command will not be visible from another shell.
Alternatively maybe I should do:
system("kill $$")
possibly after setting the job status and/or saving a return value in an environment passed to the job for creating its objects?