Today’s post is a documentation that I am writing while I learn how the
Durable Task Plug-in
for Jenkins works. The durable-task-plugin is not really a plug-in. It is a
“library offering an extension point for processes which can run outside of Jenkins
yet be monitored”.
But how does it monitor a process running outside of Jenkins? Well, let’s
examine its source code.
This library provides a high level API for controlling long running tasks, and
provides shell and windows examples that executes something like
nohup sh -c script.sh > log.txt 2>&1; echo $? > result.txt. Appends
the result of log.txt to another OutputStream and returns the exit code.
The DurableTask is an extension point, and represents a “task which may
be run asynchronously on a build node and withstand disconnection of the slave agent”.
The only interesting method is launch(…). This method is called to launch
our durable task (whichever it may be). And it receives environment variables,
the reference for a workspace, a launcher and a listener (logger).
Looking at its relationships, we can see that FileMonitoringTask extends
it, and that it utilizes two other classes, and is utilized by one
class (DurableTaskDescriptor for generics).
The launch(…) method returns a Controller object, our next topic.
Once your durable task has been submitted, the Controller returned by the
DurableTask#launch(…) method will define how to control its execution. The
name of methods in this abstract class are very intuitive.
Let’s consider your task is running and it will take a while to complete. The
controller provides the writeLog(…) method that is responsible for
verifying if your task has produced any new output and keeping track of it.
The exitStatus(…) can be used to retrieve the exit status. But be wary as
it can return null if the process is still running.
The stop(…) tries to stop the running task, and cleanup(…) is called
afterwards to close any open resources.
In the library the controller is only used in the abstract DurableTask and
in the sample FileMonitoringTask.
“A task which forks some external command and then waits for log and status
files to be updated/created”. That’s the description of the FileMonitoringTask.
Good use cases include those long running shell scripts that output some data
for each folder/file processed, or that script that executes Python, R, lua and
what not and tries to log to a text file what is going on. Let’s take a look at
some of its methods.
In FileMonitorTask, the launch(…) method only adds a special environment
variable (which scope it out of the context of this post) and calls the
doLaunch(…) method. The doLaunch(…) method is abstract and returns an
instance of FileMonitoringController.
Looking at the relationships of the FileMonitoringTask we can confirm that
it extends DurableTask, and that it utilizes the Controller class. And we can
also see that there are two implementations in the durable-task-plugin: WindowsBatchScript
But let’s take a look first at the FileMonitoringController returned by the
FileMonitoringTask#doLauncher method, and then conclude looking at the BourneShellScript.
The FileMonitoringController is a static inner class of FileMonitoringTask, and is
returned by FileMonitoringTask#doLaunch(…).
The constructor of the FileMonitoringController calls controlDir(…). This method is
responsible for creating a working directory for this controller. The directory
is created within the workspace and its name is a random string (id).
The writeLog(…) method when called to retrieve the task log file first will call
getLogFile(…). The latter will create/get a file named “jenkins-log.txt” in
the workspace. Having the log file, the writeLog method continues appending the output
of the task into this log file, and stores its last location (lastLocation). When
invoked again, writeLog(…) will use the lastLocation variable to check if the log
changed or not.
Finally, the exitStatus(…) relies on getResultLog(…) to retrieve
$WORKSPACE/<control-dir>/jenkins-result.txt. If the file exists, the exitStatus(…)
method will return the exit status written in this file, otherwise it will return null.
The BourneShellScript extends the FileMonitoringTask. And what is does is
nohup sh -c '/var/lib/jenkins/.../script.sh' > jenkins-log.txt 2>&1; echo $? > jenkins-result.txt
This can be translated as: run script.sh, which will output to jenkins-log.txt, and after
it is completed, echo the exit code to jenkins-result.txt. Now go back and read about the
I told you BourneShellScript was simple. Its constructor receives a script String parameter
that contains the script body (returned by getScript()). The doLaunch(…) method
is responsible for all the boilerplace code to execute the aforementioned nohup -sh script and
redirect the output streams to the right files.
Your long running task will feed the jenkins-log.txt, and once it is completed the
BourneShellScript will know because the jenkins-result.txt will exist, and hopefully it will
contain the exit code of your task.