gwcelery.tasks.condor module

Submit and monitor HTCondor jobs [1].

Notes

Internally, we use the XML condor log format [2] for easier parsing.

References

[1]http://research.cs.wisc.edu/htcondor/manual/latest/condor_submit.html
[2]http://research.cs.wisc.edu/htcondor/classad/refman/node3.html
exception gwcelery.tasks.condor.JobAborted[source]

Bases: Exception

Raised if an HTCondor job was aborted (e.g. by condor_rm).

exception gwcelery.tasks.condor.JobRunning[source]

Bases: Exception

Raised if an HTCondor job is still running.

exception gwcelery.tasks.condor.JobFailed(returncode, cmd, output=None, stderr=None)[source]

Bases: subprocess.CalledProcessError

Raised if an HTCondor job fails.

(task)gwcelery.tasks.condor.submit(submit_file, log=None)[source]

Submit a job using HTCondor.

Parameters:
  • submit_file (str) – Path of the submit file.
  • log (str) – Used internally to track job state. Caller should not set.
Raises:
  • JobAborted – If the job was aborted (e.g. by running condor_rm).
  • JobFailed – If the job terminates and returns a nonzero exit code.
  • JobRunning – If the job is still running. Causes the task to be re-queued until the job is complete.

Example

>>> submit.s('example.sub',
...          accounting_group='ligo.dev.o3.cbc.explore.test')
(task)gwcelery.tasks.condor.check_output(args, log=None, error=None, output=None, **kwargs)[source]

Call a process using HTCondor.

Call an external process using HTCondor, in a manner patterned after subprocess.check_output(). If successful, returns its output on stdout. On failure, raise an exception.

Parameters:
  • args (list) – Command line arguments, as if passed to subprocess.check_call().
  • error, output (log,) – Used internally to track job state. Caller should not set.
  • **kwargs – Extra submit description file commands. See the documentation for condor_submit for possible values.
Returns:

Captured output from command.

Return type:

str

Raises:
  • JobAborted – If the job was aborted (e.g. by running condor_rm).
  • JobFailed – If the job terminates and returns a nonzero exit code.
  • JobRunning – If the job is still running. Causes the task to be re-queued until the job is complete.

Example

>>> check_output.s(['sleep', '10'],
...                accounting_group='ligo.dev.o3.cbc.explore.test')