luna.util.jobs module¶
- class ArgsGenerator(generator, nargs)[source]¶
Bases:
object
Custom generator that implements __len__(). This class can be used in conjunction with
ProgressTracker
in cases where the tasks are obtained from generators. Note thatProgressTracker
requires a pre-defined number of tasks to calculate the progress, therefore a standard generator cannot be used directly as it does not implement __len__(). Then, withArgsGenerator
, one may take advantage of generators andProgressTracker
by explicitly providing the number of tasks that will be generated.- Parameters
generator (generator) – The tasks generator.
nargs (int) – The number of tasks that will be generated.
- class ParallelJobs(nproc=1)[source]¶
Bases:
object
Executes a set of tasks in parallel (
JoinableQueue
) or sequentially.- Parameters
nproc (int or None) – The number of CPUs to use. The default value is the
maximum number of CPUs - 1
. Ifnproc
is None, 0, or 1, run the jobs sequentially. Otherwise, use themaximum number of CPUs - 1
.- Variables
~ParallelJobs.nproc (int) – The number of CPUs to use.
~ParallelJobs.progress_tracker (ProgressTracker) – A
ProgressTracker
object to track the tasks’ progress.
- run_jobs(args, consumer_func, output_file=None, proc_output_func=None, output_header=None, job_name=None)[source]¶
Run a set of tasks in parallel or sequentially according to the
nproc
.- Parameters
args (iterable of iterables,
ArgsGenerator
) – A sequence of arguments to be provided to the consumer functionconsumer_func
.consumer_func (function) – The function that will be executed for each set of arguments in
args
.output_file (str, optional) – Save outputs to this file. If
proc_output_func
is not provided, it tries to save a stringified version of each output data. Otherwise, it executesproc_output_func
first and its output will be printed to the output file instead.Note: if
proc_output_func
is provided but notoutput_file
, a new random unique filename will be generated and the file will be saved in the current directory.proc_output_func (function, optional) – Post-processing function that is executed for each output data produced by
consumer_func
.output_header (str, optional) – A header for the output file.
job_name (str, optional) – A name to identify the job.
- Return type