See also
- @permutations in the Ruffus Manual
- Decorators for more decorators
@permutations( input, filter, tuple_size, output, [extras,...] )ΒΆ
Purpose:
Generates the permutations, between all the elements of a set of input (e.g. A B C D),
The effect is analogous to the python itertools function of the same name:
>>> from itertools import permutations >>> # permutations('ABCD', 2) --> AB AC AD BA BC BD CA CB CD DA DB DC >>> [ "".join(a) for a in permutations("ABCD", 2)] ['AB', 'AC', 'AD', 'BA', 'BC', 'BD', 'CA', 'CB', 'CD', 'DA', 'DB', 'DC']Only out of date tasks (comparing input and output files) will be run
output file names and strings in the extra parameters are generated by string replacement via the formatter() filter from the input. This can be, for example, a list of file names or the output of up stream tasks. . The replacement strings require an extra level of nesting to refer to parsed components.
This will be clear in the following example:
Example:
Calculate the @permutations of A,B,C,D files
If input is four pairs of file names
input_files = [ [ 'A.1_start', 'A.2_start'], # 0 [ 'B.1_start', 'B.2_start'], # 1 [ 'C.1_start', 'C.2_start'], # 2 [ 'D.1_start', 'D.2_start'] ] # 3The first job of:
@permutations(input_files, formatter(), 2, ...)Will be
# Two file pairs at a time ['A.1_start', 'A.2_start'], # 0 # versus ['B.1_start', 'B.2_start'], # 1
- First level of nesting:
['A.1_start', 'A.2_start'] # [0] ['B.1_start', 'B.2_start'] # [1]- Second level of nesting:
'A.2_start' # [0][1] 'B.2_start' # [1][1]- Parse filename without suffix
'A' # {basename[0][1]} 'B' # {basename[1][1]}Python code:
from ruffus import * from ruffus.combinatorics import * # initial file pairs @originate([ ['A.1_start', 'A.2_start'], ['B.1_start', 'B.2_start'], ['C.1_start', 'C.2_start'], ['D.1_start', 'D.2_start']]) def create_initial_files_ABCD(output_files): for output_file in output_files: with open(output_file, "w") as oo: pass # @permutations @permutations(create_initial_files_ABCD, # Input formatter(), # match input files # tuple of 2 at a time 2, # Output Replacement string "{path[0][0]}/" "{basename[0][1]}_vs_" "{basename[1][1]}.permutations", # Extra parameter: path for 1st set of files, 1st file name "{path[0][0]}", # Extra parameter ["{basename[0][0]}", # basename for 1st set of files, 1st file name "{basename[1][0]}", # 2nd ]) def permutations_task(input_file, output_parameter, shared_path, basenames): print " - ".join(basenames) # # Run # pipeline_run(verbose=0)This results in:
>>> pipeline_run(verbose=0) A - B A - C A - D B - A B - C B - D C - A C - B C - D D - A D - B D - CParameters:
- filter = formater(...)
a formatter indicator object containing optionally a python regular expression (re).
- tuple_size = N
Select N elements at a time.
- output = output
Specifies the resulting output file name(s) after string substitution
- extras = extras
Any extra parameters are passed verbatim to the task function
If you are using named parameters, these can be passed as a list, i.e. extras= [...]
Any extra parameters are consumed by the task function and not forwarded further down the pipeline.