PYME.cluster.rules module¶

Refactored rule pushing. Introduces rule classes which act as a python proxy for the JSON rule objects, and rule factories for use when constructing equivalent rules (or chains of rules) for multiple series.

Design principles as follows:

a Rule object is a 1:1 mapping with rules on the ruleserver
you create a new Rule object for each rule you push to the server
a pattern for rule creation is expressed using a RuleFactory
calling `.get_rule() on the first step returns you a fully linked rule suitable for submitting
very limited inference of inputs etc … between steps - rely on specifying inputs and outputs using patterns instead

Examples¶

>>> step1 = LocalisationRuleFactory(analysisMetadata=mdh)
>>> step2 = RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/render_image.yaml', input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.h5r'})
>>> step1.chain(step2)
>>> step3 = RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/measure_blobs.yaml', input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.tif'})
>>> step2.chain(step3)

or

>>> step1 = LocalisationRuleFactory(analysisMetadata=mdh,
>>>                                 on_completion=RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/render_image.yaml',
>>>                                                                 input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.h5r'},
>>>                                                                 on_completion=RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/measure_blobs.yaml',
>>>                                                                                                 input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.tif'})))

then:

>>> def on_launch_analysis(context={'spool_dir': ..., 'series_stub': ...}):
>>>    step1.get_rule(context=context).push()

class PYME.cluster.rules.LocalisationRule(seriesName, analysisMetadata, resultsFilename=None, startAt=0, dataSourceModule=None, serverfilter='runnervmkkn4f', **kwargs)¶

Bases: Rule

Create a new rule. Sub-classed for specific rule types

Parameters

on_completionRuleFactory instance: A rule to run after this one has completed (also setable using the .chain() method)
kwargsany additional arguments, available in the context used when creating the rule template

property complete¶

Is this rule complete, or do we need to poll for more input?

Over-ridden in localisation rule Returns ——-

property data_complete¶: Is the underlying data complete?

get_new_tasks()¶

Over-ridden in rules where all the data is not gauranteed to be present when the rule is created

Returns

release_start, release_endthe indices of starting and ending tasks to release

on_data_complete()¶: Over-ride in derived rules so that, e.g. events can be written at the end of a real-time acquisition

prepare()¶

Do any setup work - e.g. uploading metadata required before the rule is triggered

Returns

post_argsdict: a dictionary with arguments to pass to RulePusher._post_rule() - specifically timeout, max_tasks, release_start, release_end

property total_frames¶

class PYME.cluster.rules.LocalisationRuleFactory(**kwargs)¶

Bases: RuleFactory

See LocalisationRule for full initialization arguments. Required kwargs are:

:seriesName : str :analysisMetadata : PYME.IO.MetaDataHandler.MDHandlerBase

exception PYME.cluster.rules.NoNewTasks¶: Bases: Exception

class PYME.cluster.rules.RecipeRule(recipe=None, recipeURI=None, output_dir=None, **kwargs)¶

Bases: Rule

Create a recipe rule

Parameters

recipestr or recipes.modules.ModuleCollection

The recipe as YAML text or as a recipe instance (alternatively provide recipeURL)

recipeURIstr

A cluster URI for the recipe text (if recipe not provided directly)

output_dirstr

The directory to put the recipe output TODO: should this be templated based on context?

kwargsdict

Parameters for the base Rule class (notably on_completion, which, if provided, should be a RuleFactory instance). Additional parameters, not consumed by Rule are accessible in the context used for creating rule templates.

One (and only one) of the following keyword parameters should be provided to specify the recipe inputs:

inputsdict: keys are recipe namespace keys, values are either lists of file URIs, or globs which will be expanded to a list of URIs. Corresponds to the inputsByTask property in the recipe description (see ruleserver docs). inputsByTask will be used by the server to populate the inputs for individual tasks
input_templatesdict: simplar to inputs, except that dictionary substitution with the rule context is perfromed on the values before they are written to inputsByTasks
rule_outputsdict: used with chained recipes, this is the outputs of the previous recipe step. Unlike inputs and input_templates this is a dict of str, rather than of list (or glob-implied list) and is written directly to the “inputs” section of the template, rather than to the inputsByTask property of the rule, short-circuiting server side input filling.

TODO - support for templated recipes? Subclass?

prepare()¶

Do any setup work - e.g. uploading metadata required before the rule is triggered

Returns

post_argsdict: a dictionary with arguments to pass to RulePusher._post_rule() - specifically timeout, max_tasks, release_start, release_end

class PYME.cluster.rules.RecipeRuleFactory(**kwargs)¶

Bases: RuleFactory

Create a new rule factory. Sub-classed for specific rule types

Parameters

on_completionRuleFactory instance: A rule to run after this one has completed (also setable using the .chain() method)
kwargsany additional arguments (ignored in base class)

class PYME.cluster.rules.Rule(on_completion=None, **kwargs)¶

Bases: object

Create a new rule. Sub-classed for specific rule types

Parameters

on_completionRuleFactory instance: A rule to run after this one has completed (also setable using the .chain() method)
kwargsany additional arguments, available in the context used when creating the rule template

cleanup()¶

property complete¶

Is this rule complete, or do we need to poll for more input?

Over-ridden in localisation rule Returns ——-

property data_complete¶: Is the underlying data complete?

get_new_tasks()¶

Over-ridden in rules where all the data is not guaranteed to be present when the rule is created

Returns

release_start, release_endthe indices of starting and ending tasks to release

join()¶

on_data_complete()¶: Over-ride in derived rules so that, e.g. events can be written at the end of a real-time acquisition

property output_files¶

prepare()¶

Do any setup work - e.g. uploading metadata required before the rule is triggered

Returns

post_argsdict: a dictionary with arguments to pass to RulePusher._post_rule() - specifically timeout, max_tasks, release_start, release_end

push()¶

status()¶

watch()¶

class PYME.cluster.rules.RuleFactory(on_completion=None, rule_class=<class 'PYME.cluster.rules.Rule'>, **kwargs)¶

Bases: object

Create a new rule factory. Sub-classed for specific rule types

Parameters

on_completionRuleFactory instance: A rule to run after this one has completed (also setable using the .chain() method)
kwargsany additional arguments (ignored in base class)

chain(on_completion)¶

Chain a rule (as an alternative to passing on_completion to the rule constructor, allows us to create rule chains from front to back rather than back to front

Parameters

on_completionRuleFactory instance: The rule to run after this one has completed

get_rule(context)¶

Populate a rule using series specific info from context. Note that the the rule class initialization arguments should be passed in the RuleFactory initialization as kwargs, but can also be passed here in context if, e.g. the series name is not known when creating the

Parameters

contextdict: a dictionary containing series specific info to populate into the task template

Returns

Rule: a rule suitable for submitting to the ruleserver /add_integer_id_rule endpoint

property rule_type¶

class PYME.cluster.rules.RuleGroupWatcher¶

Bases: object

Single object / thread to watch multiple rules

Used when launching rules from jupyter notebooks, this will show an updating display of rule progress

register()¶

watch(rule)¶

class PYME.cluster.rules.RuleWatcher¶

Bases: object

Single object / thread to watch multiple rules

watch(rule)¶

class PYME.cluster.rules.SpoolLocalLocalizationRule(spooler, seriesName, analysisMetadata, resultsFilename=None, startAt=0, serverfilter='runnervmkkn4f', **kwargs)¶

Bases: LocalisationRule

Create a new rule. Sub-classed for specific rule types

Parameters

on_completionRuleFactory instance: A rule to run after this one has completed (also setable using the .chain() method)
kwargsany additional arguments, available in the context used when creating the rule template

property data_complete¶: Is the underlying data complete?

on_data_complete()¶: Over-ride in derived rules so that, e.g. events can be written at the end of a real-time acquisition

property total_frames¶

class PYME.cluster.rules.SpoolLocalLocalizationRuleFactory(**kwargs)¶

Bases: RuleFactory

See SpoolLocalLocalizationRule for full initialization arguments. Required kwargs are:

:seriesName : str :analysisMetadata : PYME.IO.MetaDataHandler.MDHandlerBase

PYME.cluster.rules.verify_cluster_results_filename(resultsFilename)¶

Checks whether a results file already exists on the cluster, and returns an available version of the results filename. Should be called before writing a new results file.

Parameters

resultsFilenamestr: cluster path, e.g. pyme-cluster:///example_folder/name.h5r
Returns
——-
resultsFilenamestr: cluster path which may have _# appended to it if the input resultsFileName is already in use, e.g. pyme-cluster:///example_folder/name_1.h5r