PYME.cluster.rules module¶
Refactored rule pushing. Introduces rule classes which act as a python proxy for the JSON rule objects, and rule factories for use when constructing equivalent rules (or chains of rules) for multiple series.
Design principles as follows:
a Rule object is a 1:1 mapping with rules on the ruleserver
you create a new Rule object for each rule you push to the server
a pattern for rule creation is expressed using a RuleFactory
calling `.get_rule() on the first step returns you a fully linked rule suitable for submitting
very limited inference of inputs etc … between steps - rely on specifying inputs and outputs using patterns instead
Examples¶
>>> step1 = LocalisationRuleFactory(analysisMetadata=mdh)
>>> step2 = RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/render_image.yaml', input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.h5r'})
>>> step1.chain(step2)
>>> step3 = RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/measure_blobs.yaml', input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.tif'})
>>> step2.chain(step3)
or
>>> step1 = LocalisationRuleFactory(analysisMetadata=mdh,
>>> on_completion=RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/render_image.yaml',
>>> input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.h5r'},
>>> on_completion=RecipeRuleFactory(recipeURI='PYME-CLUSTER///RECIPES/measure_blobs.yaml',
>>> input_patterns={'input':'{{spool_dir}}/analysis/{{series_stub}}.tif'})))
then:
>>> def on_launch_analysis(context={'spool_dir': ..., 'series_stub': ...}):
>>> step1.get_rule(context=context).push()
- class PYME.cluster.rules.LocalisationRule(seriesName, analysisMetadata, resultsFilename=None, startAt=0, dataSourceModule=None, serverfilter='runnervm0kj6c', **kwargs)¶
Bases:
RuleCreate a new rule. Sub-classed for specific rule types
- Parameters
- on_completionRuleFactory instance
A rule to run after this one has completed (also setable using the .chain() method)
- kwargsany additional arguments, available in the context used when creating the rule template
- property complete¶
Is this rule complete, or do we need to poll for more input?
Over-ridden in localisation rule Returns ——-
- property data_complete¶
Is the underlying data complete?
- get_new_tasks()¶
Over-ridden in rules where all the data is not gauranteed to be present when the rule is created
- Returns
- release_start, release_endthe indices of starting and ending tasks to release
- on_data_complete()¶
Over-ride in derived rules so that, e.g. events can be written at the end of a real-time acquisition
- prepare()¶
Do any setup work - e.g. uploading metadata required before the rule is triggered
- Returns
- post_argsdict
a dictionary with arguments to pass to RulePusher._post_rule() - specifically timeout, max_tasks, release_start, release_end
- property total_frames¶
- class PYME.cluster.rules.LocalisationRuleFactory(**kwargs)¶
Bases:
RuleFactorySee LocalisationRule for full initialization arguments. Required kwargs are:
:seriesName : str :analysisMetadata : PYME.IO.MetaDataHandler.MDHandlerBase
- exception PYME.cluster.rules.NoNewTasks¶
Bases:
Exception
- class PYME.cluster.rules.RecipeRule(recipe=None, recipeURI=None, output_dir=None, **kwargs)¶
Bases:
RuleCreate a recipe rule
- Parameters
- recipestr or recipes.modules.ModuleCollection
The recipe as YAML text or as a recipe instance (alternatively provide recipeURL)
- recipeURIstr
A cluster URI for the recipe text (if recipe not provided directly)
- output_dirstr
The directory to put the recipe output TODO: should this be templated based on context?
- kwargsdict
Parameters for the base Rule class (notably on_completion, which, if provided, should be a RuleFactory instance). Additional parameters, not consumed by Rule are accessible in the context used for creating rule templates.
- One (and only one) of the following keyword parameters should be provided to specify the recipe inputs:
- inputsdict
keys are recipe namespace keys, values are either lists of file URIs, or globs which will be expanded to a list of URIs. Corresponds to the inputsByTask property in the recipe description (see ruleserver docs). inputsByTask will be used by the server to populate the inputs for individual tasks
- input_templatesdict
simplar to inputs, except that dictionary substitution with the rule context is perfromed on the values before they are written to inputsByTasks
- rule_outputsdict
used with chained recipes, this is the outputs of the previous recipe step. Unlike inputs and input_templates this is a dict of str, rather than of list (or glob-implied list) and is written directly to the “inputs” section of the template, rather than to the inputsByTask property of the rule, short-circuiting server side input filling.
- TODO - support for templated recipes? Subclass?
- prepare()¶
Do any setup work - e.g. uploading metadata required before the rule is triggered
- Returns
- post_argsdict
a dictionary with arguments to pass to RulePusher._post_rule() - specifically timeout, max_tasks, release_start, release_end
- class PYME.cluster.rules.RecipeRuleFactory(**kwargs)¶
Bases:
RuleFactoryCreate a new rule factory. Sub-classed for specific rule types
- Parameters
- on_completionRuleFactory instance
A rule to run after this one has completed (also setable using the .chain() method)
- kwargsany additional arguments (ignored in base class)
- class PYME.cluster.rules.Rule(on_completion=None, **kwargs)¶
Bases:
objectCreate a new rule. Sub-classed for specific rule types
- Parameters
- on_completionRuleFactory instance
A rule to run after this one has completed (also setable using the .chain() method)
- kwargsany additional arguments, available in the context used when creating the rule template
- cleanup()¶
- property complete¶
Is this rule complete, or do we need to poll for more input?
Over-ridden in localisation rule Returns ——-
- property data_complete¶
Is the underlying data complete?
- get_new_tasks()¶
Over-ridden in rules where all the data is not guaranteed to be present when the rule is created
- Returns
- release_start, release_endthe indices of starting and ending tasks to release
- join()¶
- on_data_complete()¶
Over-ride in derived rules so that, e.g. events can be written at the end of a real-time acquisition
- property output_files¶
- prepare()¶
Do any setup work - e.g. uploading metadata required before the rule is triggered
- Returns
- post_argsdict
a dictionary with arguments to pass to RulePusher._post_rule() - specifically timeout, max_tasks, release_start, release_end
- push()¶
- status()¶
- watch()¶
- class PYME.cluster.rules.RuleFactory(on_completion=None, rule_class=<class 'PYME.cluster.rules.Rule'>, **kwargs)¶
Bases:
objectCreate a new rule factory. Sub-classed for specific rule types
- Parameters
- on_completionRuleFactory instance
A rule to run after this one has completed (also setable using the .chain() method)
- kwargsany additional arguments (ignored in base class)
- chain(on_completion)¶
Chain a rule (as an alternative to passing on_completion to the rule constructor, allows us to create rule chains from front to back rather than back to front
- Parameters
- on_completionRuleFactory instance
The rule to run after this one has completed
- get_rule(context)¶
Populate a rule using series specific info from context. Note that the the rule class initialization arguments should be passed in the RuleFactory initialization as kwargs, but can also be passed here in context if, e.g. the series name is not known when creating the
- Parameters
- contextdict
a dictionary containing series specific info to populate into the task template
- Returns
- Rule
a rule suitable for submitting to the ruleserver /add_integer_id_rule endpoint
- property rule_type¶
- class PYME.cluster.rules.RuleGroupWatcher¶
Bases:
objectSingle object / thread to watch multiple rules
Used when launching rules from jupyter notebooks, this will show an updating display of rule progress
- register()¶
- watch(rule)¶
- class PYME.cluster.rules.RuleWatcher¶
Bases:
objectSingle object / thread to watch multiple rules
- watch(rule)¶
- class PYME.cluster.rules.SpoolLocalLocalizationRule(spooler, seriesName, analysisMetadata, resultsFilename=None, startAt=0, serverfilter='runnervm0kj6c', **kwargs)¶
Bases:
LocalisationRuleCreate a new rule. Sub-classed for specific rule types
- Parameters
- on_completionRuleFactory instance
A rule to run after this one has completed (also setable using the .chain() method)
- kwargsany additional arguments, available in the context used when creating the rule template
- property data_complete¶
Is the underlying data complete?
- on_data_complete()¶
Over-ride in derived rules so that, e.g. events can be written at the end of a real-time acquisition
- property total_frames¶
- class PYME.cluster.rules.SpoolLocalLocalizationRuleFactory(**kwargs)¶
Bases:
RuleFactorySee SpoolLocalLocalizationRule for full initialization arguments. Required kwargs are:
:seriesName : str :analysisMetadata : PYME.IO.MetaDataHandler.MDHandlerBase
- PYME.cluster.rules.verify_cluster_results_filename(resultsFilename)¶
Checks whether a results file already exists on the cluster, and returns an available version of the results filename. Should be called before writing a new results file.
- Parameters
- resultsFilenamestr
cluster path, e.g. pyme-cluster:///example_folder/name.h5r
- Returns
- ——-
- resultsFilenamestr
cluster path which may have _# appended to it if the input resultsFileName is already in use, e.g. pyme-cluster:///example_folder/name_1.h5r