Created on 23 dec. 2010
Code author: jhkwakkel <j.h.kwakkel (at) tudelft (dot) nl>
-
class model_ensemble.ModelEnsemble(sampler=<samplers.LHSSampler object at 0x10AD4550>)
One of the two main classes for performing EMA. The ensemble class is
responsible for running experiments on one or more model structures across
one or more policies, and returning the results.
The sampling is delegated to a sampler instance.
The storing or results is delegated to a callback instance
the class has an attribute ‘parallel’ that specifies whether the
experiments are to be run in parallel or not. By default, ‘parallel’ is
False.
an illustration of use
>>> model = UserSpecifiedModelInterface(r'working directory', 'name')
>>> ensemble = SimpleModelEnsemble()
>>> ensemble.set_model_structure(model)
>>> ensemble.parallel = True #parallel processing is turned on
>>> results = ensemble.perform_experiments(1000) #perform 1000 experiments
In this example, a 1000 experiments will be carried out in parallel on
the user specified model interface. The uncertainties are retrieved from
model.uncertainties and the outcomes are assumed to be specified in
model.outcomes.
-
add_model_structure(ms)
Add a model structure to the list of model structures.
-
add_model_structures(mss)
add a collection of model structures to the list of model structures.
-
add_policies(policies)
Add policies, policies should be a collection of policies.
Parameters: | policies – policies to be added, every policy should be a
dict with at least a name. |
-
add_policy(policy)
Add a policy.
Parameters: | policy – policy to be added, policy should be a dict with at
least a name. |
-
continue_robust_optimization(cases=None, nr_of_generations=10, pop=None, stats_callback=None, policy_levers=None, obj_function=None, crossover_rate=0.5, mutation_rate=0.02, reporting_interval=100, **kwargs)
Continue the robust optimization from a previously saved state. To
make this work, one should save the return from
perform_robust_optimization. The typical use case for this method is
to manually track convergence of the optimization after a number of
specified generations.
Parameters: |
- cases – In case of Latin Hypercube sampling and Monte Carlo
sampling, cases specifies the number of cases to
generate. In case of Full Factorial sampling,
cases specifies the resolution to use for sampling
continuous uncertainties. Alternatively, one can supply
a list of dicts, where each dicts contains a case.
That is, an uncertainty name as key, and its value.
- nr_of_generations – the number of generations for which the
GA will be run
- pop – the last ran population, returned
by perform_robust_optimization
- stats_callback – the NSGA2StatisticsCallback instance returned
by perform_robust_optimization
- reporting_interval – parameter for specifying the frequency with
which the callback reports the progress.
(Default is 100)
- policy_levers – A dictionary with model parameter names as key
and a dict as value. The dict should have two
fields: ‘type’ and ‘values. Type is either
list or range, and determines the appropriate
allele type. Values are the parameters to
be used for the specific allele.
- obj_function – the objective function used by the optimization
- crossover_rate – crossover rate for the GA
- mutation_rate – mutation_rate for the GA
|
Note
There is some tricky stuff involved in loading
the stats_callback via cPickle. cPickle requires that the
classes in the pickle file exist. The individual class used
by deap is generated dynamicly. Loading the cPickle should
thus be preceded by reinstantiating the correct individual.
-
determine_uncertainties()
Helper method for determining the unique uncertainties and how
the uncertainties are shared across multiple model structure
interfaces.
Returns: | An overview dictionary which shows which uncertainties are
used by which model structure interface, or interfaces, and
a dictionary with the unique uncertainties across all the
model structure interfaces, with the name as key. |
-
parallel = False
boolean for turning parallel on (default is False)
-
perform_experiments(cases, callback=<class 'callbacks.DefaultCallback'>, reporting_interval=100, model_kwargs={}, which_uncertainties='intersection', which_outcomes='intersection', **kwargs)
Method responsible for running the experiments on a structure. In case
of multiple model structures, the outcomes are set to the intersection
of the sets of outcomes of the various models.
Parameters: |
- cases – In case of Latin Hypercube sampling and Monte Carlo
sampling, cases specifies the number of cases to
generate. In case of Full Factorial sampling,
cases specifies the resolution to use for sampling
continuous uncertainties. Alternatively, one can supply
a list of dicts, where each dicts contains a case.
That is, an uncertainty name as key, and its value.
- callback – Class that will be called after finishing a
single experiment,
- reporting_interval – parameter for specifying the frequency with
which the callback reports the progress.
(Default is 100)
- model_kwargs – dictionary of keyword arguments to be passed to
model_init
- which_uncertainties – keyword argument for controlling whether,
in case of multiple model structure
interfaces, the intersection or the union
of uncertainties should be used.
(Default is intersection).
- which_uncertainties – keyword argument for controlling whether,
in case of multiple model structure
interfaces, the intersection or the union
of outcomes should be used.
(Default is intersection).
- kwargs – generic keyword arguments to pass on to callback
|
Returns: | a structured numpy array
containing the experiments, and a dict with the names of the
outcomes as keys and an numpy array as value.
|
suggested use
In general, analysis scripts require both the structured array of the
experiments and the dictionary of arrays containing the results. The
recommended use is the following:
>>> results = ensemble.perform_experiments(10000) #recommended use
>>> experiments, output = ensemble.perform_experiments(10000) #will work fine
The latter option will work fine, but most analysis scripts require
to wrap it up into a tuple again:
>>> data = (experiments, output)
Another reason for the recommended use is that you can save this tuple
directly:
>>> import expWorkbench.util as util
>>> util.save_results(results, file)
Note
The current implementation has a hard coded limit to the
number of designs possible. This is set to 50.000 designs.
If one want to go beyond this, set self.max_designs to
a higher value.
-
perform_outcome_optimization(reporting_interval=100, obj_function=None, weights=(), nr_of_generations=100, pop_size=100, crossover_rate=0.5, mutation_rate=0.02, **kwargs)
Method responsible for performing outcome optimization. The
optimization will be performed over the intersection of the
uncertainties in case of multiple model structures.
Parameters: |
- reporting_interval – parameter for specifying the frequency with
which the callback reports the progress.
(Default is 100)
- obj_function – the objective function used by the optimization
- weights – tuple of weights on the various outcomes of the
objective function. Use the constants MINIMIZE and
MAXIMIZE.
- nr_of_generations – the number of generations for which the
GA will be run
- pop_size – the population size for the GA
- crossover_rate – crossover rate for the GA
- mutation_rate – mutation_rate for the GA
|
-
perform_robust_optimization(cases, reporting_interval=100, obj_function=None, policy_levers={}, weights=(), nr_of_generations=100, pop_size=100, crossover_rate=0.5, mutation_rate=0.02, **kwargs)
Method responsible for performing robust optimization.
Parameters: |
- cases – In case of Latin Hypercube sampling and Monte Carlo
sampling, cases specifies the number of cases to
generate. In case of Full Factorial sampling,
cases specifies the resolution to use for sampling
continuous uncertainties. Alternatively, one can supply
a list of dicts, where each dicts contains a case.
That is, an uncertainty name as key, and its value.
- reporting_interval – parameter for specifying the frequency with
which the callback reports the progress.
(Default is 100)
- obj_function – the objective function used by the optimization
- policy_levers – A dictionary with model parameter names as key
and a dict as value. The dict should have two
fields: ‘type’ and ‘values. Type is either
list or range, and determines the appropriate
allele type. Values are the parameters to
be used for the specific allele.
- weights – tuple of weights on the various outcomes of the
objective function. Use the constants MINIMIZE and
MAXIMIZE.
- nr_of_generations – the number of generations for which the
GA will be run
- pop_size – the population size for the GA
- crossover_rate – crossover rate for the GA
- mutation_rate – mutation_rate for the GA
|
-
processes = None
In case of parallel computing, the number of
processes to be spawned. Default is None, meaning
that the number of processes will be equal to the
number of available cores.
-
set_model_structure(modelStructure)
Set the model structure. This function wraps the model structure
in a tuple, limiting the number of model structures to 1.