ema workbench

Other Sub Sites

pairs plotting

Code author: jhkwakkel <j.h.kwakkel (at) tudelft (dot) nl>

This module provides R style pairs plotting functionality.

pairs_plotting.pairs_scatter(results, outcomes_to_show=[], group_by=None, grouping_specifiers=None, ylabels={}, legend=True, point_in_time=-1, **kwargs)

Generate a R style pairs scatter multiplot. In case of time-series data, the end states are used.

Parameters:
  • results – return from perform_experiments.
  • outcomes_to_show – list of outcome of interest you want to plot. If empty, all outcomes are plotted.
  • group_by – name of the column in the cases array to group results by. Alternatively, index can be used to use indexing arrays as the basis for grouping.
  • grouping_specifiers – set of categories to be used as a basis for grouping by. Grouping_specifiers is only meaningful if group_by is provided as well. In case of grouping by index, the grouping specifiers should be in a dictionary where the key denotes the name of the group.
  • ylabels – ylabels is a dictionary with the outcome names as keys, the specified values will be used as labels for the y axis.
  • legend – boolean, if true, and there is a column specified for grouping, show a legend.
  • point_in_time – the point in time at which the scatter is to be made. If None is provided, the end states are used. point_in_time should be a valid value on time
Return type:

a figure instance and a dict with the individual axes.

Note

the current implementation is limited to seven different categories in case of column, categories, and/or discretesize. This limit is due to the colors specified in COLOR_LIST.

pairs_plotting.pairs_lines(results, outcomes_to_show=[], group_by=None, grouping_specifiers=None, ylabels={}, legend=True, **kwargs)

Generate a R style pairs lines multiplot. It shows the behavior of two outcomes over time against each other. The origin is denoted with a circle and the end is denoted with a ‘+’.

Parameters:
  • results – return from perform_experiments.
  • outcomes_to_show – list of outcome of interest you want to plot. If empty, all outcomes are plotted.
  • group_by – name of the column in the cases array to group results by. Alternatively, index can be used to use indexing arrays as the basis for grouping.
  • grouping_specifiers – set of categories to be used as a basis for grouping by. Grouping_specifiers is only meaningful if group_by is provided as well. In case of grouping by index, the grouping specifiers should be in a dictionary where the key denotes the name of the group.
  • ylabels – ylabels is a dictionary with the outcome names as keys, the specified values will be used as labels for the y axis.
  • legend – boolean, if true, and there is a column specified for grouping, show a legend.
  • point_in_time – the point in time at which the scatter is to be made. If None is provided, the end states are used. point_in_time should be a valid value on time
Return type:

a figure instance and a dict with the individual axes.

pairs_plotting.pairs_density(results, outcomes_to_show=[], group_by=None, grouping_specifiers=None, ylabels={}, point_in_time=-1, log=True, gridsize=50, colormap='jet', filter_scalar=True)

Generate a R style pairs hexbin density multiplot. In case of time-series data, the end states are used.

hexbin makes hexagonal binning plot of x versus y, where x, y are 1-D sequences of the same length, N. If C is None (the default), this is a histogram of the number of occurences of the observations at (x[i],y[i]). For further detail see matplotlib on hexbin

Parameters:
  • results – return from perform_experiments.
  • outcomes_to_show – list of outcome of interest you want to plot. If empty, all outcomes are plotted.
  • group_by – name of the column in the cases array to group results by. Alternatively, index can be used to use indexing arrays as the basis for grouping.
  • grouping_specifiers – set of categories to be used as a basis for grouping by. Grouping_specifiers is only meaningful if group_by is provided as well. In case of grouping by index, the grouping specifiers should be in a dictionary where the key denotes the name of the group.
  • ylabels – ylabels is a dictionary with the outcome names as keys, the specified values will be used as labels for the y axis.
  • point_in_time – the point in time at which the scatter is to be made. If None is provided, the end states are used. point_in_time should be a valid value on time
  • log – boolean, indicating whether density should be log scaled. Defaults to True.
  • gridsize – controls the gridsize for the hexagonal binning
  • cmap – color map that is to be used in generating the hexbin. For details on the available maps, see pylab. (Defaults = jet)
  • filter_scalar – boolean, remove the non-time-series outcomes. Defaults to True.
Return type:

a figure instance and a dict with the individual axes.