evaluator
- class MetricsDictProvider[source]
Bases:
sensai.tracking.tracking_base.TrackingMixin
,abc.ABC
- compute_metrics(model, **kwargs) Optional[Dict[str, float]]
Computes metrics for the given model, typically by fitting the model and applying it to test data. If a tracked experiment was previously set, the metrics are tracked with the string representation of the model added under an additional key ‘str(model)’.
- Parameters
model – the model for which to compute metrics
kwargs – parameters to pass on to the underlying evaluation method
- Returns
a dictionary with metrics values
- class MetricsDictProviderFromFunction(compute_metrics_fn: Callable[[sensai.vector_model.VectorModel], Dict[str, float]])[source]
Bases:
sensai.evaluation.evaluator.MetricsDictProvider
- __init__(compute_metrics_fn: Callable[[sensai.vector_model.VectorModel], Dict[str, float]])
- class VectorModelEvaluationData(stats_dict: Dict[str, sensai.evaluation.evaluator.TEvalStats], io_data: sensai.data.InputOutputData, model: sensai.vector_model.VectorModelBase)[source]
Bases:
abc.ABC
,Generic
[sensai.evaluation.evaluator.TEvalStats
]- __init__(stats_dict: Dict[str, sensai.evaluation.evaluator.TEvalStats], io_data: sensai.data.InputOutputData, model: sensai.vector_model.VectorModelBase)
- Parameters
stats_dict – a dictionary mapping from output variable name to the evaluation statistics object
io_data – the input/output data that was used to produce the results
model – the model that was used to produce predictions
- property model_name
- property input_data
- get_eval_stats(predicted_var_name=None) sensai.evaluation.evaluator.TEvalStats
- get_data_frame()
Returns an DataFrame with all evaluation metrics (one row per output variable)
- Returns
a DataFrame containing evaluation metrics
- iter_input_output_ground_truth_tuples(predicted_var_name=None) Generator[Tuple[sensai.util.typing.PandasNamedTuple, Any, Any], None, None]
- class VectorRegressionModelEvaluationData(stats_dict: Dict[str, sensai.evaluation.evaluator.TEvalStats], io_data: sensai.data.InputOutputData, model: sensai.vector_model.VectorModelBase)[source]
Bases:
sensai.evaluation.evaluator.VectorModelEvaluationData
[sensai.evaluation.eval_stats.eval_stats_regression.RegressionEvalStats
]- get_eval_stats_collection()
- class EvaluatorParams(data_splitter: Optional[sensai.data.DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True)[source]
Bases:
sensai.util.string.ToStringMixin
,abc.ABC
- __init__(data_splitter: Optional[sensai.data.DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True)
- Parameters
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if test data must be obtained via split, dataSplitter is None] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if test data must be obtained via split, dataSplitter is none] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if test data must be obtained via split, dataSplitter is None] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
- get_data_splitter() sensai.data.DataSplitter
- set_data_splitter(splitter: sensai.data.DataSplitter)
- class VectorModelEvaluator(data: sensai.data.InputOutputData, test_data: Optional[sensai.data.InputOutputData] = None, params: Optional[sensai.evaluation.evaluator.EvaluatorParams] = None)[source]
Bases:
sensai.evaluation.evaluator.MetricsDictProvider
,Generic
[sensai.evaluation.evaluator.TEvalData
],abc.ABC
- __init__(data: sensai.data.InputOutputData, test_data: Optional[sensai.data.InputOutputData] = None, params: Optional[sensai.evaluation.evaluator.EvaluatorParams] = None)
Constructs an evaluator with test and training data.
- Parameters
data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- set_tracked_experiment(tracked_experiment: sensai.tracking.tracking_base.TrackedExperiment)
Sets a tracked experiment which will result in metrics being saved whenever computeMetrics is called or evalModel is called with track=True.
- Parameters
tracked_experiment – the experiment in which to track evaluation metrics.
- eval_model(model: Union[sensai.vector_model.VectorModelBase, sensai.vector_model.VectorModelFittableBase], on_training_data=False, track=True, fit=False) sensai.evaluation.evaluator.TEvalData
Evaluates the given model
- Parameters
model – the model to evaluate
on_training_data – if True, evaluate on this evaluator’s training data rather than the held-out test data
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object
fit – whether to fit the model before evaluating it (via this object’s fit_model method); if enabled, the model must support fitting
- Returns
the evaluation result
- create_metrics_dict_provider(predicted_var_name: Optional[str]) sensai.evaluation.evaluator.MetricsDictProvider
Creates a metrics dictionary provider, e.g. for use in hyperparameter optimisation
- Parameters
predicted_var_name – the name of the predicted variable for which to obtain evaluation metrics; may be None only if the model outputs but a single predicted variable
- Returns
a metrics dictionary provider instance for the given variable
- fit_model(model: sensai.vector_model.VectorModelFittableBase)
Fits the given model’s parameters using this evaluator’s training data
- class RegressionEvaluatorParams(data_splitter: Optional[sensai.data.DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, metrics: Optional[Sequence[sensai.evaluation.eval_stats.eval_stats_regression.RegressionMetric]] = None, additional_metrics: Optional[Sequence[sensai.evaluation.eval_stats.eval_stats_regression.RegressionMetric]] = None, output_data_frame_transformer: Optional[sensai.data_transformation.dft.DataFrameTransformer] = None)[source]
Bases:
sensai.evaluation.evaluator.EvaluatorParams
- __init__(data_splitter: Optional[sensai.data.DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, metrics: Optional[Sequence[sensai.evaluation.eval_stats.eval_stats_regression.RegressionMetric]] = None, additional_metrics: Optional[Sequence[sensai.evaluation.eval_stats.eval_stats_regression.RegressionMetric]] = None, output_data_frame_transformer: Optional[sensai.data_transformation.dft.DataFrameTransformer] = None)
- Parameters
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
metrics – regression metrics to apply. If None, default regression metrics are used.
additional_metrics – additional regression metrics to apply
output_data_frame_transformer – a data frame transformer to apply to all output data frames (both model outputs and ground truth), such that evaluation metrics are computed on the transformed data frame
- classmethod from_dict_or_instance(params: Optional[Union[Dict[str, Any], sensai.evaluation.evaluator.RegressionEvaluatorParams]]) sensai.evaluation.evaluator.RegressionEvaluatorParams
- class VectorRegressionModelEvaluatorParams(*args, **kwargs)[source]
Bases:
sensai.evaluation.evaluator.RegressionEvaluatorParams
- __init__(*args, **kwargs)
- Parameters
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation;
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
metrics – regression metrics to apply. If None, default regression metrics are used.
additional_metrics – additional regression metrics to apply
output_data_frame_transformer – a data frame transformer to apply to all output data frames (both model outputs and ground truth), such that evaluation metrics are computed on the transformed data frame
- class VectorRegressionModelEvaluator(data: sensai.data.InputOutputData, test_data: Optional[sensai.data.InputOutputData] = None, params: Optional[sensai.evaluation.evaluator.RegressionEvaluatorParams] = None)[source]
Bases:
sensai.evaluation.evaluator.VectorModelEvaluator
[sensai.evaluation.evaluator.VectorRegressionModelEvaluationData
]- __init__(data: sensai.data.InputOutputData, test_data: Optional[sensai.data.InputOutputData] = None, params: Optional[sensai.evaluation.evaluator.RegressionEvaluatorParams] = None)
Constructs an evaluator with test and training data.
- Parameters
data – the full data set, or, if testData is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- compute_test_data_outputs(model: sensai.vector_model.VectorModelBase) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]
Applies the given model to the test data
- Parameters
model – the model to apply
- Returns
a pair (predictions, groundTruth)
- class VectorClassificationModelEvaluationData(stats_dict: Dict[str, sensai.evaluation.evaluator.TEvalStats], io_data: sensai.data.InputOutputData, model: sensai.vector_model.VectorModelBase)[source]
Bases:
sensai.evaluation.evaluator.VectorModelEvaluationData
[sensai.evaluation.eval_stats.eval_stats_classification.ClassificationEvalStats
]- get_misclassified_inputs_data_frame() pandas.core.frame.DataFrame
- get_misclassified_triples_pred_true_input() List[Tuple[Any, Any, pandas.core.series.Series]]
- Returns
a list containing a triple (predicted class, true class, input series) for each misclassified data point
- class ClassificationEvaluatorParams(data_splitter: Optional[sensai.data.DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, additional_metrics: Optional[Sequence[sensai.evaluation.eval_stats.eval_stats_classification.ClassificationMetric]] = None, compute_probabilities: bool = False, binary_positive_label: Optional[str] = ('__guess',))[source]
Bases:
sensai.evaluation.evaluator.EvaluatorParams
- __init__(data_splitter: Optional[sensai.data.DataSplitter] = None, fractional_split_test_fraction: Optional[float] = None, fractional_split_random_seed=42, fractional_split_shuffle=True, additional_metrics: Optional[Sequence[sensai.evaluation.eval_stats.eval_stats_classification.ClassificationMetric]] = None, compute_probabilities: bool = False, binary_positive_label: Optional[str] = ('__guess',))
- Parameters
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
additional_metrics – additional metrics to apply
compute_probabilities – whether to compute class probabilities. Enabling this will enable many downstream computations and visualisations (e.g. precision-recall plots) but requires the model to support probability computation in general.
binary_positive_label – the positive class label for binary classification; if GUESS, try to detect from labels; if None, no detection (assume non-binary classification)
- classmethod from_dict_or_instance(params: Optional[Union[Dict[str, Any], sensai.evaluation.evaluator.ClassificationEvaluatorParams]]) sensai.evaluation.evaluator.ClassificationEvaluatorParams
- class VectorClassificationModelEvaluatorParams(*args, **kwargs)[source]
Bases:
sensai.evaluation.evaluator.ClassificationEvaluatorParams
- __init__(*args, **kwargs)
- Parameters
data_splitter – [if test data must be obtained via split] a splitter to use in order to obtain; if None, must specify fractionalSplitTestFraction for fractional split (default)
fractional_split_test_fraction – [if dataSplitter is None, test data must be obtained via split] the fraction of the data to use for testing/evaluation
fractional_split_random_seed – [if dataSplitter is none, test data must be obtained via split] the random seed to use for the fractional split of the data
fractional_split_shuffle – [if dataSplitter is None, test data must be obtained via split] whether to randomly (based on randomSeed) shuffle the dataset before splitting it
additional_metrics – additional metrics to apply
compute_probabilities – whether to compute class probabilities. Enabling this will enable many downstream computations and visualisations (e.g. precision-recall plots) but requires the model to support probability computation in general.
binary_positive_label – the positive class label for binary classification; if GUESS, try to detect from labels; if None, no detection (assume non-binary classification)
- class VectorClassificationModelEvaluator(data: sensai.data.InputOutputData, test_data: Optional[sensai.data.InputOutputData] = None, params: Optional[sensai.evaluation.evaluator.ClassificationEvaluatorParams] = None)[source]
Bases:
sensai.evaluation.evaluator.VectorModelEvaluator
[sensai.evaluation.evaluator.VectorClassificationModelEvaluationData
]- __init__(data: sensai.data.InputOutputData, test_data: Optional[sensai.data.InputOutputData] = None, params: Optional[sensai.evaluation.evaluator.ClassificationEvaluatorParams] = None)
Constructs an evaluator with test and training data.
- Parameters
data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- compute_test_data_outputs(model) Tuple[pandas.core.frame.DataFrame, pandas.core.frame.DataFrame, pandas.core.frame.DataFrame]
Applies the given model to the test data
- Parameters
model – the model to apply
- Returns
a triple (predictions, predicted class probability vectors, groundTruth) of DataFrames
- class RuleBasedVectorClassificationModelEvaluator(data: sensai.data.InputOutputData)[source]
Bases:
sensai.evaluation.evaluator.VectorClassificationModelEvaluator
- __init__(data: sensai.data.InputOutputData)
Constructs an evaluator with test and training data.
- Parameters
data – the full data set, or, if test_data is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- eval_model(model: sensai.vector_model.VectorModelBase, on_training_data=False, track=True, fit=False) sensai.evaluation.evaluator.VectorClassificationModelEvaluationData
Evaluate the rule based model. The training data and test data coincide, thus fitting the model will fit the model’s preprocessors on the full data set and evaluating it will evaluate the model on the same data set.
- Parameters
model – the model to evaluate
on_training_data – has to be False here. Setting to True is not supported and will lead to an exception
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object
- Returns
the evaluation result
- class RuleBasedVectorRegressionModelEvaluator(data: sensai.data.InputOutputData)[source]
Bases:
sensai.evaluation.evaluator.VectorRegressionModelEvaluator
- __init__(data: sensai.data.InputOutputData)
Constructs an evaluator with test and training data.
- Parameters
data – the full data set, or, if testData is given, the training data
test_data – the data to use for testing/evaluation; if None, must specify appropriate parameters to define splitting
params – the parameters
- eval_model(model: Union[sensai.vector_model.VectorModelBase, sensai.vector_model.VectorModelFittableBase], on_training_data=False, track=True, fit=False) sensai.evaluation.evaluator.VectorRegressionModelEvaluationData
Evaluate the rule based model. The training data and test data coincide, thus fitting the model will fit the model’s preprocessors on the full data set and evaluating it will evaluate the model on the same data set.
- Parameters
model – the model to evaluate
on_training_data – has to be False here. Setting to True is not supported and will lead to an exception
track – whether to track the evaluation metrics for the case where a tracked experiment was set on this object
- Returns
the evaluation result