feature_generator_registry
- class FeatureGeneratorRegistry(use_singletons: bool = False)[source]
Bases:
object
Represents a registry for (named) feature generator factories
- __init__(use_singletons: bool = False)
- Parameters
use_singletons – if True, internally maintain feature generator singletons, such that there is at most one instance for each name/key
- property available_features
- register_factory(name: Hashable, factory: Callable[[], sensai.featuregen.feature_generator.FeatureGenerator])
Registers a feature generator factory which can subsequently be referenced by models via their name/hashable key
- Parameters
name – the name/key (which can, in particular, be a string or an Enum item). Especially for larger projects the use of an Enum is recommended (for optimal IDE support)
factory – the factory
- get_feature_generator(name: str) sensai.featuregen.feature_generator.FeatureGenerator
Creates a feature generator from a name, which must have been previously registered. The name of the returned feature generator (as returned by getName()) is set to name.
- Parameters
name – the name (which can, in particular, be a string or an enum item)
- Returns
a new feature generator instance (or existing instance for the case where useSingletons is enabled)
- collect_features(*feature_generators_or_names: Union[Hashable, sensai.featuregen.feature_generator.FeatureGenerator]) sensai.featuregen.feature_generator_registry.FeatureCollector
Creates a feature collector for the given feature names/keys/instances, which can subsequently be added to a model.
- Parameters
feature_generators_or_names – feature names/keys known to this registry or feature generator instances
- class FeatureCollector(*feature_generators_or_names: Union[Hashable, sensai.featuregen.feature_generator.FeatureGenerator], registry: Optional[sensai.featuregen.feature_generator_registry.FeatureGeneratorRegistry] = None)[source]
Bases:
object
A feature collector which facilitates the collection of features that shall be used by a model as well as the generation of commonly used feature transformers that are informed by the features’ meta-data.
- __init__(*feature_generators_or_names: Union[Hashable, sensai.featuregen.feature_generator.FeatureGenerator], registry: Optional[sensai.featuregen.feature_generator_registry.FeatureGeneratorRegistry] = None)
- Parameters
feature_generators_or_names – generator names/keys (known to the registry) or generator instances
registry – the feature generator registry for the case where names/keys are passed
- get_multi_feature_generator() sensai.featuregen.feature_generator.MultiFeatureGenerator
Gets the multi-feature generator that was created for this collector. To create a new, independent instance (e.g. when using this collector for multiple models), use
create_multi_feature_generator()
instead.- Returns
the multi-feature generator that was created for this instance
- get_normalisation_rules(include_generated_categorical_rules=True)
- get_categorical_feature_name_regex() str
- Returns
a regular expression that matches all known categorical feature names
- create_multi_feature_generator()
Creates a new instance of the multi-feature generator that generates the features collected by this instance. If the feature collector instance is not used for multiple models, use
get_multi_feature_generator()
instead to obtain the instance that has already been created.- Returns
a new multi-feature generator that generates the collected features
- create_dft_normalisation(default_transformer_factory=None, require_all_handled=True, inplace=False) sensai.data_transformation.dft.DFTNormalisation
Creates a feature transformer that will apply normalisation to all supported (numeric) features
- Parameters
default_transformer_factory – a factory for the creation of transformer instances (which implements the API used by sklearn.preprocessing, e.g. StandardScaler) that shall be used to create a transformer for all rules that do not specify a particular transformer. The default transformer will only be applied to columns matched by such rules, unmatched columns will not be transformed. Use SkLearnTransformerFactoryFactory to conveniently create a factory.
require_all_handled – whether to raise an exception if not all columns are matched by a rule
inplace – whether to apply data frame transformations in-place
- Returns
the transformer
- create_dft_one_hot_encoder(ignore_unknown=False, inplace=False)
Creates a feature transformer that will apply one-hot encoding to all the features that are known to be categorical
- Parameters
inplace – whether to perform the transformation in-place
ignore_unknown – if True and an unknown category is encountered during transform, the resulting one-hot encoded columns for this feature will be all zeros. if False, an unknown category will raise an error.
- Returns
the transformer
- create_feature_transformer_normalisation(default_transformer_factory=None, require_all_handled=True, inplace=False) sensai.data_transformation.dft.DFTNormalisation
Creates a feature transformer that will apply normalisation to all supported (numeric) features. Alias of create_dft_normalisation.
- Parameters
default_transformer_factory – a factory for the creation of transformer instances (which implements the API used by sklearn.preprocessing, e.g. StandardScaler) that shall be used to create a transformer for all rules that do not specify a particular transformer. The default transformer will only be applied to columns matched by such rules, unmatched columns will not be transformed. Use SkLearnTransformerFactoryFactory to conveniently create a factory.
require_all_handled – whether to raise an exception if not all columns are matched by a rule
inplace – whether to apply data frame transformations in-place
- Returns
the transformer
- create_feature_transformer_one_hot_encoder(ignore_unknown=False, inplace=False)
Creates a feature transformer that will apply one-hot encoding to all the features that are known to be categorical. Alias of create_dft_one_hot_encoder.
- Parameters
inplace – whether to perform the transformation in-place
ignore_unknown – if True and an unknown category is encountered during transform, the resulting one-hot encoded columns for this feature will be all zeros. if False, an unknown category will raise an error.
- Returns
the transformer