greedy_clustering
- class GreedyAgglomerativeClustering(clusters: Sequence[sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.Cluster], merge_candidate_determination_strategy: Optional[sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.MergeCandidateDeterminationStrategy] = None)[source]
Bases:
object
An implementation of greedy agglomerative clustering which avoids unnecessary recomputations of merge costs through the management of a priority queue of potential merges.
Greedy agglomerative clustering works as follows. Starting with an initial set of clusters (where each cluster typically contains a single data point), the method successively merges the two clusters where the merge cost is lowest (greedy), until no further merges are admissible. The merge operation is a mutating operation, i.e. the initial clusters are modified.
To apply the method, the Cluster class must be subclassed, so as to define what the cost of a merge in your application shall be and how two clusters can be merged. For example, if data points are points in a Cartesian coordinate system, then the merge cost can be defined as the minimum or maximum distance among all pairs of points in the two clusters, admissibility being determined by a threshold that must not be exceeded; the merge operation can simply concatenate lists of data points.
- class Cluster
Bases:
abc.ABC
Base class for clusters that can be merged via GreedyAgglomerativeClustering
- abstract merge_cost(other) float
Computes the cost of merging the given cluster with this cluster
- Returns
the (non-negative) merge cost or math.inf if a merge is inadmissible
- abstract merge(other)
Merges the given cluster into this cluster”
- Parameters
other – the cluster that is to be merged into this cluster
- __init__(clusters: Sequence[sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.Cluster], merge_candidate_determination_strategy: Optional[sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.MergeCandidateDeterminationStrategy] = None)
- Parameters
clusters – the initial clusters, which are to be agglomerated into larger clusters
- apply_clustering() List[sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.Cluster]
Applies greedy agglomerative clustering to the clusters given at construction, merging clusters until no further merges are admissible
- Returns
the list of agglomerated clusters (subset of the original clusters, which may have had other clusters merged into them)
- class WrappedCluster(cluster, idx, clusterer: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering)
Bases:
object
Wrapper for clusters which stores additional data required for clustering (internal use only)
- __init__(cluster, idx, clusterer: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering)
- is_merged() bool
- get_cluster_association() sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.WrappedCluster
Gets the wrapped cluster that this cluster’s points have ultimately been merged into (which may be the cluster itself)
- Returns
the wrapped cluster this cluster’s points are associated with
- remove_merges()
- compute_merges(initial: bool, merged_cluster_indices: Optional[Tuple[int, int]] = None)
- class ClusterMerge(c1: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.WrappedCluster, c2: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.WrappedCluster, merge_cost)
Bases:
object
Represents a potential merge
- __init__(c1: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.WrappedCluster, c2: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.WrappedCluster, merge_cost)
- apply()
- class MergeCandidateDeterminationStrategy
Bases:
abc.ABC
- __init__()
- set_clusterer(clusterer: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering)
Initialises the clusterer the strategy is applied to :param clusterer: the clusterer
- abstract iter_candidate_indices(wc: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.WrappedCluster, initial: bool, merged_cluster_indices: Optional[Tuple[int, int]] = None) Iterator[Union[int, sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.ClusterMerge]]
- Parameters
wc – the wrapped cluster: the cluster for which to determine the cluster indices that are to be considered for a potential merge
initial – whether we are computing the initial candidates (at the start of the clustering algorithm)
merged_cluster_indices – [for initial=False] the pair of cluster indices that were just joined to form the updated cluster wc
- Returns
an iterator of cluster indices that should be evaluated as potential merge partners for wc (it may contain the index of wc, which will be ignored)
- class MergeCandidateDeterminationStrategyDefault
-
- iter_candidate_indices(wc: sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.WrappedCluster, initial: bool, merged_cluster_indices: Optional[Tuple[int, int]] = None) Iterator[Union[int, sensai.clustering.greedy_clustering.GreedyAgglomerativeClustering.ClusterMerge]]
- Parameters
wc – the wrapped cluster: the cluster for which to determine the cluster indices that are to be considered for a potential merge
initial – whether we are computing the initial candidates (at the start of the clustering algorithm)
merged_cluster_indices – [for initial=False] the pair of cluster indices that were just joined to form the updated cluster wc
- Returns
an iterator of cluster indices that should be evaluated as potential merge partners for wc (it may contain the index of wc, which will be ignored)