coordinate_clustering_ground_truth

This module contains utilities for retrieving and visualizing ground truth labels for evaluating clustering algorithms

class PolygonAnnotatedCoordinates(coordinates: Union[numpy.ndarray, shapely.geometry.MultiPoint, geopandas.GeoDataFrame, sensai.clustering.clustering_base.EuclideanClusterer.Cluster], ground_truth_polygons: Union[str, Sequence[shapely.geometry.Polygon], geopandas.GeoDataFrame], noise_label: Optional[int] = - 1)[source]

Bases: sensai.geoanalytics.geopandas.coordinates.GeoDataFrameWrapper

Class for retrieving ground truth cluster labels from a set of coordinate points and polygons. From the provided 2-dim. coordinates only points within the ground truth region will be considered.

__init__(coordinates: Union[numpy.ndarray, shapely.geometry.MultiPoint, geopandas.GeoDataFrame, sensai.clustering.clustering_base.EuclideanClusterer.Cluster], ground_truth_polygons: Union[str, Sequence[shapely.geometry.Polygon], geopandas.GeoDataFrame], noise_label: Optional[int] = - 1)
Parameters
  • coordinates – coordinates of points. These points should be spread over an area larger or equal to the ground truth area

  • ground_truth_polygons – sequence of polygons, GeoDataFrame or path to a shapefile containing such a sequence. The polygons represent the ground truth for clustering. Important: the first polygon in the sequence is assumed to be the region within which ground truth was provided and has to cover all remaining polygons. This also means that all non-noise clusters in that region should be covered by a polygon

  • noise_label – label to associate with noise or None

to_geodf(crs='epsg:3857', include_noise=True)
Returns

GeoDataFrame with clusters as MultiPoint instance indexed by the clusters’ identifiers

plot(include_noise=True, **kwargs)

Plots the ground truth clusters

Parameters
  • include_noise

  • kwargs

Returns

get_coordinates_labels()

Extract cluster coordinates and labels as numpy arrays from the provided ground truth region and cluster polygons

Returns

tuple of arrays of the type (coordinates, labels)