User API¶
The following entries are the publically-available functions in the package.
Module contents¶
- exception kmeans.MaxIterationError¶
 Bases:
ExceptionAn exception to be raised when the maximum iteration tolerance is exceeded.
- kmeans.cluster(
 - data: ndarray | list[ndarray] | tuple[ndarray],
 - k: int,
 - *,
 - initial_means: ndarray | list[ndarray] | tuple[ndarray] | None = None,
 - ndim: int | None = None,
 - tolerance: float = 4.440892098500626e-15,
 - max_iterations: int = 250,
 Perform k-means clustering
The input data should be formatted in terms of row vectors. Given a flat numpy array
data=np.array([0, 1, 2, 3, 4]), do the following:data = data.reshape(data.shape[-1], -1) # or data = data[..., np.newaxis]
It should make each point a row entry:
[[0], [1], [2], [3], [4]]
Data of higher dimensions (ex. a multi-channeled image) should be flattened using the number of indices for the deepest dimension. So, for an image with shape (480, 640, 3), run:
data = data.reshape(-1, data.shape[-1])
- Parameters:
 data – The input data. Expects data homogeneity (all elements are the same dimension)
k – Amount of clusters desired.
initial_means – The initial cluster centroids. Means are randomly selected from data with uniform probability by default.
ndim – Dimension limit for clustering. If default, the length of a given data element is used (all data dimensions clustered).
tolerance – Controls the completion criteria. Lower values -> more iterations. Defaults to 20*eps for np.float64.
max_iterations – Max number of iterations before terminating function execution.
- Returns:
 Clustered Data, Cluster Centroids
- Return type:
 dict[int, np.ndarray], np.ndarray
- Raises:
 kmeans.MaxIterationError – Raise this exception if the clustering doesn’t converge before reaching the max_iterations count.
- kmeans.view_clustering(
 - data: ndarray | list[ndarray] | tuple[ndarray],
 - k: int,
 - *,
 - initial_means: ndarray | list[ndarray] | tuple[ndarray] | None = None,
 - ndim: int | None = None,
 - tolerance: float = 4.440892098500626e-15,
 - max_iterations: int = 250,
 Perform and display k-means clustering
This is the same as
kmeans.cluster(), just with plotting side-effects.- Parameters:
 data – The input data. Expects data homogeneity (all elements are the same dimension)
k – Amount of clusters desired.
initial_means – The initial cluster centroids. Means are randomly selected from data with uniform probability by default.
ndim – Dimension limit for clustering. If default, the length of a given data element is used (all data dimensions clustered).
tolerance – Controls the completion criteria. Lower values -> more iterations. Defaults to 20*eps for np.float64.
max_iterations – Max number of iterations before terminating function execution.
- Returns:
 Clustered Data, Cluster Centroids, Matplotlib Figure
- Return type:
 dict[int, np.ndarray], np.ndarray, matplotlib.figure.Figure
- Raises:
 ValueError – if calculated
ndimor providedndimis neither 2 nor 3.kmeans.MaxIterationError – Raise this exception if the clustering doesn’t converge before reaching the
max_iterationscount.
Submodules¶
kmeans.segmentation¶
- kmeans.segmentation.segment_img(
 - img: ndarray,
 - groups: int,
 - random_colors: bool = False,
 Segment the input RGB image by color groups.
- Parameters:
 img – The image to be segmented. Assumes RGB
groups – How many groups the image is segmented into. Higher numbers -> more detail
random_colors – Provide each group with a randomized RGB color instead of the average color.
- Returns:
 Segmented Image
- Return type:
 np.ndarray