Function Reference

class Somoclu(n_columns, n_rows, initialcodebook=None, kerneltype=0, maptype='planar', gridtype='rectangular', compactsupport=False, neighborhood='gaussian', std_coeff=0.5, initialization=None)

Class for training and visualizing a self-organizing map.

Attributes:
codebook The codebook of the self-organizing map. bmus The BMUs corresponding to the data points.
Parameters:
  • n_columns (int.) – The number of columns in the map.
  • n_rows (int.) – The number of rows in the map.
  • initialcodebook (2D numpy.array of float32.) – Optional parameter to start the training with a given codebook.
  • kerneltype (int.) –

    Optional parameter to specify which kernel to use:

    • 0: dense CPU kernel (default)
    • 1: dense GPU kernel (if compiled with it)
  • maptype (str.) –

    Optional parameter to specify the map topology:

    • ”planar”: Planar map (default)
    • ”toroid”: Toroid map
  • gridtype (str.) –

    Optional parameter to specify the grid form of the nodes:

    • ”rectangular”: rectangular neurons (default)
    • ”hexagonal”: hexagonal neurons
  • compactsupport (bool.) – Optional parameter to cut off map updates beyond the training radius with the Gaussian neighborhood. Default: True.
  • neighborhood (str.) –

    Optional parameter to specify the neighborhood:

    • ”gaussian”: Gaussian neighborhood (default)
    • ”bubble”: bubble neighborhood function
  • std_coeff (float.) – Optional parameter to set the coefficient in the Gaussian neighborhood function exp(-||x-y||^2/(2*(coeff*radius)^2)) Default: 0.5
  • initialization (str.) –

    Optional parameter to specify the initalization:

    • ”random”: random weights in the codebook
    • ”pca”: codebook is initialized from the first subspace spanned by the first two eigenvectors of the correlation matrix
  • verbose (int.) – Optional parameter to specify verbosity (0, 1, or 2).
cluster(algorithm=None)

Cluster the codebook. The clusters of the data instances can be assigned based on the BMUs. The method populates the class variable Somoclu.clusters. If viewing methods are called after clustering, but without colors for best matching units, colors will be automatically assigned based on cluster membership.

Parameters:algorithm – Optional parameter to specify a scikit-learn clustering algorithm. The default is K-means with eight clusters.
get_surface_state(data=None))

Return the dot product of the codebook and the data.

Parameters:data (2D numpy.array of float32.) – Optional parameter to specify data, otherwise the data used previously to train the SOM is used.
Returns:The the dot product of the codebook and the data.
Return type:2D numpy.array
get_bmus(activation_map)

Return Best Matching Unit indexes of the activation map.

Parameters:activation_map (2D numpy.array) – Activation map computed with self.get_surface_state()
Returns:The bmus indexes corresponding to this activation map (same as self.bmus for the training samples).
Return type:2D numpy.array
load_bmus(filename)

Load the best matching units from a file to the Somoclu object.

Parameters:filename (str.) – The name of the file.
load_codebook(filename)

Load the codebook from a file to the Somoclu object.

Parameters:filename (str.) – The name of the file.
load_umatrix(filename)

Load the umatrix from a file to the Somoclu object.

Parameters:filename (str.) – The name of the file.
train(data=None, epochs=10, radius0=0, radiusN=1, radiuscooling='linear', scale0=0.1, scaleN=0.01, scalecooling='linear')

Train the map on the current data in the Somoclu object.

Parameters:
  • data (2D numpy.array of float32.) – Training data..
  • epochs (int.) – The number of epochs to train the map for.
  • radius0 (float.) – The initial radius on the map where the update happens around a best matching unit. Default value of 0 will trigger a value of min(n_columns, n_rows)/2.
  • radiusN (float.) – The radius on the map where the update happens around a best matching unit in the final epoch. Default: 1.
  • radiuscooling

    The cooling strategy between radius0 and radiusN:

    • ”linear”: Linear interpolation (default)
    • ”exponential”: Exponential decay
  • scale0 (float.) – The initial learning scale. Default value: 0.1.
  • scaleN (float.) – The learning scale in the final epoch. Default: 0.01.
  • scalecooling (str.) –

    The cooling strategy between scale0 and scaleN:

    • ”linear”: Linear interpolation (default)
    • ”exponential”: Exponential decay
view_activation_map(data_vector=None, data_index=None, activation_map=None, figsize=None, colormap=cm.Spectral_r, colorbar=False, bestmatches=False, bestmatchcolors=None, labels=None, zoom=None, filename=None)

Plot the activation map of a given data instance or a new data vector

Parameters:
  • data_vector (numpy.array) – Optional parameter for a new vector
  • data_index (int.) – Optional parameter for the index of the data instance
  • activation_map (numpy.array) – Optional parameter to pass the an activation map
  • figsize ((int, int)) – Optional parameter to specify the size of the figure.
  • colormap (matplotlib.colors.Colormap) – Optional parameter to specify the color map to be used.
  • colorbar (bool.) – Optional parameter to include a colormap as legend.
  • bestmatches (bool.) – Optional parameter to plot best matching units.
  • bestmatchcolors (list of int.) – Optional parameter to specify the color of each best matching unit.
  • labels (list of str.) – Optional parameter to specify the label of each point.
  • zoom (((int, int), (int, int))) – Optional parameter to zoom into a region on the map. The first two coordinates of the tuple are the row limits, the second tuple contains the column limits.
  • filename (str.) – If specified, the plot will not be shown but saved to this file.
view_component_planes(dimensions=None, figsize=None, colormap=cm.Spectral_r, colorbar=False, bestmatches=False, bestmatchcolors=None, labels=None, zoom=None, filename=None)

Observe the component planes in the codebook of the SOM.

Parameters:
  • dimensions – Optional parameter to specify along which dimension or dimensions should the plotting happen. By default, each dimension is plotted in a sequence of plots.
  • figsize ((int, int)) – Optional parameter to specify the size of the figure.
  • colormap (matplotlib.colors.Colormap) – Optional parameter to specify the color map to be used.
  • colorbar (bool.) – Optional parameter to include a colormap as legend.
  • bestmatches (bool.) – Optional parameter to plot best matching units.
  • bestmatchcolors (list of int.) – Optional parameter to specify the color of each best matching unit.
  • labels (list of str.) – Optional parameter to specify the label of each point.
  • zoom (((int, int), (int, int))) – Optional parameter to zoom into a region on the map. The first two coordinates of the tuple are the row limits, the second tuple contains the column limits.
  • filename (str.) – If specified, the plot will not be shown but saved to this file.
view_similarity_matrix(data=None, labels=None, figsize=None, filename=None)

Plot the similarity map according to the activation map

Parameters:
  • data (numpy.array) – Optional parameter for data points to calculate the similarity with
  • figsize ((int, int)) – Optional parameter to specify the size of the figure.
  • labels (list of str.) – Optional parameter to specify the label of each point.
  • filename (str.) – If specified, the plot will not be shown but saved to this file.
view_umatrix(figsize=None, colormap=<Mock name=cm.Spectral_r, colorbar=False, bestmatches=False, bestmatchcolors=None, labels=None, zoom=None, filename=None)

Plot the U-matrix of the trained map.

Parameters:
  • figsize ((int, int)) – Optional parameter to specify the size of the figure.
  • colormap (matplotlib.colors.Colormap) – Optional parameter to specify the color map to be used.
  • colorbar (bool.) – Optional parameter to include a colormap as legend.
  • bestmatches (bool.) – Optional parameter to plot best matching units.
  • bestmatchcolors (list of int.) – Optional parameter to specify the color of each best matching unit.
  • labels (list of str.) – Optional parameter to specify the label of each point.
  • zoom (((int, int), (int, int))) – Optional parameter to zoom into a region on the map. The first two coordinates of the tuple are the row limits, the second tuple contains the column limits.
  • filename (str.) – If specified, the plot will not be shown but saved to this file.