Plotting options


You can try this notebook in you browser: Binder


Clustergram offers two types of plots - static and interactive. Static plots are using matplotlib, while interactive are based on bokeh.

Let’s load the data and fit clustergram on Palmer penguins dataset. See the Introduction for its overview.

import seaborn
from sklearn.preprocessing import scale
from clustergram import Clustergram

df = seaborn.load_dataset('penguins')
data = scale(df.drop(columns=['species', 'island', 'sex']).dropna())

cgram = Clustergram(range(1, 12), verbose=False)

Static plots

Static plots can be generated using Clustergram.plot() method.

<AxesSubplot:xlabel='Number of clusters (k)', ylabel='PCA weighted mean of the clusters'>


Clustergram.plot() returns matplotlib axis and can be fully customised as any other matplotlib plot. You can pass keyword arguments to control the style of cluster centers as a cluster_style dictionary and arguments to control lines using line_style dictionary. Using global styles like those in seaborn also works.


    cluster_style={"color": "lightblue", "edgecolor": "black"},
    line_style={"color": "red", "linestyle": "-."},
    figsize=(12, 8)
<AxesSubplot:xlabel='Number of clusters (k)', ylabel='PCA weighted mean of the clusters'>

Partial plot

Clustergram.plot() can also plot only a part of the diagram, if you want to focus on a limited range of k.

cgram.plot(k_range=range(3, 10), figsize=(12, 8))
<AxesSubplot:xlabel='Number of clusters (k)', ylabel='PCA weighted mean of the clusters'>

Saving plot

Clustergram.plot() returns matplotlib axis object and as such can be saved as any other plot:

import matplotlib.pyplot as plt


Interactive plots

Interactive plots can be generated using Clustergram.bokeh() method. The method returs bokeh Figure object and it is up to you what to do with it. Probably the best option, if you are using Jupyter notebook, is to show it directly in the cell. For that you will need to load BokehJS interface using output_notebook() and then call show().

You need to install bokeh, which is an optional dependency only:

conda install bokeh


pip install bokeh
from import output_notebook
from bokeh.plotting import show

Loading BokehJS ...
fig = cgram.bokeh()

This clustegram allows you to interactively zoom to specific parts of the diagram and also shows the number of observations per each cluster alongside its label. You can retrieve labels for each observation and each iteration using Clustergram.labels.


Bokeh plot can be customised in a very similar way as the static one, using style dictionaries.

fig = cgram.bokeh(
    cluster_style={"color": "lightblue", "line_color": "black", },
    line_style={"color": "red", "line_dash": "dotted", "line_cap": "butt"},
    figsize=(700, 500)


You can also save Bokeh plot as HTML to retain its interactivity using save() instead of show().

from bokeh.plotting import output_file, save


Mean options

On the y axis, a clustergram can use mean values as in the original paper by Matthias Schonlau or PCA weighted mean values as in the implementation by Tal Galili. PCA weighted plots are default as they help distinguishing between different branches and make interpretation a bit easier. The same option is supported by both plotting backends.

cgram.plot(figsize=(12, 8), pca_weighted=True)
<AxesSubplot:xlabel='Number of clusters (k)', ylabel='PCA weighted mean of the clusters'>
cgram.plot(figsize=(12, 8), pca_weighted=False)
<AxesSubplot:xlabel='Number of clusters (k)', ylabel='Mean of the clusters'>