DistancesPlot#

class chemotools.plotting.DistancesPlot(y: ndarray, *, x: ndarray | None = None, color_by: ndarray | None = None, annotations: list[str] | None = None, label: str = 'Data', color: str | None = None, colormap: str | None = None, marker: str = 'o', confidence_lines: bool | tuple[float | None, float | None] | None = None, color_mode: Literal['continuous', 'categorical'] | None = None, colorbar_label: str = 'Value')[source]

Bases: BasePlot, ColoringMixin

Simple, composable distances plot for a single dataset.

This class creates scatter plots of distance measures (e.g., Q residuals, Hotelling’s T²) for outlier detection. Supports plotting one distance vs another or distance vs sample index. Multiple datasets can be overlaid by using the render() method on shared axes.

Parameters:
  • x (np.ndarray, optional) – Explicit x-axis values. Must match the length of y. When omitted, the sample index (0, 1, …, n_samples-1) is used.

  • y (np.ndarray, optional) – Y-axis values to plot. Accepts 1D arrays only.

  • color_by (np.ndarray, optional) –

    Values for coloring samples. Can be either:

    • Continuous (numeric): shows colorbar

    • Categorical (strings/classes): shows legend with discrete colors

  • annotations (list[str], optional) – Labels for annotating individual points.

  • label (str, optional) – Legend label for this dataset (default: “Data”).

  • color (str, optional) – Color for all points when color_by is None (default: auto-assigned).

  • colormap (str, optional) –

    Colormap name. Colorblind-friendly defaults:

    • ”tab10” for categorical data

    • ”viridis” for continuous data

  • marker (str, optional) – Marker style for scatter points (default: “o”). Examples: “o”, “s”, “^”, “v”, “D”.

  • confidence_lines (bool or tuple[float | None, float | None], optional) –

    Whether to draw confidence/threshold lines.

    • If True: draws lines at distances using default method

    • If tuple: (x_threshold, y_threshold) values for lines

    • If False or None: no lines (default)

    Examples: True, (12.5, 5.2), (None, 5.2), (12.5, None)

  • color_mode ({"continuous", "categorical"}, optional) – Explicitly specify coloring mode. If None (default), automatically detects based on dtype and unique values of color_by.

  • colorbar_label (str, optional) – Label for the colorbar when using continuous coloring. Default is “Value”. Only applies when color_by is continuous.

Raises:

ValueError – If distances have invalid shapes or index selections.

Examples

Simple single dataset plot (Q residuals vs sample index):

>>> plot = DistancesPlot(q_residuals, confidence_lines=(None, 5.2))
>>> fig = plot.show(title="Q Residuals with Control Limit")

Multiple datasets composed together (T² vs Q):

>>> fig, ax = plt.subplots()
>>> DistancesPlot(
...     y=train_q,
...     x=train_t2,
...     label="Train",
...     color="blue",
...     confidence_lines=(12.5, 5.2),
... ).render(ax)
>>> DistancesPlot(
...     y=test_q,
...     x=test_t2,
...     label="Test",
...     color="red",
... ).render(ax)
>>> ax.set_xlabel("Hotelling's T²")
>>> ax.set_ylabel("Q Residuals")
>>> ax.legend()
>>> plt.show()

With categorical coloring:

>>> plot = DistancesPlot(
...     y=q_residuals,
...     x=t2_residuals,
...     color_by=classes,
...     confidence_lines=(12.5, 5.2),
... )
>>> fig = plot.show(title="Outliers by Class")

With annotations for outliers:

>>> outliers = [5, 23, 47]
>>> annotations = [f"S{i}" if i in outliers else "" for i in range(len(q_residuals))]
>>> plot = DistancesPlot(
...     y=q_residuals,
...     annotations=annotations,
...     confidence_lines=(None, 5.2),
... )
>>> fig = plot.show(title="Annotated Outliers")

Explicit x/y arrays:

>>> plot = DistancesPlot(
...     y=q_residuals,
...     x=t2_residuals,
...     confidence_lines=(9.35, 12.0),
... )
>>> fig = plot.show(
...     title="T² vs Q",
...     xlabel="Hotelling's T²",
...     ylabel="Q Residuals",
... )

Attributes

color_by

is_categorical

colormap

colorbar_label

show(*, figsize: Tuple[float, float] | None = None, title: str | None = None, xlabel: str | None = None, ylabel: str | None = None, xlim: Tuple[float, float] | None = None, ylim: Tuple[float, float] | None = None, **kwargs: Any) Figure[source]

Create and return a complete figure with the distances plot.

This method handles figure creation and then delegates to render().

Parameters:
  • figsize (tuple[float, float], optional) – Figure size in inches (width, height).

  • title (str, optional) – Figure title.

  • xlabel (str, optional) – Custom x-axis label. If None, uses existing label or default.

  • ylabel (str, optional) – Custom y-axis label. If None, uses existing label or default.

  • xlim (tuple[float, float], optional) – X-axis limits as (xmin, xmax).

  • ylim (tuple[float, float], optional) – Y-axis limits as (ymin, ymax).

  • **kwargs (Any) – Additional keyword arguments passed to the render() method.

Returns:

The matplotlib Figure object containing the plot.

Return type:

Figure

render(ax: Axes | None = None, *, xlabel: str | None = None, ylabel: str | None = None, xlim: tuple[float, float] | None = None, ylim: tuple[float, float] | None = None, **kwargs: Any) tuple[Figure, Axes][source]

Render the plot on the given axes or create new ones.

Use this method to compose multiple plots on the same axes.

Parameters:
  • ax (Axes, optional) – Matplotlib axes to plot on. If None, creates new figure and axes.

  • xlabel (str, optional) – Custom x-axis label. If None, uses existing label or the default label configured at initialization.

  • ylabel (str, optional) – Custom y-axis label. If None, uses existing label or the default label configured at initialization.

  • xlim (tuple[float, float], optional) – X-axis limits as (xmin, xmax).

  • ylim (tuple[float, float], optional) – Y-axis limits as (ymin, ymax).

  • **kwargs (Any) – Additional keyword arguments passed to ax.scatter().

Returns:

  • fig (Figure) – The matplotlib Figure object.

  • ax (Axes) – The matplotlib Axes object with the rendered plot.

Examples

Compose multiple datasets:

>>> fig, ax = plt.subplots()
>>> DistancesPlot(train_dist, label="Train").render(ax)
>>> DistancesPlot(test_dist, label="Test").render(ax)
>>> ax.set_xlabel("Hotelling T²")
>>> ax.set_ylabel("Q Residuals")
>>> ax.legend()
>>> plt.show()