DistancesPlot#
- class chemotools.plotting.DistancesPlot(y: ndarray, *, x: ndarray | None = None, color_by: ndarray | None = None, annotations: list[str] | None = None, label: str = 'Data', color: str | None = None, colormap: str | None = None, marker: str = 'o', confidence_lines: bool | tuple[float | None, float | None] | None = None, color_mode: Literal['continuous', 'categorical'] | None = None, colorbar_label: str = 'Value')[source]
Bases:
BasePlot,ColoringMixinSimple, composable distances plot for a single dataset.
This class creates scatter plots of distance measures (e.g., Q residuals, Hotelling’s T²) for outlier detection. Supports plotting one distance vs another or distance vs sample index. Multiple datasets can be overlaid by using the render() method on shared axes.
- Parameters:
x (np.ndarray, optional) – Explicit x-axis values. Must match the length of
y. When omitted, the sample index (0, 1, …, n_samples-1) is used.y (np.ndarray, optional) – Y-axis values to plot. Accepts 1D arrays only.
color_by (np.ndarray, optional) –
Values for coloring samples. Can be either:
Continuous (numeric): shows colorbar
Categorical (strings/classes): shows legend with discrete colors
annotations (list[str], optional) – Labels for annotating individual points.
label (str, optional) – Legend label for this dataset (default: “Data”).
color (str, optional) – Color for all points when color_by is None (default: auto-assigned).
colormap (str, optional) –
Colormap name. Colorblind-friendly defaults:
”tab10” for categorical data
”viridis” for continuous data
marker (str, optional) – Marker style for scatter points (default: “o”). Examples: “o”, “s”, “^”, “v”, “D”.
confidence_lines (bool or tuple[float | None, float | None], optional) –
Whether to draw confidence/threshold lines.
If True: draws lines at distances using default method
If tuple: (x_threshold, y_threshold) values for lines
If False or None: no lines (default)
Examples: True, (12.5, 5.2), (None, 5.2), (12.5, None)
color_mode ({"continuous", "categorical"}, optional) – Explicitly specify coloring mode. If None (default), automatically detects based on dtype and unique values of color_by.
colorbar_label (str, optional) – Label for the colorbar when using continuous coloring. Default is “Value”. Only applies when color_by is continuous.
- Raises:
ValueError – If distances have invalid shapes or index selections.
Examples
Simple single dataset plot (Q residuals vs sample index):
>>> plot = DistancesPlot(q_residuals, confidence_lines=(None, 5.2)) >>> fig = plot.show(title="Q Residuals with Control Limit")
Multiple datasets composed together (T² vs Q):
>>> fig, ax = plt.subplots() >>> DistancesPlot( ... y=train_q, ... x=train_t2, ... label="Train", ... color="blue", ... confidence_lines=(12.5, 5.2), ... ).render(ax) >>> DistancesPlot( ... y=test_q, ... x=test_t2, ... label="Test", ... color="red", ... ).render(ax) >>> ax.set_xlabel("Hotelling's T²") >>> ax.set_ylabel("Q Residuals") >>> ax.legend() >>> plt.show()
With categorical coloring:
>>> plot = DistancesPlot( ... y=q_residuals, ... x=t2_residuals, ... color_by=classes, ... confidence_lines=(12.5, 5.2), ... ) >>> fig = plot.show(title="Outliers by Class")
With annotations for outliers:
>>> outliers = [5, 23, 47] >>> annotations = [f"S{i}" if i in outliers else "" for i in range(len(q_residuals))] >>> plot = DistancesPlot( ... y=q_residuals, ... annotations=annotations, ... confidence_lines=(None, 5.2), ... ) >>> fig = plot.show(title="Annotated Outliers")
Explicit x/y arrays:
>>> plot = DistancesPlot( ... y=q_residuals, ... x=t2_residuals, ... confidence_lines=(9.35, 12.0), ... ) >>> fig = plot.show( ... title="T² vs Q", ... xlabel="Hotelling's T²", ... ylabel="Q Residuals", ... )
Attributes
color_byis_categoricalcolormapcolorbar_label- show(*, figsize: Tuple[float, float] | None = None, title: str | None = None, xlabel: str | None = None, ylabel: str | None = None, xlim: Tuple[float, float] | None = None, ylim: Tuple[float, float] | None = None, **kwargs: Any) Figure[source]
Create and return a complete figure with the distances plot.
This method handles figure creation and then delegates to render().
- Parameters:
figsize (tuple[float, float], optional) – Figure size in inches (width, height).
title (str, optional) – Figure title.
xlabel (str, optional) – Custom x-axis label. If None, uses existing label or default.
ylabel (str, optional) – Custom y-axis label. If None, uses existing label or default.
xlim (tuple[float, float], optional) – X-axis limits as (xmin, xmax).
ylim (tuple[float, float], optional) – Y-axis limits as (ymin, ymax).
**kwargs (Any) – Additional keyword arguments passed to the render() method.
- Returns:
The matplotlib Figure object containing the plot.
- Return type:
Figure
- render(ax: Axes | None = None, *, xlabel: str | None = None, ylabel: str | None = None, xlim: tuple[float, float] | None = None, ylim: tuple[float, float] | None = None, **kwargs: Any) tuple[Figure, Axes][source]
Render the plot on the given axes or create new ones.
Use this method to compose multiple plots on the same axes.
- Parameters:
ax (Axes, optional) – Matplotlib axes to plot on. If None, creates new figure and axes.
xlabel (str, optional) – Custom x-axis label. If None, uses existing label or the default label configured at initialization.
ylabel (str, optional) – Custom y-axis label. If None, uses existing label or the default label configured at initialization.
xlim (tuple[float, float], optional) – X-axis limits as (xmin, xmax).
ylim (tuple[float, float], optional) – Y-axis limits as (ymin, ymax).
**kwargs (Any) – Additional keyword arguments passed to ax.scatter().
- Returns:
fig (Figure) – The matplotlib Figure object.
ax (Axes) – The matplotlib Axes object with the rendered plot.
Examples
Compose multiple datasets:
>>> fig, ax = plt.subplots() >>> DistancesPlot(train_dist, label="Train").render(ax) >>> DistancesPlot(test_dist, label="Test").render(ax) >>> ax.set_xlabel("Hotelling T²") >>> ax.set_ylabel("Q Residuals") >>> ax.legend() >>> plt.show()