ResidualDistributionPlot#
- class chemotools.plotting.ResidualDistributionPlot(residuals: ndarray, *, target_index: int = 0, bins: int | str = 'auto', density: bool = True, add_normal_curve: bool = True, add_stats: bool = True, color: str = '#008BFB', alpha: float = 0.6)[source]
Bases:
BasePlotHistogram plot of residuals to assess normality and distribution shape.
This class creates histogram plots of residuals with optional overlay of the theoretical normal distribution. Useful for visually assessing if residuals follow a normal distribution and detecting skewness or outliers.
- Parameters:
residuals (np.ndarray) – Residual values with shape (n_samples,) for univariate or (n_samples, n_targets) for multivariate regression.
target_index (int, optional) – For multivariate residuals, which target to plot (default: 0). Ignored if residuals is 1D.
bins (int or str, optional) – Number of histogram bins or binning strategy (default: “auto”). Can be int or any value accepted by np.histogram_bin_edges.
density (bool, optional) – If True, normalize histogram to form probability density (default: True). Required for overlaying theoretical normal distribution.
add_normal_curve (bool, optional) – Whether to overlay theoretical normal distribution curve (default: True).
add_stats (bool, optional) – Whether to add text box with distribution statistics (default: True). Shows mean, std, skewness, and kurtosis.
color (str, optional) – Color for the histogram bars (default: “#008BFB”).
alpha (float, optional) – Transparency of histogram bars (default: 0.6).
- Raises:
ValueError – If residuals have invalid shapes.
Examples
Basic histogram:
>>> residuals = y_true - y_pred >>> plot = ResidualDistributionPlot(residuals) >>> fig = plot.show(title="Distribution of Residuals")
Without normal curve overlay:
>>> plot = ResidualDistributionPlot(residuals, add_normal_curve=False) >>> fig = plot.show(title="Residual Histogram")
Custom number of bins:
>>> plot = ResidualDistributionPlot(residuals, bins=30) >>> fig = plot.show(title="Residual Distribution (30 bins)")
Multiple targets side by side:
>>> residuals = y_true - y_pred # shape (n_samples, n_targets) >>> fig, axes = plt.subplots(1, 3, figsize=(15, 5)) >>> for i in range(3): ... ResidualDistributionPlot(residuals, target_index=i).render(axes[i]) ... axes[i].set_title(f"Target {i+1}") >>> plt.tight_layout() >>> plt.show()
Without statistics text box:
>>> plot = ResidualDistributionPlot(residuals, add_stats=False) >>> fig = plot.show(title="Clean Histogram")
Count histogram instead of density:
>>> plot = ResidualDistributionPlot(residuals, density=False, add_normal_curve=False) >>> fig = plot.show(title="Residual Counts", ylabel="Count")
Notes
The statistics shown when add_stats=True include: - Mean: Should be close to 0 for good regression - Std: Standard deviation of residuals - Skewness: Measure of asymmetry (0 for normal) - Kurtosis: Measure of tail heaviness (0 for normal, excess kurtosis)
For normally distributed residuals: - Histogram should be bell-shaped - Should match the overlaid normal curve - Skewness ≈ 0, Kurtosis ≈ 0
- show(*, figsize: Tuple[float, float] | None = None, title: str | None = None, xlabel: str | None = None, ylabel: str | None = None, xlim: Tuple[float, float] | None = None, ylim: Tuple[float, float] | None = None, **kwargs: Any) Figure[source]
Create and return a complete figure with the residual distribution plot.
This method handles figure creation and then delegates to render().
- Parameters:
figsize (tuple[float, float], optional) – Figure size in inches (width, height).
title (str, optional) – Figure title.
xlabel (str, optional) – Custom x-axis label. If None, uses existing label or default.
ylabel (str, optional) – Custom y-axis label. If None, uses existing label or default.
xlim (tuple[float, float], optional) – X-axis limits as (xmin, xmax).
ylim (tuple[float, float], optional) – Y-axis limits as (ymin, ymax).
**kwargs (Any) – Additional keyword arguments passed to the render() method.
- Returns:
The matplotlib Figure object containing the plot.
- Return type:
Figure
- render(ax: Axes | None = None, *, xlabel: str | None = None, ylabel: str | None = None, xlim: tuple[float, float] | None = None, ylim: tuple[float, float] | None = None, **kwargs: Any) tuple[Figure, Axes][source]
Render the plot on existing or new axes.
- Parameters:
ax (Axes, optional) – Matplotlib axes to render on. If None, creates new figure/axes.
xlabel (str, optional) – Custom x-axis label. If None, uses existing label or default.
ylabel (str, optional) – Custom y-axis label. If None, uses existing label or default.
xlim (tuple[float, float], optional) – X-axis limits (min, max).
ylim (tuple[float, float], optional) – Y-axis limits (min, max).
**kwargs (Any) – Additional keyword arguments passed to histogram.
- Returns:
The Figure and Axes objects containing the plot.
- Return type:
tuple[Figure, Axes]