WhittakerSmooth#

class chemotools.smooth.WhittakerSmooth(lam: float = 10000.0, weights: ndarray | None = None, solver_type: Literal['banded', 'sparse'] = 'banded')[source]

Bases: _BaseWhittaker

Whittaker smoothing for noise reduction and signal trend estimation.

Whittaker smoothing is a penalized least squares method that estimates smooth trends from noisy data by balancing fidelity to the input signal with a smoothness constraint. A second-order difference operator is used as the penalty term, ensuring that the estimated signal is smooth while preserving overall shape.

The Whittaker smoothing step can be solved using either: - a banded solver (fast and memory-efficient, recommended for most spectra), or - a sparse LU solver (more stable for ill-conditioned problems).

Optional weights can be provided to emphasize or downweight certain observations during smoothing. If no weights are supplied, all points are treated equally.

Parameters:
  • lam (float, default=1e4) – Regularization parameter controlling smoothness of the fitted signal. Larger values yield smoother trends.

  • weights (ndarray of shape (n_features,), optional, default=None) – Non-negative weights applied to each observation. If None, all observations are weighted equally.

  • solver_type (Literal["banded", "sparse"], default="banded") – If “banded”, use the banded solver for Whittaker smoothing. If “sparse”, use a sparse LU decomposition.

Variables:

n_features_in (int) – The number of features in the training data.

References

[1] Eilers, P.H. (2003).

“A perfect smoother.” Analytical Chemistry 75 (14), 3631–3636.

Examples

>>> from chemotools.datasets import load_fermentation_train
>>> from chemotools.smooth import WhittakerSmooth
>>> # Load sample data
>>> X, _ = load_fermentation_train()
>>> # Initialize WhittakerSmooth
>>> ws = WhittakerSmooth()
WhittakerSmooth()
>>> # Fit and transform the data
>>> X_smoothed = ws.fit_transform(X)
fit(X: ndarray, y=None) WhittakerSmooth[source]

Fit the Whittaker smoother to input data.

Parameters:
  • X (ndarray of shape (n_samples, n_features)) – The input data matrix, where rows correspond to samples and columns correspond to features (e.g., spectra).

  • y (None) – Ignored, present for API consistency with scikit-learn.

Returns:

self – Fitted estimator.

Return type:

WhittakerSmooth

transform(X: ndarray, y=None) ndarray[source]

Apply Whittaker smoothing to input data.

Parameters:
  • X (ndarray of shape (n_samples, n_features)) – The input data matrix to smooth.

  • y (None) – Ignored, present for API consistency with scikit-learn.

Returns:

X_transformed – The smoothed version of the input data.

Return type:

ndarray of shape (n_samples, n_features)