WhittakerSmooth#
- class chemotools.smooth.WhittakerSmooth(lam: float = 10000.0, weights: ndarray | None = None, solver_type: Literal['banded', 'sparse'] = 'banded')[source]
Bases:
_BaseWhittakerWhittaker smoothing for noise reduction and signal trend estimation.
Whittaker smoothing is a penalized least squares method that estimates smooth trends from noisy data by balancing fidelity to the input signal with a smoothness constraint. A second-order difference operator is used as the penalty term, ensuring that the estimated signal is smooth while preserving overall shape.
The Whittaker smoothing step can be solved using either: - a banded solver (fast and memory-efficient, recommended for most spectra), or - a sparse LU solver (more stable for ill-conditioned problems).
Optional weights can be provided to emphasize or downweight certain observations during smoothing. If no weights are supplied, all points are treated equally.
- Parameters:
lam (float, default=1e4) – Regularization parameter controlling smoothness of the fitted signal. Larger values yield smoother trends.
weights (ndarray of shape (n_features,), optional, default=None) – Non-negative weights applied to each observation. If None, all observations are weighted equally.
solver_type (Literal["banded", "sparse"], default="banded") – If “banded”, use the banded solver for Whittaker smoothing. If “sparse”, use a sparse LU decomposition.
- Variables:
n_features_in (int) – The number of features in the training data.
References
- [1] Eilers, P.H. (2003).
“A perfect smoother.” Analytical Chemistry 75 (14), 3631–3636.
Examples
>>> from chemotools.datasets import load_fermentation_train >>> from chemotools.smooth import WhittakerSmooth >>> # Load sample data >>> X, _ = load_fermentation_train() >>> # Initialize WhittakerSmooth >>> ws = WhittakerSmooth() WhittakerSmooth() >>> # Fit and transform the data >>> X_smoothed = ws.fit_transform(X)
- fit(X: ndarray, y=None) WhittakerSmooth[source]
Fit the Whittaker smoother to input data.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – The input data matrix, where rows correspond to samples and columns correspond to features (e.g., spectra).
y (None) – Ignored, present for API consistency with scikit-learn.
- Returns:
self – Fitted estimator.
- Return type:
WhittakerSmooth
- transform(X: ndarray, y=None) ndarray[source]
Apply Whittaker smoothing to input data.
- Parameters:
X (ndarray of shape (n_samples, n_features)) – The input data matrix to smooth.
y (None) – Ignored, present for API consistency with scikit-learn.
- Returns:
X_transformed – The smoothed version of the input data.
- Return type:
ndarray of shape (n_samples, n_features)