MultiplicativeScatterCorrection#

class chemotools.scatter.MultiplicativeScatterCorrection(method: Literal['mean', 'median'] = 'mean', reference: ndarray | None = None, weights: ndarray | None = None)[source]

Bases: OneToOneFeatureMixin, TransformerMixin, BaseEstimator

Multiplicative Scatter Correction (MSC).

MSC is a transformation method used to compensate for additive and/or multiplicative scatter effects in spectral data (like NIR). It linearizes each spectrum against a reference spectrum (usually the mean or median) using Ordinary Least Squares (OLS) or Weighted Least Squares (WLS).

Read more in the User Guide.

Parameters:
  • method ({"mean", "median"}, default="mean") – The statistic used to calculate the reference spectrum if reference is None. - “mean”: Use the average spectrum of the training set. - “median”: Use the median spectrum of the training set.

  • reference (array-like of shape (n_features,), default=None) – A custom reference spectrum to use for the correction. If provided, method is ignored.

  • weights (array-like of shape (n_features,), default=None) – Weighting vector applied during the linear regression for each spectrum. Useful for de-emphasizing noisy wavelengths.

Variables:
  • reference (ndarray of shape (n_features,)) – The reference spectrum used for the correction, either passed via reference or calculated during fit().

  • weights (ndarray of shape (n_features,)) – The weights used in the correction. Defaults to a vector of ones.

  • n_features_in (int) – Number of features seen during fit.

  • feature_names_in (ndarray of shape (n_features_in_,)) – Names of features seen during fit. Defined only when X has feature names that are all strings.

  • pinv_A (ndarray of shape (2, n_features)) – The precomputed weighted pseudo-inverse of the design matrix used to solve for $m$ (slope) and $c$ (intercept) efficiently.

Notes

The correction follows the linear model:

\[x_{raw} = m \cdot x_{ref} + c + e\]

where $x_{raw}$ is the observed spectrum, $x_{ref}$ is the reference spectrum, $m$ is the multiplicative scaling, and $c$ is the additive offset. The corrected spectrum is calculated as:

\[x_{corr} = \frac{x_{raw} - c}{m}\]

References

Examples

>>> import numpy as np
>>> from chemotools.scatter import MultiplicativeScatterCorrection
>>> X = np.random.rand(10, 100)
>>> msc = MultiplicativeScatterCorrection(method='mean')
>>> msc.fit(X)
MultiplicativeScatterCorrection()
>>> X_corr = msc.transform(X)
fit(X, y=None)[source]
transform(X)[source]