Leverage#

class chemotools.outliers.Leverage(model: _BasePCA | _PLS | Pipeline, confidence: float = 0.95)[source]

Bases: _ModelResidualsBase

Calculate the leverage of the training samples on the latent space of a PLS model. This method allows to detect datapoints with high leverage in the model.

Parameters:
  • model (Union[ModelType, Pipeline]) – A fitted PLSRegression model or Pipeline ending with such a model

  • confidence (float, default=0.95) – Confidence level for statistical calculations (between 0 and 1)

Variables:
  • estimator (ModelType) – The fitted model of type _PLS

  • transformer (Optional[Pipeline]) – Preprocessing steps before the model

  • n_features_in (int) – Number of features in the input data

  • n_components (int) – Number of components in the model

  • n_samples (int) – Number of samples used to train the model

  • critical_value (float) – The calculated critical value for outlier detection

References

[1] Kim H. Esbensen,

“Multivariate Data Analysis - In Practice”, 5th Edition, 2002.

Examples

>>> from sklearn.cross_decomposition import PLSRegression
>>> from chemotools.outliers import Leverage
>>> X = np.random.rand(100, 10)
>>> y = np.random.rand(100)
>>> pls = PLSRegression(n_components=3).fit(X, y)
>>> # Initialize Leverage with the fitted PLS model
>>> leverage = Leverage(pls, confidence=0.95)
Leverage(model=PLSRegression(n_components=3), confidence=0.95)
>>> leverage.fit(X, y)
>>> # Predict outliers in the dataset
>>> outliers = leverage.predict(X)
>>> # Get the leverage of the samples
>>> residuals = leverage.predict_residuals(X)

Attributes

critical_value_