Leverage#

class chemotools.outliers.Leverage(model: _BasePCA | _PLS | Pipeline, confidence: float = 0.95)[source]

Bases: _ModelResidualsBase

Calculate the leverage of the training samples on the latent space of a PLS model. This method allows to detect datapoints with high leverage in the model.

Parameters:

model (Union[ModelType, Pipeline]) – A fitted PLSRegression model or Pipeline ending with such a model
confidence (float, default=0.95) – Confidence level for statistical calculations (between 0 and 1)

Variables:

estimator (ModelType) – The fitted model of type _PLS
transformer (Optional[Pipeline]) – Preprocessing steps before the model
n_features_in (int) – Number of features in the input data
n_components (int) – Number of components in the model
n_samples (int) – Number of samples used to train the model
critical_value (float) – The calculated critical value for outlier detection

References

[1] Kim H. Esbensen,: “Multivariate Data Analysis - In Practice”, 5th Edition, 2002.

Examples

>>> from sklearn.cross_decomposition import PLSRegression
>>> from chemotools.outliers import Leverage
>>> X = np.random.rand(100, 10)
>>> y = np.random.rand(100)
>>> pls = PLSRegression(n_components=3).fit(X, y)
>>> # Initialize Leverage with the fitted PLS model
>>> leverage = Leverage(pls, confidence=0.95)
Leverage(model=PLSRegression(n_components=3), confidence=0.95)
>>> leverage.fit(X, y)
>>> # Predict outliers in the dataset
>>> outliers = leverage.predict(X)
>>> # Get the leverage of the samples
>>> residuals = leverage.predict_residuals(X)

fit(X: ndarray, y: ndarray | None = None) → Leverage[source]

Fit the model to the input data.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input data
y (array-like of shape (n_samples,), default=None) – Target data

Returns:

self – Fitted estimator with the critical threshold computed

Return type:

Leverage

predict(X: ndarray, y: ndarray | None = None) → ndarray[source]

Calculate Leverage for training data on the model.

Parameters:: X (array-like of shape (n_samples, n_features)) – Input data
Returns:: Bool with samples with a leverage above the critical value
Return type:: ndarray of shape (n_samples,)

predict_residuals(X: ndarray, y: ndarray | None = None, validate: bool = True) → ndarray[source]

Calculate the leverage of the samples.

Parameters:: X (array-like of shape (n_samples, n_features)) – Input data
Returns:: Leverage of the samples
Return type:: np.ndarray

Leverage#

This Page