Leverage#
- class chemotools.outliers.Leverage(model: _BasePCA | _PLS | Pipeline, confidence: float = 0.95)[source]
Bases:
_ModelResidualsBaseCalculate the leverage of the training samples on the latent space of a PLS model. This method allows to detect datapoints with high leverage in the model.
- Parameters:
model (Union[ModelType, Pipeline]) – A fitted PLSRegression model or Pipeline ending with such a model
confidence (float, default=0.95) – Confidence level for statistical calculations (between 0 and 1)
- Variables:
estimator (ModelType) – The fitted model of type _PLS
transformer (Optional[Pipeline]) – Preprocessing steps before the model
n_features_in (int) – Number of features in the input data
n_components (int) – Number of components in the model
n_samples (int) – Number of samples used to train the model
critical_value (float) – The calculated critical value for outlier detection
References
- [1] Kim H. Esbensen,
“Multivariate Data Analysis - In Practice”, 5th Edition, 2002.
Examples
>>> from sklearn.cross_decomposition import PLSRegression >>> from chemotools.outliers import Leverage >>> X = np.random.rand(100, 10) >>> y = np.random.rand(100) >>> pls = PLSRegression(n_components=3).fit(X, y) >>> # Initialize Leverage with the fitted PLS model >>> leverage = Leverage(pls, confidence=0.95) Leverage(model=PLSRegression(n_components=3), confidence=0.95) >>> leverage.fit(X, y) >>> # Predict outliers in the dataset >>> outliers = leverage.predict(X) >>> # Get the leverage of the samples >>> residuals = leverage.predict_residuals(X)
- fit(X: ndarray, y: ndarray | None = None) Leverage[source]
Fit the model to the input data.
- Parameters:
X (array-like of shape (n_samples, n_features)) – Input data
y (array-like of shape (n_samples,), default=None) – Target data
- Returns:
self – Fitted estimator with the critical threshold computed
- Return type:
Leverage