StudentizedResiduals#

class chemotools.outliers.StudentizedResiduals(model: _PLS | Pipeline, confidence=0.95)[source]

Bases: _ModelResidualsBase

Calculate the Studentized Residuals on a _PLS model preditions.

Parameters:
  • model (Union[ModelType, Pipeline]) – A fitted _PLS model or Pipeline ending with such a model

  • confidence (float, default=0.95) – Confidence level for statistical calculations (between 0 and 1)

Variables:
  • estimator (ModelType) – The fitted model of type _BasePCA or _PLS

  • transformer (Optional[Pipeline]) – Preprocessing steps before the model

  • n_features_in (int) – Number of features in the input data

  • n_components (int) – Number of components in the model

  • n_samples (int) – Number of samples used to train the model

  • critical_value (float) – The calculated critical value for outlier detection

fit(X, y=None)

Fit the Studentized Residuals model by computing residuals from the training set. Calculates the critical threshold based on the chosen method.

predict(X, y=None)

Identify outliers in the input data based on Studentized Residuals threshold.

predict_residuals(X, y=None, validate=True)

Calculate Studentized Residuals for input data.

_calculate_critical_value(X)

Calculate the critical value for outlier detection using the specified method.

Examples

>>> from chemotools.datasets import load_fermentation_train
>>> from chemotools.outliers import StudentizedResiduals
>>> from sklearn.cross_decomposition import PLSRegression
>>> # Load sample data
>>> X, y = load_fermentation_train()
>>> y = y.values
>>> # Instantiate the PLS model
>>> pls = PLSRegression(n_components=3).fit(X, y)
>>> # Initialize StudentizedResiduals with the fitted PLS model
>>> studentized_residuals = StudentizedResiduals(model=pls, confidence=0.95)
StudentizedResiduals(model=PLSRegression(n_components=3), confidence=0.95)
>>> studentized_residuals.fit(X, y)
>>> # Predict outliers in the dataset
>>> outliers = studentized_residuals.predict(X, y)
>>> # Calculate Studentized residuals
>>> studentized_residuals_stats = studentized_residuals.predict_residuals(X, y)

References

[1] Kim H. Esbensen,

“Multivariate Data Analysis - In Practice”, 5th Edition, 2002.

Attributes

estimator_

critical_value_

estimator_: _PLS