StudentizedResiduals#
- class chemotools.outliers.StudentizedResiduals(model: _PLS | Pipeline, confidence=0.95)[source]
Bases:
_ModelResidualsBaseCalculate the Studentized Residuals on a _PLS model preditions.
- Parameters:
model (Union[ModelType, Pipeline]) – A fitted _PLS model or Pipeline ending with such a model
confidence (float, default=0.95) – Confidence level for statistical calculations (between 0 and 1)
- Variables:
estimator (ModelType) – The fitted model of type _BasePCA or _PLS
transformer (Optional[Pipeline]) – Preprocessing steps before the model
n_features_in (int) – Number of features in the input data
n_components (int) – Number of components in the model
n_samples (int) – Number of samples used to train the model
critical_value (float) – The calculated critical value for outlier detection
- fit(X, y=None)
Fit the Studentized Residuals model by computing residuals from the training set. Calculates the critical threshold based on the chosen method.
- predict(X, y=None)
Identify outliers in the input data based on Studentized Residuals threshold.
- predict_residuals(X, y=None, validate=True)
Calculate Studentized Residuals for input data.
- _calculate_critical_value(X)
Calculate the critical value for outlier detection using the specified method.
Examples
>>> from chemotools.datasets import load_fermentation_train >>> from chemotools.outliers import StudentizedResiduals >>> from sklearn.cross_decomposition import PLSRegression >>> # Load sample data >>> X, y = load_fermentation_train() >>> y = y.values >>> # Instantiate the PLS model >>> pls = PLSRegression(n_components=3).fit(X, y) >>> # Initialize StudentizedResiduals with the fitted PLS model >>> studentized_residuals = StudentizedResiduals(model=pls, confidence=0.95) StudentizedResiduals(model=PLSRegression(n_components=3), confidence=0.95) >>> studentized_residuals.fit(X, y) >>> # Predict outliers in the dataset >>> outliers = studentized_residuals.predict(X, y) >>> # Calculate Studentized residuals >>> studentized_residuals_stats = studentized_residuals.predict_residuals(X, y)
References
- [1] Kim H. Esbensen,
“Multivariate Data Analysis - In Practice”, 5th Edition, 2002.
Attributes
estimator_critical_value_- estimator_: _PLS