DModX#
- class chemotools.outliers.DModX(model: _BasePCA | _PLS | Pipeline, confidence: float = 0.95, mean_centered: bool = True)[ソース]
ベースクラス:
_ModelResidualsBaseCalculate Distance to Model (DModX) statistics.
DModX measures the distance between an observation and the model plane in the X-space, useful for detecting outliers.
- パラメータ:
- 変数:
estimator (ModelType) -- The fitted model of type _BasePCA or _PLS
transformer (Optional[Pipeline]) -- Preprocessing steps before the model
n_features_in (int) -- Number of features in the input data
n_components (int) -- Number of components in the model
n_samples (int) -- Number of samples used to train the model
critical_value (float) -- The calculated critical value for outlier detection
train_sse (float) -- The training sum of squared errors (SSE) for the model normalized by degrees of freedom
A0 (int) -- Adjustment factor for degrees of freedom based on mean centering
参照
- [1] Max Bylesjö, Mattias Rantalainen, Oliver Cloarec, Johan K. Nicholson,
Elaine Holmes, Johan Trygg. "OPLS discriminant analysis: combining the strengths of PLS-DA and SIMCA classification." Journal of Chemometrics 20 (8-10), 341-351 (2006).
サンプル
>>> from chemotools.datasets import load_fermentation_train >>> from chemotools.outliers import DModX >>> from sklearn.decomposition import PCA >>> # Load sample data >>> X, _ = load_fermentation_train() >>> # Instantiate the PCA model >>> pca = PCA(n_components=3).fit(X) >>> # Initialize DModX with the fitted PCA model >>> dmodx = DModX(model=pca, confidence=0.95, mean_centered=True) DModX(model=PCA(n_components=3), confidence=0.95, mean_centered=True) >>> dmodx.fit(X) >>> # Predict outliers in the dataset >>> outliers = dmodx.predict(X) >>> # Calculate DModX residuals >>> residuals = dmodx.predict_residuals(X)
Attributes
critical_value_