VIPSelector#
- class chemotools.feature_selection.VIPSelector(model, threshold: float = 1.0)[source]
Bases:
_PLSFeatureSelectorBaseThis selector is used to select features that contribute significantly to the latent variables in a PLS regression model using the Variables Importance in Projection (VIP) method.
- Parameters:
model (Union[_PLS, Pipeline]) – The PLS regression model or a pipeline with a PLS regression model as last step.
threshold (float, default=1.0) – The threshold for feature selection. Features with importance above this threshold will be selected.
- Variables:
estimator (ModelTypes) – The fitted model of type _BasePCA or _PLS
feature_scores (np.ndarray) – The calculated feature scores based on the selected method.
support_mask (np.ndarray) – The boolean mask indicating which features are selected.
References
- [1] Kim H. Esbensen,
“Multivariate Data Analysis - In Practice”, 5th Edition, 2002.
Examples
>>> from chemotools.datasets import load_fermentation_train >>> from chemotools.feature_selection import VIPSelector >>> from sklearn.cross_decomposition import PLSRegression >>> # Load sample data >>> X, y = load_fermentation_train() >>> # Instantiate the PLS regression model >>> pls_model = PLSRegression(n_components=2).fit(X, y) >>> # Instantiate the VIP selector with the PLS model >>> selector = VIPSelector(model=pls_model, threshold=1.0) >>> selector.fit(X) VIPSelector(model=PLSRegression(n_components=2), threshold=1.0) >>> # Get the selected features >>> X_selected = selector.transform(X) >>> X_selected.shape (21, 527)
- fit(X: ndarray, y=None) VIPSelector[source]
Fit the transformer to calculate the feature scores and the support mask.
- Parameters:
X (array-like of shape (n_samples, n_features)) – The input data to fit the transformer to.
y (None) – Ignored to align with API.
- Returns:
self – The fitted transformer.
- Return type:
VIPSelector