VIPSelector#

class chemotools.feature_selection.VIPSelector(model, threshold: float = 1.0)[源代码]

基类:_PLSFeatureSelectorBase

This selector is used to select features that contribute significantly to the latent variables in a PLS regression model using the Variables Importance in Projection (VIP) method.

参数:
  • model (Union[_PLS, Pipeline]) -- The PLS regression model or a pipeline with a PLS regression model as last step.

  • threshold (float, default=1.0) -- The threshold for feature selection. Features with importance above this threshold will be selected.

变量:
  • estimator (ModelTypes) -- The fitted model of type _BasePCA or _PLS

  • feature_scores (np.ndarray) -- The calculated feature scores based on the selected method.

  • support_mask (np.ndarray) -- The boolean mask indicating which features are selected.

引用

[1] Kim H. Esbensen,

"Multivariate Data Analysis - In Practice", 5th Edition, 2002.

示例

>>> from chemotools.datasets import load_fermentation_train
>>> from chemotools.feature_selection import VIPSelector
>>> from sklearn.cross_decomposition import PLSRegression
>>> # Load sample data
>>> X, y = load_fermentation_train()
>>> # Instantiate the PLS regression model
>>> pls_model = PLSRegression(n_components=2).fit(X, y)
>>> # Instantiate the VIP selector with the PLS model
>>> selector = VIPSelector(model=pls_model, threshold=1.0)
>>> selector.fit(X)
VIPSelector(model=PLSRegression(n_components=2), threshold=1.0)
>>> # Get the selected features
>>> X_selected = selector.transform(X)
>>> X_selected.shape
(21, 527)

Attributes

estimator_

fit(X: ndarray, y=None) VIPSelector[源代码]

Fit the transformer to calculate the feature scores and the support mask.

参数:
  • X (array-like of shape (n_samples, n_features)) -- The input data to fit the transformer to.

  • y (None) -- Ignored to align with API.

返回:

self -- The fitted transformer.

返回类型:

VIPSelector