PiecewiseDirectStandardization#
- class chemotools.adaptation.PiecewiseDirectStandardization(window_length: int = 25, n_components: int = 2, scale: bool = True, storage: str = 'dense')[源代码]
基类:
DocLinkMixin,OneToOneFeatureMixin,TransformerMixin,BaseEstimatorPiecewise Direct Standardization (PDS) is a transformer used for domain adaptation (calibration) applications. The transformer uses least squares to find a linear map from the target instrument space to the source instrument space, following the implementation by [1] and [2].
- 参数:
window_length (int, default=25) -- Half-width (w) of the local spectral window used in PDS
n_components (int, default=2) -- Number of components to keep for PLS model
scale (bool, default = True) -- Whether to scale X and Y in the PLS model
storage (str {"dense", "band"}, default="dense") --
Storage format for the regression coefficients. -
"dense"stores the full(n_features, n_features)matrix.Fastest when
n_featuresis small enough that the matrix fits in CPU cache (roughlyn_features≤ 1 500 at float64)."band"stores coefficients in a compact(n_features, 2 * window_length + 1)array and exploits the band structure intransform()via a strided sliding-window view andeinsum. Fastest whenn_featuresis large (roughlyn_features> 1 500), and uses ~2–5 % of the memory of"dense".
- 变量:
n_features_in (int) -- Number of features seen during fit (set automatically by sklearn).
T (np.ndarray of shape (n_features, n_features), or None.) -- Banded transformation matrix. Dense ndarray when
storage="dense".Nonewhenstorage="band"or when fitted withoutX_source.coef_band (np.ndarray of shape (n_features, 2 * window_length + 1), or None.) -- Packed band of per-feature PLS coefficients. Set only when
storage="band";Noneotherwise.interior_start (int) -- Index of the first feature whose local window is fully interior (
window_length). Set only whenstorage="band".interior_end (int) -- One past the last interior feature (
n_features - window_length). Set only whenstorage="band".bias (np.ndarray of shape (n_features,), or None) -- Precomputed per-feature bias that absorbs local PLS centering, allowing
transform()to avoid per-sample intermediate allocations. None if fitted with X_source=None.x_source_provided (bool) -- Boolean flag indicating if X_source was provided during fitting.
- 抛出:
ValueError -- If X and X_source do not have the same shape.
ValueError -- If
n_componentsexceedsn_samples.ValueError -- If
n_componentsexceeds the minimum window size at the boundaries (window_length + 1).
参见
DirectStandardizationGlobal linear transformation without local windows.
引用
示例
>>> import numpy as np >>> from chemotools.adaptation import PiecewiseDirectStandardization >>> rng = np.random.default_rng(42) >>> X = rng.normal(size=(50, 100)) >>> X_source = X * 1.2 + rng.normal(0, 0.1, size=(50, 100)) >>> pds = PiecewiseDirectStandardization(window_length=5, n_components=2) >>> pds.fit(X, X_source=X_source) PiecewiseDirectStandardization(n_components=2, window_length=5) >>> X_transformed = pds.transform(X) >>> X_transformed.shape (50, 100)
Attributes
n_features_in_T_coef_band_interior_start_interior_end_bias_x_source_provided_- n_features_in_: int
- interior_start_: int
- interior_end_: int
- x_source_provided_: bool
- fit(X: ndarray, y=None, *, X_source: ndarray | None = None) PiecewiseDirectStandardization[源代码]
Fit the PiecewiseDirectStandardization to the input data.
- 参数:
X (np.ndarray of shape (n_samples, n_features)) -- Data from the target instrument.
y (None) -- Ignored to align with API.
X_source (np.ndarray of shape (n_samples, n_features), optional) -- Data from the source instrument. If None, the transformer defaults to an identity transformation.
- 返回:
self
- 返回类型:
PiecewiseDirectStandardization
- transform(X) ndarray[源代码]
Use the trained model to transform the target data
- 参数:
X (np.ndarray of shape (n_samples, n_features)) -- Input data to transform
- 返回:
X_transformed -- Data transformed
- 返回类型:
np.ndarray of shape (n_samples, n_features)
- set_fit_request(*, X_source: bool | None | str = '$UNCHANGED$') PiecewiseDirectStandardization
Configure whether metadata should be requested to be passed to the
fitmethod.Note that this method is only relevant when this estimator is used as a sub-estimator within a meta-estimator and metadata routing is enabled with
enable_metadata_routing=True(seesklearn.set_config()). Please check the User Guide on how the routing mechanism works.The options for each parameter are:
True: metadata is requested, and passed tofitif provided. The request is ignored if metadata is not provided.False: metadata is not requested and the meta-estimator will not pass it tofit.None: metadata is not requested, and the meta-estimator will raise an error if the user provides it.str: metadata should be passed to the meta-estimator with this given alias instead of the original name.
The default (
sklearn.utils.metadata_routing.UNCHANGED) retains the existing request. This allows you to change the request for some parameters and not others.在 1.3 版本加入.