AddNoise#

class chemotools.augmentation.AddNoise(distribution: Literal['gaussian', 'poisson', 'exponential'] = 'gaussian', scale: float = 0.0, random_state: int | None = None)[源代码]

基类:TransformerMixin, OneToOneFeatureMixin, BaseEstimator

Add noise to input data from various probability distributions.

This transformer adds random noise from specified probability distributions to the input data. Supported distributions include Gaussian, Poisson, and exponential.

参数:
  • distribution ({'gaussian', 'poisson', 'exponential'}, default='gaussian') -- The probability distribution to sample noise from.

  • scale (float, default=0.0) -- Scale parameter for the noise distribution: - For gaussian: standard deviation - For poisson: multiplication factor for sampled values - For exponential: scale parameter (1/λ) Must be non-negative.

  • random_state (int, optional) -- Random seed for reproducibility.

变量:

n_features_in (int) -- Number of features in the training data.

示例

>>> from chemotools.augmentation import AddNoise
>>> from chemotools.datasets import load_fermentation_train
>>> # Load sample data
>>> X, _ = load_fermentation_train()
>>> # Instantiate the transformer
>>> transformer = AddNoise(distribution="gaussian", scale=0.1)
AddNoise()
>>> transformer.fit(X)
>>> # Generate noisy data
>>> X_noisy = transformer.transform(X)
fit(X: ndarray, y=None) AddNoise[源代码]

Fit the transformer to the input data.

参数:
  • X (np.ndarray of shape (n_samples, n_features)) -- Training data.

  • y (None) -- Ignored. Present for API consistency.

返回:

self -- Fitted transformer.

返回类型:

AddNoise

抛出:

ValueError -- If X is not a 2D array or contains non-finite values.

transform(X: ndarray, y=None) ndarray[源代码]

Transform the input data by adding random noise.

参数:
  • X (np.ndarray of shape (n_samples, n_features)) -- Input data to transform.

  • y (None) -- Ignored. Present for API consistency.

返回:

X_transformed -- Transformed data with added noise.

返回类型:

np.ndarray of shape (n_samples, n_features)

抛出:

ValueError -- If X has different number of features than the training data, or if an invalid noise distribution is specified.