AddNoise#

class chemotools.augmentation.AddNoise(distribution: Literal['gaussian', 'poisson', 'exponential'] = 'gaussian', scale: float = 0.0, random_state: int | None = None)[source]

Bases: TransformerMixin, OneToOneFeatureMixin, BaseEstimator

Add noise to input data from various probability distributions.

This transformer adds random noise from specified probability distributions to the input data. Supported distributions include Gaussian, Poisson, and exponential.

Parameters:
  • distribution ({'gaussian', 'poisson', 'exponential'}, default='gaussian') – The probability distribution to sample noise from.

  • scale (float, default=0.0) – Scale parameter for the noise distribution: - For gaussian: standard deviation - For poisson: multiplication factor for sampled values - For exponential: scale parameter (1/λ) Must be non-negative.

  • random_state (int, optional) – Random seed for reproducibility.

Variables:

n_features_in (int) – Number of features in the training data.

Examples

>>> from chemotools.augmentation import AddNoise
>>> from chemotools.datasets import load_fermentation_train
>>> # Load sample data
>>> X, _ = load_fermentation_train()
>>> # Instantiate the transformer
>>> transformer = AddNoise(distribution="gaussian", scale=0.1)
AddNoise()
>>> transformer.fit(X)
>>> # Generate noisy data
>>> X_noisy = transformer.transform(X)
fit(X: ndarray, y=None) AddNoise[source]

Fit the transformer to the input data.

Parameters:
  • X (np.ndarray of shape (n_samples, n_features)) – Training data.

  • y (None) – Ignored. Present for API consistency.

Returns:

self – Fitted transformer.

Return type:

AddNoise

Raises:

ValueError – If X is not a 2D array or contains non-finite values.

transform(X: ndarray, y=None) ndarray[source]

Transform the input data by adding random noise.

Parameters:
  • X (np.ndarray of shape (n_samples, n_features)) – Input data to transform.

  • y (None) – Ignored. Present for API consistency.

Returns:

X_transformed – Transformed data with added noise.

Return type:

np.ndarray of shape (n_samples, n_features)

Raises:

ValueError – If X has different number of features than the training data, or if an invalid noise distribution is specified.