AddNoise#
- class chemotools.augmentation.AddNoise(distribution: Literal['gaussian', 'poisson', 'exponential'] = 'gaussian', scale: float = 0.0, random_state: int | None = None)[source]
Bases:
TransformerMixin,OneToOneFeatureMixin,BaseEstimatorAdd noise to input data from various probability distributions.
This transformer adds random noise from specified probability distributions to the input data. Supported distributions include Gaussian, Poisson, and exponential.
- Parameters:
distribution ({'gaussian', 'poisson', 'exponential'}, default='gaussian') – The probability distribution to sample noise from.
scale (float, default=0.0) – Scale parameter for the noise distribution: - For gaussian: standard deviation - For poisson: multiplication factor for sampled values - For exponential: scale parameter (1/λ) Must be non-negative.
random_state (int, optional) – Random seed for reproducibility.
- Variables:
n_features_in (int) – Number of features in the training data.
Examples
>>> from chemotools.augmentation import AddNoise >>> from chemotools.datasets import load_fermentation_train >>> # Load sample data >>> X, _ = load_fermentation_train() >>> # Instantiate the transformer >>> transformer = AddNoise(distribution="gaussian", scale=0.1) AddNoise() >>> transformer.fit(X) >>> # Generate noisy data >>> X_noisy = transformer.transform(X)
- fit(X: ndarray, y=None) AddNoise[source]
Fit the transformer to the input data.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Training data.
y (None) – Ignored. Present for API consistency.
- Returns:
self – Fitted transformer.
- Return type:
AddNoise
- Raises:
ValueError – If X is not a 2D array or contains non-finite values.
- transform(X: ndarray, y=None) ndarray[source]
Transform the input data by adding random noise.
- Parameters:
X (np.ndarray of shape (n_samples, n_features)) – Input data to transform.
y (None) – Ignored. Present for API consistency.
- Returns:
X_transformed – Transformed data with added noise.
- Return type:
np.ndarray of shape (n_samples, n_features)
- Raises:
ValueError – If X has different number of features than the training data, or if an invalid noise distribution is specified.